clustering
6 TopicsLayer 7 Switching + Load Balancing = Layer 7 Load Balancing
Modern load balancers (application delivery controllers) blend traditional load-balancing capabilities with advanced, application aware layer 7 switching to support the design of a highly scalable, optimized application delivery network. Here's the difference between the two technologies, and the benefits of combining the two into a single application delivery controller. LOAD BALANCING Load balancing is the process of balancing load (application requests) across a number of servers. The load balancer presents to the outside world a "virtual server" that accepts requests on behalf of a pool (also called a cluster or farm) of servers and distributes those requests across all servers based on a load-balancing algorithm. All servers in the pool must contain the same content. Load balancers generally use one of several industry standard algorithms to distribute request. Some of the most common standard load balancing algorithms are: round-robin weighted round-robin least connections weighted least connections Load balancers are used to increase the capacity of a web site or application, ensure availability through failover capabilities, and to improve application performance. LAYER 7 SWITCHING Layer 7 switching takes its name from the OSI model, indicating that the device switches requests based on layer 7 (application) data. Layer 7 switching is also known as "request switching", "application switching", and "content based routing". A layer 7 switch presents to the outside world a "virtual server" that accepts requests on behalf of a number of servers and distributes those requests based on policies that use application data to determine which server should service which request. This allows for the application infrastructure to be specifically tuned/optimized to serve specific types of content. For example, one server can be tuned to serve only images, another for execution of server-side scripting languages like PHP and ASP, and another for static content such as HTML , CSS , and JavaScript. Unlike load balancing, layer 7 switching does not require that all servers in the pool (farm/cluster) have the same content. In fact, layer 7 switching expects that servers will have different content, thus the need to more deeply inspect requests before determining where they should be directed. Layer 7 switches are capable of directing requests based on URI, host, HTTP headers, and anything in the application message. The latter capability is what gives layer 7 switches the ability to perform content based routing for ESBs and XML/SOAP services. LAYER 7 LOAD BALANCING By combining load balancing with layer 7 switching, we arrive at layer 7 load balancing, a core capability of all modern load balancers (a.k.a. application delivery controllers). Layer 7 load balancing combines the standard load balancing features of a load balancing to provide failover and improved capacity for specific types of content. This allows the architect to design an application delivery network that is highly optimized to serve specific types of content but is also highly available. Layer 7 load balancing allows for additional features offered by application delivery controllers to be applied based on content type, which further improves performance by executing only those policies that are applicable to the content. For example, data security in the form of data scrubbing is likely not necessary on JPG or GIF images, so it need only be applied to HTML and PHP. Layer 7 load balancing also allows for increased efficiency of the application infrastructure. For example, only two highly tuned image servers may be required to meet application performance and user concurrency needs, while three or four optimized servers may be necessary to meet the same requirements for PHP or ASP scripting services. Being able to separate out content based on type, URI, or data allows for better allocation of physical resources in the application infrastructure.1.6KViews0likes2CommentsLoad Balancing on the Inside
Business critical internal processing systems often require high-availability and fault tolerance, too. Load balancing and application delivery is almost always associated with scaling out interactive, web-based applications. Rarely does anyone think about load balancing and application delivery in batch processing systems even when those systems might be critical to the business they are supporting. But scaling out non-interactive processing systems and providing high-availability to such critical systems is just as easily accomplished for an application delivery controller (ADC) as it is to scale out an interactive web-based application. Maybe easier. When that system also requires a bit more intelligence than just simple load balancing, it makes a lot of sense to look closer at a context-aware system that can support all the requirements in a single solution. THE SCENARIO A batch document processing system uses a document ID to match all related documents to the same “case.” The first time a document ID is encountered, it creates a new “case” and subsequent documents bearing that ID are attached to the original case. To ensure processing around the clock, a redundant set of application servers is configured to process the documents, and the vendor’s application server clustering solution is used to load balance documents (in simple round-robin fashion) across the two instances. A load test is conducted, ramping up to 2500 documents per hour (41 per minute, fewer than 1 per second). During the test it is discovered that in some situations two documents with the same ID will arrive at the clustering solution in order. They will each be load balanced to separate instances. There is no existing “case” for this document id. Because of processing times and load on the servers, both documents result in the creation of separate “cases.” The test is considered a failure. Because the system, while managing the load fine from a network perspective, executed incorrectly under load from a process perspective. The solution? Reconfigure the clustering solution to an active-standby configuration, thus introducing the process latency needed to ensure that the scenario does not occur. Retest. Success. The result? The investment in the second instance of the application server – hardware, software licenses, management, maintenance – is wasted. It is a “failover” node only and reduces the overall capacity – and ultimately performance at higher load levels – of the system. WHEN CONTEXT MATTERS This scenario is real; it was described to me by a program manager at a Fortune 500 with a great deal of frustration as it seemed, to her anyway, that the architects could not come up with a working solution other than wasting a perfectly good set of resources. Instinctively she described a solution that leveraged persistence to force all documents with the same ID to the same server as it had been proven repeatedly that if all documents with the same ID were processed by the same application server that the system processed them correctly and associated them with the right “case” in all situations. But the application server clustering solution, which can provide server affinity (persistence) based on a few variables, was for some reason not able to support affinity (persistence) based on the document ID. After a few questions regarding the overall system and processing times it became clear that a context-aware application delivery controller could indeed solve this problem. The solution is fairly simple, actually, and based on existing persistence-based load balancing solutions. It is a given that documents with the same ID are batch processed within minutes of each other. Thus, a persistence table with a life of an hour or even thirty-minutes would provide the proper context in which documents could be processed and directed to the “right” web application server. This requires context; it requires that the load balancing solution, the application delivery controller, be aware of not only what it is processing but what it has processed already, and where it’s been sent. Document ID Based Persistence Logic Extract the document ID from the document Check the persistence table for the document ID If the document ID already exists, route the document to the same server as the previous document(s) with that ID If the document ID does not exist, decide which server the document will be sent to for processing and create an entry in the persistence table Wash. Rinse. Repeat. This problem is really about process level execution; about enforcing a business requirement on the technological implementation. In order to achieve compliance with the business process expectations it is necessary to be able to view each request in the context of that process rather than as an individual request that needs to be executed. Thus each touch point in the architecture that needs to manipulate, transform, or perform some task with or on or to the request needs to be able to take into consideration the process; it needs to be context-aware so that its decisions are made within the context of the entire process and not just the individual request. Layer 7 switching, application load balancing, application delivery. Whatever you want to call it, it is the way in which load balancing becomes context-aware and becomes collaborative. It enables the business requirements to be not only taken into consideration but enforced while ensuring that CapEx and OpEx investments in additional systems are not left to sit idle; wasted. It improves capacity essentially by introducing process latency into the equation. By forcing the process to follow a particular path the application delivery controller assists in the technological implementation meeting the goals of the business. In order words, it aligns IT with the business. Sometimes the marketing fluff is more solid than it appears. To Boldly Go Where No Production Application Has Gone Before WILS: Network Load Balancing versus Application Load Balancing Sessions and Cookies and Persistence, oh my! Persistent and Persistence, What's the Difference? If Load Balancers Are Dead Why Do We Keep Talking About Them? A new era in application delivery Infrastructure 2.0: The Diseconomy of Scale Virus The Politics of Load Balancing Business-Layer Load Balancing Not all application requests are created equal245Views0likes1CommentWILS: Virtualization, Clustering, and Disaster Recovery
#virtualization Clustering is local. Disaster recovery is global. There are two levels of reliability for an application. There’s local and there’s global. We might want to consider it more simply as “inside” and “outside” reliability. Virtualization enables local reliability – the inside kind of reliability. Whether you’re relying upon clustering or load balancing (each has advantages and disadvantages, but for purposes of reliability and this discussion we’ll assume equal capabilities) to provide the abstraction isn’t as important as recognizing that in terms of reliability you’re acting at the local, i.e. inside, level. A cluster or pool, in load balancing parlance, is able to maintain local reliability by distributing load across multiple instances of the application. We can transparently add or remove instances to achieve the elasticity necessary to meet demand, thus ensuring reliability. In the event of a local disaster, such as the failure of a virtual machine, we can take the failed instance out of the rotation and even provision another to replace it. What clustering (load balancing) can’t do is address global reliability, i.e. outside reliability. Global reliability must be addressed using a different technology, normally referred to as Global Server Load Balancing (GLSB). The terminology grew out of the days when global reliability was achieved by load balancing individual servers across the globe to ensure a failure in the network or at a specific location could not interrupt the service. As demand grew, GSLB performed the same functions, but did so at a site level, essentially load balancing sites instead of individual servers. The name remains, however confusing that may be to the uninitiated. To achieve global reliability you need GSLB. To avoid the detrimental effects of a disaster in the network or at the site level, you must be able to direct users to an active location. This is realized in most implementations through simple DNS load balancing techniques; i.e. when a user makes a request the GSLB service responds with the IP address of an appropriate, active site. GLSB is capable of much more complex decision making, however, and decisions can be based on a variety of business and operational parameters, at the discretion of the organization. The GSLB service monitors each of the local sites, and is able to detect an outage within seconds and begin directing users elsewhere. At the local level, clustering and load balancing also monitor the “health” of individual instances and can react similarly in the event of a failure, but do so only at the local level. If the site fails, as might be the case in the event of a disaster, the local service is unable to do anything about it. It can’t redirect globally, it can’t notify other components. It’s just gone. For disaster recovery purposes, this is important stuff. When cloud first drifted onto the scene is was postulated that the cheaper compute would make implementing secondary data centers specifically for disaster recovery purposes more financially feasible for a wider variety of organizations. While that’s true in the sense that it’s way cheaper than building a secondary data center, many of the technological foundations remain the same: GSLB and a replicated environment. Some folks balk at the replication and point to transparent migration as a solution. After all, why pay even pennies on the hour instances that may never be put into commission? The problem is that transparent migration of virtual machines is only useful while the VMs are live and running. If they aren’t, such as might be the case in the event of a disaster, the site can’t be replicated and global reliability fails. A cluster-to-cluster failover via a bridged network to the cloud might sound like a good idea, but it isn’t practical when applied to a disaster recovery scenario. Too much depends on the availability of the site, of the network, and of the clustering/load balancing mechanism itself. If any one of the components has failed, global reliability is unrealizable. To achieve true global reliability regardless of the involvement of cloud computing , you’re going to need to implement a good old-fashioned GSLB architecture, complete with the network components and replicated application infrastructure. Local reliability (inside) may be achievable with virtual clustering solutions, but global reliability requires a very different architecture and set of technologies. Disaster recovery strategies cannot rely on local reliability, they must be based on global reliability. WILS: Write It Like Seth. Seth Godin always gets his point across with brevity and wit. WILS is an ATTEMPT TO BE concise about application delivery TOPICS AND just get straight to the point. NO DILLY DALLYING AROUND. Back to Basics: Load balancing Virtualized Applications The Cost of Ignoring ‘Non-Human’ Visitors Cloud Bursting: Gateway Drug for Hybrid Cloud The HTTP 2.0 War has Just Begun Why Layer 7 Load Balancing Doesn’t Suck Network versus Application Layer Prioritization WILS: The Many Faces of TCP WILS: WPO versus FEO226Views0likes0CommentsNot all application requests are created equal
ArsTechnica has an interesting little article on what Windows Azure is and is not. During the course of discussion with Steven Martin, Microsoft's senior director of Developer Platform Product Management, a fascinating – or disturbing in my opinion – statement was made: There is a distinction between the hosting world and the cloud world that Martin wanted to underline. Whereas hosting means simply the purchase of space under certain conditions (as opposed to buying the actual hardware), the cloud completely hides all issues of clustering and/or load balancing, and it offers an entirely virtualized instance that takes care of all your application's needs. [emphasis added] The reason this is disturbing is because not all application requests are created equal and therefore should not necessarily be handled in the same way by a “clustering and/or load balancing solution”. But that’s exactly what hiding clustering and/or load balancing ends up doing. While it’s nice that the nitty-gritty details are obscured in the cloud from developers and, in most cases today, the administrators as well, the lack of control over how application requests are distributed actually makes the cloud and its automatic scalability (elasticity) less effective. To understand why you need a bit of background regarding industry standard load balancing algorithms. In the beginning there was Round Robin, an algorithm that is completely application agnostic and simply distributes request based on a list of servers, one after the other. If there are five servers in a pool/farm/cluster, then each one gets a turn. It’s an egalitarian algorithm that treats all servers and all requests the same. Round Robin achieves availability, but often at the cost of application performance. When application performance became an issue we got new algorithms like Least Connections and Fastest Response Time. These algorithms tried to take into account the load on the servers in the pool/farm/cluster before making a decision, and could therefore better improve utilization such that application performance started getting better. But these algorithms only consider the server and its load, and don’t take into consideration the actual request itself. And therein lies the problem, for not all requests are created equal. A request for an image requires X processing on a server and Y memory and is usually consistent across time and users. But a request that actually invokes application logic and perhaps executes a database query is variable in its processing time and memory utilization. Some may take longer than others, and require more memory than others. Each request is a unique snowflake whose characteristics are determined by user, by resource, and by the conditions that exist at the time it was made. It turns out the problem is that in order to effectively determine how to load balance requests in a way that optimizes utilization on servers and offers the best application performance you actually have to understand the request. That epiphany gave rise to layer 7 load balancing and the ability to exact finer-grained control over load balancing. Between understanding the request and digging deeper into the server – understanding CPU utilization, memory, network capacity – load balancers were suddenly very effective at distributing load in a way that made sense on a per request basis. The result was better architectures, better performing applications, and better overall utilization of the resources available. Now comes the cloud and its “we hide all the dirty infrastructure details from you” mantra. The problem with this approach is simple: a generic load balancing algorithm is not the most effective method of distributing load across servers, but a cloud provider is not prescient and therefore has no idea what algorithm might be best for your application. Therefore the provider has very little choice in which algorithm is used for load balancing and therefore any choice made will certainly provide availability, but will likely not be the most effective for your specific application. So while it may sound nice that all the dirty details of load balancing and clustering is “taken care of for you” in the cloud, it’s actually doing you and your application a disservice. Hiding the load balancing and/or clustering capabilities of the cloud, in this case Azure, from the developer is not necessarily the bonus Martin portrays it to be. The ability to control how requests are distributed is just as important in the cloud as it is in your own data center. As Gartner analyst Daryl Plummer points out, underutilizing resources in the cloud, as may happen when using simplistic load balancing algorithms, can be as expensive as running your own data center and may negatively impact application performance. Without some input into the configuration of load balancers and other relevant infrastructure, there isn’t much you can do about that, either, but start up another instance and hope that horizontal scalability will improve performance – at the expense of your budget. Remember that when someone else makes decisions for you that you are necessarily giving up control. That’s not always a bad thing. But it’s important for you to understand what you are giving up before you hand over the reins. So do your research. You may not have direct control, but you can ask about the “clustering and/or load balancing” provided and understand what affect that may – or may not – have on the performance of your application and the effectiveness of the utilization of the resources for which you are paying.230Views0likes2CommentsWhy you should not use clustering to scale an application
It is often the case that application server clustering and load-balancing are mistakenly believed to be the same thing. They are not. While server clustering does provide rudimentary load-balancing functionality, it does a better job of providing basic fail-over and availability assurance than it does load-balancing. In fact, load balancing has effectively been overtaken by application delivery, which builds on load balancing but is much, much more than that today. Clustering essentially turns one instance of an application server into a controlling node, a proxy of sorts, through which requests are funneled and then distributed amongst several instances of application servers. Sounds like load-balancing, on the surface, but digging deeper will reveal there are many reasons why application server clustering will not support long-term scalability and efficiency. Aside from the obvious hardware accelerated functions provided by an application delivery controller (a.k.a. modern load balancer), there are a number of other reasons to look to options other than application server clustering when you are trying to build out a scalable, efficient application architecture. Here are the top three reasons you should reconsider (or not consider in the first place) a scalability solution centered around application server clustering technology. JUST LOAD BALANCING ISN'T EFFICIENT Simple load balancing is not efficient. It uses industry standard algorithms ultimately derived from network load balancing to distribute requests across a pool (or farm) of servers. Those algorithms don't take into consideration a wide variety of factors that can affect not only capacity of an application but the performance of an application. There is no intelligence, no real awareness of the application in an application server clustering architecture and thus the solution does not utilize resources in a way that squeezes out as much capacity and performance from applications. Application server clustering also lacks many of the features available in today's application delivery controllers that enhance the efficiency of servers and supporting infrastructure. Optimization of core protocols and reuse of connections can dramatically increase the efficiency and performance of applications and neither option is available in application server clustering solutions. That's because the application server clustering solution relies on the same core protocol stack (TCP/IP) as the application server and operating system, and neither are optimized for scalability. LACK OF SUPPORT FOR CLOUD COMPUTING and VIRTUALIZED ENVIRONMENTS Dynamism is the ability of your application and network infrastructure to handle the expansion and contraction of applications in an on-demand environment. If you're considering building your own private cloud computing environment and taking advantage of the latest style of computing, you'll want to consider options other than application server clustering to serve as your '"control node". Aside from failing to exhibit the four core properties necessary in a cloud computing infrastructure (transparency, scalability, security, and application intelligence), application server clustering itself is not designed to handle a fluid application infrastructure. Like early load balancers, it expects to manage a number of servers in a farm and that the number (and location) will remain the same. Its configuration is static, not dynamic, and it is not well-suited to automatically adjusting to changing infrastructure conditions in the data center. Virtualization initiatives put similar demands on controlling solutions like application delivery and application server cluster controllers; demands that cannot be met by application server cluster controllers due to their static configuration nature. IT ISN'T SCALABLE When it comes down to it there is only one reason you really need to stay away from application server clustering as a mechanism for scaling your applications: application server clustering doesn't scale well. Think about it this way, you are trying to scale out an application by taking an instance of the application server (the one you need to scale, by the way) and turning it into a controlling node. While the application server clustering functionality is likely capable of supporting twice the number of concurrent connections as a single instance running an application, it isn't likely to be able to handle three or four times that number. You are still limited by the software, by the operating system, and by the hardware capabilities of the server on which the clustering solution is deployed. The number of web sites that are static and do not involve dynamic components served from application servers of some kind are dwindling. Most sites recognize the impact of Web 2.0 on their customer base and necessarily include dynamic content as the primary source of web site content. That means they're trying to serve a high number of concurrent customers on traditional application server technology solutions. Scaling those applications is an important part of deploying a site today, both to ensure availability and to meet increasingly demanding performance requirements. Application server clustering technology wasn't designed for this kind of scalability, and there's a reason that folks like Microsoft, Oracle/BEA, and IBM partner with hardware application delivery solution providers: they know that in order to truly scale an application, you're going to need a hardware-based solution. Application server vendors build application servers that are focused on building, deploying, and serving up rich, robust applications. And every one of them has said in the past, "Use a hardware load balancer to scale." If the recommendation of your application server vendor isn't enough to convince you that application server clustering isn't the right choice for scaling web applications, I don't know what is.212Views0likes0CommentsClustering versus load-balancing
What's the difference, really? There are actually quite a few differences, even if you ignore that clustering is generally used to refer to the capability of a software product to provide load-balancing services and load-balancing is often used to refer to a hardware-based (or at leastthird-party software) solution. Clustering is most often used in conjunction with application servers such as BEA WebLogic, IBM WebSphere, and Oracle AS (10g).So are load-balancing features found within Application Delivery Controllers (ADC) like BIG-IP. In the world of hardware load balancers the term "pool" or "farm" is used to describe a grouping of servers across which application requests will be distributed. Inthe world of software load balancing the term used is "cluster". I will try to forget the use of the term factotum for this concept as it still gives me nightmares. Scalability Clustering typically makes one instance of an application server into a master controller through which all requests are processed and distributed to a number of instances using industry standard algorithms like round robin, weighted round robin, and least connections. Clustering, like load balancing,enables horizontal scalability, that is the ability to add more instances of an application server nearly transparently to increase the capacity or response time performance of an application. Clustering features usually include the ability to ensure an instance is available through the use of ICMP ping checks and, in some cases, TCP or HTTP connection checks. ADCs typically support these same industry standard algorithms, but add more complex calculations and parameters that can include per-server CPU and memory resource utilization and fastest response times. ADCs also support health monitoring capabilities, but they generally go beyond the rudimentary capabilities of those found in application server clustering solutions. This includes the ability to verify content or perform passive monitoring which removes the relatively low impact ofhealth checking on application server instances. Server Affinity Clustering uses server affinity to ensure that applications requiring the user interact with the same server during a session get to the right server. This is most often used in applications executing a process, for example order entry, in which the session is used between requests (pages) to store information that will be used to conclude a transaction, for example a shopping cart. ADCs use persistence to provide the same functionality. While clustering solutions are generally limited in the variables that can be used, ADCs can use traditional application variables as well as custom information from within the application data or network-based information. High Availability (Failover) Clustering solutions claim to provide HA/Failover capabilities, when this failover is related to application process level failover, not high availability of the clustering controller itself. This is an important distinction as in the event the clustering controller instance fails, the entire system falls apart. While cluster-based load-balancing provides high availability for members of the cluster, the controller instance becomes a single point of failure in the data path. ADCs are built for redundancy and include sophisticated features that not only ensure applications are still available if one ADC fails, but also replicates session state between two ADCs such that if the primary fails the application sessions are not lost. This replication capability is also available in most clustering application server solutions. Transparency Many clustering solutions require a node-agent be deployed on each instance of an application server being clustered by the controller. This agent is often already deployed, so it's often not a burden in terms of deployment and management, but it is another process running on each server that is consuming resources such as memory and CPU and which adds another point of failure into the data path. ADCs require no server-side components, they are completely transparent. Making A Choice So which should you chose? That depends highly on the reasons you are considering implementing either clustering or deploying an ADC and whether or not you will have to make an additional purchase to enable clustering capabilities for your particular application server. There's also a broader question of whether you will need to provide this support for more than one application server brand. Clustering is proprietary to the application server while ADCs can provide these services for any application or web server. Clustering The pros: Generally available as part of an enterprise package for an application server Solution doesn't require a lot of networking skills Generally less expensive than a redundant ADC deployment The cons: High availability is not assured using clustering solutions Best practices dictate the cluster controller be deployed on separate hardware Requires node agents on managed application server instances Clustering is "proprietary" in that you can only cluster homogeneous servers. ADCs The pros: Can provide high availability and load balancing across heterogeneous environments Offers additional value such as optimization, security, and acceleration for applications Transparent - doesn't require changes to applications or the servers on which they are deployed The cons: Adds another piece of infrastructure to the architecture Generally more expensive than clustering solutions May require a new set of skills to deploy and manage Want more insight into performance, configuration, and use cases? Check out this testing-based article on ADCs, and this testing-based review of application server clustering. Imbibing: Water Technorati tags: F5, application delivery, load balancing, clustering, MacVittie2.1KViews0likes0Comments