cookies
9 TopicsPersistent and Persistence, What's the Difference?
The English language is one of the most expressive, and confusing, in existence. Words can have different meaning based not only on context, but on placement within a given sentence. Add in the twists that come from technical jargon and suddenly you've got words meaning completely different things. This is evident in the use of persistent and persistence. While the conceptual basis of persistence and persistent are essentially the same, in reality they refer to two different technical concepts. Both persistent and persistence relate to the handling of connections. The former is often used as a general description of the behavior of HTTP and, necessarily, TCP connections, though it is also used in the context of database connections. The latter is most often related to TCP/HTTP connection handling but almost exclusively in the context of load-balancing. Persistent Persistent connections are connections that are kept open and reused. The most commonly implemented form of persistent connections are HTTP, with database connections a close second. Persistent HTTP connections were implemented as part of the HTTP 1.1 specification as a method of improving the efficiency Related Links HTTP 1.1 RFC Persistent Connection Behavior of Popular Browsers Persistent Database Connections Apache Keep-Alive Support Cookies, Sessions, and Persistence of HTTP in general. Before HTTP 1.1 a browser would generally open one connection per object on a page in order to retrieve all the appropriate resources. As the number of objects in a page grew, this became increasingly inefficient and significantly reduced the capacity of web servers while causing browsers to appear slow to retrieve data. HTTP 1.1 and the Keep-Alive header in HTTP 1.0 were aimed at improving the performance of HTTP by reusing TCP connections to retrieve objects. They made the connections persistent such that they could be reused to send multiple HTTP requests using the same TCP connection. Similarly, this notion was implemented by proxy-based load-balancers as a way to improve performance of web applications and increase capacity on web servers. Persistent connections between a load-balancer and web servers is usually referred to as TCP multiplexing. Just like browsers, the load-balancer opens a few TCP connections to the servers and then reuses them to send multiple HTTP requests. Persistent connections, both in browsers and load-balancers, have several advantages: Less network traffic due to less TCP setup/teardown. It requires no less than 7 exchanges of data to set up and tear down a TCP connection, thus each connection that can be reused reduces the number of exchanges required resulting in less traffic. Improved performance. Because subsequent requests do not need to setup and tear down a TCP connection, requests arrive faster and responses are returned quicker. TCP has built-in mechanisms, for example window sizing, to address network congestion. Persistent connections give TCP the time to adjust itself appropriately to current network conditions, thus improving overall performance. Non-persistent connections are not able to adjust because they are open and almost immediately closed. Less server overhead. Servers are able to increase the number of concurrent users served because each user requires fewer connections through which to complete requests. Persistence Persistence, on the other hand, is related to the ability of a load-balancer or other traffic management solution to maintain a virtual connection between a client and a specific server. Persistence is often referred to in the application delivery networking world as "stickiness" while in the web and application server demesne it is called "server affinity". Persistence ensures that once a client has made a connection to a specific server that subsequent requests are sent to the same server. This is very important to maintain state and session-specific information in some application architectures and for handling of SSL-enabled applications. Examples of Persistence Hash Load Balancing and Persistence LTM Source Address Persistence Enabling Session Persistence 20 Lines or Less #7: JSessionID Persistence When the first request is seen by the load-balancer it chooses a server. On subsequent requests the load-balancer will automatically choose the same server to ensure continuity of the application or, in the case of SSL, to avoid the compute intensive process of renegotiation. This persistence is often implemented using cookies but can be based on other identifying attributes such as IP address. Load-balancers that have evolved into application delivery controllers are capable of implementing persistence based on any piece of data in the application message (payload), headers, or at in the transport protocol (TCP) and network protocol (IP) layers. Some advantages of persistence are: Avoid renegotiation of SSL. By ensuring that SSL enabled connections are directed to the same server throughout a session, it is possible to avoid renegotiating the keys associated with the session, which is compute and resource intensive. This improves performance and reduces overhead on servers. No need to rewrite applications. Applications developed without load-balancing in mind may break when deployed in a load-balanced architecture because they depend on session data that is stored only on the original server on which the session was initiated. Load-balancers capable of session persistence ensure that those applications do not break by always directing requests to the same server, preserving the session data without requiring that applications be rewritten. Summize So persistent connections are connections that are kept open so they can be reused to send multiple requests, while persistence is the process of ensuring that connections and subsequent requests are sent to the same server through a load-balancer or other proxy device. Both are important facets of communication between clients, servers, and mediators like load-balancers, and increase the overall performance and efficiency of the infrastructure as well as improving the end-user experience.4.9KViews0likes2CommentsSessions and Cookies and Persistence, oh my!
At some point (you hope!) it becomes necessary to implement load-balancing for your applications. So you went out and got one, either from a hardware vendor or maybe downloaded a solution, and put it into place. Now you're ready to go, right? Maybe not just yet. Do your applications require persistence? Yes? You did remember to validate that your solution is capable of performing persistence-based load-balancing, didn't you? If you're shaking your head wondering why this application thing is important to load balancing, read on. Persistence is one of the best examples of why it's so very important to understand how the applications you will be load-balancing work, because if an application needs persistence, you may break it without a persistent capable load-balancing solution. The Relationship between Sessions and Cookies Sessions are not cookies, but they can (and do) work together to create the illusion of persistence in an otherwise stateless protocol. Sometimes persistence is referred to as "stickiness", or "sticky connections." That's because what persistence does is ensure that a client connects to the "real" server on which his/her current session is active. When a user connects the first time to a site, a session is created on the server to which the user was directed. If the site is load balanced and the user is directed to a second server on the next request, a new session is created. Obviously this is not an optimal situation. What we need is some mechanism to ensure that a user is reconnected to the same server for the duration of a session. Persistence is the process of ensuring that a user is connected to the same server every time they make a request within the boundaries of a single session. Even though users could be persisted based on their IP address, this is rarely done due to the sharing of IP addresses. Persistence is most often implemented using a cookie containing the server session id because it is the most accurate method of determining where a user's session is currently stored. If you're a web developer or administrator, you might think "that sounds a lot like server affinity". You'd be right, of course, server affinity and persistence are two different terms that mean the same thing. Sessions are stored on the server, and are not reliant on cookies being enable in the client's browser. Sessions are where web developers store bits of application relevant data that they may wish to use across requests. Shopping carts are the most ubiquitous example of session data, but there are other uses for it, especially in complex web applications like CRM (customer relationship management) or SFA (sales force automation) applications. Cookies store bits of data on the client (the browser) and are passed to the server via the HTTP header Cookie. Without persistence, users would be unknowingly creating sessions willy-nilly across multiple web servers in a load-balanced environment. That's a waste of resources, as sessions will remain in memory on the web server until they time out according to the web server's configuration. Additionally, a lack of persistence breaks web applications, because the data stored in the session on previous requests is no longer accessible on Server 2 because it's still sitting over on Server 1. This is why it's so important that a load-balancing or application delivery solution is capable of handling persistence-based load distribution. If your load-balancing solution works based on an industry standard algorithm like round-robin, least-connections, or a weighted version of either, then you're likely to break those applications which require persistence because the load balancing algorithms aren't taking session persistence needs into consideration. Persisting Connections The most common data used to persist connections is SSL session id. SSL connections without persistence is like crust without the bread. Yeah, it's that bad. Basically, load balancing SSL without persistence doesn't work. The second most common data used to persist connections is application or server session id, like JSESSIONID or PHPSESSIONID. These IDs are automatically generated by applications and web servers, and are generally passed to the client as a cookie on the first response, and then used by the load balancer to determine to which server it should direct subsequent requests. An example of HTTP headers storing a JSESSIONID in a cookie: Cookie: JSESSIONID=9597856473431 Cache-Control: no-cache Host: 127.0.0.2:8080 Connection: Keep-Alive Your chosen load-balancing or application delivery solution needs to be able to take application session data into consideration when making routing decisions. It must be able to look at HTTP headers and extract the data the web application stored to determine which server it should direct the request to, or you risk breaking your web applications and wasting resources on your servers. Imbibing: Coffee ADDITIONAL RESOURCES Colin has a great entry in his 20LOL series that implements JSessionID based persistence. The iRule should be easily modified to support other types of application ID based persistence, as long as the value is stored in the HTTP Headers somewhere. And Joe has a short article on enabling session persistence on BIG-IP. Wikipedia has a great discussion on Web server session management here.2.2KViews0likes6CommentsFive questions you need to ask about load balancing and the cloud
Whether you are aware of it or not, if you’re deploying applications in the cloud or building out your own “enterprise class” cloud, you’re going to be using load balancing. Horizontal scaling of applications is a fairly well understood process that involves (old skool) server virtualization of the network kind: making many servers (instances) look like one to the outside world. When you start adding instances to increase capacity for your application, load balancing necessarily gets involved as it’s the way in which horizontal scalability is implemented today. The fact that you may have already deployed an application in the cloud and scaled it up without recognizing this basic fact may lead you to believe you don’t need to care about load balancing options. But nothing could be further from the truth. If you haven’t asked already, you should. But more than that you need to understand the importance of load balancing and its implications to the application. That’s even more true if you’re considering an enterprise cloud, because it will most assuredly be your problem in the long run. Do not be fooled; the options available for load balancing and assuring availability of your application in the cloud will affect your application – if not right now, then later. So let’s start with the five most important things you need to ask about load balancing and cloud environments regardless of where they may reside. #5 DIRECT SERVER RETURN If you’re going to be serving up video or audio – real-time streaming media – you should definitely be interested in whether or not the load balancing solution is capable of supporting direct server return (DSR). While there are pros and cons to using DSR, for video and audio content it’s nearly an untouchable axiom of application delivery that you should enable this capability. DSR basically allows the server to return content directly to the client without being processed by any intermediary (other than routers/switches/etc… which of course need to process individual packets). In most load balancing situations the responses from the server are returned via the same path they took to get to the server, notably through the load balancer. DSR allows responses to return outside the path of the load balancer or, if still returning through it, to do so unmolested. In the latter scenario the load balancer basically acts as a simple packet forwarder and does no additional processing on the packets. The advantage to DSR is that it removes any additional latency imposed by additional processing by intermediaries. Because real-time streaming media is very sensitive to the effects of latency (jitter), DSR is often suggested as a best practice when load balancing servers responsible for serving such content. Question: Is it supported? #4 HEALTH CHECKING One of the ways in which load balancers/application delivery controllers make decisions regarding which server should handle which request is to understand the current status of the application. It’s part of being context-aware, and it provides information about the application that is invaluable not just to the load balancing decision but to the overall availability of the application. Health checking allows a load balancing solution to determine whether a server/instance is “available” based on a variety of factors. At the simplest level an ICMP ping can be used to determine whether the server is available. But that tells it nothing of the state of the application. A three-way TCP handshake is the next “step” up the ladder, and this will tell the load balancing solution whether an application is capable of accepting connections, but still tells it nothing of the state of the application. A simple HTTP GET takes it one step further, but what’s really necessary is the ability of the load balancing solution to retrieve actual data and ensure it is valid in order to consider an application “available”. As the availability of an application may be (and should be if it is not) one way to determine whether new instances are necessary or not, the ability to determine whether the actual application is available and responding appropriately are important in keeping costs down in a cloud environment lest instances be launched for no reason or – more dangerously – instances are not launched when necessary due to an outage or failure. In an external cloud environment it is important to understand how the infrastructure determines when an application is “available” or “not”, based on such monitoring, as the subtle differences in what is actually being monitored/tested can impact application availability. Question: What determines when an application (instance) is available and responding as expected? #3 PERSISTENCE Persistence is one of the most important facets of load balancing that every application developer, architect, and network professional needs to understand. Nearly every application today makes heavy use of application sessions to maintain state, but not every application utilizes a shared database model for its session management. If you’re using standard application or web server session features to manage state in your application, you will need to understand whether the load balancing solutions available supports persistence and how that persistence is implemented. Persistence basically ensures that once a user has been “assigned” a server/instance that all subsequent requests go to that same server/instance in order to preserve access to the application session. Persistence can be based on just about anything depending on the load balancing solution available, but most commonly takes the form of either source ip address or cookie-based. In the case of the former there’s very little for you to do, though you should be somewhat concerned over the use of such a rudimentary method of enabling persistence as it is quite possible – probable, in fact – that many users will be sharing the same source IP address based on NAT and masquerading at the edge of corporate and shared networks. If the persistence is cookie-based then you’ll need to understand whether you have the ability to determine what data is used to enable that persistence. For example, many applications used PHPSESSIONID or ASPSESSIONID as it is routine for those environments to ensure that these values are inserted into the HTTP header and are available for such use. But if you can’t configure the option yourself, you’ll need to understand what values are used for persistence and to ensure your application can support that value in order to match up users with their application state. Question: How is persistence implemented? #2 QUIESCING (BLEEDING) CONNECTIONS Part of the allure of a cloud architecture is the ability to provision resources on-demand. With that comes the assumption that you can also de-provision resources when they are no longer needed. One would further hope this process is automated and based on a policy configurable by the user, but we are still in the early days of cloud so that may be just a goal at this point. Load balancers and clustering solutions can usually be told to begin quiescing (bleeding off) connections. This means that they stop distributing requests to the specified servers/instances but allow existing users to continue using the application until they are finished. It basically takes a server/instance out of the “rotation” but keeps it online until all users have finished and the server/instance is no longer needed. At that point either through a manual or automated process the server/instance can be de-provisioned or taken offline. This is often used in traditional data centers to enable maintenance such as patching/upgrades to occur without interrupting application availability. By taking one server/instance at a time offline the other servers/instances remain in service, serving up requests. In an on-demand environment this is of course used to keep costs controlled by only keeping the instances necessary for current capacity online. What you need to understand is whether that process is manual, i.e. you need to push a button to begin the process of bleeding off connections, or automated. If the latter, then you’ll need to ask about what variables you can use to create a policy to trigger the process. Variables might be number of total connections, requests, users, or bandwidth. It could also, if the load balancing solution is “smart enough” include application performance (response time) or even time of day variables. Question: How do connections quiesce (bleed) off – manually or automatically based on thresholds? #1 FAILOVER We talk a lot about the cloud as a means to scale applications, but we don’t very often mention availability. Availability usually means there needs to be in place some sort of “failover” mechanism, in case an application or server fails. Applications crash, hardware fails, these things happen. What should not happen, however, is that the application becomes unavailable because of these types of inevitable problems. If one instance suddenly becomes unavailable, what happens? That’s the question you need to ask. If there is more than one instance running at that time, then any load balancing solution worth its salt will direct subsequent requests to the remaining available instances. But if there are no other instances running, what happens? If the provisioning process is manual, you may need to push a button and wait for the new instance to come online. If the provisioning process is not manual, then you need to understand how long it will take for the automated system to bring a new instance online, and perhaps ask about the ability to serve up customized “apology” pages that reassure visitors that the site will return shortly. Question: What kind of failover options are available (if any)? THERE ARE NO STUPID QUESTIONS Folks seem to talk and write as if cloud computing relieves IT staff (customers) of the need to understand the infrastructure and architecture of the environments in which applications will be deployed. Because there is an increasingly symbiotic relationship between applications and its infrastructure – both network and application network – this fallacy needs to be exposed for the falsehood it is. It is more important today, with cloud computing, than it ever has been for all of IT – application, network, and security – to understand the infrastructure and how it works together to deliver applications. That means there are no stupid questions when it comes to cloud computing infrastructure. There are certainly other questions you can – and should – ask a potential provider or vendor in order to make the right decision about where to deploy your applications. Because when it comes down to it it’s your application and your customers, partners, and users are not going to be calling/e-mailing/tweeting the cloud provider; they’re going to be gunning for you if things don’t work as expected. Getting the answers to these five questions will provide a better understanding of how your application will handle unexpected failures, allow you to plan appropriately for maintenance/upgrades/patches, and formulate the proper policies for dealing with the nuances of a load balanced application environment. Don’t just ask about product/vendor and hope that will answer your questions. Sure, your cloud provider may be using F5 or another advanced application delivery platform, but that doesn’t mean that they’re utilizing the product in a way that would offer the features you need to ensure your application is always available. So dig deeper and ask questions. It’s your application, it’s your responsibility, no matter where it ends up running. And the Killer App for Private Cloud Computing Is… Not All Virtual Servers are Created Equal Infrastructure 2.0: The Feedback Loop Must Include Applications Cloud Computing: Is your cloud sticky? It should be The Disadvantages of DSR (Direct Server Return) Cloud Computing: Vertical Scalability is Still Your Problem Server Virtualization versus Server Virtualization1.4KViews0likes3Comments20 Lines or Less #9
What could you do with your code in 20 Lines or Less? That's the question I ask every week, and every week I go looking to find cool new examples that show just how flexible and powerful iRules can be without getting in over your head. This week I've got a combination of entries from our awesome forum users, and a rule I wrote a while back to meet a certain need at the time. We're almost at 10 editions of the 20LoL, and I'm looking forward to many more. Hopefully you're still finding it interesting and useful. Shoot me a line and let me know what's good, what's bad, what can be better and what you want to hear about. In the meantime, here's this week's 20 Lines or Less Multi-Conditional Redirect http://devcentral.f5.com/s/Default.aspx?tabid=53&forumid=5&postid=25219&view=topic Hoolio delivers this short and sweet iRules in the forums to show how you can use multiple pieces of data to decide when to perform a redirect. Not only does he make use of a normal string comparison, but also an IP::addr comparison against the client's IP address. So in one line you're getting two comparisons on two different pieces of data. This is a good example for someone looking to redirect only a small subset of people, based on multiple pieces of data. when HTTP_REQUEST { if { [string tolower [HTTP::path]] ends_with "/_grid/print/print_data.aspx" \ and (not ([IP::addr [IP::client_addr]/8 equals 10.0.0.0]))} { HTTP::redirect "http://google.com" } } Syslog Priority Rewriting This is a variation on some actual code I wrote a while back to translate the syslog priority numbers when needed. Depending on the different syslog configurations, these numbers may not line up. This can be a problem when you're trying to aggregate many syslog systems into one main log server. This iRule shows how you can catch these messages inline and modify them with whatever equation fits your environment. when CLIENT_DATA { set pri [regexp -inline {} [UDP::payload] ] set newPri [expr ( ($pri - (($pri / 6) * 8) ) ) ] regsub $pri [UDP::payload] $newPri newPayload UDP::payload replace 0 [UDP::payload length] $newPayload } Duplicate Cookie Definitions http://devcentral.f5.com/s/Default.aspx?tabid=53&forumid=5&postid=25215&view=topic Going back to the forums, it seems that hoolio is at it again. In this cool example he shows a fellow community member how to check for and remove multiple Set-Cookie entries with the same name. This way they can ensure that there is only one cookie present, regardless of how many times different apps may have tried to set it. This one looks a little long, but remove the comments and some of the white space, and it's under 20 lines...I checked. when HTTP_RESPONSE { # Insert some test response headers HTTP::header insert Set-Cookie {SESSIONID=AAAAAAAA; domain=.domain.com; path=/path/1} HTTP::header insert Set-Cookie {keeper=don't delete; domain=.domain.com; path=/path/2} HTTP::header insert Set-Cookie {SESSIONID=BBBBBBBB; domain=.domain.com; path=/path/3} HTTP::header insert Set-Cookie {SESSIONID=CCCCCCCC; domain=.domain.com; path=/path/4} log local0. "Set-Cookie header values: [HTTP::header values Set-Cookie]" log local0. "First Set-Cookie header which starts with SESSIONID: \ [lsearch -glob -inline [HTTP::header values Set-Cookie] "SESSIONID*"]" log local0. "Last Set-Cookie header which starts with SESSIONID: \ [lsearch -glob -inline -start end [HTTP::header values Set-Cookie] "SESSIONID*"]" set set_cookie_header [lsearch -glob -inline -start end [HTTP::header values Set-Cookie] "SESSIONID*"] log local0. "\$set_cookie_header: $set_cookie_header" # Remove all SESSIONID cookies while {[HTTP::cookie exists SESSIONID]}{ HTTP::cookie remove SESSIONID } log local0. "Set-Cookie values: [HTTP::header values Set-Cookie]" # Re-insert the last SESSIONID Set-Cookie header HTTP::header insert Set-Cookie $set_cookie_header log local0. "SESSIONID cookie: [HTTP::cookie SESSIONID]" } There you have it, 3 more examples in under 60 lines of code. Keep checking back every week to see what cool things can be done in just a few keystrokes. Many thanks to the awesome community and the people posting these examples. You're truly making DC a great place to be. #Colin549Views0likes0Comments4 things you can do in your code now to make it more scalable later
No one likes to hear that they need to rewrite or re-architect an application because it doesn't scale. I'm sure no one at Twitter thought that they'd need to be overhauling their architecture because it gained popularity as quickly as it did. Many developers, especially in the enterprise space, don't worry about the kind of scalability that sites like Twitter or LinkedIn need to concern themselves with, but they still need to be (or at least should be) concerned with scalability in general and the effects of inserting an application into a high-scalability environment, such as one fronted by a load balancer or application delivery controller. There are some very simple things you can do in your code, when you're developing an application, that can ease the transition into a high-availability architecture and that will eventually lead to a faster, more scalable application. Here are four things you can do now - and why - to make your application fit better into a high availability environment in the future and avoid rewriting or re-architecting your solutions later. Where's F5? Storage Decisions Sept 23-24 in New York Networld IT Roadmap Sept 23 in Dallas Oracle Open World Sept 21-25 in San Francisco Storage Networking World Oct 13-16 in Dallas Storage Expo 2008 UK Oct 15-16 in London Storage Networking World Oct 27-29 in Frankfurt 1. Don't assume your application is always responsible for cookie encryption Encrypting cookies in today's privacy lax environment that is the Internet is the responsible thing to do. In the first iterations of your application you will certainly be responsible for handling the encryption and decryption of cookies, but later on, when the application is inserted into a high-availability environment and there exists an application delivery controller (ADC), that functionality can be offloaded to the ADC. Offloading the responsibility for encryption and decryption of cookies to the ADC improves performance because the ADC employs hardware acceleration. To make it easier to offload this responsibility to an ADC in the future but support it early on, use a configuration flag to indicate whether you should decrypt or encrypt cookies before examining them. That way you can simply change the configuration flag later on and immediately take advantage of a performance boost from the network infrastructure. 2. Don't assume the client IP is accurate If you need to use/store/access the client's IP address, don't assume the traditional HTTP header is accurate. Early on it certainly will be, but when the application is inserted into a high availability environment and a full-proxy solution is sitting in front of your application, it won't be. A full-proxy mediates between client and server, which means it is the client when talking to the server, so its IP address becomes the "client IP". Almost all full-proxies insert the real client IP address into the X-Forwarded-For HTTP header, so you should always check that header before checking the client IP address. If there is an X-Forwarded-For value, you'll more than likely want to use it instead of the client IP address. This simple check should alleviate the need to make changes to your application when it's moved into a high availability environment. 3. Don't use relative paths Always use the FQDN (fully qualified domain name) when referencing images, scripts, etc... inside your application. Furthermore, use different host names for different content types - i.e. images.example.com and scripts.example.com. Early on all the hosts will point to the same server, probably, but by insuring that you're using the FQDN now makes architecting that high availability environment much easier. While any intelligent application delivery controller can perform layer 7 switching on any part of the URI and arrive at the same architecture, it's much more efficient to load balance and route application data based on the host name. By using the FQDN and separating host names by content type you can later optimize and tune specific servers for delivery of that content, or use the CNAME trick to improve parallelism and performance in request heavy applications. 4. Separate out API rate limiting functionality If you're writing an application with an API for integration later, separate out the rate limiting functionality. Initially you may need it, but when the application is inserted into a high-availability environment with an intelligent application delivery controller, it can take over that functionality and spare your application from having to reject requests that exceed the set limits. Like cookie encryption, use a configuration flag to determine whether you should check this limitation or not so it can be easily be turned on and off at will. By offloading the responsibility for rate limiting to an application delivery controller you remove the need for the server to waste resources (connections, RAM, cycles) on requests it won't respond to anyway. This improves the capacity of the server and thus your application, making it more efficient and more scalable. By thinking about the ways in which your application will need to interact with a high availability infrastructure later and adjusting your code to take that into consideration you can save yourself a lot of headaches later on when your application is inserted into that infrastructure. That means less rewriting of applications, less troubleshooting, and fewer servers needed to scale up quickly to meet demand. Happy coding!360Views0likes1CommentLightboard Lessons: What are Website Cookies?
When you visit a website today, you might notice a banner that pops up and says something like...This website uses cookies, and you have to accept them to make the site work properly. But, do you know what a "cookie" is? Or, what it is used for? This lightboard lesson digs into the detail of website cookies, so check it out to learn more!295Views1like0CommentsThe third greatest (useful) hack in the history of the Web
Developers have an almost supernatural ability to workaround restrictions, even though some of the restrictions on building applications delivered via the web have been akin to a kryptonite. Like Superman fighting through the debilitating effects of the imaginary mineral, they've gotten around those restrictions by coming up with ways to implement functionality and improve the behavior of browsers and thus web applications anyway. The first greatest hack was giving HTTP state. The second? Cookie-based persistence. The third? The CNAME trick. THE PROBLEM The reason the "CNAME trick" came about was a limitation on browser connections to a single host imposed by browsers, but particularly version of Internet Explorer previous to IE8. With only 2 connections per host name allowed and many times that number of objects on a page, the ability of IE in particular but really all browsers to quickly retrieve all those objects and render them was also hampered. This resulted in the appearance that the application performed poorly, when in reality it wasn't the application but the inherent delivery mechanisms that were slow due to limitations beyond the user's, the network admin's, and the developer's control. Users, of course, don't care about any of this. All they know is that the application they are using is slow and they want it fast. And when some of those users are corporate business users, the developers are going to hear about it because the help desk is going to call them when they get barraged with complaints from users. This is the real reason developers develop nearly supernatural powers of hacking; they'll do anything to stop users from complaining. THE HACK Developers all over (including ours inside F5, working on building our application acceleration solution, WebAccelerator) figured that if the browser was going to limit the number of connections to a single host that the answer was simply to trick the browser into thinking it was talking to more than one host. Turns out doing this is rather trivial: simply add multiple CNAMEs for the same host to DNS, and then reference those as the host for some of the objects in the page. So www.example.com becomes www1, www2, www3, and so on. This required changes to the application so that the additional host names were referenced, unless you made use of a proxy-based solution like WebAccelerator and BIG-IP Local Traffic Manager capable of rewriting outbound host names and virtualizing them to appear to the outside world as if they were a single host. THE NEW PROBLEMS This improved application performance, but at the cost of increasing the number of simultaneous connections to the server. This was "bad" in the sense that a web server, even well tuned, can only support X simultaneous connections at a time, and if each user was consuming Y connections per page, the number of concurrent users that could be supported on a single server was decreased by this hack. This made servers less efficient and required additional servers to ensure availability and scale. Along comes AJAX, and the popularity of the "CNAME trick" rose rapidly. This is because AJAX became a way to provide near real-time updates to web browsers on a per-component basis. The result of this is a rich, interactive application that ends up maintaining a connection to the server on a nearly continual basis. So not only is a single application using more connections - and thus server resources - to load a page, it is also consuming more resources and connections throughout its execution. Some web servers aren't good about dealing with virtual hosts. Rather than pretend that a single instance is all the virtual hosts, it will spawn more and more children instances, each one consuming resources until there are so many instances running that each one can handle fewer concurrent users than the parent. The CNAME trick is also difficult to scale. Every time a new CNAME is added, it must be referenced within the application, which often means modifying applications. A costly proposition in terms of time and effort. If the CNAME trick is used as a reactive measure to address poor application performance, the entire application must be modified to reference the new host names. THE SOLUTIONS There are several solutions to the problems created by the CNAME trick, but they all take advantage of the same core principle: a proxy-based mediator. The proxy's job is to mediate for the client and aggregate connection requests to the server and make them more efficient, either by reusing connections to servers or by load-balancing them more efficiently, or by rewriting requests. The advantage of implementing a proxy-based solution proactively is that applications don't necessarily need to be modified if the solution can rewrite host names in outbound responses. Basically, if the proxy is smart enough, it can change the host names in the page before it reaches the browser, and then aggregate and optimize them as the requests for each object come back in, essentially taking advantage of layer 7 switching to route all the requests to the appropriate web server. The problem of additional connections can't be solved without a proxy of some kind. How efficient a web server is at handling the additional connections can be changed with tuning and optimization, but the additional connections and resulting consequences are going to hit that web server regardless without a mediator to deal with them. THE FUTURE This is a great hack, there is no doubt about it. It's one of the best ways to "workaround" the problems with browser limitations. Now that IE8 is increasing the connection limit from 2 to 6, the CNAME hack will be less necessary to combat the perception of poor application performance. But the problem of inefficiency and resource consumption on servers will not go away; in fact, it's likely to get worse as IE8 is adopted over the next year. The increase in connections initiated from browsers will continue to strain the application infrastructure and, in fact, will get worse primarily because not everyone implemented the "CNAME trick" and thus not everyone's infrastructure felt the strain of those increased connections. With IE8 everyone will feel the impact of additional connections upon their application infrastructure - whether they're a small to medium business, a large enterprise, or a service provider.225Views0likes0Comments9 ways to use network-side scripting to architect faster, scalable, more secure applications
You may recall a recent overview on network-side scripting that described a few uses of this technology integrated with application delivery controllers. With thousands of examples of the uses of network-side scripting it's hard to choose just one to adequately represent its potential. Luckily, we don't have to stick to just one. Viva la Internet! Based on the technical session the great network-side scripting guru Colin and I ran at SD Best Practices in October, I've pulled nine ways to use network-side scripting that can enhance the scalability, security, and performance of web applications into a presentation for your viewing pleasure. These uses of network-side scripting technology improve security, performance, and scalability primarily by offloading resource intensive and shared application logic onto the application delivery platform. In a few cases the application intelligence provided by such platforms is also used to aid in the architecture of a more scalable infrastructure (e.g., application switching). I'm including the list here, but if you want the goods (replete with explanations and sample code) then you'll want to walk through the shared presentation. Cookie encryption (security, performance, scalability) Session persistence (scalability) URI Rewrite (scalability, security) Application switching (scalability, security) Exception handling (security) Data scrubbing (security, scalability) Intelligent compression (performance, scalability) LDAP connection proxying (performance, scalability) Customized 404 responses (scalability)209Views0likes4CommentsAmazon Makes the Cloud Sticky
Stateless applications may be the long term answer to scalability of applications in the cloud, but until then, we need a solution like sticky sessions (persistence) Amazon recently introduced “stickiness” to its ELB (Elastic Load Balancing) offering. I’ve written a bit about “stickiness”, a.k.a. what we’ve called persistence for oh, nearly ten years now, before so I won’t reiterate again but to say, “it’s about time.” A description of why sticky sessions is necessary was offered in the AWS blog announcing the new feature: Up until now each Load balancer had the freedom to forward each incoming HTTP or TCP request to any of the EC2 instances under its purview. This resulted in a reasonably even load on each instance, but it also meant that each instance would have to retrieve, manipulate, and store session data for each request without any possible benefit from locality of reference. -- New Elastic Load Balancing Feature: Sticky Sessions What the author is really trying to say is that without “sticky sessions” ELB breaks applications because it does not honor state. Remember that most web applications today rely upon state (session) to store quite a bit of application and user specific data that’s necessary for the application to behave properly. When a load balancer distributes requests across instances without consideration for where that state (session) is stored, the application behavior can become erratic and unpredictable. Hence the need for “stickiness”.206Views0likes0Comments