Domain sharding is a well-known practice to improve application performance – and you can implement automatically without modifying your applications today.
If you’re a web developer, especially one that deals with AJAX or is responsible for page optimization (aka “Make It Faster or Else”), then you’re likely familiar with the technique of domain sharding, if not the specific terminology.
For those who aren’t familiar with the technique (or the term), domain sharding is a well-known practice used to trick browsers into opening many more connections with a server than is allowed by default. This is important for improving page load times in the face of a page containing many objects. Given that the number of objects comprising a page has more than tripled in the past 8 years, now averaging nearly 85 objects per page, this technique is not only useful, it’s often a requirement. Modern browsers like to limit browsers to 8 connections per host, which means just to load one page a browser has to not only make 85 requests over 8 connections, but it must also receive those requests over those same, limited 8 connections. Consider, too, that the browser only downloads 2-6 objects over a single connection at a time, making this process somewhat fraught with peril when it comes to performance. This is generally why bandwidth isn’t a bottleneck for web applications but rather it’s TCP related issues such as round trip time (latency).
Here are the two main points that need to be understood when discussing Bandwidth vs. RTT in regards to page load times:
1.) The average web page has over 50 objects that will need to be downloaded (reference: http://www.websiteoptimization.com/speed/tweak/average-web-page/) to complete page rendering of a single page.
2.) Browsers cannot (generally speaking) request all 50 objects at once. They will request between 2-6 (again, generally speaking) objects at a time, depending on browser configuration.
This means that to receive the objects necessary for an average web page you will have to wait for around 25 Round Trips to occur, maybe even more. Assuming a reasonably low 150ms average RTT, that’s a full 3.75 seconds of page loading time not counting the time to download a single file. That’s just the time it takes for the network communication to happen to and from the server. Here’s where the bandwidth vs. RTT discussion takes a turn decidedly in the favor of RTT.
So the way this is generally addressed is to “shard” the domain – create many imaginary hosts that the browser views as being separate and thus eligible for their own set of connections. This spreads out the object requests and responses over more connections simultaneously, allowing what is effectively parallelization of page loading functions to improve performance.
Obviously this requires some coordination, because every host name needs a DNS entry and then you have to … yeah, modify the application to use those “new” hosts to improve performance. The downside is that you have to modify the application, of course, but also that this results in a static mapping. On the one hand, this can be the perfect time to perform some architectural overhauls and combine domain sharding with creating scalability domains to improve not only performance but scalability (and thus availability). You’ll still be stuck with the problem of tightly-coupled hosts to content, but hey – you’re getting somewhere which is better than nowhere.
Or the better way (this is an F5 Friday so you knew that was coming) would be to leverage a solution capable of automatically sharding domains for you. No mess, no fuss, no modifying the application. All the benefits at one-tenth the work.
What BIG-IP WebAccelerator does, automatically, is shard domains by adding a prefix to the FQDN (Fully Qualified Domain Name). The user would initiate a request for “www.example.com” and WebAccelerator would go about its business of requesting it (or pulling objects from the cache, as per its configuration). Before returning the content to the user, however, WebAccelerator then shards the domain, adding prefixes to objects. The browser then does its normal processing and opens the appropriate number of connections to each of the hosts, requesting each of the individual objects. As WebAccelerator receives those requests, it knows to deshard (unshard?) the hosts and make the proper requests to the web or application server, thus insuring that the application understands the requests. This means no changes to the actual application. The only changes necessary are to DNS to ensure the additional hosts are recognized appropriately and to WebAccelerator, to configure domain sharding on-demand.
This technique is useful for improving performance of web applications and is further enhanced with BIG-IP platform technology like OneConnect which multiplexes (and thus reuses) TCP connections to origin servers. This reduces the round trip time between WebAccelerator and the origin servers by keeping connections open, thus eliminating the overhead of TCP connection management. It improves page load time by allowing the browser to request more objects simultaneously.
This particular feature falls into the transformative category of web application acceleration as it transforms content as a means to improve performance. This is also called FEO (Front End Optimization) as opposed to WPO (Web Performance Optimization) which focuses on optimization and acceleration of delivery channels, such as the use of compressing and caching.