Curing the Cloud Performance Arrhythmia
Arrhythmias are most often associated with the human heart. The heart beats in a specific, known and measurable rhythm to deliver oxygen to the entire body in a predictable fashion. Arrhythmias occur when the heart beats irregularly. Some arrhythmias are little more than annoying, such as PVCs, but others can be life-threatening, such as ventricular fibrillation. All arrhythmias should be actively managed.
Inconsistent application performance is much like a cardiac arrhythmia. Users may experience a sudden interruption in performance at any time, with no real rhyme or reason. In cloud computing environments, this is more likely, because there are relatively few, if any, means of managing these incidents.
A 2011 global study on cloud conducted on behalf of Alcatel-Lucent showed that while security is still top of mind for IT decision makers considering cloud computing, performance – in particular reliable performance – ranks higher on the list of demands than security or costs.
THE PERFORMANCE PRESCRIPTION
One of the underlying reasons for performance arrhythmias in the cloud is a lack of attention paid to TCP management at the load balancing layer. TCP has not gotten any lighter during our migration to cloud computing and while most enterprise implementations have long since taken advantage of TCP management capabilities in the data center to redress inconsistent performance, these techniques are either not available or simply not enabled in cloud computing environments.
Two capabilities critical to managing performance arrhythmias of web applications are caching and TCP multiplexing. These two technologies, enabled at the load balancing layer, reduce the burden of delivering content on web and application servers by offloading to a service specifically designed to perform these tasks – and do so fast and reliably.
In doing so, the Load balancer is able to process the 10,000th connection with the same vim and verve as the first. This is not true of servers, whose ability to process connections degrades as load increases, which in turn necessarily raises latency in response times that manifests as degrading performance to the end-user.
Failure to cache HTTP objects outside the web or application server has a similar negative impact due to the need to repetitively serve up the same static content to every user, chewing up valuable resources that eventually burdens the server and degrades performance.
Caching such objects at the load balancing layer offloads the burden of processing and delivering these objects, enabling servers to more efficiently process those requests that require business logic and data.
FAILURE in the CLOUD
Interestingly, customers are very aware of the disparity between cloud computing and data center environments in terms of services available.
In a recent article on this topic from Shamus McGillicuddy, "Tom Hollingsworth, a senior network engineer with United Systems, an Oklahoma City-based value-added reseller (VAR). "I want to replicate [in the cloud with] as much functionality [customers] have for load balancers, firewalls and things like that."
So why are cloud providers resistant to offering such services?
Shamus offered some insight in the aforementioned article, citing maintenance and scalability as inhibitors to cloud provider offerings in the L4-7 service space. Additionally, the reality is that such offload technologies, while improving and making more consistent performance of applications also have a side effect of making more efficient the use of resources available to the application. This ultimately means a single virtual instance can scale more efficiently, which means the customer needs fewer instances to support the same user base. This translates into fewer instances for the provider, which negatively impacts their ARPU (Annual Revenue Per User) – one of the key metrics used to evaluate the health and growth of providers today.
But the reality is that providers will need to start addressing these concerns if they are to woo enterprise customers and convince them the cloud is where it's at. Enabling consistent performance is a requirement, and a decade of experience has shown customers that consistent performance in a scalable environment requires more than simple load balancing – it requires the very L4-7 services that do not exist in provider environments today.
Referenced blogs & articles:
- Layer 4-7 cloud networking still scarce in IaaS market
- Understanding the market opportunity for carrier cloud services
- The Need for (HTML5) Speed
- SPDY versus HTML5 WebSockets
- QoS without Context: Good for the Network, Not So Good for the End user
- The Cloud Integration Stack
- HTML5 WebSockets: High-Speed Infrastructure Integration Bus?
- Cloud Delivery Model is about Ops, not Apps