#cloud #loadbalancing Scalability requires load balancing, but it doesn't require efficient or cost-effective load balancing.
It's not the first time we've heard the statement that cloud can be too expensive and I doubt it will be the last. This latest episode comes from Alexei Rodriguez, Head of Ops at Evernote by way of Structure 2014:
Original Tweet: https://twitter.com/joeweinman/status/479769276027379712
It is important to note that this admission - like those in the past - have come from what we call "web monsters." Web monsters are, as the name implies, web-first (and usually only) organizations who have millions (or billions) of users. Modern web monsters generally have only one application for which they are responsible, a la Evernote, Netflix, Facebook, etc...
It is unlikely that most enterprises will encounter this same conundrum - that of the cloud actually costing more than a DIY approach - for short-lived projects. A marketing campaign, seasonal promotions and offerings, etc... are almost certainly never going to approach the consumption levels of a Facebook or Evernote, and thus their costs will almost certainly be less in the cloud than in house.
That's not to say that enterprises won't run into this problem, or need to carefully evaluate the long-term costs of cloud for an application against their own ability to service it, especially as the Internet of Things begins to arrive and push at oft times already bulging data center seams.
One of the ways in which cloud can end up costing more is based on the load balancing service you choose to use.
Load balancing is at the heart of every cloud computing model. Without load balancing of some kind you can't scale, and scalability is one of cloud's biggest benefits, as well as a top driver according to North Bridge Ventures 2014 Future of Cloud.
Load balancing, of course, distributes load across multiple instances of an application to enable scale, improve performance, and maintain availability. In most cloud environments, where provider supplied load balancing services are made available, these services are based on a scale out model, meaning scalability is based purely on the cloning of new application instances when demand reaches a certain (usually customer defined) threshold.
Now, that's all pretty simple stuff. All load balancing services offer scalability this way. What separates enterprise class load balancing from the simplistic offerings from providers is the ability to optimize server-side (virtual or physical) resource utilization in order to eke out the most capacity from each one, without compromising on other service level requirements such as performance.
Enterprise class load balancing services achieve this by using a variety of TCP optimizations designed to offload protocol overhead from the server (instance). TCP multiplexing and response buffering capabilities enable enterprise class load balancing to improve the capacity of servers (instances) by 25% or more, on average.
Obviously if a server (instance) can serve 25% more user requests, you don't scale out as quickly. In other words, you aren't launching more instances as frequently. Which means you aren't paying for more instances as often, either. Interesting, isn't that?
Enterprise load balancing services also offer a variety of load balancing algorithms, each of which has advantages and disadvantages. All load balancing services generally support the most basic of algorithms, round robin, but more sophisticated algorithms are rarely implemented. It is here, along with TCP optimizations, that efficient scalability becomes problematic. Round robin is application and server load agnostic, meaning it doesn't care if the instance selected has 400 connections while a second instance has only 50, it's still going to send that request to the next one in line. While least connections may not be the most efficient algorithm available, it's definitely more application load-aware than round robin.
Most enterprise driven load balancing algorithms take into consideration in some way - whether through weights or connection counts - the load on a given application instance. Rather than just distribute requests, they attempt to efficiently and equally distribute requests in order to maximize resource utilization without impacting performance or availability.
Thus, the use of simple load balancing services with rudimentary algorithmic support and an apathetic view toward server (instance) load serves to distribute load unequally.
These load balancing services do, however, serve to ensure that more instances are launched and more bandwidth is used, which necessarily incurs additional costs.
The load balancing service you choose does ultimately impact the overall cost of cloud. While its not the primary cause behind observations from organizations like Evernote, it's certainly a contributor.