Gaming the System: The $23,698,655.93 per hour Cloud Computing Instance?

An interesting look at how automation combined with cloud computing resource brokering could go very, very wrong

Automation is not a new concept. People – regular old people – have been using it for years for tasks that require specific timing or reaction to other actions, like bidding on eBay or other auction-focused sites.

The general concept is pretty simple as it’s just an event-driven system that automatically performs an action when the specified trigger occurs. Usually, at least when money is concerned, there’s an upper limit. The action can’t be completed if the resulting total would be above a specified maximum amount.

Sometimes, however, things go horribly wrong.

THE MOST EXPENSIVE BOOK IN HISTORY

I was out trolling Facebook and happened to see a link to an article claiming a book was actually listed on Amazon for – wait for it, wait for it - $23,698,655.93. Seriously, it was listed for that much for a short period of time.

There’s a lengthy explanation of why and it turns out that an automated “pricing war” of sorts was to blame. Two competing sellers tried to keep their prices within specific percentages – one slightly below 100% of the other while the other tried to stay slightly above 100% of the other. The mathematically astute can figure out what happens when the differences were not equal – specifically the seller keeping his price higher used a higher percentage off 100% than the seller trying to stay below the other guy. Stair-step increases over time ultimately resulted in the price hitting over $23 million dollars before someone noticed what was going on.

Needless to say neither seller found a buyer at that price, mores the pity for them.

THE POTENTIAL DANGER for CLOUD BROKERS

The concept of cloud brokers, services that provide competitive bidding and essentially auctioning of cloud resources, is one that plays well into the demesne of commoditized resource services.

Commoditization, after all, engenders an environment in which the consumer indicates the value and therefore price they will pay for a resource, and generally providers of that resource respond. Allowing consumers to “bid” on the resource allows the market to determine the value in a very agile manner. Seasonal or event-driven spikes in capacity needs, for example, could allow those resources that are most valuable in those moments to rise in price, while at other times it may drive the price downward. While making it difficult, perhaps, to budget properly across a financial reporting period, such volatility can be positive as it also indicates to the market the price consumers will bear in general.

But assume that, like the Amazon marketplace, two such brokers begin setting prices based on each other rather than through market participation. Two brokers that wish to remain competitive, each with different value propositions such that one sets its price slightly lower than other, automatically, while the other sets the pricing of instances slightly higher than the other, automatically.

Indeed, you could arrive at the nearly $24 million dollar per hour cloud computing instance. Or nearly $24 million dollar block storage, or gigabit per second of bandwidth or whatever resource the two brokers are offering.

THE POTENTITIAL DANGER for DATA CENTERS

Now certainly this is an extreme – and unlikely - scenario. But if we apply the same concept to a dynamic, integrated infrastructure tasked with delivering applications based on certain business and operational parameters, you might see that the same scenario could become reality with slightly different impacts to the data center and the business it serves.

While not directly related to pricing, it is other policies regarding the security, availability and performance of a applications that could be impacted and problems compounded if controls and limitations are not clearly set upon automated responses to conditions within the data center. Policies that govern network speeds and feeds, for example, could impose limitations on users or applications based on prioritization or capacity. Other policies regarding performance might react to the initiation of those policies in an attempt to counter a degradation of performance, which again triggers a tightening of network consumption, which again triggers… You get the picture. Circular references – whether in a book or cloud computing resource market or internal to the data center infrastructure – can cascade such that the inevitable result is a negative impact on availability and performance.

Limitations, thresholds, and clear controls are necessary in any automated system. In programming we use the term “terminal condition” to indicate at what point a piece of given code should terminate, or exit, a potentially infinite loop. Such terminal conditions must be present in data center automation as a means to combat a potentially infinite loop between two or more pieces of infrastructure that control the flow of application data. Collaboration, not just integration, is critical. While Infrastructure 2.0 enables the integration necessary to support a context-aware data center architecture capable of adapting on-demand to conditions as a means of ensuring availability, security and performance goals are met, that integration requires collaboration across people – across architects and devops and admins – who can recognize such potential infinite loops and address them by implementing the proper terminal conditions in those processes.

COLLABORATION without CONTROL is BAD, M’KAY?

Whether the implementation is focusing on automating a pricing process or the enablement of a security or performance policy in the data center, careful attention to controls is necessary to avoid an infinite regression of policies that counteract one another. Terminal conditions, limitations, thresholds. These are necessary implements to ensure that the efficiencies gained through automation do not negatively impact application delivery. The slow but steady increase of a book beyond a “normal” price should have been recognized as being out of bounds in context – the context of the market, of activity, of other similar book pricing. IN the data center, the same contextual-awareness is necessary to understand why more capacity may be needed or why performance may be degrading. Is it a multi-layer (modern) attack? Is it a legitimate flash crowd of traffic? These questions must be able to be answered in order to properly adjust policies and ensure the right folks are notified in the event that changes in the volume being handled by the data center may be detrimental to not only the security but the budget of the data center and applications it is delivering.

Collaboration and integration go hand in hand, as do automation and control.