Damned if you do, damned if you don't

There has been much fervor around the outages of cloud computing providers of late, which seems to be leading to an increased and perhaps unwarranted emphasis on SLAs the likes of which we haven't seen since...well, the last time the IT saw outsourced anything reach the hype-level of cloud computing. Consider this snippet of goodness for a moment, and pay careful attention to the last paragraph.

From Five Key Challenges of Enterprise Cloud Computing

I won’t beat the dead “Gmail down, EC2 down, etc down” horse here. But the truth of the matter is enterprises today cannot reasonably rely on the cloud infrastructures/platforms to run their business. There’s almost no SLAs provided by the cloud providers today. Even Jeff Barr from Amazon said that AWS only provides SLA for their S3 service.

[...]

Can you imagine enterprises signing up cloud computing contracts without SLAs clearly defined? It’s like going to host their business critical infrastructure in a data center that doesn’t have clearly defined SLA.

We all know that SLAs really doesn’t buy you much. In most cases, enterprises get refunded for the amount of time that the network was down. No SLA will cover business loss. However, as one of the CSOs I met said, it’s about risk transfer. As long as there’s a defined SLA on paper, when the network/site goes down, they can go after somebody. If there’s no SLA, it will be the CIO/CSO’s head that’s on the chopping block.

Let's look at this rationally for a moment. SLAs really don't buy you much. True. True of cloud computing providers, true of the enterprise. No SLA covers business loss. True. True of cloud computing providers, true of the enterprise.

What I find amusing about this article is that the author asks if we can imagine "signing up cloud computing contracts without SLAs clearly defined?" Well, why not? Businesses do it every day when IT deploys the latest "Business App v4.5.3.2a". Microsoft Office 2007 relies heavily on on-line components, but we don't demand an SLA from Microsoft for it. Likewise, the anti-phishing capabilities of IE7 don't necessarily come with an SLA and businesses don't shy away from making it their corporate standard anyway.

In fact, I'd argue that most cloudware today comes with an anti-SLA: use at your own risk, we don't guarantee anything.

The CIO/CSO's head is on the chopping block if he does have an SLA, because there's no guarantee that IT can meet it. Oh, usually they do, because the SLA is broadly defined for all of IT in terms of "we'll have 5 9's of availability for the network" and "applications will have less than an X second response time" and so on. But it isn't as if IT and the business sit down and negotiate SLAs for every single application they deploy into the enterprise data center.

If they do, then they're the exception, not the rule. And the applications this is true of are so time-sensitive and mission critical that it's unlikely the responsibility for them will ever be outsourced. Financial services and brokerages are a good example of this. Outsourced? Unlikely. The IT folks responsible for the applications and networks in those industries are probably laughing uproariously at the idea.

The argument that an SLA is simply to place a target on someone's head regarding responsibility for uptime and performance of applications is largely true. But that would seem to indicate that if you're a CIO/CSO and can wrangle any SLA out of a cloud computing provider that you should immediately use them for everything, because you can pass the mantle of responsibility for failing to meet SLAs to them instead of shouldering it yourself.

This isn't a cloud computing problem, this is a problem of responsibility and managing expectations. It's a problem with expecting that a million moving parts, hundreds of connections, routers, switches, intermediaries, servers, operating systems, libraries, and applications will somehow always manage to be available. Unpossible, I say, and unrealistic regardless of whether we're talking cloud computing or enterprise infrastructure.

Basically, the CIO/CSO is damned if he has an SLA because chances are IT is going to fail to meet them at some point, and he's damned if he doesn't have an SLA because that means he's solely responsible for the reliability and performance of all of IT.

And people wonder why C-level execs command the compensation levels they do. It's to make sure they can afford the steady stream of antacids they need just to get through the day.

AddThis Feed Button Bookmark and Share

Published Sep 10, 2008
Version 1.0
  • I have been trying to find this coffee mug for a long time, do you know where I might be able to find one?