Gotta Catch Em All. Multiple bottlenecks are a part of the IT lifestyle

My older children, like most kids in their age group, all played with or collected Pokemon cards. Just like I and all of my friends had GI Joes and discussed the strengths and weaknesses of Kung-fu grip versus hard hands, they and all of their friends sat around talking about how much cooler their current favorite Pokemon card was compared to all of the others. We let them play and kept an eye on how cards were being passed about the group (they’re small and tend to walk off, so we patrolled a bit, but otherwise stayed out of the way). And the interesting thing about Pokemon – or any other Collectible Card Game – is that as soon as you’ve settled your discussion about which card is “best”, someone picks a new favorite so you can rehash all the same issues with this new card in the mix. People – mostly but not exclusively children - honestly spend hours at this pass-time, and every time they resolve the differences, it starts all over again. The point of Pokemon is to catch and train little creatures (build a deck of cards) that will, on your command, battle other little creatures (the other players’ card deck) for supremacy. But that’s often lost in the discussions of which individual card or small combinations of cards is “best”. Everyone has their favorites and a focused direction, so these conversations can grow quite heated.

It is no mistake that I’m discussing Pokemon in an IT blog. Our role is to support the business with applications that will allow them to do  their job, or do their job better, or do things the competition can’t do. That’s why we’re here. But everyone in IT has a focus and direction – Developer, Architect, Network Admin, Systems Admin, Storage Admin, Business Analyst… The list goes on – and sometimes our conversations about how to best serve the business get quite heated. More importantly, sometimes the point of IT – to  support the business – gets lost in examining the minutiae, just like comparing two Pokemon cards when there are hundreds of cards to build decks from. There are a few – like Charizard pictured here – that are special until they’re superseded by even cooler cards.

But a lot of what we do is written in stone, and is easily lost in the shuffle. Just as no one champions the basic “energy” cards in Pokemon – because they don’t DO anything by themselves – we often don’t discuss some of the basic issues IT always has and always will struggle with, because they’re known, set in stone, and should be self-evident. Or at least we think they should. So I’ll remind you of one of the basics, and perhaps that will spur you to keep the simple stuff in mind whilst arguing over the coolest new toy in the datacenter.

 

Image courtesy of Pokebeach.com

The item I’ve chosen? There is never one bottleneck. It is a truth. If you find and eliminate the performance bottleneck of your application, you have not resolved all problems, you have simply removed a roadblock on the way to the next bottleneck. A system that ran fine last week may not be running fine this week because a new bottleneck threshold has been hit. And the bottlenecks are always – always inter-related.

(Warning – of course I reference F5 products in this list, if you have other vendors, insert their names)

Consider this, your web app is having performance problems, and you track it down to your network card utilization. So you upgrade the server or throw it behind your BIG-IP (or other ADC or a load balancer), and the problem is resolved. So now your CPU utilization is fine, but the application’s performance degrades again relatively quickly. You go researching and discover that your new bottleneck is storage. Too many high-access files on a single NAS device is slowing down simple file reads and writes. So you move your web servers to use a different NAS device (downright simple if you have ARX in-house, not too terribly difficult if you don’t), and a couple of weeks later users are complaining again. You dig and research, and all seems well to you, but there are enough complaints that you are pretty certain there’s a problem. So you call up a coworker in a remote office and have them check. They say performance stinks. So you go home that night and try it from home, and sure enough, outside the building performance stinks. Inside, it’s fine. Now your problem is your Internet connection. So you check the statistics, and back-end services like replication are burying your Internet connection. So you do some research and decide that your problems are best addressed by reducing the bandwidth required for those back-end processes and setting guaranteed bandwidth numbers for HTTP traffic. Enter WAN Optimization. If you’re an F5 customer, you just add WOM to your BIG-IP and configure it. Other vendors have a few more steps, but not terribly more than if you were not an F5 customer and bought BIG-IP with WOM to solve this problem. And once all of that clears up, guess what? We’re back to  Pikachu. Your two servers, now completely cleared of other bottlenecks, are servicing so many requests that their CPU utilization is spiking. Time for a third server.

Now this whole story sounds simple, but it isn’t. Network, Storage, Systems, all fall under the bailiwick of different groups within IT. It is never so easy as the above paragraph makes it sound… I’ve glossed over the long nights, the endless status meetings, the frustration of not finding the bottleneck right away – mine are obvious only because I list them, I skip the part where you check fifty other things first. And inevitable, there is the discussion of what’s the right solution to a given problem that starts to sound like people who discuss the “best” Pokemon card. Someone wants to cut back on the amount of bandwidth back-office applications use by turning off services, someone wants to  buy a bigger pipe, someone suggests WAN optimization, and we go a few rounds until we settle on a plan that’s best for the organization in question.

But in the end, keeping the business going and customers happy is the key to IT. Sure, clearing up one bottleneck will create another and spawn another round of “right solution” discussions, but that’s the point. It’s why you’re there. You have the skills and the expertise the company needs to keep moving forward, and this is how they’re applied. And along the way you’ll get to find the new hot toy in the datacenter and propose it as the right solution to everything, because it is your Charizard – until the next round of discussion anyway.

And admit it, this stuff is fun, just like the game. Choosing the right solution, getting it implemented, that’s what drives all good IT people. Figuring out problems that are complex enough to be called rocket science under pressure that is sometimes oppressive. But the rush is there when the solution is in and is right. And it’s often a team effort by all of the different groups in IT.

I personally think IT should throw itself more parties, but I guess we’ll just have to settle for more dinner-at-the-desk moments for the time being.

Published Feb 08, 2011
Version 1.0
No CommentsBe the first to comment