In Replication, Speed isn’t the Only Issue
In the US, many people watch the entire season of NASCAR without ever really paying attention to the racing. They are fixated on seeing a crash, and at the speed that NASCAR races average – 81mph on the most complex track to 188 mph on the least curvy track – they’re likely to get what they’re watching for. But that misses the point of the races. The merging of man and machine to react at lightning speed to changes in the environment are what the races are about. Of course speed figures in, but it is not the only issue. Mechanical issues, and the dreaded “other driver” are things that must be watched for by every driver on the track.
I’ve been writing a whole lot about remote replication and keeping systems up-to-date over a limited WAN pipe, but in all of those posts, I’ve only lightly touched upon some of the other very important issues, because first and foremost in most datacenter manager or storage manager’s mind is “how fast can I cram out a big update, and how fast can I restore if needed”. But of course the other issues are more in-your-face, so in the interests of not being lax, I’ll hit them a little more directly here.
Images from NASCAR.com
The cost of remote replication and backups is bandwidth. Whether that bandwidth is taken in huge bursts or leached from your WAN connection in a steady stream is merely a question of implementation. You have X amount of data to move in Y amount of time before the systems on the other end are not current enough to be useful. Some systems copy changes as they occur (largely application-based replication that largely resembles or is actually called Continuous Data Protection), some systems (think traditional backups) run the transfers in one large lump at a given time of the day. Both move roughly the same amount of data, the only variable is the level of impact on your WAN connection. There are good reasons to implement both – a small, steady stream of data is unlikely to block your WAN connection and will keep you the closest to up-to-date, while traditional backups can be scheduled such that at peak times they use no bandwidth whatsoever, and utilize the connection at times when there is not much other usage.
Of course your environment is not so simple. There is always other usage, and if you’re a global organization, “peak time” becomes “peak times” in a very real sense as the sun travels around the globe and different people come online at different times. This can have implications for both types of remote replication, for even the CDP style utilizes bandwidth in bursty bits. When you hit a peak time, changes to databases and files also peak. This can effectively put a throttle on your connection by increasing replication bandwidth at the same time that normal usage is increasing in bandwidth needs.
The obvious answer to this dilemma is the same answer that is obvious for every “the pipe is full” problem – get a bigger connection. But we’ve gone over this one before, bigger connections are a monthly fee, and the larger you go, the larger the hike in price. In fact, because the growth is near exponential, the price spike is near exponential. And that’s something most of us can’t just shell out. So the obvious answer is often a dead end. Not to mention that the smaller the city your datacenters are in, the harder it is to get more bandwidth in a single connection. This is improving in some places, but is still very much the truth in many smaller metropolitan areas.
So what is an IT admin to do? This is where WAN Optimization Controllers come into the game. Standard disclaimer: F5 plays in this space with our WOM module.
Many users approach WAN Optimization products from the perspective of cramming more through the pipe – which most are very good at, but often the need is not for a bigger pipe, it is for a more evenly utilized pipe, or one that can differentiate between the traffic (like replication and web store orders) that absolutely must get through versus traffic – like YouTube streams to employee’s desks – that doesn’t have to.
If you could allocate bandwidth to data going through the pipe in such a way that you tagged and tracked the important data, you could reduce the chance that your backups are invalid due to network congestion, and improve the responsiveness of backups and other critical applications simply by rating them higher and allocating bandwidth to them.
Add WAN Optimization style on-the-fly compression and deduplication, and you’re sending less data over the pipe and dedicating bandwidth to it. Leaving more room for other applications while guaranteeing that critical ones get the time they need is a huge combination. Of course the science of bandwidth allocation requires a good solid product and the art of bandwidth allocation at your organization. Only you know what is critical and how much of your pipe that critical data needs. You can get help making these determinations, but in the end, your staff has the knowledge necessary to make a go of it.
But think about it, your replication taking 20-50% (or less, lots of variables in this number) of its current bandwidth requirements and being more reliable. Even if nothing in your organization runs one tiny smidgen faster (and that is highly unlikely if you’re using a WAN Optimization Controller), that’s a win in overall circuit usage.
And that’s huge. Like I’ve said before, don’t buy a bigger pipe, use your connection more intelligently.
Not all WAN Optimization products offer Bandwidth Allocation, check with your vendor. Or call us, we’ve got it all built in – because WOM runs on TMOS, and all LTM functionality comes with the package.
Once you’ve cleared away the mechanical failures and the risks of collision, unlike a NASCAR driver, then you should focus on speed. Unlike them, we don’t have to live with the risk. Maybe that’s why they’re famous and we’re geeks ;-).
And no, sorry, I’m not a NASCAR fan. Just a geek with Google.
Related Articles and Blogs