Stop Repeating Yourself. Deduping WAN-Opt Style

Ever hang out with the person who just wants to make their point, and no matter what the conversation says the same thing over and over in slightly different ways? Ever want to tell them they were doing their  favorite cause/point/whatever a huge disfavor by acting like a repetitive fool? That’s what your data is doing when you send it across the WAN. Ever seen the data in a database file? Or in your corporate marketing documents? R E P E T I T I V E. And under a normal backup or replication scenario – or a remote office scenario – you are sending the same sequence of bytes over and over and over. Machines may be quad word these days, but your pipe is still measured in bits. That means even most of your large integers have 32 bits of redundant zeroes. Let’s not talk about all the places your corporate logo is in files, or how many times the word “the” appears in your documents.

It is worth noting for those of you just delving into this topic that WAN deduplication shares some features and even technologies with storage deduplication, but because the WAN has to handle an essentially unlimited stream of data running through it, and it does not have to store that data and keep differentials or anything moving forward, it is a very different beast than disk-based deduplication. WAN deduplication is more along the lines of “fire and forget” (though forget is the wrong word, since it keeps duplicate info for future reference) than storage which is “fire and remember exactly what we did”.

Thankfully, your data doesn’t have feelings, so we can offer a technological solution to its repetitive babbling. There are a growing number of products out there that tell your data “Hey! Say it once and move on!” these products either are or implement in-flight data deduplication. These devices require a system on each end – one to dedupe, one to rehydrate – and there are a variety of options the developer can choose, along with a few that you can choose, to make the deduplication of higher or lower quality. Interestingly, some of these options are perfect for one customers’ data set and not at all high-return for others.

So I thought we’d talk through them generically, giving you an idea of what to ask your vendor when you consider deduplication as part of your WAN Optimization strategy.


    

AddThis Feed Button Bookmark and Share

Related Articles and Blogs:

Published Jun 30, 2010
Version 1.0
No CommentsBe the first to comment