nas
16 TopicsCloud Storage Gateways. Short term win, but long term…?
In the rush to cloud, there are many tools and technologies out there that are brand new. I’ve covered a few, but that’s nowhere near a complete list, but it’s interesting to see what is going on out there from a broad-spectrum view. I have talked a bit about Cloud Storage Gateways here. And I’m slowly becoming a fan of this technology for those who are considering storing in the cloud tier. There are a couple of good reasons to consider these products, and I was thinking about the reasons and their standing validity. Thought I’d share with you where I stand on them at this time, and what I see happening that might impact their value proposition. The two vendors I have taken some time to research while preparing this blog for you are Nasuni and Panzura. No doubt there are plenty of others, but I’m writing you a blog here, not researching a major IT initiative. So I researched two of them to have some points of comparison, and leave the in-depth vendor selection research to you and your staff. These two vendors present similar base technology and very different additional feature sets. Both rely heavily upon local caching in the controller box, and both work with multiple cloud vendors, and both claim to manage compression. Nasuni delivers as a Virtual Appliance, includes encryption on your network before transmitting to the cloud, automated cloud provisioning, and caching that has timed updates to the cloud, but can perform a forced update if the cache gets full. It presents the cloud storage you’ve provisioned as a NAS on your end. Panzura delivers a hardware appliance that also presents the cloud as a NAS, works with multiple cloud vendors, handles encryption on-device, and claims global dedupe. I say claims, because “global” has a meaning that is “all” and in their case “all” means “all the storage we know about”, not “all the storage you know”. I would prefer a different term, but I get what they mean. Like everything else, they can’t de-dupe what they don’t control. They too present the cloud storage you’ve provisioned as a NAS on your end, but claim to accelerate CIFS and NFS also. Panzura is also trying to make a big splash about speeding access to MS-Sharepoint, but honestly, as a TMM for F5, a company that makes two astounding products that speed access to Sharepoint and nearly everything else on the Internet (LTM and WOM), I’m not impressed by Sharepoint acceleration. In fact, our Sharepoint Application Ready Solution is here, and our list of Application Ready Solutions is here. Those are just complete architectures we support directly, and don’t touch on what you can do with the products through Virtuals, iRules, profiles, and the host of other dials and knobs. I could go on and on about this topic, but that’s not the point of this blog, so suffice it to say there are some excellent application acceleration and WAN Optimization products out there, so this point solution alone should not be a buying criteria. There are some compelling reasons to purchase one of these products if you are considering cloud storage as a possible solution. Let’s take a look at them. Present cloud storage as a NAS – This is a huge benefit right now, but over time the importance will hopefully decrease as standards for cloud storage access emerge. Even if there is no actual standard that everyone agrees to, it will behoove smaller players to emulate the larger players that are allowing access to their storage in a manner that is similar to other storage technologies. Encryption – As far as I can see this will always be a big driver. They’re taking care of encryption for you, so you can sleep at night as they ship your files to the public cloud. If you’re considering them for non-public cloud, this point may still be huge if your pipe to the storage is over the public Internet. Local Caching – With current broadband bandwidths, this will be a large driver for the foreseeable future. You need your storage to be responsive, and local caching increases responsiveness, depending upon implementation, cache size, and how many writes you are doing this could be a huge improvement. De-duplication – I wish I had more time to dig into what these vendors mean by dedupe. Replacing duplicate files with a symlink is simplest and most resembles existing file systems, but it is also significantly less effective than partial file de-dupe. Let’s face it, most organizations have a lot more duplication laying around in files named Filename.Draft1.doc through Filename.DraftX.doc than they do in completely duplicate files. Check with the vendors if you’re considering this technology to find out what you can hope to gain from their de-dupe. This is important for the simple reason that in the cloud, you pay for what you use. That makes de-duplication more important than it has historically been. The largest caution sign I can see is vendor viability. This is a new space, and we have plenty of history with early entry players in a new space. Some will fold, some will get bought up by companies in adjacent spaces, some will be successful… at something other than Cloud Storage Gateways, and some will still be around in five or ten years. Since these products compress, encrypt, and de-dupe your data, and both of them manage your relationship with the cloud vendor, losing them is a huge risk. I would advise some due diligence before signing on with one – new companies in new market spaces are not always a risky proposition, but you’ll have to explore the possibilities to make sure your company is protected. After all, if they’re as good as they seem, you’ll soon have more data running through them than you’ll have free space in your data center, making eliminating them difficult at best. I haven’t done the research to say which product I prefer, and my gut reaction may well be wrong, so I’ll leave it to you to check into them if the topic interests you. They would certainly fit well with an ARX, as I mentioned in that other blog post. Here’s a sample architecture that would make “the Cloud Tier” just another piece of your virtual storage directory under ARX, complete with automated tiering and replication capabilities that ARX owners thrive on. This sample architecture shows your storage going to a remote data center over EDGE Gateway, to the cloud over Nasuni, and to NAS boxes, all run through an ARX to make the client (which could be a server or a user – remember this is the NAS client) see a super-simplified, unified directory view of the entire thing. Note that this is theoretical, to my knowledge no testing has occurred between Nasuni and ARX, and usually (though certainly not always) the storage traffic sent over EDGE Gateway will be from a local NAS to a remote one, but there is no reason I can think of for this not to work as expected – as long as the Cloud Gateway really presents itself as a NAS. That gives you several paths to replicate your data, and still presents client machines with a clean, single-directory NAS that participates in ADS if required. In this case Tier one could be NAS Vendor 1, Tier two NAS Vendor 2, your replication targets securely connected over EDGE Gateway, and tier 3 (things you want to save but no longer need to replicate for example) is the cloud as presented by the Cloud Gateway. The Cloud Gateway would arbitrate between actual file systems and whatever idiotic interface the cloud provider decided to present and tell you to deal with, while the ARX presents all of these different sources as a single-directory-tree NAS to the clients, handling tiering between them, access control, etc. And yes, if you’re not an F5 shop, you could indeed accomplish pieces of this architecture with other solutions. Of course, I’m biased, but I’m pretty certain the solution would not be nearly as efficient, cool, or let you sleep as well at night. Storage is complicated, but this architecture cleans it up a bit. And that’s got to be good for you. And all things considered, the only issue that is truly concerning is failure of a startup cloud gateway vendor. If another vendor takes one over, they’ll either support it or provide a migration path, if they are successful at something else, you’ll have plenty of time to move off of their storage gateway product, so only outright failure is a major concern. Related Articles and Blogs Panzura Launches ANS, Cloud Storage Enabled Alternative to NAS Nasuni Cloud Storage Gateway InfoSmack Podcasts #52: Nasuni (Podcast) F5’s BIG-IP Edge Gateway Solution Takes New Approach to Unifying, Optimizing Data Center Access Tiering is Like Tables or Storing in the Cloud Tier444Views0likes1CommentFirst Conferences, 12 TB, and a Road Trip!
Thursday was quite the day for us. I mentioned earlier in the week that I was setting up the storage for Lori to digitize all of the DVDs, well we came to the conclusion that we needed 12 terabytes of raw disk to hold movies + music. Our current NAS total was just over four Terabytes, clearly not enough. While I take it in stride that I would consider purchasing an additional 12 TB of disk space, you have to stop in awe for a moment, don’t you? It was just a decade ago that many pundits were saying most enterprises didn’t need more than a terabyte of data, now I’m considering 12 for personal use? And while it isn’t cheap, it is in the range of my budget, if I shop smartly. Kind of mind boggling. Makes you wonder what most enterprises are at. Sure, they’re not digitizing HD movies, but they are producing HD content, and that’s just a portion of what they’re turning out. I’ll have to go dig up some research on current market trends. So I ordered another NAS from Dell. For price plus simplicity, they suited our needs best – though the HP product a PR rep sent me a link to was interesting, it only had 1.5 TB of space, and I don’t feel like bulk-replacing disks on brand new kit. It seems that my ARX is going to get its workout in the near future… Almost gave it up about a month ago, now I’m glad I didn’t, that’s four NAS’s plus a couple of large shares that it can manage for us. We’re also reconfiguring backups, and I (at this time) have no idea how best to make use of the ARX in this process… It’s learning time. Shortly after I placed the order for the new NAS, we left for Milwaukee, where Lori was due to speak at the Paragon Development Services conference (they’re a partner of F5’s). We decided that since he was nearly three years old, it was time to take The Toddler to his first IT conference… He went, fit in well, and appears to have all the traits of conference attendees. He picked up on the “free water” thing pretty quickly, paid rapt attention, and was ready to go before she even started. Since PDS is pretty big on private clouds, and Lori knows a thing or two about clouds, their issues, their potential, and how to get started, it was a natural fit for her to speak there, and all accounts are that she did well. Yes, I had The Toddler in hand, so I left before she actually started to speak, didn’t want him to do to her what he did in church, shouting “I see a barn!” into the silence of her taking a breath. But everyone that talked to us on the way out was pleased with her presentation, and the slides were rock-solid, so I’m assuming she rocked it like she always does. So our Thursday was pretty full of excitement, but all went well. We have a shiny new NAS on the way (Dell Storage employees, it is on the way, isn’t it?), The Toddler got to see his first tech conference, and we ran all over Milwaukee looking for a hobby store afterward. The one we wanted to go to either closed or moved, the building was empty… And we ended up coming home without the hobby store stop along the way. Lori preparing to speak at PDS Tech 2010 Free Water! I will no doubt update you about the new NAS when it arrives, much the same as I have on previous new NAS purchases. Meanwhile, it’s back to the joys of WOM for me…202Views0likes0CommentsOnce Again, I Can Haz Storage As A Service?
While plenty of people have had a mouthful (or page full, or pipe full) of things to say about the Amazon outage, the one thing that it brings to the fore is not a problem with cloud, but a problem with storage. Long ago, the default mechanism for “High Availability” was to have two complete copies of something (say a network switch) and when one went down, the other was brought up with the same IP. It is sad to say that even this is far-and-away better than the level of redundancy that most of us place in our storage. The reasons are pretty straight-forward, you can put in a redundant pair of NAS heads, or a redundant pair of file/directory virtualization appliances like our own ARX, but a redundant pair of all of your storage? The cost alone would be prohibitive. Amazon’s problems seem to stem from a controller problem, not a data redundancy problem, but I’m not an insider, so that is just appearances. But most of us suffer from the opposite. High availability entry points protect data that is all too often a single point of failure. I know I lived through the sudden and irrevocable crashing of an AIX NAS once, and it wasn’t pretty. When the third disk turned up bad, we were out of luck, had to wait for priority shipment of new disks and then do a full restore… The entire time being down in a business where time is money. The critical importance of the data that is the engine of our enterprises these days makes building that cost-prohibitive truly redundant architecture a necessity. If you don’t already have a complete replica of your data somewhere, it is worth looking into file and database replication technologies. Honestly, if you choose to keep your replicas on cheaper, slower disk, you can save a bundle and still have the security that even if your entire storage system goes down, you’ll have the ability to keep the business running. But what I’d like to see is full blown storage as a service. We couldn’t call it SaaS, so I’ll propose we name it Storage Tiers As A Service Host, just so we can use the acronym Staash. The worthy goal of this technology would be the ability to automatically, with no administrator interaction, redirect all traffic to device A over to device B, heterogeneously. So your core datacenter NAS goes down hard, lets call it a power failure to one or more racks, Staash would detect that the primary is off-line and substitute your secondary for it in the storage hierarchy. People might notice that files are served up more slowly, depending upon your configuration, but they’ll also still be working. Given sufficient maturity, this model could even go so far as to allow them to save changes made to documents that were open at the time that the primary NAS went down, though this would be a future iteration of the concept. Today we have automated redundancy all the way to the final step, it is high time we implemented redundancy on that last little bit, and made our storage more agile. While I could reasonably argue that a File/Directory Virtualization device like F5’s ARX is the perfect place to implement this functionality – it is already heterogeneous, it sits between users and data, and it is capable of being deployed in HA pairs… All the pre-requisites for Staash to be implemented – I don’t think your average storage or server administrator much cares where it is implemented, as long as it is implemented. We’re about 90% there. You can replicate your data – and you can replicate it heterogeneously. You can set up an HA pair of NAS heads (if you are a single-vendor shop) or File/Directory virtualization devices whether you are single-vendor or heterogeneous, and with a file/directory virtualization tool you have already abstracted the user from the physical storage location in a very IT-friendly way (files are still saved together, storage is managed in a reasonable manner, only files with naming conflicts are renamed, etc), all that is left is to auto-switch from your high-end primary to a replica created however your organization does these things… And then you are truly redundantly designed. It’s been what, forty years? That’s almost as long as I’ve been alive. Of course, I think this would fit in well with my long-term vision of protocol independence too, but sometimes I want to pack too much into one idea or one blog, so I’ll leave it with “let’s start implementing our storage architecture like we do our server architecture… No single point of failure. No doubt someone out there is working on this configuration… Here’s hoping they call it Staash when it comes out. The cat in the picture above is Jennifer Leggio’s kitty Clarabelle. Thanks to Jen for letting me use the pic!199Views0likes0CommentsGraduating Your Storage
Lori and I’s youngest daughter graduated from High School this year, and her class chose one of the many good Vince Lombardi quotes for the theme of their graduation – “The measure of who we are is what we do with what we have.” Those who know me well know that I’m not a huge football fan (don’t tell my friends here in Green Bay that… The stadium can hold roughly half the city’s population, and they aren’t real friendly to those who don’t join in the frenzy), but Vince Lombardi certainly had a lot of great quotes over the course of his career, and I am a fan of solid quotes. This is a good example of his ability to say things short and to the point. This is the point where I say that I’m proud of our daughter. For a lot more than simply making it through school, and wish her the best of luck in that rollercoaster ride known as adult life. About the same time as our daughter was graduating, Lori sent me a link to this Research And Markets report on High Performance Computing site Storage usage. I found it to be very interesting, just because HPC sites are generally on the larger end of storage environments, and are where the rubber really hits the road in terms of storage performance and access times. One thing that stood out was the large percentage of disk that is reported as DAS. While you know there’s a lot of disk sitting in servers underutilized, I would expect the age of virtualization to have used a larger chunk of that disk with local images and more swap space for the multiple OS instances. Another thing of interest was that NAS and SAN are about evenly represented. Just a few years ago, that would not have been true at all. Fiber Channel has definitely lost some space to IP-based storage if they’re about even in HPC environment deployments. What’s good for the some of the most resource intensive environments on earth is good for most enterprises, and I suspect that NAS has eclipsed SAN in terms of shear storage space in the average enterprise (though that’s a conjecture on my part, not anything from the report). And that brings us back to the Vince Lombardi Quote. NAS disk space is growing. DAS disk space is still plentiful. The measure of the service your IT staff delivers will be what you do with what you have. And in this case, what you have is DAS disk not being used and a growing number of NAS heads to manage all that NAS storage. What do you do with that? Well, you do what makes the most sense. In this case, storage tiering comes to mind, but DAS isn’t generally considered a tier, right? It is if you have file virtualization (also called directory virtualization) in place. Seriously. By placing all that spare DAS into the directory tree, it is available as a pool of resources to service storage needs – and by utilizing automated, rule-based tiering, what is stored there can be tightly controlled by tiering rules so that you are not taking over all of the available space on the DAS, and things are stored in the place that makes the most sense based upon modification and/or access times. With tiering and file virtualization in place, you have a solution that can utilize all that DAS, and an automated system to move things to the place that makes the most sense. While you’re at it, move the rest of the disk into the virtual directory, and you can run backups off the directory virtualization engine, rather than set them up for each machine. You can even create rules to copy data off to slow disk and back it up from there, if you like. And with the direction things are headed, throw in an encrypting Cloud Storage Gateway like our ARX Cloud Extender, and you have a solution that utilizes your internal DAS and NAS both intelligently and to the maximum, and the gateway to Cloud storage for overflow, Tier N, or archival storage… depending upon how you’re using cloud storage. Then you are doing the most with what you have – and setting up an infinitely expandable pool to cover for unforeseen growth. All of the above makes your storage environment more rational, improves utilization in DAS (and in most cases NAS), retains your files with their names intact, and moves unstructured data to the storage that makes the most sense for it. There is very little not to like. So check it out. We have ARX, other vendors offer their solutions – though ARX is the leader in this space, so I don’t feel pandering to say you’ll find us a better fit.191Views0likes0CommentsStore Storing Stored? Or Blocked?
Now that Lori has her new HP TouchSmart for an upcoming holiday gift, we are finally digitizing our DVD collection. You would think that since our tastes are somewhat similar, we’d be good to go with a relatively small number of DVDs… We’re not. I’m a huge fan of well-done war movies and documentaries, we share history and fantasy interests, and she likes a pretty eclectic list of pop-culture movies, so the pile is pretty big. I’m working out how to store them all on the NAS such that we can play them on any TV on the network, and that got me to pondering the nature of storage access these days. We own a SAN, it never occurred to me to put these shows on it – that would limit access to those devices with an FC card… Or we’d end up creating a share to run them all through one machine with an FC card as a NAS head of sorts. In the long litany of different ways that we store things – direct attached or networked, cloud or WAN, Object store or hierarchical – the one that stands out as the most glaring, and the one that has traditionally gotten the most attention is file versus block. For at least a decade the argument has raged between which is more suited to enterprise use, while most of us have watched from the sidelines and been somewhat bemused by the conversation because the enterprise is using both. As a rule of thumb, if you need to boot from it or write sectors of data to it, you need block. Everything else is generally file. And that’s where I’m starting to wonder. I know there was a movement not too many years ago to make databases file based instead of block based, and that the big vendors were going in that direction, but I do wonder if maybe it’s time for block to retire at the OS level. Of course for old disks to be compatible, the OS would still have to handle block, but setting it to only allow OS-level calls (I know, it’s harder with each release, that’s death by a thousand cuts though) to read/write sectors would resolve much of the problem. Then a VMWare style boot-from-file-structure would resolve the last bit. Soon we could cut our file protocols in half. Seriously, at this point in time, what does block give us? Not much, actually. thin/auto provisioning is available on NAS, high-end performance tweaks are available on NAS, and the extensive secondary network (be it FC or IP) is not necessary for NAS, though there are some cases where throughput may demand it, those are not your everyday case in a world of 1 Gig networks with multi-Gig backplanes on most devices. And 10 Gig is available pretty readily these days. SAN has been slowly dying, I’m just pondering the question of whether it should be finished off. Seriously, people say “SAN is the only thing for high-performance!” but I can guarantee you that I can find plenty of NAS boxes that perform better than plenty of SAN networks – just a question of vendor and connectivity. I’m a big fan of iSCSI, but am no longer sure there’s a need for it out there. Our storage environment, as I’ve blogged before, has become horribly complex, with choices at every turn, many of which are more tied to vendor and profits than needs and customer desires. Strip away the marketing and I wonder if SAN has a use in the future of enterprise. I’m starting to think not, but I won’t declare it dead, as I am still laughing at those who declared tape dead for the last 20 years – and still are, regardless of what tape vendors’ sales look like. It would be hypocritical of me to laugh at them and make the same type of pronouncement. SAN will be dead when customers stop buying it, not before. Block will end when vendors stop supporting it, not before… So I really am just pondering the state of the market, playing devil’s advocate a bit. I have heard people proclaim that block is much faster for database access. I have written and optimized B-Tree code, and yeah, it is. But that’s because we write databases to work on blocks. If we used a different mechanism, we’d get a different result. It is no trivial thing to move to a different storage method, but if the DB already supports file access, the work is half done, only optimizing for the new method or introducing shims to make chunks of files look like blocks would be required. If you think about it, if your DB is running in a VM, this is already essentially the case. The VM is in a file, the DB is in that file… So though the DB might think it’s directly accessing disk blocks, it is not. Food for thought.186Views0likes0CommentsVE as in Very Exciting. ARX VE Trial
The limiting factor in adoption of file virtualization has been, in my opinion, twofold. First is the FUD created by the confusion with block-level virtualization and proprietary vendors wanting to sell you more of their gear – both of which are rapidly disappearing – and second is the unknown element. The simple “how does this set of products improve my environment, save me money, or cut manhours?” Well now this issue is going to rapidly go away also, because you can find out easily enough. Those of you who follow my writing know that I was a hard sell for file virtualization services. In fact, until I had a device in my environment and running, giving me a chance to tinker with it, I remained a bit skeptical even after understanding the use cases. The reasons are probably well-known to many in IT… What does file virtualization offer that the independent NAS boxes don’t cover in one way or another. The answer that I came to was something we here at F5 call strategic points of control. The ARX in my environment allowed me to utilize the back-end NAS devices/file servers to the maximum while alleviating quite a bit of “out of disk space” concern. This is simply a case of the ARX seeing across NAS devices and giving the ability to move things to less utilized space without client machines having to even know they moved. This goes a lot further than simple disk space utilization and allows tiering and enhanced automated backup. But I digress. My point is that, much like you and I cannot know what it is like to walk in space, I didn’t “get it” until I had my hands on the tool and could toy with it. Oh I conceptually understood the benefits, but wasn’t certain of the ROI for those benefits versus the cost of the device. Having one to configure and implement changes through was what it took for me to fully understand what benefits file virtualization had to offer in my environment. Today our Data Solutions Group introduced a new version of F5 ARX – F5 ARX VE Trial or ARX Virtual Edition Trial. Yes indeed, now, assuming you have the right environment, can download a copy of ARX and kick the tires, see for yourself what I found in our network – that file management, replication, and tiering are all enabled by the F5 ARX line of products at a level that makes life easier for storage admins, desktop support, security, and systems admins. Of course, no software exists in a vacuum, so I’ll cover the minimum requirements here, then talk about issues and differences from an ARX appliance. Image Courtesy of NASA Requirements are not too strenuous, and shouldn’t be too much of a surprise. First, it is a VM, so you’ll need VMWare. Specifically you’ll need VMWare ESX 4.0 Update 2 or VMWare ESX 4.1. The VMware install must be able to offer the ARX VM one virtual processor, two gigabytes of memory, and forty gigabytes of disk space inn order for it to run. And finally, you’ll need Internet access – either directly from the VM, or via a management machine. This is so you can get the license key from the F5 license server. You’ll want it to have routes to whatever subnets the storage your trying it out with are on, of course, and clients should have a route to it – or you won’t be doing much with it – but other than that, you’re set. I know there are a lot of you out there who have wondered at my change of heart vis-a-vis file virtualization… Several of you have written to me about it in fact. But now is your chance to see why I went from wonder that this market exists to an advocate of putting your storage behind such a device. The trial is free, with a few limitations, so lets go over them here. Remember the point of this product is to try out ARX, not to put in a fully functional production VM. More on that later, for now, understand that the following limitations exist and should offer more than enough power for you to check it out: The biggest one, in my opinion, is that you are limited to 32,768 files per system. That means your test environment will have to be carefully selected – you’d be amazed how fast 32K files (not of storage, actual unique files) build up. Next is that you are really only going to have 32 mount points available on the ARX. This is somewhat less of an issue because from a single mount point at root you can get to the entire storage system. The documentation that I have does not mention NFS at all, so presumably it is not supported in the Trial version – but let me caveat that with “Just because I haven’t seen it doesn’t mean it isn’t there”. I’ll be installing and playing with this over the next couple of weeks, and pop back in to let you know what I find. All in all, you can drop this into a VM, fire it up, and figure out just how much benefit you could get from file virtualization. That should be the point of a Trial version, so I think they hit the nail on the head. As to upgrading in the future, there are some caveats. What you do in the Trial Edition won’t transfer to a production box for example, you’ll have to reconfigure. But it’s meant for testing only, so that’s not a huge limitation. I know when I first install any unfamiliar infrastructure element there is that first bit of learning time that creates clutter anyway, so losing that clutter shouldn’t be all bad. Unless you’re just better than me anyway :-).184Views0likes0CommentsF5 Friday: Data Inventory Control
Today’s F5 Friday post comes to you courtesy of our own Don MacVittie who blogs more often than not on storage-related topics like file virtualization, cloud storage, and automated tiering goodness. You can connect, converse, and contradict Don in any of the usual social networking avenues: Enjoy! I have touched a few times on managing your unstructured data, and knowing what you have so you know what to do with it. As you no doubt know, automating some of that process in a generic manner is nearly impossible, since it is tied to your business, your vertical, and your organizational structure. But some of it – file level metadata and some small amount of content metadata can absolutely be handled for you by an automated system. This process is increasingly necessary as tiering, virtualization, and cloud become more and more prevalent as elements of the modern data center. The cost driver is pretty obvious to anyone that has handled storage budgeting in the last decade… Disk is expensive, and tier one disk the most expensive. Add to that the never-ending growth of unstructured data and you have a steady bleed of IT infrastructure dollars that you’re going to want to get under control. One of the tools that F5 (and presumably other vendors, but this is an F5 Friday post) offers to help you get a handle on your unstructured data with an eye to making your entire storage ecosystem more efficient, reliable, and scalable, is Data Manager. Data Manager is a software tool that helps you to categorize the data flowing through your corporate NAS infrastructure by looking at the unstructured files and evaluating all that it can about them from context, metadata, and source. Giving you a solid inventory of your unstructured data files is a good start toward automated tiering, including things like using an F5 ARX and a Cloud Storage Gateway to store your infrequently accessed data encrypted in the cloud. Automating tiering is a well-known science, but providing you with data about your files, heterogeneous file system usage, and data mobility is less covered in the marketplace. But you cannot manage what you don’t understand, and we all know that we’ve been managing storage simply by buying more each budgeting cycle, and that process is starting to weigh on the ops budget as well as the tightened budgets that the current market is enforcing – though there are signs that the tight market might be lifting, who wants to keep overhead any higher than it absolutely has to be? Auto-tiering is well-known to developers that make it happen, but consider the complexity of classifying files, identifying tiers, and moving that data while not causing disruptions to users or applications that need access to the files in question. It is definitely not the easiest task that your data center is providing, particularly in a heterogeneous environment where communications with the several vendor’s NAS devices can vary pretty wildly. The guys that write this stuff certainly have my admiration, but it does work. The part where you identify file servers and classify data – setting up communications with the various file servers and accessing the various folders to get at the unstructured files – is necessary just for classification, and that is what Data Manager is all about. Add in that Data Manager can help you understand utilization on all of these resources – in fact does help you understand utilization of them – and you’ve got a powerful tool for understanding what is going on in your NAS storage infrastructure. The reports it generates are in PDF, and can be done from directory level all the way up to a group of NAS boxes. Here’s a sample directory level from one of our NAS devices… The coolest part? Data Manager is Free for 90 days. Just download it here, install it, and start telling it about your NAS devices. See what it can do for you, decide if you like it, provide us with feedback and consider if it helps enough to warrant a purchase. It is an excellent tool for you to discover where you can get the most benefit our of file virtualization solutions like the F5 ARX. And yes, we hope you’ll buy Data Manager and ask about ARX. But the point is, you can try it out and decide if it helps your organization or not. If nothing else you will get an inventory of your NAS devices and what type of utilization you are getting out of them.182Views0likes2CommentsCloud Storage: Just In Time For Data Center Consolidation.
There’s this funny thing about pouring two bags of M&Ms into one candy dish. The number of M&Ms is exactly the same as when you started, but now they’re all in one location. You have, in theory, saved yourself from having to wash a second candy dish, but the same number of people can enjoy the same number of M&Ms, you’ll run out of M&Ms at about the same time, and if you have junior high kids in the crowd, the green M&Ms will disappear at approximately the same rate. The big difference is that fewer people will fit around one candy dish than two, unless you take extraordinary steps to make that one candy dish more accessible. If the one candy dish is specifically designed to hold one or one and a half bags of M&Ms, well then you’re going to need a place to store the excess. The debate about whether data center consolidation is a good thing or not is pretty much irrelevant if, for any reason your organization chooses to pursue this path. Seriously, while analysts want to make a trend out of everything these days, there are good reasons to consolidate data centers, ranging from skills shortage at one location to a hostile regulatory environment at another. Cost savings are very real when you consolidate data centers, though they’re rarely as large as you expect them to be in the planning stages because the work still has to be done, the connections still have to be routed, the data still has to be stored. You will get some synergies by hosting apps side-by-side that would normally need separate resources, but honestly, a datacenter consolidation project isn’t an application consolidation project. It can be, but that’s a distinct piece of the pie that introduces a whole lot more complexity than simply shifting loads, and all the projects I’ve seen with both as a part of them have them in two separate and distinct phases - “let’s get everything moved, and then focus on reducing our app footprint”. Lori and the M&Ms of doom. While F5 offers products to help you with all manner of consolidation problems, this is not a sales blog, so I’ll focus on one opportunity in the cloud that is just too much of the low-hanging fruit for you not to be considering it. Moving the “no longer needed no matter what” files out to the cloud. I’ve mentioned this in previous Cloud Storage and Cloud Storage Gateway posts, but in the context of data center consolidation, it moves from the “it just makes sense” category to the “urgently needed” category. You’re going to be increasing the workload at your converged datacenter by an unknown amount, and storage requirements will stay relatively static, but you’re shifting those requirements from two or more datacenters to one. This is the perfect time to consider your options with cloud storage. What if you could move an entire classification of your data out to the cloud, so you didn’t have to care if you were accessing it from a data center in Seattle or Cairo? What if you could move that selection of data out to the cloud and the purposely shift data centers without having to worry about that data? Well you can… And I see this as one of the drivers for Cloud Storage adoption. In general you will want a Cloud Storage Gateway like our ARX Cloud Extender, and using ARX or another rules-based tiering engine will certainly make the initial cloud storage propagation process easier, but the idea is simple. Skim off those thousands of files that haven’t been accessed in X days and move them to Cloud storage, freeing up local space so that maybe you won’t need to move or replace that big old NAS system from the redundant data center. X is very much dependent upon your business and even the owning business unit, I would seriously work with the business leaders to set reasonable numbers – and offering them guidance about what it will take (in terms of days X needs to be) to save the company moving or replacing an expensive (and expensive to ship) NAS. While the benefits appear to be short-term – not consolidating the NAS devices while consolidating datacenters – they are actually very long term. They allow you to learn about cloud storage and how it fits into your architectural plans with relatively low-risk data, as time goes on, the number of files (and terabytes) that qualify for movement to the cloud will continue to increase, keeping an escape valve on your NAS growth, and the files that generally don’t need to be backed up every month or so will all be hanging off your cloud storage gateway, simplifying the backup process and reducing backup/replication windows. I would be remiss if I didn’t point out the ongoing costs of cloud storage, after all, you will be paying each and every month. But I contend you would be anyway. If this becomes an issue from the business or from accounts payable, it should be relatively easy with a little research to come up with a number for what storage growth costs the company when it owns the NAS devices. The only number available to act as a damper on this cost would be the benefits of depreciation, but that’s a fraction of the total in real-dollar benefits, so my guess is that companies within the normal bounds of storage growth over the last five years can show a cost reduction over time without having to include cost-of-money-over-time calculations for “buy before you use” storage. So the cost of cloud being pieced out over months is beneficial, particularly at the prices in effect today for cloud storage. There will no doubt be a few speed bumps, but getting them out of the way now with this never-accessed data is better than waiting until you need cloud storage and trying to figure it out on the fly. And it does increase your ability to respond to rapidly changing storage needs… Which over the last decade have been rapidly changing in the upward direction. Datacenter consolidation is never easy on a variety of fronts, but this could make it just a little bit less painful and provide lasting benefits into the future. It’s worth considering if you’re in that position – and truthfully to avoid storage hardware sprawl, even if you’re not. Related Articles and Blogs Cloud Storage Gateways, stairway to (thin provisioning) heaven? Certainly Cirtas! Cloud Storage Gains Momentum Cloud Storage Gateways. Short term win, but long term…? Cloud Storage and Adaptability. Plan Ahead Like “API” Is “Storage Tier” Redefining itself? The Problem With Storage Growth is That No One Is Minding the Store F5 Friday: F5 ARX Cloud Extender Opens Cloud Storage Chances Are That Your Cloud Will Be Screaming WAN.182Views0likes0CommentsIn The End, You Have to Clean.
Lori and I have a large technical reference library, both in print and electronic. Part of the reason it is large is because we are electronics geeks. We seriously want to know what there is to know about computers, networks, systems, and development tools. Part of the reason is that we don’t often enough sit down and decide to pare the collection down by those books that no longer have a valid reason for sitting on our (many) bookshelves of technical reference. The collection runs the gamut from the outdated to the state of the art, from the old stand-byes to the obscure, and we’ve been at it for 20 years… So many of them just don’t belong any more. One time we went through and cleaned up. The few books we got rid of were not only out of date (mainframe pascal data structures was one of them), but weren’t very good when they were new. And we need to do it again.From where I sit at my desk, I can see an OSF DCE reference, the Turbo Assembler documentation, A Perl 5 reference, a MicroC/OS-II reference, and Mastering Web Server Security. All of which are just not relevant anymore. There’s more, but I’ll save you the pain, you get the point. The thing is, I’m more likely to take a ton of my valuable time and sort through these books, recycling those that no longer make sense unless they have sentimental value - Lori and I wrote an Object Oriented Programming book back in 1996, that’s not going to recycling – than you are to go through your file system and clean the junk out of it. Two of ten… Funny thing happens in highly complex areas of human endeavor, people start avoiding ugly truths by thinking they’re someone else’s problem. In my case (and Lori’s), I worry about recycling a book that she has a future use for. Someone else’s problem syndrome (or an SEP field if you read Douglas Adams) has been the source of tremendous folly throughout mankind’s history, and storage at enterprises is a prime example of just such folly. Now don’t bet me wrong, I’ve been around the block, responsible for an ever-growing pool of storage, know that IT management has to worry that the second they start deleting unused files they’re going to end up in the hotseat because someone thought they needed the picture of the sign in front of the building circa 1995… But if IT (who owns the storage space) isn’t doing it, and business unit leaders (who own the files on the storage) aren’t doing it… Well, you’re going to have a nice big stack of storage building up over the next couple of years. Just like the last couple. I could – and will - tell you that you can use our ARX product to help you solve the problem, particularly with ARX Cloud Extender and a trusted cloud provider, by shuffling out to the cloud. But in the longer term, you’ve got to clean up the bookshelf, so-to-speak. ARX is very good at many things, but not making those extra files disappear. You’re going to pay for more disk, or you’re going to pay a cloud provider until you delete them. I haven’t been in IT management for a while, but if I were right now, I’d get the storage guys to build me a pie-chart showing who owns how much data, then gather a couple of outrageous examples of wasted space (a PowerPoint that is more than five years old is good, better than the football pool for marketing from ten years ago, because PowerPoint uses a ton more disk space), and then talk with business leaders about the savings they can bring the company by cleaning up. While you can’t make it their priority, you can give them the information they need. If marketing is responsible for 30% of the disk usage on NAS boxes (or I suppose unstructured storage in general, though this exercise is more complex with mixed SAN/NAS numbers, not terribly more complex), and you can show that 40% of the files owned by Marketing haven’t been touched in a year… That’s compelling at the C-level. 12% of your disk is sitting there just from one department with easy to identify unused files on it. Some CIOs I’ve known have laid the smackdown – “delete X percent by Y date or we will remove this list of files” is actually from a CIOs memo – but that’s just bad PR in my opinion. Convincing business leaders that they’re costing the company money – what’s 12% of your NAS investment for example, plus 12% of the time of the storage staff dedicated to NAS – is a much better plan, because you’re not the bad guy, you’re the person trying to save money while not negatively impacting their jobs. So yeah, install ARX, because it has a ton of other benefits, but go to the bookshelf, dust off that copy of the Fedora 2 Admin Guide, and finally put it to rest. That’s what I’ll be doing this weekend, I know that.179Views0likes0CommentsGiven Enough Standards, Define Anarchy
If a given nation independently developed twelve or fourteen governmental systems that all sat side-by-side and attempted to cooperate but never inter-operate, then anarchy would result. Not necessarily overnight, but issues about who is responsible for what, where a given function is best handled, and more would spring up nearly every day. Related Articles and Blogs: NEC’s New I/O Technology Enables Simultaneous Sharing of I/O Storage Area Networking Network Attached Storage SNIA (website) HP Flexfabric Gets Raves from Storage Networking Vendors177Views0likes0Comments