storage tiering
5 TopicsSSDs, Velocity and the Rate of Change.
The rate of change in a mathematical equation can vary immensely based upon the equation and the inputs to the equation. Certainly the rate of change for f(x) = x^2 is a far different picture than the rate of change for f(x)=2x, for example. The old adage “the only constant is change” is absolutely true in high tech. The definition of “high” in tech changes every time something becomes mainstream. You’re working with tools and systems that even ten years ago were hardly imaginable. You’re carrying a phone that Alexander Graham Bell would not recognize – or know how to use. You have tablets with the power that was not so long ago only held by mainframes. But that change did not occur overnight. Apologies to iPhone fans, but all the bits Apple put together to produce the iPhone had existed before, Apple merely had the foresight to see how they could be put together in a way customers would love. The changes happen over time, and we’re in the midst of them, sometimes that’s difficult to remember. Sometimes that’s really easy to remember, as our brand-new system or piece of architecture gives us headaches. Depends upon the day. Image generated at Cool Math So what is coming of age right now? Well, SSDs for one. They’re being deployed in the numbers that were expected long ago, largely because prices have come down far enough to make them affordable. We offer an SSD option for some of our systems these days, and since the stability of our products is of tantamount to our customers’ interests, we certainly aren’t out there on the cutting edge with this development. They’re stable enough for mission critical use, and the uptick in sales reflects that fact. If you have a high-performance application that relies upon speedy database access, you might look into them. There are a lot of other valid places to deploy SSDs – Tier one for example – but a database is an easy win. If access times are impacting application performance, it is relatively easy to drop in an SSD drive and point the DB (cache or the whole DB) at them, speeding performance of every application that relies on that DBMS. That’s an equation that is pretty simple to figure out, even if the precise numbers are elusive. Faster disk access = faster database response times = faster applications. That is the same type of equation that led us to offer SSDs for some of our products. They sit in the network between data and the applications that need the data. Faster is better, assuming reliability, which after years of tweaking and incremental development, SSDs offer. Another place to consider SSDs is in your virtual environment. If you have twenty VMs on a server, and two of them have high disk access requirements, putting SSDs into place will lighten the load on the overall system simply by reducing the blocking time waiting for disk responses. While there are some starting to call for SSDs everywhere, remember that there were some who said cloud computing meant no one should ever build out a datacenter again also. The price of HDs has gone down with the price of SSDs pushing them from the top, so there is still a significant cost differential, and frankly, a lot of applications just don’t need the level of performance that SSDs offer. The final place I’ll offer up for SSDs is if you are implementing storage tiering such as that available through our ARX product. If you have high-performance NAS needs, placing an SSD array as tier one behind a tiering device can significantly speed access to the files most frequently used. And that acceleration is global to the organization. All clients/apps that access the data receive the performance boost, making it another high-gain solution. Will we eventually end up in a market where old-school HDDs are a thing of the past and we’re all using SSDs for everything? I honestly can’t say. We have plenty of examples in high-tech where as demand went down, the older technology started to cost more because margins plus volume equals profit. Tube monitors versus LCDs, a variety of memory types, and even big old HDDs – the 5.25 inch ones. But the key is whether SDDs can fulfill all the roles of HDDs, and whether you and I believe they can. That has yet to be seen, IMO. The arc of price reduction for both HDDs and SSDs plays in there also – if quality HDDs remain cheaper, they’ll remain heavily used. If they don’t, that market will get eaten by SSDs just because all other things being roughly equal, speed wins. It’s an interesting time. I’m trying to come up with a plausible use for this puppy just so I can buy one and play with it. Suggestions are welcome, our websites don’t have enough volume to warrant it, and this monster for laptop backups would be extreme – though it would shorten my personal backup window ;-). OCZ Technology 1 TB SSD. The Golden Age of Data Mobility? What Do You Really Need? Use the Force Luke. (Zzzaap) Don't Confuse A Rubber Stamp With Validation On Cloud, Integration and Performance Data Center Feng Shui: Architecting for Predictable Performance F5 Friday: Performance, Throughput and DPS F5 Friday: Performance Analytics–More Than Eye-Candy Reports Audio White Paper - High-Performance DNS Services in BIG-IP ... Analyzing Performance Metrics for File Virtualization274Views0likes0CommentsToll Booths and Dams. And Strategic Points of Control
An interesting thing about toll booths, they provide a point at which all sorts of things can happen. When you are stopped to pay a toll, it smooths the flow of traffic by letting a finite number of vehicles through per minute, reducing congestion by naturally spacing things out. Dams are much the same, holding water back on a river and letting it flow through at a rate determined by the operators of the dam. The really interesting bit is the other things that these two points introduce. When necessary, toll booths have been used to find and stop suspected criminals. They have also been used as advertising and information transmission points. None of the above are things toll booths were created for. They were created to collect tolls. And yet by nature of where they sit in the highway system, can be utilized for much more. The same is true of a dam. Dams today almost always generate electricity. Often they function as bridges over the very water they’re controlling. They control the migration of fish, and operate as a check on predatory invasive species. Again, none of these things is the primary reason dams were originally invented, but the nature of their location allows them to be utilized effectively in all of these roles. Toll booths - Wikipedia We’ve talked a bit about strategic points of control. They’re much like toll booths and dams in the sense that their location makes them key to controlling a whole lot of traffic on your LAN. In the case of F5’s defined strategic points of control, they all tie in to the history of F5’s product lineup much like a toll booth was originally to collect tolls. F5BIG-IPLTM sits at the network strategic point of control. Initially LTM was a load balancer, but by virtue of its location and the needs of customers has grown into one of the most comprehensive Application Delivery Controllers on the market – everything from security to uptime monitoring is facilitated by LTM. F5 ARX is much the same, being the file-based storage strategic point of control allows such things as directing some requests to cloud storage and others to storage by vendor A, while still others go to vendor B, and the remainder go to a Linux or Windows machine with a ton of free disk space on it. The WAN strategic point of control is where you can improve performance over the WAN via WOM, but it is also a place where you can extend LTM functionality to remote locations, including the cloud. Budgets for most organizations are not growing due to the state of the economy. Whether you’re government, public, private, or small business, you’ve been doing more with less for so long that doing more with the same would be a nice change. If you’re lucky, you’ll see growth in IT budgeting due to increasing needs of security and growth of application footprints. Some others will see essentially flat budgets, and many – including most government IT orgs - will see shrinking budgets. While that is generally bad news, it does give you the opportunity to look around and figure out how to make more effective use of existing technology. Yes, I have said that before, because you’re living that reality, so it is worth repeating. Since I work for F5, here are a few examples though, something I’ve not done before. From the network strategic point of control, we can help you with DNSSec, AAA, Application Security, Encryption, performance on several levels (from TCP optimizations to compression), HA, and even WAN optimization issues if needed. From the storage strategic point of control we can help you harness cloud storage, implement tiering, and balance load across existing infrastructure to help stave off expensive new storage purchases. Backups and replication can be massively improved (both in terms of time and data transferred) from this location also. We’re not the only vendor that can help you out without having to build a whole new infrastructure. It might be worthwhile to have a vendor day, where you invite vendors in to give presentations about how they can help – larger companies and the federal government do this regularly, you can do the same in a scaled down manner, and what sales person is going to tell you “no, we don’t want to come tell you how you can help and we can sell you more stuff”? Really? Another option is, as I’ve said in the past, make sure you know not just the functionality you are using, but the capabilities of the IT gear, software, and services that you already have in-house. Chances are there are cost savings by using existing functionality of an existing product, with time being your only expense. That’s not free, but it’s about as close as IT gets. Hoover Dam from the air - Wikipedia So far we in IT have been lucky, the global recession hasn’t hit our industry as hard as it has hit most, but it has constricted our ability to spend big, so little things like those above can make a huge difference. Since I am on a computer or Playbook for the better part of 16 hours a day, hitting websites maintained by people like you, I can happily say that you all rock. A highly complex, difficult to manage set of variables rarely produces a stable ecosystem like we have. No matter how good the technology, in the end it is people who did that, and keep it that way. You all rock. And you never know, but you might just find the AllSpark hidden in the basement ;-).259Views0likes0CommentsIt Is Not What The Market Is Doing, But What You Are.
We spend an obsessive amount of time looking at the market and trying to lean toward accepted technologies. Seriously, when I was in IT management, there were an inordinate number of discussions about the state of market X or Y. While these conversations almost always revolved around what we were doing, and thus were put into context, sometimes an enterprise sits around waiting for everyone else to jump on board before joining in the flood. While sometimes this is commendable behavior, it is just as often self-defeating. If you have a project that could use technology X, then find the best implementation of said technology for your needs, and implement it. Using an alternative or inferior technology just because market adoption hasn’t happened will bite you as often as it will save you. Take PDAs, back in the bad old days when cell phones were either non-existent of just plain phones. Those organizations that used them reaped benefits from them, those that did not… Did not. While you could talk forever about the herky-jerky relationship of IT with small personal devices like PDAs, the fact is that they helped management stay better organized and kept salespeople with a device they could manage while on the road going from appointment to appointment. For those who didn’t wait to see what happened or didn’t raise a bunch of barriers and arguments that, retrospectively, appear almost ridiculous. Yeah, data might leak out on them. Of course, that was before USB sticks and in more than one case entire hard disks of information walked away, proving that a PDA wasn’t as unique in that respect as people wanted to claim. There is a whole selection of technologies that seem to have fallen into that same funky bubble – perhaps because, like PDAs, the value proposition was just not quite right. When cell phones became your PDA also, nearly all restrictions on them were lifted in every industry, simply because the cell phone + PDA was a more complete solution. One tool to rule them all and all that. Palm Pilot, image courtesy of wikipedia Like PDAs, there is benefit to be had from going “no, we need this, let’s do it”. Storage tiering was stuck in the valley of wait-and-see for many years, and finally seems to be climbing out of that valley simply because of the ongoing cost of storage combined with the parallel growth of storage. Still, there are many looking to save money on storage that aren’t making the move – almost like there’s some kind of natural resistance. It is rare to hear of an organization that introduced storage tiering and then dumped it to go back to independent NAS boxes/racks/whatever, so the inhibition seems to be strictly one of inexperience. Likewise, cloud suffers from some reluctance that I personally attribute to not only valid security concerns, but to very poor handling of those concerns by industry. If you read me regularly, you know I was appalled when people started making wild claims like “the cloud is more secure than your DC”, because that lack of touch with reality made people more standoffish, not less. But some of it is people not seeing a need in their organization, which is certainly valid if they’ve checked it out and come to that conclusion. Quite a bit of it, I think, is the same resistance that was applied to SaaS early on – if it’s not in your physical network, is it in your control? And that’s a valid fear that often shuts down the discussion before it starts – particularly if you don’t have an application that requires the cloud – lots of spikes in traffic, for example. Application Firewalls are an easier one in my book – having been a developer, I know that they’re going to be suspicious of canned security protecting their custom app. While I would contend that it isn’t “canned security” in the case of an Application Firewall, I can certainly understand their concern, and it is a communications issue on the part of the Application Firewall vendor community that will have to be resolved if uptake is to spike. Regulatory issues are helping, but far better an organization purchase a product because they believe it helps than because someone forced them to purchase. With HP’s exit from the tablet market, this is another field that is in danger of falling into the valley of waiting. While it’s conjecture, I’ll contend that not every organization will be willing to go with iPads as a corporate roll-out for groups that can benefit from tablet PCs – like field sales staff – and RIM is in such a funk organizations are unlikely to rush their money to them. The only major contender that seems to remain is Samsung with the Galaxy Tab (android-based), but I bought one for Lori for her last birthday, and as-delivered it is really a mini gaming platform, not a productivity tool. Since that is configurable within the bounds set in the Android environment, it might not be such a big deal, but someone will have to custom-install them for corporate use. But the point is this. If you’re spending too much on storage and don’t have tiering implemented, contact a vendor that suits your needs and look into it. I of course recommend F5ARX, but since I’m an F5 employee, expecting anything else would be silly. Along the same lines, find a project and send it to the cloud. Doesn’t matter how big or small it is, the point is to build your expertise so you know when the cloud will be most useful to you. And cloud storage doesn’t count, for whatever reason it is seeing a just peachy uptake (see last Thursday’s blog), and uses a different skill set than cloud for application deployment. Application Firewalls can protect your applications in a variety of ways, and those smart organizations that have deployed them have done so as an additive protection to application development security efforts. If for some odd reason you’re against layered protection, then think about this… They can stop most attacks before they ever get to your servers, meaning even fingerprinting becomes a more difficult chore. Of course I think F5 products rule the roost in this market – see my note above about ARX. As to tablet PCs, well, right now you have a few choices, if you can get a benefit from them now, determine what will work for you for the next couple of years and run with it. You can always do a total refresh after the market has matured those couple of years. Right now I see Apple, RIM, and Samsung as your best choices, with RIM being kind of shaky. Lori and I own Playbooks and love them, but RIM has managed to make a debacle of itself right when they hit the market, and doesn’t seem to be climbing out with any speed. Or you could snatch up a whole bunch of those really inexpensive Web-OS pads and save a lot of money until your next refresh :-). But if you need it, don’t wait for the market. The market is like FaceBook… Eventually consistent, but not guaranteed consistent at any given moment. Think more about what’s best for your organization and less about what everyone else is doing, you’ll be happier that way. And yes, there’s slightly more risk, but risk is not the only element in calculating best for the organization, it is one small input that can be largely mitigated by dealing with companies likely to be around for the next few years.218Views0likes0CommentsGive Your Unstructured Data The Meyers-Briggs (TM)
For those who don’t know, according to the Meyers & Briggs Foundation, part of the Meyers-Briggs Assessment is defined as: The essence of the theory is that much seemingly random variation in the behavior is actually quite orderly and consistent… The same can be said about your data. Much that is seemingly random is consistent and predictable. One of the problems currently facing the enterprise is to properly categorize that data so that its “personality” is well known. You cannot sort (or tier) what you don’t know, and this is a simple proposal for how you might begin such a categorization. No matter what your organization does, it has a variety of data in a variety of types with a variety of attributes that can be built into indices to help you understand not only what you have, but how much of it you have, what its relative importance is to the organization, and how you can make use of all of this information to help you move data about in an intelligent manner. Meyers-Briggs uses initials to give your average Joe (or Jane) an easy-to-access summary of a given individual’s personality type, and by extension how to interface with that person. This little tool aims to do the same type of thing for your data, so I kept their initials and mapped them to information about your data that will help you figure out what to do with it. Perhaps a bit gimmicky, but it’s valid, so let’s get started defining your data We can break data information into two categories – physical and extended. Physical information can be readily accessed and utilized by automated tiering systems like the one built into our own ARX, but extended information is unique to your organization and some of it will fluctuate over time. That is the hard part that interviews and intelligent data analysis will be required to determine. Not insurmountable, but certainly a task, and if you’re a geek that doesn’t like to play with “squishy” data, not an envious task at all. Though knowing this stuff will help you come to logical conclusions about where and how the data will be stored. Physical Extended Extension Interest Timestamp Jurisdiction Size Permissions Filesystem Necessity First, clear definitions of each data type. Extension is the file extension. It will not only tell you type of file (generally), it will also tell you aggregate type. An AVI falls into the video category, for example. Yeah, it can be audio too, but most organizations treat the two media types similarly when making decisions strictly on extension. Timestamp the last time that the filesystem shows this file as written. If you have a tool (like ARX) that allows you to accurately track last access date/time also, then you could use that information much more intelligently than last save time. Size Let’s face it, the multi-gigabyte file is going to be treated differently than the 10K file just because it is a big win to get it off of tier one storage and on to something cheaper. Filesystem Files on the SAN generally take more money to keep there than those on the NAS. Now if your SAN is low-end and your NAS high-end, it is possible that this is untrue for you, but either way, knowing what filesystem a file is hosted on helps you to understand what the impact of moving that file will be. Interest How much interest would this file be to ne’er-do-wells that got access to the storage medium it is currently stored on? Jurisdication Who is the ultimate owner of this data? The person who can make decisions about its use, distribution, and access rights? Permissions Who has access to this file, and is it by user or group, is access to this file managed on the file itself, or the file system it is stored on? Necessity How is this file used within the organization? If it went away tomorrow, who would be impacted and how would they be impacted? The idea is to collect all of this information about your files so that you can make intelligent decisions about how to move that data around and store it in the most appropriate place. As I said above, there are tools to help you with the physical stuff, and some of them help with Permissions also. But you’ll still need to collect the other data, and that’s a lot of work. If you just plain don’t have time to interview director-level people about their team’s data usage and specific files, then start with directories. Something is better than nothing after all, and behaviorally most groups put like data into folders as far as usage, permissions, jurisdiction, and necessity. After all, the fantasy football spreadsheet isn’t generally stored in the new product development folder. Using these values, you can properly categorize your data, which is the first step to both understanding it and organizing it – and tiering it. Unlike Meyers-Briggs, these attributes can have multiple non-numeric values, so your tracking will be a little bit more complex than a Meyers-Briggs score, but it will be highly valuable in helping you figure out what to do with your data. Data whose necessity is high will obviously take pride of place on your tier one storage systems – unless it is almost never accessed, which the better version of timestamp could tell you. If you just don’t have the manhours, cooperation, or desire to work through all of this, then invest in an automated tiering product, let it learn, and turn it on. It will get you 50% there, maybe 5/8ths of the way there, with no significant effort on your part. You’ll have to install and configure it, and monitor it… But the investment is small compared to interviewing business owners and asking them to make definitive statements about all of the data they own. And it gets you started. In the end, you can’t send stuff that is of high interest unprotected into the cloud, you can’t run stuff that is frequently accessed into an archival format, and you want to check how many movies and audio files you have, where they’re stored, and how much space they take up. So the more you know, the more power over your storage environment you will have. Meyers-Briggs is a trademark of the Meyers and Briggs Foundation. Related Articles and Blogs: Wikipedia Meyers-Briggs Type Indicator Create A Smarter Storage Strategy (pdf) Storage Industry Tackles Making Sense of Metadata Tiering is Like Tables or Storing in the Cloud Tier188Views0likes0Comments- 134Views0likes0Comments