Learning EMC server stats for statistic collection
Once upon a time, not too long ago, the EMC's command line interface utilized a few known commands for gathering statistics. The commands I grew used to were called server_cifsstat and server_nfsstat but then one day when I called upon their mighty powers they responded to me with an unexpected echo: Info 26306752352: server_2 : This command has been deprecated and replaced with server_stats command. "Deprecated!?", I yelled into the air, fists raised high as if I was clutching doom! I slowly put my hands down, took a deep breath and started to focus my inner child towards more positive thoughts. Years of software development and design decisions played through my head as I realized, "there must be a new way of doing the old thing." I just had to search for an answer, much as my old friend Indiana Jones would do, I went into the deep dark caverns of documentation to solve the riddle. Okay, I'm really not that great at reading documentation but once I found a command called server_stats (keyword "stats" gave it away to me) I realized I was heading down the right path… but I was still lost in the darkness. When I executed the command I received a cryptic message: [nasadmin@EMC-VNX-SIM ~]$ server_stats USAGE: server_stats -list | -info [-all|[,...]] | -service { -start [-port ] | -stop | -delete | -status } | -monitor -action {status|enable|disable} |[ [{ -monitor {statpath_name|statgroup_name}[,...] | -monitor {statpath_name|statgroup_name} [-sort ] [-order {asc|desc}] [-lines ] }...] [-count ] [-interval ] [-terminationsummary {no|yes|only}] [-format {text [-titles {never|once|}]|csv}] [-type {rate|diff|accu}] [-file [-overwrite]] The command, which was pretty easy before, has turned into something a bit harder for me to wield but, thanks to some research, experimentation and friends here I can count on, we formulated a single-line statistics gathering command that would do what we needed. I emphasis friends that I can count on, because we have some pretty brilliant folks here. This is the magic sauce (all one one line): server_stats server_2 -monitor cifs.smb1,cifs.smb2,nfs.v2,nfs.v3,nfs.v4,cifs.global,nfs.basic -format csv -terminationsummary no -count 144 -interval 300 -type accu -file name-server_2.csv This command will capture statistics for CIFS/SMB version 1 and 2 as well as NFS version 2, 3 and 4 along with a few more statistics that we may want in the future (bandwidth, other stats and goodies, etc.) It will capture 144 statistical snapshots in time, every 300 seconds, and save them into name-server_2.csv (a comma-separated file with a nice header). One piece of the puzzle, that took me by surprise, was the -type accu option, which accumulates statistics upon each capture rather than starting back at a baseline of zero. You can also do 'diff' to capture the difference from interval to interval, which is nice… but unfortunately I am not able to utilize that feature. We have written tools to scan statistics on some storage devices like EMC and Network Appliance and, while this new command is super awesome, it's not consistent with anything else out there (even older releases prior to deprecation) so our in-house tools which calculate differences do the work for us. If you're looking to start working with the newerserver_stats feature, I suggest using the online manual pages (man server_stats) to get a slightly more clear understanding of all the features and what they can do for you. I believe the command is a bit large for what it needs to do for us, considering it deprecated a much simpler series of commands. However, we work with what we have and hopefully our example command line implementation will give you an understanding of how you can unlock the potential of server_stats for your own needs.437Views0likes1CommentMultiple Stream Protocols, eBooks, And You.
EBook readers are an astounding thing, if you really stop and think about it. Prior to their creation, how could you reasonably have hundreds or thousands of books in one place, all the notes you took and highlighting you wanted to do, and your current page in each book all stored together in one easy to use place? We have a room that is a library. It has shelf upon shelf of books. We have other bookshelves throughout our house with more books. And do you think where you last left off in those books is remembered? Sure, some of them will retain bookmarks, but not automatically, you have to take steps to put the bookmark physically into the book and then hope that no one else messes with it. Lori and I have very similar tastes in reading, and we share almost 100% of the books in the house, which means inevitably someone’s page or a quote marker or something gets lost. Not with eBooks. We use Kindles, and all the books I read show up in her archive to read, all the books she reads show up in mine. My notes are mine, her notes are hers. All at the same time. No confusion at all. The revolution in reading that eBook readers have enabled is not on the “uber-fast” pace that I would have expected, just because of the cost of entry. Buy a book to read today for $8 USD, or scrounge $100 to $500 USD to purchase a reader? For lots of people tight on cash, there is no choice there. The big-name publishers themselves haven’t helped any either. I’m not going to pay book price for a book I already own, just for the right to put it on my eReader, I’ll just pick up the paper copy, thanks. But it’s still moving along at a rapid pace because demand for one small tablet device to contain tons of books was unknown until it was real, but now that it’s real, demand is growing. The same is true for stream protocols. That is protocols that bundle streams together into a single connection. From Java to VDI, these protocols are growing because they encapsulate the entire communications thread and can optimize strictly based upon whatever it is they’re transporting. In the case of Amazon’s SPDY or VDI, they’re transporting an awful lot, and often in two-way communications. And yet, like eBook readers, technology has come far enough that they do so pretty darned well. The real difference between these protocols and TCP or HTTP is that they allow multiple message streams within a single connection. Always remembering where each is, detecting lost data and which streams it impacts… Much like an eBook remembering your notes. One corner of our library And they’re growing in popularity. For Virtual Desktop Infrastructure, shared protocols are standard. For Amazon, SPDY capability is assumed on the server (or SPDY wouldn’t be an option), though it won’t be used if the client can’t support it. For Java, support of the IETF Stream Control Transmission Protocol (SCTP) is completely optional… For the developer. Since these protocols don’t impact the end user in any noticeable way, they will continue to gain popularity to multiplex several related functions over a single connection. And you should be aware of that, because if you do any load balancing or own any tool that uses packet inspection for anything, you'll want to check with your vendor about what they do/intend to support. It’s passingly difficult, for example, to load balance SPDY unless the load balancer has special features to do so. The reason is simple, the current world of TCP and HTTP has a source and a target, but under SPDY you have a source and targets. If your device doesn’t know how to crack open SPDY and see what it’s trying to do, the device can’t very well route it to the best server to handle the request. That is true of all of the multiple stream protocols, and as they gain in popularity, or when you start supporting one on your servers, you’ll want to make sure your infrastructure can deal with them intelligently. Much like back seven or so years ago, when content based routing hit the “what about encryption?” snag, you will see similar issues pop up with these protocols. If you’re using QoS functionality, for example, what if you limited video bandwidth to make certain your remote backup could complete in a timely manner, but users are streaming video over SPDY? How do you account for that without limiting all of SPDY? Well you don’t, unless your device is smart enough to handle the protocol. That doesn’t even touch the potential for prioritization that SPDY allows… If your device can parse it. My Kindle currently holds more books than those shelves. So pay attention to what’s happening in the space – when you have time – and perk up your ears for the impact on your infrastructure if someone wants to bring a product in-house that utilizes one of these protocols. They’re very cool, but don’t get caught unaware. Of course now that I’ve equated them to eBook readers, perhaps you’ll think of them whenever you read . And just like my kindle holds as many books as we have in our large library (my Kindle is around 500 right now, no idea how many are in the library, but 500 is a big number), those Multiple Stream Protocols could hold more connections than your other servers are seeing. On the bright side, at least today, IT has to make a positive decision to use a product that requires these protocols, so you’ll get a chance to do some homework. Related Articles and Blogs F5 Friday: Doing VDI, Only Better Oops! HTML5 Does It Again The Conspecific Hybrid Cloud F5 and vCloud Director: A Yellow Bricks How-to SPDY Wikipedia entry SCTP RFC Microsoft RDP description189Views0likes0CommentsThe Magic of Mobile Cloud
It’s like unicorns…and rainbows! #mobile Mark my words, the term “mobile” is the noun (or is it a verb? Depends on the context, doesn’t it?) that will replace “cloud” as the most used and abused and misapplied term in technology in the coming year. If I was to find a pitch in my inbox that did not someway invoke the term “mobile” I’d be surprised. The latest one to catch my eye was pitching a survey on the “mobile cloud”. The idea, apparently, around this pitch involving “mobile cloud” is the miraculous capability bestowed upon cloud deployed services to automagically perform synchronization and storage tasks. The proliferation of mobile devices has created demand for services that allow users to access personal data and content from any device at any time. Mobile cloud services are emerging that synchronise data across multiple mobile devices with centralised storage in the cloud. While the statement regarding demand is true, the follow-on assertion is at best inaccurate, at worst it is false. There are no services, in the cloud or anywhere else, that can synchronize data across multiple devices. Oh, services may be emerging that claim to do so, but they can’t and don’t. Without fail, services “in the cloud” are invoked from the client – each individual client, mind you – and without that initiating event a cloud service would no more be able to synchronize data than previous incarnations of mobile services when we called them hosted applications. SERVICE-SIDE PUSH This is because the underlying technology used to access these services is still, regardless of the interface presented, the web. It’s an API. It’s HTTP. It’s a client-server paradigm that hasn’t changed very much since it rose to ascendancy as the preferred application architectural model back in the last century. The reason SPDY has started to gain attention and mindshare is not necessarily because it’s faster (that’s a plus, mind you, but it’s not the whole enchilada) but because of its bidirectional communication capabilities. SPDY can push to clients in a way that HTTP has never really been able to do, though many have tried. They’ve come close with approximations and solutions that to the untrained user appear to be a “push” but in reality they are little more than “dragging out a pull response.” And yet SPDY is still constrained in the same way as traditional HTTP: the client must initiate the connection. The capability to push from the service-side does not and will not imbue “cloud services” of any kind with the ability to initiate actions, because the “cloud” cannot push to a client unless a connection is already established. And who initiates connections? That’s right, clients. The only entity that could make a claim that it could initiate anything on a mobile device would be a service provider. That’s because they are the only ones who can actually find and connect to a device on-demand – and then it’s only their devices on their mobile networks. And then they’d best only do that if it’s (1) part of their terms of service or (2) the user specifically checked a box allowing them (or their service) to do so. But consider the impracticality of “service-side push” to clients to synchronize data. Client devices are, well, mobile. That means their connectivity is not assured. “Always on” is a misnomer. Yes, the device is always on in a way the PC has never been, but it’s also in stand-by mode, which often means the radio – its means of communication – is off. This little fact is a problem for performance-focused IT, and it’s even more troublesome to those who’d like to create a service-side “push”. So Bob uploads a photo to a “cloud storage” service and the service wants to synchronize it with Bob’s other (configured by Bob, of course) devices. So the service starts sending out messages to try to connect to Bob’s other devices. Right. One is turned off and the other is in flight mode to prevent his three-year old from purchasing God only knows what apps through the Android market and the third? It’s in standby, the radio is off. That’s not the way it works today and it certainly shouldn’t be the way it works in the future. It’s a waste of processing power, of bandwidth, of resources in general. The client will eventually be online and will open a session with the “cloud service” and ask it for updates. MOBILE CLOUD Whether applications use web technologies because of the reality that clients are not “always on” or because it’s the model (client initiated and more importantly to them, controlled) most familiar and acceptable to consumers, reality is that mobile devices and clients leverage web technologies to store, share, and synchronize data across services. The “mobile cloud” and its alleged ability to “synchronize data across devices” is little more than cloud washing, as is the term “mobile cloud” itself which some have tried to claim is defined by the way in which a device accesses its services. From differentiation between network type (wired versus wireless) to the client-model (thin client browser versus thick client application), some continue to try to make the case that there exists some “mobile cloud” that is completely different than that of the “regular old cloud.” There is not. The web is the web, the presentation layer of an application (thick or thin) does not define its server-side technological model, and service-side push (and control) remains yet another marketing phrase used to describe capabilities that is not technically accurate and which ultimately sets unrealistic expectations for consumers – and in the enterprise, IT. The notion that you’d build a “mobile cloud” that is somehow separate from the “regular cloud” is preposterous precisely because it contradicts the purported purpose for building it: synchronization and “access from anywhere.” It’s that “anywhere” requirement that makes a mobile cloud as realistic as unicorns. If I upload a photo to I should be able to access – and thus synchronize – from any device, and that includes my laptop or desktop PC, the latter of which is certainly not “mobile”. These assertions that a mobile cloud exist only serve to reinforce the heretofore unknown Clark’s Third (and a half) Law: Any sufficiently advanced web technology is indistinguishable from cloud in the eyes of the marketing department.172Views0likes0CommentsFile Virtualization Performance: Understanding CIFS Create ANDX
Once upon a time, files resided on a local disk and file access performance was measured by how fast the disk head accessed the platters, but today those platters may be miles away; creating, accessing and deleting files takes on a new challenge and products like the F5 ARX become the Frodo Baggins of the modern age. File Virtualization devices are burdened with a hefty task (this is where my Lord of the Ring Analogy really beings to play out) of becoming largely responsible for how close your important files are to your finger tips. How fast do you want it to perform? Quite expectedly, “as fast as possible,” will be the typical response. File Virtualization requires you to meet the expectations of an entire office—many of which are working miles away from the data they access and every single user hates to see a waiting cursor. To judge the performance of a storage environment we often ask the question, “How many files do you create and how many files do you open?” Fortunately, Microsoft CIFS allows humans to interact with their files over a network and it does so with many unique Remote Procedure Calls (RPCs). One such procedure call, Create ANDX, was initially intended to create new files on a file system but became the de-facto standard for opening files as well. While you and I can clearly see an obvious distinction between opening a file and creating a file, CIFS liberal use of Create ANDX, gives us pause, as this one tiny procedure has been overloaded to perform both tasks. Why is this a problem? Creating a file and opening a file requires a completely separate amount of work with entirely different results, one of the great challenges of File Virtualization. Imagine if you were given the option between writing a book like The Fellowship of The Ring or simply opening the one already created. Which is easier? Creating a file may require metadata about the file (security information, other identifiers, etc.) and allocating sufficient space on disk takes a little time. Opening a file is a much faster operation compared to “create” and, often, will be followed by one or more read operations. Many storage solutions, EMC and Network Appliance come to mind, have statistics to track just how many CIFS RPC’s have been requested by clients in the office. These statistics are highly valuable when analyzing the performance of a storage environment for File Virtualization with the F5 ARX. Gathering the RPC statistics over a fixed interval of time allow easier understanding of the environment but one key statistic, Create ANDX, leaves room for improvement… this is the “all seeing eye” of RPC’s because of its evil intentions. Are we creating 300 files per second or simply opening them? Perhaps it’s a mix of both and we’ve got to better understand what’s going on in the storage network. When we analyze a storage environment we put additional focus on the Create ANDX RPC and utilize a few other RPC’s to try to guess what the client’s intentions so we can size the environment for the correct hardware. In a network with 300 Create ANDX procedures a second, we would then look into how many read RPC’s we can find compared to the write RPC’s and attempt to judge what the client is intending to perform as an action. For example, a storage system with 300 “creates” that then performs 1200 reads and five writes is probably spending much of its time opening files, not creating them. Logic dictates that a client would open a file to read from it and not create a 0-byte file and read emptiness, which just doesn’t make much sense. Tracking fifteen minute intervals of statistics on your storage device, over a 24-hour period, will give you a bit of understanding as to what RPC’s are heavily used in the environment (a 48-hour sample will yield even more detailed results.) Taking a bit of time to read into the intentions of Create ANDX and try to understand how your clients are using the storage environment, are they opening files or are they creating files? Just as creating files on storage systems is a more intensive process compared to the simple open action, the same can be said for the F5 ARX. The ARX will also track metadata for newly created files for its virtualization layer and the beefier the ARX hardware, the more file creations can be done in a short interval of time. Remember, while it’s interesting and often times impressive to know just how many files are virtualized behind an F5 ARX or sitting on your storage environment, it’s much more interesting when you know how many are actually actively accessed. With a handful of applications, multiple protocols, dozens of RPC’s, hundreds of clients and several petabytes of information, do you know how your files are accessed?241Views0likes0CommentsThe IPv6 Application Integration Factor
#IPv6 Integration with partners, suppliers and cloud providers will make migration to IPv6 even more challenging than we might think… My father was in the construction business most of the time I was growing up. He used to joke with us when we were small that there was a single nail in every house that – if removed – would bring down the entire building. Now that’s not true in construction, of course, but when the analogy is applied to IPv6 it may be more true than we’d like to think, especially when that nail is named “integration”. Most of the buzz around IPv6 thus far has been about the network; it’s been focused on getting routers, switches and application delivery network components supporting the standard in ways that make it possible to migrate to IPv6 while maintaining support for IPv4 because, well, we aren’t going to turn the Internet off for a day in order to flip from IPv4 to IPv6. Not many discussions have asked the very important question: “Are your applications ready for IPv6?” It’s been ignored so long that many, likely, are not even sure about what that might mean let alone what they need to do to ready their applications for IPv6. IT’S the INTEGRATION The bulk of issues that will need to be addressed in the realm of applications when the inevitable migration takes off is in integration. This will be particularly true for applications integrating with cloud computing services. Whether the integration is at the network level – i.e. cloud bursting – or at the application layer – i.e. integration with SaaS such as Salesforce.com or through PaaS services – once a major point of integration migrates it will likely cause a chain reaction, forcing enterprises to migrate whether they’re ready or not. Consider for example, that cloud bursting, assumes a single, shared application “package” that can be pushed into a cloud computing environment as a means to increase capacity. If – when – a cloud computing provider decides to migrate to IPv6 this process could become a lot more complicated than it is today. Suddenly the “package” that assumed IPv4 internal to the corporate data center must assume IPv6 internal to the cloud computing provider. Reconfiguration of the OS, platform and even application layer becomes necessary for a successful migration. Enterprises reliant on SaaS for productivity and business applications will likely be first to experience the teetering of the house of (integration) cards. Enterprises are moving to the cloud, according to Yankee Group’s 2011 US FastView: Cloud Computing Survey. Approximately 48 percent of the respondents said remote/mobile user connectivity is driving the enterprises to deploy software as a service. This is significant as there is a 92 percent increase over 2010. Around 38 percent of enterprises project the deployment of over half of their software applications on a cloud platform within three years compared to 11 percent today, Yankee Group said in its “2011 Fast View Survey: Cloud Computing Motivations Evolve to Mobility and Productivity.” -- Enterprise SaaS Adoption Almost Doubles in 2011: Yankee Group Survey Enterprise don’t just adopt SaaS and cloud services, they integrate them. Data stored in cloud-hosted software is invaluable to business decision makers but first must be loaded – integrated – into the enterprise-deployed systems responsible for assisting in analysis of that data. Secondary integration is also often required to enable business processes to flow naturally between on- and off-premise deployed systems. It is that integration that will likely first be hit by a migration on either side of the equation. If the enterprise moves first, they must address the challenge of integrating two systems that speak incompatible network protocol versions. Gateways and dual-stack strategies – even potentially translators – will be necessary to enable a smooth transition regardless of who blinks first in the migratory journey toward IPv6 deployment. Even that may not be enough. Peruse RFC 4038, “Application Aspects of IPv6 Transition”, and you’ll find a good number of issues that are going to be as knots in wood to a nail including DNS, conversion functions between hostnames and IP addresses (implying underlying changes to development frameworks that would certainly need to be replicated in PaaS environments which, according to a recent report from Gartner, indicates a 267% increase in inquiries regarding PaaS this year alone), and storage of IP addresses – whether for user identification, access policies or integration purposes. Integration is the magic nail; the one item on the migratory checklist that is likely to make or break the success of IPv6 migration. It’s also likely to be the “thing” that forces organizations to move faster. As partners, sources and other integrated systems make the move it may cause applications to become incompatible. If one environment chooses an all or nothing strategy to migration, its integrated partners may be left with no option but to migrate and support IPv6 on a timeline not their own. TOO TIGHTLY COUPLED While the answer for IPv6 migration is generally accepted to be found in a dual-stack approach, the same cannot be said for Intercloud application mobility. There’s no “dual stack” in which services aren’t tightly coupled to IP address, regardless of version, and no way currently to depict an architecture without relying heavily on topological concepts such as IP. Cloud computing – whether IaaS or PaaS or SaaS – is currently entrenched in a management and deployment system that tightly couples IP addresses to services. Integration relying upon those services, then, becomes heavily reliant on IP addresses and by extension IP, making migration a serious challenge for providers if they intend to manage both IPv4 and IPv6 customers at the same time. But eventually, they’ll have to do it. Some have likened the IPv4 –> IPv6 transition as the network’s “Y2K”. That’s probably apposite but incomplete. The transition will also be as challenging for the application layers as it will for the network, and even more so for the providers caught between two versions of a protocol upon which so many integrations and services rely. Unlike Y2K we have no deadline pushing us to transition, which means someone is going to have to be the one to pull the magic nail out of the IPv4 house and force a rebuilding using IPv6. That someone may end up being a cloud computing provider as they are likely to have not only the impetus to do so to support their growing base of customers, but the reach and influence to make the transition an imperative for everyone else. IPv6 has been treated as primarily a network concern, but because applications rely on the network and communication between IPv4 and IPv6 without the proper support is impossible, application owners will need to pay more attention to the network as the necessary migration begins – or potentially suffer undesirable interruption to services.280Views0likes0CommentsThe Many Faces of DDoS: Variations on a Theme or Two
Many denial of service attacks boil down to the exploitation of how protocols work and are, in fact, very similar under the hood. Recognizing these themes is paramount to choosing the right solution to mitigate the attack. When you look across the “class” of attacks used to perpetrate a denial of service attack you start seeing patterns. These patterns are important in determining what resources are being targeted because it provides the means to implement solutions that mitigate the consumption of those resources while under an attack. Once you recognize the underlying cause of a service outage due to an attack you can enact policies and solutions that mitigate that root cause, which better serves to protect against the entire class of attacks rather than employing individual solutions that focus on specific attack types. This is because attacks are constantly evolving, and the attacks solutions protect against today will certainly morph into a variation on that theme, and solutions that protect against specific attacks rather than addressing the root cause will not necessarily be capable of defending against those evolutions. In general, there are two types of denial of service attacks: those that target the network layers and those that target the application layer. And of course as we’ve seen this past week or so, attackers are leveraging both types simultaneously to exhaust resources and affect outages across the globe. NETWORK DoS ATTACKS Network-focused DoS attacks often take advantage of the way network protocols work innately. There’s nothing wrong with the protocols, no security vulnerabilities, nada. It’s just the way they behave and the inherent trust placed in the communication that takes place using these protocols. Still others simply attempt to overwhelm a single host with so much traffic that it falls over. Sometimes successful, other times it turns out the infrastructure falls over before the individual host and results in more a disruption of service than a complete denial, but with similar impact to the organization and customers. SYN FLOOD A SYN flood is an attack against a system for the purpose of exhausting that system’s resources. An attacker launching a SYN flood against a target system attempts to occupy all available resources used to establish TCP connections by sending multiple SYN segments containing incorrect IP addresses. Note that the term SYN refers to a type of connection state that occurs during establishment of a TCP/IP connection. More specifically, a SYN flood is designed to fill up a SYN queue. A SYN queue is a set of connections stored in the connection table in the SYN-RECEIVED state, as part of the standard three-way TCP handshake. A SYN queue can hold a specified maximum number of connections in the SYN-RECEIVED state. Connections in the SYN-RECEIVED state are considered to be half-open and waiting for an acknowledgement from the client. When a SYN flood causes the maximum number of allowed connections in the SYN-RECEIVED state to be reached, the SYN queue is said to be full, thus preventing the target system from establishing other legitimate connections. A full SYN queue therefore results in partially-open TCP connections to IP addresses that either do not exist or are unreachable. In these cases, the connections must reach their timeout before the server can continue fulfilling other requests. ICMP FLOOD (Smurf) The ICMP flood, sometimes referred to as a Smurf attack, is an attack based on a method of making a remote network send ICMP Echo replies to a single host. In this attack, a single packet from the attacker goes to an unprotected network’s broadcast address. Typically, this causes every machine on that network to answer with a packet sent to the target. UDP FLOOD The UDP flood attack is most commonly a distributed denial-of-service attack (DDoS), where multiple remote systems are sending a large flood of UDP packets to the target. UDP FRAGMENT The UDP fragment attack is based on forcing the system to reassemble huge amounts of UDP data sent as fragmented packets. The goal of this attack is to consume system resources to the point where the system fails. PING of DEATH The Ping of Death attack is an attack with ICMP echo packets that are larger than 65535 bytes. Since this is the maximum allowed ICMP packet size, this can crash systems that attempt to reassemble the packet. NETWORK ATTACK THEME: FLOOD The theme with network-based attacks is “flooding”. A target is flooded with some kind of traffic, forcing the victim to expend all its resources on processing that traffic and, ultimately, becoming completely unresponsive. This is the traditional denial of service attack that has grown into distributed denial of service attacks primarily because of the steady evolution of web sites and applications to handle higher and higher volumes of traffic. These are also the types of attacks with which most network and application components have had long years of experience with and are thus well-versed in mitigating. APPLICATION DoS ATTACKS Application DoS attacks are becoming the norm primarily because we’ve had years of experience with network-based DoS attacks and infrastructure has come a long way in being able to repel such attacks. That and Moore’s Law, anyway. Application DoS attacks are likely more insidious simply because like their network-based counterparts they take advantage of application protocol behaviors but unlike their network-based counterparts it requires far fewer clients to overwhelm a host. This is part of the reason application-based DoS attacks are so hard to detect – because there are fewer clients necessary (owing to the large chunks of resources consumed by a single client) they don’t fit the “blast” pattern that is so typical of a network-based DoS. It can take literally millions of ICMP requests to saturate a host and its network, but it requires only tens of thousands of requests to consume the resources of an application host such that it becomes unreliable and unavailable. And given the ubiquitous nature of HTTP – over which most of these attacks are perpetrated – and the relative ease with which it is possible to hijack unsuspecting browsers and force their participation in such an attack – an attack can be in progress and look like nothing more than a “flash crowd” – a perfectly acceptable and in many industries desirable event. A common method of attack involves saturating the target (victim) machine with external communications requests, so that the target system cannot respond to legitimate traffic, or responds so slowly as to be rendered effectively unavailable. In general terms, DoS attacks are implemented by forcing the targeted computer to reset, or by consuming its resources so that it can no longer provide its intended service, or by obstructing the communication media between the intended users and the victim so that they can no longer communicate adequately. HTTP GET FLOOD An HTTP GET flood is as exactly as it sounds: it’s a massive influx of legitimate HTTP GET requests that come from large numbers of users, usually connection-oriented bots. These requests mimic legitimate users and are nearly impossible for applications and even harder for traditional security components to detect. This result of this attack is similar to the effect: server errors, increasingly degraded performance, and resource exhaustion. This attack is particularly dangerous to applications deployed in cloud-based environments (public or private) that are enabled with auto-scaling policies, as the system will respond to the attack by launching more and more instances of the application. Limits must be imposed on auto-scaling policies to ensure the financial impact of an HTTP GET flood does not become overwhelming. SLOW LORIS Slowloris consumes resources by “holding” connections open by sending partial HTTP requests. It subsequently sends headers at regular intervals to keep the connections from timing out or being closed due to lack of activity. This causes resources on the web /application servers to remain dedicated to the clients attacking and keeps them unavailable for fulfilling legitimate requests. SLOW HTTP POST A slow HTTP Post is a twist on Slow Loris in which the client sends POST headers with a legitimate content-length. After the headers are sent the message body is transmitted at slow speed, thus tying up the connection (server resources) for long periods of time. A relatively small number of clients performing this attack can effectively consume all resources on the web / application server and render it useless to legitimate users. APPLICATION ATTACK THEME: SLOW Notice a theme, here? That’s because clients can purposefully (and sometimes inadvertently) affect a DoS on a service simply by filling its send/receive queues slowly. The reason this works is similar to the theory behind SYN flood attacks, where all available queues are filled and thus render the server incapable of accepting/responding until the queues have been emptied. Slow pulls or pushes of content keep data in the web/application server queue and thus “tie up” the resources (RAM) associated with that queue. A web/application server has only so much RAM available to commit to queues, and thus a DoS can be affected simply by using a small number of v e r y slow clients that do little other than tie up resources with what are otherwise legitimate interactions. While the HTTP GET flood (page flood) is still common (and works well) the “slow” variations are becoming more popular because they require fewer clients to be successful. Fewer clients makes it harder for infrastructure to determine an attack is in progress because historically flooding using high volumes of traffic is more typical of an attack and solutions are designed to recognize such events. They are not, however, generally designed to recognize what appears to be a somewhat higher volume of very slow clients as an attack. THEMES HELP POINT to a SOLUTION Recognizing the common themes underlying modern attacks are helpful in detecting the attack and subsequently determining what type of solution is necessary to mitigate such an attack. In the case of flooding, high-performance security infrastructure and policies regarding transaction rates coupled with rate shaping based on protocols can mitigate attacks. In the case of slow consumption of resources, it is generally necessary to leverage a high-capacity intermediary that essentially shields the web/application servers from the impact of such requests, coupled with emerging technology that enables a context-aware solution better detect such attacks and then act upon that knowledge to reject them. When faced with a new attack type, it is useful to try to determine the technique behind the attack – regardless of implementation – as it can provide the clues necessary to implement a solution and address the attack before it can impact the availability and performance of web applications. It is important to recognize that solutions only mitigate denial of service attacks. They cannot prevent them from occurring. What We Learned from Anonymous: DDoS is now 3DoS There Is No Such Thing as Cloud Security Jedi Mind Tricks: HTTP Request Smuggling When Is More Important Than Where in Web Application Security Defeating Attacks Easier Than Detecting Them The Application Delivery Spell Book: Contingency What is a Strategic Point of Control Anyway? Layer 4 vs Layer 7 DoS Attack Putting a Price on Uptime Why Single-Stack Infrastructure Sucks425Views0likes3CommentsNew TCP vulnerability about trust, not technology
I read about a "new" TCP flaw that, according to C|Net News, Related Posts puts Web sites at risk. There is very little technical information available; the researchers who discovered this tasty TCP tidbit canceled a conference talk on the subject and have been sketchy about the details of the flaw when talking publicly. So I did some digging and ran into a wall of secrecy almost as high as the one Kaminsky placed around the DNS vulnerability. Layer 4 vs Layer 7 DoS Attack The Unpossible Task of Eliminating Risk Soylent Security So I hit Twitter and leveraged the simple but effective power of asking for help. Which resulted in several replies, leading me to Fyodor and an April 2000 Bugtraq entry. The consensus at this time seems to be that the wall Kaminsky built was for good reason, but this one? No one's even trying to ram it down because it doesn't appear to be anything new. Which makes the "oooh, scary!" coverage by mainstream and trade press almost amusing and definitely annoying. The latest 'exploit' appears to be, in a nutshell, a second (or more) discovery regarding the nature of TCP. It appears to exploit the way in which TCP legitimizes a client. In that sense the rediscovery (I really hesitate to call it that, by the way) is on par with Kaminsky's DNS vulnerability simply because the exploit appears to be about the way in the protocol works, and not any technical-based vulnerability like a buffer overflow. TCP and applications riding atop TCP inherently trust any client that knocks on the door (SYN) and responds correctly (ACK) when TCP answers the door (SYN ACK). It is simply the inherent trust of the TCP handshake as validation of the legitimacy of a client that makes these kinds of attacks possible. But that's what makes the web work, kids, and it's not something we should be getting all worked up about. Really, the headlines should read more like "Bad people could misuse the way the web works. Again." This likely isn't about technology, it's about trust, and the fact that the folks who wrote TCP never thought about how evil some people can be and that they'd take advantage of that trust and exploit it. Silly them, forgetting to take into account human nature when writing a technical standard. If they had, however, we wouldn't have the Internet we have today because the trust model on the Web would have to be "deny everything, trust no one" rather than "trust everyone unless they prove otherwise." So is the danger so great as is being portrayed around the web? I doubt it, unless the researchers have stumbled upon something really new. We've known about these kinds of attacks for quite some time now. Changing the inherent nature of TCP isn't something likely to happen anytime soon, but contrary to the statements made regarding there being no workarounds or solutions to these problem, there are plenty of solutions that address these kinds of attacks. I checked in with our engineers, just in case, and got the low-down on how BIG-IP handles this kind of a situation and, as expected, folks with web sites and applications being delivered via a BIG-IP really have no reason to be concerned about the style of attack described by Fyodor. If it turns out there's more to this vulnerability, then I'll check in again. But until then, I'm going to join the rest of the security world and not worry much about this "new" attack. In the end, it appears that the researchers are not only exploiting the trust model of TCP, they're exploiting the trust between people; the trust that the press has in "technology experts" to find real technical vulnerabilities and the trust that folks have in the trade press to tell them about it. That kind of exploitation is something that can't be addressed with technology. It can't be fixed by rewriting a TCP stack, and it certainly can't be patched by any vendor.204Views0likes2CommentsHTTP: The de facto application transport protocol of the Web
When the OSI defined its model it included a transport layer which was supposed to handle end-to-end connections and address communication reliability. In the early days of the web HTTP sat at the application layer (layer 7) and rode atop TCP, its transport layer. An interesting thing happened on the way to the 21st century; HTTP became an application transport layer. Many web applications today use HTTP to transport other application protocols such as JSON and SOAP and RSS. Applications now "speak" using a variety of languages to communicate, but underlying them all is HTTP. This is not the same as tunneling a different application through port 80 simply because almost all HTTP traffic flows through that port and it is therefore likely to be open on the corporate firewall. Those applications that simply tunnel through port 80 use TCP and their own application layer protocols, they're essentially just pretending to be HTTP by using the same port to fool firewalls into allowing their traffic to pass unhindered. No, this is different. This is the use of HTTP to wrap other application protocols and transport them. The web server interprets the HTTP and handles sessions and cookies and parameters, but another application is required to interpret the messages contained within because they represent the protocol of yet another application. In today's world the availability of exponentially expanding collaboration and syndication applications, all requiring different applications, is driving the need for smarter application delivery solutions to ensure availability, reliability, and scalability. Simple layer 4 (TCP) load balancing is not enough, neither is load balancing based on layer 7 (HTTP). Load balancing requests based on TCP or HTTP doesn't address the need to distribute application requests because the app is no longer HTTP, it's something else entirely. HTTP has been relegated to the status of application transport protocol, and that means in order to intelligently deliver an application we have to dig even deeper than layer 7. We've got to get inside. The problem is, of course, that there are no standards beyond HTTP. My JSON-based Web 2.0 application looks nothing like your SOAP-based Web 2.0 application. And yet a single solution must be able to adapt to those differences and provide the same level of scalability and reliability for me as it does you. It has to be extensible. It has to provide some mechanism for adding custom behavior and addressing the specific needs of application protocols that are unknown at the time the solution is created. This is an important facet of application delivery that is often overlooked. Applications aren't about HTTP anymore, they're about undefined and unknowable protocols. An application delivery solution can't distribute application load across servers unless it can understand which application it's supposed to be managing. And because HTTP connections are artificially limited by browsers, multiple application protocols are using the same HTTP connections over which to exchange data. That means an application delivery solution has to be able to dig into the application protocol and figure out where that request should be directed, and how to treat it, and what policies to apply. Application delivery today is about the message, not the protocol, and the message is undefined until it's created by a developer. There's a lot of traffic out there that's just HTTP, as it was conceived of and implemented years ago. But there's a growing amount of traffic out there that's more than HTTP, that's relegated this ubiquitous protocol to an application transport layer protocol and uses it as such to deliver custom applications that use protocols without RFCs, without standards bodies, without the W3C. If your application delivery solution doesn't offer a way that easily allows you to dig into the real application protocols, but instead relegates you to making load balancing and routing decisions based solely on HTTP, you need to reconsider your solution. HTTP is the de facto application transport protocol today, but because it's so often used this way we have to get smarter about how we load balance and distribute those messages riding on HTTP if we want to architect smarter, greener, more efficient architectures. Imbibing: Coffee222Views0likes1CommentDo you control your application network stack? You should.
Owning the stack is important to security, but it’s also integral to a lot of other application delivery functions. And in some cases, it’s downright necessary. Hoff rants with his usual finesse in a recent posting with which I could not agree more. Not only does he point out the wrongness of equating SaaS with “The Cloud”, but points out the importance of “owning the stack” to security. Those that have control/ownership over the entire stack naturally have the opportunity for much tighter control over the "security" of their offerings. Why? because they run their business and the datacenters and applications housed in them with the same level of diligence that an enterprise would. They have context. They have visibility. They have control. They have ownership of the entire stack. Owning the stack has broader implications than just security. The control, visibility, and context-awareness implicit in owning the stack provides much more flexibility in all aspects covering the delivery of applications. Whether we’re talking about emerging or traditional data center architectures the importance of owning the application networking stack should not be underestimated. The arguments over whether virtualized application delivery makes more sense in a cloud computing- based architecture fail to recognize that a virtualized application delivery network forfeits that control over the stack. While it certainly maintains some control at higher levels, it relies upon other software – the virtual machine, hypervisor, and operating system – which shares control of that stack and, in fact, processes all requests before it reaches the virtual application delivery controller. This is quite different from a hardened application delivery controller that maintains control over the stack and provides the means by which security, network, and application experts can tweak, tune, and exert that control in myriad ways to better protect their unique environment. If you don’t completely control layer 4, for example, how can you accurately detect and thus prevent layer 4 focused attacks, such as denial of service and manipulation of the TCP stack? You can’t. If you don’t have control over the stack at the point of entry into the application environment, you are risking a successful attack. As the entry point into application, whether it’s in “the” cloud, “a” cloud, or a traditional data center architecture, a properly implemented application delivery network can offer the control necessary to detect and prevent myriad attacks at every layer of the stack, without concern that an OS or hypervisor-targeted attack will manage to penetrate before the application delivery network can stop it. The visibility, control, and contextual awareness afforded by application delivery solutions also allows the means by which finer-grained control over protocols, users, and applications may be exercised in order to improve performance at the network and application layers. As a full proxy implementation these solutions are capable of enforcing compliance with RFCs for protocols up and down the stack, implement additional technological solutions that improve the efficiency of TCP-based applications, and offer customized solutions through network-side scripting that can be used to immediately address security risks and architectural design decisions. The importance of owning the stack, particularly at the perimeter of the data center, cannot and should not be underestimated. The loss of control, the addition of processing points at which the stack may be exploited, and the inability to change the very behavior of the stack at the point of entry comes from putting into place solutions incapable of controlling the stack. If you don’t own the stack you don’t have control. And if you don’t have control, who does?252Views0likes0CommentsThe ironic truth about the ugly truth about web application acceleration
Lately I've been seeing quite a few links to a white paper popping up in my alerts and feed-reader. Regardless of who's linking to it, it generally reads as promising to reveal some grand secret about how web application acceleration is an epic failure. I finally gave in and clicked on a link and ended up directed to download a white-paper, the description for which essentially distilled "web application acceleration" down to "caching". And then promised to tell me why caching wasn't a good way to accelerate web applications. I didn't download the white paper primarily because equating "web application acceleration" to "caching" is exceedingly shortsighted and intimated that the actual content of the paper was most likely equally astigmatic. While caching is certainly one mechanism used by web application acceleration solutions, it's not the only one. (I also found it ironic (hence the title) to see a paper denigrating the use of caches being promoted by a vendor for whose technology caching, albeit at the byte and object level, is an underlying foundation. Oh, none of us with these kinds of solutions call it that because it needs a snazzy marketing name, but that's the core of what it really is. Anyone who claims otherwise is just (desperately) trying to sell you something. Go ahead, ask me about how we implement that particular technology. Part of the answer is "caching at the byte and network level". I could continue this parenthetical aside further and say that what these guys are accelerating in the end is applications, so technically they're a (web) application acceleration solution, too, (So aren't they really pointing out their own faults then?) but that's probably better left as another post for another day. (Man, that was almost as bad as programming in LISP, but now I really digress.)) In any case, the truth is that caching is only one of the technologies used by web application acceleration to improve performance of web applications. Otherwise we'd call just them caches, wouldn't we? Other technologies used by web application acceleration include: Protocol optimization You would be amazed (or maybe you wouldn't) at the number of enhancements to protocols like TCP are available that improve the performance of web (and other applications riding on the protocol) applications. Web application acceleration solutions often implement these enhancements. Compression Fewer packets means faster transmission and one of the ways in which you turn big fat web application messages into fewer packets is to compress them. This is especially true for HTML and XML and other web application formats like JSON which are text-based and are therefore highly compressable. Sometimes the answer to improving performance is not to compress, and web application acceleration solutions need to be intelligent enough to determine when to use this technology. Linearization of PDF documents Waiting for a PDF to load is one of the most painful experiences on the web. Technology that linearizes PDF documents and allows you to start reading page one while pages two through, well, whatever reduces the pain of that transfer. It still takes as long, but at least you aren't wasting your time waiting for it load. Leveraging the browser cache Caching often implies a server-side technology but let's not forget that there's a cache in every browser. Unfortunately, it's often not leveraged to its fullest potential. Web application acceleration technology improves the use of the browser cache to improve performance, because the fastest way to load content is to get it locally if it's fresh and available. Content spooling How fast a server can dish out content to a client is highly dependent upon the size of the network pipe (usually not a problem), number of concurrent users, number of open connections, distribution of resources being requested, and the total load and memory utilization on the server. Web application acceleration solutions, at least good ones, suck the content from the server as fast as possible and then serve it up to the browser. This reduces the number of open connections and concurrent users which reduces the load and resource consumption on the server and thus improves the speed at which the server can dish out content. Which makes the application faster. SSL acceleration It's forever been true that encryption and decryption is slow, and can be dramatically improved through the use of hardware assisted acceleration. Web application acceleration solutions generally employ this technique to improve the performance of secure web applications delivered via SSL. I could go on about how web application acceleration solutions further leverage technology like TCP multiplexing to improve performance by eliminating the overhead associated with the constant opening and closing of TCP sessions, which in turns makes servers more efficient which in turn makes them faster resulting in ... faster application delivery. But I think that's enough for one day. The truth is that web application acceleration technologies comprise more than just caching and employ multiple techniques that improve the efficiency of servers and the performance of applications whether they are delivered over the LAN, the WAN, or the Internet.182Views0likes0Comments