The Order of (Network) Operations
Thought those math rules you learned in 6 th grade were useless? Think again…some are more applicable to the architecture of your data center than you might think. Remember back when you were in the 6 th grade, learning about the order of operations in math class? You might recall that you learned that the order in which mathematical operators were applied can have a significant impact on the result. That’s why we learned there’s an order of operations – a set of rules – that we need to follow in order to ensure that we always get the correct answer when performing mathematical equations. Rule 1: First perform any calculations inside parentheses. Rule 2: Next perform all multiplications and divisions, working from left to right. Rule 3: Lastly, perform all additions and subtractions, working from left to right. Similarly, the order in which network and application delivery operations are applied can dramatically impact the performance and efficiency of the delivery of applications – no matter where those applications reside.358Views0likes1CommentDeduplication and Compression – Exactly the same, but different.
One day many years ago, Lori and I’s oldest son held up two sheets of paper and said “These two things are exactly the same, but different!” Now, he’s a very bright individual, he was just young, and didn’t even get how incongruous the statement was. We, being a fun loving family that likes to tease each other on occasion, we of course have not yet let him live it down. It was honestly more than a decade ago, but all is fair, he doesn’t let Lori live down something funny that she did before he was born. It is all in good fun of course. Why am I bringing up this family story? Because that phrase does come to mind when you start talking about deduplication and compression. Highly complimentary and very similar, they are pretty much “Exactly the same, but different”. Since these technologies are both used pretty heavily in WAN Optimization, and are growing in use on storage products, this topic intrigued me. To get this out of the way, at F5, compression is built into the BIG-IP family as a feature of the core BIG-IP LTM product, and deduplication is an added layer implemented over BIG-IP LTM on BIG-IP WAN Optimization Module (WOM). Other vendors have similar but varied (there goes a variant of that phrase again) implementation details. Before we delve too deeply into this topic though, what caught my attention and started me pondering the whys of this topic was that F5’s deduplication is applied before compression, and it seems that reversing the order changes performance characteristics. I love a good puzzle, and while the fact that one should come before the other was no surprise, I started wanting to know why the order it was, and what the impact of reversing them in processing might be. So I started working to understand the details of implementation for these two technologies. Not understand them from an F5 perspective, though that is certainly where I started, but try to understand how they interact and compliment each other. While much of this discussion also applies to in-place compression and deduplication such as that used on many storage devices, some of it does not, so assume that I am talking about networking, specifically WAN networking, throughout this blog. At the very highest level, deduplication and compression are the same thing. They both look for ways to shrink your dataset before passing it along. After that, it gets a bit more complex. If it was really that simple, after all, we wouldn’t call them two different things. Well, okay, we might, IT has a way of having competing standards, product categories, even jobs that we lump together with the same name. But still, they wouldn’t warrant two different names in the same product like F5 does with BIG-IP WOM. The thing is that compression can do transformations to data to shrink it, and it also looks for small groupings of repetitive byte patterns and replaces them, while deduplication looks for larger groupings of repetitive byte patterns and replaces them. In the implementation you’ll see on BIG-IP WOM, deduplication looks for larger byte patterns repeated across all streams, while compression applies transformations to the data, and when removing duplication only looks for smaller combinations on a single stream. The net result? The two are very complimentary, but if you run compression before deduplication, it will find a whole collection of small repeating byte patterns and between that and transformations, deduplication will find nothing, making compression work harder and deduplication spin its wheels. There are other differences – because deduplication deals with large runs of repetitive data (I believe that in BIG-IP the minimum size is over a K), it uses some form of caching to hold patterns that duplicates can match, and the larger the caching, the more strings of bytes you have to compare to. This introduces some fun around where the cache should be stored. In memory is fast, but limited in size, on flash disk is fast and has a greater size, but is expensive, and on disk is slow but has a huge advantage in size. Good deduplication engines can support all three and thus are customizable to what your organization needs and can afford. Some workloads just won’t benefit from one, but will get a huge benefit from the other. The extremes are good examples of this phenomenon – if you have a lot of in-the-stream repetitive data that is too small for deduplication to pick up, and little or no cross-stream duplication, then deduplication will be of limited use to you, and the act of running through the dedupe engine might actually degrade performance a negligible amount – of course, everything is algorithm dependent, so depending upon your vendor it might degrade performance a large amount also. On the other extreme, if you have a lot of large byte count duplication across streams, but very little within a given stream, deduplication is going to save your day, while compression will, at best, offer you a little benefit. So yes, they’re exactly the same from the 50,000 foot view, but very very different from the benefits and use cases view. And they’re very complimentary, giving you more bang for the buck.296Views0likes1CommentSelective Compression on BIG-IP
BIG-IP provides Local Traffic Policies that simplify the way in which you can manage traffic associated with a virtual server. You can associate a BIG-IP local traffic policy to support selective compression for types of content that can benefit from compression, like HTML, XML, and CSS stylesheets. These file types can realize performance improvements, especially across slow connections, by compressing them. You can easily configure your BIG-IP system to use a simple Local Traffic Policy that selectively compresses these file types. In order to use a policy, you will want to create and configure a draft policy, publish that policy, and then associate the policy with a virtual server in BIG-IP v12. Alright, let’s log into a BIG-IP The first thing you’ll need to do is create a draft policy. On the main menu select Local Traffic>Policies>Policy List and then the Create or + button. This takes us to the create policy config screen. We’ll name the policy SelectiveCompression, add a description like ‘This policy compresses file types,’ and we’ll leave the Strategy as the default of Execute First matching rule. This is so the policy uses the first rule that matches the request. Click Create Policy which saves the policy to the policies list. When saved, the Rules search field appears but has no rules. Click Create under Rules. This brings us to the Rules General Properties area of the policy. We’ll give this rule a name (CompressFiles) and then the first settings we need to configure are the conditions that need to match the request. Click the + button to associate file types. We know that the files for compression are comprised of specific file types associated with a content type HTTP Header. We choose HTTP Header and select Content-Type in the Named field. Select ‘begins with’ next and type ‘text/’ for the condition and compress at the ‘response’ time. We’ll add another condition to manage CPU usage effectively. So we click CPU Usage from the list with a duration of 1 minute with a conditional operator of ‘less than or equal to’ 5 as the usage level at response time. Next under Do the following, click the create + button to create a new action when those conditions are met. Here, we’ll enable compression at the response time. Click Save. Now the draft policy screen appears with the General Properties and a list of rules. Here we want to click Save Draft. Now we need to publish the draft policy and associate it with a virtual server. Select the policy and click Publish. Next, on the main menu click Local Traffic>Virtual Servers>Virtual Server List and click the name of the virtual server you’d like to associate for the policy. On the menu bar click Resources and for Policies click Manage. Move SelectiveCompression to the Enabled list and click Finished. The SelectiveCompression policy is now listed in the policies list which is now associated with the chosen virtual server. The virtual server with the SelectiveCompression Local Traffic Policy will compress the file types you specified. Congrats! You’ve now added a local traffic policy for selective compression! You can also watch the full video demo thanks to our TechPubs team. ps978Views0likes7CommentsTrue or False: Application acceleration solutions teach developers to write inefficient code
It has been suggested that the use of application acceleration solutions as a means to improve application performance would result in programmers writing less efficient code. In a comment on “The House that Load Balancing Built” a reader replies: Not only will it cause the application to grow in cost and complexity, it's teaching new and old programmers to not write efficient code and rely on other products and services on [sic] thier behalf. I.E. Why write security into the app, when the ADC can do that for me. Why write code that executes faster, the ADC will do that for me, etc., etc. While no one can control whether a programmer writes “fast” code, the truth is that application acceleration solutions do not affect the execution of code in any way. A poorly constructed loop will run just as slow with or without an application acceleration solution in place. Complex mathematical calculations will execute with the same speed regardless of the external systems that may be in place to assist in improving application performance. The answer is, unequivocally, that the presence or lack thereof of an application acceleration solution should have no impact on the application developer because it does nothing to affect the internal execution of written code. If you answered false, you got the answer right. The question has to be, then, just what does an application acceleration solution do that improves performance? If it isn’t making the application logic execute faster, what’s the point? It’s a good question, and one that deserves an answer. Application acceleration is part of a solution we call “application delivery”. Application delivery focuses on improving application performance through optimization of the use and behavior of transport (TCP) and application transport (HTTP/S) protocols, offloading certain functions from the application that are more efficiently handled by an external often hardware-based system, and accelerating the delivery of the application data. OPTIMIZATION Application acceleration improves performance by understanding how these protocols (TCP, HTTP/S) interact across a WAN or LAN and acting on that understanding to improve its overall performance. There are a large number of performance enhancing RFCs (standards) around TCP that are usually implemented by application acceleration solutions. Delayed and Selective Acknowledgments (RFC 2018) Explicit Congestion Notification (RFC 3168) Limited and Fast Re-Transmits (RFC 3042 and RFC 2582) Adaptive Initial Congestion Windows (RFC 3390) Slow Start with Congestion Avoidance (RFC 2581) TCP Slow Start (RFC 3390) TimeStamps and Windows Scaling (RFC 1323) All of these RFCs deal with TCP and therefore have very little to do with the code developers create. Most developers code within a framework that hides the details of TCP and HTTP connection management from them. It is the rare programmer today that writes code to directly interact with HTTP connections, and even rare to find one coding directly at the TCP socket layer. The execution of code written by the developer takes just as long regardless of the implementation or lack of implementation of these RFCs. The application acceleration solution improves the performance of the delivery of the application data over TCP and HTTP which increases the performance of the application as seen from the user’s point of view. OFFLOAD Offloading compute intensive processing from application and web servers improves performance by reducing the consumption of CPU and memory required to perform those tasks. SSL and other encryption/decryption functions (cookie security, for example) are computationally expensive and require additional CPU and memory on the server. The reason offloading these functions to an application delivery controller or stand-alone application acceleration solution improves application performance is because it frees the CPU and memory available on the server and allows it to be dedicated to the application. If the application or web server does not need to perform these tasks, it saves CPU cycles that would otherwise be used to perform them. Those cycles can be used by the application and thus increases the performance of the application. Also beneficial is the way in which application delivery controllers manage TCP connections made to the web or application server. Opening and closing TCP connections takes time, and the time required is not something a developer – coding within a framework – can affect. Application acceleration solutions proxy connections for the client and subsequently reduce the number of TCP connections required on the web or application server as well as the frequency with which those connections need to be open and closed. By reducing the connections and frequency of connections the application performance is increased because it is not spending time opening and closing TCP connections, which are necessarily part of the performance equation but not directly affected by anything the developer does in his or her code. The commenter believes that an application delivery controller implementation should be an afterthought. However, the ability of modern application delivery controllers to offload certain application logic functions such as cookie security and HTTP header manipulation in a centralized, optimized manner through network-side scripting can be a performance benefit as well as a way to address browser-specific quirks and therefore should be seriously considered during the development process. ACCELERATION Finally, application acceleration solutions improve performance through the use of caching and compression technologies. Caching includes not just server-side caching, but the intelligent use of the client (usually the browser) cache to reduce the number of requests that must be handled by the server. By reducing the number of requests the server is responding to, the web or application server is less burdened in terms of managing TCP and HTTP sessions and state, and has more CPU cycles and memory that can be dedicated to executing the application. Compression, whether using traditional industry standard web-based compression (GZip) or WAN-focused data de-duplication techniques, decreases the amount of data that must be transferred from the server to the client. Decreasing traffic (bandwidth) results in fewer packets traversing the network which results in quicker delivery to the user. This makes it appear that the application is performing faster than it is, simply because it arrived sooner. Of all these techniques, the only one that could possibly contribute to the delinquency of developers is caching. This is because application acceleration caching features act on HTTP caching headers that can be set by the developer, but rarely are. These headers can also be configured by the web or application server administrator, but rarely are in a way that makes sense because most content today is generated dynamically and is rarely static, even though individual components inside the dynamically generated page may in fact be very static (CSS, JavaScript, images, headers, footers, etc…). However, the methods through which caching (pragma) headers are set is fairly standard and the actual code is usually handled by the framework in which the application is developed, meaning the developer ultimately cannot affect the efficiency of the use of this method because it was developed by someone else. The point of the comment was likely more broad, however. I am fairly certain that the commenter meant to imply that if developers know the performance of the application they are developing will be accelerated by an external solution that they will not be as concerned about writing efficient code. That’s a layer 8 (people) problem that isn’t peculiar to application delivery solutions at all. If a developer is going to write inefficient code, there’s a problem – but that problem isn’t with the solutions implemented to improve the end-user experience or scalability, it’s a problem with the developer. No technology can fix that.247Views0likes4CommentsWILS: How can a load balancer keep a single server site available?
Most people don’t start thinking they need a “load balancer” until they need a second server. But even if you’ve only got one server a “load balancer” can help with availability, with performance, and make the transition later on to a multiple server site a whole lot easier. Before we reveal the secret sauce, let me first say that if you have only one server and the application crashes or the network stack flakes out, you’re out of luck. There are a lot of things load balancers/application delivery controllers can do with only one server, but automagically fixing application crashes or network connectivity issues ain’t in the list. If these are concerns, then you really do need a second server. But if you’re just worried about standing up to the load then a Load balancer for even a single server can definitely give you a boost.421Views0likes2CommentsI am wondering why not all websites enabling this great feature GZIP?
Understanding the impact of compression on server resources and application performance While doing some research on a related topic, I ran across this question and thought “that deserves an answer” because it certainly seems like a no-brainer. If you want to decrease bandwidth – which subsequently decreases response time and improves application performance – turn on compression. After all, a large portion of web site traffic is text-based: CSS, JavaScript, HTML, RSS feeds, which means it will greatly benefit from compression. Typical GZIP compression affords at least a 3:1 reduction in size, with hardware-assisted compression yielding an average of 4:1 compression ratios. That can dramatically affect the response time of applications. As I said, seems like a no-brainer. Here’s the rub: turning on compression often has a negative impact on capacity because it is CPU-bound and under certain conditions can actually cause a degradation in performance due to the latency inherent in compressing data compared to the speed of the network over which the data will be delivered. Here comes the science. IMPACT ON CPU UTILIZATION Compression via GZIP is CPU bound. It requires a lot more CPU than you might think. The larger the file being compressed, the more CPU resources are required. Consider for a moment what compression is really doing: it’s finding all similar patterns and replacing them with representations (symbols, indexes into a table, etc…) to a single instance of the text instead. So it makes sense that the larger a file is, the more resources are required – RAM and CPU – to execute such a process. Of course the larger the file is the more benefit you see from compression in terms of bandwidth and improvement in response time. It’s kind of a Catch-22: you want the benefits but you end up paying in terms of capacity. If CPU and RAM is being chewed up by the compression process then the server can handle fewer requests and fewer concurrent users. You don’t have to take my word for it – there are quite a few examples of testing done on web servers and compression that illustrate the impact on CPU utilization. Measuring the Performance Effects of Dynamic Compression in IIS 7.0 Measuring the Performance Effects of mod_deflate in Apache 2.2 HTTP Compression for Web Applications They all essentially say the same thing; if you’re serving dynamic content (or static content and don’t have local caching on the web server enabled) then there is a significant negative impact on CPU utilization that occurs when enabling GZIP/compression for web applications. Given the exceedingly dynamic nature of Web 2.0 applications, the use of AJAX and similar technologies, and the data-driven world in which we live today, that means there are very few types of applications running on web servers for which compression will not negatively impact the capacity of the web server. In case you don’t (want || have time) to slog through the above articles, here’s a quick recap: File Size Bandwidth decrease CPU utilization increase IIS 7.0 10KB 55% 4x 50KB 67% 20x 100KB 64% 30x Apache 2.2 10KB 55% 4x 50KB 65% 10x 100KB 63% 30x It’s interesting to note that IIS 7.0 and Apache 2.2 mod_deflate have essentially the same performance characteristics. This data falls in line with the aforementioned Intel report on HTTP compression which noted that CPU utilization was increased 25-35% when compression was enabled. So essentially when you enable compression you are trading its benefits – bandwidth reduction, response time improvement – for a reduction in capacity. You’re robbing Peter to pay Paul, because instead of paying for bandwidth you’re paying for more servers to handle the same load. THE MYTH OF IMPROVED RESPONSE TIME One of the reasons you’d want to compress content is to improve response time by decreasing the total number of packets that have to traverse a wire. This is a necessity when transferring content via a WAN, but can actually cause a decrease in performance for application delivery over the LAN. This is because the time it takes to compress the content and then deliver it is actually greater than the time to just transfer the original file via the LAN. The speed of the network over which the content is being delivered is highly relevant to whether compression yields benefits for response time. The increasing consumption of CPU resources as volume increases, too, has a negative impact on the ability of the server to process and subsequently respond, which also means an increase in application response time, which is not the desired result. Maybe you’re thinking “I’ll just get more CPU then. After all, there’s like billion core servers out there, that ought to solve the problem!” Compression algorithms, like FTP, are greedy. FTP will, if allowed, consume as much bandwidth as possible in an effort to transfer data as quickly as possible. Compression will do the same thing to CPU resources: consume as much as it can to perform its task as quickly as possible. Eventually, yes, you’ll find a machine with enough cores to support both compression and capacity needs, but at what cost? It may well have been more financially efficient to invest in a better solution (that also brings additional benefits to the table) than just increasing the size of the server. But hey, it’s your data, you need to do what you need to do. The size of the content, too, has an impact on whether compression will benefit application performance. Consider that the goal of compression is to decrease the number of packets being transferred to the client. Generally speaking, the standard MTU for most network is 1500 bytes because that’s what works best with ethernet and IP. That means you can assume around 1400 bytes per packet available to transfer data. That means if content is 1400 bytes or less, you get absolutely no benefit out of compression because it’s already going to take only one packet to transfer; you can’t really send half-packets, after all, and in some networks packets that are too small can actually freak out some network devices because they’re optimized to handle the large content being served today – which means many full packets. TO COMPRESS OR NOT COMPRESS There is real benefit to compression; it’s part of the core techniques used by both application acceleration and WAN application delivery services to improve performance and reduce costs. It can drastically reduce the size of data and especially when you might be paying by the MB or GB transferred (such as applications deployed in cloud environments) this a very important feature to consider. But if you end up paying for additional servers (or instances in a cloud) to make up for the lost capacity due to higher CPU utilization because of that compression, you’ve pretty much ended up right where you started: no financial benefit at all. The question is not if you should compress content, it’s when and where and what you should compress. The answer to “should I compress this content” almost always needs to be based on a set of criteria that require context-awareness – the ability to factor into the decision making process the content, the network, the application, and the user. If the user is on a mobile device and the size of the content is greater than 2000 bytes and the type of content is text-based and … It is this type of intelligence that is required to effectively apply compression such that the greatest benefits of reduction in costs, application performance, and maximization of server resources is achieved. Any implementation that can’t factor all these variables into the decision to compress or not is not an optimal solution, as it’s just guessing or blindly applying the same policy to all kinds of content. Such implementations effectively defeat the purpose of employing compression in the first place. That’s why the answer to where is almost always “on the load-balancer or application delivery controller”. Not only are such devices capable of factoring in all the necessary variables but they also generally employ specialized hardware designed to speed up the compression process. By offloading compression to an application delivery device, you can reap the benefits without sacrificing performance or CPU resources. Measuring the Performance Effects of Dynamic Compression in IIS 7.0 Measuring the Performance Effects of mod_deflate in Apache 2.2 HTTP Compression for Web Applications The Context-Aware Cloud The Revolution Continues: Let Them Eat Cloud Nerd Rage675Views0likes2CommentsProgrammable Cache-Control: One Size Does Not Fit All
#webperf For addressing challenges related to performance of #mobile devices and networks, caching is making a comeback. It's interesting - and almost amusing - to watch the circle of technology run around best practices with respect to performance over time. Back in the day caching was the ultimate means by which web application performance was improved and there was no lack of solutions and techniques that manipulated caching capabilities to achieve optimal performance. Then it was suddenly in vogue to address the performance issues associated with Javascript on the client. As Web 2.0 ascended and AJAX-based architectures ruled the day, Javascript was Enemy #1 of performance (and security, for that matter). Solutions and best practices began to arise to address when Javascript loaded, from where, and whether or not it was even active. And now, once again, we're back at the beginning with caching. In the interim years, it turns out developers have not become better about how they mark content for caching and with the proliferation of access from mobile devices over sometimes constrained networks, it's once again come to the attention of operations (who are ultimately responsible for some reason for performance of web applications) that caching can dramatically improve the performance of web applications. [ Excuse me while I take a breather - that was one long thought to type. ] Steve Souders, web performance engineer extraordinaire, gave a great presentation at HTML5DevCon that was picked up by High Scalability: Cache is King!. The aforementioned articles notes: Use HTTP cache control mechanisms: max-age, etag, last-modified, if-modified-since, if-none-match, no-cache, must-revalidate, no-store. Want to prevent HTTP sending conditional GET requests, especially over high latency mobile networks. Use a long max-age and change resource names any time the content changes so that it won't be cached improperly. -- Better Browser Caching Is More Important Than No Javascript Or Fast Networks For HTTP Performance The problem is, of course, that developers aren't putting all these nifty-neato-keen tags and meta-data in their content and the cost to modify existing applications to do so may result in a prioritization somewhere right below having an optional, unnecessary root canal. In other cases the way in which web applications are built today - we're still using AJAX-based, real-time updates of chunks of content rather than whole pages - means simply adding tags and meta-data to the HTML isn't necessarily going to help because it refers to the page and not the data/content being retrieved and updated for that "I'm a live, real-time application" feel that everyone has to have today. Too, caching tags and meta-data in HTML doesn't address every type of data. JSON, for example, commonly returned as the response to an API call (used as the building blocks for web applications more and more frequently these days) aren't going to be impacted by the HTML caching directives. That has to be addressed in a different way, either on the server side (think Apache mod_expire) or on the client (HTML5 contains new capabilities specifically for this purpose and there are usually cache directives hidden in AJAX frameworks like jQuery). The Programmable Network to the Rescue What you need is the ability to insert the appropriate tags, on the appropriate content, in such a way as to make sure whatever you're about to do (a) doesn't break the application and (b) is actually going to improve the performance of the end-user experience for that specific request. Note that (b) is pretty important, actually, because there are things you do to content being delivered to end users on mobile devices over mobile networks that might make things worse if you do it to the same content being delivered to the same end user on the same device over the wireless LAN. Network capabilities matter, so it's important to remember that. To avoid rewriting applications (and perhaps changing the entire server-side architecture by adding on modules) you could just take advantage of programmability in the network. When enabled as part of a full-proxy, network intermediary the ability to programmatically modify content in-flight becomes invaluable as a mechanism for improving performance, particularly with respect to adding (or modifying) cache headers, tags, and meta-data. By allowing the intermediary to cache the cacheable content while simultaneously inserting the appropriate cache control headers to manage the client-side cache, performance is improved. By leveraging programmability, you can start to apply device or network or application (or any combination thereof) logic to manipulate the cache as necessary while also using additional performance-enhancing techniques like compression (when appropriate) or image optimization (for mobile devices). The thing is that a generic "all on" or "all off" for caching isn't going to always result in the best performance. There's logic to it that says you need the capability to say "if X and Y then ON else if Z then OFF". That's the power of a programmable network, of the ability to write the kind of logic that takes into consideration the context of a request and takes the appropriate actions in real-time. Because one size (setting) simply does not fit all.202Views0likes0CommentsBare Metal Blog: Throughput Sometimes Has Meaning
#BareMetalBlog Knowing what to test is half the battle. Knowing how it was tested the other. Knowing what that means is the third. That’s some testing, real clear numbers. In most countries, top speed is no longer the thing that auto manufacturers want to talk about. Top speed is great if you need it, but for the vast bulk of us, we’ll never need it. Since the flow of traffic dictates that too much speed is hazardous on the vast bulk of roads, automobile manufacturers have correctly moved the conversation to other things – cup holders (did you know there is a magic number of them for female purchasers? Did you know people actually debate not the existence of such a number, but what it is?), USB/bluetooth connectivity, backup cameras, etc. Safety and convenience features have supplanted top speed as the things to discuss. The same is true of networking gear. While I was at Network Computing, focus was shifting from “speeds and feeds” as the industry called it, to overall performance in a real enterprise environment. Not only was it getting increasingly difficult and expensive to push ever-larger switches until they could no longer handle the throughput, enterprise IT staff was more interested in what the capabilities of the box were than how fast it could go. Capabilities is a vague term that I used on purpose. The definition is a moving target across both time and market, with a far different set of criteria for, say, an ADC versus a WAP. There are times, however, where you really do want to know about the straight-up throughput, even if you know it is the equivalent of a professional driver on a closed course, and your network will never see the level of performance that is claimed for the device. There are actually several cases where you will want to know about the maximum performance of an ADC, using the tools I pay the most attention to at the moment as an example. WAN optimization is a good one. In WANOpt, the goal is to shrink the amount of data being transferred between two dedicated points to try and maximize the amount of throughput. When “maximize the amount of throughput” is in the description, speeds and feeds matter. WANOpt is a pretty interesting example too, because there’s more than just “how much data did I send over the wire in that fifteen minute window”. It’s more complex than that (isn’t it always?). The best testing I’ve seen for WANOpt starts with “how many bytes were sent by the originating machine”, then measures that the same number of bytes were received by the WANOpt device, then measures how much is going out the Internet port of the WANOpt device – to measure compression levels and bandwidth usage – then measures the number of bytes the receiving machine at the remote location receives to make sure it matches the originating machine. So even though I say “speeds and feeds matter”, there is a caveat. You want to measure latency introduced with compression and dedupe, and possibly with encryption since WANOpt is almost always over the public Internet these days, throughput, and bandwidth usage. All technically “speeds and feeds” numbers, but taken together giving you an overall picture of what good the WANOpt device is doing. There are scenarios where the “good” is astounding. I’ve seen the numbers that range as high as 95x the performance. If you’re sending a ton of data over WANOpt connections, even 4x or 5x is a huge savings in connection upgrades, anything higher than that is astounding. This is an (older) diagram of WAN Optimization I’ve marked up to show where the testing took place, because sometimes a picture is indeed worth a thousand words. And yeah, I used F5 gear for the example image… That really should not surprise you . So basically, you count the bytes the server sends, the bytes the WANOpt device sends (which will be less for 99.99% of loads if compression and de-dupe are used), and the total number of bytes received by the target server. Then you know what percentage improvement you got out of the WANOpt device (by comparing server out bytes to WANOpt out bytes), that the WANOpt devices functioned as expected (server received bytes == server sent bytes), and what the overall throughput improvement was (server received bytes/time to transfer). There are other scenarios where simple speeds and feeds matter, but less of them than their used to be, and the trend is continuing. When a device designed to improve application traffic is introduced, there are certainly few. The ability to handle a gazillion connections per second I’ve mentioned before is a good guardian against DDoS attacks, but what those connections can do is a different question. Lots of devices in many networking market spaces show little or even no latency introduction on their glossy sales hand-outs, but make those devices do the job they’re purchased for and see what the latency numbers look like. It can be ugly, or you could be pleasantly surprised, but you need to know. Because you’re not going to use it in a pristine lab with perfect conditions, you’re going to slap it into a network where all sorts of things are happening and it is expected to carry its load. So again, I’ll wrap with acknowledgement that you all are smart puppies and know where speeds and feeds matter, make sure you have realistic performance numbers for those cases too. Technorati Tags: Testing,Application Delivery Controller,WAN Optimization,throughput,latency,compression,deduplication,Bare Metal Blog,F5 Networks,Don MacVittie The Whole Bare Metal Blog series: Bare Metal Blog: Introduction to FPGAs | F5 DevCentral Bare Metal Blog: Testing for Numbers or Performance? | F5 ... Bare Metal Blog: Test for reality. | F5 DevCentral Bare Metal Blog: FPGAs The Benefits and Risks | F5 DevCentral Bare Metal Blog: FPGAs: Reaping the Benefits | F5 DevCentral Bare Metal Blog: Introduction | F5 DevCentral203Views0likes0CommentsThe “All of the Above” Approach to Improving Application Performance
#ado #fasterapp #stirling Carnegie Mellon testing of ADO solutions answers age old question: less filling or tastes great? You probably recall years ago the old “Tastes Great vs Less Filling” advertisements. The ones that always concluded in the end that the beer in question was not one or the other, but both. Whenever there are two ostensibly competing technologies attempting to solve the same problem, we run into the same old style argument. This time, in the SPDY versus Web Acceleration debate, we’re inevitably going to arrive at the conclusion it’s both less filling and tastes great. SPDY versus Web Acceleration In general, what may appear on the surface to be competing technologies are actually complementary. Testing by Carnegie Mellon supports this conclusion, showing marked improvements in web application performance when both SPDY and Web Acceleration techniques are used together. That’s primarily because web application traffic shows a similar pattern across modern, interactive Web 2.0 sites: big, fat initial pages with a subsequent steady stream of small requests and a variety of response sizes, typically small to medium in content length. We know from experience and testing that web acceleration techniques like compression provide the greatest improvements in performance when acting upon medium-large sized responses though actual improvement rates depend highly on the network over which data is being exchanged. We know that compression can actually be detrimental to performance when responses are small (in the 1K range) and being transferred over a LAN. That’s because the processing time incurred to compress that data is greater than the time to traverse the network. But when used to compress larger responses traversing congested or bandwidth constrained connections, compression is a boon to performance. It’s less filling. SPDY, though relatively new on the scene, is the rising star of web acceleration. Its primary purposes is to optimize the application layer exchanges that typically occur via HTTP (requests and responses) by streamlining connection management (SPDY only uses one connection per client-host), dramatically reducing header sizes, and introducing asynchronicity along with prioritization. It tastes great. What Carnegie Mellon testing shows is that when you combine the two, you get the best results because each improves performance of specific data exchanges that occur over the life of a user interaction. HERE COMES the DATA The testing was specifically designed to measure the impact of each of the technologies separately and then together. For the web acceleration functionality they chose to employ BoostEdge (a software ADC) though one can reasonably expect similar results from other ADCs provided they offer the same web acceleration and optimization capabilities, which is generally a good bet in today’s market. The testing specifically looked at two approaches: Two of the most promising software approaches are (a) content optimization and compression, and (b) optimizing network protocols. Since network protocol optimization and data optimization operate at different levels, there is an opportunity for improvement beyond what can be achieved by either of the approaches individually. In this paper, we report on the performance benefits observed by following a unified approach, using both network protocol and data optimization techniques, and the inherent benefits in network performance by combining these approaches in to a single solution. -- Data and Network Optimization Effect on Web Performance, Carnegie Mellon, Feb 2012 The results should not be surprising – when SPDY is combined with ADC optimization technologies, the result is both less filling and it tastes great. When tested in various combinations, we find that the effects are more or less additive and that the maximum improvement is gained by using BoostEdge and SPDY together. Interestingly, the two approaches are also complimentary; i.e., in situations where data predominates (i.e. “heavy” data, and fewer network requests), BoostEdge provides a larger boost via its data optimization capabilities and in cases where the data is relatively small, or “light”, but there are many network transactions required, SPDY provides an increased proportion of the overall boost. The general effect is that relative level of improvement remains consistent over various types of websites. -- Data and Network Optimization Effect on Web Performance, Carnegie Mellon, Feb 2012 NOT THE WHOLE STORY Interestingly, the testing results show there is room for even greater improvement. The paper notes that comparisons include “an 11% cost of running SSL” (p 13). This is largely due to the use of a software ADC solution which employs no specialized hardware to address the latency incurred by compute intense processing such as compression and cryptography. Leveraging a hardware ADC with cryptographic hardware and compression acceleration capabilities should further improve results by reducing the latency incurred by both processes. Testers further did not test under load (which would have a significant impact on the results in a negative fashion) or leverage other proven application delivery optimization (ADO) techniques such as TCP multiplexing. While the authors mention the use of front-end optimization (FEO) techniques such as image optimization and client-side cache optimization, it does not appear that these were enabled during testing. Other techniques such as EXIF stripping, CSS caching, domain sharding, and core TCP optimizations are likely to provide even greater benefits to web application performance when used in conjunction with SPDY. CHOOSE “ALL of the ABOVE” What the testing concluded was that an “all of the above” approach would appear to net the biggest benefits in terms of application performance. Using SPDY along with complimentary ADO technologies provides the best mitigation of latency-inducing issues that ultimately degrade the end-user experience. Ultimately, SPDY is one of a plethora of ADO technologies designed to streamline and improve web application performance. Like most ADO technologies, it is not universally beneficial to every exchange. That’s why a comprehensive, inclusive ADO strategy is necessary. Only an approach that leverages “all of the above” at the right time and on the right data will net optimal performance across the board. The HTTP 2.0 War has Just Begun Stripping EXIF From Images as a Security Measure F5 Friday: Domain Sharding On-Demand WILS: WPO versus FEO WILS: The Many Faces of TCP HTML5 Web Sockets Changes the Scalability Game Network versus Application Layer Prioritization The Three Axioms of Application Delivery229Views0likes0CommentsThe Four V’s of Big Data
#stirling #bigdata #ado #interop “Big data” focuses almost entirely on data at rest. But before it was at rest – it was transmitted over the network. That ultimately means trouble for application performance. The problem of “big data” is highly dependent upon to whom you are speaking. It could be an issue of security, of scale, of processing, of transferring from one place to another. What’s rarely discussed as a problem is that all that data got where it is in the same way: over a network and via an application. What’s also rarely discussed is how it was generated: by users. If the amount of data at rest is mind-boggling, consider the number of transactions and users that must be involved to create that data in the first place – and how that must impact the network. Which in turn, of course, impacts the users and applications creating it. It’s a vicious cycle, when you stop and think about it. This cycle shows no end in sight. The amount of data being transferred over networks, according to Cisco, is only going to grow at a staggering rate – right along with the number of users and variety of devices generating that data. The impact on the network will be increasing amounts of congestion and latency, leading to poorer application performance and greater user frustration. MITIGATING the RISKS of BIG DATA SIDE EFFECTS Addressing that frustration and improving performance is critical to maintaining a vibrant and increasingly fickle user community. A Yotta blog detailing the business impact of site performance (compiled from a variety of sources) indicates a serious risk to the business. According to its compilation, a delay of 1 second in page load time results in: 7% Loss in Conversions 11% Fewer Pages Viewed 16% Decrease in Customer Satisfaction This delay is particularly noticeable on mobile networks, where latency is high and bandwidth is low – a deadly combination for those trying to maintain service level agreements with respect to application performance. But users accessing sites over the LAN or Internet are hardly immune from the impact; the increasing pressure on networks inside and outside the data center inevitably result in failures to perform – and frustrated users who are as likely to abandon and never return as are mobile users. Thus, the importance of optimizing the delivery of applications amidst potentially difficult network conditions is rapidly growing. The definition of “available” is broadening and now includes performance as a key component. A user considers a site or application “available” if it responds within a specific time interval – and that time interval is steadily decreasing. Optimizing the delivery of applications while taking into consideration the network type and conditions is no easy task, and requires a level of intelligence (to apply the right optimization at the right time) that can only be achieved by a solution positioned in a strategic point of control – at the application delivery tier. Application Delivery Optimization (ADO) Application delivery optimization (ADO) is a comprehensive, strategic approach to addressing performance issues, period. It is not a focus on mobile, or on cloud, or on wireless networks. It is a strategy that employs visibility and intelligence at a strategic point of control in the data path that enables solutions to apply the right type of optimization at the right time to ensure individual users are assured the best performance possible given their unique set of circumstances. The technological underpinnings of ADO are both technological and topological, leveraging location along with technologies like load balancing, caching, and protocols to improve performance on a per-session basis. The difficulties in executing on an overarching, comprehensive ADO strategy is addressing variables of myriad environments, networks, devices, and applications with the fewest number of components possible, so as not to compound the problems by introducing more latency due to additional processing and network traversal. A unified platform approach to ADO is necessary to ensure minimal impact from the solution on the results. ADO must therefore support topology and technology in such a way as to ensure the flexible application of any combination as may be required to mitigate performance problems on demand. Topologies Symmetric Acceleration Front-End Optimization (Asymmetric Acceleration) Lengthy debate has surrounded the advantages and disadvantages of symmetric and asymmetric optimization techniques. The reality is that both are beneficial to optimization efforts. Each approach has varying benefits in specific scenarios, as each approach focuses on specific problem areas within application delivery chain. Neither is necessarily appropriate for every situation, nor will either one necessarily resolve performance issues in which the root cause lies outside the approach's intended domain expertise. A successful application delivery optimization strategy is to leverage both techniques when appropriate. Technologies Protocol Optimization Load Balancing Offload Location Whether the technology is new – SPDY – or old – hundreds of RFC standards improving on TCP – it is undeniable that technology implementation plays a significant role in improving application performance across a broad spectrum of networks, clients, and applications. From improving upon the way in which existing protocols behave to implementing emerging protocols, from offloading computationally expensive processing to choosing the best location from which to serve a user, the technologies of ADO achieve the best results when applied intelligently and dynamically, taking into consideration real-time conditions across the user-network-server spectrum. ADO cannot effectively scale as a solution if it focuses on one or two comprising solutions. It must necessarily address what is a polyvariable problem with a polyvariable solution: one that can apply the right set of technological and topological solutions to the problem at hand. That requires a level of collaboration across ADO solutions that is almost impossible to achieve unless the solutions are tightly integrated. A holistic approach to ADO is the most operationally efficient and effective means of realizing performance gains in the face of increasingly hostile network conditions. Mobile versus Mobile: 867-5309 Identity Gone Wild! Cloud Edition Network versus Application Layer Prioritization Performance in the Cloud: Business Jitter is Bad The Three Axioms of Application Delivery Fire and Ice, Silk and Chrome, SPDY and HTTP The HTTP 2.0 War has Just Begun Stripping EXIF From Images as a Security Measure224Views0likes0Comments