I am wondering why not all websites enabling this great feature GZIP?

Understanding the impact of compression on server resources and application performance

While doing some research on a related topic, I ran across this question and thought “that deserves an answer” because it certainly seems like a no-brainer. If you want to decrease bandwidth – which subsequently decreases response time and improves application performance – turn on compression. After all, a large portion of web site traffic is text-based: CSS, JavaScript, HTML, RSS feeds, which means it will greatly benefit from compression. Typical GZIP compression affords at least a 3:1 reduction in size, with hardware-assisted compression yielding an average of 4:1 compression ratios. That can dramatically affect the response time of applications. As I said, seems like a no-brainer.

Here’s the rub: turning on compression often has a negative impact on capacity because it is CPU-bound and under certain conditions can actually cause a degradation in performance due to the latency inherent in compressing data compared to the speed of the network over which the data will be delivered.

Here comes the science.

Compression via GZIP is CPU bound. It requires a lot more CPU than you might think. The larger the file being compressed, the more CPU resources are required. Consider for a moment what compression is really doing: it’s finding all similar patterns and replacing them with representations (symbols, indexes into a table, etc…) to a single instance of the text instead. So it makes sense that the larger a file is, the more resources are required – RAM and CPU – to execute such a process.

Of course the larger the file is the more benefit you see from compression in terms of bandwidth and improvement in response time. It’s kind of a Catch-22: you want the benefits but you end up paying in terms of capacity. If CPU and RAM is being chewed up by the compression process then the server can handle fewer requests and fewer concurrent users.

You don’t have to take my word for it – there are quite a few examples of testing done on web servers and compression that illustrate the impact on CPU utilization.

They all essentially say the same thing; if you’re serving dynamic content (or static content and don’t have local caching on the web server enabled) then there is a significant negative impact on CPU utilization that occurs when enabling GZIP/compression for web applications. Given the exceedingly dynamic nature of Web 2.0 applications, the use of AJAX and similar technologies, and the data-driven world in which we live today, that means there are very few types of applications running on web servers for which compression will not negatively impact the capacity of the web server. 

In case you don’t (want || have time) to slog through the above articles, here’s a quick recap:

 File SizeBandwidth decreaseCPU utilization increase
IIS 7.010KB55%4x
Apache 2.2 10KB55%4x

It’s interesting to note that IIS 7.0 and Apache 2.2 mod_deflate have essentially the same performance characteristics. This data falls in line with the aforementioned Intel report on HTTP compression which noted that CPU utilization was increased 25-35% when compression was enabled.

So essentially when you enable compression you are trading its benefits – bandwidth reduction, response time improvement – for a reduction in capacity. You’re robbing Peter to pay Paul, because instead of paying for bandwidth you’re paying for more servers to handle the same load.

One of the reasons you’d want to compress content is to improve response time by decreasing the total number of packets that have to traverse a wire. This is a necessity when transferring content via a WAN, but can actually cause a decrease in performance for application delivery over the LAN. This is because the time it takes to compress the content and then deliver it is actually greater than the time to just transfer the original file via the LAN. The speed of the network over which the content is being delivered is highly relevant to whether compression yields benefits for response time.

The increasing consumption of CPU resources as volume increases, too, has a negative impact on the ability of the server to process and subsequently respond, which also means an increase in application response time, which is not the desired result.

Maybe you’re thinking “I’ll just get more CPU then. After all, there’s like billion core servers out there, that ought to solve the problem!” Compression algorithms, like FTP, are greedy. FTP will, if allowed, consume as much bandwidth as possible in an effort to transfer data as quickly as possible. Compression will do the same thing to CPU resources: consume as much as it can to perform its task as quickly as possible. Eventually, yes, you’ll find a machine with enough cores to support both compression and capacity needs, but at what cost? It may well have been more financially efficient to invest in a better solution (that also brings additional benefits to the table) than just increasing the size of the server. But hey, it’s your data, you need to do what you need to do.

The size of the content, too, has an impact on whether compression will benefit application performance. Consider that the goal of compression is to decrease the number of packets being transferred to the client. Generally speaking, the standard MTU for most network is 1500 bytes because that’s what works best with ethernet and IP. That means you can assume around 1400 bytes per packet available to transfer data. That means if content is 1400 bytes or less, you get absolutely no benefit out of compression because it’s already going to take only one packet to transfer; you can’t really send half-packets, after all, and in some networks packets that are too small can actually freak out some network devices because they’re optimized to handle the large content being served today – which means many full packets.


There is real benefit to compression; it’s part of the core techniques used by both application acceleration and WAN application delivery services to improve performance and reduce costs. It can drastically reduce the size of data and especially when you might be paying by the MB or GB transferred (such as applications deployed in cloud environments) this a very important feature to consider. But if you end up paying for additional servers (or instances in a cloud) to make up for the lost capacity due to higher CPU utilization because of that compression, you’ve pretty much ended up right where you started: no financial benefit at all.

The question is not if you should compress content, it’s when and where and what you should compress.

The answer to “should I compress this content” almost always needs to be based on a set of criteria that require context-awareness – the ability to factor into the decision making process the content, the network, the application, and the user. If the user is on a mobile device and the size of the content is greater than 2000 bytes and the type of content is text-based and … It is this type of intelligence that is required to effectively apply compression such that the greatest benefits of reduction in costs, application performance, and maximization of server resources is achieved. Any implementation that can’t factor all these variables into the decision to compress or not is not an optimal solution, as it’s just guessing or blindly applying the same policy to all kinds of content. Such implementations effectively defeat the purpose of employing compression in the first place.

That’s why the answer to where is almost always “on the load-balancer or application delivery controller”. Not only are such devices capable of factoring in all the necessary variables but they also generally employ specialized hardware designed to speed up the compression process. By offloading compression to an application delivery device, you can reap the benefits without sacrificing performance or CPU resources.


AddThis Feed Button Bookmark and Share

Published May 27, 2009
Version 1.0

Was this article helpful?


  • If most of the content is static, I see no reason why there would be increased computation required for every request. It's far simpler to compress the content once, cache the result, and then serve that cached version for subsequent requests. Then you get the benefit of reduced bandwidth requirements without the extra CPU load.