Caching for Faster APIs

Pop quiz: Do you know what the number one driver cited in 2016 for networking investments was? No peeking!

If you guessed speed and performance, you guessed right. If you guessed security don’t feel bad, it came in at number two, just ahead of availability.

Still, it’s telling that the same things that have always driven network upgrades, improvements, and architectures continue to do so. We want fast, secure, and reliable networks that deliver fast, secure, and reliable applications.

Go figure.

The problem is that a blazing fast, secure, and reliable network does not automatically translate into a fast, secure, and reliable application.

But it can provide a much needed boost. And I’m here to tell you how (and it won’t even cost you shipping and handling fees).

The thing is that there have long been web app server (from which apps and APIs are generally delivered) options for both caching and compression. The other thing is that they’re often not enabled. Caching headers are part of the HTTP specification. They’re built in, but that means they’re packaged up with each request and response. So if a developer doesn’t add them, they aren’t there.

Except when you’ve got an upstream, programmable proxy with which you can insert them. Cause when we say software-defined, we really mean software-defined. As in “Automation is cool and stuff, but interacting with requests/responses in real-time is even cooler.”

So, to get on with it, there are several mechanisms for managing caching within HTTP, two of which are: ETag and Last-Modified.

ETag The HTTP header “ETag” contains a hash or checksum that can be used to compare whether or not content has changed. It’s like the MD5 signature on compressed files or RPMs.While MD5 signatures are usually associated with security, they can also be used to determine whether or not content has changed. In the case of browser caching, the browser can make a request that says “hey, only give me new content if it’s changed”. The server-side uses the ETag to determine if it has and if not, sends back an empty HTTP 304 response. The browser says “Cool” and pulls the content from its own local cache. This saves on transfer times (by reducing bandwidth and round trips if the content is large) and thus improves performance.
Last-Modified. This is really the same thing as an ETag but with timestamps, instead. Browsers ask to be served new content if it has been modified since a specific date. This, too, saves on bandwidth and transfer times, and can improve performance.

Now, these mechanisms were put into place primarily to help with web-based content. Caching images and other infrequently changing presentation components (think style-sheets, a la CSS) can have a significant impact on performance and scalability of an application. But we’re talking about APIs, and as we recall, APIs are not web pages. So how does HTTP’s caching options help with APIs?

Well, very much the same way, especially given that most APIs today are RESTful, which means they use HTTP.

If I’ve got an app (and I’ve got lots of them) that depends on an API there are still going to be a lot of content types that are similar, like images. Those images can (and should) certainly be cached when possible, especially if the app is a mobile one. Data, too, for frequently retrieved content can be cached, even if it is just a big blob of JSON. Consider the situation in which I have an app and every day the “new arrivals” are highlighted. But they’re only updated once a day, or on a well-known schedule. The first time I open the menu item to see the “new arrivals”, the app should certainly go get the new content, because it’s new. But after that, there’s virtually no reason for the app to go requesting that data. I already paid the performance price to get it, and it hasn’t changed – neither the JSON objects representing the individual items nor the thumbnails depicting them. Using HTTP caching headers and semantics, I can ask “have you changed this yet?” and the server can quickly respond “Not yet.” That saves subsequent trips back and forth to download data while I click on fourteen different pairs of shoes* off the “new arrivals” list and then come back to browse for more.

If the API developer hasn’t added the appropriate HTTP cues in the headers, however, you’re stuck grabbing and regrabbing the same content and wasting bandwidth as well as valuable client and server-side resources. An upstream programmable proxy can be used to insert them, however, and provide both a performance boost (for the client) and greater scalability (for the server).

Basically, you can insert anything you want into the request/response using a programmable proxy, but we’ll focus on just HTTP headers right now. The basic pattern is:

   1: when HTTP_REQUEST {

   2:      HTTP::header insert "ETag" "my-computed-value"

   3: }

Really, that’s all there is to it. Now, you probably want some logic in there to not override an existing header because if the developer put it in, there’s a good reason. This is where I mini-lecture you on the cultural side of DevOps and remind you that communication is as critical as code when it comes to improving the deployment and delivery of applications. And there’s certainly going to be some other stuffs that go along with it, but the general premise is that the insertion of caching-related HTTP headers is pretty simple to achieve.

For example, we could insert a Last-Modified header for any JPG image:

   1: when HTTP_RESPONSE {

   2:     if { [HTTP::header "Content-Type" ] equals "Image/jpeg" } {

   3:         HTTP::header insert "Last-Modified" "timestamp value"

   4:        }

   5:     }

   6: }

We could do the same for CSS, or JS, as well. And we could get more complex and make decisions based on a hundred other variables and conditions. Cause, software-defined delivery kinda means you can do whatever you need to do.

Another reason a programmable proxy is an excellent option in this case is because it further allows you to extend HTTP unofficial functionality when servers do not. For example, there’s an unofficial “PURGE” method that’s used by Varnish for invalidating cache entries. Because it’s unofficial, it’s not universally supported by the web servers on which APIs are implemented. But a programmable proxy could be used to implement that functionality on behalf of the web server (cause that’s what proxies do) and relieve pressure on web servers to do so themselves. That’s important when external caches like memcached and varnish enter the picture. Because sometimes it’s not just about caching on the client, but in the infrastructure.

In any case, HTTP caching mechanisms can improve performance of APIs, particularly when they are returning infrequently changing content like images or static text. Not taking advantage of them is a lost opportunity.

* you shop for what you want, I’ll shop for shoes.

Published Oct 13, 2016

Version 1.0

application delivery

iRules

webperf