The impact of SPDY on infrastructure architecture
The Internets were abuzz with the revelation that the custom browser Silk, distributed on Amazon’s latest endeavor Fire, leverages com...
SPDY, short for "speedy" was developed by Google as a way of augmenting the regular HTTP protocol. It uses compression and several methods of optimizing and even predicting requests so resources are sent faster from the server to the browser.
Amazon Silk uses SPDY for its connection to the EC2 cloud. Google also uses it in Chrome for all connections to Google sites. SPDY is an open protocol, so anyone is free to use it and Google is encouraging websites to adopt it.
This isn’t the first time I’ve pondered SPDY; the last time was a dive into how SPDY combined with Map/Reduce might make for a very interesting and scalable web architecture. We could focus on the advantages of owning the “hardware” with SPDY, Fire, Chrome, and the interaction with Google and Amazon’s own cloud services, but I think it more valuable to examine SPDY from the potential impact on infrastructure, architecture, and the broader market. With both web behemoths taking advantage of SPDY, the increasing rate of consumerization of IT and eager adoption of mobile devices by consumers, it’s becoming more obvious that perhaps the fledgling protocol might indeed be something the market should take a much, much closer look at.
There are three main points to SPDY that are (most) relevant to modern and emerging web architectures:
1. Only one, asynchronous connection is allowed between client and server
2. SPDY sits above TCP and encapsulates HTTP
3. Requests can be prioritized
The main premise of SPDY is the use of a single, asynchronous connection (1) between client and server to reduce latency inherent in network transfer times. Clients then fire off a series of requests with or without priority (3) desired over that connection. Those requests are encapsulated into SPDY (2) and sent to a SPDY-capable web server infrastructure where they are translated and processed before being returned. This process is, as Google points out, much faster than traditional acceleration techniques involving parallelization of requests because a browser simply cannot open a number of connections to the web server commensurate with the number of objects (requests) it must retrieve. Connection limitations and the synchronous nature of the HTTP protocol impose a performance penalty that is nigh unto impossible to eliminate. Hence, SPDY, which eliminates the penalty by changing the rules of the game. Google has some fairly impressive performance results from using SPDY (no doubt improved further by the proximity of Google services to the Internet backbone, as is also the case with Amazon) that I will not even attempt to refute. As I’ve long verified similar improvements in the use of TCP and HTTP multiplexing on the server-side of this equation, Google’s numbers are no doubt an accurate depiction of the improvements gained using similar techniques on the client side.
Now, the question becomes, “how could a typical enterprise take advantage of this?” After all, we all want faster web sites and user-experiences. But the reliance of SPDY on support in both the client and the server infrastructure seems a potentially insurmountable challenge. While we may be able to take advantage of Silk or Chrome’s native SPDY support, we would still need to ensure the web/application server infrastructure on which web applications are deployed (in the cloud and in the data center) supports it as well. This is because the traditional HTTP transaction is encapsulated inside the SPDY frame: it must be extracted before it can be processed by traditional web/application infrastructure.
The most obvious impact to any infrastructure between a SPDY-enabled client and server is that it drives intermediate processing back to layer 4, to TCP. Because SPDY is its own protocol and encapsulates HTTP inside its frame, infrastructure focused on layer 7 (application, HTTP) would effectively be blinded by SPDY. While traditional layer 4 functionality – network firewall, QoS, load balancing – would remain unaffected by being in the data path, traditional layer 7 functionality – web application firewall, web access management, application switching (a.k.a. “page routing” in Facebook speak), etc… – would be rendered ineffectual. This, much in the same way IPv4 encapsulation in IPv6-enabled architectures, would render infrastructure architectures dependent on such processing inoperable. After all, if web application firewall or access management services rely on a URI to determine which policy should be applied and that URI is no longer available as part of the request, the service cannot function.
Additionally, the use of encryption as an integral component of the protocol would prove problematic for many infrastructure intermediaries unable to decrypt the data and perform inspect of the contents. “TLS encrypts the contents of all transmission (except the handshake itself), making it difficult for attackers to control the data which could be used in a cross-protocol attack.” – SPDY Protocol
That said, any intermediary capable of decrypting and subsequently extracting the HTTP data – such as via network-side scripting capable infrastructure - would be capable of serving in the same capacity as it has in the past, albeit while incurring some amount of latency while doing so.
There are several operationally-related issues that SPDY introduces as well as those it does not address. For example, SPDY does not address capacity constraints nor does it intelligently apply compression. Furthermore it assumes security will be applied at the server. Architectural best practices concerning both performance and capacity dictate SSL/TLS be offloaded upstream and that TCP multiplexing between intermediate load balancing servers and servers be used to improve utilization of server resources.
The behavior of SPDY assumes a persistent connection between the client and the server. Again, best practices for architecting highly available applications include the use of load balancing services to provide a mechanism for failover as well as scalability, introducing potential issues with persistence. While rudimentary, albeit effective, layer 4 load balancing services would alleviate this potential pitfall (layer 4 load balancing maps clients and servers at the TCP layer upon initial connection and thereafter act as little more than a layer 3 forwarding mechanism, a.k.a. a switch) but introduce others, such as effective server capacity management necessary to dynamically scale in and out, i.e. elasticity.
Other questions remain, as well, around basic security of SPDY-enabled infrastructure. The premise of the protocol is a rapid-firing of requests with client-specified priority at a web site or application. Differentiating between legitimate clients and potential attackers may be an interesting exercise and one that is not directly addressed by SPDY with the exception of a mention regarding the ability to throttle clients.
The conclusion at this point may be that traditional architecture and infrastructure is inherently incompatible with SPDY. This is far from an accurate conclusion. For application delivery components, at least, the introduction of SPDY should not be viewed as problematic nor a show-stopper. In fact, it could be viewed as the opposite. An advanced application delivery controller serves as a SPDY-enabling technology.
An advanced application delivery controller with network-side scripting ability could easily act as a translator for SPDY enabled clients interacting with non-SPDY enabled sites and applications. Through the use of network-side scripting, the HTTP data could be extracted, inserted into a traditional HTTP exchange with servers, and the responses re-injected into a SPDY-compliant data exchange with the client. Such an architecture could easily serve as a migratory step toward a fully SPDY-enabled web and application architecture, or as a means to support (the currently limited set of) SPDY-enabled browsers. Even in a SPDY-enabled architecture, an advanced application delivery controller remains a key component for security, access and capacity management. If able to extract and interpret SPDY, the intermediary retains its ability to perform processing on the data without sacrificing the inherent performance gains achieved by the nascent protocol. This allows existing web and application architectures to remain in place but supports the use of the protocol as a means of accelerating the end-user experience.
This approach is likely preferred by the infrastructure market itself as it remains to be seen whether SPDY will become more widely used in mobile devices (and other browsers) and the investment to natively support what is a new protocol in what is a rather large and varied market would need to be justified by widespread demand. If intermediaries are able to "speak" SPDY natively, they could act in much the same way IPv6 Gateways today serve as an transitional step between IPv4 and IPv6. In this respect, a component based on a full-proxy architecture is perfectly suited to inserting itself (and thus its other capabilities such as security and access control) into a SPDY conversation to ensure both sides of the equation are equally efficient and performant.
Because of the limited deployment and support for SPDY, this is likely to remain a non-issue for most organizations for quite some time. However, with growing use on mobile devices like Silk and other Android-based devices and the push to integrate Amazon and Google cloud services with enterprise architectures, it may be time to put SPDY on the list of technologies to keep a closer eye on.