series-what-is-http
11 TopicsWhat is HTTP Part VII - OneConnect
In the last article in this What is HTTP? series we finished covering the settings in the HTTP profile itself. This week, we begin the transition to technologies that support, enhance, optimize, secure, and (insert your descriptive qualifier here…) the HTTP traffic flowing through the BIG-IP, starting with OneConnect. What is OneConnect? What comes to mind when you hear the phrase “Reduce Reuse Recycle”? Does a little green pseudo-triangle with arrows flash before your eyes, or do you think if they really cared about the reduction, the phrase would be “re[duce|use|cycle]”? But I digress. OneConnect reduces the number of server-side connections between the BIG-IP and application servers by reusing the existing server-side connections. So whereas you might be serving thousands of connections on the client-side, there will be a reduced number of server-side connections to offload the overhead of tcp connection setup and teardown. You can see in the image above that before HTTP/1.1, single requests per connection on client and server sides resulted in inefficiencies. With HTTP/1.1, you can multiplex requests on single connections client-side, and as shown below, can multiplex multiple client-side connections onto single server-side connections. This is the power OneConnect offers. Both clients multiple requests in a single tcp connection for each client are pooled together on a single tcp connection between the BIG-IP and the server. There’s even hope for HTTP/1.0 clients via the OneConnect Transformations I mentioned in part five of this series. In this case, the four individual connections on the client-side of the BIG-IP are transformed into a single connection on the server-side thanks to OneConnect. The OneConnect Profile Let’s start with a screen shot of the fields available in the OneConnect profile. As this is the default parent, you don’t have the option to name it and set the parent, but when creating a custom profile, those settings would be available for configuration as well in the General Properties. Source Prefix Length Also known as theSource Maskfield in earlier versions, this is where you specify the mask length for connection reuse. It’s important to understand that this is the contextual source IP of the server-side connection. This means that if you are using a SNAT address, it is applied before the mask length is evaluated for OneConnect reuse eligibility. The masking behavior details by version are maintained on AskF5 in knowledge base articles K5911 (10.x - 11.x) andK85357500 (12.x.) In my current lab version of 12.1.2, the prefix length can be set to IPv4 or IPv6 mask lengths, and are supplied in CIDR notation instead of traditional netmasks. There’s no rocket science to the mask itself, a mask-length of 0 (default) is most efficient as OneConnect will look for any open idle tcp connection on the destination server. A length of /32 for IPv4 or /128 for IPv6 is a host match, so only an open connection between the source host IP (or SNAT) and the destination server would be eligible for reuse. Maximum Size This setting controls the maximum number of idle server-side tcp connections that BIG-IP holds in the connection reuse pool. This value is divided by the number of TMMs on BIG-IP, so for the default of 10,000 on a system with 10 TMMs, the maximum number of idle connections per TMM would be 1,000. This can be tested with an artificially low maximum-size of two, which means one a two TMM system only one idle connection is available per TMM in the connection reuse pool. Let's use this iRule to make it simple to confirm whether connection has been reused or not: when HTTP_REQUEST_RELEASE { log local0. "BIG-IP: [IP::local_addr]:[TCP::local_port] sent [HTTP::method] to [IP::server_addr]:[serverside {TCP::remote_port}]" } If we have two servers and send two requests to them, each request goes to each server respectively (everything except max size here is default with round-robin LB method.) Rule /Common/check_oneconnect : BIG-IP: 172.16.199.234:38648 sent GET to 172.16.199.31:80 Rule /Common/check_oneconnect : BIG-IP: 172.16.199.234:38650 sent GET to 172.16.199.32:80 The server2 connection (from second request) is not added to the connection reuse pool because max size is set to 2 (1 per TMM) and server1 is already there. The third request goes to server1 and reuses the connection from the first request (src port 38648): Rule /Common/check_oneconnect : BIG-IP: 172.16.199.234:38648 sent GET to 172.16.199.31:80 The fourth request goes to server2 and because there are no idle connections open to server2 then a new connection is created: Rule /Common/check_oneconnect : BIG-IP: 172.16.199.234:38646 sent GET to 172.16.199.32:80 The fifth request again goes to server1 still reusing the idle server1 connection: Rule /Common/check_oneconnect : BIG-IP: 172.16.199.234:38648 sent GET to 172.16.199.31:80 A sixth request to server2 also creates a new connection because server1 is still occupying the only spot available in the connection reuse pool: Rule /Common/check_oneconnect : BIG-IP: 172.16.199.234:38658 sent GET to 172.16.199.32:80 Maximum Age This is the maximum number of seconds that a connection stays in the connection reuse pool before the BIG-IP marks a connection ineligible for reuse. Two key points on the max age: It is triggered after the first time the connection is reused, not when the connection is used for the initial request. It is only after the first reuse is completed and the connection is idle that the max age countdown begins. The max-age setting does not close connections when expired, it marks connections ineligible for future reuse. They are then closed in one of two ways. The first way is the BIG-IP will close an ineligible idle connection when a new request comes in. The second way is to wait for the server’s HTTP keep alive to time out. Both sides of the connection have the ability to close down the connection. Let’s take a look at the max age behavior by setting it artificially low on the BIG-IP to 5 seconds, and setting the test apache server to 60 seconds to prevent apache from closing the connection first. If we send one request (frame 10) only the connection is closed 60 seconds later (frame 46) honoring the keep alive we set on apache: The max age did not kick in here as the connection was not reused. However, let’s make another initial request (frame 10.) This time we make a second request (frame 51) 20 seconds later but before the 60 second timeout. Notice the connection has been reused Now that the max age setting has kicked in, we wait another 20 seconds (well after the 5 second expiration we set) and BIG-IP closes the previous connection (frame 77) and opens a new one (frame 92.) One the new connection is established the client-side request is sent to the server (frame 96.) You can also see that after another minute of idleness, the apache server keep alive timeout has expired as well, and it closes the connection (frame 130.) If we had opened a new connection within 5 seconds after frame 115, then the connection would have been reused, and then max-age would kick in again immediately after the connection was idle again. Maximum Reuse This setting controls themaximum number (default: 1,000) of times a particular server connection can be reused before it is closed. So for example, if you set the maximum reuse to two connections, a server-side tcp connection (172.16.199.234:63826 <-> 172.16.199.31:80) would be used for three requests (one initial in frame 10 and two reuses in frames 49 and 90,) and then BIG-IP, unlike with max age in marking a connection ineligible, will immediately close the connection (frame 119.) Idle Timeout Override When enabled (disabled by default,) this setting overrides any timeout set by protocol profiles such as tcp which has a default timeout of 300 seconds. So with the BIG-IP tcp idle timeout at 300 seconds and apache (from earlier) set to 60 seconds on the keep alive, the idle connection will be closed by the server. If we set the keep alive to 301 seconds, BIG-IP would reset the connection. For the sake of testing the idle timeout override, however, let’s enable the idle timeout override in the OneConnect profile by setting it at 30 seconds. You can see in the capture above that after the request is completed (frame 33) and the connection is now idle, it is reset after 30 seconds (frame 44,) occurring before the server timeout and the BIG-IP tcp timeout. You can also set the idle-timeout-override to infinite, which will ignore the other BIG-IP protocol timers but not overcome the settings of the server keep alives. Limit Type This setting controls how connection limits are managed in conjunction with OneConnect. None (default) - OneConnect is not involved with the connection limits. Idle - Specifies that idle connections will be dropped as the tcp connection limit is reached. For short intervals, during the overlap of the idle connection being dropped and the new connection being establish, the tcp connection limit may be exceeded. Strict - Specifies that the tcp connection limit is honored without exception. This means that idle connections will prevent new tcp connections from being made until they expire, even if they could otherwise be reused. This is not a recommended configuration, but might prove useful in exceptional cases with very short expiration timeouts. Understanding OneConnect Behaviors OneConnect does not override load balancing behavior. If there are two pool members (s1 and s2) and we use the round-robin load balancing method with OneConnect applied, the behavior is as follows: First request goes to s1 Second request goes to s2 Third request goes to s1 and reuses idle connection Fourth request goes to s2 and reuses idle connection Applying a OneConnect profile to a virtual server without a layer seven transactional (read: clearly defined requests and responses) protocol like HTTP is generally considered a misconfiguration, and you may see odd behavior when doing so. There may be exceptions, but this configuration should be considered carefully and tested thoroughly before deploying to a production environment. Without OneConnect on a virtual server with HTTP, you will find that persistence data does not appear to be honored. This is because by default, the BIG-IP system performs load balancing for each tcp connection, not each HTTP request, so further requests on the same client-side connection will follow suit from the original decision. By applying OneConnect to the virtual, you effectively detach the server-side from the client-side connection after each request, forcing a new load balancing decision and using persistence information if it available. Resources OneConnect profile overview OneConnect and persistence OneConnect HTTP headers OneConnect and Traffic Distribution OneConnect and Source Mask Thanks to Rodrigo Albuquerque for the fantastic test source material!3.4KViews1like5CommentsWhat is HTTP Part V - HTTP Profile Basic Settings
In the first four parts of this series on HTTP, we laid the foundation for understanding what’s to come. In this article, we’ll focus on the basic settings in the BIG-IP HTTP profile. Before we dive into the HTTP-specific settings, however, let’s briefly discuss the nature of a profile. In the BIG-IP, every application is serviced by one or more virtual servers. There are literally hundreds of individual settings one can make to tune the various protocols used to service applications. Tweaking every individual setting for every individual application is not only time consuming, but at risk for human error. A profile allows an administrator to configure a sub-set of (usually) protocol specific settings that can then be applied to one or more virtual servers. Profiles can also be tiered in that you can have a baseline parent profile, and then use that to create child profiles based on that parent profile to tweak individual settings as necessary. The built-in profiles can be changed but it is not recommended you do so. The recommendation is to create a child based on a built-in profile, and customize from there. The settings we’ll focus on today are shown in the screen capture below. Note that this is TMOS version 12.1 HF1. The HTTP profile has oscillated over the versions to include or disclude various feature sets, so don’t be alarmed if your view is different. There are a few proxy modes available in the HTTP profile, but we’ll focus on reverse proxy mode. Basic Auth Realm The realm is a scope of protection for resources on an origin server. Think of realms as partitions where each can have their own authentication schemes. When authentication is required, the server should respond with a status of 401 and the WWW-Authenticate header. If you set this field to testrealm, the header value returned to the client will be Basic realm=“testrealm”. On the client end, when this response is received, a browser pop-up will trigger and provide a message like Please enter your username and password for : though some browsers do not provide the realm in this message. Fallback The two fallback fields in the profile are useful for handling adverse conditions in availability of your server pool of resources. The Fallback on Error Codes field allows you to set status codes in a space-separated list that will redirect to your Fallback Host when the server responds with any of those codes. This is useful hiding back end issues for a better user experience as well as securing any underlying server information leakage. The Fallback Host by itself is utilized when there are no available nodes to send requests to. When this happens, a status of 302 (a temporary redirect)with the specified host is returned to the client. Header Insertions, Erasures, & Allowances The profile supports limited insert/remove/sanitize functionality. On requests, you can select a single header to insert and a single header to remove. There is a precedence for insertions and removals you'll want to keep in mind if you are inserting and removing the same header. Not sure why you'd do that but I tested and it appears that the HSTS and X-Forwarded-For features will be inserted first, if either are specified in the Request Header Erase they will be removed, and if also specified in the Request Header Insert, will be re-inserted on the way to the server. On responses, you can specify a space-separated list of allowed headers and headers not in that list will be removed before the response is returned to the client. If any of that doesn’t meet your needs, you can further customize the experience with local traffic policies or iRules, which we’ll dig into in part nine of this series. Chunking The request/response chunking fields help deal with compression. We’ll talk about compression in part eight of this series. These fields control how BIG-IP handles chunked content from clients and servers as indicated in the Transfer-Encoding header. The table below shows how the settings in the profile handle chunked and unchecked data for requests and responses. OneConnect Transformations We’ll address OneConnect in part seven of this series, but basically, by enabling transformations AND having a OneConnect profile, it allows the system to change non-keepalive connection requests on the client side to keep alive connections on the server side. If this feature is enabled WITHOUT a OneConnect profile, which is default, there is no action taken. Redirect Rewrites SSL offload is a very popular component for BIG-IP deployments. Traffic from client->BIG-IP is typically secured with SSL, and often, not protected from BIG-IP->server. Some servers are either not configured or just not able to handle returning secured URLs if the server is serving unsecured content. There are many ways to handle this, but this particular setting can be configured to assure that all or request-matching URLs are rewritten from http:// to https:// links. For redirects to node IP addresses, you can also configure the redirect to be rewritten to the virtual server address instead. The default for this setting is to not rewrite redirects. The actions taken for each option in the rewrites is as follows: None: Specifies that the system does not rewrite the URI in any HTTP redirect responses. All: Specifies that the system rewrites the URI in all HTTP redirect responses. Matching: Specifies that the system rewrites the URI in any HTTP redirect responses that match the request URI. Nodes: Specifies that if the URI contains a node IP address instead of a host name, the system changes it to the virtual server address. Note that it is not content that is being rewritten here. Only the Location header in the redirect is updated. Cookie Encryption This is a security feature, enabling you to take an unencrypted cookie in a server response, encrypt it with 192-bit AES cipher, and b64 encode it before sending the response to the client. Multiple cookies can be specified in a space-separated list. This can also be done in iRules against all cookies without calling out specific cookie names. X-Forwarded-For The X-Forwarded-For header is a de-facto standard for passing client IP addresses to servers when the true nature of the force (ahem, the IP path) is not visible to servers. This is common with network address translation being prevalent between clients and server. By default the Insert field is disabled, but if enabled, the client-side IP address of the packets being received by BIG-IP will be inserted into the X-Forwarded-For header and sent to the server. As X-Forwarded-For is a de-facto standard, that means that it is no standard at all, and some proxies with insert an additional header and some will just append to a header. The Accept XFF and XFF Alternate Names fields are used for system level statistics and whether they can be trusted. More details on that (specific to ASM) can be found in solution K12264. Enabling X-Forwarded-For in the HTTP profile does not sanitize existing X-Forwarded-For headers already present in the client request. If there are two present in the request, then BIG-IP if configured to insert one will add a third. You can test this easily with curl: curl -H "X-Forwarded-For: 172.16.31.2" -H "X-Forwarded-For: 172.16.31.3" http://www.test.local/ Here are my wireshark captures on my Ubuntu test server before and after enabling the XFF header insertion in the profile: If you want to pass oneX-Forwarded-For and one only to the server, an iRule is a good idea to achieve this behavior. Linear White Space There are two fields for handling linear white space: max columns (default: 80) and the separator (default: \r\n). Linear white space is any number of spaces, tabs, newlines IF followed by at least one space or tab. From RFC2616: A CRLF is allowed in the definition of TEXT only as part of a header field continuation. It is expected that the folding LWS will be replaced with a single SP before interpretation of the TEXT value. I've not had a reason to ever change these settings. you can test for the presence of linear white space in a header with the HTTP::header lws command. This stack thread has a good answer on what linear white space is. Max Requests This is pretty straightforward. The default of zero does not limit the number of HTTP requests per connection, but changing this number will limit client requests on a per-connection basis to the value specified. Via Headers Regardless of request or response, this setting controls how BIG-IP handles the Via header. Like trace route for IP packets, the Via header will be appended with each intermediary along the way, so when the message reaches its destination, the HTTP path (not the IP path) should be known (whereas all HTTP/1.1 proxies are required to participate, HTTP/1.0 proxies that may or may not append the Via header.) When updating the header, the proxy is required to identify their hostname (or a pseudonym) and the HTTP version of the previous server in the path. Each proxy appends the information in sequential order to the header. Here’s an example Via header from Mozilla’s developer portal. Via: 1.1 vegur Via: HTTP/1.1 GWA Via: 1.0 fred, 1.1 p.example.net The available selections for this setting, request or response, is to preserve, remove, or append the Via header. Server Agent Name This is the value that BIG-IP populates in the Server header for response traffic. The default of BigIP gives away infrastructure clues, so many implementations will change this default. The Server header is analogous in response traffic to the User Agent header in client traffic. What questions do you have? Drop a comment below if there is anything I can clarify. Join me next week when we move on to the HTTP profile enforcement settings.3.1KViews0likes13CommentsWhat is HTTP Part VIII - Compression and Caching
In the last article of this What is HTTP? series we covered the nuances of OneConnect on HTTP traffic through the BIG-IP. In this article, we’ll cover caching and compression. We’ll deal with compression first, and then move on to caching. Compression In the very early days of the internet, much of the content was text based. This meant that the majority of resources were very small in nature. As popularity grew, the desire for more rich content filled with images grew as well, and resource sized began to explode. What had not yet exploded yet, however, was the bandwidth available to handle all that rich content (and you could argue that’s still the case in mobile and remote terrestrial networks as well.) This intersection of more resources without more bandwidth led to HTTP development in a few different areas: Methods for getting or sending partial resources Methods for identifying if resources needed to be retrieved at all Methods for reducing resources during transit that could be successfully reproduced after receipt The various rangeheaders were developed to handle the first case, caching, which we will discuss later in this article, was developed to handle the second case, and compression was developed to handle the third case. The basic definition of data compression is simply reducing the bits necessary to accurately represent the resource. This is done not only to save network bandwidth, but also on storage devices to save space. And of course money in both areas as well. In HTTP/1.0, end-to-end compression was possible, but not hop-by-hop as it does not have a distinguishing mechanism between the two. That is addressed in HTTP/1.1, so intermediaries can use complex algorithms unknown to the server or client to compress data between them and translate accordingly when speaking to the clients and servers respectively. In 11.x forward, compression is managed in its own profile. Prior to 11.x, it was included in the http profile. The httpcompression profile overview on AskF5 is very thorough, so I won’t repeat that information here, but you will want to pay attention to the compression level if you are using gzip (default.) The default of level 1 is fast from the perspective of the act of compressing on BIG-IP, but having done minimal compressing, reaps the least amount of benefit on the wire. If a particular application has great need for less bandwidth utilization toward the clientside of the network footprint, bumping up to level 6 will increase the reduction in bandwidth without overly taxing the BIG-IP to perform the operation. Also, it’s best to avoid compressing data that has already been compressed, like images and pdfs. Compressing them actually makes the resource larger, and wastes BIG-IP resources doing it! SVG format would be an exception to that rule. Also, don’t compress small files. The profile default is 1M for minimum content length. For BIG-IP hardware platforms, compression can be performed in hardware to offload that function. There is a database variable that you can configure to select the data compression strategy via sys modify db compression.strategy . The default value is latency, but there are four other strategies you can employ as covered in the manual. Caching Web caching could (and probably should) be its own multi-part series. The complexities are numerous, and the details plentiful. We did a series called Project Acceleration that covered some of the TCP optimization and compression topics, as well as the larger product we used to call Web Accelerator but is now the Application Acceleration Manager or AAM. AAM is caching and application optimization on steroids and we are not going to dive that deep here. We are going to focus specifically on HTTP caching and how the default functionality of the ramcache works on the BIG-IP. Consider the situation where there is no caching: In this scenario, every request from the browser communicates with the web server, no matter how infrequently the content changes. This is a wasteful use of resources on the server, the network, and even the client itself. The most important resource to our short attention span end users is time! The more objects and distance from the server, the longer the end user waits for that page to render. One way to help is to allow local caching in the browser: This way, the first request will hit the web server, and repeat requests for that same resource will be pulled from the cache (assuming the resource is still valid, more on that below.) Finally, there is the intermediary cache. This can live immediately in front of the end users like in an enterprise LAN, in a content distribution network, immediately in front of the servers in a datacenter, or all of the above! In this case, the browser1 client requests an object not yet in the cache serving all the browser clients shown. Once the cache has the object from the server, it will serve it to all the browser clients, which offloads the requests to server, saves the time in doing so, and brings the response closer to the browser clients as well. Given the benefits of a caching solution, let’s talk briefly of the risks. If you take the control of what’s served away from the server and put it in the hands of an intermediary, especially an intermediary the administrators of the origin server might not have authority over, how do you control then what content the browsers ultimately are loading? That’s where the HTTP standards on caching control come into play. HTTP/1.0 introduced the Pragma, If-Modified-Since, Last-Modified, and Expires headers for cache control. The Cache-Control and ETag headers along with a slew of “If-“ conditional headers were introduced in HTTP/1.1, but you will see many of the HTTP/1.0 cache headers in responses alongside the HTTP/1.1 headers for backwards compatibility. Rather than try to cover the breadth of caching here, I’ll leave it to the reader to dig into the quite good resources linked at the bottom (start with "Things Caches Do") for detailed understanding. However, there's a lot to glean from your browser developer tools and tools like Fiddler and HttpWatch. Consider this request from my browser for the globe-sm.svg file on f5.com. Near the bottom of the image, I’ve highlighted the request Cache-Control header, which has a value of no-cache. This isn’t a very intuitive name, but what the client is directing the cache is that it must submit the request to the origin server every time, even if the content is fresh. This assures authentication is respected while still allowing for the cache to be utilized for content delivery. In the response, the Cache-Control header has two values: public and max-age. The max-age here is quite large, so this is obviously an asset that is not expected to change much. The public directive means the resource can be stored in a shared cache. Now that we have a basic idea what caching is, how does the BIG-IP handle it? The basic caching available in LTM is handled in the same profile that AAM uses, but there are some features missing when AAM is not provisioned. It used to be called ramcache, but now is the webacceleration profile. Solution K14903 provides the overview of the webacceleration profile but we’ll discuss the cache size briefly. Unlike the Web Accelerator, there is no disk associated with the ramcache. As the name implies, this is “hot” cache in memory. So if you are memory limited on your BIG-IP, 100MB might be a little too large to keep locally. Managing the items in cache can be done via the tmsh command line with the ltm profile ramcache command. tmsh show/delete operations can be used against this method. An example show on my local test box: root@(ltm3)(cfg-sync Standalone)(Active)(/Common)(tmos)# show ltm profile ramcache webacceleration Ltm::Ramcaches /Common/webacceleration Host: 192.168.102.62 URI : / -------------------------------------- Source Slot/TMM 1/1 Owner Slot/TMM 1/1 Rank 1 Size (bytes) 3545 Hits 5 Received 2017-11-30 22:16:47 Last Sent 2017-11-30 22:56:33 Expires 2017-11-30 23:16:47 Vary Type encoding Vary Count 1 Vary User Agent none Vary Encoding gzip,deflate Again, if you have AAM licensed, you can provision it and then additional fields will be shown in the webacceleration profile above to allow for an acceleration policy to be applied against your virtual server. Resources RFC 2616 - The standard fine print. Things Caches Do- Excellent napkin diagrams that provide simple explanations of caching operations. Caching Tutorial - Comprehensive walk through of caching. HTTP Caching - Brief but informative look at caching from a webdev perspective. HTTP Caching - Google develops page with examples, flowcharts, and advice on caching strategies. Project Acceleration - Our 10 part series on web acceleration technology available on the BIG-IP platform in LTM and/or AAM modules. Solution K5157 - BIG-IP caching and the Vary header Make Your Cache Work For You - Article by Dawn Parzych here on DevCentral on tuning techniques2.7KViews1like0CommentsWhat is HTTP Part X - HTTP/2
In the penultimate article in this What is HTTP? series we covered iRules and local traffic policies and the power they can unleash on your HTTP traffic. To date in this series, the content primarily focuses on HTTP/1.1, as that is still the predominant industry standard. But make no mistake, HTTP/2 is here and here to stay, garnering 30% of all website traffic and climbing steadily. In this article, we’ll discuss the problems in HTTP/1.1 addressed in HTTP/2 and how BIG-IP supports the major update. What’s So Wrong withHTTP/1.1? It’s obviously a pretty good standard since it’s lasted as long as it has, right? So what’s the problem? Well, let’s set security aside for this article, since the HTTP/2 committee pretty much punted on it anyway, and let’s instead talk about performance. Keep in mind that the foundational constructs of the HTTP protocol come from the internet equivalent of the Jurassic age, where the primary function was to get and post text objects. As the functionality stretched from static sites to dynamic interactive and real-time applications, the underlying protocols didn’t change much to support this departure. That said, the two big issues with HTTP/1.1 as far as performance goes are repetitive meta data and head of line blocking.HTTP was designed to be stateless. As such, all applicable meta data is sent on every request and response, which adds from minimal to a grotesque amount of overhead. Head of Line Blocking For HTTP/1.1, this phenomenon occurs due to each request needs a completed response before a client can make another request. Browser hacks to get around this problem involved increasing the number of TCP connections allowed to each host from one to two and currently at six as you can see in the image below. More connections more objects, right? Well yeah, but you still deal with the overhead of all those connections, and as the number of objects per page continues to grow the scale doesn’t make sense. Other hacks on the server side include things like domain sharding, where you create the illusion of many hosts so the browser creates more connections. This still presents a scale problem eventually. Pipelining was a thing as well, allowing for parallel connections and the utopia of improved performance. But as it turns out, it was not a good thing at all, proving quite difficult to implement properly and brittle at that, resulting in a grand total of ZERO major browsers actually supporting it. Radical Departures - The Big Changes in HTTP/2 HTTP/2 still has the same semantics as HTTP/1. It still has request/response, headers in key/value format, a body, etc. And the great thing for clients is the browser handles the wire protocols, so there are no compatibility issues on that front. There are many improvements and feature enhancements in the HTTP/2 spec, but we’ll focus here on a few of the major changes. John recorded a Lightboard Lesson a while back on HTTP/2 with an overview of more of the features not covered here. From Text to Binary With HTTP/2 comes a new binary framing layer, doing away with the text-based roots of HTTP. As I said, the semantics of HTTP are unchanged, but the way they are encapsulated and transferred between client and server changes significantly. Instead of a text message with headers and body in tow, there are clear delineations for headers and data, transferred in isolated binary-encoded frames (photo courtesy of Google). Client and server need to understand this new wire format in order to exchange messages, but the applications need not change to utilize the core HTTP/2 changes. For backwards compatibility, all client connections begin as HTTP/1 requests with an upgrade header indicating to the server that HTTP/2 is possible. If the server can handle it, a 101 response to switch protocols is issued by the server, and if it can’t the header is simply ignored and the interaction will remain on HTTP/1. You’ll note in the picture above that TLS is optional, and while that’s true to the letter of the RFC law (see my punting on security comment earlier) the major browsers have not implemented that as optional, so if you want to use HTTP/2, you’ll most likely need to do it with encryption. Multiplexed Streams HTTP/2 solves the HTTP/1.1 head of line problem by multiplexing requests over a single TCP connection. This allows clients to make multiple requests of the server without requiring a response to earlier requests. Responses can arrive in any order as the streams all have identifiers (photo courtesy of Google). Compare the image below of an HTTP/2 request to the one from the HTTP/1.1 section above. Notice two things: 1) the reduction of TCP connections from six to one and 2) the concurrency of all the objects being requested. In the brief video below, I toggle back and forth between HTTP/1.1 and HTTP/2 requests at increasing latencies, thanks to a demo tool on golang.org, and show the associated reductions in page load experience as a result. Even at very low latency there is an incredible efficiency in making the switch to HTTP/2. This one change obviates the need for many of the hacks in place for HTTP/1.1 deployments. One thing to note on the head of line blocking: TCP actually becomes a stumbling block for HTTP/2 due to its congestion control algorithms. If there is any packet loss in the TCP connection, the retransmit has to be processed before any of the other streams are managed, effectively halting all traffic on that connection. Protocols like QUIC are being developed to ride the UDP waveand overcome some of the limitations in TCP holding back even better performance in HTTP/2. Header Compression Given that headers and data are now isolated by frame types, the headers can now be compressed independently, and there is a new compression utility specifically for this called HPACK. This occurs at the connection level. The improvements are two-fold. First, the header fields are encoded using Huffman coding thus reducing their transfer size. Second, the client and server maintain a table of previous headers that is indexed. This table has static entries that are pre-defined on common HTTP headers, and dynamic entries added as headers are seen. Once dynamic entries are present in the table, the index for that dynamic entry will be passed instead of the head values themselves (photo courtesy of amphinicy.com). BIG-IP Support F5 introduced the HTTP/2 profile in 11.6 as an early access, but it hit general availability in 12.0. The BIG-IP implementation supports HTTP/2 as a gateway, meaning that all your clients can interact with the BIG-IP over HTTP/2, but server-side traffic remains HTTP/1.1. Applying the profile also requires the HTTP and clientssl profiles. If using the GUI to configure the virtual server, the HTTP/2 Profile field will be grayed out until use select an HTTP profile. It will let you try to save it at that point even without a clientssl profile, but will complain when saving: 01070734:3: Configuration error: In Virtual Server (/Common/h2testvip) http2 specified activation mode requires a client ssl profile As far as the profile itself is concerned, the fields available for configuration are shown in the image below. Most of the fields are pretty self explanatory, but I’ll discuss a few of them briefly. Insert Header - this field allows you to configure a header to inform the HTTP/1.1 server on the back end that the front-end connection is HTTP/2. Activation Modes - The options here are to restrict modes toALPN only, which would then allow HTTP/1.1 or negatiate to HTTP/2 or Always, which tells BIG-IP that all connections will be HTTP/2. Receive Window - We didn’t cover the flow control functionality in HTTP/2, but this setting sets the level (HTTP/2 v3+) where individual streams can be stalled. Write Size - This is the size of the data frames in bytes that HTTP/2 will send in a single write operation. Larger size will improve network utilization at the expense of an increased buffer of the data. Header Table Size - This is the size of the indexed static/dynamic table that HPACK uses for header compression. Larger table size will improve compression, but at the expense of memory. In this article, we covered the basics of the major benefits of HTTP/2. There are more optimizations and features to explore, such as server push, which is not yet supported by BIG-IP. You can read about many of those features here on this very excellent article on Google’s developers portal where some of the images in this article came from.2.5KViews1like2CommentsWhat is HTTP?
tl;dr - The Hypertext Transfer Protocol, or HTTP, is the predominant tool in the transferring of resources on the web, and a "must-know" for many application delivery concepts utilized on BIG-IP HTTP defines the structure of messages between web components such as browser or command line clients, servers like Apache or Nginx, and proxies like the BIG-IP. As most of our customers manage, optimize, and secure at least some HTTP traffic on their BIG-IP devices, it’s important to understand the protocol. This introductory article is the first of eleven parts on the HTTP protocol and how BIG-IP supports it. The series will take the following shape: What is HTTP? (this article) HTTP Series Part II - Underlying Protocols HTTP Series Part III - Terminology HTTP Series Part IV - Clients, Proxies, & Servers — Oh My! HTTP Series Part V - Profile Basic Settings HTTP Series Part VI - Profile Enforcement HTTP Series Part VII - Oneconnect HTTP Series Part VIII - Compression & Caching HTTP Series Part IX - Policies & iRules HTTP Series Part X- HTTP/2 A Little History Before the World Wide Web of Hypertext Markup Language (HTML) was pioneered, the internet was alive and well with bulletin boards, ftp, and gopher, among other applications. In fact, by the early 1990’s, ftp accounted for more than 50% of the internet traffic! But with the advent of HTML and HTTP, it only took a few years for the World Wide Web to completely rework the makeup of the internet. By the late 1990’s, more than 75% of the internet traffic belonged to the web. What makes up the web? Well get ready for a little acronym salad. There are three semantic components of the web: URIs, HTML, and HTTP. The URI is the Uniform Resource Identifier. Think of the URI as a pointer. The URI is a simple string and consists of three parts: the protocol, the server, and the resource. Consider https://devcentral.f5.com/s/articles/ . The protocol is https, the server is devcentral.f5.com, and the resources is /articles/. URL, which stands for Uniform Resource Locator, is actually a form of a URI, but for the most part they can be used interchangeably. I will clarify the difference in the terminology article. HTML is short for the HyperText Markup Language. It’s based on the more generic SGML, or Standard Generic Markup Language. HTML allows content creators to provide structure, text, pictures, and links to documents. In our context, this is the HTTP payload that BIG-IP might inspect, block, update, etc. HTTP as declared earlier, is the most common way of transferring resources on the web. It’s core functionality is a request/response relationship where messages are exchanged. An example of a GET message in the HTTP/1.1 version is shown in the image below. This is eye candy for now as we’ll dig in to the underlying protocols and HTTP terminology shown here in the following two articles. But take notice of the components we talked about earlier defined there. The protocol is identified as HTTP. Following the method is our resource /home, and the server is identified in the Host header. Also take note of all those silly carriage returns and new lines. Oh, the CRLF!! If you’ve dealt with monitors, you can feel our collective pain! HTTP Version History Whereas HTTP/2 has been done for more than two years now, current usage is growing but less than 20%, with HTTP/1.1 laboring along as the dominant player. We’ll cover version-specific nuances later in this series, but the major releases throughout the history of the web are: HTTP/0.9 - 1990 HTTP/1.0 - 1996 HTTP/1.1 - 1999 HTTP/2 - 2015 Given the advancements in technology in the last 18 years, the longevity of HTTP/1.1 is a testament to that committee (or an indictment on the HTTP/2 committee, you decide!) Needless-to-say, due to the longevity of HTTP/1.1, most of the industry expertise exists here. We’ll wrap this series with HTTP/2, but up front, know that it’s a pretty major departure from HTTP/1.1, most notably is that it is a binary protocol, whereas earlier versions of HTTP were all textual.1.6KViews4likes7CommentsWhat is HTTP Part III - Terminology
Have you watched the construction of a big building over time? For the first few weeks, the footers and foundation are being prepared to support the building. it seems like not much is happening, but that ground work is vital to the overall success of the project. So it is with this series, so stay with me—the foundation is important. This week, we begin to dig into the HTTP specifications and we’ll start with defining the related terminology. World Wide Web, or WWW, or simply “the web" - The collection of resources accessible amongst the global interconnected system of computers. Resource - An object or service identified by a URI. A Web page is a resource, but images, scripts, and stylesheets for a webpage would also be resources. Web Page - A document accessible on the web by way of a URI. Example - this article. Web Site - A collection of web pages. An example would be this site, accessed by the DNS hostname devcentral.f5.com. Web Client - The software application that requests the resources on a web site by generating, receiving, and processing HTTP messages. A web client is always the initiator. Web Server - The software application that serves the resources a web site by receiving, processing, and generating HTTP messages. A web server does not initiate traffic to a client. Examples would be Apache, NGINX, IIS, or even an F5 BIG-IP iRule! Uniform Resource Identifier, or URI - As made clear in the name, the URI is an identifier, which can mean the resource name, or location, or both. All URLs (locators) and URNs (names) are URIs, but all URLs/URNs are not URIs. Think Venn diagrams here. Consider https://clouddocs.f5.com/api/irules/. The key difference between a URI (which would be /wiki/iRules.HomePage.ashx in this case) and a URL is the URL combines the name of the resource, the location of that resource (devcentral.f5.com,) and the method to access that resource (https://.) In the URL, the port is assumed to be 80 if the request method is http, and assumed to be 443 if the request method is https. If the default ports are not appropriate for a particular location, they will need to be provided by adding immediately after the hostname like https://devcentral.f5.com/s:50443/wiki/iRules.HomePage.ashx. Message - This is the basic HTTP unit of communication Header - This is the control section within an HTTP message Entity - This is the body of an HTTP message User Agent - This is a string of tokens that should beinserted by a web client as a header called User-Agent. The tokens are listed in order of significance. Most web clients perform this on your behalf, but there are browser tools that allow you to manipulate this, which can be good for testing, client statistics, or even client remediation. Keep in mind that bad actors might also manipulate this for nefarious purposes, so any kind of access control based solely on user agents is ill advised. Proxy - Like in the business world, a proxy acts as an intermediary. A server to clients, and a client to servers, proxies must understand HTTP messages. We’ll dig deeper into proxies in the next article. Cache - This is web resource storage that can exist at the server, any number of intermediary proxies, the browser. Or all of the above. The goals of caching are to reduce bandwidth consumption on the networks, reduce compute resource utilization on the servers, and reduce page load latency on the clients. Cookie - Originally added for managing state (since HTTP itself is a stateless protocol,) a cookie is a small piece of data to be stored by the web client as instructed by the web server. Standards Group Language - I won’t dive deep into this, but as you learn a protocol, knowing how to read RFCs would serve you well. As new protocols, or new versions of existing protocols, are released, there are interpretation challenges that companies work through to make sure everyone is adhering to the “standard.” Sometimes this is an agreeable process, other times not so much. Basic understanding of what must, should, or may be done in a request/response can make your troubleshooting efforts go much more smoothly. Basic Message Format Requests The syntax of an HTTP request message has the following format request-line headers CRLF (carriage return / line feed) message body (optional) An example of this is shown below: -----Request Line---- |GET / HTTP/1.1 --------------------- -------Headers------- |Host: roboraiders.net |Connection: keep-alive |User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36 |Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8 |Accept-Encoding: gzip, deflate |Accept-Language: en-US,en;q=0.8 ---------------------- The CRLF (present in the tcpdump capture but not seen here) denotes the end of the headers, and there is no body. Notice in the request line, you see the request method, the URI, and the http protocol version of the client. Responses The syntax of an HTTP response message has a very similar format: status-line headers CRLF (carriage return / line feed) message body (optional) An example of this is shown below: -----Request Line---- |HTTP/1.1 200 OK --------------------- -------Headers------- |Server: openresty/1.9.15.1 |Date: Thu, 21 Sep 2017 17:19:01 GMT |Content-Type: text/html; charset=utf-8 |Transfer-Encoding: chunked |Connection: keep-alive |Vary: Accept-Encoding |Content-Encoding: gzip --------------------- ---Zipped Content---- |866 |...........Y.v.....S...;..$7IsH...7.I..I9:..Y.C`....].b....'.7....eEQoZ..~vf.....x..o?^....|[Q.R/..|.ZV.".*..$......$/EZH../..n.._E..W^.. --------------------- Like the request line of an HTTP request, the protocol of the server is stated in the HTTP response status-line. Also stated is the response code, in this case a 200 ok. You’ll notice that the only header in this case that is similar between the request and response is the Connection header. HTTP Request Methods The URI is a resource with which the client is wanting to interact. The request method provides the “how” the client would like to interact with the resource. There are many request methods, but we’ll focus on the few most popular methods for this article. GET - This method is used by a client to retrieve resources from a server. HEAD - Like the GET method, but only retrieves the metadata that would be returned with the payload of a GET, but no payload is returned. This is useful in monitoring and troubleshooting. POST - Used primarily to upload data from a client to a server, by means of creating or updating an object through a process handler on the server. Due to security concerns, there are usually limitations on who can do this, how it’s done, and how big an update will be allowed. PUT - Typically used to replace the contents of a resource. DELETE - This method removes a resource. PATCH - Used to modify but not replace the contents of a resource. If you are at all interested in using iControl REST to perform automation on your BIG-IP infrastructure, all the methods above except HEAD are instrumental in working with the REST interface. HTTP Headers There are general headers that can apply to both requests and responses, and then there are specific headers based on whether it is a request OR a response message. Note that there are differences between supported HTTP/1.0 and HTTP/1.1 headers, but I’ll leave the nuances to the reader to study. We’ll cover HTTP/2 at the end of this series. Examples in parentheses are not complete, and do not denote the actual header names. RFC 2616 has all the HTTP/1.1 define headers documented. General Headers These headers can be present in either requests or responses. Conceptually, they deal with the broader issues of client/server sessions like timing, caching, and connection management. The Connection header in the request and response messages above is an example of a general connection-management header in action. Request Headers These headers are for requests only, and are utilized to inform the server (and intermediaries) on preferred response behavior (acceptable encodings,) constraints for the server (range of content or host definition,) conditional requests (resource modification timestamps,) and client profile (user agent, authorization.)TheHostheader in the request message above is an example of a request header constraining the server to that identity. Response Headers Like with requests, response headers are for responses only, and are utilized for security(authentication challenges,) caching (timing and validation,) information sharing (identification,) and redirection.TheServerheader in the response message above is an example of a response header identifying the server. Entity Headers Entity headers exist not to provide request or response messaging context, but to provide specific insight about the body or payload of the message. The Content-Type header in the response message above for example is instructing the client that the payload of the response is just text and should be rendered as such. A Note on MIME types - web clients/servers are for the most part "dumb" in that they are do not guess at content types based on analysis, they follow the instructions in the message via the Content-Type header. I've experienced this in both directions. For iControl REST development, BIG-IP returns an error if you send json payload but do not specify application/json in the Content-Type header. I also had a bug in an ASM deployment once where the ASM violation response page was set to text/html but should have been application/json, so the browser client never displayed the error, you had to find it buried in browser tools until we corrected that issue. HTTP Response Status Codes We will conclude with a brief discussion on status codes. Before getting into the specifics, there are a couple general things that should make awareness and analysis a regular part of your system management: security and SEO. On security, there are many things one can learn through status codes (and headers for that matter) on server patterns and behavior, as well as information leakage by not slurping application errors before being returned to clients. With SEO, how redirects and missing files are handled can hurt or help your overall impressions and ranking power. Moz has a good best practices article on managing status codes for search engines. But back to the point and hand: status codes. There are five categories and 41 status codes recognized in HTTP/1.1. Informational - 1xx - This category added for HTTP/1.1. Used to inform clients that a request has been received and the initial request (likely a POST of data) can continue Success - 2xx - Used to inform the client that the request was processed successfully. Redirection - 3xx - The request was received but resource needs to be dealt with in a different way. Client Error - 4xx - Something went wrong on the client side (bad resource, bad authentication, etc.) Server Error -5xx - Something went wrong on the server side. Application monitors pay particularly close attention to the 5xx errors. Security practitioners focus in on 4xx/5xx errors, but even 2xx/3xx messages if baseline volume and accessed resources start to skew from normal. Join us next week when we start to talk about clients, proxies, and servers, oh my!1KViews1like3CommentsWhat is HTTP Part VI - HTTP Profile Enforcement Settings
In last week’s article in the What is HTTP? series, we covered the basic settings of the HTTP profile. This week, we’ll cover the enforcement parameters available in the profile, again focusing strictly on the reverse proxy mode profile options. In TMOS version 12.1, the enforcement parameters are as follows (default values shown.) Allow Truncated Redirect This is a simple toggle that allows you to choose how BIG-IP handles redirects without the trailing CRLF. By default the redirect will be dropped silently. Maximum Header Size & Count The HTTP specification does not dictate the max header size, which in this case is not the size of a single header but the size of the request line and all headers combined, and various servers set their maximums differently. The BIG-IP default is 32k, but Apache is 8k, and depending on version, nginx is 4k-8k, IIS is 8k-16k, and Tomcat is 8k-48k. So if you are fronting any of these servers other than potentially Tomcat on a particular virtual, you can dial down the profile default significantly without impacting the servers. There is a behavior difference, however. Most servers will return a 4xxstatus code, and I’ve seen 400, 404, and 413 offered up. Don’t you love standards? BIG-IP will simply reset the TCP connection. You can limit individual header sizes in iRules, which has been done in various rules written to provide 0day mitigations in advance of application server patches. The max header count is simply that. As long as the request is less than 65, all is well. Pipeline Action The pipeline action is another simple toggle that allows or rejects new requests from clients if previous requests have not yet received a response. What is pipelining you ask? Think of a telephone conversation, when you ask a question, or make a comment, and the person on the other end responds. This is the behavior you see without pipelining. Now think of a debate you’ve observed, where one debater lays out a ten-part question. He typically does so all at once. After he has made all his statements or posed all his questions, the other debater will handle those statements and questions in order. This is the behavior you see with pipelining. This pipeline action setting is where the moderator in the debate would step in and allow or refuse the first debater any more statements/questions until the second debater had responded. Lori MacVittie shared this good visual for pipelining behavior in her article on security risks inherent with allowing pipelining (shown below.) Method Management These settings are helpful to eliminate attack vectors to the unknown. If your application is a simple GET/POST application, then there is no need to allow for anything else. By rejecting unknown methods and disabling all other methods from the enabled methods except GET and POST, you reduce your threat landscape to issues the server MAY be exposed to. Managing these settings carefully should mean that you have a tight communicative relationship with your application team, so as the application evolves there aren’t occasions where new methods are introduced without profile adjustments. Bonus Features! These are not enforcement settings, but they wrap up the reverse proxy settings in the HTTP profile so I am including them here. sFlow sFlow is a scalable packet sampling technology that allows you to observe traffic patterns for performance, troubleshooting, billing, and security purposes. The two settings in the HTTP profile are polling interval and sampling rate. The polling interval is the max interval between polls, and the sampling rate is the ratio of packets observed to the samples generated. So for the default of 10 seconds and 1024 packets respectively, there would be two polls within 10 seconds of each other in which the system randomly generates one sample for every 1024 packets observed. You can specify the the sample rate and polling interval here in the profile, or you can accept the default, which inherits the settings specified for HTTP under System->sFlow->Global Settings. HTTP Strict Transport Security Strict Transport Security is a security header that instructs the browser to submit every request for that domain via SSL even if the protocol on the URL was returned from the server as HTTP instead of HTTPS. HSTS is one of those settings that began as an iRule (a simple header insertion for client responses) that later became part of the product. You enable HSTS by checking the Mode box. The Maximum Age setting is in seconds and the default is about six months. The Include Subdomains checkbox if enabled will instruct the browser to do the same thing for subdomains, which may or may not be desirable. Include subdomains can be dangerous! Say you want to enable HSTS on domain test.local because it is SSL-enabled. But you forget that you have an HTTP-only site called suckstobeyou.test.local. If you checked the include subdomains box, then any client that hit that test.local domain will now be unable to get to the subdomain until they clear their browser of this setting, or the max age expires, which at a default of six months, is effectively bricking your own domains! Be careful!899Views0likes1CommentWhat is HTTP Part IX - Policies and iRules
In the previous article of this What is HTTP? series we covered optimization techniques in HTTP with compression and caching. In this article we pivot slightly from describing the protocol itself to showing ways to interact with the HTTP protocol on BIG-IP: policies and iRules. There is a plethora of content on DevCentral for how and what policies are, so rather than regurgitate that here, we’ll focus on what you can do with each approach and show a couple common examples. Before we do, however, run (don’t walk!) over to Chase Abbott’s To iRule, or not to iRule: Introduction to Local Traffic Policies article and Steve McCarthy’s LTM Policy article and read those first. Seriously, stop here and read those articles! Example 1: Simple HTTP Redirect I haven’t performed any advanced analysis on our Q&A section in quite a while, but at one time, redirection held the lion’s share of questions, and I still see them trickle in. Sometimes an application owner will move resources from one location to another within the app, either because the underlying platform has changed or navigation or site map redesign has occurred. Alternatively, the app owner may just want some vanity or random URLs to surface in advertisements so it’s easier to market and track demographically. To reorient application clients to the new resource locations, you can simply rewrite the URL before sending to the server, which will be unnoticed by the client, or you can redirect them to the new resource, which results in an HTTP response to the client and a second request for BIG-IP to handle the request for the “right” resource. The 300 level status codes are used by the “server" to instruct the client on necessary additional actions. The two primary redirection codes are 301, which instructs the client that the resource has been moved permanently, and the 302, which instructs the client that the move is temporary. For this example, we’re going to create a vanity URL to temporarily redirect clients from /http to this series landing page at/articles/sid/8311. Policy Solution Each policy has one or more rules. In this policy, we just need the one rule that has the condition of /http as the path and the action of redirecting a matching condition to the specified URL. ltm policy simple_redirect { controls { forwarding } last-modified 2017-12-14:23:22:51 requires { http } rules { vanity_http { actions { 0 { http-reply redirect location https://devcentral.f5.com/s/articles/sid/8311 } } conditions { 0 { http-uri path values { /http } } } } } status published strategy first-match } Writing policies textually can be a little overwhelming, but you can also configure them in the GUI as shown below. Note that in version 12.1 and later, you need to publish a policy before it can be used, and altering policies requires making a draft copy before activating any changes. iRules Solution The iRule solution is actually quite short. when HTTP_REQUEST { if { [HTTP::path] eq "/http" } { HTTP::redirect "https://devcentral.f5.com/s/articles/sid/8311" } } While short and simple, it is not likely to be as performant as the policy. However, since it is a very simple iRule with little overhead, the performance savings might not justify the complexity of using policies, especially if the work is being done on the command line. Also of note, if the requirement is a permanent redirect, then you must use an iRule because you cannot force a 301 from a policy alone. With a permanent redirect, you have to change the the HTTP::redirect command to an HTTP::respond and specify the 301 with a location header. when HTTP_REQUEST { if { [HTTP::path] eq "/http" } { HTTP::respond 301 Location "https://devcentral.f5.com/s/articles/sid/8311" } } There is a way to use a policy/iRule hybrid for 301 redirection, but if you need the iRule from the policy, I'd recommend just using the iRule approach. Example 2: Replace X-Forwarded-For with Client IP Address Malicious users might try to forge client IPs in the X-Forwarded-For (XFF) header, which can potentially lead to bad things. In this example, we’ll strip any XFF header in the client request and insert a new one. Policy Solution In this policy, we don't care what the conditions are, we want a sweeping decision to replace the XFF header, so no condition is necessary, just an action. You'll notice the tcl: immediately before using the iRules [IP::client_addr] . These tcl expressions are supported in policies, and are a compromise of performance for additional flexibility. ltm policy xff { last-modified 2017-12-15:00:39:26 requires { http } rules { remove_replace { actions { 0 { http-header replace name X-Forwarded-For value tcl:[IP::client_addr] } } } } status published strategy first-match } This works great if you have a single XFF header, but what if you have multiples? Let's check wireshark on my curl command to insert two headers with IP addresses 172.16.31.2 and 172.16.31.3. You can see that one of the two XFF headers was replaced, but not both, which does not accomplish our goal. So instead, we need to do a remove and insert in the policy like below. ltm policy xff { last-modified 2017-12-15:00:35:35 requires { http } rules { remove_replace { actions { 0 { http-header remove name X-Forwarded-For } 1 { http-header insert name X-Forwarded-For value tcl:[IP::client_addr] } } } } status published strategy first-match } Removing and re-inserting, we were able to achieve our goal as shown in the wireshark output from the same two IP address inserted in XFF headers from a curl request. iRules Solution The iRule is much like the policy in that we’ll remove and insert. when HTTP_REQUEST { HTTP::header remove "X-Forwarded-For" HTTP::header insert "X-Forwarded-For" [IP::client_addr] # You could also use replace since it will insert if missing # HTTP::header replace "X-Forwarded-For" [IP::client_addr] } Again, short and sweet! Like the policy example above, using HTTP::header remove in the header command removes all occurrences of the XFF header, not just the last match, which is what HTTP::header replace does. Profile Solution This is where we step back a little and evaluate whether we need an iRule or policy at all.You might recall frompart five of this seriesthe settings in the HTTP profile for the X-Forwarded-For header. But with that, I noted that enabling this just adds another X-Forwarded-For header, it doesn’t replace the existing ones. But a slick little workaround (much like our policy solution above, but completely contained in the http profile!) is to add the header to the remove and insert fields, with the values set as shown below. Keep in mind you can only erase/insert a single header, so it works for this single solution, but as soon as you have more parameters the policy or iRule is the better option. Profiles, Policies, and iRules…Where’s My Point of Control? This isn’t an HTTP problem, but since we’re talking about solutions, I’d be remiss if I didn’t address it. Performance is a big concern with deployments, but not the only concern. One other big area that needs to be carefully planned is where your operational team is comfortable with points of control. If application logic is distributed among profiles, policies, and iRules, and that changes not just within a single app but across all applications, there is significant risk for resolving issues when they do arise if documentation of the solutions is lacking. It’s perfectly ok to spread the logic around, but everyone should be aware that XYZ is in iRules, ABC is in policies, and DEF is in profiles. Otherwise, your operations teams will spend a significant amount of time just mapping out what belongs where so they can understand the application flow within the BIG-IP! I say all that to drive home the point that all iRules is an ok solution if the needs policies address is a small percentage of the overall application services configured on BIG-IP. It’s also ok to standardize on policies and push back on application teams to move the issues policies can’t address back into the application itself. The great thing is you have options. Additional Resources Getting Started with iRules iRule Recipe 1: Single URL Explicit Redirect Codeshare: iRules & HTTP Getting Started with Policies (AskF5 Manual) LTM Policy Recipes I LTM Policy Recipes II Automatically Redirect http to https on a Virtual Server(Ask F5 Solution) Forwarded HTTP Extension Insertion - RFC 7239 (Codeshare iRule)713Views0likes0CommentsWhat is HTTP Part II - Underlying Protocols
Last week in part one of this series, we took the 50,000 foot view of the HTTP protocol. HTTP defines the structure of message transfer for web resources, but doesn’t have anything to say about or do with the underlying infrastructure (through version 1.1, version 2 is a slight departure to my claim and we’ll cover that in part 12.) So before we move on to HTTP terminology next week, we’ll spend time today discussing internet plumbing. Photo: Bob Duran on Flickr The first two layers in the OSI model, physical and data-link, are consolidated in the network interface layer of the TCP/IP model. Consider the network interface layer to be a river and ship, a road and a car, a rail and a train, or a plane and smooth skies. The lowest level means for which messages are transferred doesn’t really matter to HTTP. It doesn’t care or have visibility into the network interface layer, so whether it’s frame-relay, ATM, sonet, dwdm, ethernet, etc…it doesn’t care. IP Layer The network layer (OSI model ) or the internet protocol (IP) layer (TCP/IP model,) is the next layer. Think of this as a shipping container your letters or boxes are contained in. That container left its source, was shipped via many potential shipping paths, and arrived at a destination. At that destination, the container is then opened and your letter or box is delivered to you (the next layer up, which would be TCP.) Now, HTTP still doesn’t care about this purely from a messaging standpoint, but you as an administrator, or the way the server is configured, mightcare about some of the information regarding that container, like the source and destination information. It’s really amazing the internet functions as well as it does. It’s not one centrally-controlled organism, it’s built by thousands of companies with thousands of devices that just play hot potato all day long. Receive here, send there. Routing infrastructures are complex beasts, but aside from some security controls in place, they really just pass packets as fast as possible. The transport control protocol (TCP) layer, or transport layer (OSI,) (as should be obvious by its name,) is the control freak of message transportation. The letter pulled from your container has overhead (the envelope) and payload (your actual letter.) The mailman doesn’t read your letter, he just reads the IP information (src/dst,) and validates postage, and any particular services you might have on that envelope. Say you sent 10 letters and they arrived in 4 different containers, the mailman would collect them as they arrived, and re-order them for you before handing up the chain. Ok, maybe that’s a lot to expect of a mailman, but hopefully the analogy is clear. TCP Layer At this layer, typically only the client and server machines, and intermediary devices like firewalls and proxies get involved. TCP is session-oriented and as such, there are expectations of behavior. Think of a phone call. Jason dials a phone number and waits for John to answer. Once John says “hey,” Jason received that acknowledgement that John sent, and can respond as well. So Jason and John have that “handshake” to start the call (the 3-way handshake: phone dial, acknowledge, acknowledge) followed by a meaningful conversation (payload) and finally a closing salutation (the 4-way close: I’m finished, I’m finished, good-bye, good-bye.) So it is with TCP. And what we do naturally with our speech, slow down my speech if John isn’t able to receive that many words at once, repeat myself if there was static in the line, TCP handles as well with it’s protocol definitions. So HTTP does not need to manage transport reliability, TCP does this on HTTP’s behalf. This is important to understand for highly available technologies like session mirroring, where it’s largely unnecessary overhead for protocols like HTTP. As with IP information, there is some TCP information you mightcare about with HTTP, like src/dst ports, or other header information like Akamai does with TCP options to insert original client IPs hiding behind proxies, but very little of that information is useful or helpful at the HTTP layer. DNS DNS is not a layer, but a critical application that combined with HTTP makes the web work. Now back in my network engineer days (and my younger days when I had a better memory) I used to call DNS a crutch for people without the fortitude to remember IP addresses. DNS, or the domain name system, exists to map names to numbers. Largely, people remember names better than numbers. DNS, like the IP infrastructure it relies on, is a patchwork of individually controlled and managed systems in an hierarchical architecture of domains. For more details on how DNS works and why it’s important, check out Peter’s DevCentral Basics What is DNS? article. Putting It All Together When you type https://devcentral.f5.com/s into the locator bar of your browser, the first task is not to send an IP packet to our server. First, it does a DNS lookup for that name, checking your system’s cache, then to the local DNS server configured on your server. At that point, the LDNS server will do the lookups on your behalf and retrieve that answer, then return it to your system, where the answer will be cached for some period of time (hopefully at or shorter than the specified time to live.) At that point, your system will attempt to establish a session with our server (dial the phone and greet each other.) If they reach an agreement, only THEN will the HTTP message be sent from the client to the server. A simple flow of the HTTP traffic can be seen in the image below Conclusion HTTP messages aren’t exchanged in a vacuum. A robust infrastructure of routers, firewalls, proxies, switches, and assisting protocols like DNS exist to help make the web work. Come back next week when we begin the journey into the specifics of the HTTP protocol.620Views0likes0CommentsWhat is HTTP Part IV - Clients, Servers, and Proxies. Oh My!
So far in this series, we have covered the basics of HTTP messaging, the underlying protocols, and last week, we focused on digging a little deeper into the terminology. This week we will focus on the HTTP speakers: clients, servers, and proxies. Borrowing from the underlying protocols article, we’ll just place a happy little accidental blur over the devices in the hop path that aren’t actively engaged at the HTTP protocol layer. Note that the devices speaking HTTP also perform these lower layers, but that is not the focus here. Clients andServers Every journey has a beginning and an end, a source and a destination. So it is with HTTP. I’m lumping these two together because they are the consumers and suppliers in this story. An application is an HTTP client if it speaks HTTP. Now, the client might do a lot of other things unrelated to HTTP itself, but the core client functionality is this: send a request message and receive a response message. With that in mind, you can use all sorts of tools to be HTTP clients. Like telnet: MACDADDY:~ jrahm$ telnet telnet> open 10.10.10.10 5000 Trying 10.10.10.10... Connected to vmlab. Escape character is '^]'. GET /rants.html HTTP/1.1 HTTP/1.0 200 OK Content-Type: text/html; charset=utf-8 Content-Length: 1126 Server: Werkzeug/0.11.15 Python/2.7.12 Date: Thu, 28 Sep 2017 18:06:33 GMT ... Or build your own socket client with python! >>> import socket >>> s = socket.socket(socket.AF_INET,socket.SOCK_STREAM) >>> s.connect(('10.10.10.10', 5000)) >>> s.send('GET /rants.html HTTP/1.1\n\n') 26 >>> while True: ... response = s.recv(1024) ... if response == '': break ... print response, ... HTTP/1.0 200 OK Content-Type: text/html; charset=utf-8 Content-Length: 1126 Server: Werkzeug/0.11.15 Python/2.7.12 Date: Thu, 28 Sep 2017 18:10:41 GMT ... >>> s.close() In many client-server systems, most if not all intelligence resides on the server side of the relationship. This is not the case with web traffic. There is shared intelligence between web clients and servers, with more intelligence seemingly accelerating toward the big daddy of all HTTP clients—the browser. I don't want to dive too far down the rabbit hole, but the browser evolution over the years is nothing short of astounding. Do you recall the crazy number of plugins or off-browser technical gymnastics you had to do to play a video? Now we can’t get them NOT to play! Browsers manage name resolution, compression, caching, cookies, request routing, rendering, security, etc. The browser is an incredibly powerful and rich application platform that far exceeds the scope of this article or series, but an HTTP client it is nonetheless. Servers serve content. It’s what they do (thank you Capt. Obvious!) It can be static content like scripts, stylesheets, images, etc. But usually most modern sites are incredibly dynamic. Quick aside - In some architectures, you have the web server, the application server, and then the database server. Sometimes, these are collapses in front/back pairs or combined altogether. The added layers and functionality, however, don't change the underlying HTTP messaging required for a server of that web content to return a response back to the client. Servers can be just as surprisingly simple or mind-numbingly complex as the client. For example, an iRule running on your BIG-IP can be a “server” and serve up HTTP responses with a single command in the HTTP_RESPONSE event, or even just respond to a certain url: # respond to every URI with simple "ok" when HTTP_RESPONSE { HTTP::respond 200 ok # respond to /service-check.html when HTTP_RESPONSE { if { [HTTP::path] eq "/service-check.html" } { HTTP::respond 200 ok } } Also, with micro-frameworks like python’s Flask package, you can serve up HTTP response messages with a few lines of code: from flask import Flask app = Flask(__name__) @app.route('/') def hello_world(): return 'Hello, World!' In addition to content management, like clients, there are additional services that can be offered by servers. Some of the same services as the client with compression and caching, and some additional services like authentication and authorization, session handling, tenant isolation, etc. Lightweight command line HTTP clients (ab, netcat, curl) and server applications (like twisted) are fantastic for doing functional testing of your BIG-IP deployments, and many of them can be scaled with help to do some pretty helpful performance testing as well. And whereas BIG-IP (via iRules, policies, or HTP profiles) can be clients and/or servers in the messaging game, it is not usually considered as such. BIG-IPis considered a proxy. Proxies If clients and servers are the consumers and suppliers in our story, then proxies are the slick used car salesmen. The high volume operators. The border agents inspecting the precious cargo. The custom fabricators. They are the middle men! And just like in society, there are positive use cases for middle men, and some not so positive ones (man-in-the-middle attacks.) Earlier this year as part of our DevCentral Basics series wrote an excellent article covering the basic concepts of proxies, so I won’t belabor that here but I do want to address some proxy functionality: Caching - en vogue on client-side networks when bandwidth was a premium back in the day, a caching proxy would retrieve and store locally so multiple requests for the same content didn’t need to rob bandwidth for repeat functions. In this use case it would be a forward proxy, but you can cache at the reverse proxy as well. This is available in the Application Acceleration Manager module. We will discuss caching more in-depth later in this series. Non-HTTP services gateway - as an intermediary, a proxy manages client and server traffic, and can implement other services that may or may not be HTTP related at all. ICAP is a good example of this. Transformations - Managing client and server available functionalities separately. For example, maintaining HTTP/1.0 client side and HTTP/1.1 server side. Client Anonymity - On the server side, you might want to track the history of where your clients are coming from for various purposes, but if a large percentage of them are behind a shared proxy, this could impact not only your statistics, but also your load distribution in your infrastructure, depending on how your architect your solution. Some companies like Akamai will insert in TCP options the (as far as they can tell) source IP so that information is available. The more common way proxies deal with this (where it is activated) is by inserting the HTTP X-Forwarded-For header. Request/Response Inspection/Filtering - This happens with forward or reverse proxy implementations. You want a border control agent making sure the tourists are who they say they are and aren’t trying to cross the border with unauthorized goods. The BIG-IP platform supports forward proxy through the Secure Web Gateway Servicesand the reverse proxy functionality through the core Local Traffic Manager module and with additional access and security services with Access Policy Manager and Application Security Manager respectively. Moving forward in this series, we will focus primarily on the reverse proxy functionality within BIG-IP. Join us next week when we start to talk about the BIG-IP HTTP profile.499Views0likes4Comments