tcp
60 TopicsLayer 4 vs Layer 7 DoS Attack
Not all DoS (Denial of Service) attacks are the same. While the end result is to consume as much - hopefully all - of a server or site's resources such that legitimate users are denied service (hence the name) there is a subtle difference in how these attacks are perpetrated that makes one easier to stop than the other. SYN Flood A Layer 4 DoS attack is often referred to as a SYN flood. It works at the transport protocol (TCP) layer. A TCP connection is established in what is known as a 3-way handshake. The client sends a SYN packet, the server responds with a SYN ACK, and the client responds to that with an ACK. After the "three-way handshake" is complete, the TCP connection is considered established. It is as this point that applications begin sending data using a Layer 7 or application layer protocol, such as HTTP. A SYN flood uses the inherent patience of the TCP stack to overwhelm a server by sending a flood of SYN packets and then ignoring the SYN ACKs returned by the server. This causes the server to use up resources waiting a configured amount of time for the anticipated ACK that should come from a legitimate client. Because web and application servers are limited in the number of concurrent TCP connections they can have open, if an attacker sends enough SYN packets to a server it can easily chew through the allowed number of TCP connections, thus preventing legitimate requests from being answered by the server. SYN floods are fairly easy for proxy-based application delivery and security products to detect. Because they proxy connections for the servers, and are generally hardware-based with a much higher TCP connection limit, the proxy-based solution can handle the high volume of connections without becoming overwhelmed. Because the proxy-based solution is usually terminating the TCP connection (i.e. it is the "endpoint" of the connection) it will not pass the connection to the server until it has completed the 3-way handshake. Thus, a SYN flood is stopped at the proxy and legitimate connections are passed on to the server with alacrity. The attackers are generally stopped from flooding the network through the use of SYN cookies. SYN cookies utilize cryptographic hashing and are therefore computationally expensive, making it desirable to allow a proxy/delivery solution with hardware accelerated cryptographic capabilities handle this type of security measure. Servers can implement SYN cookies, but the additional burden placed on the server alleviates much of the gains achieved by preventing SYN floods and often results in available, but unacceptably slow performing servers and sites. HTTP GET DoS A Layer 7 DoS attack is a different beast and it's more difficult to detect. A Layer 7 DoS attack is often perpetrated through the use of HTTP GET. This means that the 3-way TCP handshake has been completed, thus fooling devices and solutions which are only examining layer 4 and TCP communications. The attacker looks like a legitimate connection, and is therefore passed on to the web or application server. At that point the attacker begins requesting large numbers of files/objects using HTTP GET. They are generally legitimate requests, there are just a lot of them. So many, in fact, that the server quickly becomes focused on responding to those requests and has a hard time responding to new, legitimate requests. When rate-limiting was used to stop this type of attack, the bad guys moved to using a distributed system of bots (zombies) to ensure that the requests (attack) was coming from myriad IP addresses and was therefore not only more difficult to detect, but more difficult to stop. The attacker uses malware and trojans to deposit a bot on servers and clients, and then remotely includes them in his attack by instructing the bots to request a list of objects from a specific site or server. The attacker might not use bots, but instead might gather enough evil friends to launch an attack against a site that has annoyed them for some reason. Layer 7 DoS attacks are more difficult to detect because the TCP connection is valid and so are the requests. The trick is to realize when there are multiple clients requesting large numbers of objects at the same time and to recognize that it is, in fact, an attack. This is tricky because there may very well be legitimate requests mixed in with the attack, which means a "deny all" philosophy will result in the very situation the attackers are trying to force: a denial of service. Defending against Layer 7 DoS attacks usually involves some sort of rate-shaping algorithm that watches clients and ensures that they request no more than a configurable number of objects per time period, usually measured in seconds or minutes. If the client requests more than the configurable number, the client's IP address is blacklisted for a specified time period and subsequent requests are denied until the address has been freed from the blacklist. Because this can still affect legitimate users, layer 7 firewall (application firewall) vendors are working on ways to get smarter about stopping layer 7 DoS attacks without affecting legitimate clients. It is a subtle dance and requires a bit more understanding of the application and its flow, but if implemented correctly it can improve the ability of such devices to detect and prevent layer 7 DoS attacks from reaching web and application servers and taking a site down. The goal of deploying an application firewall or proxy-based application delivery solution is to ensure the fast and secure delivery of an application. By preventing both layer 4 and layer 7 DoS attacks, such solutions allow servers to continue serving up applications without a degradation in performance caused by dealing with layer 4 or layer 7 attacks.20KViews0likes3CommentsThe TCP Send Buffer, In-Depth
Earlier this year, my guide to TCP Profile tuning set out some guidelines on how to set send-buffer-size in the TCP profile. Today I'll dive a little deeper into how the send buffer works, and why it's important. I also want to call your attention to cases where the setting doesn't do what you probably think it does. What is the TCP Send Buffer? The TCP send buffer contains all data sent to the remote host but not yet acknowledged by that host. With a few isolated exceptions*, data not yet sent is not in the buffer and remains in the proxy buffer, which is in the subject of different profile parameters. The send buffer exists because sent data might need to be retransmitted. When an acknowledgment for some data arrives, there will be no retransmission and it can free that data.** Each TCP connection will only take system memory when it has data to store in the buffer, but the profile sets a limit called send-buffer-size to cap the memory footprint of any one connection. Note that there are two send buffers in most connections, as indicated in the figure above: one for data sent to the client, regulated by the clientside TCP profile; and one for data sent to the server, regulated by the serverside TCP profile. Cases Where the Configured Send Buffer Limit Doesn't Apply Through TMOS v12.1, there are important cases where the configured send buffer limit is not operative. It does not apply when the system variable tm.tcpprogressive.autobuffertuning is enabled,which is the default, ANDat least oneof the following attributes is set in the TCP profile: MPTCP enabled Rate Pacing enabled Tail Loss Probe enabled TCP Fast Open is enabled Nagle's Algorithm in 'Auto' Mode Congestion Metrics Cache Timeout > 0 The Congestion Control algorithm is Vegas, Illinois, Woodside, CHD, CDG, Cubic, or Westwood The virtual server executes an iRule with the 'TCP::autowin enable' command. The system variable tm.tcpprogressive is set to 'enable' or 'mptcp'. (The default value is 'negotiate'). Note that none of these settings apply to the default TCP profile, so the default profile enforces the send buffer limit. Given the conditions above, the send buffer maximum is one of two values, instead of the configured one: If the configured send buffer size AND the configured receive buffer size are 64K or less, the maximum send buffer size is 64K. Otherwise, the maximum send buffer size is equal to the system variable tm.tcpprogressive.sndbufmax, which defaults to 16MB. We fully recognize that this not an intuitive way to operate, and have plans to streamline it soon. However, note that you can force the configured send buffer limit to always apply by setting tm.tcpprogressive.autobuffertuning to 'disabled,' or force it to never apply by enabling tm.tcpprogressive. What if send-buffer-size is too small? The Send Buffer size is a practical limit on how much data can be in flight at once. Say you have 10 packets to send (allowed by both congestion control and the peer's receive window) but only 5 spaces in the send buffer. Then the other 5 will have to wait in the proxy buffer until at least the first 5 are acknowledged, which will take one full Round Trip Time (RTT). Generally, this means your sending rate is firmly limited to (Sending Rate) = (send-buffer-size) / RTT regardless of whatever available bandwidth there happens to be, congestion and peer receive windows, and so on. Therefore, we recommend that your send buffer be set to at least your (maximum achievable sending rate) * RTT, more generally known as the Banwidth-Delay Product (BDP). There's more on getting the proper RTT below. What if send-buffer-size is too large? If the configured size is larger than the bandwidth-delay product, your BIG-IP may use more memory per connection than it can use at any given time, reducing the capacity of your system. A sending rate that exceeds the uncongested BDP of the path will cause router queues to build up and possibly overflow, resulting in packet losses. Although this is intrinsic to TCP's design, a sufficiently low send buffer size prevents TCP congestion control from reaching sending rates where it will obviously cause congestion losses. An over-large send buffer may not matter depending on the remote host's advertised receive window. BIG-IP will not bring data into the send buffer if the receive window says it can't be sent. The size of that receive window is limited by the TCP window scale option (See RFC 7323, Section 2.2) in the SYN packet. How Do I Get the Bandwidth-Delay Product? The profile tuning article gives some pointers on using iRules to figure out the bandwidth and RTT on certain paths, which I won't repeat here. TCP Analytics can also generate some useful data here. And you may have various third party tools (the most simple of which is "ping") to get one or both of these metrics. When computing BDP, beware of the highest RTTs you observe on a path. Why? Bufferbloat. Over some intervals, TCP will send data faster than the bottleneck bandwidth, which fills up queues and adds to RTT. As a result, TCP's peak bandwidth will exceed the path's, and the highest RTT will include a lot of queueing time.This isn't good.A sending rate that includes self-induced queueing delay isn't getting data there any faster; instead, it's just increasing latency for itself and everybody else. I wish I could give you more precise advice, but there are no easy answers here. To the extent you can probe the characteristics of the networks you operate on, you want to take (max bandwidth) * (min RTT) to find each path's BDP, and take the maximum of all those path BDPs. Easier said then done! But perhaps this article has given you enough intuition about the problem to create a better send-buffer-size setting. In case a measurement program is not in the cards, I'll leave you with the chart below, which plots the BDP of 64KB, 128KB, and 256KB buffer sizes against representative BDPs of various link types. * If TCP blocks transmission due to the Nagle Algorithmor Rate Pacing, the unsent data will already be in the send buffer. ** A client can renege on a SACK (Selective Acknowledgment), so this is not sufficient to free the SACKed data.10KViews0likes3CommentsTCP Internals: 3-way Handshake and Sequence Numbers Explained
In this article, I will explain and show you what really happens during a TCP 3-way handshake as captured by tcpdump tool. We'll go deeper into details of TCP 3-way handshake (SYN, SYN/ACK and ACK) and how Sequence Numbers and Acknowledgement Numbers actually work. Moreover, I'll also briefly explain using real data how TCP Receive Window and Maximum Segment Size play an important role in TCP connection. As a side note, I will not touchTCP SACKandTCP Timestampsthis time as they should be covered in a future article about TCP retransmissions. FYI, the TCP capture was generated by a simpleHTTP GETrequest to BIG-IP to get hold of a file on/cgi-bin/directory calledscript.plusingHTTP/1.1protocol: BIG-IP then responds withHTTP/1.1 200 OKwith the requested data. This is not very relevant as we'll be looking at TCP layer but it's good to understand the capture's context to fully understand what's going on. This is what a TCP 3-way handshake looks like on Wireshark: Aswe can see, the first 3 packets are exchanged less than 1 second apart from each other. TheIN/OUTportion ofInfofield on BIG-IP's capture tells us if the packet is coming IN or being sent OUT by BIG-IP (as capture was taken on BIG-IP). As this is a slightly more in-depth explanation of TCP internals, I am assuming you know at least what a TCP 3-way handshake is conceptually. The TCP SYN, SYN/ACK and ACK Segments We can see that first packet is[SYN], second one is[SYN/ACK]and last one is[SYN/ACK]as displayed on Wireshark. TheInfosection as a whole only shows the summary of the most relevant fields copied from the TCP header. It is just enough to make us understand the context of the TCP segment. Let's now have a look what these fields mean with the exception ofSACK_PERMandTSval. When we double click on the[SYN]packet below, we find the same information again in the actual TCP header: The most important thing to understand here is that[SYN],[SYN/ACK]and[ACK]are all part of theFlagsheader above. They're just 1's and 0's. WhenSYNflag is enabled (i.e its value is 1), the receiving end (in this case BIG-IP) should automatically understand that someone (my client PC in this case) is trying to establish aTCPconnection. The response from BIG-IP (SYN/ACK) is an acknowledgement to theSYNpacket and therefore it has bothSYNandACKflags set to 1. Client's last response is just anACKas seen below: As per RFC, both sides should now assume a TCP connection is established. For plain-textHTTP/1.1protocol, there should now be a GET request in another layer as a payload of (or encapsulated by) TCP layer. If our traffic it is protected byTLSthenTLSlayer should come first as the payload of TCP layer and HTTP would be the payload of TLS layer. Does it make sense? That's how things work in the real world. TCP Sequence numbers A side note,Wireshark shows that our first SYN segment's Sequence number is 0 (Seq=0): It also shows that it isrelativesequence numberbut this is not the real TCP sequence number. Wireshark automatically zeroes it for you to make it easier to visualise and/or troubleshoot. In reality, the real sequence number is a much longer number that is calculated by your OS using current time and other random parameters for security purposes. This is how we see the real sequence number in Wireshark: Now back to business. Some people say if Client sends a TCP segment to BIG-IP, BIG-IP's ACK should be client's sequence number + 1 right? Wrong! Instead of +1 it should be+ number of bytes last received from peer or +1 if SYN or FIN segments. To clarify, here's thefull Flow Graphof our capture using relativesequence numbersto make it easier to grasp (.135= Client and .143 =BIG-IP): On 4th segment above (PSH, ACK - Len: 93), client sends TCP segment withSeq = 1and TCP payload data length (comprised of HTTP layer) of93 bytes. In this case, BIG-IP's response isnotACK = 2 (1 + 1) as some might think. Instead, BIG-IP responds with whatever client's last Sequence number wasplusnumber of bytes last received. As last sequence number was 1 and client also sent a TCP payload of 93 bytes, thenACKis 94! This is the most important concept to grasp for understanding sequence numbers and ACKs. SEQsandACKsonly increment whenthere is a TCP payload involved(by the number of bytes). SYN, FIN or ZeroWindow segments count as 1 byte for SEQs/ACKs. I added a full analysis using real TCPSEQs/ACKsto anAppendixsection if you'd like to go deeper into it. For the moment let's shift our attention towardsTCP Receive Window. TCP Receive Window and Maximum Segment Size (MSS) During 3-way handshake, the Receive Window (Window size valueon Wireshark) tells each side of the connection the maximum receiving buffer in bytes each side can handle: So it's literally like this (read red lines first please): [1]→ Hey, BIG-IP! My receiving buffer size is 29200 bytes. That means, you caninitiallysend me up to 29200 bytes before you even bother waiting for an ACK from me to send further data. [2]→ This should be the same as[1], unless Window Scale TCP Option is active. Window Scale should be the subject of a different article but I briefly touch it on[3]. [3]→ Original TCP Window Size field is limited to 16 bits so maximum buffer size is just65,535 bytes which is too little for today's speedy connections. This option extends the 16-bit window to 32-bit window but because BIG-IP did not advertise Window Scale option for this connection, it is disabled as both sides must support it for it to be used. [4]→ Hey, client! My receiving buffer size is 4380 bytes. That means, you caninitiallysend me up to 4328 bytesbefore you even bother waiting for an ACK from me to send further data. The reason why the wordinitiallyisunderlined on [1] and [3] is because Window size typically changes during the connection. For example, client's initial window size is 29200 bytes, right? This means that if it receives 200 bytes from BIG-IP it should go down to 2900 bytes. Easy, eh? But that's not whatalwayshappens in real life. In fact, in our capture it's the opposite! Bytes in flightcolumn shows the data BIG-IP (*.143) is sending in bytes to our client (*.135) that has not yet been acknowledged. I've added a column withWindow Size valueto make it easier to spot how variable this field is: It is the OS TCP Flow control implementation that dictates theReceive Windowsize taking into account the current "health" of its TCP stack and of course your configuration. Yes, in many cases, especially in the middle of a connection, the Window Size does decrease based on amount of data received/buffered so our first explanation also makes sense! How does BIG-IP know that client has freed up it's buffer again? As we can see above, when Client ACKs the receipt of BIG-IP's data, it also informs the size of its buffer in theWindow Size valuefield. That's how BIG-IP knows how much data it can send to Client before it receives another ACK. What about the Maximum Segment Size? Each side also displays aTCP Option - Maximum Segment sizeof 1460 bytes. This informs the maximum size of the TCP payload each side can send at a time (per TCP segment). Looking at the picture above, BIG-IP sent 334 bytes of TCP payload to client, right? In theory, this could've been up to 1460 bytes as it's also within client's initial buffer of 29200 bytes. So apart from informing each other about the maximum buffer, the maximum size of TCP segment is also informed. TCP Len vs Bytes in Flight Column (BIF) If we look at our last picture, we can see that whatever is inLenfield matches what's in ourBIFcolumn, right? Are they the same? No! Lenshows the current size of TCP payload (excluding the size of TCP header). Remember that TCP payload in this case is the whole HTTP portion that our TCP segment is carrying. Bytes in flightis not really part of TCP header but that's something Wireshark adds to make it easier for us to troubleshoot. It just means the number of bytes sent that have not yet been acknowledged by receiver. In our capture, data is acknowledged immediately so bothLenandBIFare the same. I've picked a different capture here where there are 3 TCP segments sent with no acknowledgement soBIFcolumn increments for each unacknowledged data segment but goes back to zero as soon as anACKis received by receiver: Notice thatBIFvalues now differ from TCP payload (the equivalent toLeninInfocolumn). That's it for now. The next article would be about TCP retransmission. Appendix - Going in depth into TCP sequence numbers! Here's a full explanation about what actually takes place on TCP layer from the point of view of BIG-IP: Just follow along from [1] to [10]. That's it.9.7KViews4likes1CommentThe Disadvantages of DSR (Direct Server Return)
I read a very nice blog post yesterday discussing some of the traditional pros and cons of load-balancing configurations. The author comes to the conclusion that if you can use direct server return, you should. I agree with the author's list of pros and cons; DSR is the least intrusive method of deploying a load-balancer in terms of network configuration. But there are quite a few disadvantages missing from the author's list. Author's List of Disadvantages of DSR The disadvantages of Direct Routing are: Backend server must respond to both its own IP (for health checks) and the virtual IP (for load balanced traffic) Port translation or cookie insertion cannot be implemented. The backend server must not reply to ARP requests for the VIP (otherwise it will steal all the traffic from the load balancer) Prior to Windows Server 2008 some odd routing behavior could occur in In some situations either the application or the operating system cannot be modified to utilse Direct Routing. Some additional disadvantages: Protocol sanitization can't be performed. This means vulnerabilities introduced due to manipulation of lax enforcement of RFCs and protocol specifications can't be addressed. Application acceleration can't be applied. Even the simplest of acceleration techniques, e.g. compression, can't be applied because the traffic is bypassing the load-balancer (a.k.a. application delivery controller). Implementing caching solutions become more complex. With a DSR configuration the routing that makes it so easy to implement requires that caching solutions be deployed elsewhere, such as via WCCP on the router. This requires additional configuration and changes to the routing infrastructure, and introduces another point of failure as well as an additional hop, increasing latency. Error/Exception/SOAP fault handling can't be implemented. In order to address failures in applications such as missing files (404) and SOAP Faults (500) it is necessary for the load-balancer to inspect outbound messages. Using a DSR configuration this ability is lost, which means errors are passed directly back to the user without the ability to retry a request, write an entry in the log, or notify an administrator. Data Leak Prevention can't be accomplished. Without the ability to inspect outbound messages, you can't prevent sensitive data (SSN, credit card numbers) from leaving the building. Connection Optimization functionality is lost. TCP multiplexing can't be accomplished in a DSR configuration because it relies on separating client connections from server connections. This reduces the efficiency of your servers and minimizes the value added to your network by a load balancer. There are more disadvantages than you're likely willing to read, so I'll stop there. Suffice to say that the problem with the suggestion to use DSR whenever possible is that if you're an application-aware network administrator you know that most of the time, DSR isn't the right solution because it restricts the ability of the load-balancer (application delivery controller) to perform additional functions that improve the security, performance, and availability of the applications it is delivering. DSR is well-suited, and always has been, to UDP-based streaming applications such as audio and video delivered via RTSP. However, in the increasingly sensitive environment that is application infrastructure, it is necessary to do more than just "load balancing" to improve the performance and reliability of applications. Additional application delivery techniques are an integral component to a well-performing, efficient application infrastructure. DSR may be easier to implement and, in some cases, may be the right solution. But in most cases, it's going to leave you simply serving applications, instead of delivering them. Just because you can, doesn't mean you should.5.9KViews0likes4CommentsPersistent and Persistence, What's the Difference?
The English language is one of the most expressive, and confusing, in existence. Words can have different meaning based not only on context, but on placement within a given sentence. Add in the twists that come from technical jargon and suddenly you've got words meaning completely different things. This is evident in the use of persistent and persistence. While the conceptual basis of persistence and persistent are essentially the same, in reality they refer to two different technical concepts. Both persistent and persistence relate to the handling of connections. The former is often used as a general description of the behavior of HTTP and, necessarily, TCP connections, though it is also used in the context of database connections. The latter is most often related to TCP/HTTP connection handling but almost exclusively in the context of load-balancing. Persistent Persistent connections are connections that are kept open and reused. The most commonly implemented form of persistent connections are HTTP, with database connections a close second. Persistent HTTP connections were implemented as part of the HTTP 1.1 specification as a method of improving the efficiency Related Links HTTP 1.1 RFC Persistent Connection Behavior of Popular Browsers Persistent Database Connections Apache Keep-Alive Support Cookies, Sessions, and Persistence of HTTP in general. Before HTTP 1.1 a browser would generally open one connection per object on a page in order to retrieve all the appropriate resources. As the number of objects in a page grew, this became increasingly inefficient and significantly reduced the capacity of web servers while causing browsers to appear slow to retrieve data. HTTP 1.1 and the Keep-Alive header in HTTP 1.0 were aimed at improving the performance of HTTP by reusing TCP connections to retrieve objects. They made the connections persistent such that they could be reused to send multiple HTTP requests using the same TCP connection. Similarly, this notion was implemented by proxy-based load-balancers as a way to improve performance of web applications and increase capacity on web servers. Persistent connections between a load-balancer and web servers is usually referred to as TCP multiplexing. Just like browsers, the load-balancer opens a few TCP connections to the servers and then reuses them to send multiple HTTP requests. Persistent connections, both in browsers and load-balancers, have several advantages: Less network traffic due to less TCP setup/teardown. It requires no less than 7 exchanges of data to set up and tear down a TCP connection, thus each connection that can be reused reduces the number of exchanges required resulting in less traffic. Improved performance. Because subsequent requests do not need to setup and tear down a TCP connection, requests arrive faster and responses are returned quicker. TCP has built-in mechanisms, for example window sizing, to address network congestion. Persistent connections give TCP the time to adjust itself appropriately to current network conditions, thus improving overall performance. Non-persistent connections are not able to adjust because they are open and almost immediately closed. Less server overhead. Servers are able to increase the number of concurrent users served because each user requires fewer connections through which to complete requests. Persistence Persistence, on the other hand, is related to the ability of a load-balancer or other traffic management solution to maintain a virtual connection between a client and a specific server. Persistence is often referred to in the application delivery networking world as "stickiness" while in the web and application server demesne it is called "server affinity". Persistence ensures that once a client has made a connection to a specific server that subsequent requests are sent to the same server. This is very important to maintain state and session-specific information in some application architectures and for handling of SSL-enabled applications. Examples of Persistence Hash Load Balancing and Persistence LTM Source Address Persistence Enabling Session Persistence 20 Lines or Less #7: JSessionID Persistence When the first request is seen by the load-balancer it chooses a server. On subsequent requests the load-balancer will automatically choose the same server to ensure continuity of the application or, in the case of SSL, to avoid the compute intensive process of renegotiation. This persistence is often implemented using cookies but can be based on other identifying attributes such as IP address. Load-balancers that have evolved into application delivery controllers are capable of implementing persistence based on any piece of data in the application message (payload), headers, or at in the transport protocol (TCP) and network protocol (IP) layers. Some advantages of persistence are: Avoid renegotiation of SSL. By ensuring that SSL enabled connections are directed to the same server throughout a session, it is possible to avoid renegotiating the keys associated with the session, which is compute and resource intensive. This improves performance and reduces overhead on servers. No need to rewrite applications. Applications developed without load-balancing in mind may break when deployed in a load-balanced architecture because they depend on session data that is stored only on the original server on which the session was initiated. Load-balancers capable of session persistence ensure that those applications do not break by always directing requests to the same server, preserving the session data without requiring that applications be rewritten. Summize So persistent connections are connections that are kept open so they can be reused to send multiple requests, while persistence is the process of ensuring that connections and subsequent requests are sent to the same server through a load-balancer or other proxy device. Both are important facets of communication between clients, servers, and mediators like load-balancers, and increase the overall performance and efficiency of the infrastructure as well as improving the end-user experience.4.9KViews0likes2CommentsThe TCP Proxy Buffer
The proxy buffer is probably the least intuitive of the three TCP buffer sizes that you can configure in F5's TCP Optimization offering. Today I'll describe what it does, and how to set the "high" and "low" buffer limits in the profile. The proxy buffer is the place BIG-IP stores data that isn't ready to go out to the remote host. The send buffer, by definition, is data already sent but unacknowledged. Everything else is in the proxy buffer. That's really all there is to it. From this description, it should be clear why we need limits on the size of this buffer. Probably the most common deployment of a BIG-IP has a connection to the server that is way faster than the connection to the client. In these cases, data will simply accumulate at the BIG-IP as it waits to pass through the bottleneck of the client connection. This consumes precious resources on the BIG-IP, instead of commodity servers. So proxy-buffer-high is simply a limit where the BIG-IP will tell the server, "enough." proxy-buffer-low is when it will tell the server to start sending data again. The gap between the two is simply hysteresis: if proxy-buffer-high were the same as proxy-buffer-low, we'd generate tons of start/stop signals to the server as the buffer level bounced above and below the threshold. We like that gap to be about 64KB, as a rule of thumb. So how does it tell the server to stop? TCP simply stops increasing the receive window: once advertised bytes avaiable have been sent, TCP will advertise a zero receive window. This stops server transmissions (except for some probes) until the BIG-IP signals it is ready again by sending an acknowledgment with a non-zero receive window advertisement. Setting a very large proxy-buffer-high will obviously increase the potential memory footprint of each connection. But what is the impact of setting a low one? On the sending side, the worst-case scenario is that a large chunk of the send buffer clears at once, probably because a retransmitted packet allows acknowledgement of a missing packet and a bunch of previously received data. At worst, this could cause the entire send buffer to empty and cause the sending TCP to ask the proxy buffer to accept a whole send buffer's worth of data. So if you're not that worried about the memory footprint, the safe thing is to set proxy-buffer-high to the same size as the send buffer. The limits on proxy-buffer-low are somewhat more complicated to derive, but the issue is that if a proxy buffer at proxy-buffer-low suddenly drains, it will take oneserversideRound Trip Time (RTT) to send the window update and start getting data again. So the total amount of data that has to be in the proxy buffer at the low point is the RTT of the serverside times the bandwidth of the clientside. If the proxy buffer is filling up, the serverside rate generally exceeds the clientside data rate, so that will be sufficient. If you're not deeply concerned about the memory footprint of connections, the minimum proxy buffer settings that will prevent any impairment of throughput are as follows for the clientside: proxy-buffer-high = send-buffer-size = (clientside bandwidth) * (clientside RTT) proxy-buffer-low = (clientside bandwidth) * (serverside RTT) proxy-buffer-low must be sufficiently below proxy-buffer-high to avoid flapping. If youarerunning up against memory limits, then cutting back on these settings will only hurt you in the cases above. Economizing on proxy buffer space is definitely preferable to limiting the send rate by making the send buffer too small.4.3KViews3likes14CommentsF5 Unveils New Built-In TCP Profiles
[Update 3/17:Some representative performance results are at the bottom] Longtime readers know thatF5's built-in TCP profileswere in need of a refresh. I'm pleased to announce that inTMOS® version13.0, available now, there are substantial improvements to the built-in profile scheme. Users expect defaults to reflect best common practice, and we've made a huge step towards that being true. New Built-in Profiles We've kept virtually all of the old built-in profiles, for those of you who are happy with them, or have built other profiles that derive from them. But there are four new ones to load directly into your virtual servers or use a basis for your own tuning. The first three are optimized for particular network use cases: f5-tcp-wan, f5-tcp-lan, and f5-tcp-mobile are updated versions of tcp-wan-optimized, tcp-lan-optimized, and tcp-mobile-optimized. These adapt all settings to the appropriate link types, except that they don't enable the very newest features. If the hosts you're communicating with tend to use one kind of link, these are a great choice. The fourth isf5-tcp-progressive.This is meant to be a general-use profile (like the tcp default), but it contains the very latest features for early adopters. In our benchmark testing, we had the following criteria: f5-tcp-wan, f5-tcp-lan, and f5-tcp-mobile achieved throughput at least as high, and often better, than the default tcp profile for that link type. f5-tcp-progressive had equal or higher throughput than default TCP across all representative network types. The relative performance of f5-tcp-wan/lan/mobile and progressive in each link type will vary given the new features that f5-tcp-progressive enables. Living, Read-Only Profiles These four new profiles,and the default 'tcp' profile,are now "living." This means that we'll continually update them with best practices as they evolve. Brand-new features, if they are generally applicable, will immediately appear in f5-tcp-progressive. For our more conservative users, these new features will appear in the other four living profiles after a couple of releases. The default tcp profile hasn't changed yet, but it will in future releases! These five profiles are also now read-only, meaning that to make modifications you'll have to create a new profile that descends from these. This will aid in troubleshooting. If there are any settings that you like so much that you never want them to change, simply click the "custom" button in the child profile and the changes we push out in the future won't affect your settings. How This Affects Your Existing Custom Profiles If you've put thought into your TCP profiles, we aren't going to mess with it. If your profile descends from any of the previous built-ins besides default 'tcp,' there is no change to settings whatsoever. Upgrades to 13.0 will automatically prevent disruptions to your configuration.We've copied all of the default tcp profile settings to tcp-legacy, which is not a "living" profile. All of the old built-in profiles (like tcp-wan-optimized), and any custom profiles descended from default tcp, will now descend instead from tcp-legacy, and never change due to upgrades from F5. tcp-legacy will also include any modifications you made to the default tcp profile, as this profile is not read-only. Our data shows that few, if any, users are using the current (TMOS 12.1 and before) tcp-legacy settings.If you are, it is wise to make a note of those settings before you upgrade. How This Affects Your Existing Virtual Servers As the section above describes, if your virtual server uses any profile other than default 'tcp' or tcp-legacy, there will be no settings change at all. Given the weaknesses of the current default settings, we believe most users who use virtuals with the TCP default are not carefully considering their settings. Those virtuals will continue to use the default profile, and therefore settings will begin to evolve as we modernize the default profile in 13.1 and later releases. If you very much like the default TCP profile, perhaps because you customized it when it wasn't read-only, you should manually change the virtual to use tcp-legacy with no change in behavior. Use the New Profiles for Better Performance The internet changes. Bandwidths increase, we develop better algorithms to automatically tune your settings, and the TCP standard itself evolves. If you use the new profile framework, you'll keep up with the state of the art and maximize the throughput your applications receive. Below, I've included some throughput measurements from our in-house testing. We used parameters representative of seven different link types and measured the throughput using some relevant built-in profiles. Obviously, the performance in your deployment may vary. Aside from LANs, where frankly tuning isn't all that hard, the benefits are pretty clear.4.3KViews1like9CommentsStop Using the Base TCP Profile!
[Update 1 Mar 2017:F5 has new built-in profiles in TMOS v13.0. Although the default profile settings still haven't changed, there is good news on that from as well.] If the customer data I've seen is any indication, the vast majority of our customers are using the base 'tcp' profile to configure their TCP optimization. This haspoor performance consequencesand I strongly encourage you to replace it immediately. What's wrong with it? The Buffers are too small.Both the receive and send buffers are limited to 64KB, and the proxy buffer won't exceed 48K . If the bandwidth/delay product of your connection exceeds the send or receive buffer, which it will in most of today's internet for all but the smallest files and shortest delays, your applications will be limited not by the available bandwidth but by an arbitrary memory limitation. The Initial Congestion Window is too small.As the early thin-pipe, small-buffer days of the internet recede, the Internet Engineering Task Force (see IETFRFC 6928) increased the allowed size of a sender's initial burst. This allows more file transfers to complete in single round trip time and allows TCP to discover the true available bandwidth faster. Delayed ACKs.The base profile enables Delayed ACK, which tries to reduce ACK traffic by waiting 200ms to see if more data comes in. This incurs a serious performance penalty on SSL, among other upper-layer protocols. What should you do instead? The best answer is to build a custom profile based on your specific environment and requirements. But we recognize that some of you will find that daunting! So we've created a variety of profiles customized for different environments. Frankly, we should do some work to improve these profiles, but even today there are much better choices than base 'tcp'. If you have an HTTP profile attached to the virtual, we recommend you use tcp-mobile-optimized. This is trueeven if your clients aren't mobile. The name is misleading! As I said, the default profiles need work. If you're just a bit more adventurous with your virtual with an HTTP profile, then mptcp-mobile-optimizedwill likely outperform the above. Besides enabling Multipath TCP (MPTCP)for clients that ask for it, it uses a more advanced congestion control ("Illinois") and rate pacing. We recognize, however, that if you're still using the base 'tcp' profile today then you're probably not comfortable with the newest, most innovative enhancements to TCP. So plain old tcp-mobile-optimized might be a more gentle step forward. If your virtual doesn't have an HTTP profile, the best decision is to use a modified version of tcp-mobile-optimized or mptcp-mobile-optimized. Just derive a profile from whichever you prefer and disable the Nagle algorithm. That's it! If you are absolutely dead set against modifying a default profile, then wam-tcp-lan-optimized is the next best choice. It doesn't really matter if the attached network is actually a LAN or the open internet. Why did we create a default profile with undesirable settings? That answer is lost in the mists of time. But now it's hard to change: altering the profile from which all other profiles are derived will cause sudden changes in customer TCP behavior when they upgrade. Most would benefit, and many may not even notice, but we try to not to surprise people. Nevertheless, if you want a quick, cheap, and easy boost to your application performance, simply switch your TCP profile from the base to one of our other defaults. You won't regret it.3.9KViews1like27CommentsTuning the TCP Profile, Part One
A few months ago I pointed out some problems with the existing F5-provided TCP profiles, especially the default one. Today I'll begin a pass through the (long) TCP profile to point out the latest thinking on how to get the most performance for your applications. We'll go in the order you see these profile options in the GUI. But first, a note about programmability: in many cases below, I'm going to ask you to generalize about the clients or servers you interact with, and the nature of the paths to those hosts. In a perfect world, we'd detect that stuff automatically and set it for you, and in fact we're rolling that out setting by setting. In the meantime, you can customize your TCP parameters on a per-connection basis using iRules for many of the settings described below, something I'll explain further where applicable. In general, when I refer to "performance" below, I'm referring to the speed at which your customer gets her data. Performance can also refer to the scalability of your application delivery due to CPU and memory limitations, and when that's what I mean, I'll say so. Timer Management The one here with a big performance impact isMinimum RTO. When TCP computes its Retransmission Timeout (RTO), it takes the average measured Round Trip Time (RTT) and adds a few standard deviations to make sure it doesn't falsely detect loss. (False detections have very negative performance implications.) But if RTT is low and stable that RTO may betoolow, and the minimum is designed to catch known fluctuations in RTT that the connection may not have observed. Set Minimum RTO too low, and TCP may improperly enter congestion response and reduce the sending rate all the way down to one packet per round trip. Set it too high, and TCP sits idle when it ought to retransmit lost data. So what's the right value? Obviously, if you have a sense of the maximum RTT to your clients (which you can get with the ping command), that's a floor for your value. Furthermore, many clients and servers will implement some sort of Delayed ACK, which reduces ACK volume by sometimes holding them back for up to 200ms to see if it can aggregate more data in the ACK. RFC 5681 actually allows delays of up to 500ms, but this is less common. So take the maximum RTT and add 200 to 500 ms. Another group of settings aren't really about throughput,but to help clients and servers to close gracefully, at the cost of consuming some system resources. Long Close Wait, Fin Wait 1, Fin Wait 2, and Time Wait timers will keep connection state alive to make sure the remote host got all the connection close messages. Enabling Reset On Timeout sends a message that tells the peer to tear down the connection. Similarly, disabling Time Wait Recycle will prevent new connections from using the same address/port combination, making sure that the old connection with that combination gets a full close. The last group of settingskeeps possibly dead connections alive,using system resources to maintain state in case they come back to life. Idle Timeout and Zero Window Timeout commit resources until the timer expires. If you set Keep Alive Interval to a valuelessthan the Idle Timeout, then on the clientside BIG-IP will keep the connection alive as long as the client keeps responding to keepalive and the server doesn't terminate the connection itself. In theory, this could be forever! Memory Management In terms of high throughput performance, you want all of these settings to be as large as possible up to a point. The tradeoff is that setting them too high may waste memory and reduce the number of supportable concurrent connections. I say "may" waste because these are limitson memory use, and BIG-IP doesn't allocate the memory until it needs it for buffered data.Even so, the trick is to set the limits large enough that there are no performance penalties, but no larger. Send Buffer and Receive Window are easy to set in principle, but can be tricky in practice. For both, answer these questions: What is the maximum bandwidth (Bytes/second) that BIG-IP might experience sending or receiving? Out of all paths data might travel, what minimum delay among those paths is the highest? (What is the "maximum of the minimums"?) Then you simply multiply Bytes/second by seconds of delay to get a number of bytes. This is the maximum amount of data that TCP ought to have in flight at any one time, which should be enough to prevent TCP connections from idling for lack of memory. If your application doesn't involve sending or receiving much data on that side of the proxy, you can probably get away with lowering the corresponding buffer size to save on memory. For example, a traditional HTTP proxy's clientside probably can afford to have a smaller receive buffer if memory-constrained. There are three principles to follow in setting Proxy Buffer Limits: Proxy Buffer High should be at least as big as the Send Buffer. Otherwise, if a large ACK clears the send buffer all at once there may be less data available than TCP can send. Proxy Buffer Low should be at least as big as the Receive Window on the peer TCP profile(i.e. for the clientside profile, use the receive window on the serverside profile). If not, when the peer connection exits the zero-window state, new data may not arrive before BIG-IP sends all the data it has. Proxy Buffer High should be significantly larger than Proxy Buffer Low (we like to use a 64 KB gap) to avoid constant flapping to and from the zero-window state on the receive side. Obviously, figuring out bandwidth and delay before a deployment can be tricky. This is a place where some iRule mojo can really come in handy. The TCP::rtt and TCP::bandwidth* commands can give you estimates of both quantities you need, even though the RTT isn't a minimum RTT. Alternatively, if you've enabled cmetrics-cache in the profile, you can also obtain historical data for a destination using the ROUTE::cwnd* command, which is a good (possibly low) guess at the value you should plug into the send and receive buffers. You can then set buffer limits directly usingTCP::sendbuf**,TCP::recvwnd**, and TCP::proxybuffer**. Getting this to work very well will be difficult, and I don't have any examples where someone worked it through and proved a benefit. But if your application travels highly varied paths and you have the inclination to tinker, you could end up with an optimized configuration. If not, set the buffer sizes using conservatively high inputs and carry on. *These iRule commands only supported in TMOS® version 12.0.0 and later. **These iRule commands only supported inTMOS® version 11.6.0and later.3.3KViews0likes6CommentsWhat is server offload and why do I need it?
One of the tasks of an enterprise architect is to design a framework atop which developers can implement and deploy applications consistently and easily. The consistency is important for internal business continuity and reuse; common objects, operations, and processes can be reused across applications to make development and integration with other applications and systems easier. Architects also often decide where functionality resides and design the base application infrastructure framework. Application server, identity management, messaging, and integration are all often a part of such architecture designs. Rarely does the architect concern him/herself with the network infrastructure, as that is the purview of “that group”; the “you know who I’m talking about” group. And for the most part there’s no need for architects to concern themselves with network-oriented architecture. Applications should not need to know on which VLAN they will be deployed or what their default gateway might be. But what architects might need to know – and probably should know – is whether the network infrastructure supports “server offload” of some application functions or not, and how that can benefit their enterprise architecture and the applications which will be deployed atop it. WHAT IT IS Server offload is a generic term used by the networking industry to indicate some functionality designed to improve the performance or security of applications. We use the term “offload” because the functionality is “offloaded” from the server and moved to an application network infrastructure device instead. Server offload works because the application network infrastructure is almost always these days deployed in front of the web/application servers and is in fact acting as a broker (proxy) between the client and the server. Server offload is generally offered by load balancers and application delivery controllers. You can think of server offload like a relay race. The application network infrastructure device runs the first leg and then hands off the baton (the request) to the server. When the server is finished, the application network infrastructure device gets to run another leg, and then the race is done as the response is sent back to the client. There are basically two kinds of server offload functionality: Protocol processing offload Protocol processing offload includes functions like SSL termination and TCP optimizations. Rather than enable SSL communication on the web/application server, it can be “offloaded” to an application network infrastructure device and shared across all applications requiring secured communications. Offloading SSL to an application network infrastructure device improves application performance because the device is generally optimized to handle the complex calculations involved in encryption and decryption of secured data and web/application servers are not. TCP optimization is a little different. We say TCP session management is “offloaded” to the server but that’s really not what happens as obviously TCP connections are still opened, closed, and managed on the server as well. Offloading TCP session management means that the application network infrastructure is managing the connections between itself and the server in such a way as to reduce the total number of connections needed without impacting the capacity of the application. This is more commonly referred to as TCP multiplexing and it “offloads” the overhead of TCP connection management from the web/application server to the application network infrastructure device by effectively giving up control over those connections. By allowing an application network infrastructure device to decide how many connections to maintain and which ones to use to communicate with the server, it can manage thousands of client-side connections using merely hundreds of server-side connections. Reducing the overhead associated with opening and closing TCP sockets on the web/application server improves application performance and actually increases the user capacity of servers. TCP offload is beneficial to all TCP-based applications, but is particularly beneficial for Web 2.0 applications making use of AJAX and other near real-time technologies that maintain one or more connections to the server for its functionality. Protocol processing offload does not require any modifications to the applications. Application-oriented offload Application-oriented offload includes the ability to implement shared services on an application network infrastructure device. This is often accomplished via a network-side scripting capability, but some functionality has become so commonplace that it is now built into the core features available on application network infrastructure solutions. Application-oriented offload can include functions like cookie encryption/decryption, compression, caching, URI rewriting, HTTP redirection, DLP (Data Leak Prevention), selective data encryption, application security functionality, and data transformation. When network-side scripting is available, virtually any kind of pre or post-processing can be offloaded to the application network infrastructure and thereafter shared with all applications. Application-oriented offload works because the application network infrastructure solution is mediating between the client and the server and it has the ability to inspect and manipulate the application data. The benefits of application-oriented offload are that the services implemented can be shared across multiple applications and in many cases the functionality removes the need for the web/application server to handle a specific request. For example, HTTP redirection can be fully accomplished on the application network infrastructure device. HTTP redirection is often used as a means to handle application upgrades, commonly mistyped URIs, or as part of the application logic when certain conditions are met. Application security offload usually falls into this category because it is application – or at least application data – specific. Application security offload can include scanning URIs and data for malicious content, validating the existence of specific cookies/data required for the application, etc… This kind of offload improves server efficiency and performance but a bigger benefit is consistent, shared security across all applications for which the service is enabled. Some application-oriented offload can require modification to the application, so it is important to design such features into the application architecture before development and deployment. While it is certainly possible to add such functionality into the architecture after deployment, it is always easier to do so at the beginning. WHY YOU NEED IT Server offload is a way to increase the efficiency of servers and improve application performance and security. Server offload increases efficiency of servers by alleviating the need for the web/application server to consume resources performing tasks that can be performed more efficiently on an application network infrastructure solution. The two best examples of this are SSL encryption/decryption and compression. Both are CPU intense operations that can consume 20-40% of a web/application server’s resources. By offloading these functions to an application network infrastructure solution, servers “reclaim” those resources and can use them instead to execute application logic, serve more users, handle more requests, and do so faster. Server offload improves application performance by allowing the web/application server to concentrate on what it is designed to do: serve applications and putting the onus for performing ancillary functions on a platform that is more optimized to handle those functions. Server offload provides these benefits whether you have a traditional client-server architecture or have moved (or are moving) toward a virtualized infrastructure. Applications deployed on virtual servers still use TCP connections and SSL and run applications and therefore will benefit the same as those deployed on traditional servers. I am wondering why not all websites enabling this great feature GZIP? 3 Really good reasons you should use TCP multiplexing SOA & Web 2.0: The Connection Management Challenge Understanding network-side scripting I am in your HTTP headers, attacking your application Infrastructure 2.0: As a matter of fact that isn't what it means2.7KViews0likes1Comment