tcp
95 TopicsTCP Configuration Just Got Easier: Autobuffer Tuning
One of the hardest things about configuring a TCP profile for optimum performance is picking the right buffer sizes. Guess too small, and your connection can't utilize the available bandwidth. Guess too large, and you're wasting system memory, and potentially adding to path latency through the phenomenon of "bufferbloat." But if you get the path Bandwidth-Delay Product right, you're in Nirvana: close to full link utilization without packet loss or latency spikesdue to overflowing queues. Beginning in F5® TMOS® 13.0, help has arrived with F5's new 'autobuffer tuning' feature. Click the "Auto Proxy Buffer", "Auto Receive Window", and "Auto Send Buffer" boxes in your TCP profile configuration, and you need not worry about those buffer sizes any more. What it Does The concept is simple. To get a bandwidth-delay product, we need the bandwidth and delay. We have a good idea of the delay from TCP's round-trip-time (RTT) measurement. In particular, the minimum observed RTT is a good indicator ofthe delay when queues aren't built up from over-aggressive flows. The bandwidth is a little trickier to measure. For the send buffer, the algorithm looks at long term averages of arriving acks to estimate how quickly data is arriving at the destination. For the receive buffer, it's fairly straightforward to count the incoming bytes. The buffers start at 64 KB. When the Bandwidth-Delay Product (BDP) calculation suggests that's not enough, the algorithm increments the buffers upwards and takes new measurements. After a few iterations, your connection buffer sizes should converge on something approaching the path BDP, plus a small bonus to cover measurement imprecision and leave space for later bandwidth increases. Knobs! Lots of Knobs! There are no configuration options in the profile to control autotuning except to turn it on and off. We figure you don't want to tune your autotuning! However, for inveterate optimizers, there are some sys db variables under the hood to make this feature behave exactly howyou want. For send buffers,the algorithm computes bandwidthand updates the buffer size every tm.tcpprogressive.sndbufminintervalmilliseconds (default 100). The send buffer size is determined by (bandwidth_max * RTTmin) * tm.tcpprogressive.sndbufbdpmultiplier + tm.tcpprogressive.sndbufincr. The defaults for the multiplier and increment are 1 and 64KB, respectively. Both of these quantities exist to provide a little "wiggle room" to discovernewly available bandwidth and provision for measurement imprecision. The initial send buffer size starts attm.tcpprogressive.sndbufmin(default 64KB) and is limited to tm.tcpprogressive.sndbufmax (default 16MB). For receive buffers, replace 'sndbuf' with 'rcvbuf' above. For proxy buffers, the high watermarkisMAX(send buffer size, peer TCP's receive window + tm.tcpprogressive.proxybufoffset) and the low watermarkis (proxy buffer high) - tm.tcpprogressive.proxybufoffset. The proxy buffer high is limited by tm.tcpprogressive.proxybufmin (default 64KB) andtm.tcpprogressive.proxybufmax (default 2MB). When send or receive buffers change, proxy buffers are updated too. This May Not Be For SomeUsers Some of you out there already have a great understanding of your network, have solid estimates of BDPs, and have configured your buffers accordingly. You may be better off sticking with your carefully measured settings.Autobuffer tuning starts out with no knowledge of the network and converges on the right setting. That's inferior to knowing the correct setting beforehand and going right to it. Autotuning Simplifies TCP Configuration We've heard from the field that many people find TCP profiles too hard to configure. Together with the Autonagle option, autobuffer tuning is designed to take some of the pain of getting the most out of your TCP stack. If you don't know where to start with setting buffer sizes, turn on autotuning and let the BIG-IP® set them for you.1.8KViews4likes3CommentsTCP Internals: 3-way Handshake and Sequence Numbers Explained
In this article, I will explain and show you what really happens during a TCP 3-way handshake as captured by tcpdump tool. We'll go deeper into details of TCP 3-way handshake (SYN, SYN/ACK and ACK) and how Sequence Numbers and Acknowledgement Numbers actually work. Moreover, I'll also briefly explain using real data how TCP Receive Window and Maximum Segment Size play an important role in TCP connection. As a side note, I will not touchTCP SACKandTCP Timestampsthis time as they should be covered in a future article about TCP retransmissions. FYI, the TCP capture was generated by a simpleHTTP GETrequest to BIG-IP to get hold of a file on/cgi-bin/directory calledscript.plusingHTTP/1.1protocol: BIG-IP then responds withHTTP/1.1 200 OKwith the requested data. This is not very relevant as we'll be looking at TCP layer but it's good to understand the capture's context to fully understand what's going on. This is what a TCP 3-way handshake looks like on Wireshark: Aswe can see, the first 3 packets are exchanged less than 1 second apart from each other. TheIN/OUTportion ofInfofield on BIG-IP's capture tells us if the packet is coming IN or being sent OUT by BIG-IP (as capture was taken on BIG-IP). As this is a slightly more in-depth explanation of TCP internals, I am assuming you know at least what a TCP 3-way handshake is conceptually. The TCP SYN, SYN/ACK and ACK Segments We can see that first packet is[SYN], second one is[SYN/ACK]and last one is[SYN/ACK]as displayed on Wireshark. TheInfosection as a whole only shows the summary of the most relevant fields copied from the TCP header. It is just enough to make us understand the context of the TCP segment. Let's now have a look what these fields mean with the exception ofSACK_PERMandTSval. When we double click on the[SYN]packet below, we find the same information again in the actual TCP header: The most important thing to understand here is that[SYN],[SYN/ACK]and[ACK]are all part of theFlagsheader above. They're just 1's and 0's. WhenSYNflag is enabled (i.e its value is 1), the receiving end (in this case BIG-IP) should automatically understand that someone (my client PC in this case) is trying to establish aTCPconnection. The response from BIG-IP (SYN/ACK) is an acknowledgement to theSYNpacket and therefore it has bothSYNandACKflags set to 1. Client's last response is just anACKas seen below: As per RFC, both sides should now assume a TCP connection is established. For plain-textHTTP/1.1protocol, there should now be a GET request in another layer as a payload of (or encapsulated by) TCP layer. If our traffic it is protected byTLSthenTLSlayer should come first as the payload of TCP layer and HTTP would be the payload of TLS layer. Does it make sense? That's how things work in the real world. TCP Sequence numbers A side note,Wireshark shows that our first SYN segment's Sequence number is 0 (Seq=0): It also shows that it isrelativesequence numberbut this is not the real TCP sequence number. Wireshark automatically zeroes it for you to make it easier to visualise and/or troubleshoot. In reality, the real sequence number is a much longer number that is calculated by your OS using current time and other random parameters for security purposes. This is how we see the real sequence number in Wireshark: Now back to business. Some people say if Client sends a TCP segment to BIG-IP, BIG-IP's ACK should be client's sequence number + 1 right? Wrong! Instead of +1 it should be+ number of bytes last received from peer or +1 if SYN or FIN segments. To clarify, here's thefull Flow Graphof our capture using relativesequence numbersto make it easier to grasp (.135= Client and .143 =BIG-IP): On 4th segment above (PSH, ACK - Len: 93), client sends TCP segment withSeq = 1and TCP payload data length (comprised of HTTP layer) of93 bytes. In this case, BIG-IP's response isnotACK = 2 (1 + 1) as some might think. Instead, BIG-IP responds with whatever client's last Sequence number wasplusnumber of bytes last received. As last sequence number was 1 and client also sent a TCP payload of 93 bytes, thenACKis 94! This is the most important concept to grasp for understanding sequence numbers and ACKs. SEQsandACKsonly increment whenthere is a TCP payload involved(by the number of bytes). SYN, FIN or ZeroWindow segments count as 1 byte for SEQs/ACKs. I added a full analysis using real TCPSEQs/ACKsto anAppendixsection if you'd like to go deeper into it. For the moment let's shift our attention towardsTCP Receive Window. TCP Receive Window and Maximum Segment Size (MSS) During 3-way handshake, the Receive Window (Window size valueon Wireshark) tells each side of the connection the maximum receiving buffer in bytes each side can handle: So it's literally like this (read red lines first please): [1]→ Hey, BIG-IP! My receiving buffer size is 29200 bytes. That means, you caninitiallysend me up to 29200 bytes before you even bother waiting for an ACK from me to send further data. [2]→ This should be the same as[1], unless Window Scale TCP Option is active. Window Scale should be the subject of a different article but I briefly touch it on[3]. [3]→ Original TCP Window Size field is limited to 16 bits so maximum buffer size is just65,535 bytes which is too little for today's speedy connections. This option extends the 16-bit window to 32-bit window but because BIG-IP did not advertise Window Scale option for this connection, it is disabled as both sides must support it for it to be used. [4]→ Hey, client! My receiving buffer size is 4380 bytes. That means, you caninitiallysend me up to 4328 bytesbefore you even bother waiting for an ACK from me to send further data. The reason why the wordinitiallyisunderlined on [1] and [3] is because Window size typically changes during the connection. For example, client's initial window size is 29200 bytes, right? This means that if it receives 200 bytes from BIG-IP it should go down to 2900 bytes. Easy, eh? But that's not whatalwayshappens in real life. In fact, in our capture it's the opposite! Bytes in flightcolumn shows the data BIG-IP (*.143) is sending in bytes to our client (*.135) that has not yet been acknowledged. I've added a column withWindow Size valueto make it easier to spot how variable this field is: It is the OS TCP Flow control implementation that dictates theReceive Windowsize taking into account the current "health" of its TCP stack and of course your configuration. Yes, in many cases, especially in the middle of a connection, the Window Size does decrease based on amount of data received/buffered so our first explanation also makes sense! How does BIG-IP know that client has freed up it's buffer again? As we can see above, when Client ACKs the receipt of BIG-IP's data, it also informs the size of its buffer in theWindow Size valuefield. That's how BIG-IP knows how much data it can send to Client before it receives another ACK. What about the Maximum Segment Size? Each side also displays aTCP Option - Maximum Segment sizeof 1460 bytes. This informs the maximum size of the TCP payload each side can send at a time (per TCP segment). Looking at the picture above, BIG-IP sent 334 bytes of TCP payload to client, right? In theory, this could've been up to 1460 bytes as it's also within client's initial buffer of 29200 bytes. So apart from informing each other about the maximum buffer, the maximum size of TCP segment is also informed. TCP Len vs Bytes in Flight Column (BIF) If we look at our last picture, we can see that whatever is inLenfield matches what's in ourBIFcolumn, right? Are they the same? No! Lenshows the current size of TCP payload (excluding the size of TCP header). Remember that TCP payload in this case is the whole HTTP portion that our TCP segment is carrying. Bytes in flightis not really part of TCP header but that's something Wireshark adds to make it easier for us to troubleshoot. It just means the number of bytes sent that have not yet been acknowledged by receiver. In our capture, data is acknowledged immediately so bothLenandBIFare the same. I've picked a different capture here where there are 3 TCP segments sent with no acknowledgement soBIFcolumn increments for each unacknowledged data segment but goes back to zero as soon as anACKis received by receiver: Notice thatBIFvalues now differ from TCP payload (the equivalent toLeninInfocolumn). That's it for now. The next article would be about TCP retransmission. Appendix - Going in depth into TCP sequence numbers! Here's a full explanation about what actually takes place on TCP layer from the point of view of BIG-IP: Just follow along from [1] to [10]. That's it.9.7KViews4likes1CommentThe TCP Proxy Buffer
The proxy buffer is probably the least intuitive of the three TCP buffer sizes that you can configure in F5's TCP Optimization offering. Today I'll describe what it does, and how to set the "high" and "low" buffer limits in the profile. The proxy buffer is the place BIG-IP stores data that isn't ready to go out to the remote host. The send buffer, by definition, is data already sent but unacknowledged. Everything else is in the proxy buffer. That's really all there is to it. From this description, it should be clear why we need limits on the size of this buffer. Probably the most common deployment of a BIG-IP has a connection to the server that is way faster than the connection to the client. In these cases, data will simply accumulate at the BIG-IP as it waits to pass through the bottleneck of the client connection. This consumes precious resources on the BIG-IP, instead of commodity servers. So proxy-buffer-high is simply a limit where the BIG-IP will tell the server, "enough." proxy-buffer-low is when it will tell the server to start sending data again. The gap between the two is simply hysteresis: if proxy-buffer-high were the same as proxy-buffer-low, we'd generate tons of start/stop signals to the server as the buffer level bounced above and below the threshold. We like that gap to be about 64KB, as a rule of thumb. So how does it tell the server to stop? TCP simply stops increasing the receive window: once advertised bytes avaiable have been sent, TCP will advertise a zero receive window. This stops server transmissions (except for some probes) until the BIG-IP signals it is ready again by sending an acknowledgment with a non-zero receive window advertisement. Setting a very large proxy-buffer-high will obviously increase the potential memory footprint of each connection. But what is the impact of setting a low one? On the sending side, the worst-case scenario is that a large chunk of the send buffer clears at once, probably because a retransmitted packet allows acknowledgement of a missing packet and a bunch of previously received data. At worst, this could cause the entire send buffer to empty and cause the sending TCP to ask the proxy buffer to accept a whole send buffer's worth of data. So if you're not that worried about the memory footprint, the safe thing is to set proxy-buffer-high to the same size as the send buffer. The limits on proxy-buffer-low are somewhat more complicated to derive, but the issue is that if a proxy buffer at proxy-buffer-low suddenly drains, it will take oneserversideRound Trip Time (RTT) to send the window update and start getting data again. So the total amount of data that has to be in the proxy buffer at the low point is the RTT of the serverside times the bandwidth of the clientside. If the proxy buffer is filling up, the serverside rate generally exceeds the clientside data rate, so that will be sufficient. If you're not deeply concerned about the memory footprint of connections, the minimum proxy buffer settings that will prevent any impairment of throughput are as follows for the clientside: proxy-buffer-high = send-buffer-size = (clientside bandwidth) * (clientside RTT) proxy-buffer-low = (clientside bandwidth) * (serverside RTT) proxy-buffer-low must be sufficiently below proxy-buffer-high to avoid flapping. If youarerunning up against memory limits, then cutting back on these settings will only hurt you in the cases above. Economizing on proxy buffer space is definitely preferable to limiting the send rate by making the send buffer too small.4.3KViews3likes14CommentsIs TCP's Nagle Algorithm Right for Me?
Of all the settings in the TCP profile, the Nagle algorithm may get the most questions. Designed to avoid sending small packets wherever possible, the question of whether it's right for your application rarely has an easy, standard answer. What does Nagle do? Without the Nagle algorithm, in some circumstances TCP might send tiny packets. In the case of BIG-IP®, this would usually happen because the server delivers packets that are small relative to the clientside Maximum Transmission Unit (MTU). If Nagle is disabled, BIG-IP will simply send them, even though waiting for a few milliseconds would allow TCP to aggregate data into larger packets. The result can be pernicious. Every TCP/IP packet has at least 40 bytes of header overhead, and in most cases 52 bytes. If payloads are small enough, most of the your network traffic will be overhead and reduce the effective throughput of your connection. Second, clients with battery limitations really don't appreciate turning on their radios to send and receive packets more frequently than necessary. Lastly, some routers in the field give preferential treatment to smaller packets. If your data has a series of differently-sized packets, and the misfortune to encounter one of these routers, it will experience severe packet reordering, which can trigger unnecessary retransmissions and severely degrade performance. Specified in RFC 896 all the way back in 1984, the Nagle algorithm gets around this problem by holding sub-MTU-sized data until the receiver has acked all outstanding data. In most cases, the next chunk of data is coming up right behind, and the delay is minimal. What are the Drawbacks? The benefits of aggregating data in fewer packets are pretty intuitive. But under certain circumstances, Nagle can cause problems: In a proxy like BIG-IP, rewriting arriving packets in memory into a different, larger, spot in memory taxes the CPU more than simply passing payloads through without modification. If an application is "chatty," with message traffic passing back and forth, the added delay could add up to a lot of time. For example, imagine a network has a 1500 Byte MTU and the application needs a reply from the client after each 2000 Byte message. In the figure at right, the left diagram shows the exchange without Nagle. BIG-IP sends all the data in one shot, and the reply comes in one round trip, allowing it to deliver four messages in four round trips. On the right is the same exchange with Nagle enabled. Nagle withholds the 500 byte packet until the client acks the 1500 byte packet, meaning it takes two round trips to get the reply that allows the application to proceed. Thus sending four messages takes eight round trips. This scenario is a somewhat contrived worst case, but if your application is more like this than not, then Nagle is poor choice. If the client is using delayed acks (RFC 1122), it might not send an acknowledgment until up to 500ms after receipt of the packet. That's time BIG-IP is holding your data, waiting for acknowledgment. This multiplies the effect on chatty applications described above. F5 Has Improved on Nagle The drawbacks described above sound really scary, but I don't want to talk you out of using Nagle at all. The benefits are real, particularly if your application servers deliver data in small pieces and the application isn't very chatty. More importantly, F5® has made a number of enhancements that remove a lot of the pain while keeping the gain: Nagle-aware HTTP Profiles: all TMOS HTTP profiles send a special control message to TCP when they have no more data to send. This tells TCP to send what it has without waiting for more data to fill out a packet. Autonagle:in TMOS v12.0, users can configure Nagle as "autotuned" instead of simply enabling or disabling it in their TCP profile. This mechanism starts out not executing the Nagle algorithm, but uses heuristics to test if the receiver is using delayed acknowledgments on a connection; if not, it applies Nagle for the remainder of the connection. If delayed acks are in use, TCP will not wait to send packets but will still try to concatenate small packets into MSS-size packets when all are available. [UPDATE:v13.0 substantially improves this feature.] One small packet allowed per RTT: beginning with TMOS® v12.0, when in 'auto' mode that has enabled Nagle, TCP will allow one unacknowledged undersize packet at a time, rather than zero. This speeds up sending the sub-MTU tail of any message while not allowing a continuous stream of undersized packets. This averts the nightmare scenario above completely. Given these improvements, the Nagle algorithm is suitable for a wide variety of applications and environments. It's worth looking at both your applications and the behavior of your servers to see if Nagle is right for you.1.3KViews2likes5CommentsTCP SACK Kernel Panic Vuln- F5 impacted?
Hi Team, I know this question is eventually going to be asked - I may as well do it: With the news this week around three CVE's relating to Selective ACK's (CVE-2019-11477/CVE-2019-11478/CVE-2019-5599) - I wanted to know if we need to disable SACK's on our TCP profiles or is this threat mitigated by the fact that traffic is handled by the F5 pipeline instead of the Linux socket buffer? Thanks.Solved412Views2likes1CommentHuge number of TCP 3WHS rejected (bad ACK), chksum incorrect
Hi guys, Hope you can help me with this, for me, complete mystery. I'm getting lots of following: Wireshark text export from F5 tcpdump: 4815 17:27:58.597830 CLIENT_IP F5_VS_IP TCP 162 36562 → 443 [SYN] Seq=0 Win=29200 Len=0 MSS=1460 SACK_PERM=1 TSval=681665250 TSecr=0 WS=128 4816 17:27:58.597846 F5_VS_IP CLIENT_IP TCP 193 443 → 36562 [SYN,ACK] Seq=0 Ack=1 Win=4380 Len=0 MSS=1460 TSval=751619333 TSecr=681665250 SACK_PERM=1 4817 17:27:58.660439 CLIENT_IP F5_VS_IP TCP 185 36562 → 443 [ACK] Seq=1 Ack=1 Win=29200 Len=0 TSval=681665313 TSecr=751619333 4818 17:27:58.730179 CLIENT_IP F5_VS_IP TLSv1.2 380 Client Hello 4819 17:27:58.730201 F5_VS_IP CLIENT_IP TLSv1.2 4529 Server Hello 4820 17:27:58.792837 CLIENT_IP F5_VS_IP TCP 185 36562 → 443 [ACK] Seq=196 Ack=4345 Win=37648 Len=0 TSval=681665445 TSecr=751619465 4821 17:27:58.792854 F5_VS_IP CLIENT_IP TLSv1.2 706 Certificate, Server Hello Done 4822 17:27:58.855416 CLIENT_IP F5_VS_IP TCP 185 36562 → 443 [ACK] Seq=196 Ack=4866 Win=40544 Len=0 TSval=681665508 TSecr=751619528 4823 17:27:58.857719 CLIENT_IP F5_VS_IP TLSv1.2 543 Client Key Exchange, Change Cipher Spec, Encrypted Handshake Message 4824 17:27:58.857731 F5_VS_IP CLIENT_IP TCP 185 443 → 36562 [ACK] Seq=4866 Ack=554 Win=4933 Len=0 TSval=751619593 TSecr=681665510 4825 17:27:58.859778 F5_VS_IP CLIENT_IP TLSv1.2 276 Change Cipher Spec, Encrypted Handshake Message 4826 17:27:58.923739 CLIENT_IP F5_VS_IP TLSv1.2 670 Application Data 4827 17:27:58.923793 F5_VS_IP CLIENT_IP TCP 185 443 → 36562 [ACK] Seq=4957 Ack=1039 Win=5418 Len=0 TSval=751619659 TSecr=681665576 4828 17:27:58.923981 F5_FLOAT_IP SERVER_IP TCP 193 1360 → 8080 [SYN] Seq=0 Win=4380 Len=0 MSS=1460 TSval=751619659 TSecr=0 SACK_PERM=1 4829 17:27:59.923626 F5_FLOAT_IP SERVER_IP TCP 193 [TCP Retransmission] 1360 → 8080 [SYN] Seq=0 Win=4380 Len=0 MSS=1460 TSval=751620659 TSecr=0 SACK_PERM=1 4830 17:27:59.923874 SERVER_IP F5_FLOAT_IP TCP 173 [TCP ACKed unseen segment] 8080 → 1360 [ACK] Seq=1 Ack=993763571 Win=29845 Len=0 4831 17:27:59.923882 F5_FLOAT_IP SERVER_IP TCP 209 1360 → 8080 [RST] Seq=993763571 Win=0 Len=0 4832 17:28:00.923733 F5_FLOAT_IP SERVER_IP TCP 193 [TCP Retransmission] 1360 → 8080 [SYN] Seq=0 Win=4380 Len=0 MSS=1460 TSval=751621659 TSecr=0 SACK_PERM=1 4833 17:28:01.923650 F5_FLOAT_IP SERVER_IP TCP 181 [TCP Retransmission] 1360 → 8080 [SYN] Seq=0 Win=4380 Len=0 MSS=1460 SACK_PERM=1 4834 17:28:01.923822 SERVER_IP F5_FLOAT_IP TCP 173 [TCP ACKed unseen segment] 8080 → 1360 [ACK] Seq=2408178215 Ack=1538671403 Win=30282 Len=0 4835 17:28:01.923845 F5_FLOAT_IP SERVER_IP TCP 209 1360 → 8080 [RST] Seq=1538671403 Win=0 Len=0 4836 17:28:02.923550 F5_VS_IP CLIENT_IP TCP 204 443 → 36562 [RST,ACK] Seq=4957 Ack=1039 Win=0 Len=0 4837 17:28:02.923561 F5_FLOAT_IP SERVER_IP TCP 204 [TCP ACKed unseen segment] 1360 → 8080 [RST, ACK] Seq=1 Ack=591246314 Win=0 Len=0 F5 tcpdump sees following (this is for different case): F5_FLOAT_IP.27216 > SERVER_IP.8080: Flags [R], cksum 0x95b2 (incorrect -> 0x0a17), seq 3911139265, win 0, length 0 out slot1/tmm10 lis=/Common/https_production flowtype=128 flowid=5618A9EEBE00 peerid=56189FD35F00 conflags=4000024 inslot=2 inport=9 haunit=1 priority=3 rst_cause="[0x2b07e6a:2314] TCP 3WHS rejected (bad ACK)" peerremote=00000000:00000000:X:X peerlocal=00000000:00000000:X:X remoteport=59656 localport=443 proto=6 vlan=4093 It is hitting constantly, and quite a lot. As per "K13223" this represent "The BIG-IP system failed to establish a TCP connection with the host (client or server) due to a failure during the TCP 3-way handshake process." In my case it is communication between F5 and server pool (all nodes affected). There is no firewall between F5 and server pool(s). It is happening with both AutoMap and SNAT. Are there any guides/cases how to debug this issue further? Mine test shows that it's not connected with client type (browser, curl, ...) or URL (same URL works in 99 percent of cases, that 1% is what's bothering me). Thank you!1.1KViews1like3CommentsStop Using the Base TCP Profile!
[Update 1 Mar 2017:F5 has new built-in profiles in TMOS v13.0. Although the default profile settings still haven't changed, there is good news on that from as well.] If the customer data I've seen is any indication, the vast majority of our customers are using the base 'tcp' profile to configure their TCP optimization. This haspoor performance consequencesand I strongly encourage you to replace it immediately. What's wrong with it? The Buffers are too small.Both the receive and send buffers are limited to 64KB, and the proxy buffer won't exceed 48K . If the bandwidth/delay product of your connection exceeds the send or receive buffer, which it will in most of today's internet for all but the smallest files and shortest delays, your applications will be limited not by the available bandwidth but by an arbitrary memory limitation. The Initial Congestion Window is too small.As the early thin-pipe, small-buffer days of the internet recede, the Internet Engineering Task Force (see IETFRFC 6928) increased the allowed size of a sender's initial burst. This allows more file transfers to complete in single round trip time and allows TCP to discover the true available bandwidth faster. Delayed ACKs.The base profile enables Delayed ACK, which tries to reduce ACK traffic by waiting 200ms to see if more data comes in. This incurs a serious performance penalty on SSL, among other upper-layer protocols. What should you do instead? The best answer is to build a custom profile based on your specific environment and requirements. But we recognize that some of you will find that daunting! So we've created a variety of profiles customized for different environments. Frankly, we should do some work to improve these profiles, but even today there are much better choices than base 'tcp'. If you have an HTTP profile attached to the virtual, we recommend you use tcp-mobile-optimized. This is trueeven if your clients aren't mobile. The name is misleading! As I said, the default profiles need work. If you're just a bit more adventurous with your virtual with an HTTP profile, then mptcp-mobile-optimizedwill likely outperform the above. Besides enabling Multipath TCP (MPTCP)for clients that ask for it, it uses a more advanced congestion control ("Illinois") and rate pacing. We recognize, however, that if you're still using the base 'tcp' profile today then you're probably not comfortable with the newest, most innovative enhancements to TCP. So plain old tcp-mobile-optimized might be a more gentle step forward. If your virtual doesn't have an HTTP profile, the best decision is to use a modified version of tcp-mobile-optimized or mptcp-mobile-optimized. Just derive a profile from whichever you prefer and disable the Nagle algorithm. That's it! If you are absolutely dead set against modifying a default profile, then wam-tcp-lan-optimized is the next best choice. It doesn't really matter if the attached network is actually a LAN or the open internet. Why did we create a default profile with undesirable settings? That answer is lost in the mists of time. But now it's hard to change: altering the profile from which all other profiles are derived will cause sudden changes in customer TCP behavior when they upgrade. Most would benefit, and many may not even notice, but we try to not to surprise people. Nevertheless, if you want a quick, cheap, and easy boost to your application performance, simply switch your TCP profile from the base to one of our other defaults. You won't regret it.4KViews1like27CommentsF5 Unveils New Built-In TCP Profiles
[Update 3/17:Some representative performance results are at the bottom] Longtime readers know thatF5's built-in TCP profileswere in need of a refresh. I'm pleased to announce that inTMOS® version13.0, available now, there are substantial improvements to the built-in profile scheme. Users expect defaults to reflect best common practice, and we've made a huge step towards that being true. New Built-in Profiles We've kept virtually all of the old built-in profiles, for those of you who are happy with them, or have built other profiles that derive from them. But there are four new ones to load directly into your virtual servers or use a basis for your own tuning. The first three are optimized for particular network use cases: f5-tcp-wan, f5-tcp-lan, and f5-tcp-mobile are updated versions of tcp-wan-optimized, tcp-lan-optimized, and tcp-mobile-optimized. These adapt all settings to the appropriate link types, except that they don't enable the very newest features. If the hosts you're communicating with tend to use one kind of link, these are a great choice. The fourth isf5-tcp-progressive.This is meant to be a general-use profile (like the tcp default), but it contains the very latest features for early adopters. In our benchmark testing, we had the following criteria: f5-tcp-wan, f5-tcp-lan, and f5-tcp-mobile achieved throughput at least as high, and often better, than the default tcp profile for that link type. f5-tcp-progressive had equal or higher throughput than default TCP across all representative network types. The relative performance of f5-tcp-wan/lan/mobile and progressive in each link type will vary given the new features that f5-tcp-progressive enables. Living, Read-Only Profiles These four new profiles,and the default 'tcp' profile,are now "living." This means that we'll continually update them with best practices as they evolve. Brand-new features, if they are generally applicable, will immediately appear in f5-tcp-progressive. For our more conservative users, these new features will appear in the other four living profiles after a couple of releases. The default tcp profile hasn't changed yet, but it will in future releases! These five profiles are also now read-only, meaning that to make modifications you'll have to create a new profile that descends from these. This will aid in troubleshooting. If there are any settings that you like so much that you never want them to change, simply click the "custom" button in the child profile and the changes we push out in the future won't affect your settings. How This Affects Your Existing Custom Profiles If you've put thought into your TCP profiles, we aren't going to mess with it. If your profile descends from any of the previous built-ins besides default 'tcp,' there is no change to settings whatsoever. Upgrades to 13.0 will automatically prevent disruptions to your configuration.We've copied all of the default tcp profile settings to tcp-legacy, which is not a "living" profile. All of the old built-in profiles (like tcp-wan-optimized), and any custom profiles descended from default tcp, will now descend instead from tcp-legacy, and never change due to upgrades from F5. tcp-legacy will also include any modifications you made to the default tcp profile, as this profile is not read-only. Our data shows that few, if any, users are using the current (TMOS 12.1 and before) tcp-legacy settings.If you are, it is wise to make a note of those settings before you upgrade. How This Affects Your Existing Virtual Servers As the section above describes, if your virtual server uses any profile other than default 'tcp' or tcp-legacy, there will be no settings change at all. Given the weaknesses of the current default settings, we believe most users who use virtuals with the TCP default are not carefully considering their settings. Those virtuals will continue to use the default profile, and therefore settings will begin to evolve as we modernize the default profile in 13.1 and later releases. If you very much like the default TCP profile, perhaps because you customized it when it wasn't read-only, you should manually change the virtual to use tcp-legacy with no change in behavior. Use the New Profiles for Better Performance The internet changes. Bandwidths increase, we develop better algorithms to automatically tune your settings, and the TCP standard itself evolves. If you use the new profile framework, you'll keep up with the state of the art and maximize the throughput your applications receive. Below, I've included some throughput measurements from our in-house testing. We used parameters representative of seven different link types and measured the throughput using some relevant built-in profiles. Obviously, the performance in your deployment may vary. Aside from LANs, where frankly tuning isn't all that hard, the benefits are pretty clear.4.4KViews1like9CommentsIntroducing QUIC and HTTP/3
QUIC [1] is a new transport protocol that provides similar service guarantees to TCP, and then some, operating over a UDP substrate. It has important advantages over TCP: Streams: QUIC provides multiple reliable ordered byte streams, which has several advantages for user experience and loss response over the single stream in TCP. The stream concept was used in HTTP/2, but moving it into the transport further amplifies the benefits. Latency: QUIC can complete the transport and TLS handshakes in a single round trip. Under some conditions, it can complete the application handshake (e.g. HTTP requests) in a single round-trip as well. Privacy and Security: QUIC always uses TLS 1.3, the latest standard in application security, and hides much more data about the connection from prying eyes. Moreover, it is much more resistant than TCP to various attacks on the protocol, because almost all of its packets are authenticated. Mobility: If put in the right sort of data center infrastructure, QUIC seamlessly adjusts to changes in IP address without losing connectivity. [2] Extensibility: Innovation in TCP is frequently hobbled by middleboxes peering into packets and dropping anything that seems non-standard. QUIC’s encryption, authentication, and versioning should make it much easier to evolve the transport as the internet evolves. Google started experimenting with early versions of QUIC in 2012, eventually deploying it on Chrome browsers, their mobile apps, and most of their server applications. Anyone using these tools together has been using QUIC for years! The Internet Engineering Task Force (IETF) has been working to standardize it since 2016, and we expect that work to complete in a series of Internet Requests for Comment (RFCs) standards documents in late 2020. The first application to take advantage of QUIC is HTTP. The HTTP/3 standard will publish at the same time as QUIC, and primarily revises HTTP/2 to move the stream multiplexing down into the transport. F5 has been tracking the development of the internet standard. In TMOS 15.1.0.1, we released clientside support for draft-24 of the standard. That is, BIG-IP can proxy your HTTP/1 and HTTP/2 servers so that they communicate with HTTP/3 clients. We rolled out support for draft-25 in 15.1.0.2 and draft-27 in 15.1.0.3. While earlier drafts are available in Chrome Canary and other experimental browser builds, draft-27 is expected to see wide deployment across the internet. While we won’t support all drafts indefinitely going forward, our policy will be to support two drafts in any given maintenance release. For example, 15.1.0.2 supports both draft-24 and draft-25. If you’re delivering HTTP applications, I hope you take a look at the cutting edge and give HTTP/3 a try! You can learn more about deploying HTTP/3 on BIG-IP on our support page at K60235402: Overview of the BIG-IP HTTP/3 and QUIC profiles. -----[1] Despite rumors to the contrary, QUIC is not an acronym. [2] F5 doesn’t yet support QUIC mobility features. We're still in the midst of rolling out improvements.1.2KViews1like0Comments