Investigating the LTM TCP Profile: Nagle’s Algorithm
Introduction
The LTM TCP profile has over thirty settings that can be manipulated to enhance the experience between client and server. Because the TCP profile is applied to the virtual server, the flexibility exists to customize the stack (in both client & server directions) for every application delivered by the LTM. In this series, we will dive into several of the configurable options and discuss the pros and cons of their inclusion in delivering applications.
- Nagle's Algorithm
- Max Syn Retransmissions & Idle Timeout
- Windows & Buffers
- Timers
- QoS
- Slow Start
- Congestion Control Algorithms
- Acknowledgements
- Extended Congestion Notification & Limited Transmit Recovery
- The Finish Line
Quick aside for those unfamiliar with TCP: the transmission control protocol (layer 4) rides on top of the internet protocol (layer 3) and is responsible for establishing connections between clients and servers so data can be exchanged reliably between them.
Normal TCP communication consists of a client and a server, a 3-way handshake, reliable data exchange, and a four-way close. With the LTM as an intermediary in the client/server architecture, the session setup/teardown is duplicated, with the LTM playing the role of server to the client and client to the server. These sessions are completely independent, even though the LTM can duplicate the tcp source port over to the server side connection in most cases, and depending on your underlying network architecture, can also duplicate the source IP.
Nagle's Algorithm, defined in RFC896, is a congestion control mechanism designed to bundle smaller chunks of data for delivery in one big packet. The algorithm:
if there is new data to send
if the window size >= MSS and available data is >= MSS
send complete MSS segment now
else
if there is unconfirmed data still in the pipe
enqueue data in the buffer until an acknowledge is received
else
send data immediately
end if
end if
end if
Sending packets with 40 bytes of overhead to carry little data is very inefficient, and Nagle's was created to address this inefficiency. Efficiency, however, is not the only consideration. Delay-sensitive applications such as remote desktop protocol can be severely impacted by Nagle's. An RDP user connecting to a terminal server expects real-time movement on the desktop presentation, but with Nagle's enabled, the sending station will queue the content if there is additional data coming, which can be perceived as the network being slow, when in actuality, it is performing as expected.
Even for non-real-time applications, there can be a noticable difference on the wire, even if the end user is oblivious to the performance gain. This can come in to play with automated performance scripts that enable thresholds. For example, in one installation a first generation load balancer was scheduled to be replaced. All TCP was simply passed by the load balancer, so the controlled optimization points were isolated to the servers. The server TCP stacks were tuned with the help of a couple monitoring tools: one that measured the time to paint the front page of the application, and one to perform a transaction within the application. During testing, inserting the LTM with the default tcp profile negated the optimizations performed on the server TCP stacks and the tools alerted the administrators accordingly with a twofold drop in performance. Disabling Nagle's alone resulted in a significant improvement from the default profile, but the final configuration included additional options, which will be discussed in the coming weeks.
One warning: Nagle's and delayed acknowledgements do not play well in the same sandbox. There's a good analysis here and a commentary on their interactivity by Mr Nagle himself here.
In conclusion, Nagle's algorithm can make your bandwidth utilization more effective in relation to packet overhead, but a careful analysis of the overall architecture will help in deciding if you should enable it.
- L4L7_53191NimbostratusNice write up Jason, as usual. One note - the real big issue with Nagling is that delayed ACK problem.
- Frank_30530AltocumulusIf Delayed ACK and Nagle are better not enabled simultaneously, why is it that in TMOS 11.1 in the tcp-wan-optimized profile both options are enabled?
- JRahmAdminThat's a good question, Frank. My experience has been to keep nagle's disabled, but the product development team tests far more scenarios than I do on what makes an overall more stable and performant stack. Best practice is to test with your app and tune as necessary. Each app impacts tcp differently.
- Joe_Chapman_416NimbostratusThis should really be incorporated into or mentioned in the F5 - RDP deployment guide.
- JRahmAdminHi Joe, I'll pass along your recommendation to the group responsible for the deployment guides
- JRahmAdminHi Joe, the deployment guides are being updated based on your feedback! Nice work!
- MR_RJCirrusThanks a lot for this post.