Investigating the LTM TCP Profile: Nagle’s Algorithm

Introduction

The LTM TCP profile has over thirty settings that can be manipulated to enhance the experience between client and server. Because the TCP profile is applied to the virtual server, the flexibility exists to customize the stack (in both client & server directions) for every application delivered by the LTM. In this series, we will dive into several of the configurable options and discuss the pros and cons of their inclusion in delivering applications.

Quick aside for those unfamiliar with TCP: the transmission control protocol (layer 4) rides on top of the internet protocol (layer 3) and is responsible for establishing connections between clients and servers so data can be exchanged reliably between them.

Normal TCP communication consists of a client and a server, a 3-way handshake, reliable data exchange, and a four-way close. With the LTM as an intermediary in the client/server architecture, the session setup/teardown is duplicated, with the LTM playing the role of server to the client and client to the server. These sessions are completely independent, even though the LTM can duplicate the tcp source port over to the server side connection in most cases, and depending on your underlying network architecture, can also duplicate the source IP.

Nagle's Algorithm, defined in RFC896, is a congestion control mechanism designed to bundle smaller chunks of data for delivery in one big packet. The algorithm:

if there is new data to send
if the window size >= MSS and available data is >= MSS
    send complete MSS segment now
else
    if there is unconfirmed data still in the pipe
      enqueue data in the buffer until an acknowledge is received
    else
       send data immediately
     end if
   end if
end if

Sending packets with 40 bytes of overhead to carry little data is very inefficient, and Nagle's was created to address this inefficiency. Efficiency, however, is not the only consideration. Delay-sensitive applications such as remote desktop protocol can be severely impacted by Nagle's. An RDP user connecting to a terminal server expects real-time movement on the desktop presentation, but with Nagle's enabled, the sending station will queue the content if there is additional data coming, which can be perceived as the network being slow, when in actuality, it is performing as expected.

Even for non-real-time applications, there can be a noticable difference on the wire, even if the end user is oblivious to the performance gain. This can come in to play with automated performance scripts that enable thresholds. For example, in one installation a first generation load balancer was scheduled to be replaced. All TCP was simply passed by the load balancer, so the controlled optimization points were isolated to the servers. The server TCP stacks were tuned with the help of a couple monitoring tools: one that measured the time to paint the front page of the application, and one to perform a transaction within the application. During testing, inserting the LTM with the default tcp profile negated the optimizations performed on the server TCP stacks and the tools alerted the administrators accordingly with a twofold drop in performance. Disabling Nagle's alone resulted in a significant improvement from the default profile, but the final configuration included additional options, which will be discussed in the coming weeks.

One warning: Nagle's and delayed acknowledgements do not play well in the same sandbox. There's a good analysis here and a commentary on their interactivity by Mr Nagle himself here.

In conclusion, Nagle's algorithm can make your bandwidth utilization more effective in relation to packet overhead, but a careful analysis of the overall architecture will help in deciding if you should enable it.

Updated Nov 30, 2023

Version 3.0

application delivery

news

series-the-tcp-profile

tcp

tech tip

JRahm

Admin

Christ Follower, Husband, Father, Technologist. I love community and I especially love THIS community. My background is networking, but I've dabbled in all the F5 iStuff, I'm a recovering Perl guy, and am very much a python enthusiast. Learning alongside all of you in this accelerating industry toward modern apps and architectures.

View Profile

JRahm

Admin

View Profile

7 Comments

L4L7_53191
Nimbostratus
Aug 03, 2011
Nice write up Jason, as usual. One note - the real big issue with Nagling is that delayed ACK problem.

Check out line 5 of the algorithm above. The line "if there is unconfirmed data still in the pipe" is the specific bit that doesn't play well with delayed ACKs: if there's an ACK outstanding Nagle will buffer. If it gets one, it'll immediately send the data. But now, you've got delayed ACK waiting for data before it sends an ACK! Then you're stuck - you're buffering on the send side (Nagle), but waiting for that delayed ACK timer to fire on the receive side.

Also, that delayed ACK timer is set on bootstrap and it can fire at any time between 1-500ms (RFC states no more than 500) when the ACK is being delayed on the receive side. So you may not get totally clean and predictable stalls - you'll just know that it's not performing well.

One last bit, regarding BigIP. If you disable Nagle, you may also want to enable "Acknowledge on Push" and test. I've seen dramatic improvements when these are done together...

--Matt
Frank_30530
Altocumulus
May 07, 2012
If Delayed ACK and Nagle are better not enabled simultaneously, why is it that in TMOS 11.1 in the tcp-wan-optimized profile both options are enabled?

--Frank
JRahm
Admin
May 18, 2012
That's a good question, Frank. My experience has been to keep nagle's disabled, but the product development team tests far more scenarios than I do on what makes an overall more stable and performant stack. Best practice is to test with your app and tune as necessary. Each app impacts tcp differently.
Joe_Chapman_416
Nimbostratus
Sep 12, 2012
This should really be incorporated into or mentioned in the F5 - RDP deployment guide.

It definitely makes a huge difference with the RDP experience if Nagle’s Algorithm is disabled.
JRahm
Admin
Sep 24, 2012
Hi Joe, I'll pass along your recommendation to the group responsible for the deployment guides
JRahm
Admin
Sep 26, 2012
Hi Joe, the deployment guides are being updated based on your feedback! Nice work!
MR_RJ
Cirrus
Sep 18, 2013
Thanks a lot for this post.

Nagles caused us a lot of issues over several months. Finally I managed to find this post mention that it could have severe impact on remote desktop.

Once disabled it, it all works smoothly, no latency issues.

The whitepapers / deployment guide mention that a optimized lan profile should be used. By some reason we changed this back to default tcp profile while trouble-shooting not respond issued with MS Exchange on the RDS-servers. And by that me introduced new issues without really realize it.

Once again, thanks for an excellent post and information on what the impact might be.

Investigating the LTM TCP Profile: Nagle’s Algorithm

7 Comments

ABOUT DEVCENTRAL

RESOURCES

SUPPORT

PARTNERS