Investigating the LTM TCP Profile: Quality of Service

Introduction
The LTM TCP profile has over thirty settings that can be manipulated to enhance the experience between client and server.  Because the TCP profile is applied to the virtual server, the flexibility exists to customize the stack (in both client & server directions) for every application delivered by the LTM.  In this series, we will dive into several of the configurable options and discuss the pros and cons of their inclusion in delivering applications.
  1. Nagle's Algorithm
  2. Max Syn Retransmissions & Idle Timeout
  3. Windows & Buffers
  4. Timers
  5. QoS
  6. Slow Start
  7. Congestion Control Algorithms
  8. Acknowledgements
  9. Extended Congestion Notification & Limited Transmit Recovery
  10. The Finish Line
Quick aside for those unfamiliar with TCP: the transmission control protocol (layer 4) rides on top of the internet protocol (layer 3) and is responsible for establishing connections between clients and servers so data can be exchanged reliably between them. 
Normal TCP communication consists of a client and a server, a 3-way handshake, reliable data exchange, and a four-way close.  With the LTM as an intermediary in the client/server architecture, the session setup/teardown is duplicated, with the LTM playing the role of server to the client and client to the server.  These sessions are completely independent, even though the LTM can duplicate the tcp source port over to the server-side connection in most cases, and depending on your underlying network architecture, can also duplicate the source IP.
Why QoS?
First, let's define QoS as it is implemented in the profile—the capability to apply an identifier to a specific type of traffic so the network infrastructure can treat it uniquely from other types. So now that we know what it is, why is it necessary? There are numerous reasons, but let’s again consider the remote desktop protocol. Remote users expect immediate response to their mouse and keyboard movements. If a large print job is released and sent down the wire and the packets hit the campus egress point towards the remote branch prior to the terminal server responses, the standard queue in a router will process the packets first in, first out, resulting in the user session getting delayed to the point human perception is impacted. Implementing a queuing strategy at the egress (at least) will ensure the higher priority traffic gets attention before the print job.
QOS Options

The LTM supports setting priority at layer 2 with Link QoS and at layer 3 with IP ToS. This can be configured on a pool, a virtual server’s TCP/UDP profile, and in an iRule. The Link QoS field is actually three bits within the vlan tag of an Ethernet frame, and the values as such should be between zero and seven. The IP ToS field in the IP packet header is eight bits long but the six most significant bits represent DSCP. This is depicted in the following diagram:

The precedence level at both layers is low to high in terms of criticality: zero is the standard “no precedence” setting and seven is the highest priority. Things like print jobs and stateless web traffic can be assigned lower in the priority scheme, whereas interactive media or voice should be higher. RFC 4594 is a guideline for establishing DSCP classifications. DSCP, or Differentiated Services Code Point, is defined in RFC 2474. DSCP provides not only a method to prioritize traffic into classes, but also to assign a drop probability to those classes. The drop probability is high to low, in that a higher value means it will be more likely the traffic will be dropped. In the table below, the precedence and the drop probabilities are shown, along with their corresponding DSCP value (in decimal) and the class name. These are the values you’ll want to use for the IP ToS setting on the LTM, whether it is in a profile, a pool, or an iRule. You'll note, however, that the decimal used for IP::tos is a multiple of 4 of the actual DSCP value. The careful observer of the above diagram will notice that the DSCP bits are bit-shifted twice in the tos field, so make sure you use the multiple instead of the actual DSCP value.

DSCP Mappings for IP::tos Command
Precedence Type of Service DSCP Class DSCP Value IP::tos Value
0 0 none 0 0
1 0 cs1 8 32
1 1 af11 10 40
1 10 af12 12 48
1 11 af13 14 56
10 0 cs2 16 64
10 1 af21 18 72
10 10 af22 20 80
10 11 af23 22 88
11 0 cs3 24 96
11 1 af31 26 104
11 10 af32 28 112
11 11 af33 30 120
100 0 cs4 32 128
100 1 af41 34 136
100 10 af42 36 144
100 11 af43 38 152
101 0 cs5 40 160
101 11 ef 46 184
110 0 cs6 48 192
111 0 cs7 56 224

 

The cs classes are the original IP precedence (pre-dating DSCP) values. The assured forwarding (af) classes are defined in RFC 2597, and the expedited forwarding (ef) class is defined in RFC 2598. So for example, traffic in af33 will have higher priority over traffic in af21, but will experience greater drops than traffic in af31.
Application

As indicated above, the Link QoS and IP ToS settings can be applied globally to all traffic hitting a pool, or all traffic hitting a virtual to which the profile is applied, but they can also be applied specifically by using iRules, or just as cool, they can be retrieved to make a forwarding decision. In this example, if requests arrive marked as AF21 (decimal 18), forward the request to the platinum server pool, AF11 to the gold pool, and all others to the standard pool.

when CLIENT_ACCEPTED {
    if { [IP::tos] == 72 } {
        pool platinum
    } elseif { [IP::tos] == 40 } {
        pool gold
    } else { 
        pool standard 
    }
}

In this example, set the Ethernet priority on traffic to the server to three if the request came from IP 10.10.10.10:

when CLIENT_ACCEPTED {
    if { [IP::addr [IP::client_addr]/24 equals "10.10.10.0"] }
        LINK::qos serverside 3
    }
}
Final Thoughts

Note that by setting the Link QoS and/or IP ToS values you have not in any way guaranteed Quality of Service. The QoS architecture needs to be implemented in the network before these markings will be addressed. The LTM can play a role in the QoS strategy in that the marking can be so much more accurate and so much less costly than it will be on the router or switch to which it is connected. Knowing your network, or communicating with the teams that do, will go a long way to gaining usefulness out of these features.

Updated Nov 30, 2023
Version 3.0
  • Please fix the table. next prec bit will be 100 after 011. But you have 110 after 011.
  • Can the f5 simply be configured to trust DSCP tags from the network? We already have all our tagging done on the access edge, but the f5 is resetting all of them. It would be ideal to allow the f5 to just trust the existing DSCP tags.
  • Default QoS is set to pass through at tcp profile and pool levels. Check to see if maybe that's not the case in your environment. Also, if this is for BIG-IP VE, there is a known issue with vswitches in some hypervisor environments strip the QoS field. You might take a tcpdump and check the field on outgoing packets to see if the BIG-IP is setting (or passing) it properly.