WILS: Load Balancing and Ephemeral Port Exhaustion

Understanding the relationship between SNAT and connection limitations in full proxy intermediaries.

If you’ve previously delved into the world of SNAT (which is becoming increasingly important in large-scale implementations, such as those in the service provider world) you remember that SNAT essentially provides an IP address from which a full-proxy intermediary can communicate with server-side resources and maintain control over the return routing path.

There is an interesting relationship between intermediaries that leverage two separate TCP stacks (such as full-proxies) and SNAT in terms of concurrent (open) connections that can be supported by any given “virtual” server (or virtual IP address, as they are often referred to in the industry). The number of ephemeral ports that can be used by any client IP address is 65535. Programmer types will recognize that as a natural limitation imposed by the use of an unsigned short integer (16 bits) in many programming languages.

Now, what that means is that for each SNAT address assigned to a virtual IP address, a theoretical total of 65535 connections can be open at any other single address at any given time. This is because in a full-proxy architecture the intermediary is acting as a client and while servers use well-known ports for communication, clients do not. They use ephemeral (temporary) ports, the value of which is communicated to the server in the source port field in the request. Each additional SNAT address available increases the total number of connections by some portion of that space. As you should never use ephemeral ports in the privileged range (port numbers under 1024 are traditionally reserved for firewall and other sanity checkers - see /etc/services on any Unix box) that number can be as many as 64512 available ports between the SNAT address and any other IP address. For example, if a server pool (virtual or iron) has 24 members and assuming the SNAT address is configured to use ephemeral ports in the range of 1024-65535, then a single SNAT address results in a total of 24 x 48k = 1,152k concurrent connections to the pool. If the SNAT is assigned to a virtual server that is targeting a single address (like another virtual server or another intermediate device) then the total connections is 1 x 48k = 48k connections.

Obviously this has a rather profound impact on scalability and capacity planning. If you only have one SNAT address available and you need the capabilities of a full-proxy (such as payload inspection inbound and out) you can only support a limited number of connections (and by extension, users). Some solutions provide the means by which these limitations can be mitigated, such as the ability to configure a SNAT pool (a set of dedicated IP addresses) from which SNAT addresses can be automatically pulled and used to automatically increase the number of available ephemeral ports.

Running out of ephemeral ports is known as “ephemeral port exhaustion” as you have exhausted the ports available from which a connection to the server resource can be made. In practice the number of ephemeral ports available for any given IP address can be limited by operating system implementations and is always much lower than the 65535 available per IP address. For example, the IANA official suggestion is that ephemeral ports use 49152 through 65535, which means a limitation of 16383 open connections per address. Any full-proxy intermediary that has adopted this suggestion would necessarily require more SNAT addresses to scale an application to more concurrent connections.

One of the advantages of a solution implementing a custom TCP/IP stack, then, is that they can ignore the suggestion on ephemeral port assignment typically imposed at the operating system or underlying software layer and increase the range to the full 65535 if desired. Another major advantage is making aggressive use of TIME-WAIT recycling. Normal TCP stacks hold on to the ephemeral port for seconds to minutes after a connection closes. This leads to odd bursting behavior. With proper use of TCP timestamps you can recycle that ephemeral port almost immediately.

Regardless, it is an important relationship to remember, especially if it appears that the Load balancer (intermediary) is suddenly the bottleneck when demand increases. It may be that you don’t have enough IP addresses and thus ports available to handle the load.

WILS: Write It Like Seth. Seth Godin always gets his point across with brevity and wit. WILS is an ATTEMPT TO BE concise about application delivery TOPICS AND just get straight to the point. NO DILLY DALLYING AROUND.