Forum Discussion
steelplate_8766
Nimbostratus
May 25, 2010sNAT to Windows server and port collision
Hi,
I have an F5 doing sNAT, and the problem I face is that the windows server keeps the port in time_wait (currently default 240 seconds windows 2003 server). The F5 will attempt to reuse the client port within that interval and as it causes a port collision, the syn's don't even get ack's.
Windows is doing a full tcp port close (fin,ack with ack response in both directions), so the f5 deems it ok to reuse the port.
My understanding is that the f5 shouldn't try to reuse this port for 2MSL , but where can I find the default MSL for the F5, as I should make windows TCPTimedWaitDelay =< the f5 2MSL ?
I tried setting the f5 to always change client port, as this should have caused the f5 to use a new port that wasn't in use, but instead it makes the problem worse, I see the f5 use try to reuse the changed client port in < 1 second, again I assume this is because the f5 sees a full close.
How have other users dealt with this problem as it must have effected many other users.
14 Replies
- Michael_Yates
Nimbostratus
Do you have a OneConnect Profile applied to the Virutal Server that you are having this problem with? - hoolio
Cirrostratus
You should be able to modify this behavior with a custom TCP profile. See this post from Deb for more info on OneConnect and the TCP profile options (and a few related solutions):
OneConnect Transformations reuse from same source IP address and port number?
http://devcentral.f5.com/Forums/tabid/1082223/asg/52/showtab/groupforums/aff/31/aft/17949/afv/topic/Default.aspx
SOL7559: Overview of the TCP profile
https://support.f5.com/kb/en-us/solutions/public/7000/500/sol7559.html
SOL7208: Overview of the OneConnect profile
https://support.f5.com/kb/en-us/solutions/public/7000/200/sol7208.html
OneConnect wiki page
http://devcentral.f5.com/wiki/default.aspx/AdvDesignConfig/oneconnect
Aaron - steelplate_8766
Nimbostratus
Sorry for delay, have turned on alerts so I see replies now :-)
No, we are not using oneconnect.
our server is windows 2003 which has TcpTimedWaitDelay registry setting available, which we can set lower than the documented setting to 1 second. However, according to MS, in 2008 server, if you set it below 30 seconds it will revert to the default of 240 seconds without informing you.
Our problem is when the server initiates close and goes through the RFC 793 close process, the server ends up in time_wait for 2xMSL. by default this is 240 seconds on windows 2003 server.
However, the F5 immediately tries to reuse this port (<150ms from sending the ACK to the server FIN and the new SYN). The server ignores this, the F5 retries the SYN 3 times then sends a RST to the client. The client then errors to the user.
We have added a large range of ip's to the sNAT pool and set "Port Reuse" (not strict) and the server TcpTimedWaitDelay to 1 second to minimize the problem occurring but it still could.
If we have no pool and "Port Change" we see the client fail immediately as the second tcp connection will always be a reuse.
If the F5 is following the RFC, it should respect the MSL before reusing a port, so I am assuming the F5 has some really really low MSL and this mismatch is the problem ?
from our F5
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_rfc1337 = 0 - Jerome_42901
Nimbostratus
I've had this port collision issue as well on a application we load balance that involves very short lived connections.
To address the issue, I've created a custom FastL4 profile with custom tcp close timeout that matches the TcpTimedWaitDelay we set on the windows servers (which we lowered from 240s to 30s iirc), and added more IP's to the SNAT pool. We have not faced another port collision issue since this "fix" went live.
I hope it helps. edit: I'm not sure toying with sysctl on the F5 itself will help in any way, as I think it would only impact TCP connections to the F5 services (ssh, https, whatever) but not to the VIP. - steelplate_8766
Nimbostratus
tcp_close timer shouldn't make any difference in my case anyway
http://tools.ietf.org/html/rfc793section-3.5
has this diagram
TCP A TCP B
1. ESTABLISHED ESTABLISHED
2. (Close)
FIN-WAIT-1 --> --> CLOSE-WAIT
3. FIN-WAIT-2 <-- <-- CLOSE-WAIT
4. (Close)
TIME-WAIT <-- <-- LAST-ACK
5. TIME-WAIT --> --> CLOSED
6. (2 MSL)
CLOSED
The problem I have is the server goes to point 6, and within the 2MSL period incomes a new syn. (150ms , close_wait timer is 5 seconds by default i think) - steelplate_8766
Nimbostratus
I have been reading RFC 1337 , TIME-WAIT Assassination Hazards, and running some direct to windows server tests (no f5) can see port reuse successful within the time_wait period, due to MS implementing this feature.
However, when using sNAT on the f5, the ISN generated doesn't fall within a range MS consider valid for time_wait assassination, so it ignores it.
Does anyone know if the f5 is assuming assassination is working , hence quickly reusing the port, and if so, can this be turned off for outbound connections ?
and does anyone know what the algorithm f5 uses for ISN is (I know about the security aspects of asking such a question, but wrt rfc 1337 they could give enough detail to see if it's compatible with MS's tcp stack ?) - Hamish
Cirrocumulus
Are you translating or preserving client source ports? - steelplate_8766
Nimbostratus
we are preserve ports (NOT strict). - Hamish
Cirrocumulus
Disable the preserving of ports... I suspect strict vs non-strict is simply a matter of is it open or not already. And ignores 2xMSL completely...
Source port preservation isn't going to help much anyway...
H - steelplate_8766
Nimbostratus
I tried setting the f5 to always change client port, as this should have caused the f5 to use a new port that wasn't in use, but instead it makes the problem worse, I see the f5 use try to reuse the changed client port in < 1 second, again I assume this is because the f5 sees a full close.
see from above from my first post. I prefer to have the option to maintain ports though as it makes wireshark views of the f5 tcpdump a little less painful.
Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
