Forum Discussion
Shayne_Rinne_84
Nimbostratus
Apr 24, 2008F5 as a default gateway
Hello,
We are running CA siteminder policy servers on Solaris 8 behind a BIG IP LTM, and many of our connections to Active Directory LDAP User directories are going into a TCP IDLE state. This eventually cripples the policy server and we have to restart it to clear the IDLE connections. The LTM is only acting as a default gateway and the internal to external VIP is using Performance L4, could this be the cause of the IDLE state? What can we do to determine if the F5 is causing the IDLE state connections?
4 Replies
- hoolio
Cirrostratus
Hi,
If either the client or server attempts to close the connection with a FIN or RST, BIG-IP should honor that and close the corresponding connection. You can check what the BIG-IP is tracking in it's connection table by running 'b conn all show all'.
If the connections aren't being closed by either the client or server, and you want the BIG-IP to reap them sooner than it is now, you can lower the idle timeout on the FastL4 profile. By default it's set to 300 seconds. You can view/modify the setting under Local Traffic >> Profiles >> Protocol >> FastL4 >> Idle Timeout. You might want to create a custom FastL4 profile if you end up modifying the setting.
There are a few related AskF5 solutions:
SOL7166: Configuring BIG-IP to close idle connections
https://support.f5.com/kb/en-us/solutions/public/7000/100/sol7166.html?sr=685167
SOL5401: Idle connections may be allowed to exist after the idle timeout expires
https://support.f5.com/kb/en-us/solutions/public/5000/400/sol5401.html?sr=685167
SOL7412: The output from the "bigpipe conn show all" command does not correctly display the idle time for PVA assisted connections
https://support.f5.com/kb/en-us/solutions/public/7000/400/sol7412.html?sr=685167
Aaron - Shayne_Rinne_84
Nimbostratus
We are still having an issue with this and we have narrow the problem down to what seems to be configuration between Solaris 8, F5 and MS 2003 or 2000. We ran a sniffer against the Solaris server and we determine that when MS AD closes the connection to the Solaris server TCP connections in LSOF or NETSTAT show IDLE.
Flow:
Solaris establishes a connection with MS
At Random points MS sends a FIN, ACK to close the connection
Solaris sends an ACK to MS
1 minute later MS sends a RST, ACK to Solaris
Solaris changes the TCP state to IDLE
Applications involved are siteminder policy server on solaris and Active Directory on MS. We have only seen this IDLE state issues on Solaris servers using the F5 as a default gateway.
netstat output of a IDLE TCP connection:
*.* *.* 0 0 24576 0 IDLE
The F5 inbound to outbound connection has Performance Layer 4 as the type.
Looking to rule out the F5 as the cause of the problem and having difficulting doing so. Help? - Hamish
Cirrocumulus
Sounds like a classic firewall/f5 long lived connection problem. If the connection through the F5 is idle for a long time it will be flushed from the connection table. Then when it's eventually closed by one end or the other, the packets (FIN/FINACK/ACK) will be dropped because there's no connection table entry to say where they're being balanced to.
There are a couple of options...
1. Enable tcp keepalives on the hosts, and set the keepalive timeouts lower than the connection table idle timeout (Only works if one of them requests SO_KEEPALIVE on the socket after creating it).
2. Increase the connection table idle timeout. (May cause the tables to grow quite large, so you're better off making that change on a VS specific to that port & destination IP).
3. Create a new tcp profile that uses loose-initiation/close. (Which will create a new connection table entry on any packet, not just the SYN/SYNACK/ACK sequence).
Option 3 is probably the best one... It won't affect connection table size, however it doesn't work if there is more than one way to route the packet and there is no shared state on the next hop (e..g when creating a firewall sandwhich the F5 has no way of knowing which of the poolmembers waas used for the original connection, so might guess wrong on the loosely opened one. But if you're just following the forwarding through another gateway, or do have sync'ed gateways, or the next hope doesn't do stateful connections, then you're in luck).
H - Shayne_Rinne_84
Nimbostratus
Thank-you for the replies. It turns out that loose-initiation in combination with loose close is the issue. The F5 is sending a RST 60 seconds after the first FIN to both the Solaris and MS server. This causes the Solaris server to create a TCP IDLE state connection that can be only cleared by restarting the process holding the connection. We have understood the reset to be sent based on our loose close enabled, a 60 sec TCP Close timeout and reset on close enabled. We are looking at our options and have come up with 3:
1. Turn off reset on close
2. Turn off loose close
3. Increase the close timeout to match the IDLE timeout
Any recommendations?
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects