Forum Discussion
pagema1_69881
Nimbostratus
Mar 08, 2010Very Slow Application performance behind F5
We have one application that performs very poorly behind F5. There is a 9 second delay on the initial GET request going through the VIP. If we bypass the F5 to the servers directly there is no delay. Wireshark shows a lot of reassembled PDU's. I'm no guru with captures so I'm not sure what this means. Here is our setup:
SSL Offloading VIP.
one http pool with 2 members.
TCP lan/wan Optimized profiles on VIP, with one connect profile.
We are using SNAT
We tried disabling Nagle's, no effect
Tried enabling proxy max segment, no effect
tried going thru F5 using HTTP only, no effect
If we connect to the servers directly that 9 second initial delay vanishes.
No packet loss on NIC's.
Switch is set to 100MB Full as are F5 Nics.
2 HA LTM 3400 vers 10.0.1.
We do have a case open with support but they have not been able to identify the issue within our TCP Dumps. Has anyone seen this type of delay only on the initial GET Request? Any tips on improving performance? Our other applications behind F5 don't have this delay.
Thanks,
Marc
27 Replies
- Mark_Cloutier
Nimbostratus
Just a guess, but does your web server have redundant nics? I ran into a situation where load balanced servers with teamed nics were load balancing transmitted traffic. This caused problems with the v 9.4.6 LTM where I was running a FastL4 profile and doing SSL termination, since it was tracking the connection based on source mac which was changing. Once we changed the teamed nic config to only use the secondary nic in the event of failure on the first, it worked fine. - William_64205
Nimbostratus
Is the website multi-tiered? Does it also connect to other databases? - pagema1_69881
Nimbostratus
First of all thanks for so much feedback.
Webserver NICS were teamed, but we broke the team and it made no difference.
We don't use self signed cert's. They are from the cert authority.
The app is multi tiered and talks to a DB cluster and also sql server reporting servers.
We will disable the net bios on the nics and update if that helps. So far we have not identified the issue but still researching....
Marc - pagema1_69881
Nimbostratus
disabling net bios on teh NIC worked!!! We are golden now with no 9 second delay. Thanks for everyone's input!!!!!!
Marc - William_64205
Nimbostratus
wow disabling netbios on the nic worked? i have never heard of that affecting it. - JRahm
Admin
Maybe this?
http://support.microsoft.com/kb/166159 - William_64205
Nimbostratus
9 seconds =~ Specifies the number of times the system will retry NetBIOS name query broadcasts. The default is 3. The timeout for each netbios query is ~ 3 seconds. Mind you this is rough numbers for the timeout query. But doing a google search of windows 2003 netbios issues shows a few listings of them.
Nice thinking hoolio. - hoolio
Cirrostratus
The NetBIOS over TCP/IP wiki page has some details on what NetBIOS is used for:
http://en.wikipedia.org/wiki/NetBIOS_over_TCP/IP
Were you able to see what host the server was performing the lookup for? It would be quite strange if the server was looking up the client IP address.
Aaron - L4L7_53191
Nimbostratus
Maybe this then (source: wikipedia). Somehow having the VIP in line affected this process somehow and the lookups were breaking? An interesting bit of data may be to re-enable it on the NIC, then dump that UDP 137 traffic and see what is going on exactly in Wireshark (if you've not arlready done this).In order to start sessions or distribute datagrams, an application must register its NetBIOS name using the name service. NetBIOS names are 16 bytes in length and vary based on the particular implementation. Frequently, the 16th byte is used to designate a "type" similar to the use of ports in TCP/IP. In NBT, the name service operates on UDP port 137 (TCP port 137 can also be used, but it is rarely if ever used).
-Matt - Joel_Moses
Nimbostratus
I know the issue's been solved, but I thought I'd share some things I've learned about Windows web server loadbalancing, NetBIOS, and Windows Integrated Authentication.
The behavior described above is absolutely the behavior of a Windows box that has its NetBIOS node-type set incorrectly and is set to use Integrated auth. What it's doing here is attempting to locate domain resources based on the incoming interface IP address. The system probably doesn't have a WINS server set (making it a b-node), or may have a WINS server set to which it can't talk. In both cases, it's either sending out a broadcast or a WINS query which isn't getting a response, and times out after 9 seconds (3x3). I run into this all the time will "multi-tier" Windows apps that have their Web front-ends in a firewalled DMZ.
It gets even worse when the system has Windows Integrated Auth enabled and can communicate with a WINS server but not the AD domain for which the WINS host is primary. The delay will be equal to 3 times the number of domain controllers that WINS reports are in the domain. In a big environment, this could take up to two minutes and stall the connection the whole time.
This can occur in two places: when the user connects and authenticates to the front-end, or when the front-end tries to authenticate to a WCF web service on the back-end. In both cases, the solution is the same: you must modify either the authentication order (Negotiate,NTLM becomes NTLM,Negotiate - http://support.microsoft.com/kb/215383) or disable Windows Integrated auth.
There's another little gotcha that can cause a connection to stall. If you've got WIA enabled and Negotiate mode comes up first as an authentication offer, most modern Windows systems will attempt to locate Kerberos services first, then fall back to NTLM. This works fine on an internal network, but not if you leave this setting in place when you move the app to be Internet-facing but choose (smartly) to leave your Kerberos servers unexposed. In a strange little twist, the Kerberos lookups usually last longer than the browser's TCP timeout, leading to a "Page cannot be displayed" error. The browser will try to find a domain Kerberos server, never failing over to NTLM as it should.
This iRule works to knock down the Negotiate header from being send to the browser, keeping this from occuring. It should be installed on the F5 that's handling incoming connections from the Internet.when RULE_INIT { set ::negotiate_rule_debug 0 } when HTTP_REQUEST { set negotiate_disable 0 if { [HTTP::header exists "Authorization"] } { set negotiate_disable 0 } else { set auth_host [string tolower [HTTP::host]] set negotiate_disable 1 } } when HTTP_RESPONSE { On 401 requests, if we've got the Negotiate holddown cookie, remove the WWW-Authenticate headers for Negotiate and keep only the NTLM and basic auth ones headed to the client. The realm header on the basic auth is set to the present hostname captured in HTTP_REQUEST. if { ($negotiate_disable) && ([HTTP::status] == "401") } { HTTP::header remove WWW-Authenticate HTTP::header insert "WWW-Authenticate" "NTLM" HTTP::header insert "WWW-Authenticate" "Basic realm=\"$auth_host\"" unset auth_host unset negotiate_disable if { ($::negotiate_rule_debug) } { log local0. "Replacing Negotiate for initial authentication." } } }
Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
