Forum Discussion
Steve_Scott_873
Jan 28, 2010Historic F5 Account
Unexpected TMOS utalsiation hit
I have a bigip 6400 platform running LTM v 9.4.7
We have an iRule to direct HTTP requests coming into the F5 from the datacentre out to the appropriate external service (For XML messaging). This is a lot cleaner than having 250 vips for 250 different services.
The rule matches the HTTP host to a pool defined on the server, so if we need to update a host, we can do this via the web interface, without making any changes to the iRule by hand (= lower risk, easier operation).
Slap the iRule on the virtual server and a serverssl profile on there and you've got encrypted traffic going out.
We've also got some error handling to deal with people trying to get to services they shouldn't from this particular vip, and error messages if the external service is down.
Here is the code:
iRule to direct HTTP requests based on hostname
when HTTP_REQUEST {
Extract hostname - needs to be lower case to match pool name
set host [string tolower [HTTP::host]]
Check if the hostname (and therefore the pool name the request will be sent to
end with correct domain - Prevent using this VS to hop out a different pool
if { ($host ends_with ".bob.com") and !($host contains ".test.")} {
if { [catch { pool $host } ] } {
no matching pool name - so move on to error
HTTP::respond 404 content "Endpoint not defined"
}
} else {
HTTP::respond 403 content "Invalid URL"
}
}
when LB_FAILED {
HTTP::respond 504 content "Endpoint Unavailable"
}
So far, so good. We've tested, we've run iRule benchmarking and its reasonably efficient...
config cat /proc/cpuinfo
model name : AMD Opteron(tm) Processor 246
stepping : 1
cpu MHz : 1992.276
config bigpipe rule Prod_Spine_Generic_Routing show all
RULE Prod_Spine_Generic_Routing
+-> HTTP_REQUEST 3123 total 0 fail 0 abort
| | Cycles (min, avg, max) = (17437, 51642, 92343)
+-> LB_FAILED 0 total 0 fail 0 abort
| Cycles (min, avg, max) = (0, 0, 0)
When that's spreadsheeted it comes back with 40,000 TPS, which is perfectly reasonable for the amount of traffic we're expecting.
So at that point we said, we've got this sorted, put it into production and carried on to the next piece of work.
We've now got people using this VIP, and during their first bulk load run they got to the dizzying heights of 12 HTTP requests per second. During this period the TMOS utilisation moved from its usual 1-2% to 25%, and when they backed off to 6 TPS it reduced to 10-12%.
Clearly somethings not right here, but the timing stats (Produced from production data) show its all fine. There has to be some sort of hidden cost somewhere, but I can't see any obvious place its coming from.
Our reseller isn't being very helpful, so I'm between a rock and a hard place here.
Any thoughts would be most gratefully received...
14 Replies
- hoolio
Cirrostratus
I talked with Chris this morning and discussed some testing options that the two of you can try together. If the troubleshooting gets stuck he'll let me know and I can try to help where I can.
Aaron - Steve_Scott_873Historic F5 AccountWell, it appears the iRule timings are quite correct, the inefficiency is coming from server certificate checking being enabled on the serverssl profile. My understanding was this was almost entirely dealt with in hardware, so i weren't expecting problems there.
Hopefully I can get that raised as a support case, the capacity there is somewhat lower than advertised - Hamish
Cirrocumulus
is cert checking even supposed to be accelerated (Do you mean the 'Server Authentication' option?)
Handshakes and bulk crypto is for SOME ciphers. Not all (See SOL6739 for a list of fully accelerated cipher). SOL6808 lists the accelerated (native) and unaccelerated (compat) ciphers for v9.x +
Oh... Have you tried setting oneconnect? Are you getting http keepalives or not?
H - Steve_Scott_873Historic F5 AccountI do indeed mean the server authentication option.
In our case, we have a mix of native and compat ciphers enabled. We were using native ciphers, however it seems that having compat ciphers enabled meant that the session cache was not working correctly, and sessions were not being resumed - obviously more work.
Changing to native only (Disabling DHE and DH ciphers in our case) seems to have the session cache back working, and even if the session cache size is set to 0 then its still pushing 10-15% with 120 TPS rather than 50% with compat there. (Again, we were using AES256+SHA, so its on the native fully accelerated list, it seems merely having compats there in a serverssl profile is enough to cause problems)
Couldn't find anything on this in the knowdgebase, either before or after i've found the solution
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects