Forum Discussion
Steve_Scott_873
Jan 28, 2010Historic F5 Account
Unexpected TMOS utalsiation hit
I have a bigip 6400 platform running LTM v 9.4.7
We have an iRule to direct HTTP requests coming into the F5 from the datacentre out to the appropriate external service (For XML messaging). This is a lot cleaner than having 250 vips for 250 different services.
The rule matches the HTTP host to a pool defined on the server, so if we need to update a host, we can do this via the web interface, without making any changes to the iRule by hand (= lower risk, easier operation).
Slap the iRule on the virtual server and a serverssl profile on there and you've got encrypted traffic going out.
We've also got some error handling to deal with people trying to get to services they shouldn't from this particular vip, and error messages if the external service is down.
Here is the code:
iRule to direct HTTP requests based on hostname
when HTTP_REQUEST {
Extract hostname - needs to be lower case to match pool name
set host [string tolower [HTTP::host]]
Check if the hostname (and therefore the pool name the request will be sent to
end with correct domain - Prevent using this VS to hop out a different pool
if { ($host ends_with ".bob.com") and !($host contains ".test.")} {
if { [catch { pool $host } ] } {
no matching pool name - so move on to error
HTTP::respond 404 content "Endpoint not defined"
}
} else {
HTTP::respond 403 content "Invalid URL"
}
}
when LB_FAILED {
HTTP::respond 504 content "Endpoint Unavailable"
}
So far, so good. We've tested, we've run iRule benchmarking and its reasonably efficient...
config cat /proc/cpuinfo
model name : AMD Opteron(tm) Processor 246
stepping : 1
cpu MHz : 1992.276
config bigpipe rule Prod_Spine_Generic_Routing show all
RULE Prod_Spine_Generic_Routing
+-> HTTP_REQUEST 3123 total 0 fail 0 abort
| | Cycles (min, avg, max) = (17437, 51642, 92343)
+-> LB_FAILED 0 total 0 fail 0 abort
| Cycles (min, avg, max) = (0, 0, 0)
When that's spreadsheeted it comes back with 40,000 TPS, which is perfectly reasonable for the amount of traffic we're expecting.
So at that point we said, we've got this sorted, put it into production and carried on to the next piece of work.
We've now got people using this VIP, and during their first bulk load run they got to the dizzying heights of 12 HTTP requests per second. During this period the TMOS utilisation moved from its usual 1-2% to 25%, and when they backed off to 6 TPS it reduced to 10-12%.
Clearly somethings not right here, but the timing stats (Produced from production data) show its all fine. There has to be some sort of hidden cost somewhere, but I can't see any obvious place its coming from.
Our reseller isn't being very helpful, so I'm between a rock and a hard place here.
Any thoughts would be most gratefully received...
- hoolio
Cirrostratus
Hi Steve, - Steve_Scott_873Historic F5 AccountAaron,
CPU: 5% busy 95% idle 0% sleep Thu Jan 28 19:19:56 2010 Memory Allocated New Flow Old Flow Poll 34,248,152 / 3,472,883,712 14,169 441,524 135 Cycles [ . : . | . : . ] 16 215 14,651,347 Total Tcp Crypto Ops Random Class 106 Timers 56 Open 3 (total) 407 (total) 0 Stats 15 Accepts 0 rsa 0 Pseudo 13 Connects 0 full hs 0 Entropy Virtual Class Wait 5 record 407 Secure 12,349,665 (total) 0 Rtx 0 cipher 10,485,780 mco db 32 Del ACK -1 (unseen) 1,399,046 ssl 258,108 tcl 206,731 (unseen) Umem Class 53,519 (total) 47,632 ssl_session 1,559 listener 1,333 xfrag 727 poolmbr 582 laddr 565 pool 436 vaddr 306 packet 158 selfip 66 connflow 31 rt_entry 19 arp_entry 15 cn_key 15 proxy_ctx cac 15 ssl_profile 12 http_data 11 rtm_internal 9 lasthop 9 ssl_cn 8 ncache_entry vnic 4 CallFrame 118,280b rx link 125,928b tx 3 ssl_shim [ . : . | . : . ] bg [ . : . | . : . ] 1 mpi_recv_desc 471,792b rx 1,000 link 129,536b tx 1 shaper_domain [ . : . | . : . ] bg [ . : . | . : . ] 1 ssl_hs 21,616b rx 1,000 link 118,496b tx 1 ssl_keys [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx
- Steve_Scott_873Historic F5 AccountTMStat without anything going (Baseline)
CPU: 0% busy 100% idle 0% sleep Thu Jan 28 19:36:14 2010 Memory Allocated New Flow Old Flow Poll 34,009,560 / 3,472,883,712 18,435 9,425 132 Cycles [ . : . | . : . ] 3 29 15,048,450 Total Tcp Crypto Ops Random Class 64 Timers 30 Open 0 (total) 0 (total) 0 Stats 2 Accepts 0 rsa 0 Pseudo 0 Connects 0 full hs 0 Entropy Virtual Class Wait 0 record 0 Secure 12,181,457 (total) 0 Rtx 0 cipher 10,485,780 mco db 8 Del ACK 0 (unseen) 1,231,264 ssl 257,682 tcl 206,731 (unseen) Umem Class 53,481 (total) 47,632 ssl_session 1,559 listener 1,335 xfrag 727 poolmbr 582 laddr 565 pool 436 vaddr 306 packet 158 selfip 38 connflow 31 rt_entry 19 arp_entry 15 cn_key 15 ssl_profile 13 proxy_ctx cac 11 rtm_internal 10 http_data 10 ssl_cn 8 ncache_entry 7 lasthop vnic 1 CallFrame 57,232b rx link 26,408b tx 1 mpi_recv_desc [ . : . | . : . ] bg [ . : . | . : . ] 1 shaper_domain 18,400b rx 1,000 link 21,496b tx 1 ssl_keys [ . : . | . : . ] bg [ . : . | . : . ] 13,288b rx 1,000 link 9,504b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx [ . : . | . : . ] [ . : . | . : . ] 0b rx 0 link 0b tx
- Steve_Scott_873Historic F5 AccountThis is a graph from the TMOS CPU graph. This is real traffic, with nobody on the F5 (CLI or otherwise)
- hoolio
Cirrostratus
Hmm.. so the TMM CPU (CPU1) usage was most likely peaking at 30%. I just did a similar test with 10 curl clients looping requests to a VIP with your rule set up. tmstat and the TMM Utilization graph both show less than 3% usage. - Steve_Scott_873Historic F5 AccountYes, they did a test run with 2 servers running initially, the 2nd server ran into trouble so they stopped, and then ran with a single server for the remainder of the session. First spike was 12 HTTP requests a second, second prolonged period was 6 transactions per second. As you can see, normal operation (Its a webapp we're hosting, so most busy during the day, ~ 100 TPS), the TMOS utalsiation is minimal, despite there being more complex iRules running.
- hoolio
Cirrostratus
I tested on a 6400 running 9.4.8. I didn't test to a pool. All my requests hit the catch statement for the non-existent pool condition. I was focusing on the iRule operation as opposed to load balancing. I guess there could be something horrible happening with the SSL negotiations or serverside connections, but I've never seen something like that use so many CPU cycles. - Steve_Scott_873Historic F5 AccountIt could be something going wrong at the pool / SSL level - thats the sort of area i'd expect, the iRule timings seem to indicate things aren't being held up there. I could try giving it a curling at the catch level and seeing if i can reproduce. That would at least push it down to pool selection or ssl profile.
- hoolio
Cirrostratus
The timing command is used to calculate the CPU cycles required to run the iRule. That data is used to extrapolate how many evaluations of the rule can be done for the platform. As far as I'm aware, timing doesn't take into account any other resource requirements. Unfortunately, there isn't a simple way to determine how many TMM CPU cycles other activities are using. So I'm not sure that your scenario shows that the iRule timing is broken. And I don't think there is a linear relationship between the requests per second and the CPU usage. - Steve_Scott_873Historic F5 AccountAgreed, but its its not the iRule eating the cycles, its not custom development eating the cycles.
Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects