Forum Discussion
iRule performs 100x slower under load
Hello,
I have an iRule (BIGIP v9.4.8) whose HTTP_REQUEST timing performance drops by 100x when under load. How can I diagnose what's happening? Under what conditions can an iRule slow down (in terms of cycles/request) when there's more significant traffic (200-1000 HTTP requests per second)?
HTTP_REQUEST performance in a development environment (BIG IP 3400 w/ v9.4.8) ran with timing stats of roughly 1,000,000 cycles/request average and 10,000,000 cycles/request max. By commenting out pieces of code, we found that 90% of that time is spent on decrypting cookies. According to the "F5DevCentral_iRulesRuntimeCalculator" (see "Evaluating iRule Performance") the 10^6 cycles only represent 357 microseconds. No big deal. The max of 10^7 is a little worry but is still only 3.5 milliseconds.
"b rule show all" output from developer environment (BIG IP v9.4.8, BIG IP 3400)
+-> HTTP_REQUEST 7252 total 0 fail 0 abort
| | Cycles (min, avg, max) = (46052, 838937, 9343832)
+-> HTTP_RESPONSE 7079 total 0 fail 0 abort
| | Cycles (min, avg, max) = (34872, 849654, 9031528)
When we deploy this rule into production (BIG IP v9.4.8, BIG IP 3400), the performance falls off a cliff:
+-> HTTP_REQUEST 4530 total 2 fail 0 abort
| | Cycles (min, avg, max) = (239321, 103.3M, 200.0M)
+-> HTTP_RESPONSE 4406 total 8 fail 0 abort
| | Cycles (min, avg, max) = (390987, 132.1M, 197.9M)
(Note: the 7252 requests in DEV were from an entire day of testing where the 4530 requests in prod came from only a couple of minutes.)
We've done fairly comprehensive comparisons between dev and production and I don't see any significant configuration differences. The CPU is about 30% busy. The memory (through 'top') is running at 50% (of 2GB). This box, like the dev box, serves a large of virtual servers but under greater load.
I've gone through the "Ten Steps to iRules Optimization" and the "Evaluating iRule Performance" documents linked to from here: SOL11769. However, I haven't yet found info that would explain the timing behavior differences between dev and prod.
From other posts on DEVCENTRAL, I've found the 'tmctl' and 'b memory' commands to see if I have a memory leak.
Commenting out sections of the iRule step by step in the development environment, reseting statistics and re-exercising the rule, I can account for all of the "cycles" seen in the DEV environment. But there is a 100x slow down when the rule is facing heavier traffic in production. What tricks do you know to figure out why?
Thanks for your time,
-- Chris
- The_BhattmanNimbostratusHi Chris,
- Chris_Huston_16NimbostratusHi Bhattman. I can't post the script - makes things difficult I know. But given the situation, what tricks would you use to find the problem? What culprits would you suspect?
- The_BhattmanNimbostratusHi Chris,
- Chris_Huston_16NimbostratusHi Bhattman,
- hoolioCirrostratusHi Chris,
- Colin_Walker_12Historic F5 AccountWithout knowing more about what's actually happening inside the iRule tracking this down is going to be tough. I realize you can't post the entire rule, but any details you can provide would be extremely helpful. Which commands are being used, which events they're being used in, context (client vs server), etc.
- hoolioCirrostratusChris,
- Colin_Walker_12Historic F5 AccountNice Aaron, I didn't realize there was a suspend bug in timing's accounting. Good to know!
- Chris_Huston_16NimbostratusAaron:
- Case C850935
- Thank you for the heads up on the timing reporting accuracy. I believe the timing reflected real issues in that you could see the delay when exercising the application. The slow performance of the app vanished when the problem iRule was reverted to an earlier version.
- Definitely looking forward to upgrading to version 10+ but it's a big company and things move slowly.
- Lastly, I will modify the rule to use AES encryption instead of HTTP::cookie encrypt and compare performance.
+-> CLIENTSSL_CLIENTCERT 2 total 0 fail 0 abort | | Cycles (min, avg, max) = (39284, 74243, 91724) +-> CLIENTSSL_HANDSHAKE 299 total 0 fail 0 abort | | Cycles (min, avg, max) = (6484, 30481, 174556) +-> CLIENT_ACCEPTED 296 total 0 fail 0 abort | | Cycles (min, avg, max) = (21288, 62419, 112668) +-> CLIENT_CLOSED 291 total 0 fail 0 abort | | Cycles (min, avg, max) = (3420, 24984, 58528) +-> HTTP_REQUEST 43262 total 0 fail 0 abort | | Cycles (min, avg, max) = (52624, 238869, 9670788) +-> HTTP_RESPONSE 43176 total 0 fail 0 abort | | Cycles (min, avg, max) = (18668, 1455857, 10122004) +-> RULE_INIT 1 total 0 fail 0 abort | Cycles (min, avg, max) = (182880, 182880, 182880)
An second test run containing a different mix of traffic - one that has heavier reliance on cookie encryption produced the following metrics:
+-> CLIENTSSL_CLIENTCERT 0 total 0 fail 0 abort | | Cycles (min, avg, max) = (0, 0, 0) +-> CLIENTSSL_HANDSHAKE 18 total 0 fail 0 abort | | Cycles (min, avg, max) = (21828, 51560, 227392) +-> CLIENT_ACCEPTED 17 total 0 fail 0 abort | | Cycles (min, avg, max) = (48332, 90744, 110596) +-> CLIENT_CLOSED 18 total 0 fail 0 abort | | Cycles (min, avg, max) = (19092, 48822, 52680) +-> HTTP_REQUEST 6878 total 0 fail 0 abort | | Cycles (min, avg, max) = (78288, 1062803, 9434612) +-> HTTP_RESPONSE 6877 total 0 fail 0 abort | | Cycles (min, avg, max) = (31408, 1975720, 9943836) +-> RULE_INIT 0 total 0 fail 0 abort | Cycles (min, avg, max) = (0, 0, 0)
Obviously nothing close to the 100 million cycles/request measured in production.| ssl 7.116M 7.212M 0 | tcl 1.201M 1.647M 0
JMeter Samples 3361, Average 22073ms, Median 21883ms, 90% line 30147ms and Max of 55250ms
session add ssl "s[SSL::sessionid]" $ccSubject $::timeout
- Chris_Huston_16Nimbostratus
Wow! AES encryption had a huge and positive impact on rule performance:
Before (with HTTP::cookie encrypt)
+-> HTTP_REQUEST 6878 total 0 fail 0 abort | | Cycles (min, avg, max) = (78288, 1062803, 9434612) +-> HTTP_RESPONSE 6877 total 0 fail 0 abort | | Cycles (min, avg, max) = (31408, 1975720, 9943836)
After (with AES::encrypt)
+-> HTTP_REQUEST 3201 total 0 fail 0 abort | | Cycles (min, avg, max) = (111540, 280527, 884712) +-> HTTP_RESPONSE 3200 total 0 fail 0 abort | | Cycles (min, avg, max) = (35024, 156626, 594192)
A 5x-10x improvement. Great suggestion Aaron. Thank you.
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com