iRule performs 100x slower under load
Hello,
I have an iRule (BIGIP v9.4.8) whose HTTP_REQUEST timing performance drops by 100x when under load. How can I diagnose what's happening? Under what conditions can an iRule slow down (in terms of cycles/request) when there's more significant traffic (200-1000 HTTP requests per second)?
HTTP_REQUEST performance in a development environment (BIG IP 3400 w/ v9.4.8) ran with timing stats of roughly 1,000,000 cycles/request average and 10,000,000 cycles/request max. By commenting out pieces of code, we found that 90% of that time is spent on decrypting cookies. According to the "F5DevCentral_iRulesRuntimeCalculator" (see "Evaluating iRule Performance") the 10^6 cycles only represent 357 microseconds. No big deal. The max of 10^7 is a little worry but is still only 3.5 milliseconds.
"b rule show all" output from developer environment (BIG IP v9.4.8, BIG IP 3400)
+-> HTTP_REQUEST 7252 total 0 fail 0 abort
| | Cycles (min, avg, max) = (46052, 838937, 9343832)
+-> HTTP_RESPONSE 7079 total 0 fail 0 abort
| | Cycles (min, avg, max) = (34872, 849654, 9031528)
When we deploy this rule into production (BIG IP v9.4.8, BIG IP 3400), the performance falls off a cliff:
+-> HTTP_REQUEST 4530 total 2 fail 0 abort
| | Cycles (min, avg, max) = (239321, 103.3M, 200.0M)
+-> HTTP_RESPONSE 4406 total 8 fail 0 abort
| | Cycles (min, avg, max) = (390987, 132.1M, 197.9M)
(Note: the 7252 requests in DEV were from an entire day of testing where the 4530 requests in prod came from only a couple of minutes.)
We've done fairly comprehensive comparisons between dev and production and I don't see any significant configuration differences. The CPU is about 30% busy. The memory (through 'top') is running at 50% (of 2GB). This box, like the dev box, serves a large of virtual servers but under greater load.
I've gone through the "Ten Steps to iRules Optimization" and the "Evaluating iRule Performance" documents linked to from here: SOL11769. However, I haven't yet found info that would explain the timing behavior differences between dev and prod.
From other posts on DEVCENTRAL, I've found the 'tmctl' and 'b memory' commands to see if I have a memory leak.
Commenting out sections of the iRule step by step in the development environment, reseting statistics and re-exercising the rule, I can account for all of the "cycles" seen in the DEV environment. But there is a 100x slow down when the rule is facing heavier traffic in production. What tricks do you know to figure out why?
Thanks for your time,
-- Chris