Forum Discussion
iRule killing BIG-IP
Hi,
I have huge issue with iRule obviously killing BIG-IP 2000. In tests iRule logic works OK, but under load BIG-IP is overloaded.
Scenario:
- BIG-IP no iRule, 30 to 40k concurrent connections - CPU usage 40-50%
- Same number of concurrent connections with iRule attached to VS - CPU usage skyrocketing to almost 100%
I am looking for directions what to check first. iRule is quite complicated and not optimized for sure - lack of knowledge and time on my side. Still it looks for me as not so much complicated - so I am puzzled.
What to look for first and how?
It's HTTP intensive iRule in first place - most of the logic in HTTP_REQUEST event.
I wonder what load could be caused by inactive debug statements like:
if {0} (log local0. "something"} - of course 0 is some global static variable like $static::debug
I have something around 40 of those
Then I have nested if - around 8, mostly using HTTP::cookie value $cookie name in if and subtable lookups.
Then of course some set/add keys to subtables - one main with around 5000 keys, other just single key in main table.
One switch without -global and five conditions resulting in HTTP::response - either 403 or 302 with Location and Set-Cookie
So what is first candidate for closer investigation? Should I use timing on (seems that above 11.5.0 it's enabled by default so no need to set in iRule?)
Any advises/ideas will be appreciated a lot, I am running out of options here :-(
Piotr
12 Replies
- Brad_Parker
Cirrus
Try to not use subtables if possible.
"Manipulating entries in subtables has higher overhead than manipulating an entry not in a subtable. Each subtable itself also takes up memory. All of the entries in a given subtable are on the same processor. So if you put all of your entries (or the vast majority of them) into the same subtable, then one CPU will take a disproportionate amount of memory and load. Which you probably don't want."
https://devcentral.f5.com/wiki/iRules.table.ashx
- dragonflymr
Cirrostratus
Hi,
I know that part. Do you think that subtable with 5 000 keys with just 1 set as value can kill BIG-IP 2000 with 8GB RAM by itself?
I can use subtable splitting but according to my other discussion (Subtables, performance and resouces) it should be no main issue here - or I am wrong?
Piotr
- dragonflymr
Cirrostratus
OK it's two subtables with diffferent names that are created by separate iRules attached to separate VSs - so should be distributed at least among two TMMs? Piotr
- VernonWells
Employee
While it is improbable that the
construct is chewing up tons of CPU, you should absolutely remove those statements for production use. Either comment them out, or remove them completely (some advocate using a simple iApp to choose whether to include them or not, then in the iApp logic purge the statements, most easily accomplished if they are on a single line). To illustrate, the timing value after 100 executions for the following rule:if { $static::debug } { ... }when CLIENT_ACCEPTED { return }is as follows:
--------------------------------------------- Ltm::Rule Event: comment-test:CLIENT_ACCEPTED --------------------------------------------- Priority 500 Executions Total 100 Failures 0 Aborts 0 CPU Cycles on Executing Average 12.1K Maximum 44.2K Minimum 6.2KOn the other hand, for the following:
when RULE_INIT { set static::ldebug 0 } when CLIENT_ACCEPTED { if { $static::ldebug } { log local0. "Log Entry 01" } if { $static::ldebug } { log local0. "Log Entry 02" } if { $static::ldebug } { log local0. "Log Entry 03" } if { $static::ldebug } { log local0. "Log Entry 04" } if { $static::ldebug } { log local0. "Log Entry 05" } if { $static::ldebug } { log local0. "Log Entry 06" } if { $static::ldebug } { log local0. "Log Entry 07" } if { $static::ldebug } { log local0. "Log Entry 08" } if { $static::ldebug } { log local0. "Log Entry 09" } if { $static::ldebug } { log local0. "Log Entry 10" } if { $static::ldebug } { log local0. "Log Entry 11" } if { $static::ldebug } { log local0. "Log Entry 12" } if { $static::ldebug } { log local0. "Log Entry 13" } if { $static::ldebug } { log local0. "Log Entry 14" } if { $static::ldebug } { log local0. "Log Entry 15" } if { $static::ldebug } { log local0. "Log Entry 16" } if { $static::ldebug } { log local0. "Log Entry 17" } if { $static::ldebug } { log local0. "Log Entry 18" } if { $static::ldebug } { log local0. "Log Entry 19" } if { $static::ldebug } { log local0. "Log Entry 20" } return }the timing result it:
--------------------------------------------- Ltm::Rule Event: comment-test:CLIENT_ACCEPTED --------------------------------------------- Priority 500 Executions Total 100 Failures 0 Aborts 0 CPU Cycles on Executing Average 31.7K Maximum 96.3K Minimum 20.9KNow ~20k extra cycles is definitely not a deal-breaker, but it doesn't help, and that's 20k extra cycles per connection. I'll also say that this value scales linearly, with cycle consumption proportional to the number of these you have (so, if I double it to 40 instances, the average climbs to 49k).
As an aside, I strongly advocate engaging F5 Professional Services to analyze your rule and help optimize. As a bonus, the rule will be archived by F5 Support, and will receive full support from that point onward.
- dragonflymr
Cirrostratus
Hi Vernon,
Thanks a lot for your comments. I have to do serious cleanup of my iRule and then performance tests under high load.
Is there any way to estimate how much rule influence CPU usage on specific BIG-IP device using timing results - I know about spread related to iRule performance calculation but never used it and as well it seems rather old topic - don't know if it's still valid for 11.5.x or 11.6.x?
Piotr
- VernonWells
Employee
There is no easy way to answer the question "how much CPU time is devoted to the iRule, and how much is devoted to everything else happening with the connection handling". You may consider reaching out to your F5 Account team. The SE may be able to provide you with an estimate of the CPU budget available on your platform. From there, at least, you can estimate the total CPU cycles consumed by your rule (turn on timing for the rule by adding
to the top of the rule, then usetiming on
to view the time consumed). The total consumption isshow ltm ruleaverage-cycles * events-per-second(In your case, an "event" is an HTTP Request). So, my results above show an average of 31.7k cycles. If there were 1000 requests/second, the estimated cpu cycles consumed would be 31.7M. If the total CPU budget on the platform is 100M cycles/second, then the rule, under that load, would consume ~30% of the available CPU (assuming the traffic is uniformly distributed across processing cores; that is, tmm instances).
- dragonflymr
Cirrostratus
Thanks for info. Are your sure timing on is necessary on 11.5+? I have no timing on enabled on events in my iRule and can see stats when using show ltm rule rule_name. Do you know by chance F5DevCentral_iRulesRuntimeCalculator.xls tool provided in this article?
Is this tool still useful for rough evaluation?
When I entered data from my iRule (setting CPU to 2000 000 000 - I assume it's correct for my VE reporting "cpu MHz : 2000.000" from cat /proc/cpuinfo) result in last table "Max number of requests" for average is around 561. I assume that it means max connections/s that can be handled per CPU?
Piotr
- VernonWells
Employee
You are correct. As described here:
starting with 11.5 timing is enabled by default.
Looking at cpuinfo may be somewhat inaccurate because of how scheduling is performed by TMOS. Moreover, by default, the hyperthread cores are not used for tmm in 11.5.
Having said all of that, this number may be useful for a rough estimate.
In my crude formula, "events-per-second" is the peak number of events you see in practice on the system. If you wish to calculate it per-core, that's fine, too.
- dragonflymr
Cirrostratus
Hmm, So I wonder how to reverse this to judge what CPU budget is necessary - or rather what device should be used.
Assuming that my average for iRule is 3.5M then trying to handle 2k CPS I need budget of 7000 000 000 - is that correct?
Then I am not sure how it applies to VE with for example 4 vCPU - if for example cpu is reported as 2GHz considering that there are 4 vCPUs is my rough budget 8000 000 000? So iRule will most probably kill VE?
Piotr
- dragonflymr
Cirrostratus
I wonder if this iRule should really kill my VE:
for {set i 0} { $i < $stbl_size } {incr i} { set cookie_value "[string range [AES::key 128] 15 end][string range [AES::key 128] 15 end]" table set -subtable $stbl_name $cookie_value 1 0 $stbl_life }where $stbl_size is 5 000 is killing my 2 vCPU 8GB RAM VE? I bet it's set cookie_value. Is that really such resource hog?
Not always but often execution of this loop results in:
- Dec 9 19:17:48 bigip11 emerg logger: Re-starting bigd
- Dec 9 19:18:10 bigip11 emerg logger: Re-starting tmm
- Dec 9 19:18:10 bigip11 emerg logger: Re-starting tmm1
My [clock clicks -milliseconds] calculations are reporting around 32 300 ms to perform above loop.
Piotr
- VernonWells
Employee
is the culprit. The process of generating an AES key is computationally non-trivial. If you are doing it twice per cookie times 5000 times per connection, your performance is going to be quite poor.AES::key
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com