Forum Discussion
High consumption of CPU after upgraded to 11.6.0
Hello, could anyone help me?
We have a couple of Bigip running with LTM, Webaccelerator and ASM, that is running in Failover. Yesterday we upgrade the version from 11.5.1 Build 5.0.147 to 11.6.0 Build 3.0.412 but now I'm getting high CPU consumption and in less than 24 hours after upgrading my active Bigip did fail over.
Follow bellow the messagens logs when the problem starts:
Wed Feb 11 15:12:53 BRST 2015sod[6881]Offline Wed Feb 11 15:12:53 BRST 2015sod[6881]Offline for traffic group /Common/traffic-group-1. Wed Feb 11 15:12:53 BRST 2015sod[6881]HA daemon_heartbeat tmm3 fails action is go offline down links and restart. Wed Feb 11 15:12:53 BRST 2015sod[6881]HA daemon_heartbeat tmm2 fails action is go offline down links and restart. Wed Feb 11 15:12:53 BRST 2015sod[6881]HA daemon_heartbeat tmm1 fails action is go offline down links and restart. Wed Feb 11 15:12:53 BRST 2015sod[6881]HA daemon_heartbeat tmm fails action is go offline down links and restart. Wed Feb 11 15:12:48 BRST 2015mcpd[8248]Attempting to connect to CMI peer 192.168.254.2 port 6699 Wed Feb 11 15:12:48 BRST 2015mcpd[8248]CMI reconnect timer: enabled Wed Feb 11 15:12:48 BRST 2015mcpd[8248]Closed connection to device /Common/Bigip-HA02.viajanetmia.local. Wed Feb 11 15:12:48 BRST 2015mcpd[8248]Connection to CMI peer 192.168.254.2 has been removed Wed Feb 11 15:12:42 BRST 2015lacpd[8847]Failover event detected. (Switchboard failsafe disabled while offline) Wed Feb 11 15:12:41 BRST 2015sod[6881]Sod requests links down. Wed Feb 11 15:12:41 BRST 2015sod[6881]HA reports tmm NOT ready.
Our top proccess are:
tmm, learning_Manage and mysqld
Anyone have any ideia about what happens?
- Brad_ParkerCirrusI highly recommend opening a case with support.
- JGCumulonimbus
Also, there is a huge structural change in v11.6.0, and you will have 50% less of the number of tmm threads running than before the upgrade. The tmm processes are now each pinned to a specific CPU core, and half of the cores are dedicated to admin tasks only. See the Release Notes.
- Cristian_Gal_12Nimbostratus
I have the same issue on a 4200v with LTM and ASM, 11.6 hf3. Whenever I turn ASM policy on for a VS cpu usage jumps to 100%. Top shows mysqld and learn_manage as most consuming processes and management plane cores to 30% each. However I don't have a cluster to see if failover occurs, LTM doesn't seem to be impacted by this high cpu usage.
- Eduardo_de_OlivNimbostratus
Hum.. I can see anything about Policy learning... If I active a policy learning i see the cpu jumps to 100% and the top proccess is learn_manage so I turned off the learning and everything sounds good... BUT I DON'T want to leave this off... something wrong here!! SURE!
When our pair did a fail over and looking for logs I discovery that failover was because the disc latency... anything about 10seg to response.... and saw in someplace about this version was changed anything about how write logs... so.... I m sure about this change on the logs has something with policy learning... maybe a exccees of disc IO... i dont know
- JGCumulonimbusI'd recommend a rollback, and wait for the release of HF4 of v11.6.0, which has stability issues.
- Shane_Fought_18Nimbostratus
I have the same issue on a set of 4000's running LTM, ASM, APM, AVR, and AAM in a failover HA configuration. top shows that mysqld, tmm, and learn_manager are using the majority of the CPU. At times I see mysqld using 95%+. I put in a ticket with F5 today. Another issue I'm having (which may be related), is that pulling up illegal requests (Security->Event Logs->Application->Requests) has gone from 30 seconds for the page to load to 5 minutes plus since we upgraded from 11.4 to 11.6 HF3. I have another ticket in for that, but I'm wondering if the issues are related.
- Eduardo_de_OlivNimbostratus
Thanks Shane!! Please, share to us if you have any answer from F5.
- Shane_Fought_18Nimbostratus
I quoted the F5 response at the bottom of this post. Since we haven't seen any latency through the F5, I'm going to consider this a non-issue for now. I'm still waiting to hear back as to why it is taking so long to pull up the Security event logs:
From F5 support:
"Hi Shane,
As we discussed, the graphs under Statistics > Performance > System CPU Usage > View Detailed Graph, will show you CPU utilization per core an what you'll see is that just Core 7 is running high and thats probably due to learning mode being enabled.
The other issue is a memory leak ramp caused by monpd which you already knew about.
So basically as long as Core 7 doesn't stay pegged at 100% for hours then the box is running within spec."
- JGCumulonimbusThanks for sharing this info, for we have exactly the same situation of high, persistent usage of Core 7, and were thinking of opening a Support case to find out why.
- Daniel_SchröterNimbostratus
11.6HF4 has been released a few days ago. Does it solve your problem?
- Eduardo_de_OlivNimbostratus
lol I did the upgrade yesterday, for now it appears that the consumption is the same... lets wait for the next days to know if it solve the crash dump
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com