on 06-Sep-2017 09:00
In this episode of Lightboard Lessons, Jason details how BIG-IP’s Traffic Management Microkernel (TMM) utilizes Intel’s Hyper-Threading Technology (on applicable platforms) and the impacts depending on the version of TMOS. In versions before 11.5, each hyper-thread has a separate TMM running. In versions 11.5 and later, TMM and control plane tasks are assigned to separate hyper-threads by default, a behavior we call HTSplit. This behavior is controlled by the scheduler.splitplanes.ltm database key.
Note: This video is a result of a great question by community member Aditya. You can read about the background here.
Hi Jason,
Great Video! Finally a topic regarding HT on lightboard! (can't wait to share this out 🙂 )
I had a question though, so after 11.5, if even cores exceed 80% and let's say the load to CPU is still increasing, how bigip will allocate load? Will it first use tmm0 until it reaches 100% and then to tmm1 to use that 80%? Or will bigip distribute the load to tmm1 once tmm0 exceed 80%?
Thanks!
Hi Jason,
I second zack opinion - great video. Now CPU/HT/TMM is plain and simple :-) Thanks a lot for extending your explanations from from our Answer section to this video.
I am as well interested to know answer to zack's question.
Piotr
Hey Jason,
Great video. Explains visually on what we were discussing on Relation between CPU , core and TMM
Also I can see the Idle enforcer:
info kernel: ltm driver: Idle enforcer starting - tid: 12663 cpu: 11/11
info kernel: ltm driver: Idle enforcer exiting - tid: 12663 cpu: 11
info kernel: ltm driver: Idle enforcer starting - tid: 13015 cpu: 5/5
info kernel: ltm driver: Idle enforcer exiting - tid: 13015 cpu: 5
Hey Jason,
Great video. Explains visually on what we were discussing on Relation between CPU , core and TMM
Also I can see the Idle enforcer:
info kernel: ltm driver: Idle enforcer starting - tid: 12663 cpu: 11/11
info kernel: ltm driver: Idle enforcer exiting - tid: 12663 cpu: 11
info kernel: ltm driver: Idle enforcer starting - tid: 13015 cpu: 5/5
info kernel: ltm driver: Idle enforcer exiting - tid: 13015 cpu: 5
Great question Zack. The disaggregator (or DAG as we call it, also referenced as the cmp hash) distributes load to all TMMs as requests come in by vlan (configurable in vlan advanced settings 11.3+, see notes in K14358 linked above on CMP), rather than fill one bucket and move to the next. So you should see all TMMs grow together. If you have the default cmp hash and an application that has same src/dst ports (like ntp), that can overwhelm a single TMM if traffic is heavy for that application as all the traffic will be hashed to the same TMM.
Hi,
I wonder how info in the thread that was trigger for this lesson relates to vCMP and i5800.
In the thread issuing ps T | grep tmm | grep T rendered output.
I tried the same command on both vHost (i5800) and 2 vCPU vGuest - on both there was no output. When using just:
ps T | grep tmm
results are:
vHost (13.0.0.2.0.1671)
8910 ? S 0:00 runsv tmm
12846 ? S 0:06 /usr/bin/tmipsecd --tmmcount 1
12848 ? S 0:00 /etc/bigstart/scripts/tmm.start /var/run 1 1 1 --platform C119 -m -s 512
18071 ? RL 544:29 /usr/bin/tmm --platform C119 -m -s 512
`
vGuest (12.1.1.1.0.196)
` 4491 ? S 0:00 runsv tmm
4506 ? S 0:00 /etc/bigstart/scripts/tmm.start /var/run 1 1 0 --platform Z101 --split-planes -m -s 8804
7340 ? S 0:07 /usr/bin/tmipsecd --tmmcount 1
10994 ? S
So why nothing like in the thread? Only obvious fact is that HTSplit is disabled on vHost but enabled on vGuest (--split-planes).
I wonder as well what 11/11 means in this kern.log message:
info kernel: ltm driver: Idle enforcer starting - tid: 12663 cpu: 11/11
Looking at picture in K15003 active TMM on core 5 has id 10 so I can understand that first 11 is number of TMM activated after 80% threshold reached on TMM 10 but what second means? It would be easier to debug having info about physical core and TMM id like 5/11.
Can you point me to some resources about how to find out reason for high CPU and RAM usage on vGuest? Seems that Dashboard data for vGuest is not in pair with CPU load reported on vHost in vCMP ›› Guest List.
Piotr
Another are I am really lost. Performance statistics.
Let's assume i5800 is used for vCMP. It's 1 x quad core HT procesor. As far as I am not completely lost it means:
cat /proc/cpuinfo uses this therminology:
Part I don't get is all statistic info available on BIG-IP. Let's start with vHost on i5800
---------------------------
Sys::TMM: 0.65535
---------------------------
Global
TMM Process Id 18071
Running TMM Id 65535
TMM Count 1
CPU Id 65535
Memory (bytes)
Total 406.0M
Used 94.3M
CPU Usage Ratio (%)
Last 5 Seconds 3
Last 1 Minute 3
Last 5 Minutes 3
So one TMM (process?) 0. Why two ids, what is difference between TMM Process Id and Running TMM Id and why CPU id is 65535 not 0?
What CPU Usage Ratio (%) is reporting here - total processor usage (so all physical/virtual cores)?
I assume it's the same value as reported by:
snmpwalk -c public localhost .1.3.6.1.4.1.3375.2.1.8.2.3.1.37
Is Statistics ›› Performance > System > CPU Utilization with one line labeled Utilization GUI counterpart for commands above?
Then in Statistics ›› Performance > System > CPU Utilization: View Detailed Graph, there are two graphs each for four CPU - 0-3 and 4-7 - so it's reporting virtual core - right?
Seems that counterpart in CLI is
tmsh show sys cpu
or tmsh show sys host-info
?
For HTSplit enabled system is that safe to assume that stats for even numbered CPU are in fact CPU utilization caused by TMM threads - so CPU 0, 2, 4, 6 is TMM utilization for conditions under 80%?
CPU 1, 3, 5, 7 is CPU utilization by Control plane tasks - right?
How then CPU utilization for situation with TMM thread/s with load > 80% can be checked? In such situation part of CPU utilization is on even CPU, part on odd CPU - graph is not showing info which part on odd CPU is control plane and which is data plane.
Are there any tools (GUI or CLI) that are providing exact CPU utilization (virtual core utilization) per TMM thread?
Last but not least what exactly is reported by Dashboard CPU section? What is CPU here:
If this is virtual core reported here is there some difference for a way it's reported fot vCMP than for non vCMP systems? Like there could be much higher usage reported for busiest CPU?
And really last 🙂
What can be cause for big difference between busiest CPU and least busy? Assuming that statistically DAG distributes connections in equal share between TMM threads then big difference should rarely happen - or I am wrong?
Piotr
Sorry, for that, my bad I am really curious creature :-), take your time, I really appreciate time and effort required to answer such questions.
Piotr
Hi,
I am back 🙂
Some more questions:
HTSplit and vCMP - is same rule about even and odd virtual cores apply to vCMP guest? What when we have 1 vCPU vGuest?
ASM - according to KB ASM is using highest numbered core for part of control plane tasks. Same rule about halting 80% of odd core still applies here - so ASM control plane processes will be throttled to 20% if even core is loaded over 80%?
Is ASM really limited for those selected control plane task to one and only one core? I've seen configs where highest numbered core was almost constantly loaded 90-100% - I assume mainly by ASM control plane, if only one core can be dedicated to ASM what is an option in such case?
Piotr
Does this cause packet drops, when we see those logs on /var/log/kern? I can see it our box, when we do load tests on certain VIPs and these logs start growing. And the application see timeouts from the VIP