You Want Action on a Threshold Violation? Use iCall!
iCall has been around since the 11.4 release, yet there seems to be a prevailing gap in awareness of this amazing functionality in BIG-IP. A blog I wrote last year covers the overview of the iCall system, but in brief, it provides event-based automation. The events can be periodic (like cron functionality,) perpetual (watching for something like a file to appear in a directory,) or triggered by an alert (like a pool member failure.)
Late last week I was at the mother ship (F5 Corporate in Seattle) and found this question in Q&A (paraphrased):
What is a good method for toggling interface 1.1 if active pool members in a pool falls below 70%?
My mind went immediately to iCall, as this is a perfect use case. It binds an event (a pool's active members falling below a threshold) to a task (disable an interface.) I didn't have time to flesh out the solution last week, but I dropped some (errant) code in the thread to point the original poster (Lee) down the right path. Flash forward to this week, and I was intrigued enough about the solution I thought I'd take a crack at making it work.
Building Out the Solution
Given that Lee set a threshold of 70% of active pool members, I figured a test pool of four members would be a good candidate since failing one member would be just over the threshold at 75% whereas failing a second member would take me to 50%. I suppose a pool of three members would have been equally fine, but I like to see that some failure doesn't force an accidental event. So I fired up my test BIG-IP device and a linux vm with several interface aliases and built a pool with four members.
ltm pool pool4 {
members {
192.168.101.10:80 {
address 192.168.101.10
session monitor-enabled
state up
}
192.168.101.20:80 {
address 192.168.101.20
session monitor-enabled
state up
}
192.168.101.21:80 {
address 192.168.101.21
session monitor-enabled
state up
}
192.168.101.22:80 {
address 192.168.101.22
session monitor-enabled
state up
}
}
monitor http
}
Next, I needed to build the iCall script. An iCall script is just a tmsh script stored in a specific section of the configuration. It's tcl just like tmsh. But what does the script need to do? Well, a few things:
- Define the pool of interest
- Set the total number of pool members
- Set the number of available members
- Do math
- Enable/Disable the interface based on the result of that math
Steps 1, 4, & 5 are pretty self explanatory. In tmsh scripting, setting an interface (and most other tmsh-based commands) look nearly identical to the shell command.
#tmsh tmsh modify /net interface 1.1 disabled #tmsh script tmsh::modify /net interface 1.1 disabled
Where it gets tricky is figuring out how to get pool member data. This is where the tmsh::get_status and tmsh::get_field_value commands come into play. Everything is object based in tmsh, and it can be a little overwhelming to figure out how to address the objects. If you were to just run the commands below in a script, the resulting output (in /var/tmp/scriptd.out) shows you the nomenclature of the addressable objects in that data.
set pn "/Common/pool4"
set pooldata [tmsh::get_status /ltm pool $pn detail]
puts $data
#data set
ltm pool pool4 {
active-member-cnt 4
connq-all.age-edm 0
connq-all.age-ema 0
connq-all.age-head 0
connq-all.age-max 0
connq-all.depth 0
connq-all.serviced 0
connq.age-edm 0
connq.age-ema 0
connq.age-head 0
connq.age-max 0
connq.depth 0
connq.serviced 0
cur-sessions 0
members {
192.168.101.10:80 {
addr 192.168.101.10
connq.age-edm 0
connq.age-ema 0
connq.age-head 0
connq.age-max 0
connq.depth 0
connq.serviced 0
cur-sessions 0
monitor-rule http (pool monitor)
monitor-status up
node-name 192.168.101.10
nodes {
192.168.101.10 {
addr 192.168.101.10
cur-sessions 0
monitor-rule none
monitor-status unchecked
...continued...
So I get to the pool member data by first getting the pool data. And the data needed for pool member availability is the availability-state and the enabled-state from the pool member data (incomplete view of data shown below, but the necessary information is there.)
members 192.168.101.22:80 {
addr 192.168.101.22
monitor-rule http (pool monitor)
monitor-status up
node-name 192.168.101.22
nodes {
192.168.101.22 {
addr 192.168.101.22
cur-sessions 0
monitor-rule none
monitor-status unchecked
name 192.168.101.22
session-status enabled
status.availability-state unknown
status.enabled-state enabled
status.status-reason
tot-requests 0
}
}
pool-name pool4
port 80
session-status enabled
status.availability-state available
status.enabled-state enabled
status.status-reason Pool member is available
}
Now that the data set is known, the script can be completed. Note that to get to particular state information bolded above, I just set those attributes against the member in the tmsh::get_field_value commands bolded below. The math part is simple, though to get floating point, the .0 is added to the $usable count variable in the expression. Logging statements and puts commands (sending data to /var/tmp/scriptd.out for debugging) added to the script for demonstration purposes.
sys icall script poolCheck.v1.0.0 {
app-service none
definition {
set pn "/Common/pool4"
set total 0
set usable 0
foreach obj [tmsh::get_status /ltm pool $pn detail] {
puts $obj
foreach member [tmsh::get_field_value $obj members] {
puts $member
incr total
if { [tmsh::get_field_value $member status.availability-state] == "available" && \
[tmsh::get_field_value $member status.enabled-state] == "enabled" } {
incr usable
}
}
}
if { [expr $usable.0 / $total] < 0.7 } {
tmsh::log "Not enough pool members in pool $pn, interface 1.3 disabled"
tmsh::modify /net interface 1.3 disabled
} else {
tmsh::log "Enough pool members in pool $pn, interface 1.3 enabled"
tmsh::modify /net interface 1.3 enabled
}
}
description none
events none
}
Now that the script is complete, I just need to create the handler. A triggered handler could be created to run the script every time a pool member alert happens (as configured in /config/user_alert.conf,) but for demo purposes I used a periodic handler with a 60 second interval.
sys icall handler periodic poolCheck.v1.0.0 {
first-occurrence 2014-09-16:11:00:00
interval 60
script poolCheck.v1.0.0
}
Configuration complete, moving on to test!
Testing the Solution
To test, I activated the vm instance in my lab and validated that my BIG-IP interfaces and pool members were up. Then, I shut down one apache virtual ahead of the first period at 11:26, and since I had 75% availability the interface remained enabled. Next, I shut down the second apache virtual, dropping availability to 50%. At 11:27, the BIG-IP interface was deactivated. Finally, I re-enabled the apache virtuals and at the next period the BIG-IP interface was reactivated. Log files and ping test to that interface shown below.
# Log Files Sep 16 11:25:43 Pool /Common/pool4 member /Common/192.168.101.21:80 monitor status down. Sep 16 11:26:00 Enough pool members in pool /Common/pool4, interface 1.3 enabled Sep 16 11:26:26 Pool /Common/pool4 member /Common/192.168.101.22:80 monitor status down. Sep 16 11:27:00 Not enough pool members in pool /Common/pool4, interface 1.3 disabled Sep 16 11:27:32 Pool /Common/pool4 member /Common/192.168.101.21:80 monitor status up. Sep 16 11:27:36 Pool /Common/pool4 member /Common/192.168.101.22:80 monitor status up. Sep 16 11:28:01 Enough pool members in pool /Common/pool4, interface 1.3 enabled # Ping Test to Interface 1.3 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Request timed out. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.205: Destination host unreachable. Reply from 10.10.10.5: bytes=32 time=1000ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255 Reply from 10.10.10.5: bytes=32 time=1ms TTL=255
One note from this solution, don't rely on the GUI or CLI status of the interface (known tested versions in 11.5.x+. Bug 471860 catalogs the reporting issue on BIG-IP for the interface status. At boot time, if the interface is up it reports as ENABLED, but if you disable and then re-enable, it reports as DISABLED even though it will be up and passing traffic.
Dig into iCall!
iCall (and tmsh more generally) is tremendously powerful, take a look at several other use cases already in the iCall codeshare! This solution has been added to the codeshare as well.
3 Comments
- LEON_LI_38034
Nimbostratus
Cool! I admire you very much @Jason. - JRahm
Admin
that was a fun one, Lee, keep the ideas coming! - Kohlaa
Nimbostratus
for those people who stumbled here looking for which folder / file, icall scripts are stored in, as you can't browse to /Common..
the file is located in /config/bigip_script.conf