Part 3: Monitoring the health of BIG-IP APM network access PPP connections with a periodic iCall handler
Published May 12, 2020
Version 1.0Was this article helpful?
Dear Kin,
I initially had the issue with a different script I created myself, but to be sure I made a copy of your script and tested it and the same issue happened.
Here is the code I'm using:
set apm_log "/var/log/apm"
set errormsg "Reconnect"
set crit_threshold 4
set alert_threshold 7
set emerg_threshold 10
set period 3.0
set DEBUG 1
set total 0
puts "\n[clock format [clock seconds] -format "%b %e %H:%M:%S"] Running script..."
for {set i 1} {$i <= $period} {incr i} {
set hourmin [clock format [clock scan "-$i minute"] -format "%b %e %H:%M:"]
set errorcode [catch {exec grep $errormsg $apm_log | grep $hourmin | wc -l} num_entries]
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $num_entries times."}
set total [expr {$total + $num_entries}]
}
set average [expr $total / $period]
set average [format "%.1f" $average]
if {$average < $crit_threshold} {
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average. Below all threshold. No action."}
exit
}
if {$average < $alert_threshold} {
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average. Reached critical threshold $crit_threshold. Log Critical msg."}
exec logger -p local1.crit "01490266: \"$errormsg\" logged $average times on average in last $period mins. >= critical threshold $crit_threshold."
exit
}
if {$average < $emerg_threshold} {
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average. Reached alert threshold $alert_threshold. Log Alert msg."}
exec logger -p local1.alert "01490266: \"$errormsg\" logged $average times on average in last $period mins. >= alert threshold $alert_threshold."
exit
}
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average in last $period mins. Log Emerg msg"}
exec logger -p local1.emerg "01490266: \"$errormsg\" logged $average times on average in last $period mins. >= emerg threshold $emerg_threshold."
exit
I removed the check "if {$errorcode} {set num_entries 0}" to have the error output logged.
And here is what I see in /var/tmp/scriptd.out:
Apr 13 09:44:45 Running script...
DEBUG: Apr 13 09:43: "Reconnect" logged 0
child process exited abnormally times.
Followed by other errors, but that's normal as num_entriesis not set correctly.
I hope this clarifies the issue.
Thanks.
Abdessamad