Part 3: Monitoring the health of BIG-IP APM network access PPP connections with a periodic iCall handler
In this part, you monitor the health of PPP connections on the BIG-IP APM system by monitoring the frequency of a particular log message in the /var/log/apm file. In this log file, when the system is...
Published May 12, 2020
Version 1.0Kin
Employee
Joined June 05, 2019
Kin
Employee
Joined June 05, 2019
Abdessamad1
Apr 13, 2021Cirrostratus
Dear Kin,
I initially had the issue with a different script I created myself, but to be sure I made a copy of your script and tested it and the same issue happened.
Here is the code I'm using:
set apm_log "/var/log/apm"
set errormsg "Reconnect"
set crit_threshold 4
set alert_threshold 7
set emerg_threshold 10
set period 3.0
set DEBUG 1
set total 0
puts "\n[clock format [clock seconds] -format "%b %e %H:%M:%S"] Running script..."
for {set i 1} {$i <= $period} {incr i} {
set hourmin [clock format [clock scan "-$i minute"] -format "%b %e %H:%M:"]
set errorcode [catch {exec grep $errormsg $apm_log | grep $hourmin | wc -l} num_entries]
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $num_entries times."}
set total [expr {$total + $num_entries}]
}
set average [expr $total / $period]
set average [format "%.1f" $average]
if {$average < $crit_threshold} {
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average. Below all threshold. No action."}
exit
}
if {$average < $alert_threshold} {
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average. Reached critical threshold $crit_threshold. Log Critical msg."}
exec logger -p local1.crit "01490266: \"$errormsg\" logged $average times on average in last $period mins. >= critical threshold $crit_threshold."
exit
}
if {$average < $emerg_threshold} {
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average. Reached alert threshold $alert_threshold. Log Alert msg."}
exec logger -p local1.alert "01490266: \"$errormsg\" logged $average times on average in last $period mins. >= alert threshold $alert_threshold."
exit
}
if {$DEBUG} {puts "DEBUG: $hourmin \"$errormsg\" logged $average times on average in last $period mins. Log Emerg msg"}
exec logger -p local1.emerg "01490266: \"$errormsg\" logged $average times on average in last $period mins. >= emerg threshold $emerg_threshold."
exit
I removed the check "if {$errorcode} {set num_entries 0}" to have the error output logged.
And here is what I see in /var/tmp/scriptd.out:
Apr 13 09:44:45 Running script...
DEBUG: Apr 13 09:43: "Reconnect" logged 0
child process exited abnormally times.
Followed by other errors, but that's normal as num_entriesis not set correctly.
I hope this clarifies the issue.
Thanks.
Abdessamad