Generic SNMP monitor using snmpget
Problem this snippet solves:
The following is a little script that should be used as an external monitor. It performs snmpget requests on the node and evaluate the response, based on that returns server status.
Requirements:
-
Command line arguments:
- community - the RO snmp community
- oid - the OID to request
- expected_val - the expected value to indicate server is UP
OR
-
environment variables:
- community - the RO snmp community
- oid - the OID to request
- expected_val - the expected value to indicate server is UP
I found that using the variable is just more clear to configure and view, beside that it's 100% same
Enjoy,
Lahav Savir lahavs at exelmind dot com
Code :
#!/bin/sh # application API # -------------------- dest_ip=$(echo $1 | sed 's/::ffff://'); #IP (nnn.nnn.nnn.nnn notation or hostname) dest_port=$2;#port (decimal, host byte order) # you should configure the following variables on the monitor configuration or enable these lines to pass the arguments on the command line # I just found that it's easier to do it through variables as it's more clear to review them #community=$3; #community #oid=$4; #oid #expected_val=$5; #expected value (for setting monitor UP) # configurations # ------------------- pid=$$ timeout=2 up=UP down=DOWN loglevel=1 pidfile="/var/run/generic-snmp-monitor-${dest_ip}-${dest_port}-${community}-${oid}.pid" log_file=/var/log/generic-snmp-monitor-${dest_ip}.log function write_log () { echo "$(date +%Y-%m-%d) $(date +%T) ${pid} $*" >> $log_file } function init () { write_log "=== monitor started ===" if [ -f $pidfile ]; then write_log "${pidfile} exist" kill -9 `cat $pidfile` > /dev/null 2>&1 err=$? write_log "PID:$(cat $pidfile) killed, error code:${err}" fi echo ${pid} > $pidfile write_log "setting ${pidfile} with pid: ${pid}" verify_param "dest_ip" "${dest_ip}" verify_param "dest_port" "${dest_port}" verify_param "community" "${community}" verify_param "oid" "${oid}" verify_param "expected_val" "${expected_val}" } function verify_param () { name=$1; val=$2; if [ "${name}" = "" ] || [ "${val}" = "" ]; then on_error "name:"${name}" or value:"${val}" not set" fi #else # write_log "name:${name} val:${val}" #fi } function run_snmp_test () { cmd="snmpget -O qv -t ${timeout} -v2c -c ${community} ${dest_ip}:161 ${oid}" write_log "cmd=${cmd}" result=$(${cmd}) err=$? if [ ${err} -ne 0 ]; then on_error "snmpget existed with error code:${err}" fi if [ "${result}" = "${expected_val}" ]; then write_log "OK - RESULT:${result} as expected" response ${up} else write_log "FAIL - RESULT:${result} != ${expected_val}" response ${down} fi } function response () { status=$1 write_log "response status:${status}" if [ "${status}" = "${up}" ]; then echo ${status} fi } function cleanup () { write_log "deleting ${pidfile}" rm -f ${pidfile} >/dev/null } function on_error () { write_log "ERROR: $1" response ${down}; cleanup; write_log "=== monitor aborted ===" exit -1 } function main () { init; run_snmp_test; cleanup; write_log "=== monitor ended ===" exit 0 } main;
- brad_11480Nimbostratus
Why is this monitor failing when the 'echo' is done in function response ()?
i have re-downloaded this and it still fails to complete the script. i have put it on another system and it also fails to complete the script.
If I log right after the echo it doesn't appear in the log.
It have tried moving the echo response and it doesn't matter. It seems that as soon as the echo is issued the process finishes and nothing else is processed including the remove of the pid file. The next time it runs it finds the file and errors trying to kill the process (that had finished!).
- brad_11480Nimbostratus
Well as soon as anything is written out (Echo command) the script stops. so the cleanup will not occur on this script.
This is really crazy.. and spent most of the day to figure out what had been mentioned by hoolio a few times (found a template he posted and a comment that revealed it).
- brad_11480Nimbostratus
The following provides the equivalent to the original submission in this post. While the structure of that code is very nice, and it will work, the script stops before cleanup is completed leaving a dangling pid file that could kill non-related process.
Using the template and adding the SNMP portion of the code from above results in:
!/bin/bash Save as /usr/bin/monitors/custom_monitor.bash Make executable using chmod 700 custom_monitor.bash Use a custom shell command to perform a health check of the pool member IP address and port Log debug to local0.debug (/var/log/ltm)? Check if a variable named DEBUG exists from the monitor definition This can be set using a monitor variable DEBUG=0 or 1 if [ -n "$DEBUG" ] then if [ $DEBUG -eq 1 ]; then echo "EAV `basename $0`: \$DEBUG: $DEBUG" | logger -p local0.debug; fi else If the monitor config didn't specify debug, enable/disable it here DEBUG=0 echo "EAV `basename $0`: \$DEBUG: $DEBUG" | logger -p local0.debug fi Remove IPv6/IPv4 compatibility prefix (LTM passes addresses in IPv6 format) IP=`echo $1 | sed 's/::ffff://'` Save the port for use in the shell command PORT=$2; commented below as they are now sent as variables from the monitor rather than arguments community=$3; community oid=$4; oid expected_val=$5; expected value (for setting monitor UP) Check if there is a prior instance of the monitor running pidfile="/var/run/`basename $0`.$IP.$PORT.pid" if [ -f $pidfile ] then kill -9 `cat $pidfile` > /dev/null 2>&1 echo "EAV `basename $0`: exceeded monitor interval, needed to kill ${IP}:${PORT} with PID `cat $pidfile`" | logger -p local0.error fi Add the current PID to the pidfile echo "$$" > $pidfile Debug if [ $DEBUG -eq 1 ] then Customize the log statement here if you want to log the command run or the output echo "EAV `basename $0`: Running for ${IP}:${PORT} using custom command" | logger -p local0.debug fi Customize the shell command to run here. Use $IP and $PORT to specify which host/port to perform the check against Modify this portion of the line: nc $IP $PORT | grep "my receive string" And leave this portion as is: '2>&1 > /dev/null' The above code redirects stderr and stdout to nothing to ensure we don't errantly mark the pool member up Send the request request and check the response snmpget -On -c $community $IP $oid | grep "$expected_val" 2>&1 > /dev/null Check if the command ran successfully Note that any standard output will result in the script execution being stopped So do any cleanup before echoing to STDOUT if [ $? -eq 0 ] then rm -f $pidfile if [ $DEBUG -eq 1 ]; then echo "EAV `basename $0`: Succeeded for ${IP}:${PORT}" | logger -p local0.debug; fi echo "UP" else rm -f $pidfile if [ $DEBUG -eq 1 ]; then echo "EAV `basename $0`: Failed for ${IP}:${PORT}" | logger -p local0.debug; fi fi