Generic SNMP monitor using snmpget

Problem this snippet solves:

The following is a little script that should be used as an external monitor. It performs snmpget requests on the node and evaluate the response, based on that returns server status.

Requirements:

  • Command line arguments:
    • community - the RO snmp community
    • oid - the OID to request
    • expected_val - the expected value to indicate server is UP

OR

  • environment variables:
    • community - the RO snmp community
    • oid - the OID to request
    • expected_val - the expected value to indicate server is UP

I found that using the variable is just more clear to configure and view, beside that it's 100% same

Enjoy,

Lahav Savir lahavs at exelmind dot com

Code :

#!/bin/sh

# application API
# --------------------
dest_ip=$(echo $1 | sed 's/::ffff://'); #IP (nnn.nnn.nnn.nnn notation or hostname)
dest_port=$2;#port (decimal, host byte order)

# you should configure the following variables on the monitor configuration or enable these lines to pass the arguments on the command line
# I just found that it's easier to do it through variables as it's more clear to review them
#community=$3; #community 
#oid=$4; #oid
#expected_val=$5; #expected value (for setting monitor UP)

# configurations
# -------------------
pid=$$
timeout=2
up=UP
down=DOWN
loglevel=1
pidfile="/var/run/generic-snmp-monitor-${dest_ip}-${dest_port}-${community}-${oid}.pid"
log_file=/var/log/generic-snmp-monitor-${dest_ip}.log

function write_log ()
{
echo "$(date +%Y-%m-%d) $(date +%T) ${pid} $*" >> $log_file
}

function init ()
{
write_log "=== monitor started ==="
if [ -f $pidfile ]; then
write_log "${pidfile} exist"
kill -9 `cat $pidfile` > /dev/null 2>&1
err=$?
write_log "PID:$(cat $pidfile) killed, error code:${err}"
fi
echo ${pid} > $pidfile
write_log "setting ${pidfile} with pid: ${pid}"

verify_param "dest_ip" "${dest_ip}"
verify_param "dest_port" "${dest_port}"
verify_param "community" "${community}"
verify_param "oid" "${oid}"
verify_param "expected_val" "${expected_val}"
}

function verify_param ()
{
name=$1; val=$2;
if [ "${name}" = "" ] || [ "${val}" = "" ]; then
on_error "name:"${name}" or value:"${val}" not set"
fi
#else
#  write_log "name:${name} val:${val}"
#fi
}

function run_snmp_test ()
{
cmd="snmpget -O qv -t ${timeout} -v2c -c ${community} ${dest_ip}:161 ${oid}"
write_log "cmd=${cmd}"
result=$(${cmd})
err=$?

if [ ${err} -ne 0 ]; then
on_error "snmpget existed with error code:${err}"
fi 

if [ "${result}" = "${expected_val}" ]; then
write_log "OK - RESULT:${result} as expected"
response ${up}
else
write_log "FAIL - RESULT:${result} != ${expected_val}"
response ${down}
fi
}

function response ()
{
status=$1
write_log "response status:${status}"
if [ "${status}" = "${up}" ]; then
echo ${status}
fi
}

function cleanup ()
{
write_log "deleting ${pidfile}"
rm -f ${pidfile} >/dev/null
}

function on_error ()
{
write_log "ERROR: $1"
response ${down};
cleanup;
write_log "=== monitor aborted ==="
exit -1
}

function main ()
{
init;
run_snmp_test;
cleanup;
write_log "=== monitor ended ==="
exit 0
}
main;
Published Mar 12, 2015
Version 1.0
  • Why is this monitor failing when the 'echo' is done in function response ()?

     

    i have re-downloaded this and it still fails to complete the script. i have put it on another system and it also fails to complete the script.

     

    If I log right after the echo it doesn't appear in the log.

     

    It have tried moving the echo response and it doesn't matter. It seems that as soon as the echo is issued the process finishes and nothing else is processed including the remove of the pid file. The next time it runs it finds the file and errors trying to kill the process (that had finished!).

     

  • Well as soon as anything is written out (Echo command) the script stops. so the cleanup will not occur on this script.

     

    This is really crazy.. and spent most of the day to figure out what had been mentioned by hoolio a few times (found a template he posted and a comment that revealed it).

     

  • The following provides the equivalent to the original submission in this post. While the structure of that code is very nice, and it will work, the script stops before cleanup is completed leaving a dangling pid file that could kill non-related process.

     

    Using the template and adding the SNMP portion of the code from above results in:

     

    !/bin/bash
     
     Save as /usr/bin/monitors/custom_monitor.bash
     Make executable using chmod 700 custom_monitor.bash
     
     Use a custom shell command to perform a health check of the pool member IP address and port
     
     Log debug to local0.debug (/var/log/ltm)?
     Check if a variable named DEBUG exists from the monitor definition
     This can be set using a monitor variable DEBUG=0 or 1
    if [ -n "$DEBUG" ]
    then
       if [ $DEBUG -eq 1 ]; then echo "EAV `basename $0`: \$DEBUG: $DEBUG" | logger -p local0.debug; fi
    else
        If the monitor config didn't specify debug, enable/disable it here
       DEBUG=0
       echo "EAV `basename $0`: \$DEBUG: $DEBUG" | logger -p local0.debug
    fi
     
     Remove IPv6/IPv4 compatibility prefix (LTM passes addresses in IPv6 format)
    IP=`echo $1 | sed 's/::ffff://'`
     
     Save the port for use in the shell command
    PORT=$2;
     commented below as they are now sent as variables from the monitor rather than arguments
    community=$3; community 
    oid=$4; oid
    expected_val=$5; expected value (for setting monitor UP)
    
     
     Check if there is a prior instance of the monitor running
    pidfile="/var/run/`basename $0`.$IP.$PORT.pid"
    if [ -f $pidfile ]
    then
       kill -9 `cat $pidfile` > /dev/null 2>&1
       echo "EAV `basename $0`: exceeded monitor interval, needed to kill ${IP}:${PORT} with PID `cat $pidfile`" | logger -p local0.error
    fi
     
     Add the current PID to the pidfile
    echo "$$" > $pidfile
     
     Debug
    if [ $DEBUG -eq 1 ]
    then
         Customize the log statement here if you want to log the command run or the output 
       echo "EAV `basename $0`: Running for ${IP}:${PORT} using custom command" | logger -p local0.debug
    fi
     
      Customize the shell command to run here. 
     Use $IP and $PORT to specify which host/port to perform the check against
     Modify this portion of the line:
     nc $IP $PORT | grep "my receive string" 
     And leave this portion as is:
     '2>&1 > /dev/null'
     The above code redirects stderr and stdout to nothing to ensure we don't errantly mark the pool member up
     
     Send the request request and check the response
    snmpget -On -c $community $IP $oid | grep "$expected_val" 2>&1 > /dev/null
     
     Check if the command ran successfully
     Note that any standard output will result in the script execution being stopped
     So do any cleanup before echoing to STDOUT
    if [ $? -eq 0 ]
    then
       rm -f $pidfile
       if [ $DEBUG -eq 1 ]; then echo "EAV `basename $0`: Succeeded for ${IP}:${PORT}" | logger -p local0.debug; fi
       echo "UP"
    else
       rm -f $pidfile
       if [ $DEBUG -eq 1 ]; then echo "EAV `basename $0`: Failed for ${IP}:${PORT}" | logger -p local0.debug; fi
    fi