Forum Discussion

Benoit_Durand_1's avatar
Benoit_Durand_1
Icon for Nimbostratus rankNimbostratus
Aug 31, 2018

Monitoring one node to control the status of another. Feasible?

This may sound oddball-ish, but is there a way to monitor the status on one node to control the status of another? Here is my example:

 

Pool with 4 "service" nodes providing customer connectivity. I have a 5th server who's only job is to monitor in detail the services of the first 4 and to post a web page if all is well. So, my monitoring server would have web pages at if service node 1 is "good", if service node 2 is good, etc. Just looking for a 200 Ok from those, nothing more complex. However, if that page is not there for node 1 for example, I need to take that node out of the pool.

 

So, monitors on a 5th server that doesn't provide services, but controls inclusion in the pool of the 4 nodes that do.

 

Any leads on how to approach this?

 

Thanks.

 

  • Ben
  • There's a neat way to do this with iRules, and you can actually monitors to both pools.

    • Create a pool for your application servers and apply an appropriate monitor. Attach this pool to your VIP.

    • Create a separate pool for the 5th server and apply an appropriate monitor, but don't attach it to a VIP.

    • Apply an iRule like this to your VIP:

      when HTTP_REQUEST {
          if { [active_members "server5_pool"] < 1 } {
               your 5th server monitor has failed - do something here.
          }
      }
      

    So then if the 5th server monitor fails, the iRule can take some action, like presenting a maintenance page, or redirecting to a different URL.

  • Would it not make more sense then to apply this logic to the main pool monitor? It’s be pretty straight forward to have an HTTP or external monitor to poll the respective stat pages.

     

  • Okay, just a thought, but it seems like you'd need some way to map a given test page to a pool member. I put the below together using an external monitor script applied to the 5th server (status server) pool.

    !/bin/sh
    
    pidfile="/var/run/$MONITOR_NAME.$1..$2.pid"
    if [ -f $pidfile ]
    then
       kill -9 -`cat $pidfile` > /dev/null 2>&1
    fi
    echo "$$" > $pidfile
    
     =================================
     User-defined: app-server pool name
    web_server=webserver-pool
    
     User-defined: array list to map test page to pool respective pool member
    arr=(
       "test1.html=10.1.10.90:80"
       "test2.html=10.1.10.91:80"
       "test3.html=10.1.10.92:80"
    )
     =================================
    
    
     This is the status server IP and port
    stat_server=`echo $1 | sed 's/::ffff://'`
    port=$2
    
     Looping through the array
    for item in ${arr[*]}
    do
       test=$(cut -d '=' -f1 <<< $item)
       mbr=$(cut -d '=' -f2 <<< $item)
       status=$(curl -s -o /dev/null -w "%{http_code}" http://${stat_server}:${port}/${test})
       mbr_state=$(tmsh list ltm pool ${web_server} members { ${mbr} { state } } |grep state |awk -F" " '{ print $2 }')
    
       if [ "$status" != "200" ]
       then
          if [ "$mbr_state" == "up" ]
          then
              Status is not 200 - if the pool member is up, send it down
             tmsh modify ltm pool ${web_server} members modify { ${mbr} { state user-down } }
          fi
       else
          if [ "$mbr_state" != "up" ]
          then
              Status is 200 - if the pool member is down, send it up
             tmsh modify ltm pool ${web_server} members modify { ${mbr} { state user-up } }
          fi
       fi
    done
    
    rm -f $pidfile
     Echo "up" to keep the status server pool alive
    echo "up"
    

    I'd also recommend pushing the monitor interval and timeout values out a bit, maybe 15-50 seconds.

    This script effectively loops through the array and requests each test page from the status server in order. If the result is not 200, the corresponding member from the app server pool is marked down. This could definitely use some tweaking, but should otherwise work.