Forum Discussion

Omnix_TIMS_4122's avatar
Omnix_TIMS_4122
Icon for Nimbostratus rankNimbostratus
Jan 11, 2015

Question on Interface fail-safe script

Dears, Can some one assist me to understand what the script below exactly did:

shift shift

interfaces="$*"

interfaces="1/2.1 1/2.3 2/2.1" b interface show > /tmp/b_interface_show for i in $interfaces ; do status=

grep "^ *$i " /tmp/b_interface_show | awk '{print $2}'

logger -p local0.notice "interface $i is parced and status is $status"

if [ "$status" = "DN" ] ; then logger -p local0.notice "$MON_TMPL_NAME: interface $i is DOWN (status: $status)" for f in $interfaces ; do logger -p local0.notice "$MON_TMPL_NAME: bring down other interfaces" if [ $f != $i ] ; then b interface $f disable fi done echo "failed" > /tmp/int_fail_state exit 1 fi if [ "$status" = "UP" ] ; then state=

cat /tmp/int_fail_state
if [ "$state" = "failed" ] ; then logger -p local0.notice "$MON_TMPL_NAME: interface $i is back UP (status: $status)" for f in $interfaces ; do logger -p local0.notice "$MON_TMPL_NAME: bring UP interface $f in group ($interfaces)" b interface $f enable done echo "ok" > /tmp/int_fail_state fi fi done

All specified interfaces are up...

echo "up" exit 0

   I have Active F5a and Standby F5b and need to do fail safe interface between them. Once I apply it on F5a (active) and interfaces 1/2.1 and 2/2.2 still up; F5a changed to standby and another F5 didn't go active. I have other monitor for interface fail safe for interfaces 1/2.1 1/2.3 2/2.1. Please I need to understand this script very clearly to know the reason that made F5a goes to standby.

One important thing that is these interfaces connected with EFS routers which they didn't accept ICMP traffic (ping). Is it possible that F5 sending test traffic (like ping) to check the status of these interfaces if up or no, and while it send there were no reply and it mark it as 'DOWN" and the fail worked?

Many thanks.

2 Replies

  • Hi Omnix, the script does the following:

    shift 
    shift
    interfaces="$*"
    interfaces="1/2.1 1/2.3 2/2.1"
    b interface show > /tmp/b_interface_show
    for i in $interfaces
    do
        status=grep "^ *$i " /tmp/b_interface_show | awk '{print $2}'
        logger -p local0.notice "interface $i is parsed and status is $status"
        if [ "$status" = "DN" ]
        then
            logger -p local0.notice "$MON_TMPL_NAME: interface $i is DOWN (status: $status)"
            for f in $interfaces
            do
                logger -p local0.notice "$MON_TMPL_NAME: bring down other interfaces"
                if [ $f != $i ]
                then 
                    b interface $f disable
                fi 
            done
            echo "failed" > /tmp/int_fail_state
            exit 1
        fi
        if [ "$status" = "UP" ]
        then
            state=cat /tmp/int_fail_state
            if [ "$state" = "failed" ]
            then
                logger -p local0.notice "$MON_TMPL_NAME: interface $i is back UP (status: $status)"
                for f in $interfaces
                do
                    logger -p local0.notice "$MON_TMPL_NAME: bring UP interface $f in group ($interfaces)"
                    b interface $f enable
                done
                echo "ok" > /tmp/int_fail_state
            fi
        fi
    done
    

    A list of interfaces (probably on a VIPRION due to the leading blade number) is specified.

    The bigpipe command (b) to show interface state is run and in case of one of the interfaces is list is reported as down, all other interfaces in the list will be disabled administratively.

    The file /tmp/int_fail_state will contain the string "failed". In case there is no interface reported as down and the file /tmp/int_fail_state still indicates the "failed" stage, all interfaces will be re-enabled by the bigpipe command and the /tmp/int_fail_state will now contain the string "ok".

    As F5 does not provide a mechanism to track interface states, one needs workarounds as this, if you rely need to rely on this aspect.

    From my perspective there are much better built-in alternatives available:

    1. Since TMOS v10 you can use so called HA groups and used trunk (aggregated link) states as a failover trigger for sub-second failover. In HA groups you can also monitor specific nodes, i.e. gateways with a monitor of your choice.

    2. Since BIG-IP v4 and higher you have VLAN failsave and gateway failsafe to trigger a failover. Please keep in mind, that VLAN failsave and gateway failsafe may still result in standby/standby, if none of the monitored resources are available.

    The "gateway failsafe" mechanism allows the monitoring of specified resources (i.e. via PING or whatever). Two failsafe pools will be required to be bound to the two different unit IDs (assuming you are still on v9 or v10 as you´re using "bigpipe" in your script).

    Thanks, Stephan
  • Dear Stephan,

     

    Many thanks for your answer. This script worked as expected after giving bigip restart to each box.

     

    Thanks once again.