Forum Discussion

Don_22992's avatar
Don_22992
Icon for Nimbostratus rankNimbostratus
Jan 28, 2008

Email notification of node/vs down

I have searched for a while now, and have not found a solution.

 

 

Alertd appears to be the method of sending an email alert; how do I tell it a node is down?

 

 

Or a custom scripted monitor looked like a candidate.

 

 

I am not necessarily looking for the solution - just guidance of where to investigate further.
  •  

    Heres a baseline to make some SEC matches against F5 SNMP traps for node down, node up.

     

     

     

    SUPRESS MATCHES FROM EWBIGIP4 TO PREVENT DUPLICATE EMAILS

     

     

    type=suppress

     

    ptype=Substr

     

    pattern=xxx.xxx.xxx (self ip of standby f5 on defalult gateway net)

     

    desc=Supress Events for Traps from standby F5

     

     

    Service DOWN

     

     

    type=single

     

    ptype=regexp

     

    pattern=(?i) (\S+) Pool member xxx.xxx.xxx:80 monitor status down.

     

    desc= WARNING HTTP Web Service on xxx.xxx.xxx DOWN.

     

    action=pipe '%s' /usr/bin/mail -s 'F5 ALERT: WMS Web Service Node DOWN' abc@123

     

     

    Service UP

     

     

    type=single

     

    ptype=regexp

     

    pattern=(?i) (\S+) Pool member xxx.xxx.xxx:80 monitor status up.

     

    desc= NOTIFICATION HTTP Web Service on xxx.xxx.xxx UP.

     

    action=pipe '%s' /usr/bin/mail -s 'F5 NOTIFICATION: WMS Web Server Node UP' abc@123

     

    • David_Dennison's avatar
      David_Dennison
      Icon for Nimbostratus rankNimbostratus
      Not sure, we moved to Nagios to do checking of our F5's. But SEC should always work. The only thing between versions would be to make sure the trap message keeps the same format between versions. If it changes, just change the regexp pattern match to mirror the new trap message accordingly.
  • Hi All

    Despite the syntax used from the document K3667, I have the error ' Feb 22 19:47:42 localhost emerg logger: Re-starting alertd'.

    The error is looping. I have to copy an empty user_alert.conf file in order to stop it.

    Below the conf added into file /config/user_alert.conf with the rights 644:
    alert BIGIP_LIBHAL_HALERR_BLADE_POWERED_OFF {  
        snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.119";  
        lcdwarn description="Blade is about to be powered off." priority="4"  
        email toaddress="..."  
        fromaddress="..."  
        body="The test of this Solution worked!"  
    }  
    

    Thanks for help

    • Kevin_K_51432's avatar
      Kevin_K_51432
      Historic F5 Account

      Greeting,

      I took a quick peek at the article and didn't notice any mention of lcdwarn or description.

      I removed those and it's working:

      alert BIGIP_LIBHAL_HALERR_BLADE_POWERED_OFF {  
          snmptrap OID=".1.3.6.1.4.1.3375.2.4.0.119";  
          email toaddress="..."  
          fromaddress="..."  
          body="The test of this Solution worked!"  
      }  
      

      Hopefully those aren't necessary for you?

      Kevin
    • aygitci_128716's avatar
      aygitci_128716
      Icon for Nimbostratus rankNimbostratus

      Thanks for your reply. Yes it's working now.

       

      However I didn't receive the email. I'm wondering if I didn't choose the right SNMP OID.

       

      Do you know on where I can find the triggered OID?

       

      Thanks again :)

       

    • Kevin_K_51432's avatar
      Kevin_K_51432
      Historic F5 Account

      Sure thing!

       

      So, internally, TMM is picking this message up "BIGIP_LIBHAL_HALERR_BLADE_POWERED_OFF".

       

      Then it sends the trap with that OID. You could type anything in actually.

       

      Have you following this article for configuring the SMTP mailhub on BIG-IP?

       

      https://support.f5.com/csp/article/K13180

       

      Lastly, if you are using domain names, have you set up a DNS name server on BIG-IP?

       

      Thanks,

       

      Kevin