Technical Forum
Ask questions. Discover Answers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Custom Alert Banner

F5 not sending health check to all members in a pool

Dave_Watts_1515
Nimbostratus
Nimbostratus

We have an active/standby setup. The standby unit is only sending HealthChecks to 1 out of 10 members in a pool. I manually tested the connection (smtp) and the F5 can connect to all members. When capturing traffic via tcpdump you ever see the traffic go out to 9 of 10 members.

 

Anyone seen anything like this before?

 

11 REPLIES 11

Brad_Parker_139
Nacreous
Nacreous

Are the nodes marked available?

 

1 of 10 members are marked as available. All 10 of them are reachable via ICMP and telnet to SMTP port from tmsh. The other 9 are marked as down by monitor. When I watched via tcpdump you can see the healthchecks going to the 1 server however you never see traffic to the other 9.

Brad_Parker
Cirrus
Cirrus

Are the nodes marked available?

 

1 of 10 members are marked as available. All 10 of them are reachable via ICMP and telnet to SMTP port from tmsh. The other 9 are marked as down by monitor. When I watched via tcpdump you can see the healthchecks going to the 1 server however you never see traffic to the other 9.

nitass
F5 Employee
F5 Employee

have you checked mac address? is it correct?

 

have you ever tried to restart bigd?

 

Brad_Parker
Cirrus
Cirrus

I've seen this before, but don't yet have an answer as to why it happens. Force them down on the standby and then re-enable them to force the monitor to retest.

 

Are you using Vlan group?

 

Dave_Watts_1515
Nimbostratus
Nimbostratus

Forcing them offline and then re-enabling fixed this. Is there a specific condition that causes the healthcheck service/process to fail requiring this manual intervention? Thanks for the tip!

 

Dave_Watts_1515
Nimbostratus
Nimbostratus

Argh. This fixed the standby however it broke the active.

 

i think it may be good to open a support case. much appreciated if you update us the outcome. 🙂

I'm having the same problem, I use the http monitor and some members get infinite status checking, with tcpdump the check attempt is not even done.

 

Doing test with ping, telnet and curl results is positive.

 

>> I already did the bigd restart;

 

>> I pushed forced offline and returned;

 

*** I believe a possible solution would be the touch / service / mcpd / forceload procedure; reboot, however, doing this I will not have the root cause of the problem

 

Note: As a palliative solution I enabled the tcp monitor and some members returned, however, some still have a problem.

 

update November 14, 2018

 

The "Logging Monitor" enabled on a Node was the trigger for Health Check failure, it is a known BUG as described in article K06263705.

 

After disabling the logging monitor and running the restart of the bigd process (clsh bigstart restart bigd) the environment has normalized.

 

Problem solved!