Forum Discussion
F5 not sending health check to all members in a pool
We have an active/standby setup. The standby unit is only sending HealthChecks to 1 out of 10 members in a pool. I manually tested the connection (smtp) and the F5 can connect to all members. When capturing traffic via tcpdump you ever see the traffic go out to 9 of 10 members.
Anyone seen anything like this before?
- Brad_ParkerCirrus
Are the nodes marked available?
- Dave_Watts_1515Nimbostratus1 of 10 members are marked as available. All 10 of them are reachable via ICMP and telnet to SMTP port from tmsh. The other 9 are marked as down by monitor. When I watched via tcpdump you can see the healthchecks going to the 1 server however you never see traffic to the other 9.
- Brad_Parker_139Nacreous
Are the nodes marked available?
- Dave_Watts_1515Nimbostratus1 of 10 members are marked as available. All 10 of them are reachable via ICMP and telnet to SMTP port from tmsh. The other 9 are marked as down by monitor. When I watched via tcpdump you can see the healthchecks going to the 1 server however you never see traffic to the other 9.
- nitassEmployee
have you checked mac address? is it correct?
have you ever tried to restart bigd?
- Brad_ParkerCirrus
I've seen this before, but don't yet have an answer as to why it happens. Force them down on the standby and then re-enable them to force the monitor to retest.
Are you using Vlan group?
- Dave_Watts_1515Nimbostratus
Forcing them offline and then re-enabling fixed this. Is there a specific condition that causes the healthcheck service/process to fail requiring this manual intervention? Thanks for the tip!
- Dave_Watts_1515Nimbostratus
Argh. This fixed the standby however it broke the active.
- nitassEmployeei think it may be good to open a support case. much appreciated if you update us the outcome. :-)
I'm having the same problem, I use the http monitor and some members get infinite status checking, with tcpdump the check attempt is not even done.
Doing test with ping, telnet and curl results is positive.
>> I already did the bigd restart;
>> I pushed forced offline and returned;
*** I believe a possible solution would be the touch / service / mcpd / forceload procedure; reboot, however, doing this I will not have the root cause of the problem
Note: As a palliative solution I enabled the tcp monitor and some members returned, however, some still have a problem.
update November 14, 2018
The "Logging Monitor" enabled on a Node was the trigger for Health Check failure, it is a known BUG as described in article K06263705.
After disabling the logging monitor and running the restart of the bigd process (clsh bigstart restart bigd) the environment has normalized.
Problem solved!
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com