Standby Has Fewer Online VIPs Than Active – Requires Manual Monitor Reset

Question

&nbsp;Hello F5 community,I’ll preface this by saying that networking has been verified as fully routable between the Active and Standby units. Both devices can ping and SSH to each other’s Self-IPs, and rebooting the Standby did not resolve the issue.Issue: Discrepancy in Online VIPs Between Active &amp; StandbyDespite being In-Sync, the Active and Standby units show a different number of Online VIPs.If I randomly select one or two VIPs that should be online, remove their monitors, and then re-add them—BOOM, the VIP comes online.The VIPs in question were both HTTPS (443).Side Note: Frequent TCP Monitor FailuresIn my environment, I also frequently see generic ‘TCP’ monitors failing, leading to outages. While I understand that TCP monitoring alone isn’t ideal, my hands are tied as all changes must go through upper management for approval.Has anyone encountered a similar issue where VIPs don’t come online until the monitor is manually reset? Any insights into potential root causes or troubleshooting steps would be greatly appreciated!Thanks in advance.&nbsp;

michael_saleem · Answer

Which one out of the active and standby device is generally showing more online VIPs? Do you notice any patterns in terms of a particular pool or pool member(s) failing their health checks more often than others? Are you using the same nodes across a large number pools? Do you have health monitors applied also at the node level or is it just at the pool / pool member level?
&nbsp;
I would recommend checking out the following article which has good tips on how to troubleshoot health monitors:Troubleshooting health monitorsFrom past experience, I have had cases where the issue with health check failures were attributed to the back-end pool member and other cases where it was the BIG-IP. The BIG-IP by default uses the "bigd" daemon to send health check probes on the control plane. If you have a significant number of pools with health check monitors applied, then it could be getting overwhelmed. As per the previously mentioned article, you can check the memory usage of the "bigd" daemon by running the following command:
&nbsp;
ps aux | grep bigd
If you notice that the memory usage for this daemon is high, you may want to consider switching to "in-TMM" monitoring which makes the BIG-IP send health check probes using TMM (data plane) instead of the bigd (control plane)More information about in-TMM monitoring here&nbsp;

brandon · Answer

Have you recently upgraded ? check for duplicate IPs also.&nbsp; Our network team ran into this the Standby showed the VIPs up but the Active Showed the vips down. There wasnt really a fix, it was after a upgrade. Boxes rebooted and it began to stabalize it self out.&nbsp; &nbsp;

shaunsimmons · Answer

Thank you for the responses -Troubleshooting monitors -- that is not an issue. The problem I am experiencing is out of the ordinary.&nbsp;-=TMOS 17.1.1.3=-Total(s)&nbsp; 126 VIPs, 138 pools, 220 nodes&nbsp; --No noticeable patterns.&nbsp;&nbsp; &nbsp; &nbsp;The VIPs and servers are all on their respective IP subnets.--A non-complicated network topology.&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Ex:&nbsp; VIP(s) 192.168.10.x / Server(s): 192.168.11.x"bigd": I am used to environments with over 2000 VIPs / 10000&gt; nodes.&nbsp; My current role the LTMs are basically desk paper-weights. Bigd is in a good state.&nbsp;The nodes do not have a monitor configured; only at a pool level.-To test to see if bigd had a problem I switched monitoring to TMM.&nbsp; No dice.&nbsp;&nbsp;Below:&nbsp; Both the active and standby are able to reach the pool members with no issues with curl ( port is open )Standby - Monitor statusActive - Monitor status&nbsp;

printerdriversupport · Answer

Sounds like a possible issue with stale monitor states or a sync problem between the units. Have you tried forcing a full config sync and clearing the monitor stats? Also, check if there's any persistence in the monitoring cache causing this. If removing/re-adding the monitor fixes it, maybe something’s getting stuck in the process. For the TCP monitor failures, could be network latency, firewall interference, or just strict timeout settings. Might be worth tweaking those if possible.

Forum Discussion

Standby Has Fewer Online VIPs Than Active – Requires Manual Monitor Reset

Issue: Discrepancy in Online VIPs Between Active & Standby

Side Note: Frequent TCP Monitor Failures

4 Replies

F5 Container Ingress Services (CIS) and using k8s traffic policies to send traffic directly to pods

F5 Architecture Track Sessions - AppWorld 2026

Recent Discussions

OWA File Upload URIs for WAF Bypass

AST Configuration Assistance

SNMP Monitoring/OIDs for rSeries

iRule Pool member(s) offline or disabled

error code 503 redirect irule

Related Content

Upcoming Action Required: F5 NGINX Plus R33 Release and Licensing Update

BIG-IP APM: How to streamline your access requirements

F5 BGP Peering in Active /Standby Cluster

Ansible - Upload Certificates requires Administrator Role?

API Security requirement?

ABOUT DEVCENTRAL

RESOURCES

SUPPORT

PARTNERS