iquery
14 TopicsGTM Pool Members Gone After Maintenance? It's Probably This One Setting
You finish a maintenance window, everything looks good on LTM, and then someone notices Wide IPs are resolving to fewer destinations than before. You check the GTM pools and the members are just... gone. The virtual servers are fine on LTM. GTM just doesn't know about them anymore — and more importantly, it doesn't remember if they were ever pool members. This happens more often than it should, and it almost always comes back to the same thing: virtual-server-discovery enabled doing exactly what it was designed to do, at exactly the wrong moment. What's Actually Going On When virtual-server-discovery is set to enabled on a GTM server object, GTM keeps its view of LTM virtual servers in sync via iQuery. It automatically adds new virtual servers, updates existing ones, and — this is the part that causes problems — deletes virtual servers that LTM stops reporting on. That delete behavior is the issue. Any time iQuery reports zero virtual servers, even temporarily, GTM treats it as a mass deletion event. The virtual servers get pulled from the server object, and with them, their pool memberships. When LTM eventually reports on those virtual servers again, GTM re-discovers them as brand new objects with no memory of which pools they belonged to. Two scenarios trigger this consistently. Scenario 1: LTM Software Upgrade This is the one that catches most people. During an upgrade, LTM reboots and goes through a phase where iQuery can connect but the full configuration hasn't finished loading yet. From GTM's perspective, LTM is reachable but reporting no virtual servers. GTM interprets that as a deletion event, clears out the discovered virtual servers, and empties the pools. When LTM finishes loading and the virtual servers come back, GTM re-discovers them — but the pool memberships are gone. You're left manually rebuilding what was there before the maintenance window started. The telltale sign is pool members coming back in blue/CHECKING state. That only happens to newly discovered objects. GTM treated a returning virtual server as a brand new one — because as far as it's concerned, it is. The GTM log won't show a deletion event, only the re-add. That gap in the logs is a known blind spot with virtual-server-discovery enabled, and it's exactly why the problem is hard to diagnose after the fact. What you'll typically see in /var/log/gtm after the LTM comes back: alert gtmd[xxxxx]: 011a1005:1: SNMP_TRAP: Pool your_pool state change green --> red (No enabled pool members available) alert gtmd[xxxxx]: 011a3004:1: SNMP_TRAP: Wide IP your.wideip.example.com state change green --> red (No enabled pools available) And then shortly after, the virtual servers re-appear in CHECKING state as GTM re-discovers them — but with no pool bindings. Scenario 2: LTM HA Failover This one surprises people because the LTM pair is still running — it's just switching active units. After a failover, the new active device may not have its iQuery connections fully re-established yet. GTM sees the iQuery state as inconsistent, virtual server status updates stop coming through, and members disappear from the discovered list. What makes this harder to diagnose is that tmsh show gtm iquery may show "connected" — but connected doesn't mean the config sync is working correctly. In a GTM sync group, only the device assigned local ID 0 (the GTM with the lowest IP address) is responsible for writing auto-discovery results to the configuration. If that specific device loses its iQuery connection during the failover window, discovery events are missed entirely — even if every other GTM in the group can still reach the LTM. So you can have a situation where five out of six GTMs look perfectly healthy, iQuery shows connected everywhere, and yet pool members are still disappearing — because the one device that matters for discovery is the one with the broken connection. You can check which device in your sync group holds local ID 0 with: tmsh list sys db gtm.peerinfolocalid If that device's iQuery connection to the LTM is the one that dropped during the failover window, that's your answer — even if everything else looks fine. The Fix: enabled-no-delete Both scenarios share the same root cause: GTM's auto-delete behavior treating a temporary iQuery disruption as a permanent deletion event. The fix is the same for both: gtm server /Common/site1-ltm { addresses { 10.1.1.1 { device-name site1-ltm } } datacenter /Common/dc1 monitor /Common/bigip virtual-server-discovery enabled-no-delete } With enabled-no-delete, GTM still auto-discovers new virtual servers and keeps existing ones updated. The only thing that changes is that it will never delete a virtual server just because LTM temporarily stopped reporting it. Your pool memberships survive both scenarios above. Mode Adds new VS Updates VS Deletes VS Pool memberships survive iQuery disruption? disabled No No No Yes — nothing changes enabled Yes Yes Yes No — any disruption can empty pools enabled-no-delete Yes Yes No Yes — preserved The Trade-Off enabled-no-delete won't clean up after you when you intentionally decommission a virtual server on LTM. The stale GTM object stays in the discovered list until you remove it manually. In environments with a lot of VS churn, this can accumulate over time. The question is which failure mode you'd rather manage: pool members silently disappearing during a maintenance window, or occasionally needing to clean up stale objects after a planned decommission. For most production environments, the latter is far easier to deal with — and far less likely to wake someone up at 2am. How to Make the Change Via tmsh: tmsh modify gtm server /Common/site1-ltm \ virtual-server-discovery enabled-no-delete tmsh save sys config Via GUI: Go to DNS → GSLB → Servers Select the server object Set Virtual Server Discovery to Enabled (No Delete) Click Update This takes effect immediately and does not affect existing discovered virtual servers or current pool memberships. Cleaning Up Stale Objects When you intentionally decommission a virtual server on LTM, remove the leftover GTM object manually: # List virtual servers under a GTM server object tmsh list gtm server /Common/site1-ltm virtual-server # Remove a specific stale entry tmsh modify gtm server /Common/site1-ltm \ virtual-servers delete { /Common/old-vs-name } tmsh save sys config Make this part of your standard VS decommission runbook and stale objects will never pile up. Quick Diagnostic When Members Go Missing Before assuming it's a discovery issue, check iQuery health across all GTM devices first: tmsh show gtm iquery Look for: State: should be connected to all entries Reconnects: A high count suggests instability even if the connection looks up Configuration Time: None means the config has never successfully synced from that LTM Then confirm which GTM holds local ID 0 and verify its connectivity specifically: tmsh list sys db gtm.peerinfolocalid If the local ID 0 device is the one with the broken iQuery connection, that's your answer — regardless of what the other devices are showing. Wrapping Up Whether it's an LTM upgrade or an HA failover, the pattern is the same: iQuery goes quiet for a moment, GTM interprets silence as deletion, and your pool memberships are gone. It's working as designed — just not in a way that's useful to you. enabled-no-delete is a one-line change that stops this from happening. The cleanup overhead it introduces is predictable and manageable. The alternative — rebuilding pool memberships after an unplanned event — is not. Have you run into either of these scenarios in your environment? Drop a comment below, especially if you've seen the local ID 0 shift cause issues during a rolling GTM upgrade.89Views1like0CommentsiQuery/ Big-IP DNS server certificate trust problem
Unable to establish iQuery between bigip devices. Connectivity is in place but failing with: SSL error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed I take this to be a certificate chain failure. The device certificates have been added to both DNS > GSLB > Servers > Trusted Server Certificates and System > Cert Mgmt > Device Cert Mgmt > Device Trust Certs. Yet, still no joy, running openssl confirms trust issues. Device certs are issued by a 2 tier PKI (intermediary and root). Big IP is 13 HF 2. Any suggestions? Is it common place to be using internal certs here?663Views0likes1Commentf5 enterprise manager fails to connect to LTMS
Hello, I have a handful of ltms that cant communicate with EM. There is about half that can talk to the the EM and half that cant. The LTMS are 11.4.1 and EM is 3.1.0. The EM talks to the LTMs fine with iquery communication in the dump logs being ok. On of one of the LTMs in question it was discovered by an engineer who is still working on the case that I have already open, he found these errors on one of the LTM's a couple of days ago: 67 May 21 14:44:18 aprcorpextltm1 err eventd[8174]: 012d0012:3: Notification attempt to consumer id D6E738E8- 1974-626A-2E52-EF1569494AD FAILED with error Failed to connect to host 10.58.1.124, port 443: Operation already in progress. 108 May 21 16:31:33 aprcorpextltm1 err eventd[8174]: 012d0012:3: Notification attempt to consumer id 7451CF6C- 1974-F300-1696-9E58A25A09A FAILED with error Failed to connect to host 10.58.1.124, port 443: Operation already in progress. Anyone run into this before ? Thanks318Views0likes2CommentsiQuery Network Latency Guidelines
Are there any guidelines on network latency maximums to be followed to have iQuery between GTMs and LTMs be successful? Any F5 "best practices" around directing the iQuery communications over internal links between the systems or over the Internet (via IPSec VPN Tunnels) if spread across different data centers? Or are there restrictions on which interfaces iQuery can be established between the GTMs/LTMs?282Views0likes0CommentsSeeking easiest way to get iQuery grid working, with new GTM being added
Hi, all - we have a BIG-IP DNS controller we're trying to add to our iquery grid - but it has a newer TMOS (and thus newer big3d) version than our existing GTMs and LTMs. Existing GTMs: 11.5.2 final New GTM: 11.5.4 Existing LTMs: 11.5.2 final (match the existing GTMs) For logistical reasons, I'd like to avoid upgrading TMOS on all the existing equipment at this point. The articles say to make sure you're GTMs are all at the same TMOS version - but I see in the answers that people are successful as long as all the GTMs have the same big3d version (and the LTMs have that big3d version or newer). Is that true, based on your experience, that GTMs can in fact be at slightly different versions, as long as the big3d installs match? If so - and looking for the quickest/easiest path here - is it possible to install a back-leveled 11.5.2 version of big3d on the new GTM? big3d_install script won't do it, because it detects the newer 11.5.4 version of big3d on the new GTM. Is there a manual procedure that will permit back-leveling of big3d? If not - I assume it would work to upgrade just big3d on the existing GTMs and all the LTMs, to match the new GTM? Barring that, is it necessary to back-level the TMOS version on the new GTM to 11.5.2, just to get that older version of big3d? That will involve a trip to the unmanned data center, to build it from scratch again, since the existing 11.5.4 config won't copy into the older 11.5.2 boot image. Of course, eventually, we'll update all this equipment - but as a holding position, would like to get this new GTM in with as little perturbation as possible. thx308Views0likes1CommentiQuery not Returning VIP data from LTM
I have a GTM (12.0.0) and an LTM (12.0.0) setup and have gone through the steps to set up communication between them using bigip_add. When I run a iqdump on the GTM, I see "server" data and "getconfig_element" data being returned but not "vip" data. I run tcpdump on the LTM and I see iquery communications coming in and going out of the LTM. Based on this, I assume the trust is set up correctly. The LTM has a "Green" status and I can go directly to it and reach my web server. Problem is, the GTM shows the LTM object as "Red" and will not return the IP address. We verified the LTM object defined in the GMT has the correct IP address and port number. Has anyone seen this before? Thanks328Views0likes1Commentchoosing self-IPs for iQuery communication
I'm looking for advise on choosing self-ips for iQuery communication between BigIP-DNS (GTM) and LTM. We considered using the LTM virtual server self-IPs, but that net is most likely to experience external (ddos) attacks. We could configure a small network just for iQuery, but that seems like overkill. I'm curious to hear any thoughts/best practices regarding how self-IPs were chosen for bigip_add to exchange iquery SSL certificates with a remote BIG-IP system and ongoing iQuery communication.599Views0likes3Comments