upgrade
72 TopicsGTM Pool Members Gone After Maintenance? It's Probably This One Setting
You finish a maintenance window, everything looks good on LTM, and then someone notices Wide IPs are resolving to fewer destinations than before. You check the GTM pools and the members are just... gone. The virtual servers are fine on LTM. GTM just doesn't know about them anymore — and more importantly, it doesn't remember if they were ever pool members. This happens more often than it should, and it almost always comes back to the same thing: virtual-server-discovery enabled doing exactly what it was designed to do, at exactly the wrong moment. What's Actually Going On When virtual-server-discovery is set to enabled on a GTM server object, GTM keeps its view of LTM virtual servers in sync via iQuery. It automatically adds new virtual servers, updates existing ones, and — this is the part that causes problems — deletes virtual servers that LTM stops reporting on. That delete behavior is the issue. Any time iQuery reports zero virtual servers, even temporarily, GTM treats it as a mass deletion event. The virtual servers get pulled from the server object, and with them, their pool memberships. When LTM eventually reports on those virtual servers again, GTM re-discovers them as brand new objects with no memory of which pools they belonged to. Two scenarios trigger this consistently. Scenario 1: LTM Software Upgrade This is the one that catches most people. During an upgrade, LTM reboots and goes through a phase where iQuery can connect but the full configuration hasn't finished loading yet. From GTM's perspective, LTM is reachable but reporting no virtual servers. GTM interprets that as a deletion event, clears out the discovered virtual servers, and empties the pools. When LTM finishes loading and the virtual servers come back, GTM re-discovers them — but the pool memberships are gone. You're left manually rebuilding what was there before the maintenance window started. The telltale sign is pool members coming back in blue/CHECKING state. That only happens to newly discovered objects. GTM treated a returning virtual server as a brand new one — because as far as it's concerned, it is. The GTM log won't show a deletion event, only the re-add. That gap in the logs is a known blind spot with virtual-server-discovery enabled, and it's exactly why the problem is hard to diagnose after the fact. What you'll typically see in /var/log/gtm after the LTM comes back: alert gtmd[xxxxx]: 011a1005:1: SNMP_TRAP: Pool your_pool state change green --> red (No enabled pool members available) alert gtmd[xxxxx]: 011a3004:1: SNMP_TRAP: Wide IP your.wideip.example.com state change green --> red (No enabled pools available) And then shortly after, the virtual servers re-appear in CHECKING state as GTM re-discovers them — but with no pool bindings. Scenario 2: LTM HA Failover This one surprises people because the LTM pair is still running — it's just switching active units. After a failover, the new active device may not have its iQuery connections fully re-established yet. GTM sees the iQuery state as inconsistent, virtual server status updates stop coming through, and members disappear from the discovered list. What makes this harder to diagnose is that tmsh show gtm iquery may show "connected" — but connected doesn't mean the config sync is working correctly. In a GTM sync group, only the device assigned local ID 0 (the GTM with the lowest IP address) is responsible for writing auto-discovery results to the configuration. If that specific device loses its iQuery connection during the failover window, discovery events are missed entirely — even if every other GTM in the group can still reach the LTM. So you can have a situation where five out of six GTMs look perfectly healthy, iQuery shows connected everywhere, and yet pool members are still disappearing — because the one device that matters for discovery is the one with the broken connection. You can check which device in your sync group holds local ID 0 with: tmsh list sys db gtm.peerinfolocalid If that device's iQuery connection to the LTM is the one that dropped during the failover window, that's your answer — even if everything else looks fine. The Fix: enabled-no-delete Both scenarios share the same root cause: GTM's auto-delete behavior treating a temporary iQuery disruption as a permanent deletion event. The fix is the same for both: gtm server /Common/site1-ltm { addresses { 10.1.1.1 { device-name site1-ltm } } datacenter /Common/dc1 monitor /Common/bigip virtual-server-discovery enabled-no-delete } With enabled-no-delete, GTM still auto-discovers new virtual servers and keeps existing ones updated. The only thing that changes is that it will never delete a virtual server just because LTM temporarily stopped reporting it. Your pool memberships survive both scenarios above. Mode Adds new VS Updates VS Deletes VS Pool memberships survive iQuery disruption? disabled No No No Yes — nothing changes enabled Yes Yes Yes No — any disruption can empty pools enabled-no-delete Yes Yes No Yes — preserved The Trade-Off enabled-no-delete won't clean up after you when you intentionally decommission a virtual server on LTM. The stale GTM object stays in the discovered list until you remove it manually. In environments with a lot of VS churn, this can accumulate over time. The question is which failure mode you'd rather manage: pool members silently disappearing during a maintenance window, or occasionally needing to clean up stale objects after a planned decommission. For most production environments, the latter is far easier to deal with — and far less likely to wake someone up at 2am. How to Make the Change Via tmsh: tmsh modify gtm server /Common/site1-ltm \ virtual-server-discovery enabled-no-delete tmsh save sys config Via GUI: Go to DNS → GSLB → Servers Select the server object Set Virtual Server Discovery to Enabled (No Delete) Click Update This takes effect immediately and does not affect existing discovered virtual servers or current pool memberships. Cleaning Up Stale Objects When you intentionally decommission a virtual server on LTM, remove the leftover GTM object manually: # List virtual servers under a GTM server object tmsh list gtm server /Common/site1-ltm virtual-server # Remove a specific stale entry tmsh modify gtm server /Common/site1-ltm \ virtual-servers delete { /Common/old-vs-name } tmsh save sys config Make this part of your standard VS decommission runbook and stale objects will never pile up. Quick Diagnostic When Members Go Missing Before assuming it's a discovery issue, check iQuery health across all GTM devices first: tmsh show gtm iquery Look for: State: should be connected to all entries Reconnects: A high count suggests instability even if the connection looks up Configuration Time: None means the config has never successfully synced from that LTM Then confirm which GTM holds local ID 0 and verify its connectivity specifically: tmsh list sys db gtm.peerinfolocalid If the local ID 0 device is the one with the broken iQuery connection, that's your answer — regardless of what the other devices are showing. Wrapping Up Whether it's an LTM upgrade or an HA failover, the pattern is the same: iQuery goes quiet for a moment, GTM interprets silence as deletion, and your pool memberships are gone. It's working as designed — just not in a way that's useful to you. enabled-no-delete is a one-line change that stops this from happening. The cleanup overhead it introduces is predictable and manageable. The alternative — rebuilding pool memberships after an unplanned event — is not. Have you run into either of these scenarios in your environment? Drop a comment below, especially if you've seen the local ID 0 shift cause issues during a rolling GTM upgrade.34Views1like0CommentsModernizing F5 Platforms with Ansible
I’ve been meaning to publish this article for some time now. Over the past few months, I’ve been building Ansible automation that I believe will help customers modernize their F5 infrastructure. This especially true for those looking to migrate from legacy BIG-IP hardware to next-generation platforms like VELOS and rSeries. As I explored tools like F5 Journeys and traditional CLI-based migration methods, I noticed a significant amount of manual pre-work was still required. This includes: Ensuring the Master Key used to encrypt the UCS archive is preserved and securely handled Storing UCS, Master Key and information assets in a backup host Pre-configuring all VLANs and properly tagging them on the VELOS partition before deploying a Tenant OS To streamline this, I created an Ansible Playbook with supporting roles tailored for Red Hat Ansible Automation Platform. It’s built to perform a lift-and-shift migration of a F5 BIG-IP configuration from one device to another—with optional OS upgrades included. In the demo video below, you’ll see an automated migration of a F5 i10800 running 15.1.10 to a VELOS BX110 Tenant OS running 17.5.0—demonstrating a smooth, hands-free modernization process. Currently Working Velos Velos Controller/Partition running (F5OS-C 1.8.1) - which allows Tenant Management IP to be in a different VLAN Migrates a standalone F5 BIG-IP i10800 to a VELOS BX110 Tenant OS VLAN'ed Source tenant required (Doesn’t support non-vlan tenants) rSeries Shares MGMT IP with the same subnet as the Chassis Partition. Migrates a standalone F5 BIG-IP i10800 to a R5000 Tenant OS VLAN'ed Source tenant required (Doesn’t support non-vlan tenants) Handles: Configuration and crypto backup UCS creation, transfer, and validation F5OS System VLAN Creation, and Association to Tenant - (Does Not manage Interface to VLAN Mapping) F5 OS Tenant provisioning and deployment inline OS upgrades during the migration Roadmap / What's Next Expanding Testing to include Viprion/iSeries (Using VCMP) Tenant Testing. Supporting hardware-to-virtual platform migrations Adding functionality for HA (High Availability) environments Watch the Demo Video View the Source Code on GitHub https://github.com/f5devcentral/f5-bd-ansible-platform-modernization This project is built for the community—so feel free to take it, fork it, and expand it. Let’s make F5 platform modernization as seamless and automated as possible.
2KViews4likes2CommentsBIG-IP 16.1.x End of Technical Support July 31, 2025
Hello, Community! I wanted to share an important update regarding BIG-IP 16.1.x. As of July 31, 2025, this version will officially reach End of Technical Support (EoTS). If you are on version 16.1.x and haven’t started planning your upgrade, now is the perfect time. Keeping your system on supported software ensures continued technical support, and software development support. Planning ahead can foster a smooth transition. To help you navigate this update I have compiled a list of Knowledge Articles that can assist in planning your upgrade. K000139937: BIG-IP 15.1.x and 16.1.x are reaching End of Technical Support K5903: BIG-IP software support policy K84554955: Overview of BIG-IP system software upgrades K13845: Overview of supported BIG-IP upgrade paths and an upgrade planning reference K18074701: iHealth Upgrade Advisor K7727: License activation may be required before a software upgrade for BIG-IP K16022: Opening a proactive service request with F5 Support If you have any questions, please feel free to leave them below or contact F5 Support for customized assistance. Here we can work together to keep your systems secure, supported, and optimized.594Views3likes1CommentR-Series after upgrading to 1.8 - RADIUS Auth stopped working
Radius user authentication was working just fine while running v1.40. After upgrading to 1.80 any attempt is "Failed authentication." Running tcpdump does not show any traffic going to the RADIUS server and the RADIUS server has not entry of the failures in it's log. I have deleted and recreated the radius server group - that did not help. I have deleted and recreated users - that did not help. Any guidance for what to try next is appreciated. Dave254Views0likes1CommentF5 BIG_IP 3900 (C106) - Software Upgrade Advisor not working
Hello, I need to upgrade a 3900(C106) from version 11.5.4 HF2 to 12.1.3.7 I have uploaded the QKView to the iHealth tool to perform a Software Upgrade advisor but it prompts the following message: "You are already running the latest version". The menu to select to version to be upgraded to does not appear I have checked this model can be upgraded to this version in the compatibility matrix, in fact, in the Diagnostics section of the Status at the iHealth the following Upgrade options appear. Upgrade Options: 11.5.4.HF4 (hotfix) 11.5.9 (stability release) 12.1.3.7 (latest release) Any idea why this is happening? Thanks530Views0likes3CommentsUpgrade F5 BIGIP
Dear Team, I hope you all doing well. Kindly note that i want to upgrade my bigip tenant from 17.1.1.3 Build 0.70.5 to new 17.1.2 and when i try to download the software there are several options. 17.1.2 17.1.2_Tenant_F5OS and i want to know what are the difference between these two ? Just to let you know my setup is like this rSeries2600--->F5OS----->BIGIP. can you please clearly let me know which one shall i follow ?and what are the use cases ? is both ways valid for my setup ? Please find the attached the picture and also the URL below. appreciate your support. Regards,Solved1.1KViews0likes12Comments[APM] - Error: failed to reset strict operations; disconnecting from mcpd
Hello Experts , When we try to verify the sys config with the command "load sys config verify" , we are getting beow error message followed with the services restart ...Bug ID 997793 (f5.com) , We tried to remove the old epsec-package file and restarted , but no luck . Can anyone please advise on this ? Validating configuration... /config/bigip_base.conf /config/bigip_user.conf /config/bigip.conf /config/bigip_script.conf Error: failed to reset strict operations; disconnecting from mcpd. Will reconnect on next command. The connection to mcpd has been lost, try again.309Views0likes1CommentR-Series Appliance No GUI ( host system gui ) after 1.7.0 upgrade
After running the 1.7.0 upgrade for the r-series 5000 appliance - the login screen does not display in a browsers. Admin and Root passwords are as they were before upgrade. mgmt-ip settings are the same. Device can ping it's upstream router in the mgmt-ip network. the device mgmt-ip address does not reply to ping ( or any request [ ssh, http...]) from the console I can see the whole running config - it is the same as before the upgrade Any help is appreciated, Dave Mehlberg702Views0likes5CommentsF5OS R4800 upgrade to 1.7.0
Hi All, I have installed a F5 R4800 platform in our test environment, now i want it to upgrade to 1.7.0. After the upload the new image via GUI, it stays in "Signature Verification Failed" status. I have installed a R5900 platform in production, over there i no issue to upgrade to version 1.7.0 Dont know if the platform is doing verification through internet or not. Does anyone knows by chance. Thanks in advance.Solved414Views0likes4Comments