Forum Discussion

jomedusa's avatar
jomedusa
Icon for Altostratus rankAltostratus
Jul 05, 2023

LTM Traffic group failover

We have 4 LTM's in a device group and moved one of our Traffic groups to a different LTM a few weeks ago.  All 4 devices show that padf5-core1.csiweb.com (the LTM it was moved to) as the active device, but there are no stats or connections present on that LTM.  The orignal LTM it was moved from still has all the traffic and I have traced MAC address' to that device.  I have failed/moved between traffic groups many many times without issues.  I opened a case with support which has not gotten very far...from the QKVIEW's they say that both devices are "active" and I need to disable the VS's on the original for it to work properly.  I don't understand this as I have never had to do this in the past and if I were to disable the VS"s on any inactive LTM how would failover work in the event of an outage.  I have requested a support session with F5 on this issue but haven't recieved a response at this time.

FYI both LTM's in question have other traffic groups running properly on them.

Thanks,

Joe

  • You may have a split brain situation where you actually dont have an active sync and network failover happeneing between all 4 devices. Do you have all 4 devices within the same device-group/traffic-group? When you perform an iqdump on each unit, do you see the other 3 units communicating properly? Can you provide a screenshot of the sync page, where the F5 units part of the device groups are listed with their current status -- ie, grey or green balls.

    • jomedusa's avatar
      jomedusa
      Icon for Altostratus rankAltostratus

      My ultimate goal will be to have only 2 devices in the traffic group for failover as we will be decommisioning 2 of the LTM's later this year.

       

      From the image below I would like to ultimately only have the padf5-core1 and padf5-core2 in the traffic group, but currenly all the traffic is on Pad-F5-2.

       

      Can you please provide more information on the iqdump?

      Thanks,

       

       

       

      • whisperer's avatar
        whisperer
        Icon for MVP rankMVP

        Do you have more than one traffic group configured? That is another instance where you would have more than one F5 unit active. If the traffic groups are not on the same unit, then multiple units may be active. If you need to move connections ASAP, just force offline the problematic F5 unit.

        Looks like the devices should all see each other and configs are synced. So there shouldnt be any issue with network failover, the port lockdown settings, or iquery comms in this case. (If you log into each unit and the standby / active devices all look the same in GUI .. connected, grey balls or green balls, same ball colors, etc. then connectivity wise you should be OK. I would check the traffic groups.

  • I am still having the issue, it appears that one specific traffic group will go "Active" on another LTM but the traffic still seems to be "stuck" to the original active device.  I failed a different traffic group over to another LTM without issue this morning but tried to move the CORESERVICES one last night and the mac/arp entries were still present on the orgiinal F5.

    I had a case open with F5 and they stated that I needed to disable the VIPs on the existing LTM after I migrate force it to standby...but since they are in a config sync group that would shut down the VIP's for all LTM's.  I am lost at this point.  I am going to work on moving all traffic off the existing LTM and then rebooting it to see if that clears the issues.

    Any  ideas would be appreciated..

    Thanks,

    Joe

  • I think I found the issue but not completely sure how it got this way or exactly how to resolve it.  It appears the only way this Traffic group will failover is to another LTM is when the "traffic-group-1" is moved as well.  I looked at the Virutal Servers : Virtual Address List and all the of the Address are assocated with the "traffic-group-1" instead of the proper Trafic Group:

    I need to figure out how we got in this state and is it as simply as changing each one to the proper Traffic Group.  I would also like to determine how to chorrect this issue going forward.

    Thanks,

    Joe