Forum Discussion

InquisitiveMai's avatar
Apr 18, 2023
Solved

Rebooting HA pair

Is a outage expected when a HA failover happens? 

trying to figure out what would cause a outage in a HA pair, when configured is INSYNC and the process followed is

 

F5A(Active) and F5B(Standby)

1)HA failover happens F5A is force to standby under the traffic group

was running a ping test and only one ping dropped

2)F5A standby is rebooted from GUI and comes up as staby

3)F5B is forced to standby and a ping to a VIP is dropped

both the boxes are rebooted one at a time ie standby device rebooted.

in this scenario, expectation is no outage or minimum outage. But since a outage is seen, just wondering if something needs to be changed on the client or is there anything we can do on F5

 

 

  • G-Rob's avatar
    G-Rob
    Apr 21, 2023

    MAC Masquerade means that all of the self IP addresses on the F5 unit will share a single MAC address, instead of having a unique MAC per VLAN. That greatly speeds up the MAC learning process on the upstream device as it only needs to learn a single MAC for the entire appliance to move within the CAM table. 

    However, the MAC does change during a failover and the F5 will send out Gratuitous ARPs (GARPs) that notify adjacent devices to the L2 change. You can tune how fast the GARP flood starts and continues using database variables. 

    For TCP connections, connection mirroring is required for seamless failover. This is how the standby device will know about established connections in order to continue those flows during a failover event. ICMP and UDP traffic will create a new flow upon the first packet, so you should not see interruption for the stateless protocols. Thus for true HA failover, enable connection mirroring. The "system degradation" isn't really a factor but use a dedicated interface for HA (config sync and mirroring) to keep that overhead away from your data interfaces if you're concerned.  

4 Replies

  • Aside from MAC masquereding mentioned by mihaic , the only other thing that comes to my mind to make failover as smooth as possible is configuring connection mirroring, to make sure that connection and persistence information are already synced to the other unit when traffic fails over.

  • My opinion is that one ping drop is normal as far as I know.

    But I would not call it an outage. Most of the traffic in my case is TCP, so I would not see an outage because of  TCP features(TCP is a connection-oriented transport protocol.).  Besides this almost all is HTTP. In this case again , no outage, because  HTTP is a Short-lived protocol (transactional).

    also, make sure you use MAC masquerade:

    https://my.f5.com/manage/s/article/K13502

  • Thank you mihaic and CA_Valli ,  Are there any limitations for MAC Masquerade? I see that connection mirroring causes system degradation...So MAC Masquerade is the preferred one?

    From the client side, windows or linux...what would be best for them to change for smooth failover event? or is it preferred to make the changes on the F5

     

    • G-Rob's avatar
      G-Rob
      Icon for Employee rankEmployee

      MAC Masquerade means that all of the self IP addresses on the F5 unit will share a single MAC address, instead of having a unique MAC per VLAN. That greatly speeds up the MAC learning process on the upstream device as it only needs to learn a single MAC for the entire appliance to move within the CAM table. 

      However, the MAC does change during a failover and the F5 will send out Gratuitous ARPs (GARPs) that notify adjacent devices to the L2 change. You can tune how fast the GARP flood starts and continues using database variables. 

      For TCP connections, connection mirroring is required for seamless failover. This is how the standby device will know about established connections in order to continue those flows during a failover event. ICMP and UDP traffic will create a new flow upon the first packet, so you should not see interruption for the stateless protocols. Thus for true HA failover, enable connection mirroring. The "system degradation" isn't really a factor but use a dedicated interface for HA (config sync and mirroring) to keep that overhead away from your data interfaces if you're concerned.