Forum Discussion
Active-Active Colision
Hello.
I had a cluster with this initial state:
- BIG-IP1 -> Active
- BIG-IP2 -> Standby
Because of a power outage on BIG-IP2's datacenter, this device was rebooted and the communication between both devices was broken during a few minutes.
For this reason, both devices hadn't detected messages from the the far end and the status was established as disconnected during a gap of time.
From the BIG-IP1 perception:
- BIG-IP1 -> Active
- BIG-IP2 -> Disconnected
From the BIG-IP2 perception:
- BIG-IP1 -> Disconnected
- BIG-IP2 -> Active
When the communication between both devices was restablished, BIG-IP1 became Standby in favor of the other device:
Apr 15 05:23:08 slot1/BIG-IP1 notice sod[5827]: 010c007e:5: Not receiving status updates from peer device /Common/BIG-IP2.mydomain.local (10.0.0.2) (Disconnected).
Apr 15 05:41:38 slot1/BIG-IP1 warning sod[5827]: 010c0084:4: Failover status message received after 1111.500 second gap, from device /Common/BIG-IP2.mydomain.local (10.0.0.2) (unicast: -> 10.255.1.209).
Apr 15 05:41:38 slot1/BIG-IP1 notice sod[5827]: 010c007f:5: Receiving status updates from peer device /Common/BIG-IP2.mydomain.local (10.0.0.2) (Online).
Apr 15 05:41:41 slot1/BIG-IP1 notice sod[5827]: 010c004a:5: Leaving active in favor of active peer.
Apr 15 05:41:41 slot1/BIG-IP1 notice sod[5827]: 010c0052:5: Standby for traffic group /Common/traffic-group-1.
Apr 15 05:41:41 slot1/BIG-IP1 notice sod[5827]: 010c0018:5: Standby
I would like to know what criteria was adopted to decide what device leaves their active state in favor of another.
By the way, both devices are working as Load Aware with default values and only one traffic-group.
--------------------------------------------------------------------------------------------------------------------------------------------
CM::Traffic-Group
Name Device Status Next Load Next Active HA Group Times Became Last Became
Active Load Active Active
--------------------------------------------------------------------------------------------------------------------------------------------
traffic-group-1 BIG-IP1.mydomain.local standby true - 1 - 3 2019-Apr-15 05:41:35
traffic-group-1 BIG-IP2.mydomain.local active false 1 - - 2 2019-Apr-15 05:39:50
traffic-group-local-only -
Thanks in advance.
KR, Dario.
- Andy_McGrath
Cumulonimbus
First recommend looking over the following K95002127: Troubleshooting BIG-IP failover events.
I tested a similar failover/failback event years ago for a customer (overly worried about a split-brain event occurring) and found if the configuration is in sync then it has to do with the base MAC address of each device.
tmsh show sys hardware | grep -i "base mac"
I think the lower the value the higher the priority to resolve a traffic-group active/active conflict.
Again this was a long time ago and got the info from F5 Support.
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com