Forum Discussion

Chris_Reid_1643's avatar
Chris_Reid_1643
Icon for Nimbostratus rankNimbostratus
Jun 20, 2016

High-availability configuration produces a status of "ONLINE (STANDBY), In Sync"

Problem: High-availability configuration produces a status of "ONLINE (STANDBY), In Sync" on the F5 primary and standby units.

 

Models: F5 1600

 

Big-IP Version: BIG-IP 11.5.0 Build 7.0.265 Hotfix HF7

 

Steps used to configure high-availability:

 

  1. Connect a network cable on port 1.3 of each F5 1600
  2. Create a dedicated VLAN for high-availability on each F5 1600
  3. Configure an IP address for the high-availability VLAN on each F5 1600
  4. Ensure that both F5 1600 units can ping each other from the high-availability VLAN
  5. On each F5 1600, navigate to "Device Management" -> "Devices" -> "Device List". Select the F5 1600 system labelled as "self"
  6. On each F5 1600, navigate to "Device Connectivity" -> "ConfigSync". Select the IP address assigned to the high-availability VLAN
  7. On each F5 1600, navigate to "Device Connectivity" -> "Network Failover". Add the IP address assigned to the high-availability VLAN to the failover unicast configuration
  8. Force the standby unit offline
  9. On the active unit, navigate to "Device" -> "Peer List". Click "Add", and add standby unit to the high-availability configuration
  10. At this point, the primary F5 unit has a status of "ONLINE (ACTIVE), In Sync", and the standby unit has a status of "FORCED (OFFLINE), In Sync"
  11. On the primary unit, navigate to "Device Management" -> "Device Groups" to create a device group

At this point, both units have a status of "ONLINE (STANDBY), In Sync". Any ideas as to why this happening?

 

My goal is to have high-availability configured in an ACTIVE/STANDBY pair.

 

  • Usually if there would be an issue with communication channel you would end with Active-Active state - not a Standby-Standby.

     

    • step 7 - check if network failover checkbox is marked
    • verify if your device-group includes both devices (on the left side)
    • check /var/log/ltm for high availability status. Look for "Active" and "Standby" messages

    You can try to reset trust and setup HA again as per: https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/tmos-implementations-11-5-0/2.html

     

    • Jankes_162915's avatar
      Jankes_162915
      Icon for Nimbostratus rankNimbostratus
      Sure, you don't need internal and external vlans - these are used to process actual clients and server traffic. For HA you need only single vlan to pass HA traffic. For pure failover you can use management interface/network. However for config-sync you will need didicated vlan. **This indeed looks like vlan failsafe behaviour like Odaah said. You can double check that: https://support.f5.com/kb/en-us/products/big-ip_ltm/manuals/product/tmos-concepts-11-5-0/19.html
    • Chris_Reid_1643's avatar
      Chris_Reid_1643
      Icon for Nimbostratus rankNimbostratus
      I've looked over this documentation in the past. However, our current setup does't have an external VLAN because the F5 is used for internal purposes only. Additionally, I don't currently have an internal VLAN configured because I will be migrating settings into this F5 implementation. Can I still use this documentation to configure high-availability?
  • Shot in the dark - do you have VLAN failover configured with no servers or gateway active in one or more VLANs ?

     

  • One common issue to check for...

    4. Ensure that both F5 1600 units can ping each other from the high-availability VLAN

    What is your Port Lockdown setting (SelfIP)? Make sure it's not 'Allow None'. Preferred configurations for HA VLAN/SelfIPs are 'Allow Default' (the safe bet regardless of BigIP version), or 'Allow Custom' which includes all the ports that HA daemons use (varies across BigIP versions)

    • Chris_Reid_1643's avatar
      Chris_Reid_1643
      Icon for Nimbostratus rankNimbostratus
      After the "Device Group" configuration on the primary F5, the following error messages appear in the LTM log file: Error Messages on Primary F5 Unit: Jun 21 02:34:31 YPG-L-TLIDC4-001 err mcpd[4953]: 01071587:3: Commit ID message ignored, commit ID originator not specified Error Messages on Secondary F5 Unit: Jun 21 02:34:29 err mcpd[6588]: 01071587:3: Commit ID message ignored, commit ID originator not specified Jun 21 02:34:31 notice sod[7120]: 010c005c:5: Failover condition reported by /Common/ for traffic group /Common/traffic-group-1, it will not be able to go active.
    • Hannes_Rapp_162's avatar
      Hannes_Rapp_162
      Icon for Nacreous rankNacreous
      Ok. Have a look into /var/log/ltm file, issue a grep for 'daemon_heartbeat'. That file usually directly reports any changes to HA status, and the reason behind it
  • One common issue to check for...

    4. Ensure that both F5 1600 units can ping each other from the high-availability VLAN

    What is your Port Lockdown setting (SelfIP)? Make sure it's not 'Allow None'. Preferred configurations for HA VLAN/SelfIPs are 'Allow Default' (the safe bet regardless of BigIP version), or 'Allow Custom' which includes all the ports that HA daemons use (varies across BigIP versions)

    • Chris_Reid_1643's avatar
      Chris_Reid_1643
      Icon for Nimbostratus rankNimbostratus
      After the "Device Group" configuration on the primary F5, the following error messages appear in the LTM log file: Error Messages on Primary F5 Unit: Jun 21 02:34:31 YPG-L-TLIDC4-001 err mcpd[4953]: 01071587:3: Commit ID message ignored, commit ID originator not specified Error Messages on Secondary F5 Unit: Jun 21 02:34:29 err mcpd[6588]: 01071587:3: Commit ID message ignored, commit ID originator not specified Jun 21 02:34:31 notice sod[7120]: 010c005c:5: Failover condition reported by /Common/ for traffic group /Common/traffic-group-1, it will not be able to go active.
    • Hannes_Rapp's avatar
      Hannes_Rapp
      Icon for Nimbostratus rankNimbostratus
      Ok. Have a look into /var/log/ltm file, issue a grep for 'daemon_heartbeat'. That file usually directly reports any changes to HA status, and the reason behind it