Forum Discussion

Rajaraman_12066

Nimbostratus

Jan 07, 2016

Solved

BIG-IP HA over WAN - Is it a recommended practice?

Hi We plan to setup 2* F5 one each in DC1 and DC2. I plan to run HA pair between them. DCs are 6 kms apart and connected by dedicated fibre link. Can someone please advise?

application delivery

big-ip

wan

BinaryCanary_19
Jan 08, 2016
It's not explicitly discouraged as far as I know, you just need to be aware of potential latency issues. Since network failover works by heartbeats sent across the configured failover link, you have to beware that packet loss or excessive delays may trigger unwanted failovers, and even potentially "split-brain" scenarios, where both devices think for brief moments that their peer has gone down, and so assume active role (active-active).

There is a DB key that allows to configure how long the system waits before declaring it's peer dead:
failover.nettimeoutsec
. Default value is 3 seconds, and this assumes devices in the same data centre. You might want to tweak this a little higher to account for the increased distance. With the value set at 3 seconds, it will take 3 seconds to trigger a failover if the peer is truly down. This means at most 3 seconds of no device actively able to handle traffic. Raising this higher increases this window, but also helps mitigate the risk of split-brain due to transient network issues.

3 Replies

BinaryCanary_19
Historic F5 Account
Jan 08, 2016
It's not explicitly discouraged as far as I know, you just need to be aware of potential latency issues. Since network failover works by heartbeats sent across the configured failover link, you have to beware that packet loss or excessive delays may trigger unwanted failovers, and even potentially "split-brain" scenarios, where both devices think for brief moments that their peer has gone down, and so assume active role (active-active).

There is a DB key that allows to configure how long the system waits before declaring it's peer dead:
failover.nettimeoutsec
. Default value is 3 seconds, and this assumes devices in the same data centre. You might want to tweak this a little higher to account for the increased distance. With the value set at 3 seconds, it will take 3 seconds to trigger a failover if the peer is truly down. This means at most 3 seconds of no device actively able to handle traffic. Raising this higher increases this window, but also helps mitigate the risk of split-brain due to transient network issues.
- Mr__Katic_15215
  Altocumulus
  Jun 03, 2016
  Just one additional comment. You cannot use dedicated fail-over port builtin in BIG IP hardware. So if you want to have HA redundancy on HA Pair link used for heartbeat you will need to dedicate another fiber for that.
- Rajaraman_12066
  Nimbostratus
  Jan 08, 2016
  Hi, Thanks for the response. It helps.