Bidirectional Forwarding Detection (BFD) Protocol Cheat Sheet
Definition
This is a protocol initially described in RFC5880 and IPv4/IPv6 specifics in RFC5881. I would say this is an aggressive 'hello-like' protocol with shorter timers but very lightweight on the wire and requiring very little processing as it is designed to be implemented in forwarding plane (although RFC does not forbid it to be implemented in control plane). It also contains a feature called Echo that further leaves cpu processing cycle to roughly ZERO which literally just 'loops' BFD control packets sent from peer back to them without even 'touching' (processing) it.
BFD helps routing protocol detects peers failure at sub-second level and BIG-IP supports it on all its routing protocols. On BIG-IP it is control-plane independent as TMM that takes care of BFD sending/receiving unicast probes (yes, no multicast!) and BIG-IP's Advanced Routing Module® being responsible only for its configuration (of course!) and to receive state information from TMM that is displayed in show commands. BIG-IP's control plane daemon communicates with TMM is oamd. This daemon starts when BFD is enabled in the route domain like any other routing protocol.
BFD Handshake Explained
218: BFD was configured on Cisco Router but not on BIG-IP so neighbour signals BIG-IP sessionstate is Down and no flags
219: I had just enabled BFD on BIG-IP, session state is now Init and only Control Plane Independent flag set¹
220: Poll flag is set to validate initial bidirectional connectivity (expecting Final flag set in response)
221: BIG-IP sets Final flag and handshake is complete²
¹Control Plane Independent flag is set because BFD is not actively performed by BIG-IP's control plane. BIG-IP's BFD control plane daemon (oamd) just signals TMM what BFD sessions are required and TMM takes care of sending/receiving of all BFD control traffic and informs session state back to Advanced Routing Module's daemon.
²Packets 222-223 are just to show that after handshake is finished all flags are cleared unless there is another event that triggers them. Packet 218 is how Cisco Router sees BIG-IP when BFD is not enabled. Control Plane Independent flag on BIG-IP remains though for the reasons already explained above.
Protocol fields
Diagnostic codes
0 (No Diagnostic): Typically seen when BFD session is UP and there are no errors.
1 (Control Detection Time Expired): BFD Detect Timer expired and session was marked down.
2 (Echo Function Failed): BFD Echo packet loop verification failed, session is marked down and this is the diagnostic code number.
3 (Neighbor Signaled Session Down): If either neighbour advertised state or our local state is down you will see this diagnostic code
4 (Forwarding Planet Reset): When forwarding-plane (TMM) is reset and peer should not rely on BFD so session is marked down.
5 (Path Down): On demand mode external application can signal BFD that path is down so we can set session down using this code
6 Concatenated Path Down):
7 (Administratively Down): Session is marked down by an administrator
8 (Reverse Concatenated Path Down):
9-31: Reserved for future use
BFD verification 'show' commands
³Type IP address to see specific session
Modes
Asynchronous (default): hello-like mode where BIG-IP periodically sends (and receives) BFD control packets and if control detection timer expires, session is marked as down. It uses UDP port 3784.
Demand: BFD handshake is completed but no periodic BFD control packets are exchanged as this mode assumes system has its own way of verifying connectivity with peer and may trigger BFD verification on demand, i.e. when it needs to use it according to its implementation. BIG-IP currently does not support this mode.
Asynchronous + Echo Function: When enabled, TMM literally loops BFD echo-specific control packets on UDP port 3785 sent from peers back to them without processing it as it wasn't enough that this protocol is already lightweight. In this mode, a less aggressive timer (> 1 second) should be used for regular BFD control packets over port 3784 and more aggressive timer is used by echo function. BIG-IP currently does not support this mode.
Header Fields
Protocol Version: BFD version used. Latest one is v1 (RFC5880)
Diagnostic Code: BFD error code for diagnostics purpose.
Session State: How transmitting system sees the session state which can be AdminDown, Down, Init or Up.
Message Flags: Additional session configuration or functionality (e.g. flag that says authentication is enabled)
Detect Time Multiplier: Informs remote peer BFD session is supposed to be marked down if Desired Min TX Interval multiplied by this value is reached
Message Length (bytes): Length of BFD Control packet
My Discriminator: For each BFD session each peer will use a unique discriminator to differentiate multiple session.
Your Discriminator: When BIG-IP receives BFD control message back from its peer we add peer's My Discriminator to Your Discriminator in our header.
Desired Min TX Interval (microseconds): Fastest we can send BFD control packets to remote peer (no less than configured value here)
Required Min RX Interval (microseconds): Fastest we can receive BFD control packets from remote peer (no less than configured value here)
Required Min Echo Interval (microseconds): Fastest we can loop BFD echo packets back to remote system (0 means Echo function is disabled)
Session State
AdminDown: Administratively forced down by command
Down: Either control detection time expired in an already established BFD session or it never came up. If probing time (min_tx) is set to 100ms for example, and multiplier is 3 then no response after 300ms makes system go down.
Init: Signals a desire to bring session up in the beginning of BFD handshake.
Up: Indicates session is Up
Message Flags
Poll: Pool flag is just a 'ping' that requires peer box to respond with Final flag. In BFD handshake as well as in Demand mode pool message is a request to validate bidirectional connectivity.
Final: Sent in response to packet with Poll bit set
Control Plane Independent: Set if BFD can continue to function if control plane is disrupted¹
Authentication Present: Only set if authentication is being used
Demand: If set, it is implied that periodic BFD control packets are no longer sent and another mechanism (on demand) is used instead.
Multipoint: Reserved for future use of point-to-multipoint extension. Should be 0 on both sides.
¹ This is the case for BIG-IP as BFD is implemented in forwarding plane (TMM)
BFD Configuration
Configure desired transmit and receive intervals as well as multiplier on BIG-IP.
And Cisco Router:
You will typically configure the above regardless of routing protocol used.
BFD BGP Configuration
And Cisco Router:
BFD OSPFv2/v3 Configuration
BFD ISIS Configuration
BFD RIPv1/v2 Configuration
BFD Static Configuration
All interfaces no matter what:
Specific interface only:
Tie BFD configuration to static route:
- dragonflymrCirrostratus
Hi,
No problem at all. Thanks a lot for all answers and patience.
Piotr
Hi Piotr,
I'm sorry but I missed your last comment. You're correct. BFD just reports that path is no longer reachable. It's up to the routing protocol (or control plane if static route) to bypass the failed path. If you'd like BIG-IP to failover you'd need to configure BIG-IP Failsafe. Regarding second question, I haven't tested this scenario yet where we have one BIG-IP and multiple neighbours responding on the same path but I would imagine that as long as there is at least a single session on that link BIG-IP would not bring the link down if it's tied to an interface. If it's tied to a static route, it would remove the static route from the peer that went down. I would have to test to confirm this behaviour though.
Best regards,
Rodrigo
- dragonflymrCirrostratus
Hi,
Sorry for confusion and thanks for you patience. I will try to be more precise. Let's assume we have:
- Active-Standby cluster
- Each node is using different network patch (let's say switches) to reach router
- Route is configured with router IP
- BFD is configured to monitor this router
Then BFD on Active reports that router is down (in fast let's assume that some device in network path is down). As you said it means route is removed from routing table. That means no traffic will be send to this router but can it trigger failover or not really? So if BFD detects router down on Active that's it, removed from routing table and nothing more - means it can't be used to trigger failover, still functionalities like Gateway Failsfae or HA Group has to be configured?
Related to second part. I am talking about scenario when you have multiple static routes configured (no for redundancy) pointing to different routers via which different subnets can be reached. All of those are directing traffic via the same VLAN (maybe using same Floating IP) but targeting different routers - each being able to pass traffic to different subnet/IP.
Question was if i's possible to configure BFD to selectively detect failure of each router, so if only one router on given VLAN/Interface can be monitored or if you can set monitoring with every router IP and only failure of this specific router can be discovered - or it's all or nothing - one router IP per VLAN/Interface can be monitored.
Hope it makes any sense...
Piotr
Hi Piotr
When you tie static routes with BFD like this:
ip static <source> <destination> fall-over bfd
You're literally making static route dependant on BFD status which means if BFD session goes down or even if you force it administratively down, your static route is removed from routing table so you can't really say it does nothing, can you?
BTW, this is how it's used in real life (for redundancy).
Maybe I didn't quite understand your question but multiple next hop for single Floating IP is usually achieved by using VRRP protocol. BFD creates a session for end-points already 'reachable'.
- dragonflymrCirrostratus
Hi,
Thanks for fast reply and provided info. I just wonder how BFD (in context of static routing) can be used in real life. If BIG-IP will detect that next hop router is down via BFD what will happen? Nothing - it's just info, some action (can be set up)?
Wonder as well if you can configure multiple next hop routers for single Floating IP - like when you have few static routes pointing to different next hop routers in the smae VLAN.
Piotr
Hi Piotr
Yes, BFD can be configured in HA pair with same floating Self IP as source address but such functionality was only implemented in v14.1.0+ via ID701289. In effect, BFD is configured on both Active and Standby but control packets are temporarily suspended on Standby. I'd still advise you to use it with caution and test it in non-production environment before configuring it in production environment.
Cheers.
Rodrigo
- dragonflymrCirrostratus
Hi,
Great article, even if a bit to technical for me :-( I wonder how BFD can be used for static routes i cluster config. Sorry if my question do not make any sense but I am real newbie related to more advanced routing topics.
Let's se we have BIG-IP cluster, then I can see those usage scenarios:
- BIG-IP has static route set pointing to external router as next hop
- External router has BIG-IP set as next hop
- Mix of above
In cluster setup using Self IP do not make sense, so Floating IP should be used. Is that possible with BFD for static routing? If so can one use BFD functionality to trigger failover?
For example Active node is detecting that next hop router is down, so failover to Standby should be performed.
In case of external router I guess things should be easier, if Floating IP can be configured as target for BFD check then after failover same IP will be up so everything should work OK.
Piotr