Forum Discussion
ARP/MAC Tables Not Updating on Core Switches After F5 LTM Failover (GARP Issue?)
We have finally resolved this issue and as promised I said I would comment on what the issue was. We confirmed 100% with a tcpdump on the F5s that they were sending Gratuitous ARPs out its 10G interfaces for all virtual-addresses after a failover event.
We opened a TAC case with Cisco and found that there is a hardware rate-limiter in place on the particular F1 card (very old card) that these F5's were terminating into. The rate-limit for class rl-4, which ARP was assigned to was set to 100 packets-per-second. This is way too low to support the amount of ARP traffic the F5 generates and we had millions of ARP drops on this particular card.
We analyzed the pcap file and found the rate at which the F5 transmitted these GARPs and adjusted the rate-limit on the rl-4 class to 3000 packets per second. We performed failover tests and the MAC addresses on both 7Ks updated immediately for all virtual-addresses.
Thanks for all the input you guys provided.
- tatmotivOct 07, 2016Cirrostratus
Interesting! Thanks for the update!
- Destiny3986_116Oct 07, 2016Nimbostratus
I think you could try to configure "MAC Masquerade Address" for traffic group.
- eddiepar_317026Apr 10, 2017Nimbostratus
how was the adjustment of the rl-4 done on the switches? Was it done on the interfaces in pairs? On each switch?
Thank you
- portoalegreAug 25, 2017Nimbostratus
I'm having the same problem with a pair of LTM's across Data centres using 4 x 7710 Nexus switches across OTV, I have failed over twice to Standby LTM to test failover and vice versa, most of my VS's are unavailable. When I SH IP ARP X.X.X.X (VS) the mac address on the switch which now connects new Standby (demoted Primary) is the wrong mac. I have to clear every VS ARP that have failed on the switch to get things working. So one DC works the demoted DC doesn't work fully. Frustrating!
I know the F5 is sending Gratuitous ARP's I can see that in my packet capture. Logged a Cisco TAC they haven't been very helpful so far, ticket I logged with F5 suggest MAC masquerading which I'm not to confident about and is a large Production change for me with VPC's, OTV etc.
The only limit I could see (running 7710 with N77-SUP2E is a rate limiter for glean packets which is only 100! So I guess these glean packets include Gratuitous ARP's where the mac has changed or the switch cannot find ARP resolution? glean maybe not relevant. But there are drops as below, only thing I could find so far, please be aware this infrastructure was working fine over a pair of 6500 switches previously across L" DWDM between DC's
Any suggestions would be helpful.
Module: 1
Rate-limiter PG Multiplier: 1.00
R-L Class Config Allowed Dropped Total +------------------+--------+---------------+---------------+-----------------+ L3 glean 100 172479882 7398242 179878124
Port group with configuration same as default configuration Eth1/1-2 Eth1/3-4 Eth1/5-6 Eth1/7-8 Eth1/9-10 Eth1/11-12 Eth1/13-14 Eth1/15-16 Eth1/17-18 Eth1/19-20 Eth1/21-22 Eth1/23-24
- Ron_Peters_2122Aug 25, 2017Altostratus
This particular issue we were having was for the hardware rate limiter on older F1 cards. I found the following two links, which may be helpful. When I worked with TAC, I was able to prove via the packet captures that the F5s were in fact sending the GARPs and since they were directly attached to the 7Ks and the ARP tables were not updating (but the GARPs were being SPAN'd to a separate destination port on the 7K - it was an obvious issue with the 7Ks).
Perhaps it is an issue with how ARP works over OTV:
While the links below are for the 7000 series, it should be applicable in regards to the L3 glean Class on 77xx:
In any case, be persistent with Cisco and escalate if need be. Good Luck!
- portoalegreApr 16, 2018Nimbostratus
My problem is finally fixed! I increased IP GLEAN from 100 to 5000 on each Cisco 7700 Switch, I manually forced the Primary LTM into Standby, the new promoted LTM sent out 2000 gratuitous APP's out and this time this burst of ARP's are now seen across OTV on the other Aggregation Nexus 7700 Switches, so all Virtual Servers have one F5 LTM MAC addresses. This command has forced my failover to work, so the problem was that IP GLEAN has a limit by default on how many PPS it can send (default 100 which isn't a lot IMO) NB: once you configured IP GLEAN on 7700's the propagates IP GLEAN config to OTV, AGG VDC's etc.
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com