Forum Discussion

Jason_Keating's avatar
Jason_Keating
Icon for Altostratus rankAltostratus
Nov 13, 2010

HA failover and ARP confusion

Hi,

 

 

I am having some trouble with a HA failover scenario and suspect it's the switch (but can see no evidence of this) however the problem presents itself as stale ARP entries so thought I would ask here.

 

 

 

I am running 10.2.0 1755.1

 

 

 

I have 3 VLANS configured:

 

VLAN_A on 1.1

 

VLAN_B on 1.2 (tagged)

 

VLAN_C on 1.2 (tagged)

 

 

 

I have self IP's for VLAN A only (one static for each and one floating)

 

 

 

My default route is a gateway on VLAN A

 

 

 

I have virtuals on all three VLANS.

 

Both units are synced.

 

VLAN configs are duplicated and by default I am not using masquerading.

 

 

 

When Unit 1 is Active all virtuals work fine.

 

 

 

Problem:

 

I force Unit 1 to standby ... all nodes and virtuals on the new active (Unit 2) are green.

 

Virtuals on VLAN A continue to work as desired, appropriate changes to the ARP table on the switch are observed.

 

Virtuals on VLAN B and C do not respond.

 

Checking ARP tables on the switch....entries for virtuals on B and C appear stale.

 

 

 

If I fail back, all works ok.

 

 

 

I only have 4 virtuals on VLAN C and 9 on B so I don't think its Gratuitious ARP spam (ref http://support.f5.com/kb/en-us/solu...r=11136085)

 

 

 

I've tried using MAC Masquerade on the offending VLAN's although I do not have visibility into the switch again until tomorrow at which time I'll find out if the table is correct. However do a 'b load' on both units and I did fail over with no different results.

 

 

 

I've also observed I only have ARP entries on either LTM for VLAN A - I assume this is because I only have a Self IP for A (I assume I need an address on the VLAN I want to issue Gratuitous ARP's on for the reply) , I am unsure if I should see entries for the other two VLANs given I have no Self IP's for B and C. Unit 1 works without visible entries, so I assumed this would be good for Unit 2.

 

 

 

I've seen the switchport (Cisco IOS) config, and the ports for Unit 1 1.2 and Unit 2 1.2 are identical.

 

 

 

Any ideas? have I missed something in my config? any clues about what to look for on the switch?

 

 

 

Any advice appreciated.

 

 

 

Regards

 

J

 

 

 

  • Hi

    I was a little confused about it to, so I read the TMOS guide again and again on Self IP's trying to determine if I needed one.

    According to the text I only want a Self IP for a VLAN if I want to route to destination servers based on self ip (as a means to identify VLAN and therefore interface of egress), I have a default gateway for all destinations. Or for SNAT'ing thereby ensuring responses are routed back through the LTM, I am snatting most of my traffic with everything coming back to 10.162.134.183

    Here is my config, I can add self ip's but did not see anything stating I must.

    Thanks for having a look

    stp instance 0 {
       interfaces {
          1.1 {
             external path cost 20000
             internal path cost 20000
          }
          1.2 {
             external path cost 20000
             internal path cost 20000
          }
       }
       vlans {
          internal_A
          internal_B
          internal_C
       }
    }
    vlan internal_A {
       tag 4094
       interfaces 1.1
    }
    vlan internal_B {
       tag 214
       interfaces tagged 1.2
    }
    vlan internal_C {
       tag 645
       mac masq 40:017:B2:25:44
       interfaces tagged 1.2
    }
    self 10.162.134.181 {
       netmask 255.255.255.0
       vlan internal_A
       allow default
    }
    self 10.162.134.183 {
       netmask 255.255.255.0
       unit 1
       floating enable
       vlan internal_A
       allow default
    }
    route default inet {
       gateway 10.162.134.1
    }
    snatpool myfloating_SNAT {
       members 10.162.134.183
    }
    virtual virtual_1 {
       snatpool myfloating_SNAT
       pool pool_1
       destination 10.162.134.187:http
       ip protocol tcp
       profiles {
          http {}
          tcp {}
       }
    }
    virtual virtual_2 {
       snatpool myfloating_SNAT
       pool pool_2
       destination 10.162.142.2:https
       ip protocol tcp
       vlans internal_B enable
    }
    virtual virtual_3 {
       snatpool myfloating_SNAT
       pool pool_3
       destination 10.162.146.173:ldap
       ip protocol tcp
       vlans internal_C enable
    }
    
    
    

  • as i tested, there was no garp on internal vlan (no selfip). however, with mac masquerading, client in internal vlan was able to connect to virtual server after failing over.

    without mac masquerading, arp was not changed since garp was not sent out on internal vlan. so, client in internal vlan cannot connect to virtual server after failing over.

    
    bigip01:
    [root@bigip01:Active] config  b version|grep -iA 2 version
    BIG-IP Version 10.2.0 1755.1
    Hotfix HF1 Edition
    
    vlan external {
       tag 4093
       interfaces 1.1
    }
    vlan internal {
       tag 4094
       mac masq 02:01:D7:1E:C3:43
       interfaces 1.3
    }
    self 172.28.17.50 {
       netmask 255.255.255.0
       vlan external
       allow all
    }
    self 172.28.17.99 {
       netmask 255.255.255.0
       unit 1
       floating enable
       vlan external
       allow all
    }
    virtual bar {
       snat automap
       pool foo
       destination 10.10.70.100:http
       ip protocol tcp
    }
    
    bigip02:
    [root@bigip02:Standby] config  b version|grep -iA 2 version
    BIG-IP Version 10.2.0 1755.1
    Hotfix HF1 Edition
    
    vlan external {
       tag 4093
       interfaces 1.1
    }
    vlan internal {
       tag 4094
       mac masq 02:01:D7:1E:C3:43
       interfaces 1.3
    }
    self 172.28.17.10 {
       netmask 255.255.255.0
       vlan external
       allow all
    }
    self 172.28.17.99 {
       netmask 255.255.255.0
       unit 1
       floating enable
       vlan external
       allow all
    }
    virtual bar {
       snat automap
       pool foo
       destination 10.10.70.100:http
       ip protocol tcp
    }
    
    bigip01:
    [root@bigip01:Active] config  b fo standby
    
    bigip02:
    [root@bigip02:Standby] config  tcpdump -e -nni 0.0 'arp[14:4] = arp[24:4]'
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on 0.0, link-type EN10MB (Ethernet), capture size 108 bytes
    03:57:02.596939 00:01:d7:1e:c3:44 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 4093, p 0, ethertype ARP, arp who-has 172.28.17.99 (ff:ff:ff:ff:ff:ff) tell 172.28.17.99
    03:57:03.596976 00:01:d7:1e:c3:44 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 4093, p 0, ethertype ARP, arp who-has 172.28.17.99 (ff:ff:ff:ff:ff:ff) tell 172.28.17.99
    03:57:04.596594 00:01:d7:1e:c3:44 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 4093, p 0, ethertype ARP, arp who-has 172.28.17.99 (ff:ff:ff:ff:ff:ff) tell 172.28.17.99
    03:57:05.596643 00:01:d7:1e:c3:44 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 4093, p 0, ethertype ARP, arp who-has 172.28.17.99 (ff:ff:ff:ff:ff:ff) tell 172.28.17.99
    03:57:06.596685 00:01:d7:1e:c3:44 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 4093, p 0, ethertype ARP, arp who-has 172.28.17.99 (ff:ff:ff:ff:ff:ff) tell 172.28.17.99
    
    client in internal vlan:
    [root@web1 ~] arp -a|grep 10.10.70.100
    ? (10.10.70.100) at 02:01:D7:1E:C3:43 [ether] on eth0
    
  • i'm a little bit confused. u don't have selfip on vlan b and c but u've virtual server on them. what's the virtual server config??

     

     

    would u mind posting vlan, selfip and virtual server config here?
  • if i'm not wrong, i understand garp will be sent for floating selfip and vip. since those ip are in vlan a, i don't think there is garp being sent on vlan b and c. anyway, i'll have a look if there is anything i can find.

     

     

    SOL11985: Overview of the ARP.GratuitousRate bigpipe database variable

     

    http://support.f5.com/kb/en-us/solutions/public/11000/900/sol11985.html
  • Thanks for you help on this, I added self IP's - which as we suspected sorted the problem.

     

     

    I'm still unclear on why this is needed (based on F5's documentation which as far as I can see does not set it out as a requirement), although I appreciate why its likely required.

     

     

    I need to learn more about garp, I've watched the garps with the new selfip's and am puzzled by the mac's in the garp (they don't match the physical int) - anyhow it all works as expected - so like I say, some more testing, research and learning required.

     

     

    Thanks again helping out.

     

     

    Cheers

     

    J