Forum Discussion

Ev-_28244's avatar
Ev-_28244
Icon for Nimbostratus rankNimbostratus
Sep 11, 2008

VLAN Group ARP Issues

Hello all,

 

 

I have a very interesting issue with one of my configurations using an LTM. Due to some environment limitations - and timeframes, I had to resort to inserting the LTM into an existing network and establishing a VLAN Group, bridging two VLANs.

 

 

This configuration is currently in production, with no issues at all. However in my DEV/TEST area I have some problems with ARP and the VLAN association.

 

 

Outline of the configuration;

 

 

LTM DEV/TEST

 

Version : BIG-IP 9.4.3 Build 1.4 Final

 

VLANs : External (int 1.1), Internal (int 1.2)

 

VLAN Group : DMZ-Bridge (1.1, 1.2)

 

Transparency Mode : Translucent

 

Bridge all traffic : Enabled

 

VLAN IDs : External, 4094 / Internal, 4093 / DMZ-Bridge, 4094

 

Self-IP : 192.168.0.2

 

VIP : 192.168.0.5:*

 

Pool : 192.168.0.10, 192.168.0.20

 

 

 

See the attached diagram for a complete picture.

 

 

Now, the servers reside on Internal interface.

 

 

Everything runs fine for about 5-6hours then suddenly my servers start failing their health checks and the F5 drops connections from the VIP.

 

 

After doing some investigation I have discovered the following;

 

 

1) Every 300secs (ARP entry timeout) the VLAN that the servers ARP entry is associated to changes. It can move from VLAN 4093 to 4094, on occassion the ARP timeout will only be 30seconds. This can be seen by running "b arp show"

 

 

2) After about 5-6hours for the next ARP table update the F5 sends the arp request out, however it is ignoring the response. The result is the health checks then fail. About 25 - 30 seconds later it all comes good again. By this time, client connections have been dropped.

 

 

I have a workaround in TEST/DEV by creating a static ARP entry for the servers, the only issue with this is I cannot associate it to a VLAN apart from the VLAN bridge so any traffic destinated for the servers gets sent out both VLANs in the bridge rather than just one.

 

 

 

I do not have this issue in PROD. PROD is the same with the exception of;

 

 

F5 Version : BIG-IP 9.4.1 Build 29.2

 

Switch : Cisco 3750

 

 

This has got me a bit stumped, any ideas?

 

 

Thanks in advance,

 

 

Evan.

 

  • It seems to me like version/revision problem - we have upgraded a lot before getting this up. I would try to get both environments to the same builds - this should help a lot. I would also try to get second switch and make group transparent (if possible) - just to check.

     

     

    The question is, do You see ARP responses, where they should be? The MAC moving between VLANS could be caused by that - check switch, I have seen this (but years ago) on multihomed Sun Sparc and Cisco. In this case bridging F5 would be very confused...

     

     

    We have had a lot of problems with L2 (failover bridging in early 9.x, VLAN groups of more than 2 VLANs etc.) and we are phasing out the L2 designs, however we have one last cluster in the same environment You have described - this includes 2pcs 3750 stacked, 2 pcs cluster of BIG-IP 9.4.4 Build 65.1 Final with 6 VLAN groups. It appears to be running ok for now, however we are now in process of removing that in favor of L3 design, which appears to be far more stable. I am not sure about version of IOS on 3750, but if You are interested in that I can ask the admin.