Forum Discussion

Dazzla_20011's avatar
Dazzla_20011
Icon for Nimbostratus rankNimbostratus
Feb 15, 2010

Virtual Server not responding

We've just recently bought two F5 LTM to load-balance across our internal applications. We keep having an intermitent problem in that the the Virtual Servers for our test applications stop responding. When the problem occurs the applications can still be accessed using the real addresses. As soon as I press update on one of the Virtual Servers without making any changes it fixes the problem for all our applications.

 

 

The applications are load-balanced using one ip address across different tcp ports using least connections.

 

 

Anyone have any ideas? I'n curious as to what the update button does.

 

 

A packet capture from the client shows the following.

 

 

17,"4.511206","10.216.144.65","10.128.144.24","TCP","4607 > 58053 [SYN] Seq=0 Win=16384 Len=0 MSS=1460"

 

19,"7.426665","10.216.144.65","10.128.144.24","TCP","4607 > 58053 [SYN] Seq=0 Win=16384 Len=0 MSS=1460"

 

23,"13.442368","10.216.144.65","10.128.144.24","TCP","4607 > 58053 [SYN] Seq=0 Win=16384 Len=0 MSS=1460"

 

 

Many Thanks
  • Those look like typical TCP retries, based on the times - 3 seconds before the first retry, 6 seconds before the second, etc. Here are a few ideas off the top. Note: I'm assuming that the VS address is 10.128.144.24 from the capture above.

     

     

    Typical issues

     

    Some typical issues that crop up with newer installs are things like:

     

    1) Arp cache issues. This is a biggie - double check for this and other L2 problems (spanning tree, etc.)

     

    2) Duplicate IPs.Check for duplicate IPs, IP contention on the BigIPs (look in /var/log/ltm for errors).

     

     

    What I would do next

     

    Straight away fire up a CLI session on the BigiP (the active unit) and:

     

    1) Tcpdump on the vlan in question filter out for the VS address. Something like: 'tcpdump -ni tcp and host 10.128.144.24'

     

    2) Do the same, but look for arp issues: 'tcpdump -ni arp

     

    3) Let the tcpdump in 1 run while you access the VS on the appropriate port. See what happens, if anything. If you don't see the traffic hitting you, you've got some route issue.

     

     

    Assuming you've got some arp cache issues, you can run 'b load' from the CLI, which will issue a GARP on the wire on most versions.

     

     

    If none of this gives you anything, call support ASAP. I'd probably do this in conjunction with any of the steps above.

     

     

    Good luck!

     

    -Matt
  • Thakns very much. You've just given me a thought, we're migrating from Cisco CSS devices. I've just checked on the CSS and there appears to be an old content rule for that Virtual Server. I wonder if the CSS is responding to arp requests.

     

  • It's very possible. The good news is that you'll be able to track that one down pretty quickly with some captures. Please post back the final analysis!

     

    -Matt