High ping latency and loss of connection

Question

I've installed the trial of LTM VE on ESXi 4.1 and, most of the time, it's working perfectly. Randomly, however, I lose connectivity to the LTM virtual machine and all nodes behind it. During this timeout period, the LTM cannot ping any nodes, including those VM's on the same ESX host. &nbsp;
&nbsp;This, to me, indicates that the issue isn't with networking outside of the ESX host, but rather within the virtual machine or the virtual switch. I've moved the VM to another ESXi host but the problem persists.&nbsp;
&nbsp;Another curious sign is the ping latency from the LTM out to a VM node (same ESXi host):&nbsp;
&nbsp;PING 172.16.xxx.xxx (172.16.xxx.xxx) 56(84) bytes of data.
&nbsp;64 bytes from 172.16.xxx.xxx: icmp_seq=1 ttl=128 time=7.25 ms
&nbsp;64 bytes from 172.16.xxx.xxx: icmp_seq=2 ttl=128 time=9.26 ms
&nbsp;64 bytes from 172.16.xxx.xxx: icmp_seq=3 ttl=128 time=10.2 ms
&nbsp;64 bytes from 172.16.xxx.xxx: icmp_seq=4 ttl=128 time=10.2 ms
&nbsp;64 bytes from 172.16.xxx.xxx: icmp_seq=5 ttl=128 time=9.12 ms
&nbsp;64 bytes from 172.16.xxx.xxx: icmp_seq=6 ttl=128 time=10.3 ms&nbsp;
&nbsp;--- 172.16.xxx.xxx ping statistics ---
&nbsp;6 packets transmitted, 6 received, 0% packet loss, time 5035ms
&nbsp;rtt min/avg/max/mdev = 7.252/9.421/10.319/1.091 ms&nbsp;&nbsp;
&nbsp;If, on the other hand, I ping from a node to the LTM, I get &lt;1ms latency. So:&nbsp;
&nbsp;LTM VE -&gt; MyHost = ~10ms
&nbsp;MyHost -&gt; LTM VE = &lt;1ms&nbsp;
&nbsp;Marc&nbsp;&nbsp;&nbsp;

marc_57522 · Answer

A few notes: 
&nbsp;  
&nbsp; 1. My LTM VE instance has two interfaces, but the internal interface handles four tagged VLANs. External interface manages a single untagged VLAN. 
&nbsp;  
&nbsp; 2. Nothing logged to any of the /var/log files that would be of any help. 
&nbsp;  
&nbsp; 3. Performance graphs don't indicate that I'm hitting any sort of ceiling (no load on this yet). 
&nbsp;  
&nbsp; 4. Outages last for 2 - 3 minutes, then traffic resumes on its own. &nbsp;

qe_102628 · Answer

1) intermittent connectivity like this sounds like a duplicate MAC or IP address is present elsewhere within the layer2 infrastructure. 
&nbsp;  
&nbsp; 2) 10ms ping responses from the node generally indicate a slow node/polling driver.  The fact that responses from the VE to the node are quick indicates this behavior.  Are your nodes using E1000 NICS or other fully emulated NIC types?  I've seen slow responses there.   &nbsp;

marc_57522 · Answer

Posted By qe on 06/06/2011 01:19 PM
&nbsp;
1) intermittent connectivity like this sounds like a duplicate MAC or IP address is present elsewhere within the layer2 infrastructure. &nbsp;&nbsp;
Wouldn't a dupe only cause problems on one of the interfaces? I can't ping out any of the LTM interfaces while this is going on. Would the LTM log IP conflicts anywhere?

&nbsp; 2) 10ms ping responses from the node generally indicate a slow node/polling driver.  The fact that responses from the VE to the node are quick indicates this behavior.  Are your nodes using E1000 NICS or other fully emulated NIC types?  I've seen slow responses there.   &nbsp;Node-to-node ping (even across two LTM interfaces) is at 1ms. The only pings that exceed this are LTM-to-node. The LTM installation is the trial, v10.1.0.3341 and has E1000 interfaces. All of my VM nodes use VMXNET 3 adapters, and a few are physical nodes with Broadcom NICs.&nbsp;
&nbsp;Is there a chance that the trial version of LTM VE is rate-limited by holding every packet for 10ms? &nbsp;

norman_lee · Answer

Hi Marc,  
&nbsp;  
&nbsp; Have you ever found a solution to this problem?  The EXACT same situation has been bothering my setup.

john_hall_11177 · Answer

Marc, 
&nbsp; If LTM held packets for 10ms, then you'd see the same delay on LTM to LTM interfaces.  One other diagnostic step would be to temporarily switch the nodes you're communicating with to the e1000 driver.  We've seen some very weird behavior with VMware when receiving traffic on VMware's VMXNET 3 driver resulting in very large packets being handed off to upper layers that are not expecting them.  Apparently the VMXNET 3 driver doesn't always obey the configured MTU of the guest.

Forum Discussion

High ping latency and loss of connection

Recent Discussions

Is a Dedicated Interface, VLAN, and Self IP Required for a VIP in an F5 Configuration?

question about getting hsl data to be formatted properly in splunk

Problem with lets encrypt and redirect after update

where can find silverline ddos protect config guide?

iRule for client certificate verification and inserting CN

Related Content

High Availability in a Bare Metal World

High availability on big ip 2600 device

High Availability Groups on BIG-IP

DevCentral Connects hosts Capture the Flag!

Troubeshooting website connection

ABOUT DEVCENTRAL

RESOURCES

SUPPORT

PARTNERS