I'll try to explain this as best I can... I've inherited an F5 with an untested solution configured and I'm trying to troubleshoot an issue. The issue I'm having is this... I have a VLAN with middle tier servers that use the F5 as the default gateway. When they try to access a VS on the same F5, they fail to get to the pages on the pool members. My forwarding VS works fine and they can reach the rest of the network and my VS IPs are pingable. Traffic flow: external user hits VS on 10.1.1.1 port 8010 with a specific URL coming into the F5 on a front end VLAN, F5 sends the connection to one of the pool members on the back end VLAN... 10.2.1.1 port 8010. That server will then open a new connection to the same VS to access services on one of the same pool members (same VS, different URL)... could get balanced to a different server or it could be the same one, these services are needed in order to render the page. This is where it fails, since it cannot get through the VS it cannot render the page and errors out. It fails on all VSs on the F5 from that back end VLAN including some that go to static content. If you hit the VSs from outside of the F5, they work. Any ideas? Thanks. Details: F5 3900, V.11.3.2806

mmm it's somehow general the information. Try adding and Snat automap in the advanced config of the VS just for testing. Snat also helps with some routing problems . Seems like some broken sessions are ocurring there. Also,it's kinda odd that the server issues a request to VS IP address, and because a lb decision, the f5 fwd the packet to the same server again. Try tcpdump on internal and external interfacesk, and would be nice to run also wireshark on the proper server in order to know where packet is getting stucked. Good Luck, HHeredia

Just to amplify HHeredia's suggestions, it's very likely that your servers have a known route to each other. It doesn't matter that the F5 is their default route, because the server they're taking to is on the same subnet. If you look at a packet capture you'll likely see the pool member talk to the VIP, which then talks to another pool member, and then nothing else (probably a reset). A SNAT profile applied to the VIP will translate the client (pool member) source address to an IP controlled by the F5 so that return traffic is forced back through the F5.

A bit more detail... I tried SNAT to see if that made a difference but it didn't. The servers, when they are trying to hit the VS IP, are actually routing to the DG to get there because it's in a different VLAN but on the same F5.

So do you see the servers reaching the VIP? If so, do you see the F5 talking to the pool members?

If I understand your configuration, it's a rare one; server trying to reach servers on the same network and in the same server pool as the requesting server. The SNAT should work but perhaps you used Auto-SNAT and the source is the same as the VS address (e.g. 10.1.1.1) which could cause the loop you are seeing. Try a SNAT, but use an address other than the VIP. Another alternative is to create a new VS with a VIP on the same network as the server (and pool members) you are trying to access (i.e. 10.2.1.x). I have no idea what would happen if the requesting server gets itself as the destination in this scenario, seems to me you should put content on another IP address range on the same server or even on another pool of servers and make this a whole lot easier! All this looping back through the LTM can't be more efficient or better performing than just distributing the content to different IP addresses and using internal VIPs instead of pushing everything back through the 10.1.1.x VIP.

Problem with servers using F5 as DG

15 Replies

HHeredia_36237
Nimbostratus
Aug 14, 2013
mmm it's somehow general the information. Try adding and Snat automap in the advanced config of the VS just for testing. Snat also helps with some routing problems . Seems like some broken sessions are ocurring there.

Also,it's kinda odd that the server issues a request to VS IP address, and because a lb decision, the f5 fwd the packet to the same server again.

Try tcpdump on internal and external interfacesk, and would be nice to run also wireshark on the proper server in order to know where packet is getting stucked.

Good Luck,

HHeredia
Kevin_Stewart
Employee
Aug 14, 2013
Just to amplify HHeredia's suggestions, it's very likely that your servers have a known route to each other. It doesn't matter that the F5 is their default route, because the server they're taking to is on the same subnet. If you look at a packet capture you'll likely see the pool member talk to the VIP, which then talks to another pool member, and then nothing else (probably a reset). A SNAT profile applied to the VIP will translate the client (pool member) source address to an IP controlled by the F5 so that return traffic is forced back through the F5.
Mike_Marvel_629
Nimbostratus
Aug 15, 2013
A bit more detail... I tried SNAT to see if that made a difference but it didn't. The servers, when they are trying to hit the VS IP, are actually routing to the DG to get there because it's in a different VLAN but on the same F5.
Kevin_Stewart
Employee
Aug 15, 2013
So do you see the servers reaching the VIP? If so, do you see the F5 talking to the pool members?
Mark_Harris_608
Cirrus
Aug 15, 2013
If I understand your configuration, it's a rare one; server trying to reach servers on the same network and in the same server pool as the requesting server. The SNAT should work but perhaps you used Auto-SNAT and the source is the same as the VS address (e.g. 10.1.1.1) which could cause the loop you are seeing. Try a SNAT, but use an address other than the VIP. Another alternative is to create a new VS with a VIP on the same network as the server (and pool members) you are trying to access (i.e. 10.2.1.x).

I have no idea what would happen if the requesting server gets itself as the destination in this scenario, seems to me you should put content on another IP address range on the same server or even on another pool of servers and make this a whole lot easier! All this looping back through the LTM can't be more efficient or better performing than just distributing the content to different IP addresses and using internal VIPs instead of pushing everything back through the 10.1.1.x VIP.
- Mike_Marvel_629
  Nimbostratus
  Aug 15, 2013
  Preaching to the choir on this one Mark... but you know how Devs work. :) I tried SNAT but only tried auto-map. I'll try to SNAT to something else and see how that goes. Thanks.
- BinaryCanary_19
  Historic F5 Account
  Aug 15, 2013
  This is strange. Why would the server connect back to the virtual server that is sending it load-balanced connections? If the resources are on the server itself, is it not more efficient for it to connect to its own loopback address?
- Mark_Harris_608
  Cirrus
  Aug 15, 2013
  Yes, and to continue to distribute the requests across the same array of servers for dynamic page content the same way the initial connection is distributed, simply running another IP address on the server and adding to a separate pool would resolve the networking loops. Loopback address would concentrate requests for the dynamic content to the same server initially connected to. I assume the developers are trying to create one big cluster so it scales better, more servers, more resources for initial and page-building content. I say go "old school" and run multiple IP addresses on each server.
Kevin_Stewart
Employee
Aug 15, 2013
Do you need to maintain any kind of persistence or tracking information from the external client request to the server's request? If not, perhaps it'd be easier to duplicate the external VIP on the internal VLAN so that you don't have to use a DG setting. You'll still need SNAT.
Kevin_Stewart
Employee
Aug 16, 2013
I may still be missing some details, but I don't think you have a lot of options without SNAT. Default gateway or not, even if the packets from a server made to a pool member, it'd carry a client source address that the pool member would know how to route back to directly. Besides, a server wouldn't use its default gateway setting if the source was on the same subnet.

I'm beginning to believe that accessing a VIP on another BIG-IP VLAN, using the BIG-IP as the DG, is another problem entirely. I can't say for certain if that's supposed to work. In any case, if you recreate the VIP on the internal VLAN for internal servers to use, you shouldn't need the DG. The only alternative to a SNAT, I believe, is static routes - which would get messy.
marco_octavian_
Nimbostratus
Aug 16, 2013
This scenario isn't that uncommon. I've seen it a lot at Fortune 50 companies using IBM/IHS-WAS combinations. Shared resources/modules/services on the same box. I've also seen variations of SAP do this. The SAP implementation would get ugly sometimes because ip_address/host info would be parsed in the sap payload, if I am recalling correctly. Both of these are by design.

Mike,

1) Read Kevin's post again. Once that "un-snat'd" packet comes across, all the communication is local. The only part I have concern with is the option of static routes. The static routes won't work because the 10.2.1.0 network is a directly connected interface. It will always win.

2) Something is missing here. Perform a tcpdump again and catch a transaction. What is the port (VS:port_combo) for this middleware vs? 10.1.1.1:xx ??? ex: tcpdump -ni internal x.x.x.x port xxxx

3) Automap should work fine. Without it, the communication might be breaking due to confusing arp entries on the server(s) since the first syn packet will have the F5's mac address and the reply would theoretically be the server arp entry in the local cache of the destination server. Just thinking out loud here. Could be wrong. Is "Allow SNAT" set to "No" at the pool level (advanced settings)?

4) Perhaps you could use an iRule with 1:1 SNAT mappings to pool members. At least this would satisfy the logging need for original source ip addresses.

5) I would also create an additional FastL4 server (10.1.1.x) and see if it behaves any differently with and without snat enabled. I'm sure you can temporarily use a single server point to this new vip for testing.

6) What happens when you manually configure a server to point to a middleware server's ip address specifically and not use the vip (bypass LTM) ? Does that work?

I have ran into this problem before but I just can't recall how I solved it. I think I used an iRule to swap ip addresses, etc. Just can't remember. sorry.

Either way , some piece of data seems to be missing. The fact that AutoMap doesn't work lets me know we are missing some data here. Let's get those tcpdumps. Feel free to post config snippets as well.
- BinaryCanary_19
  Historic F5 Account
  Aug 16, 2013
  The scenario you explain suggests possibly a loop (packets going back and forth between VIP and server until TTL expires). Perhaps this guy has autolasthop disabled.
- BinaryCanary_19
  Historic F5 Account
  Aug 16, 2013
  Also in this situation, it is not exactly connections that the servers are sending back, just the same packets but with the payload modified (compression perhaps). In the few such scenarios I've seen, the users typically configure specific virtual servers for the external-to-internal communications, and specific ones to handle the server-to-vip connections, placing both on different VLANs.