Forum Discussion

smp_86112's avatar
Icon for Cirrostratus rankCirrostratus
Sep 29, 2010

Cookie Persistence Failure Due To Client Behavior

Here's a tricky problem with a VIP using Cookie persistence and a Pool using Least Connections (member). A client opens up a first TCP connection which is load-balanced to M1 and sends a POST. Before receiving a response (i.e. it has no cookie), a second TCP connection is opened which is load-balanced to M2 and a second POST is sent. When the first response is received, I presume the cookie for M1 is set. However when the second response is received, the M1 cookie is clobbered by the M2 cookie. Next, the client sends another transaction on the TCP connection for M1, but with the M2 cookie. Because the app also uses cookies, it sees the request came to the wrong app server which generates an error.



Not being all that familiar with persistence behavior, we were thinking about two possible ways to resolve this:


1) Enable OneConnect. However we fear this will not address the situation where a second cookie overwrites the first in a case where a client has two outstanding HTTP transactions.


2) Enable source_addr as a Fallback persistence.



If we use source_addr as Fallback persistence though, won't that mean that the load-balancing decision happens during the intial handshake, and that the BigIP cookie becomes irrelevant?



Would appreciate some thoughts on how to get this working. Thanks.


7 Replies

  • Hi SMP,



    What kind of client and app is this? For a standard browser and web app, the client can't make a second request before a response comes back to the first request as the client won't know what to request. For example, a client couldn't know to request an image if it hadn't already requested a URI which references the image.



    Are you testing some kind of atypical client and/or web app? If not, I'd guess you have multiple clients connecting to a virtual server from behind a shared proxy. If that's the case, you might see serverside TCP connections being re-used for the "wrong" clients. A simple fix for this is to add a OneConnect profile. This ensures LTM will open a new serverside connection if a request mid-connection needs to be sent to a different pool member. If you're using SNAT, you can set the OneConnect source mask to to reuse serverside connections for all clients. If you're not using SNAT, you can set the source mask to to ensure the serverside connection is only reused for the same client IP.



  • This is an SAP portal app with IE7 as the browser. We just discovered a problem the other day with the same app that triggers an IE bug, but only under really specific conditions and is nearly impossible to reproduce. So there's all sorts of funky stuff going on with this thing. I know AJAX is involved, but don't know much else.



    What I see is two new TCP connections are opened to two different pool members from the same client. Client does a POST that is sent on Connection1, but before its response is received, I see a second POST sent on Connection 2. Neither of these POSTs contain a persistence cookie. Then response is received on Connection1, followed by a response on Connection2. The next transaction I see is a POST on Connection1 with the Connection2 Cookie. This results in a server error.



    The more I think about this, the more I think the right answer is to apply source_addr persistence. When the intial TCP connection is made, the client will get locked into a pool member. When he starts sending HTTP, the persistence cookie will take over. If the client decides to open another TCP connection, source_addr will keep him on the same member.
  • This sounds really tricky. I'm actually curious to know if it's an AJAX component firing off the other POST, as this behavior sounds pretty atypical.


  • Source address persistence would probably be a safe, quick fix.



    Thinking about this though, I can't see how a client could send two concurrent POST requests to start a session. So either this is a very weird scenario or maybe you're not seeing the initial application session setup. Regardless, I imagine F5 Support could help diagnose the issue once you have more time to troubleshoot it.



  • You are right hoolio, the two POSTs I am referring to are not the start of the browser session. In fact, they are at the very end because the user stopped navigation as soon as he got the error. I'm sorry if I gave you that impression - that was miscommunication on my part. The behavior I describe happens while navigating the application. In this case the user was just browsing around the app trying to hit the error, so there is stuff going on before these two POSTs I describe. To make it worse, I didn't stop and start the trace before each attempt. But I'm pretty confident that the behavior I described is at the root.



    I just applied OneConnect. Initial feedback is that it didn't break anything, and they claim they can't repro the error condition now. But lots more testing needs to be done yet...
  • Update...testing appears to show OneConnect fixed the issue. I also found out that the app is AJAX-based. Looking at the POSTs in the traces, I saw that the referrer is from another app. So I think what happens is that the browser hits one app (which I was not tracing), that generates multiple AJAX calls to another app. Those AJAX calls are getting sent without cookies, and the last one to return wins. With OneConnect, that last cookie then gets evaluated with each subsequent request which keeps the client persisted to the same pool member throughout the remaining life of the browser session.



    Whew...this one was nasty.
  • Can any one help me out... After lo-gin in to my retail banking site, and then i keep session idle for say 2 mins, and then i surf again the next pages it say pages is not available.


    When i run fiddler, i saw With same session ID and same cookie i was redirecting to different servers.


    We are using cookie based persistence, with source_addr as a Fallback persistence.