Forum Discussion

AndOs's avatar
AndOs
Icon for Cirrostratus rankCirrostratus
Mar 30, 2011

Cookie persistence and intermittent node failure

Hi!

 

 

I'm using cookie persistence for one of our applications.

 

We are having some issues with that the nodes goes up and down temporarily.

 

While that is being looked at, I would like to preserve users that has been load balanced to a particular member, even though it temporarily goes down.

 

 

I thought that cookie persistence would take care of this, and keep a user to a specific member until the cookie timed out, but that's not what's happening.

 

Instead I'm seeing that users get load balanced away from the member that's marked down and a new cookie is sent from big-ip.

 

 

Have I made something wrong with the configuration? The virtual server is created from an IIS template with cookie persistence as default.

 

I've tried to add a one-connect profile, that I read about in the manual, but that didn't change the behavior.

 

 

Does big-ip always make a new load balancing decision for a user when a member goes down, regardless of the persistence method used?

 

 

 

 

/Andreas

 

 

 

 

  • What is your action on service down set for on the pool? Can u post the relevent anonomized config?
  • AndOs's avatar
    AndOs
    Icon for Cirrostratus rankCirrostratus
    Action on service down is set to None, I guess this is default?

    this is the config I'm using:

    monitor cookie_app_monitor {
       defaults from http
       interval 30
       timeout 91
       partition App1
    }
    
    profile persist cookie_app_persist_profile {
       defaults from cookie
       mode cookie
    }
    
    profile tcp cookie_app_lan-optimized_tcp_profile {
       defaults from tcp-lan-optimized
    }
    profile tcp cookie_app_wan-optimized_tcp_profile {
       defaults from tcp-wan-optimized
    }
    
    node 10.1.2.11 {
       screen cookie_app_node1
    }
    node 10.1.2.12 {
       screen cookie_app_node2
    }
    
    pool cookie_app_pool {
       lb method member least conn
       monitor all tcp_half_open
       members {
          10.10.2.11:http {
             priority 1
          }
          10.10.2.12:http {
             priority 1
          }
       }
    }
    
    virtual cookie_app_virtual_server {
       snat automap
       pool cookie_app_pool
       destination 10.10.3.22:http
       ip protocol tcp
       persist cookie_app_persist_profile
       profiles {
          cookie_app_lan-optimized_tcp_profile {
             serverside
          }
          cookie_app_wan-optimized_tcp_profile {
             clientside
          }
          microsoft_iis_http-wan-optimized-compression_shared_http {}
       }
    }
    
    
    
    
    
  • AndOs,

     

     

    To my knowledge you cannot do what you are attempting with the BigIP LTM.

     

     

    The LTM will try and keep your Client connected to an available pool member. If one goes down then you can make the response to the node failure automatic for the Client (Action On Service Down – Default is None. I suggest changing this to Reselect).

     

     

    The Cooke Persistence is how the BigIP LTM keeps track of the server that the Client was Load Balanced to. If that server goes down, then the Action On Service Down should select a new server and re-issue a new cookie. That behavior is correct.

     

     

    If you are looking to keep the Session Information consistent then you will need to use a different methodology (Store Session Information in a Database, Session Cookie, etc.).

     

     

    The BigIP LTM does not keep session information for the client in response to a server failure. It can keep the persistence and state information current between a load balanced pair of BigIP LTM’s (Virtual Server configuration – Connection Mirroring), but there is an associated performance cost for doing this.

     

     

    Hope this helps.
  • Actually.. Action on Service Down plays into the very behavior he's looking for..

     

     

    I've done extensive testing on the behavior of pool members and the action on service down settings.. For the record "reselect" does not behave how you may think/want it to.. At least in the later versions of 9.4.8 where I tested most. I've opened several cases on this..

     

     

    Reselect will not automatically "move" connections to available hosts if constant data is coming in from that host to a server that is responding on that connection.. For one to undoubtedly stop connections to that server, you would have to use "reject" which will send a rst to client and server.. but that's not what AndOs is looking for..

     

     

    None should give you the behavior you're looking for as long as the tcp connection is still established, client sending and server still responding..

     

     

    LTM: Action on Service Down

     

     

    http://devcentral.f5.com/Tutorials/TechTips/tabid/63/articleType/ArticleView/articleId/179/LTM-Action-on-Service-Down.aspx

     

     

    None (default)

     

    LTM will continue to send data on established connections as long as client is sending and server is responding. Connection management / recovery / cleanup is via standard TCP mechanics for both clientside and serverside flows.

     

     

    Use "None" if you don't want LTM to intervene in managing either side of the connection. Useful if your servers may not be accepting new connections, but should be allowed to continue servicing existing connections when marked DOWN. Also supports custom monitoring designed to support connection bleeding and other non-standard state management schemes.

     

     

     

     

    The session cookie should last as long as the "session".. So I believe as long as their browser is open.. I don't believe it's the cookie timing out, it's the connection.. You may be able to get around this be removing your optimization profiles and setting up a TCP profile and changing your "idle-timeout", default 300 seconds..

     

     

     

     

    See if that works for you.. I'm also not sure why you have priorities assigned to pool members of 1.. doesn't look like it should make a difference.. I don't see a minimum active members command with it.. I know they don't display by default like that in 10.2.. but I don't remember in prior versions.. what version are you running?

     

     

  • AndOs's avatar
    AndOs
    Icon for Cirrostratus rankCirrostratus
    Posted By Michael Yates on 03/30/2011 01:02 PM

     

     

    The BigIP LTM does not keep session information for the client in response to a server failure. Ah, no.. That's not what I meant to say.

     

    I don't want the LTM to hold session information, that's in the web server.

     

    I want the LTM to direct the client to the same member that the client has been loadbalanced to previously, even if the member temporarily goes down a few seconds.

     

     

     

    Posted By iRuleYou on 03/30/2011 07:10 PM

     

     

    See if that works for you.. I'm also not sure why you have priorities assigned to pool members of 1.. doesn't look like it should make a difference.. I don't see a minimum active members command with it.. I know they don't display by default like that in 10.2.. but I don't remember in prior versions.. what version are you running?

     

     

    We're running 10.2.0 Build 1755.1

     

    The vip and pool are created from the IIS template, I guess the priorites were set by default.

     

    I'll try the Action on service down and see what happens...

     

     

     

     

  • It sounds like your action on service down is already default to "none" which should give you the desired behavior.. Read the above article link of action on service down..

     

     

    what I recommended was;

     

     

     

    The session cookie should last as long as the "session".. So I believe as long as their browser is open.. I don't believe it's the cookie timing out, it's the connection.. You may be able to get around this be removing your optimization profiles and setting up a TCP profile and changing your "idle-timeout", default 300 seconds..