Forum Discussion

dubdub's avatar
Icon for Nimbostratus rankNimbostratus
Apr 25, 2011

iRule to work around Windows patching schedule

I have five Windows servers in a .NET cluster for my production environment. I'm currently balancing them round-robin, and everything is working fine. However, we use a SQL database server for our .NET session state store (necessary in a farm setup without sticky sessions). This database server has occasional outages that cause problems for my environment's 400+ applications, as many of them rely on session state and users are enormously impacted when it's not available, even briefly.



For one particular application on this cluster, which has its own VIP, I would like to modify the configuration to use cookie persistence. Then I can have the application change its session state model to in-process, and eliminate all dependencies on the SQL server, so database outages don't affect the app.



However, my five servers go through planned reboots every month for Windows patches. If I use persistent sessions for this application, some percentage of the users will get kicked out as each server is rebooted. I'd like to mitigate that by preventing connections to a given server for some period of time prior to the Windows update/reboot.



So I was thinking of writing an iRule to check the current day and time, and compare it to a data group list where I'd keep the five server names + their patch day/time. If the request is within a certain timeframe of a planned reboot, the iRule would direct the request to a pool that contained servers not on that patch schedule. Seemed pretty straightforward.



A slight wrinkle has arisen, in that the session timeout for this application is 12 hours. If I start trying to prevent connections to a server within 12 hours of a reboot, I need to spread out the reboots over practically an entire week, which is manageable but ugly. If I don't spread them out, too many servers will have their outage windows too close together and I run the risk of eventually having no pool members available.



An additional complication is that the main page of this application (after logging in) includes a refresh of that page every 30 seconds, which invokes a web service call to get a fresh data set. So... in essence, this thing never will time out.



Any ideas on how I can work around the patch schedules without impacting user sessions would be much appreciated!







4 Replies

  • Given the scenario I think that your only option is to spread out the reboots over a longer period of time so that you can disable the server and prevent any new connections to the server (but considering the self-refreshing application you still run the risk of impacting your customers).



    I think the application owner's design has lowered your number of available options, so they are going to have to help you mitigate the risk of customer impact.



    My only suggestion is to insure that the "Action On Service Down" option on the Pool is set to "Reselect" so that when you do reboot the server the Customers are sent to a different server rather than just totally dropped.


  • Would it be acceptable to display a "Please try again later" page to users who have a valid persistence cookie but whose chosen pool member is down?



    I agree with Michael; I think the "action on service down" to "reselect" is probably the best option if you want to avoid annoying your user population with splash screens or "page cannot be displayed". But this application's design is pretty much shoehorning you into a very difficult posture.
  • dubdub's avatar
    Icon for Nimbostratus rankNimbostratus
    Hi Michael and Joel,



    Even if I reselect on service down, though, that new server isn't going to have the in-memory session state that the server being patched will have - so from the user's perspective, they will have to log out/log back in either way, right?





  • That would be correct based on my understanding of how your application is set up. Basically, you will invalidate the existing session and will be required to log back in. This is probably a better scenario than just offering a user a blank screen or an error splash screen, though!