LTM: Action on Service Down

BIG-IP has long been known for its extensive application monitoring capabilities, and LTM has expanded both the types of monitoring activities possible and the possible reactions to varying monitor states. In this article, I'll explain the options for the "Action on Service Down" setting.

Overview

The LTM application health monitoring facility allows the configuration of application health monitors specific to your application which manage the availability of the pool members based on whether they respond as expected to a specific request.  The "Action on Service Down" setting controls LTM's connection management behavior when a monitor transitions a pool member to a DOWN state after it has been selected as the load balancing target for a connection.

In most cases, pool members that are marked DOWN were in the process of servicing active connections.  For example, let's say there are 3 connections to a pool member when a configured application health monitor marks it DOWN:

  • Connection #1 is brand new, still awaiting the response to the handshake SYN packet;
  • Connection #2 is open but idle;
  • Connection #3 is open and active.

In all cases, the configured "Action on Service Down" option will influence how LTM manages the connection, and the application logic will determine how the client and/or server respond to the various options.

Choosing and configuring the best option for your application

"Action on Service Down" is a Pool setting, and can be found in the GUI (Local Traffic -> Pools).  Choose "Advanced" in the Configuration dropdown to reveal the "Action on Service Down" setting. 

The "best" option for your application is entirely dependent on that application's client and server implementations. Some experimentation may be required to find the optimal setting for your application.  The possible options are None, Reject, Drop, Reselect.  Here's a description of the mechanics of each option, and some scenarios in which they might be useful:

Reject
LTM will reap any active connection by sending a RST to both client & server immediately upon receiving any traffic on the connection after the pool member transitions to a DOWN state, and removing the connection from the LTM connection table.

Use "Reject" when you want LTM to explicitly close both sides of the connection when the server goes DOWN.  "Reject" is the most commonly used option for the service down setting.  This option often results in the quickest recovery of a failed connection since it forces the client side of the connection to close, in many cases triggering an automatic re-connect & re-send of the request in process.  Replicates BIG-IP 4.x behaviour when the "svcdown_reset" db variable is set to "enabled".

Drop
LTM will silently drop any new client data sent on established connections.  The connection remains in the connection table until 1) an LTM idle timer related to the connection times out or 2) either side closes the connection.

Use "Drop" when sending a RST to the client is not desirable. This method does not immediately reflect the server's state change to the client, and depends on the client to close or otherwise manage the connection.

Reselect
LTM will choose another pool member if one is available and rebind the client connection to a new serverside flow.  (Existing OneConnect flows may be re-used if available, or a new server connection will be established if necessary.)

Use "Reselect" when the client can continue with a new server seamlessly.  The request in play at the time of state transition may be lost, so the client will need to be able to recover gracefully to use this option successfully.

None (default)
LTM will continue to send data on established connections as long as client is sending and server is responding.  Connection management / recovery / cleanup is via standard TCP mechanics for both clientside and serverside flows.

Use "None" if you don't want LTM to intervene in managing either side of the connection.  Useful if your servers may not be accepting new connections, but should be allowed to continue servicing existing connections when marked DOWN. Also supports custom monitoring designed to support connection bleeding and other non-standard state management schemes.

Related Topics

There are a couple of iRules commands you can use to replicate some of the functionality provided by the "Action on Service Down" setting, although the logic to do so is considerably more complex than using the checkbox feature.

  • The LB_FAILED event is triggered if the pool member doesn't respond to the handshake request.

    The LB::status command may be used to determine the current status of a pool member

    The LB::reselect command may be used to choose the next available pool member.
Updated Jan 26, 2023
Version 2.0
  • This and other F5 documents are all tailored towards TCP. It would be good to document what the behavior of Action on Service Down is on UDP and UDP with Immediate Connection release profice.
  • Hey Deb, thanks for the helpful article.

     

     

    I have a question, what happens if there are existing connections to the backend servers, action on service down is set to none, and the members are disabled and downed on the LTM.

     

     

    So technically the server is still up and is able to respond to tcp connections, but as part of migrations, they have been forced down and disabled on the LTM. Would the connections still goto the server in that case?