Forum Discussion

hui_37443's avatar
hui_37443
Icon for Nimbostratus rankNimbostratus
Nov 27, 2012

what happens if LB_FAILED is not handled

I've got a VIP which doesn't have an iRule to handle LB_FAILED, while the other one has. They have similar pool settings so that it is the same web application sitting behind the two VIP. Thus makes them comparable.

 

The one with LB_FAILED handler reports FB_FAILED every minutes, suggesting something wrong when trying to do tcp connection to the backend web servers. The one without handler doesn't seem to have any onging issues. That makes me wonder that if there is no iRule to deal with LB_FAILED event, how does F5 deal with the situation?

 

Thanks,

 

6 Replies

  • does client automatically retry by itself?

     

     

    or is "Reselect Tries" configured?

     

     

    However, beginning in BIG-IP version 10.x it is possible to cause a reselection by using the Reselect Tries advanced pool feature. This feature is equivalent to the LB::reselect iRule command. Once a connection attempt to a pool member is considered failed (equivalent to the LB_FAILED iRule event), the BIG-IP system selects a new pool member. Should that pool member also fail, the system repeats the reselection until it reaches the limit configured for this option. Only at this point is the client connection reset.sol10640: Pool member reselection options

     

    http://support.f5.com/kb/en-us/solutions/public/10000/600/sol10640.html

     

     

    just my 2 cents.
  • My opinion is that if Reselect Tries is at the default value of 0, then the client connection is reset using a RST. I would also expect the client to attempt to reconnect at that point. However, this is no doubt introducing some delay and you should really be trying to establish why Pool Member selection is failing. Perhaps you could add some further logging to the rule that contains the LB_FAILED event to ascertain the cause, or at least add the LB::reselect command?

     

     

    If your servers are perhaps reaching their connection limits OneConnect may also help here.
  • Thanks for the quick responses. We have worked out that the web app doesn't respond to TCP SYNC regularly. We are investigating on why that happens, which is a separate story.

     

     

    The clients are common web browsers. Therefore, I guess they don't do auto reconnect?
  • I wouldn't expect this to be a browser function. If this happens on an initial connection then perhaps you'd see an error but if this occurs midway through a session I'd expect the host TCP/IP stack to automatically reconnect if there's still data to send.

     

     

    Again, you might find OneConnect useful to help with the server issue and using LB::reselect in the iRule should minimise the issue.
  • Actually we do have Oneconnect configured and I am not sure whether it contributes to the trouble or not. We are not using LB::reselect, but instead go to another pool. Then we found that pool switching happens way too frequent, and cant' be right.
  • By the sounds of it, if the server is the issue, that's what needs solving. Alternatively, before then, use LB::reselect instead of the other pool and things should improve.