Forum Discussion

aj_102030's avatar
aj_102030
Icon for Nimbostratus rankNimbostratus
Feb 08, 2013

Health monitor fails for same pool members in multiple pools

Good Day

 

 

 

i am new to Dev central and hope my topic is related to this forum.....

 

 

we currently run a LTM v11.1.0 cluster..... There was a need to run the same pool members in multiple pools. These pool members are a mixture of production pool members and HA pool members which we start up only to test or in an recovery event. What we have found that when these HA pool members have been down for a while and one enables them the health monitor fails, on some of them even the node icmp check fails. The work around that i use are on the default node icmp is to change it to NONE and then back to ICMP and then the node comes up, this does not always work if not, i have to delete the node and the pool member out of all the pools reverenced and re add it from scratch, i can't re call that we had this issue on version 9.4.8 so i am thinking it must be related to version 11

 

 

 

any assistance will be greatly appriciated

 

 

 

Regards

 

 

A

 

6 Replies

  • Are the HA nodes different to the production ones in any way? I take it they are not running all the time? How long have you waited after enabling the HA Pool Members.

     

     

    I'd suspect an ARP issue or similar here. Have you ever tried clearing the F5's TMM/LTM ARP cache? Is Reciprocol Update enabled?

     

     

    Keep in mind that even when disabled the device will continue to monitor Pool Members and perhaps, if they are unavailable for a long time, retries max out etc.
  • Just for the curiosity, why did you decide to use same pool members on different pools?

     

     

    If you set a monitor also at the pool level then you actually are monitoring several times the same servers, .i.e. A Monitor from every single pool that contains the reused pool members and also the node defined check. This doesn't sound good.

     

     

    Why not to reuse the pool with differente VS?

     

     

    Regards!

     

     

  • Hi Guys

     

     

     

    the HA nodes are the Production nodes are identical, what will happen is that the production VM's will be clonned and moved to the DR site,

     

    even though the same pool members are used in mulitple pools, the pools are not identical, one pool will have production nodes and HA nodes were as the other pool will only have the HA nodes, the reason for this is in a event that the front end production nodes is under strain we will bring up the HA nodes, this additional front ends could cause strain on the SQL backend, when this happens they will brake the sql repliction between the sql clusters and we will bring up the the 2nd pool that consist of just the HA nodes they will then talk to a different sql cluster to eleviate the load.

     

     

    Reciprocol Update is enabled,

     

     

    i have for now disabled the node checks for these HA nodes, there is a scheduled DR run on the 23rd Feb and then i will be able to see if this makes a difference

     

     

     

    Regards

     

     

    A
  • Particularly knowing VMs are involved, I really would try clearing the device ARP cache as a first measure when bringing up the HA servers.