Technical Forum
Ask questions. Discover Answers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Custom Alert Banner

F5 LTM manual resume problem

mwi
Altocumulus
Altocumulus

Hi,

we have configured a pool with two nodes, one primary and one secondary, every node has its own monitor.

The primary node monitor has set the option "manual resume" so if this node isnt avalible, the traffic goes to secondary, but doesnt switch automaticly back.

It works under normal circumstances fine, but yesterday we had some network issues so that the active loadbalancer lost the connection to the primary node, the passive loadbalancer had no problems.

We had to take down the active Loadbalancer and now the problem was that the before passive loadbalancer routed traffic to the primary node again.

Is there any solution to sync the node status automaticly?

 

Best regards.

 

10 REPLIES 10

M_Saeed
Cirrus
Cirrus

@mwi 
This is normal because it needed to be manually resume after being disables.

you need to implement another approach of "Priority Group Activation"
That's fit your Solution.

K13525153: Configure a standby pool member to process traffic when primary pool member goes down
Refer to this and you would get it all -> https://my.f5.com/manage/s/article/K13525153

@mwi 

So if i understand your issue is fail back not fail over.

What i believe is happening is that when you bring the "active" node back online (so both are now green) traffic is still on the "backup" node.  Now i believe the behaviour here is that the active node should start taking on new traffic, but existing traffic will stay on your existing live node so in this case the backup node until the connection is broken or timed out.
If you want the backup node to stop processing traffic, you need to find a clean way of closing the connections so they are remade at the f5 from the client.  This will depend on your application but this might not be too graceful.

You could try putting the backup node into disabled or offline modes to help with this process but if the application doesn't open and close ports regularily this may not help you. So you'll be back to working out a way that you can close the connections on the backup server so they are remade on the active server.
It would be worth testing this with your application if you have that option.

mwi
Altocumulus
Altocumulus

@M_Saeedthis is allready configured and works. Sorry i didnt mention it.

The Problem is if the active part of the HA-Cluster lost the connection to the primary node, and the passive part doesnt, after a failover, the primary node is automaticly active again.

I have drawn a little picture, i hope this makes more clear what my problem is.

 

@PSFletchTheTeki think thats not part of my problem.

@mwi 
Let's assume you would seperate both into two different pools and we would conditionallay govern this via an iRule.


when CLIENT_ACCEPTED {
set default_x_pool [LB::server pool]
if {[active_members $default_pool] < 1}{
pool Service_backup_y_pool
}
}

@M_Saeedgood idea, but the primary node would be active on the second HA-Cluster member.

Did you see the Picture?

@mwi Can you take a moment to explain the situation in a different way because I am not understanding what you mean even after looking at your diagram.

i can try.

We have a HA-Cluster with two F5 Loadbalancer (LB1 and LB2), on this cluster is an pool with two nodes (primary and secondary), primary node hast configured manual resume, secondary not.

This all works as intended.

Problem is if LB1 is the active member of the cluster and lost network connection to primary node and LB2 doesent, it routes the traffic to secondary node. If now LB1 goes down (in example lost power) and LB2 is now the active clustermember, it routes traffic again to the primary node because it doesnt know that LB1 switched to the secondary node.

We need to know whether you are handling stateful or a stateless traffic .. ex it's a database or http traffic? 

its a Database

zamroni777
Altocumulus
Altocumulus

you can add Gateway Pool into HA setting.
so if Node's GW pool status is down, it wont be Active node.

zamroni777_0-1700203017408.png