Forum Discussion
Monitoring Node State (Enabled/Disabled)
First posting so if I am missing standard info please advise.
Environment & Requirements
Single LTM
Two "stacks" of servers - each stack has two physical windows hosts
Server 1 - One "application" with two processes listening on different ports. The port 80 process is hosting two different sites (host header) so this host has a total of three virtual servers associated with it.
Server 2 - Two "applications" with two processes listening on different ports per process so this host has a total of four virtual servers associated with it.
The processes do not communicate with each other directly, not even the ones on the same host - they communicate with the virtual server on the LTM.
If a single process becomes unavailable the entire stack has to fail over to the second stack (a second pair of hosts with the same configuration)
The nodes in stack1 cannot talk to the nodes in stack2 - customer requirement
One of the seven services cannot be "monitored to death" - in fact the vendor states the service can only receive one monitor event every 30 seconds.
Two of the services cannot be in the same pool - vendor requirement
Solution (to date)
We created:
Seven virtual servers (the host header based sites each got their own IPs for non-technical reasons)
Seven pools - one per actual "process" and a seventh one to manage the stack failover
Six IP specific monitors to monitor the process availability of the processes in stack1. All six monitors are assigned to a single pool which has the Availability Requirement set to All.
Seven iRules (example below) to force pool selection to stack2 if the "global pool" is down
when CLIENT_ACCEPTED {
if {[active_members qa-pool-stack1] < 1}{ qa-pool-stack1 is the name of the pool with all the monitors in it
pool qa-s2-ae2-p80 The pool name is different for each irule - corresponds to the pool servicing the second stack for that service
}
}
This works exactly as designed. If any of the processes are shutdown the "stack pool" goes offline and all traffic is rerouted to stack2
Problem
The only catch we have run into so far is that if one of the nodes is disabled in the LTM but the services are not actually shutdown the LTM doesn't failover to stack2. This makes sense because none of the monitors have failed because they are not associated with the availabiltity of the node in the LTM they are checking the actual availability of the service on the host.
We've tried using the LB:status but it is not available at the CLIENT_ACCEPTED scope. We attempted to implement it at the HTTP_REQUEST scope and used it to mark a pool member down when a node get's disabled (which works) but we haven't sorted out how to get the pool re-enabled when the node is re-enabled. We didn't really want to evaluate both nodes, two times for every connection in seven virtual servers.
when HTTP_REQUEST {
if {[LB::status node 10.0.1.2] ne "up" } {
LB::down pool qa-s1-ae1-p80 member 10.0.1.2 80
}
}
What we were really hoping to find is a "monitor" that we can add to the stack-pool that will go down when a node is set to disabled and up when it is enabled.
Anyone know if there is something like that out there - or another way to achieve this goal of monitoring node status?
Thanks
Adam
4 Replies
- Arie
Altostratus
A maximum of one HTTP-request every 30 seconds for monitoring per the vendor? That's a new one...
There are some problems with this. Assuming that the application will indeed fail if requests are made more frequently, what happens if people actually start using it? Secondly, generally a node is marked down if three subsequent requests fail, but in your case that would mean that it won't be marked down for at least 90 seconds once it has failed.
Given this limitation it would seem that the only way to monitor this app and not bring it down (per the specs from the vendor) is to use passive monitoring (i.e. use an iRule to ensure that the server response (based on a client request) is valid and mark down nodes via the iRule. - What_Lies_Bene1
Cirrostratus
Is persistence involved here? If so, I'd assume failover and failback isn't going to be too effective although OneConnect might help. - What_Lies_Bene1
Cirrostratus
Shouldn't this: ' if {[active_members qa-pool-stack1] < 1}' be ' if {[active_members qa-pool-stack1] < 7}'?
- tarac_37545
Nimbostratus
Thanks for the replies! I thought my browser had crashed and the post hadn't actually worked - haven't figured out how to find "my posts" yet.
I'll look into passive monitoring. Yea - that monitoring requirement seemed like integrator nonsense to me but I don't get to make the decision to call them on it.
Persistence is not involved on most of the virtual servers but there is a cookie profile configured on two of them.
I thought active_members checked the number of nodes in the pool not the number of monitors - is that inaccurate?
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com