Forum Discussion
Load balancing decision
Hi Guys,
I've been doing some research about loadbalancing methods and algorithms in devcentral, and so far I haven't been able to find an answer to this question:
During the loadbalancing operation, when does the LB decide that a server is down? How many healthchecks attempts does the LB do to decide that the server in line is down? is this configurable?
Thanks, Fabian
7 Replies
- Vitaliy_Savrans
Nacreous
Hi,
LB decide that a server is down based on health monitor.The number of healthchecks depends on following parameters of healthmonitor:
Interval 5and
Timeout 16With settings there will be 3 healthchecks before marking server down.
- Dicky_Moe_13167
Nimbostratus
So the LB keeps different timeout records for each healthcheck attempt?
- giltjr
Nimbostratus
I don't think so, I think it just keeps track of the amount of time since the last response. Using the time values in Vitaliy Savranskiy example it would look something like:
- Time 0 - send health check 1
- Time 0 - received response to check 1. Save last response as "0" seconds ago
- Time 1 - is last response time less than 16 seconds ago: Yes, leave as up.
- Time 5 - send health check 2
- Time 6 - is last response time less than 16 seconds ago: Yes, leave as up.
- Time 10 - send health check 3
- Time 11 - is last response time less than 16 seconds ago: Yes, leave as up.
- Time 15 - send health check 4
- Time 16 - is last response time less than 16 seconds ago: No, mark as down.
So as long as it receives a response within the last 16 seconds it will leave the node marked as being up.
- Dicky_Moe_13167
Nimbostratus
Could be. But, is it an assumption or do you know that for a fact?
- giltjr
Nimbostratus
It is an assumption based on a couple of decades of writing code.
Think about it, do you really care which one you get a response from, or do you just care that it has been more or less than 16 seconds since your last response? You only care if it has been more or less than 16 seconds. You don't care that 1, or 2, or all 3 have received a request. You just care that you have not received any response during the last 16 seconds.
If they are really keeping track of each outstanding check, then IMHO they are wasting resources.
Off hand, without giving it too much thought, what I would do is start a 16 second count timer, and reset it back to 16 every time I get a response. If the timer ever gets to zero, then change the status to down.
- Dicky_Moe_13167
Nimbostratus
most likely you are right. I just thought that since the F5 already keeps track of every connetion in the connections table, it also treated the healthchecks jus as another connection.
It does seem a waste of resources though.
Thanks!
- giltjr
Nimbostratus
Well, in a sense they do, but maybe not the same way you are thinking of. Remember all monitors are a "ask" and "answer" type function, so somewhere in the F5 there is something waiting for the "answer".
If the monitor is a ping (ICMP), the something knows that the request has been sent and it waiting for an answer. There is a timeout for that specific answer. I'm not sure what the timeout is on each individual ICMP request is.
If the monitor is a HTTP request, they you have multiple levels of "status" that the F5 code could be checking. You have the TCP connection status. The F5 code could be checking to see if the 3-way handshake has completed. Then you have the actual HTTP GET/RESPONSE. Again, at some level something in the F5 knows that the GET has been issued and is waiting for the response.
So, something knows the status of each monitor check, it just a question of how the F5 is using that information.
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com