Forum Discussion
F5 BigIp cluster active/stanby in Azure, failover very slow
Hello,
I'm contacting you because I need to configure a F5 BigIp cluster in active/stanby in Azure, and I'm encountering a problem with failover.
My infrastructure and part of the configuration looks like this:
With the mentioned iRules, the failover goes fine.
My problem is that it's dramatically slow (between 30 seconds and 3 minutes for the ALB to realize the failover).
Do you know a way of minimizing this delay?
Thanks in advance for your help.
5 Replies
HA works completely different in cloud environments due to abstraction of L2 networking. Instead, I would just run a few active BIG-IP instances and load balance via DNS. More specifically, I would use terraform or ansible or some other automation framework to auto-build my cloud BIG-IP instances based off some github managed master configuration. Then I would use cloud native automation tools to manage the DNS entries. Sadly, there is no way to minimize HA failover delay in the cloud.... you are trying to fit a square peg into a triangle hole. F5 HA was designed for connected and local traditional L2 network designs. You dont have that with cloud so trying to replicate similar functionality is really a hack. Go active w/ DNS based failover.
A BETTER way, would be to look into F5 Distributed Cloud for HTTP/S workloads.
- Alexandre_Demasi
Nimbostratus
Thank you for your answers.
I'll keep looking, starting with the links and ideas you mentioned.Have a nice day.
The Azure API is much slower than AWS API for example and as F5 in the cloud uses CFE as when there is a failover the F5 needs to call the API to change the elastic IP address attachment as cloud environments do not support GARP this is where the issue with more than a minute failover is seen.
Use Azure ALB for failover for ASM and LTM deployments as this was used in the old F5 azure arm templates but now you can modify the autoscale group if needed:
Lightboard Lessons: BIG-IP Deployments in Azure Cl... - DevCentral (f5.com)
For APM VPN deployments I suggest use F5 GTM and standalone APM devices:
- MichaelOLeary
Employee
I know it's 2 years late but I just came across your question. You have an excellent diagram and thank you for providing it. I also like this approach and I have written about in on DevCentral in the past in this article: https://community.f5.com/kb/technicalarticles/transparent-load-balancing-in-azure-part-2/332419
I have helped customers set this up. You have explained it well. But your question is: why does it take up to 3 mins for the ALB to realize failover?
If your iRule is working properly, the health probes from ALB will start failing immediately when a device becomes Standby for traffic-group-1. So I think we can say one of the following things could be your problem:
- Your ALB/iRule is not 100% correct. Are you doing TCP health checks at ALB but expecting an HTTP request in your iRule? I've seen customers set that up accidentally.
- Your ALB health probe frequency is too long.
- Your second iRule is a simple HTTP::respond 200. In my example, I actually respond with content like "I am healthy!". That may not matter, but that's what I do.
- Could there be a port mis-match between your 2 VIPs? I have seen a customer set up the first VIP listening on 8001 but then the "probing-vip" is listening on a different port, so the forwarding iRule is not getting traffic on the right port.
Those are my best guesses. Because this setup requires some advanced knowledge, I only recommend it to F5 customers that I know have the confidence in their entire F5/Azure team to support it. Sometimes, as long as they are using SNAT AutoMap and not requiring true src IP at their application, I tell them to use architecture number 4 from this article and run Active/Active: https://community.f5.com/kb/technicalarticles/four-activeactive-load-balancing-examples-with-f5-big-ip-and-azure-load-balancer/327959
Message me over DevCentral or reach out some other way if you ever want to catch up about this. That goes for any customer reading this :)
Mike.
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com