i have been going through some article on implementing BIG-IP (LTM) HA on Azure cloud, however i am stumbled upon contradictory statements where one says Azure loadbalancer is required to achieve BIG-IP HA, where as some other implementation without Azure Loadbalancer. Can someone please clarify which one is correct.
F5 provides different Azure deployment designs which can be found here...
The "Autoscale" templates are covering a setup of two or more standalone VEs in a load balances configuration and does not utilize session state replication between those VEs. The load can be distributed via RR-DNS or front ending Azure LBs which are distributing the load between the individual VEs.
The "Failover" templates are covering a traditional Sync-Failover F5 setup including session state replication. The active/passive network integration is either handled by your VEs via Azure API calls (aka. dynamically assign the public IP to the currently active unit) or via front-ending Azure LBs.
Personally I don't use any of the provided templates, since they are not flexible enough (aka. no 2-arm setup available and way too many pre-configured settings). Because of that I usually install two standalone 2-nic VEs from the scratch (aka. MGMT and Production interfaces). Created a LTM Sync-Failover cluster as usual (via Self-IPs of the Production Network) and ended up to deploy a Azure-LB in front of the units to provide network failover (aka. L2 failover/clustering does not work in Azure). In this setup each Virtual Server is simply configured with an /31 network mask (aka. two subsequent IPs for each VS) and each of the VE units is listening to just one of those /31 IPs (via additional Virtual Machine IPs). If VE unit A is currently active, the Azure load balancer will mark IP A as active and IP B as inactive and then forward the traffic via IP A to unit A. If VE unit B is currently active, the Azure load balancer will mark IP A as inactive and B as active and then forward the traffic via IP B to unit B. The outcome of this setup is a fully functional Sync-Failover cluster with fail-over delays of 5-10 seconds....
Hi Kai. You mention
" In this setup each Virtual Server is simply configured with an /31 network mask (aka. two subsequent IPs for each VS) and each of the VE units is listening to just one of those /31 IPs (via additional Virtual Machine IPs)"
Is there a good document that details best practice about how to do this? Including config of any required load balancer or traffic manager?
afaik there is no such a guide available from F5.
You have basically to see the Azure based VEs the same way you would create a cluster on On-Prem Environments. Just the missing L2 capabilities of Azure are getting replaced with those /31 bit VS instances (in Azure IP-1 gets assigned to Unit-A and IP-2 gets assigned to Unit-B) and a Azure-LB in front of those IP-Pairs to perform Health-Monitoring which system is active and finally failover if needed.
Once a got it up an running you simply operate a usual and fully featured Active-Passive VE cluster with config and session state sync. There is basically no difference between OnPrem and Azure anymore…
Review my articles please that show various HA patterns as well as what the BIG-IP virtual server should look like.
And overall HA guidance in the 3 major cloud providers with recommendations:
I developed the mentioned – lets call it Azure LB assisted /31 subnet mask Sync-Failover cluster deployment – way back in 2017/2018 when F5 had only API based Sync-Failover support with all the drawbacks the Azure-API is able to cause.
The mayor difference to your outlined “HA Using ALB for Failover” approach is, that my scenario is able to use a 2-NICs template (just Production-NIC and Management-NIC) and uses multiple /31 subnet mask virtual servers with the same :80 and:443 port assignments instead of multiple /0 subnet mask virtual servers with different ports for individual applications/services.
Note: Since then I’ve checked here and there the new releases of the supported ARM-Templates and also had a couple talks with the guys behind. But to be honest those wizard based configurations are pretty much non-intuitive and are producing a rather unfamiliar/unclean configuration set (at least for my heavy OCD impaired brain). I will definitely continue to stick to an as clean as possible 2-NIC standalone initial VE deployment (/w all autogenerated settings removed) and then just create a Sync-Failover cluster setup like I would setup for OnPrem environments. Compared to a well-kown OnPrem deployment the one and only difference are those /31 subnet mask virtual servers (or /0 with different ports as you like) and the front-ending Azure-LB to spice up the limited Layer2 capabilities of Azure. But that is all about...
My video is general and doesn't really depend on number of NICs. If you need 2 NIC, you can check out my Azure BIG-IP Terraform repo. It also builds out the necessary example VIPs for HA via LB templates (ex. 0.0.0.0/0 VIP).
There are many ways to carve out the VIPs with varying masks as long as the Azure LB has the right backend targets and ports. It all depends on how you want to manage the virtual server naming and standards in your environment.
*note, most deployments use 2nic BIG-IP but I'm currently in process to update repo to 3nic. It's easier for people to delete code than figure out what to add.
Hello Kai, can you tell me what health probe you implemented on the Azure LB? I've deployed the F5 template which creates two active/passive F5s behind an Azure LB but as I'm loadbalancing a UDP application (AlwaysOn VPN) I'm unsure what health probe I need to create on the Azure LB.
Review the supported tree:
As well as the experimental tree which has a few more options:
You can also look into terraform examples to build out the components via that tooling method. Here are examples of using other orchestration methods (terraform, ansible).