Lightboard Lessons: BIG-IP Deployments in Azure Cloud
In this edition of Lightboard Lessons, I cover the deployment of a BIG-IP in Azure cloud. There are a few videos associated with this topic, and each video will address a specific use case.
Topics will include the following:
- Azure Overview with BIG-IP
- BIG-IP High Availability Failover Methods
Glossary:
- ALB = Azure Load Balancer
- ILB = Azure Internal Load Balancer
- HA = High Availability
- VE = Virtual Edition
- NVA = Network Virtual Appliance
- DSR = Direct Server Return
- RT = Route Table
- UDR = User Defined Route
- WAF = Web Application Firewall
Azure Overview with BIG-IP
This overview covers the on-prem BIG-IP with a 3-nic example setup. I then discuss the Azure cloud network and cloud components and how that relates to making a BIG-IP work in the Azure cloud. Things discussed include NICs, routes, network security groups, and IP configurations. The important thing to remember is that the cloud is not like on-prem regarding things like L2 and L3 networking components. This makes a difference as you assign NICs and IPs to each virtual machine in the cloud.
Read more here on F5 CloudDocs for Azure BIG-IP Deployments.
BIG-IP HA and Failover Methods
The high availability section will review three different videos. These videos will discuss the failover methods for a BIG-IP cluster and how traffic can failover to the second device upon a failover event. I also discuss IP addressing options for the BIG-IP VIP/listeners and "why".
Question: “Which F5 solution is right for me? Autoscaling or HA solutions?”
Use these bullet points as guidance:
- Auto Scale Solution
- Ramp up/down time to consider as new instances come and go
- Dynamically adjust instance count based on CPU, memory, and throughput
- No failover, all devices are Active/Active
- Self-healing upon device failure (thanks to cloud provider native features)
- Instances are deployed with 1-NIC only
- HA Failover (non auto scale)
- No Ramp up/down time since no additional devices are "auto" scaling
- No dynamic scaling of the cluster, it will remain as two (2) instances
- Yes failover, UDRs and IP config will failover to other BIG-IP instance
- No self-healing, manual maintenance is required by user (similar to on-prem)
- Instances can be deployed with multiple NICs if needed
HA Using API for Failover
How do IP addresses and routes failover to the other BIG-IP unit and still process traffic with no layer 2 (L2) networking? Easy, API calls to the cloud. When you deploy an HA pair of BIG-IP instances in the Azure cloud, the BIG-IP instances are onboarded with various cloud scripts. These scripts help facilitate the moving of cloud objects by detecting failover events, triggering API calls to Azure cloud, and thus moving cloud objects (ex. Azure IPs, Azure route next-hops). Traffic now processes successfully on the newly active BIG-IP instance.
Benefits of Failover via API:
- This is most similar to a traditional HA setup
- No ALB or ILB required
- VIPs, SNATs, Floating IPs, and managed routes (UDR) can failover to the peer
- SNAT pool can be used if port exhaustion is a concern
- SNAT automap is optional (UDR routes needed if SNAT none)
Requirements for Failover via API:
- Service Principal required with correct permissions
- BIG-IP needs outbound internet access to Azure REST API on port 443
- Mutli-NIC required
Other things to know:
- BIG-IP pair will be active/standby
- Failover times are dependent on Azure API queue (30-90 seconds, sometimes longer)
- I have experienced up to 20 minutes to failover IPs (public IPs, private IPs)
- UDR route table entries typically take 5-10 seconds in my testing experience
- BIG-IP listener
- can be secondary private IP associated with NIC
- can be an IP within network prefix being routed to BIG-IP via UDR
Read about the F5 GitHub Azure Failover via API templates.
HA Using ALB for Failover
This type of BIG-IP deployment in Azure requires the use of an Azure load balancer. This ALB sits in a Tier 1 position and acts as an Layer 4 (L4) load balancer to the BIG-IP instances. The ALB performs health checks against the BIG-IP instances with configurable timers. This can result in a much faster failover time than the "HA via API" method in which the latter is dependent on the Azure API queue. In default mode, the ALB has Direct Server Return (DSR) disabled. This means the ALB will DNAT the destination IP requested by the client. This results in the BIG-IP VIP/listener IP listening on a wildcard 0.0.0.0/0 or the NIC subnet range like 10.1.1.0/24. Why? Because ALB will send traffic to the BIG-IP instance on a private IP. This IP will be unique per BIG-IP instance and cannot "float" over without an API call. Remember, no L2...no ARP in the cloud. Rather than create two different listener IP objects for each app, you can simply use a network range listener or a wildcard. The video has a quick example of this using various ports like 0.0.0.0/0:443, 0.0.0.0/0:9443.
Benefits of Failover via LB:
- 3-NIC deployment supports sync-only Active/Active or sync-fail Active/Standby
- Failover times depend on ALB health probe (ex. 5 sec)
- Multiple traffic groups are supported
Requirements for Failover via LB:
- ALB and/or ILB required
- SNAT automap required
Other things to know:
- BIG-IP pair will be active/standby or active/active depending on setup
- ALB is for internet traffic
- ILB is for internal traffic
- ALB has DSR disabled by default
- Failover times are much quicker than "HA via API"
- Times are dependent on Azure LB health probe timers
- Azure LB health probe can be tcp:80 for example (keep it simple)
- Backend pool members for ALB are the BIG-IP secondary private IPs
- BIG-IP listener
- can be wildcard like 0.0.0.0/0
- can be network range associated with NIC subnet like 10.1.1.0/24
- can use different ports for different apps like 0.0.0.0/0:443, 0.0.0.0/0:9443
Read about the F5 GitHub Azure Failover via ALB templates.
HA Using ALB for Failover with DSR Enabled (Floating IP)
This is a quick follow up video to the previous "HA via ALB". In this fourth video, I discuss the "HA via ALB" method again but this time the ALB has DSR enabled. Whew! Lots of acronyms! When DSR is enabled, the ALB will send the traffic to the backend pool (aka BIG-IP instances) private IP without performing destination NAT (DNAT). This means...if client requested 2.2.2.2, then the ALB will send a request to the backend pool (BIG-IP) on same destination 2.2.2.2. As a result, the BIG-IP VIP/listener will match the public IP on the ALB. This makes use of a floating IP.
Benefits of Failover via LB with ALB DSR Enabled:
- Reduces configuration complexity between the ALB and BIG-IP
- The IP you see on the ALB will be the same IP as the BIG-IP listener
- Failover times depend on ALB health probe (ex. 5 sec)
Requirements for Failover via LB:
- DSR enabled on the Azure ALB or ILB
- ALB and/or ILB required
- SNAT automap required
- Dummy VIP "healthprobe" to check status of BIG-IP on individual self IP of each instance
- Create one "healthprobe" listener for each BIG-IP (total of 2)
- VIP listener IP #1 will be BIG-IP #1 self IP of external network
- VIP listener IP #2 will be BIG-IP #2 self IP of external network
- VIP listener port can be 8888 for example (this should match on the ALB health probe side)
- attach iRule to listener for up/down status
Example iRule...
when HTTP_REQUEST { HTTP::respond 200 content "OK" }
Other things to know:
- ALB is for internet traffic
- ILB is for internal traffic
- BIG-IP pair will operate as active/active
- Failover times are much quicker than "HA via API"
- Times are dependent on Azure LB health probe timers
- Backend pool members for ALB are the BIG-IP primary private IPs
- BIG-IP listener
- can be same IP as the ALB public IP
- can use different ports for different apps like 2.2.2.2:443, 2.2.2.2:8443
Read about the F5 GitHub Azure Failover via ALB templates. Also read about Azure LB and DSR.
Auto Scale BIG-IP with ALB
This type of BIG-IP deployment takes advantage of the native cloud features by creating an auto scaling group of BIG-IP instances. Similar to the "HA via LB" mentioned earlier, this deployment makes use of an ALB that sits in a Tier 1 position and acts as a Layer 4 (L4) load balancer to the BIG-IP instances. Azure auto scaling is accomplished by using Azure Virtual Machine Scale Sets that automatically increase or decrease BIG-IP instance count.
Benefits of Auto Scale with LB:
- Dynamically increase/decrease BIG-IP instance count based on CPU and throughput
- If using F5 auto scale WAF templates, then those come with pre-configured WAF policies
- F5 devices will self-heal (cloud VM scale set will replace damaged instances with new)
Requirements for Auto Scale with LB:
- Service Principal required with correct permissions
- BIG-IP needs outbound internet access to Azure REST API on port 443
- ALB required
- SNAT automap required
Other things to know:
- BIG-IP cluster will be active/active
- BIG-IP will be deployed with 1-NIC
- BIG-IP onboarding time
- BIG-IP VE process takes about 3-8 minutes depending on instance type and modules
- Azure VM Scale Set configured with 10 minute window for scale up/down window (ex. to prevent flapping)
- Take these timers into account when looking at full readiness to accept traffic
- BIG-IP listener
- can be wildcard like 0.0.0.0/0
- can use different ports for different apps like 0.0.0.0/0:443, 0.0.0.0/0:9443
- Licensing
- PAYG marketplace licensing can be used
- BIG-IQ license manager can be used for BYOL licensing
Sorry, no video yet...a picture will have to do! Here's an example diagram of auto scale with ALB.
Read about the F5 GitHub Azure Auto Scale via ALB templates.
Auto Scale BIG-IP with DNS
This type of BIG-IP deployment takes advantage of the native cloud features by creating an auto scaling group of BIG-IP instances. Unlike "HA via LB" mentioned earlier or "Auto Scale with ALB", this deployment makes use of DNS that acts as a method to distribute traffic to the auto scaling BIG-IP instances. This solution integrates with F5 BIG-IP DNS (formerly named GTM). And...since there is no ALB in front of the BIG-IP instances, this means you do not need SNAT automap on the BIG-IP listeners. In other words, if you have apps that need to see the real client IP and they are non-HTTP apps (can't pass XFF header) then this is one method to consider.
Benefits of Auto Scale with DNS:
- Dynamically increase/decrease BIG-IP instance count based on CPU and throughput
- If using F5 auto scale WAF templates, then those come with pre-configured WAF policies
- F5 devices will self-heal (cloud VM scale set will replace damaged instances with new)
- ALB not required (cost savings)
- SNAT automap not required
Requirements for Auto Scale with DNS:
- Service Principal required with correct permissions
- BIG-IP needs outbound internet access to Azure REST API on port 443
- SNAT automap optional
- BIG-IP DNS (aka GTM) needs connectivity to each BIG-IP auto scaled instance
Other things to know:
- BIG-IP cluster will be active/active
- BIG-IP will be deployed with 1-NIC
- BIG-IP onboarding time
- BIG-IP VE process takes about 3-8 minutes depending on instance type and modules
- Azure VM Scale Set configured with 10 minute window for scale up/down window (ex. to prevent flapping)
- Take these timers into account when looking at full readiness to accept traffic
- BIG-IP listener
- can be wildcard like 0.0.0.0/0
- can use different ports for different apps like 0.0.0.0/0:443, 0.0.0.0/0:9443
- Licensing
- PAYG marketplace licensing can be used
- BIG-IQ license manager can be used for BYOL licensing
Sorry, no video yet...a picture will have to do! Here's an example diagram of auto scale with DNS.
Read about the F5 GitHub Azure Auto Scale via DNS templates.
Summary
That's it for now! I hope you enjoyed the video series (here in full on YouTube) and quick explanation. Please leave a comment if this helped or if you have additional questions.
Additional Resources
F5 High Availability - Public Cloud Guidance
The Hitchhiker’s Guide to BIG-IP in Azure
The Hitchhiker’s Guide to BIG-IP in Azure – “Deployment Scenarios”
The Hitchhiker’s Guide to BIG-IP in Azure – “High Availability”
The Hitchhiker’s Guide to BIG-IP in Azure – “Life Cycle Management”
- Jeff_Giroux_F5Ret. Employee
You will need to choose the correct affinity at the Azure LB side to maintain persistence when BIG-IP is active-active. Affinity is a separate LB setting from DSR floating IP. As for round-robin, Azure LB by default uses hash which sends to any healthy node. There is no typical round-robin. To stick to a certain BIG-IP, the source affinity will be required.
"The hash is used to route traffic to healthy backend instances within the backend pool. The algorithm provides stickiness only within a transport session. When the client starts a new session from the same source IP, the source port changes and causes the traffic to go to a different backend instance."
https://learn.microsoft.com/en-us/azure/load-balancer/distribution-mode-concepts
By default, Azure Load Balancer uses a five-tuple hash.
The five tuple includes:
- Source IP address
- Source port
- Destination IP address
- Destination port
- IP protocol number to map flows to available servers
- yamashin55Cirrus
Thank you for the easy to understand explanation.
Please tell me about "HA Using ALB for Failover with DSR Enabled (Floating IP)".I also read your article below.
"F5 High Availability - Public Cloud Guidance"
https://community.f5.com/t5/technical-articles/f5-high-availability-public-cloud-guidance/ta-p/284381
Please tell me about "HA Using ALB for Failover with DSR Enabled (Floating IP)".For ALB for Failover with DSR Enabled (Floating IP) configuration,
When BIG-IP is in Active-Active configuration, can traffic from ALB be session-maintained?
Will ALB be round-robined to Active-BIG-IP#1 and Active-BIGIP#2?I want to session persitence.(From ALB to Bigip)
- Vincent77Nimbostratus
Thanks for this documentation.
For HA Using ALB for Failover with DSR Enabled (Floating IP)
I created the "Dummy VIP" on each device and assigned the Irule.
The problem is that on each device, the self ip answers.
Therefore, the ALB sends on both members of the cluster and not only on the active member.
- grovesNimbostratus
Excellent work - excited for the DNS auto-scale lesson and thanks for the diagram in the interim!
- RemoteAdminNimbostratus
Thank you for the easy to understand explanation of how this all works. Great job!
- Jeff_Giroux_F5Ret. Employee
I plan to cover auto scale deployments and a few others in future articles.