Scaling SSL VPN using BIG-IP Local Traffic Manager (LTM)
Background
In response to the COVID-19 pandemic, many organisations have implemented a remote working policy, resulting in a significant increase in the number of users requiring remote access. While some customers are able to support this increase on their existing physical or virtual infrastructure, many are facing unprecedented demand that cannot be supported on their existing infrastructure.
F5 recommends the following:
- Utilizing several High-Performance Access Policy Manager (APM) Virtual Editions (VE) to provide horizontal scaling with appropriate APM CCU licensing.
- Traffic steering across the High-Performance APM VE’s using one of the following mechanisms:
- Global Server Load Balancing (GSLB) via F5 DNS (Formerly known as GTM)
- Global Server Load Balancing (GSLB) via F5 DNS Load Balancer Cloud Service
- Load Balancing via BIG-IP Local Traffic Manager (LTM)
- A combination of the above
Note: the above mechanisms can also be utilized with existing Active-Passive cluster APM instances that choose to redeploy as Active-Active behind the LTM or GSLB.
This solution brief will be focused on Load Balancing via BIG-IP LTM - option (c).
Overview
The solution consists of two tiers, one being the load balancing of inbound SSL VPN traffic utilising existing BIG-IP Local Traffic Manager (LTM) to several APM VE which perform SSL VPN termination. The High-Performance APM VE’s are utilising VE subscription licenses, that support up to 24 vCPU’s and have no throughput limits. 2GB of RAM needs to be allocated per vCPU.
The design goal is to spread the required SSL VPN load across a larger number of APM VE instances with lower concurrent user count rather than utilizing a smaller number of higher concurrent users. This approach is designed to maximise the throughput per-user.
The recommendation is for no more than 5,000 concurrent users per APM VE.
Figure 1: Scaling SSL VPN using BIG-IP Local Traffic Manager (LTM) Conceptual Diagram
The LTM is deployed and configured with a FastL4 Virtual Server (VS) in front of the APM instances in a routed environment. Source Address Translation (SNAT) will be disabled on LTM FastL4 Virtual Server as the APM VE instances are configured on the same subnet as the Internal VLAN of the LTM. This allows the APM VE to see the real client IP address and allows the default gateway of the APM VE instances to point to the internal network (useful for very large customer networks), utilizing Auto Last-Hop to return traffic back to the LTM. SNAT is also not applied on the APM VE instances.
If the APM VE instances are located on different subnets to the LTM, then SNAT is required. Please refer to the Appendices for details as changes are required to the Virtual Server configuration in order to support this.
Solution Details
As shown in Figure 2, the solution consists of a 1 * LTM Instance (3-NIC) with N * Standalone APM VE’s (3-NIC) configured in the LTM pool. Each APM VE has its own unique lease pool to assign to SSL VPN tunnels.
Figure 2: Scaling SSL VPN using BIG-IP Local Traffic Manager (LTM) Solution Detail Diagram
LTM will monitor the availability of the VPN Pool using a HTTPS and ICMP monitor.
The APM VE will run a custom monitor to check their concurrent connectivity usage (CCU) against a specified threshold every 10 seconds which alters the HTTP response that the LTM is tracking using a HTTP monitor. The page will respond with “online” if the APM VE is below the threshold limit and “offline” if the allocation is exhausted.
This is designed to provide a graceful limit to concurrent users before a license limit is hit. It also provides customers a mechanism to scale the user count horizontally before the APM license limit is reached, for example limiting the concurrent user count to 2,000 on an APM VE instance that is licensed for 2,500.
The TMSH commands used to generate the LTM and APM configuration to match the network shown in Figure 2, are available from https://github.com/brett-at-f5/f5-scaling-ssl-vpn. These are provided as mechanism to quickly stand-up the service but would need to be modified on a per-customer’s basis to suit the environment.
Monitor Configuration
As noted the LTM uses a HTTP monitor to poll the APM VE’s to check their availability based on their concurrent user load, receiving either an “online” or “offline” response. The response to the monitor by the APM VE’s is generated by an iRule that checks the status of a table that is updated by a Perl script (vpn_ccu_monitor.pl) stored in /shared/scripts and runs every 10 seconds using an iCall. These components work together as shown in Figure 3.
Figure 3: VPN CCU Monitor Process
The iCall can be created from the bash command prompt after uploading the Perl script (vpn_ccu_monitor.pl).
tmsh create sys icall script vpn_ccu_script definition { catch { exec /shared/scripts/vpn_ccu_monitor.pl } } tmsh create sys icall handler periodic vpn_ccu_handler script vpn_ccu_script interval 10
The HTTP virtual Server (vpn_monitor_vs) has the following iRule (vpn_ccu_monitor_irule) attached which responds to the BIG-IP LTM.
when HTTP_REQUEST { switch [string tolower [HTTP::path]] { "/offline" { table set -subtable ccu monitor offline indef indef HTTP::respond 200 content "offline" } "/online" { table set -subtable ccu monitor online indef indef HTTP::respond 200 content "online" } "/monitor" { set response [table lookup -subtable ccu monitor] HTTP::respond 200 content $response } default { HTTP::close } } }
APM Access Policy
The TMSH commands available from https://github.com/brett-at-f5/f5-scaling-ssl-vpn creates an empty APM policy. This policy needs to be modified to suit the your requirements. For testing purposes, the following basic APM policy was created to Authenticate the user against AD and check AD Group membership:
Figure 4: Example Access Policy
For production purposes, F5 strongly recommends multi-factor authentication be enforced by configuring two or more distinct authentication factors in the APM access policy. APM supports a wide range of authentication methods .
The recommendation is to create the Access Policy an a single APM instance and then utilise the export/import function in APM to replicate the policy onto the remaining APM VE instances.
Figure 5: Access Policy Export
When importing the Access Profile for the first time, select Reuse Existing Objects, as the TMSH commands listed in https://github.com/brett-at-f5/f5-scaling-ssl-vpn will create a number of the underlying objects, such as AAA servers and Connectivity Profiles.
Figure 6: Access Profile Import - Reuse Existing Objects
Attach the Access Profile and Connectivity Profile to the Virtual Server.
Figure 7: Virtual Server Access Profile and Connectivity Profile
Note: If a change in made to an Access Policy and you wish to re-import with the same Policy name, the existing Access Policy needs to be removed from the Virtual Servers and the Access Profile deleted- keeping the APM objects.
Ensure that only the Access Profile is removed by unchecking the remaining APM objects.
Figure 8: Deleting an Access Policy without removing other APM Objects
VoIP through Network Access connections
When your VoIP application is not compatible with network address translation, or you want to retain the source address of the VoIP clients, it is recommended to enable DTLS, enable Strict Port Translation.
K16680: VoIP through Network Access connections
Disabling Compression
Unless utilising specific F5 hardware that utilized Hardware Compression, it is recommended to disable all compression on the APM SSL VPN tunnel.
K12524516: APM Network Access (VPN) compression causes CPU usage higher
Split vs Full VPN Tunnel
Where security policies permit, split tunnelling is recommended to reduce the amount traffic being processed by the APM instances.
modify apm resource network-access vpn_na split-tunneling true modify apm resource network-access vpn_na address-space-include-dns-name add { f5.demo } modify apm resource network-access vpn_na address-space-include-subnet add { { subnet 10.0.0.0/8 } }
LTM Virtual Server with SNAT and XFF
If SNAT is required on the BIG-IP LTM Virtual Server, for the APM instance to see the real Client IP, the following is required:
- Standard Virtual Server
- HTTP profile
- Client-side and Server-side SSL profiles
- Automap / SNAT Pool
- iRule – see below
The HTTP profile must be removed after the user authentication to allow the VPN tunnel to establish. The following iRule can be used to insert the XFF header and disable the HTTP profile after authentication.
when HTTP_REQUEST { HTTP::header insert X-Forwarded-For [IP::remote_addr] if {[HTTP::uri] starts_with "/myvpn" || [HTTP::uri] starts_with "/isession"} { HTTP::disable } }
In addition to the above iRule, if Client Certificate or Machine Certificate authentication is required by APM. You will also need to configure Client Certificate Constrained Delegation (C3D). Please follow K14065425: Configuring Client Certificate Constrained Delegation (C3D).
Additional References
K17160: Achieving consistent high-performance on BIG-IP VE
K13267: BIG-IP APM connectivity license use
Manual : BIG-IP Access Policy Manager: Authentication Methods