I had a conversation with someone not too long ago on the subject of BIG-IP high availability. BIG-IP is primarily a load balancer, so some forms of high availability (load balancing across multiple web servers for instance) are obvious. But as you probably well know, there are several other characteristics of BIG-IP that can create high availability. The dialog started off slowly (load balancing, health monitors, blah, blah, etc., etc.), then as the possibilities started stacking up the conversation got lively and it became sort of a game. Out came the whiteboard and after a few more ideas – we had 10 ways to HA. As time went by the game itself evolved in my mind. Sure “10 ways to HA” is sort of catchy, but I knew there was more. I continued to share this idea with colleagues and eventually hit a happy 14 (though there are definitely more) which I’d like to share with you now. Before I begin let me be clear that not all high availability characteristics are necessarily attributes of BIG-IP (most are), but rather part of the environment where BIG-IP is a key player. We tried to think of every conceivable reason why a user couldn’t access an application and thought about how the environment could defend against that. Let’s start with a picture:
BIG-IP installed in pairs: Installing them in pairs, typically active/standby ensures that the failure of one BIG-IP does not bring down your applications. In fact with a hardware heartbeat connection between them, failover time is measured in microseconds.
Redundant switches: Switches and/or routers are typically deployed in front and/or behind the BIG-IP to increase port density. Multiple switches, with similar failover capabilities, also ensure there’s always a network path from client to application.
Shared IP addresses: When you create a VLAN and assign self-IP addresses, you can also create “floating” IP addresses that span both members of the BIG-IP pair. This ensures proper layer 3 routing if a BIG-IP should fail as the return IP address is always the same.
Shared MAC addresses: Along with floating IP addresses, you can define “masquerading” MAC addresses that ensure a proper layer 2 path if a BIG-IP should fail.
Trunking: Link aggregation and 802.1q “VLAN tagging” not only allow the aggregation of bandwidth from multiple physical ports, but also provides redundancy to a VLAN should a physical port connection fail.
Load balancing: This is really a no-brainer, but load balancing across multiple services ensures that the application can support greater numbers of request without overloading a single server.
Health monitoring: Where load balancing spreads application traffic across multiple services, health monitoring ensures that no requests are sent to services that have failed. This, in my opinion is one of the coolest and most powerful attributes of the BIG-IP. Health monitors can monitor and interact with applications and servers at pretty much every level.
Transparent monitoring: So cool it deserves its own title, transparent monitors can monitor through a device (like a router or switch between endpoints), essentially giving you high availability along a path.
Global load balancing: Where local load balancing leaves off, global load balancing ensures high availability across datacenters, across WANs, across the planet! So rest assured should Godzilla attack your Japan office, your application will still be accessible from another datacenter.
Global load balancing monitors: There are monitors at the global level that can monitor the load balancers that are monitoring your applications.
VMware integration: This is a great feature that employs F5’s iControl capability. You provision offline resources in your VM environment and when VirtualCenter detects the passing of some pre-defined threshold (memory, processor, or user concurrency overload) it turns those resources on. Once they are up and ready to start taking some of the load, VirtualCenter contacts the BIG-IP through the iControl interface indicating the IP addresses of the new VMs. The BIG-IP automatically adds those addresses to the load balancing pool. No intervention required, providing a dynamic, self provisioning environment that grows as customer demand increases.
Session state sharing: Not so much a BIG-IP thing, but most modern web servers allow their application session states to be shared, usually in a database or another “state server”. So typically, when you log into an application that needs to maintain session state, it returns a session token that it uses to track your movement and ensure authentication. That session token (usually a cookie) contains a unique identifier that maps to a piece of memory in the web server’s session table. So while the BIG-IP maintains persistence to that server, if the application fails and you must be sent to another server, your session is gone and you’ll likely have to start over. But if you allow the web servers to share session state, that in-memory table is replaced with something that is accessible to all of the application servers. You can then literally shut servers off for maintenance in the middle of the afternoon and never interrupt user sessions.
Session mirroring: It’s a little known feature of BIG-IP that allows it to share or rather mirror session information between peers (that’s persistence information, anything stored in the session table for a user, etc.).
iRules: And finally there’s iRules. Wait, what? iRules create high availability? Why yes, they do. I could go on and on about the coolness of iRules, but for starters there’s events like LB_FAILED that are designed specifically to catch availability failures, and commands like HTTP::retry that allow you to retry a request if it originally failed. You could even use iRules to replicate some of the functions of an application. The sky is really the limit.
And there you have it, 10 – err, I mean 14 ways to HA! Let nothing stand in our way - Muhaha! Seriously though, there are so many different ways to achieve high availability in a BIG-IP environment. I left out database clustering (also not really a BIG-IP thing but the BIG-IP SQL monitors are awesome!), and fast data replication and de-duplication across WAN links with iSessions. There’s also Access Policy Manager (APM) credential caching, VMware VMotion across datacenters with the Wan Optimization Module (WOM), Edge Gateway’s “always connected” capability, and link load balancing with Link Controller. This has turned out to be a pretty entertaining topic amongst my geeky colleagues. I now challenge you to keep thinking about ways to achieve high availability in a BIG-IP environment. Who’ll be the first to hit 20?!