infrastructure

786 Topics

Modernizing F5 BIG-IP Synchronized HA Pairs with Ansible Validated Content
I wanted to provide an update to a previous article I released a few months ago where we developed Ansible Automation Platform code to help with migrating Standalone Legacy platforms (non-iSeries, iSeries and Viprion Instances) to our Modern Architectures (rSeries and Velos) using F5OS Tenant instances. I am happy to announce that the code for Synchronized HA pairs has been completed, and we have uploaded it to Ansible Automation Hub as Validated Content. What is Ansible Automation Hub Validated Content? Ansible validated content collections contain pre-built YAML content (such as playbooks or roles) to address the most common automation use cases. You can use Ansible validated content out-of-the-box or as a learning opportunity to develop your skills. It's a trusted starting point to bootstrap your automation: use it, customize it, and learn from it. Due to the focus on customization and the intent for this content to be modified, it is not subject to the same support requirements as our certified collections. To this end, any issues with this content should be filed directly at the source repository for that collection. Why Synchronized HA Pairs? This is a very common use case for a lot of our customers who want resiliency and redundancy, especially for their applications and services. The biggest issue with migrating an HA Pair is that because of the way they are set up, things like Management IP Addresses and Master Keys are essential to the transition process. Even mismatched versions during upgrades cannot synchronize during the process of upgrading to major/minor releases. What does the updated Validated Code do? Standalone Migrations – Where you can change the Management IP, due to the nature of being a standalone device, an outage will occur during the transition period. o There are 2 options for Playbooks Single Playbook for the full migration 2 Parts where Part 1 – Does backups and does a big start stop of the unit Part 2 – Migrates the standalone device HA Pairs – Combined – This code is designed for a customer who just needs to transition both HA Units but isn’t concerned about an outage window. It will migrate both units at the same time to F5OS Tenants. The Playbooks for this Code are broken apart in specific areas Part 1 – Backup the Information Part 2 – Ensure Both Units are offline and Migrate both units at the same time. HA Pairs – Sequential – This code is designed for customers who need to migrate one unit at a time and maintain availability of their applications. It will migrate the Standby Unit first as part of the code When ready to transition the active unit, it will place it in Standby and make the Transitioned Standby unit the Active Node transferring services to it Then the previously Active Unit (now standby) will be migrated There are playbooks to the Code to break apart specific areas of the transition Part 1 – Backup the Information Part 2 – Ensure the Standby Devices are offline (via Management IP) and Migrate the Standby Unit Part 3 – Transition the Standby to become the Active Unit and Begin Transitioning the New Standby Unit (Previously Active Unit) similarly to Part 2 This code has been tested and validated against many different platforms, and there are plans to continue testing for other use cases. The Transition can be Like-for-Like versioning (i.e. 16.1.x to 16.1.x) within the same family tree or can be an upgrade at the same time (i.e. 15.1.10 à 17.5.1.3 or even 21.0.0) These are Ansible Playbooks with supporting roles tailored for Red Hat Ansible Automation Platform. It’s built to perform a lift-and-shift migration of a F5 BIG-IP configuration from one device to another—with optional OS upgrades included. What is the future of the code? I plan on adding some Validation code to separate roles/playbooks so customers could have points of references for testing, i.e. (ping tests and pool tests) before and after the transition, QKView Backups, and other information provided on the state of the unit prior to transition to ensure when migrated it can be validated that everything is the way it was. Notes about the Code The code is not designed to handle Non-VLANed infrastructure (F5OS is designed to be multi-tenant and setup with VLANs to deal with Multi-Tenancy) If your BIG-IPs use Untagged networks, they will need to be migrated to VLANed prior to using this code. Has not been validated/tested with FIPS-based environments Has not been validated/tested F5 DNS environments – Coming Soon HA Pairs must retain the Management IP address from source to destination; the code will ensure that the source device is powered off prior to transitioning it. Cool Additions Override variables are allowed as extra_vars to create flexibility in your deployment override_cpu - This allows you to set the CPUs of the Tenant OS. If the memory override isn’t set, it will be set to the same formula that the F5OS Gui would calculate. DEFAULT is set to 4 CPU override_disk_size - This allows you to set the Disk Space of the Tenant OS. DEFAULT is set to 120GB override_memory - This allows you to set the Memory of the Tenant OS. Be warned if over-provisioned, the Tenant may not start. DEFAULT is calculated by the CPU counts formula used in the GUI. tenant_nodes - This allows you to set the slot for the Tenant OS if there are multiple slots associated with your F5OS Partition. DEFAULT is an array object and it is set to [1] cryptos - This allows you to set the Crypto on the Tenant OS to either enabled or disabled. DEFAULT is set to enabled Variables for deployments – the code is designed to utilize specific hostnames and group names to execute the code. These variables allow connectivity to BIGIP and F5OS Tenants. When creating these hosts, you will need to provide When creating hosts in AAP, you will need to provide the following information ansible_host: - This is the IP Address of the device of the host ansible_user: - This is the username to login to the device ansible_password: - this is the password to login to the device; if using a credential in AAP, you would associate that variables information here as a reference. i.e. Standalone deployments host_vars f5_destination_partition – This is the F5OS Partition information f5_destination_tenant – This is the F5OS Tenant information f5_source – This is the source device HA Pair Deployments group_vars ha_pair_destination_chassis – contains a group of 2 hosts for the destination tenants to be deployed to (can be 2 hosts with the same information or different) ha_pair_source – contains a group of 2 hosts for the source BIG-IP Devices in a synchronized HA Pair. ha_pair_source_dynamic - this group is created automatically throughout the code to program the new Tenant OSes after deployment (DOES NOT NEED TO BE CREATED) Demos/Information We have uploaded a new demo video below, you’ll see an migration of a synchronized HA Pair of BIG-IPs running as Viprion Tenants on F5 B2250 Blades running 15.1.10 transitioning to a pair of rSeries R5800s Tenant OSs running 17.5.1.x — demonstrating a smooth modernization process. Watch the synchronized HA migration Demo Video If you want to check out the information and demo video on the Standalone migrations, check out my other article at – Modernizing F5 Platforms with Ansible | DevCentral You can access the validated content via Ansible Automation Hub (Need Red Hat Account with AAP) https://console.redhat.com/ansible/automation-hub/repo/validated/f5networks/f5_platform_modernization/ Or you can access the direct code from our GitHub Repository https://github.com/f5devcentral/f5-bd-ansible-platform-modernization This project is built for the community/partners/system integrators — so as I always say, feel free to take it, fork it, and expand it. Let’s make F5 platform modernization as seamless and automated as possible!
Matt_Mabis
Dec 16, 2025 Place Technical Articles
261Views
4likes
0Comments
The Concise Guide to Proxies
We often mention that the benefits derived from some application delivery controllers are due to the nature of being a full proxy. And in the same breath we might mention reverse, half, and forward proxies, which makes the technology sound more like a description of the positions on a sports team than an application delivery solution. So what does these terms really mean? Here's the lowdown on the different kinds of proxies in one concise guide. PROXIES Proxies (often called intermediaries in the SOA world) are hardware or software solutions that sit between the client and the server and do something to requests and sometimes responses. The most often heard use of the term proxy is in conjunction with anonymizing Web surfing. That's because proxies sit between your browser and your desired destination and proxy the connection; that is you talk to the proxy while the proxy talks to the web server and neither you nor the web server know about each other. Proxies are not all the same. Some are half proxies, some are full proxies; some are forward and some are reverse. Yes, that came excruciatingly close to sounding like a Dr. Seuss book. (Go ahead, you know you want to. You may even remember this from .. .well, when it was first circulated.) FORWARD PROXIES Forward proxies are probably the most well known of all proxies, primarily because most folks have dealt with them either directly or indirectly. Forward proxies are those proxies that sit between two networks, usually a private internal network and the public Internet. Forward proxies have also traditionally been employed by large service providers as a bridge between their isolated network of subscribers and the public Internet, such as CompuServe and AOL in days gone by. These are often referred to as "mega-proxies" because they managed such high volumes of traffic. Forward proxies are generally HTTP (Web) proxies that provide a number of services but primarily focus on web content filtering and caching services. These forward proxies often include authentication and authorization as a part of their product to provide more control over access to public content. If you've ever gotten a web page that says "Your request has been denied by blah blah blah. If you think this is an error please contact the help desk/your administrator" then you've probably used a forward proxy. REVERSE PROXIES A reverse proxy is less well known, generally because we don't use the term anymore to describe products used as such. Load balancers (application delivery controllers) and caches are good examples of reverse proxies. Reverse proxies sit in front of web and application servers and process requests for applications and content coming in from the public Internet to the internal, private network. This is the primary reason for the appellation "reverse" proxy - to differentiate it from a proxy that handles outbound requests. Reverse proxies are also generally focused on HTTP but in recent years have expanded to include a number of other protocols commonly used on the web such as streaming audio (RTSP), file transfers (FTP), and generally any application protocol capable of being delivered via UDP or TCP. HALF PROXIES Half-proxy is a description of the way in which a proxy, reverse or forward, handles connections. There are two uses of the term half-proxy: one describing a deployment configuration that affects the way connections are handled and one that describes simply the difference between a first and subsequent connections. The deployment focused definition of half-proxy is associated with a direct server return (DSR) configuration. Requests are proxied by the device, but the responses do not return through the device, but rather are sent directly to the client. For some types of data - particularly streaming protocols - this configuration results in improved performance. This configuration is known as a half-proxy because only half the connection (incoming) is proxied while the other half, the response, is not. The second use of the term "half-proxy" describes a solution in which the proxy performs what is known as delayed binding in order to provide additional functionality. This allows the proxy to examine the request before determining where to send it. Once the proxy determines where to route the request, the connection between the client and the server are "stitched" together. This is referred to as a half-proxy because the initial TCP handshaking and first requests are proxied by the solution, but subsequently forwarded without interception. Half proxies can look at incoming requests in order to determine where the connection should be sent and can even use techniques to perform layer 7 inspection, but they are rarely capable of examining the responses. Almost all half-proxies fall into the category of reverse proxies. FULL PROXIES Full proxy is also a description of the way in which a proxy, reverse or forward, handles connections. A full proxy maintains two separate connections - one between itself and the client and one between itself and the destination server. A full proxy completely understands the protocols, and is itself an endpoint and an originator for the protocols. Full proxies are named because they completely proxy connections - incoming and outgoing. Because the full proxy is an actual protocol endpoint, it must fully implement the protocols as both a client and a server (a packet-based design does not). This also means the full proxy can have its own TCP connection behavior, such as buffering, retransmits, and TCP options. With a full proxy, each connection is unique; each can have its own TCP connection behavior. This means that a client connecting to the full proxy device would likely have different connection behavior than the full proxy might use for communicating with servers. Full proxies can look at incoming requests and outbound responses and can manipulate both if the solution allows it. Many reverse and forward proxies use a full proxy model today. There is no guarantee that a given solution is a full proxy, so you should always ask your solution provider if it is important to you that the solution is a full proxy.
Lori_MacVittie
Oct 02, 2008 Place Technical Articles
5.2KViews
2likes
12Comments
Centralized Application Control for Distributed AI with Equinix and F5 Distributed Cloud
As AI adoption accelerates, I’ve been seeing a common architectural pattern emerge: centralized AI factories handling model training, with inference workloads pushed out to remote departments like public safety, healthcare, or logistics. While the execution is distributed, the operational requirements—security, performance, and policy consistency—remain very much centralized. The challenge isn’t running inference at the edge; it’s delivering centralized AI services to distributed consumers without introducing complex routing, fragmented security controls, or inconsistent performance between locations. This article outlines how you can address that problem using F5 Distributed Cloud (XC) Customer Edge deployed on Equinix Network Edge, with private connectivity provided by Equinix Fabric. The Problem to Solve From an infrastructure perspective, these environments tend to stress three things simultaneously: Scalability, as data volumes and inference demand grow rapidly Security, to protect models, APIs, and sensitive inference data Reliability, so performance remains consistent regardless of where requests originate Traditional approaches often force tradeoffs—centralize everything and accept latency, or decentralize enforcement and deal with policy sprawl. What we need is centralized control with distributed execution. Architectural Approach Rather than building bespoke connectivity for each inference location, we’ll focus on creating a repeatable edge pattern that could be deployed globally while still being governed centrally. The architecture breaks down into four core components: Central AI Factory (Training Hub) This is where model training and lifecycle management live. It connects to S3‑compatible object storage for large‑scale data ingestion and model artifacts. Importantly, it doesn’t need direct exposure to every inference a consumer makes. Equinix Fabric Equinix Fabric provides private, low‑latency connectivity between the AI factory and distributed inference locations. In this design, it effectively acts as a segment extender across regions, keeping AI traffic off the public internet while preserving predictable performance. F5 Distributed Cloud (XC) Customer Edge F5 XC Customer Edge (CE) instances are deployed close to inference consumers. These handle traffic management, API security, segmentation, and observability, while remaining under centralized policy control. This is where enforcement happens—consistently, everywhere. Equinix Network Edge Marketplace Equinix Network Edge enables rapid deployment of Customer Edge instances in new regions without waiting on physical infrastructure, which is critical when inference demand expands faster than traditional provisioning cycles. How It Works Inference requests are processed locally through CEs at each location. When access to centralized resources is required—such as model updates or validation—traffic traverses Equinix Fabric back to the AI factory. The key detail is that policy is defined centrally but enforced at the edge. Security controls, API protections, and segmentation rules are created once and applied uniformly, regardless of geography. That eliminates the need for custom routing logic or per‑site security tuning. Design Principles That Matter A few principles guided the implementation: Centralized control, distributed execution — inference stays close to data. Governance stays centralized Zero Trust by default — all AI data flows are explicitly authenticated and authorized Elastic expansion — new regions can be brought online quickly through the Marketplace Integrated observability — traffic, performance, and security posture are visible across all endpoints Compliance‑ready — isolation and segmentation support regulatory requirements like GDPR and HIPAA When This Pattern Fits This approach works well for organizations that need to scale AI inference across multiple regions or departments while maintaining tight operational control. It’s particularly effective when inference demand grows incrementally and predictability, security, and governance matter more than ad‑hoc edge autonomy. If the goal is centralized governance with distributed execution, this pattern provides a clean and repeatable way to get there. Additional Links F5 Distributed Cloud Services F5 Distributed Cloud (XC) Customer Edge Equinix Fabric Equinix Network Edge Marketplace
Greg_Coward
Apr 21, 2026 Place Technical Articles
203Views
1like
0Comments
Announcing F5 NGINX Gateway Fabric 1.4.0 with IPv6 and TLS Passthrough
We announced the next release of F5 NGINX Gateway Fabric version 1.4.0 which includes a lot of smaller but very necessary features. This allows us to dedicate more time to advancing our non-functional testing framework and ensuring we maintain top performance across releases. Nevertheless, we have some great highlights of this release: IPv6 support TLS passthrough (via TLSRoute) Server zone metrics Ability to add custom pod annotations Plenty of bug fixes! During this release cycle, we discovered a bug around our custom policies that occurred when you had the same path for more than one Route: The policy would not be applied to either Route. For this release, we’ve decided to enforce a restriction so that policies cannot be applied when two or more routes share the same path. However, we are pursuing a long-term solution to lift this restriction on this edge case, as we understand that use cases that route based on header, query parameter, or other request attributes on the same path do exist. IPv6 Support While most Kubernetes clusters are still utilizing IPv4, we recognized that anyone employing a IPv6 cluster would have no ability to deploy NGINX Gateway Fabric. Thus, we implemented a simple feature to dual IPv4/IPv6 networking for NGINX Gateway Fabric. This option is enabled by default, so you can simply install as normal on an IPv6 cluster. TLS Passthrough New with 1.4 is TLSRoute support. This Route type enables the TLS Passthrough use case and is similar to setting up an HTTPRoute. This allows you to pass encrypted traffic through NGINX Gateway Fabric where it is terminated by your backend application, ensuring end-to-end encryption. As most information passes through NGINX Gateway Fabric with this route, setup is easy. You can enable TLS passthrough for any application using our guide available here. Non-Functional Testing This release marks the completion of automating our non-functional testing that we execute before each release. If you are unfamiliar with these tests, our team runs NGINX Gateway Fabric through a series of scenarios, non-functional tests, to test if our performance is regressing or improving from previous releases. As an infrastructure product that you rely on, it is our top priority to ensure that stability and performance are not compromised as new features are released. The results of all non-functional testing are available in the GitHub repository for anyone to see and should give you an idea of how well NGINX Gateway Fabric performs in general and across releases. What’s Next NGINX Gateway Fabric 1.5.0 will bring NGINX code snippets to the Gateway API with a first-class Upstream Settings policy to configure keepalive connections and NGINX zone size. If you are familiar with NGINX or find that you need to use a feature that NGINX provides that is not yet available via a Gateway API extension, you can put a NGINX code snippet within a SnippetFilter to apply NGINX configuration to a Route rule. You will even be able to use the feature to load other modules NGINX provides and leverage the vast wealth of NGINX functionality. We will still be providing many NGINX features via first-class policies and filters, such as the Upstream Settings policy, as they allow us to handle much of the complexity of translating to Gateway API for you. These custom policies and filters allow us to handle a lot of the complexity of applying NGINX config across the Gateway API framework for you. The Upstream Settings policy can set upstream management directives that are unable to be applied via snippets effectively. We will continue to deliver these custom policies and filters across all of our releases, in addition to new Gateway API resources and NGINX Gateway Fabric specific features. You can see a preview of the full snippet design here, though not all features may be implemented in one release cycle. For more information on our strategy towards first-class NGINX customization via Gateway API extensions, see our full enhancement proposal here. Resources For the complete changelog for NGINX Gateway Fabric 1.4.0, see the Release Notes. To try NGINX Gateway Fabric for Kubernetes with NGINX Plus, start your free 30-day trial today or contact us to discuss your use cases. If you would like to get involved, see what is coming next, or see the source code for NGINX Gateway Fabric, check out our repository on GitHub! We have weekly community meetings on Tuesdays at 9:30AM Pacific/12:30PM Eastern/5:30PM GMT. The meeting link, updates, agenda, and notes are on the NGINX Gateway Fabric Meeting Calendar. Links are also always available from our GitHub readme.
mpstefan
Sep 20, 2024 Place Technical Articles
292Views
1like
0Comments
Programmability in the Network: Canary Deployments
#devops The canary deployment pattern is another means of enabling continuous delivery. Deployment patterns (or as I like to call them of late, devops patterns) are good examples of how devops can put into place systems and tools that enable continuous delivery to be, well, continuous. The goal of these patterns is, for the most part, to make sure operations can smoothly move features, functions, releases or applications into production. We've previously looked at the Blue Green deployment pattern and today we're going to look at a variation: Canary deployments. Canary deployments are applicable when you're running a cluster of servers. In other words, you've got lots and lots of (probably active right now while you're considering pushing that next release) users. What you don't want is to do the traditional "we're sorry, we're down for maintenance, here's a picture of a funny squirrel to amuse you while you wait" maintenance page. You want to be able to roll out the new release without disruption. Yeah, that's quite the ask, isn't it? The Canary deployment pattern is an incremental upgrade methodology. First, the build is pushed to a small set of servers to which only a select group of users are directed. If that goes well, the release is pushed to a larger set of servers with a limited set of users. Finally, if that goes well, then the release is pushed out to all servers and all users. If issues occur at any stage, the release is halted - it goes no further. Hence the naming of the pattern - after the miner's canary, used because "its demise provided a warning of dangerous levels of toxic gases". The trick to implementing this pattern is two fold: first, being able to group the servers used in each step into discrete pools and second, the ability to direct specific sets of users to the appropriate pools. Both capabilities requires the ability to execute some logic to perform user-based load balancing. Nolio, in its first Devops Best Practices video, implements Canary deployments by manipulating the pools of servers at the load balancing tier, removing them to upgrade and then reinserting them for testing before moving onto the next phase. If your load balancing solution is programmable, there's no need to actually remove them as you can simply insert logic to remove them from being selected until they've been upgraded. You can also then insert the logic to determine which users are directed to which pool of servers. If the load balancing platform is really programmable, you can even extend that to determination to querying a database to determine user inclusion in certain groups, such as those you might use to perform AB testing. Such logic might base the decision on IP address (not the best option but an option) or later, when you're actually rolling out to a percentage of users you can write logic that randomly selects users based on location or their user name - like sharding, only in reverse - or pretty much anything you can think of. You can even split that further if you're rolling out an update to an API that's used by both mobile and traditional clients, to catch both or neither or specific types in an orderly fashion so you can test methodically - because you want to test methodically when you're using live users as test subjects. The beauty of this pattern is that allows continuous delivery. Users are never disrupted (if you do it right) and the upgrade occurs in a safely staged, incremental fashion. That enables you to back out quickly if necessary, because you do have a back button plan, right? Right?
Lori_MacVittie
Aug 05, 2013 Place Technical Articles
1.1KViews
1like
1Comment
Back to Basics: Health Monitors and Load Balancing
#webperf #ado Because every connection counts One of the truisms of architecting highly available systems is that you never, ever want to load balance a request to a system that is down. Therefore, some sort of health (status) monitoring is required. For applications, that means not just pinging the network interface or opening a TCP connection, it means querying the application and verifying that the response is valid. This, obviously, requires the application to respond. And respond often. Best practices suggest determining availability every 5 seconds or so. That means every X seconds the load balancing service is going to open up a connection to the application and make a request. Just like a user would do. That adds load to the application. It consumes network, transport, application and (possibly) database resources. Resources that cannot be used to service customers. While the impact on a single application may appear trivial, it's not. Remember, as load increases performance decreases. And no matter how trivial it may appear, health monitoring is adding load to what may be an already heavily loaded application. But Lori, you may be thinking, you expound on the importance of monitoring and visibility all the time! Are you saying we shouldn't be monitoring applications? Nope, not at all. Visibility is paramount, providing the actionable data necessary to enable highly dynamic, automated operations such as elasticity. Visibility through health-monitoring is a critical means of ensuring availability at both the local and global level. What we may need to do, however, is move from active to passive monitoring. PASSIVE MONITORING Passive monitoring, as the modifier suggests, is not an active process. The Load balancer does not open up connections nor query an application itself. Instead, it snoops on responses being returned to clients and from that infers the current status of the application. For example, if a request for content results in an HTTP error message, the load balancer can determine whether or not the application is available and capable of processing subsequent requests. If the load balancer is a BIG-IP, it can mark the service as "down" and invoke an active monitor to probe the application status as well as retrying the request to another available instance – insuring end-users do not see an error. Passive (inband) monitors are not binary. That is, they aren't simple "on" or "off" based on HTTP status codes. Such monitors can be configured to track the number of failures and evaluate failure rates against a configurable failure interval. When such thresholds are exceeded, the application can then be marked as "down". Passive monitors aren't restricted to availability status, either. They can also monitor for performance (response time). Failure to meet response time expectations results in a failure, and the application continues to be watched for subsequent failures. Passive monitors are, like most inline/inband technologies, transparent. They quietly monitor traffic and act upon that traffic without adding overhead to the process. Passive monitoring gives operations the visibility necessary to enable predictable performance and to meet or exceed user expectations with respect to uptime, without negatively impacting performance or capacity of the applications it is monitoring.
Lori_MacVittie
Nov 21, 2012 Place Technical Articles
3.1KViews
1like
2Comments
The Limits of Cloud: Gratuitous ARP and Failover
#Cloud is great at many things. At other things, not so much. Understanding the limitations of cloud will better enable a successful migration strategy. One of the truisms of technology is that takes a few years of adoption before folks really start figuring out what it excels at – and conversely what it doesn't. That's generally because early adoption is focused on lab-style experimentation that rarely extends beyond basic needs. It's when adoption reaches critical mass and folks start trying to use the technology to implement more advanced architectures that the "gotchas" start to be discovered. Cloud is no exception. A few of the things we've learned over the past years of adoption is that cloud is always on, it's simple to manage, and it makes applications and infrastructure services easy to scale. Some of the things we're learning now is that cloud isn't so great at supporting application mobility, monitoring of deployed services and at providing advanced networking capabilities. The reason that last part is so important is that a variety of enterprise-class capabilities we've come to rely upon are ultimately enabled by some of the advanced networking techniques cloud simply does not support. Take gratuitous ARP, for example. Most cloud providers do not allow or support this feature which ultimately means an inability to take advantage of higher-level functions traditionally taken for granted in the enterprise – like failover. GRATUITOUS ARP and ITS IMPLICATIONS For those unfamiliar with gratuitous ARP let's get you familiar with it quickly. A gratuitous ARP is an unsolicited ARP request made by a network element (host, switch, device, etc… ) to resolve its own IP address. The source and destination IP address are identical to the source IP address assigned to the network element. The destination MAC is a broadcast address. Gratuitous ARP is used for a variety of reasons. For example, if there is an ARP reply to the request, it means there exists an IP conflict. When a system first boots up, it will often send a gratuitous ARP to indicate it is "up" and available. And finally, it is used as the basis for load balancing failover. To ensure availability of load balancing services, two load balancers will share an IP address (often referred to as a floating IP). Upstream devices recognize the "primary" device by means of a simple ARP entry associating the floating IP with the active device. If the active device fails, the secondary immediately notices (due to heartbeat monitoring between the two) and will send out a gratuitous ARP indicating it is now associated with the IP address and won't the rest of the network please send subsequent traffic to it rather than the failed primary. VRRP and HSRP may also use gratuitous ARP to implement router failover. Most cloud environments do not allow broadcast traffic of this nature. After all, it's practically guaranteed that you are sharing a network segment with other tenants, and thus broadcasting traffic could certainly disrupt other tenant's traffic. Additionally, as security minded folks will be eager to remind us, it is fairly well-established that the default for accepting gratuitous ARPs on the network should be "don't do it". The astute observer will realize the reason for this; there is no security, no ability to verify, no authentication, nothing. A network element configured to accept gratuitous ARPs does so at the risk of being tricked into trusting, explicitly, every gratuitous ARP – even those that may be attempting to fool the network into believing it is a device it is not supposed to be. That, in essence, is ARP poisoning, and it's one of the security risks associated with the use of gratuitous ARP. Granted, someone needs to be physically on the network to pull this off, but in a cloud environment that's not nearly as difficult as it might be on a locked down corporate network. Gratuitous ARP can further be used to execute denial of service, man in the middle and MAC flooding attacks. None of which have particularly pleasant outcomes, especially in a cloud environment where such attacks would be against shared infrastructure, potentially impacting many tenants. Thus cloud providers are understandably leery about allowing network elements to willy-nilly announce their own IP addresses. That said, most enterprise-class network elements have implemented protections against these attacks precisely because of the reliance on gratuitous ARP for various infrastructure services. Most of these protections use a technique that will tentatively accept a gratuitous ARP, but not enter it in its ARP cache unless it has a valid IP-to-MAC mapping, as defined by the device configuration. Validation can take the form of matching against DHCP-assigned addresses or existence in a trusted database. Obviously these techniques would put an undue burden on a cloud provider's network given that any IP address on a network segment might be assigned to a very large set of MAC addresses. Simply put, gratuitous ARP is not cloud-friendly, and thus it is you will be hard pressed to find a cloud provider that supports it. What does that mean? That means, ultimately, that failover mechanisms in the cloud cannot be based on traditional techniques unless a means to replicate gratuitous ARP functionality without its negative implications can be designed. Which means, unfortunately, that traditional failover architectures – even using enterprise-class load balancers in cloud environments – cannot really be implemented today. What that means for IT preparing to migrate business critical applications and services to cloud environments is a careful review of their requirements and of the cloud environment's capabilities to determine whether availability and uptime goals can – or cannot – be met using a combination of cloud and traditional load balancing services.
Lori_MacVittie
Dec 03, 2012 Place Technical Articles
1.4KViews
1like
0Comments
The Full-Proxy Data Center Architecture
Why a full-proxy architecture is important to both infrastructure and data centers. In the early days of load balancing and application delivery there was a lot of confusion about proxy-based architectures and in particular the definition of a full-proxy architecture. Understanding what a full-proxy is will be increasingly important as we continue to re-architect the data center to support a more mobile, virtualized infrastructure in the quest to realize IT as a Service. THE FULL-PROXY PLATFORM The reason there is a distinction made between “proxy” and “full-proxy” stems from the handling of connections as they flow through the device. All proxies sit between two entities – in the Internet age almost always “client” and “server” – and mediate connections. While all full-proxies are proxies, the converse is not true. Not all proxies are full-proxies and it is this distinction that needs to be made when making decisions that will impact the data center architecture. A full-proxy maintains two separate session tables – one on the client-side, one on the server-side. There is effectively an “air gap” isolation layer between the two internal to the proxy, one that enables focused profiles to be applied specifically to address issues peculiar to each “side” of the proxy. Clients often experience higher latency because of lower bandwidth connections while the servers are generally low latency because they’re connected via a high-speed LAN. The optimizations and acceleration techniques used on the client side are far different than those on the LAN side because the issues that give rise to performance and availability challenges are vastly different. A full-proxy, with separate connection handling on either side of the “air gap”, can address these challenges. A proxy, which may be a full-proxy but more often than not simply uses a buffer-and-stitch methodology to perform connection management, cannot optimally do so. A typical proxy buffers a connection, often through the TCP handshake process and potentially into the first few packets of application data, but then “stitches” a connection to a given server on the back-end using either layer 4 or layer 7 data, perhaps both. The connection is a single flow from end-to-end and must choose which characteristics of the connection to focus on – client or server – because it cannot simultaneously optimize for both. The second advantage of a full-proxy is its ability to perform more tasks on the data being exchanged over the connection as it is flowing through the component. Because specific action must be taken to “match up” the connection as its flowing through the full-proxy, the component can inspect, manipulate, and otherwise modify the data before sending it on its way on the server-side. This is what enables termination of SSL, enforcement of security policies, and performance-related services to be applied on a per-client, per-application basis. This capability translates to broader usage in data center architecture by enabling the implementation of an application delivery tier in which operational risk can be addressed through the enforcement of various policies. In effect, we’re created a full-proxy data center architecture in which the application delivery tier as a whole serves as the “full proxy” that mediates between the clients and the applications. THE FULL-PROXY DATA CENTER ARCHITECTURE A full-proxy data center architecture installs a digital "air gap” between the client and applications by serving as the aggregation (and conversely disaggregation) point for services. Because all communication is funneled through virtualized applications and services at the application delivery tier, it serves as a strategic point of control at which delivery policies addressing operational risk (performance, availability, security) can be enforced. A full-proxy data center architecture further has the advantage of isolating end-users from the volatility inherent in highly virtualized and dynamic environments such as cloud computing . It enables solutions such as those used to overcome limitations with virtualization technology, such as those encountered with pod-architectural constraints in VMware View deployments. Traditional access management technologies, for example, are tightly coupled to host names and IP addresses. In a highly virtualized or cloud computing environment, this constraint may spell disaster for either performance or ability to function, or both. By implementing access management in the application delivery tier – on a full-proxy device – volatility is managed through virtualization of the resources, allowing the application delivery controller to worry about details such as IP address and VLAN segments, freeing the access management solution to concern itself with determining whether this user on this device from that location is allowed to access a given resource. Basically, we’re taking the concept of a full-proxy and expanded it outward to the architecture. Inserting an “application delivery tier” allows for an agile, flexible architecture more supportive of the rapid changes today’s IT organizations must deal with. Such a tier also provides an effective means to combat modern attacks. Because of its ability to isolate applications, services, and even infrastructure resources, an application delivery tier improves an organizations’ capability to withstand the onslaught of a concerted DDoS attack. The magnitude of difference between the connection capacity of an application delivery controller and most infrastructure (and all servers) gives the entire architecture a higher resiliency in the face of overwhelming connections. This ensures better availability and, when coupled with virtual infrastructure that can scale on-demand when necessary, can also maintain performance levels required by business concerns. A full-proxy data center architecture is an invaluable asset to IT organizations in meeting the challenges of volatility both inside and outside the data center. Related blogs & articles: The Concise Guide to Proxies At the Intersection of Cloud and Control… Cloud Computing and the Truth About SLAs IT Services: Creating Commodities out of Complexity What is a Strategic Point of Control Anyway? The Battle of Economy of Scale versus Control and Flexibility F5 Friday: When Firewalls Fail… F5 Friday: Platform versus Product
Lori_MacVittie
Nov 21, 2011 Place Technical Articles
5.4KViews
1like
1Comment
Useful IT. Bringing Health Record Transfer into the 21st Century.
I read the Life as a Healthcare CIO blog on occasion, mostly because as a former radiographer, health care records integration and other non-diagnostic IT use in healthcare is a passing interest of mine. Within the last hospital I worked at the systems didn’t communicate – not even close, as in there was no effort to make them do so. This intrigues me, as since I’ve entered IT I have watched technology uptake in healthcare slowly ramp up at a great curve behind the rest of the business world. Oh make no mistake, technology has been in overdrive on the equipment used, but things like systems interoperability and utilizing technology to make doctors, nurses, and tech’s lives easier is just slower in the medical world. A huge chunk of the resistance is grounded in a very common sense philosophy. “When people’s lives are on the line you do not rush willy-nilly to the newest gadget.” No one in healthcare says it that way – at least not to my knowledge – but that’s the essence of what they think. I can think of a few businesses that could use that same mentality applied occasionally with a slightly different twist: “When the company’s viability is on the line…” but that’s a different blog. Even with this very common-sense resistance, there has been a steady acceleration of uptake in technology use for things like patient records and prescriptions. It has been interesting to watch, as someone on the outside with plenty of experience with the way hospitals worked and their systems were all silos. Healthcare IT is to be commended for things like electronic prescription pads and instant transfer of (now nearly all electronic) X-Rays to those who need them to care for the patient. Applying the “this can help with little impact on critical care” or even “this can help with positive impact on critical care and little risk of negative impact” viewpoint as a counter to the above-noted resistance has produced some astounding results. A friend of mine from my radiographer days is manager of a Cardiac Cath Lab, and talking with him is just fun. “Dude, ninety percent of the pups coming out of Radiology schools can’t set an exposure!” is evidence that diagnostic tools are continuing to take advantage of technology – in this case auto-detecting XRay exposure limits. He has more glowing things to say about the non-diagnostic growth of technology within any given organization. But outside the organization? Well that’s a completely different story. The healthcare organization wants to keep your records safe and intact, and rarely even want to let you touch them. That’s just a case of the “intact” bit. Some people might want their records to not contain some portion – like their blood alcohol level when brought to the ER – and some people might inadvertently lose some portion of the record. While they’re more than happy to send them on a referral, and willing to give you a copy if you’re seeking a second opinion, these records all have one archaic quality. Paper. If I want to buy a movie, I can go to netflix, sign up, and stream it (at least many of them) to watch. If I want my medical records transferred to a specialist so I can get treatment before my left eye oozes out of its socket, they have to be copied, verified, and mailed. If they’re short or my eye is on the verge of falling out right this instant, then they might be faxed. But the bulk of records are mailed. Even overnight is another day lost in the treatment cycle. Recently – the last couple of years – there has been a movement to replicate the records delivery process electronically. As time goes on, more and more of your medical records are being stored digitally. It saves room, time, and makes it easier for a doctor to “request” your record should he need it in a hurry. It also makes it easier to track accidental or even intentional changes in records. While it didn’t happen as often as fear-mongers and ambulance chasers want you to believe, of course there are deletions and misplacements in the medical records of the 300 million US citizens. An electronic system never forgets, so while something as simple as a piece of paper falling out of a record could forever change it, in electronic form that can’t happen. Even an intentional deletion can be “deleted” as in not show up, but still there, stored with your other information so that changes can be checked should the need ever arise. The inevitable off-shoot of electronic records is the ability to communicate them between hospitals. If you’re in the ER in Tulsa, and your normal doctor is in Manhattan, getting your records quickly and accurately could save your life. So it made sense that as the percentage of new records that were electronic grew, someone would start to put together a way to communicate them. No doubt you’re familiar with the debate about national health information databases, a centralized location for records is a big screaming target from many people’s perspectives, while it is a potentially life-saving technological advancement to others (they’re both right, but I think the infosec crowd has the stronger argument). But a smart group of people put together a project to facilitate doing electronically exactly what is being done today physically. The process is that the patient (or another doctor) requests the records be sent, they are pulled out, copied, mailed or faxed, and then a follow-up or “record received” communication occurs to insure that the source doctor got your records where they belong. Electronically this equates to the same thing, but instead of “selected” you get “looked up”, and instead of “mailed or faxed” you get “sent electronically”. There’s a lot more to it, but that’s the gist of The Direct Project. There are several reasons I got sucked into reading about this project. From a former healthcare worker’s perspective, it’s very cool to see non-diagnostic technology making a positive difference in healthcare, from a patient perspective, I would like the transfer of records to be as streamlined as possible, from the InfoSec perspective (I did a couple of brief stints in InfoSec), I like that it is not a massive database, but rather a “faster transit” mechanism, and from an F5 perspective, the possibilities for our gear to help make this viable were in my mind while reading. While Dr. Halamka has a lot of interesting stuff on his blog, this is one I followed the links and read the information about. It’s a pretty cool initiative, and what may seem very limiting in their scope assumptions holds true to the Direct Project’s idea of replacing the transfer mechanism and not creating a centralized database. While they’re not specifying formats to use during said transfer, they do list some recommended reading on that topic. What they do have is a registry of people who can receive records, and a system for transferring data over the wire. They worry about DNS-style health-care provider lookups, transfer protocols, and encryption, which is certainly a large enough chunk for them to bite off, and then they show how they fit into the larger nation-wide healthcare electronic records efforts going on. I hope they get it right, and the system they’re helping to build results in near-instantaneous secure records transfers, but many inventions are a product of the time and society in which they live, and even if The Direct Project fails, something like it will eventually succeed. If you’re in Healthcare IT, this is certainly a way to add value to the organization, and worth checking out. Meanwhile, I’m going to continue to delve into their work and the work of other organizations they’ve linked to and see if there isn’t a way F5 can help. After all, we can compress, dedupe, and encrypt communications on-the-wire, and the entire system is about on-the-wire communications, so it seems like a perfectly logical route to explore. Though the patient care guy in me will be reading up as much as the IT guy, because healthcare was a very rewarding field that seriously needed a bit more non-diagnostic technology when I was doing it.
Don_MacVittie_1
Mar 21, 2011 Place Technical Articles
410Views
1like
0Comments