Application Delivery
2254 TopicsSecuring Model Serving in Red Hat OpenShift AI (on ROSA) with F5 Distributed Cloud API Security
Learn how Red Hat OpenShift AI on ROSA and F5 Distributed Cloud API Security work together to protect generative AI model inference endpoints. This integration ensures robust API discovery, schema enforcement, LLM-aware threat detection, bot mitigation, sensitive data redaction, and continuous observability—enabling secure, compliant, and high-performance AI-driven experiences at scale.377Views4likes4CommentsHow I did it - “Delivering Kasm Workspaces three ways”
Securing modern, containerized platforms like Kasm Workspaces requires a robust and multi-faceted approach to ensure performance, reliability, and data protection. In this edition of "How I did it" we'll see how F5 technologies can enhance the security and scalability of Kasm Workspaces deployments.150Views1like0Comments5 Technical Sessions That Should Be Great: F5 AppWorld 2025
These F5 Academy sessions explore modern app delivery, security, and operations. The full list of sessions is on the F5 AppWorld 2025 Academy page - if you haven't yet registered you can do so here: Register for F5 AppWorld 2025 LAB - F5 Distributed Cloud: Discovering & Securing APIs API security has never been more critical, and this lab dives straight into the tough stuff. Learn how to find hidden endpoints, detect sensitive data and authentication states, and apply integrated API security measures to keep your environment locked tight. TECHNICAL BRIEFING - LLM Security and Delivery with F5’s Distributed Cloud Security Ecosystem AI is fueling the next wave of applications—but it’s also introducing new security blind spots. This briefing explores how to secure LLMs and integrate the right solutions to ensure your AI-driven workloads remain fast, cost-effective, and protected. LAB - F5 NGINX Plus Ingress as an API Gateway for Kubernetes Containerized environments and microservices are here to stay, and this lab helps you navigate the complexity. Configure NGINX Plus Ingress as a powerful API gateway for your Kubernetes workloads, enabling schema enforcement, authorization, and rate-limiting all in one streamlined solution. LAB - Zero Trust at Scale With F5 NGINX Zero trust principles become a whole lot more meaningful when you can scale them. Get hands-on with NGINX Plus and BIG-IP GTM to build a robust, scalable zero trust architecture, ensuring secure and seamless app access across enterprises and multi-cluster Kubernetes environments. LAB - F5 Distributed Cloud: Security Automation & Zero Day Mitigation In this lab, you’ll learn how to leverage advanced matching criteria and custom rules to quickly respond to emerging threats. Shore up your defenses with automated policies that deliver frictionless security and agile zero-day mitigation. Session Updates Coming in January 🚨 AppWorld's Breakout Sessions officially drop in January 2025 but here is a sneak preview! Check back in January to add these to your agenda. Global App Delivery With a Global Network How Generative AI Breaks Traditional Application Security and What You Can Do About It The New Wave of Bots: A Deep Dive into Residential IP Proxy Networks From ZTNA to Universal ZTNA: Expanding Your App Security Strategy --- See you at F5 AppWorld 2025! #AppWorld2571Views1like0CommentsUpdating SSL Certificates on BIG-IP using REST API
Simple cURL REST API commands to seamlessly update SSL certificates on a BIG-IP system. This method is ideal for those who prefer automation and want to integrate the process into their workflows. By following this guide, you will be able to: Upload a certificate and private key. Install them on the BIG-IP system. Update an SSL profile with the new certificate and key.59Views1like1CommentSetting up BIG-IP with AWS CloudHSM
Recently I was working on a project and there was a requirementfor using AWS CloudHSM. F5 has documented the process to install the AWS CloudHSM client in the implementation guide. I found it light on details of what a config should look like and showing examples. So let's pickup where the article leaves you on having installed the client software what does a working configuration look like?1.2KViews2likes1CommentIdP Routing With BIG-IP APM To Enable Seamless SSO User Experience
Organizations utilizing multiple identity providers may find it challenging to implement a unified single-sign-on experience. This article showcases how F5 BIG-IP Access Policy Manager (APM) can address the problem. Problem Statement Organizations may find themselves in situations where they are running multiple identity providers (IdP) in the environment, which could occur due to reasons such as Mergers and acquisitions, Migrating to a new IdP, or Business units operating in their own silos. In one such encounter, a F5 customer has two IdPs managing identities for two groups of users, each with its own email domains (scenario 3 above). Challenge arose when they wanted to enable SAML Single Sign-On (SSO) for their business applications, where the logon process needs to be able to direct users to the right IdPs to complete the logon process and access the applications. Implementation With BIG-IP Access Policy Manager (APM) being a well-established solution as an SAML Service Provider use case, we can address the challenge above by extending existing SAML SP access policies.The business applications are hosted behind BIG-IP Virtual Servers, with APM/access policies attached to them. The policies look a little something like below: The idea is simple - leverage the programmability of BIG-IP data plane to capture the differentiating users attribute and use it to route users to the corresponding IdPs. In this case, as each user group has their own unique domains in their email address/username, we first present a logon page from the APM to capture the information from the user: The email address/username, stored under the APM session variablesession.logon.last.username, is then passed on to the next action item, which evaluates custom expressions to categorize users by the email domain, represented by the various branches out from this action. Based on the domain in the provided email address, the SAML Auth actions redirect users to the different IdPs as seen in the table below. For consistency, each business application is registered on both IdPs with the same entity IDs and callback URLs. Email address/Username someone@f5.com someone@gmail.com Domain f5.com gmail.com IdP Microsoft Entra ID Auth0 Logon page presented Once the user finishes the logon process through the IdP, they are redirected back to BIG-IP, which allows the user through to the business application, completing the SSO flow. You will find that when redirected to the IdP logon page, users have to re-enter their email address, despite having already done that on the APM logon page. To avoid this, some IdPs such as Microsoft Entra ID supports login hints, where the sign in name field can be pre-populated with the email address provided in the initial logon page from APM. Implementation will be IdP-dependent. A neat example for configuring APM to support login hints for Microsoft Entra ID can be found in this CodeShare article. Conclusion With the above implementation, APM is able to remove unnecessary business logic from all the applications to handle various logon flows to multiple IdPs, securing applications via SSO without introducing additional user friction.48Views1like0CommentsHow I did it - "Securing NVIDIA’s Morpheus AI Framework with NGINX Plus Ingress Controller”
In this installment of "How I Did It," we continue our journey into AI security. I have documented how I deployed an NVIDIA Morpheus AI infrastructure along with F5's NGINX Plus Ingress Controller to provide secure and scalable external access.206Views2likes1CommentF5 App Connect and NetApp S3 Storage – Secured Scalable AI RAG
F5 Distributed Cloud (XC) is a SaaS solution which securely exposes services to the correct service consumers, whether the endpoints of the communications are in public clouds, private clouds, or on-prem data centers. This is particularly top of mind now as AI RAG implementations are easy to set up but are really only effective when the correct, often guarded, enterprise data stores are consumed by the solution. It is a common scenario where the AI compute loads are executed in one location, on-prem or perhaps in a cloud tenant. However, the data to be ingested, embedded, and stored in a vector database to empower inferencing may be distributed through many different geographies. The data sources to be ingested into RAG are often stored in NetApp form factors, for instance StorageGrid, a native object-first clustered solution for housing S3 buckets. Also, the ONTAP family, where frequently files are accessed through NAS protocols like NFS or SMB, today can see the RAG source content exposed as objects through S3-compliant API calls and the corresponding protocol license. Technology Overview The XC module App Connect leverages L4-L7 distributed load balancers to securely provide a conduit to enterprise NetApp-housed data for the centralized AI workloads leveraging RAG. The following is the setup objective for this article, although many customer edge (CE) sites exist, we aim to bring together corporate documents (objects) in a San Jose, California datacenter to a self-hosted AI/RAG solution running in a Seattle area datacenter. The Advantage of Load Balancers to Securely Attach to Your Data Previous articles have leveraged the XC Network Connect module to bring together elements of NetApp storage through NAS protocols like NFS in order to run AI RAG workloads, both self-hosted and through secure consumption of Azure OpenAI. The Network Connect module provides secure L3 (IP) connectivity between disparate sites. An advantage is Network Connect will support all IP-based protocol transactions between sites, and firewall rules to preclude unwanted traffic. Network Connect is great when ease of deployment is paramount, however if you know the protocols to be supported are HTTP or TCP-based read on about App Connect, a solution that can address any IP overlap that may exist between your various sites to be interconnected. App Connect is a different take on providing connectivity. It sets up a distributed load balancer between the consumer (in our case an AI server running LLMs and a vector database) and the services required (NAS or S3 accessible remote data stores on NetApp appliances or hosted services). The load balancer may be an HTTPS offering, which allows the leveraging of F5’s years of experience in web security solutions, including an industry-leading web application firewall (WAF) component. For non-web protocols, think NFS or SMB, a TCP layer load balancer is available. Advantages are that only the subnets where the consumer exists will ever receive connectivity and advertisements around the service configured. The load balancer can also expose origin pools that are not just private IP addressed appliances, origin pools can also be Kubernetes services. A final App Connect feature that is noteworthy, the offering provided is an L4 through L7 service (such as HTTPS), and as such the local layer 3 environment of the consumer and, in our case, storage offering is irrelevant. A complete overlap of IP, perhaps both ends are using the same 10.0.0.0/16 allotments, is acceptable, something extremely valuable within divisions of large corporations that have separately embraced similar RFC-1918 address spaces. Also, perhaps through mergers and acquisitions the net result in a major institution are widespread instances of duplicate IP spaces in use, IP renumbering projects are legendary as they are lengthy and fraught with the risks of touching core routing tables. Lastly, applications that require users to configure IP addresses into GUIs are problematic as values are dynamic, App Connect provides services by name typically and this is less burdensome for IT staff who manage applications. A Working AI RAG Setup Using NetApp StorageGRID and XC App Connect A Ubuntu 22.04 Linux server was configured as a hosted LLM solution in a Seattle-area datacenter. The Ollama open-source project was installed in order to serve both generative AI LLMs quickly (llama3.1, mistral and phi3 were all used for comparative results) and the required embedding LLM. The latter LLM is needed to create vector embeddings of both source enterprise documents and subsequent real-time inference query payloads. Through semantic similarity analysis, RAG will provide augmented prompts with useful and relevant enterprise data to the Ollama-served models for better AI outcomes. Using the s3fs offering on Linux one can quickly mount s3 buckets as file systems using FUSE (file system in user space). The net result is that any S3 bucket, supported natively by NetApp StorageGRID and through a protocol license on ONTAP appliances, can now be mounted as a Linux folder for your RAG embedding pipeline to build a vector database. The key really is how to easily tie together S3-compliant data sources through your modern enterprise, no matter where they exist and the form-factor they are in. This is where XC App Connect enters the equation, dropping a modern distributed load balancer to project services across your network locations. The first step in configuring the HTTPS Load balancer to connect sites is to enter the Multi-Cloud App Connect module of the XC console. Once there, primarily three key items need to be configured: An origin pool that points at the StorageGrid nodes or the local load balancer sitting in front of the nodes, these are private addresses within our San Jose site. An HTTPS load balancer that ties a virtual service name (in our case the arbitrary name s3content.local), to our origin pool. Establish where the service name will be projected by DNS and connectivity will be allowed, the service s3content.local is not to be projected into the global DNS but rather will only be advertised by the Seattle CE inside interface, essentially making this load balancer a private offering. Here is the origin pool setup, in our case a BIG-IP is being used as a local load balancer for StorageGRID and thus its private San Jose datacenter address is used. To achieve the second item, an HTTPS load balancer, we key in the following fields, including the FQDN of the service (s3content.local), the fact that we will provide a TLS certificate/key pair to be used by the load balancer, and the one-check option to create an HTTP-to-HTTPS redirection service too. Lastly, the advertisement of our service will only be supported by the CE node at the Seattle site, requests for s3content.local from our AI server will resolve to the local CE node inside network interface IP address. The App Connect load balancer will ensure the underlying connectivity, through the underlay network, to the origin pool (StorageGRID) in San Jose. RAG Using Ollama-Based Models and Remote StorageGRID Content Various methods exist to leverage freely available, downloadable AI LLMs. One popular approach is huggingface.com whereas another can be to leverage the Ollama framework and download both embedding and generative models from ollama.com. The latter approach was followed in this article and in keeping with past explorations, Python 3 was used to manipulate AI programmatically, including the RAG indexing tasks and the subsequent inferencing jobs. Ollama supports a Docker-like syntax when used interactively from the command line, the one embedding model and three generative models are seen below (from Ubuntu 22.04 terminal). $ ollama ls NAME ID SIZE MODIFIED llama3.1:latest 42182419e950 4.7 GB 6 days ago mistral:latest 974a74358d6 4.1 GB 6 days ago nomic-embed-text:latest 0a109f422b47 274 MB 6 days ago phi3:latest 4f2222927938 2.2 GB 7 days ago The RAG tests includes ingestion of both .txt and .pdf documents provided by App Connect from NetApp StorageGRID. A private CA certificate was created using an OpenSSL-derived tool and loaded into the Seattle Linux and Windows hosts. That CA cert was then used to create a pkcs12-packaged certificate and key set for s3content.local and uploaded to the HTTPS load balancer setup on Distributed Cloud. A quick Windows-based S3browser test confirmed reachability from Seattle: To leverage RAG document ingestion on Linux, the s3fs project was made use of. A Linux folder can be mounted using NAS protocols like NFS, a simple example might be: #sudo mount -t nfs 10.50.0.202:/f5busdev /home/steve/f5bizdev/ Using s3fs, folders can similarly be mounted which tie back to buckets on the remote StorageGRID: #s3fs mybucket001 /home/steve/rag-files-001 -o url=https://s3content.local:443 -o use_path_request_style -o passwd_file=/home/steve/.passwd-s3fs At this point RAG ingestion can take place. The net effectiveness of RAG is often attributed to the quality of the embedding LLM used. In this case the Python scripts leveraged the Ollama API and the locally downloaded “nomic-embed-text” model, one that is generally considered just passable but is small and runs well on CPU-only hosts. The Gutenberg project novel The Great Gatsby by F. Scott Fitzgerald was consumed from the NetApp remote bucket and a detailed question was asked of llama3.1, first without RAG’s augmentations, meaning just the pre-trained data was leveraged. $ python search.py What university did Gatsby attend? Answered without RAG: “In the novel "The Great Gatsby" by F. Scott Fitzgerald, the character of Jay Gatsby does not reveal where he attended university, as it is never explicitly stated in the book. However, it has been speculated and suggested in various adaptations and interpretations that he may have studied at St. Olaf College in Minnesota or Yale University due to some similarities with their campus architecture in the novel. But again, this remains purely speculative and not confirmed by Fitzgerald himself.” Note, the user prompt shown (nor the system prompt) told llama3.1 that the question pertained to a specific novel, it figured that part out on its own based on the impressive base training that is baked into foundational models. However, the answer is imperfect. With RAG supporting the query the result was much better; it mimics a response of an individual who had actually just read the entire novel, cover-to-cover, and is able to provide more nuanced and accurate answers (trimmed for brevity). $ python search.py What university did Gatsby attend? Answered with RAG: "According to the text, Gatsby attended 'Oggsford College' in England, not Oxford University. It is also mentioned later in the text that Gatsby told Mr. Wolfshiem that he had been employed by Dan Cody on a boat, and that this occurred after World War I, but before he claimed to have attended Oggsford College (which suggests that his attendance at the college may be questionable)." Pointing Linux mounted folders towards buckets containing pdf documents seemed to work best when the source documents were smaller. For instance, user manuals worked well, even though the embeddings focused upon textual chunks and disregarded diagrams. This script was instructed to provide attributions within the augmented text provided to the LLM, specifically the manual page number and document chunk number from that page. The following is a result using the smallest Ollama generative LLM test, the Phi3 model and quizzing it about some lawn maintenance equipment. $ python query_data.py "Are there any concerns with the muffler on this lawn mower?" Response with RAG: Yes, according to the given context, there is a concern about the muffler on this lawn mower. It is stated that if the engine has been running, the muffler will be hot and can severely burn you. Therefore, it is recommended to keep away from the hot muffler. Sources: ['../rag-files-003/toro_lawn_mower_manual.pdf:12:2', '../rag-files-003/toro_lawn_mower_manual.pdf:12:1', '../rag-files-003/toro_lawn_mower_manual.pdf:21:1', The findings were less positive with very large PDF documents. For instance, Joseph Hickey’s 1975 book A Guide to Bird Watching is available in the public domain and totals almost 300 pages and is 32 megabytes in size. Regardless of the LLM, mistral or llama3.1 included, rarely were questions taken directly from specific pages answered with precision. Questions supported by statements buried within the text, “Where was the prairie chicken first observed?” or “Have greenfinches ever been spotted in North America and how did they arrive?” all went unanswered. Get Better Mileage from Your RAG To optimize RAG, it’s unlikely the generative LLMs are at fault; with Ollama allowing serial tests it was quickly observed that none of Mistral, Llama3.1 or Phi3 differed when RAG struggled. The most likely route to improve responses is to experiment with the embedding LLM. The ability to derive semantic meaning for paragraphs of text can vary, Hugging Face provides a leaders board for embedding LLMs with their own appraisal via a performance scoring system in the massive text embedding benchmark (MTEB). Other ideas are to use significantly larger chunk sizes for large documents, to reduce the overall number of vector embeddings being semantically compared, although a traditional 2,048 token context window in inferencing would limit how much augmented text can be provided per RAG-enabled prompt. Finally, multiple ways exist to actually choose similar vector embeddings from the database, approaches like cosine similarity or Euclidean distance. In these experiments, the native feature to find “k” similar vectors is provided by ChromaDB itself, as explained here. Other methods that play this critical search for related, helpful content would include Facebook AI Similarity Search (FAISS), which uncouples the search feature from the database, reducing risks of vector DB vendor lock in. Other libraries, such as the compute-cosine-similarity library are available on-line, including support for languages like JavaScript or TypeScript. Future experiments with better embedding models or possibly changing document chunking sizes to larger values might well produce even better results when scaling up your RAG deployment to enterprise scale and the expected larger documents. F5 XC App Connect for Security and Dashboard Monitoring The F5 Distributed Cloud load balancers provide a wealth of performance and security visibility. For any specific time range, just glancing at the Performance Dashboard for our HTTPS load balancer quickly gives us rich details including how much data has been moved to and from NetApp, the browser types used, and specifically which buckets are active on the StorageGRID. Enterprises frequently invest heavily in dedicated monitoring solutions, everything from raw packet loggers to application performance measurement (APM) tools, sometimes offering PCAP exports to be consumed in turn by other tools such as Wireshark. Although Distributed Cloud load balancers are a secure communications solution, a wealth of monitoring information is available. Both directions of transactions, consisting of requests coupled with their responses as a single entity, are monitored and available in both a decoded, information pane and a rawer .json format for consumption by other tools. Here is one S3 object write from Seattle, crossing the XC HTTPS load balancer, and storing content on the San Jose StorageGRID. The nature of the client, including browser agent (S3 Browser in the image) and TLS signature for this client are available, as well as the object name and bucket it was targeting on the NetApp appliance. Useful Approaches for Managing and Securing Storage API Traffic A powerful module in Distributed Cloud that locks in on NetApp traffic, in this case carried between cities by S3-compliant APIs, is the API Discovery module. As seen in the following image, an operator can add approved API endpoints to the “Inventory”, similar to adding the endpoint to a swagger file, something easily exported from the GUI and potentially integral in enterprise API documentation. As denoted in the next image, the remaining “Shadow” endpoints that were all automatically discovered, allow quick attention brought to them and is an enabler of a positive security approach whereby shadow traffic could be blocked immediately by the load balancer. In this case, a quick method of blocking unsanctioned access to specific StorageGRID buckets is arrived at. Also worth noting in the above screenshot, the most active APIs for a selected time period, 5 minutes to a full day perhaps, are brought to the operator’s attention. Finally, a last example of the potential value of the API discovery module are the sensitive data columns. Both custom provided data types observed in flight (such as phone numbers or national health ID values) as well as non-compliance to industry-wide guidelines (such as PCI_DSS) are flagged per S3 API endpoint. Automatic Malicious User Mitigation Distributed Cloud also offers a valuable real-time malicious user mitigation feature. Using behavioral observations, many harnessing AI itself, rogue consumers or generators of S3 objects can be automatically blocked. This may be of particular use when the distributed HTTPS load balancer provides NetApp S3 access to a wider range of participants, think of perhaps partner enterprises with software CE sites installed at their location. Or, ramping up, consider general Internet access where the App Connect load balancer projects access to FQDNs through global DNS and the load balancer is instantiated on an international network of regional edge (RE) sites in 30+ major metropolitan markets. This user mitigation feature can be enacted by first tying a user identity policy to the HTTPS load balancer. One valuable approach is to make use of state-of-the-art client-side TLS fingerprinting, JA4 signatures. Combining TLS fingerprinting with other elements, such as client source IP, can assist is categorizing the unique user driving any attempted transaction. With this selection in place, an operator need only flip “on” the automatic mitigations and security will be ratcheted up for the load balancer. As seen in the following screenshot, XC and its algorithms can gauge the threat level presented by users and respond accordingly. For detected low threat levels JavaScript challenges can be presented to a client browser, something not requiring human intervention but assists in isolating DDoS attackers from legitimate service clients. For behavior consistent with medium threat levels, something like a Captcha Challenge can be opted for, in cases where a human should be driving interactions with the service presented by the load balancer. Finally, upon the perception of high threat levels Distributed Cloud will see to it that the user is placed into a temporarily blocked state. Summary In this article we have demonstrated that the F5 Distributed Cloud App Connect module can be set up to provide a distributed L4-7 load balancer that can bring remote islands of storage, in this case NetApp appliances, to a centralized RAG AI compute platform. Although TCP-based NAS protocols like NFS could be utilized through TCP load balancing, this particular article focused upon the growing S3-compatible API approach to object retrieval, which uses HTTPS as transport. Through a pair of CE sites, an S3 service was projected from San Jose to Seattle, the origin pool was a cluster of NetApp StorageGRID nodes, and the consuming entity was Ubuntu running Ollama LLMs models in support of RAG. The service was exclusively projected to Seattle and the net result was AI outcomes that consumed representative novels, research guides and product user manuals for contextually meaningful responses. The App Connect module empowers features such as rich transactional metrics, API discovery coupled with enforcement of S3 bucket access, and finally the ability to auto-mitigate worrisome users with reactions in accordance with threat risk levels.168Views2likes0CommentsMigrating between sites using stretched VLANs
Background Recently I was asked the following question by a customer: I have 2 physical datacenters. Datacenter A is a legacy datacenter and almost all workloads run here, and Datacenter B is a new location with almost no workloads deployed. I must migrate quickly out of location A into B, in order to close location A. Using NSX-T, I have "stretched" VLAN's between sites, meaning that a single VLAN and CIDR block is available to VM's in either datacenter. I have 2x F5 BIG-IP devices configured in an HA pair (Active/Standby) in location A, but none yet in B. I have thousands of Virtual Servers configured on my BIG-IP devices, and many thousands of VM's configured as pool members, all running in location A. I can migrate workloads by replicating VM's across datacenters, leaving their IP addresses unchanged after migration to location B. What are my options for migrating the BIG-IP devices to location B?I'd like to maintain High Availability within each datacenter, minimize any traffic between the sites, and minimize operational difficulties. Let's take a look at these requirements and review potential solutions. Defining our migration Firstly, let's define our existing environment and challenges in a little more detail. What is a stretched VLAN? "Stretched VLAN" is a shorthand phrase for the practice of extending Layer 3 networks across physical sites. This is useful in situations like VM migration across data centers without needing to change the VM's network configuration. If datacenters are within a few miles of each other, direct Layer 2 connectivity may be possible. A more commonly preferred approach is tunneling Layer 2 networks across routed (Layer 3) networks. This allows for more control over networking and relieves some constraints of direct L2 connections. The primary technology used to stretch a VLAN across physical data centers is VxLAN, which allows for Layer 2 connectivity extension over a Layer 3 network, effectively creating a virtual overlay network that can span multiple data centers while maintaining logical segmentation within the VLANs. VxLAN is used by VMware's NSX-T offering, but other technologies also exist, such as L2TPv3 and MPLS. How can NSX-T minimize inter-site traffic? Because NSX-T can have logically distributed routers, we can define anoverlay-backed segment. In an overlay-backed segment, traffic between two VMs on different hosts but attached to the same overlay segment has their layer 2 traffic carried by a tunnel between the hosts. directly, bypassing the physical VLAN's. Diagram source In practice, this means that traffic between two VM's in the same segment - even if they are on different VLANs and different ESXi hosts - does not need to traverse the physical network gateway. I.e, if our VLAN's default gateway of ".1" exists on a physical router in Location A, but traffic is being sent between 2x VM's on different hosts and VLAN's in Location B, the traffic does not need to traverse the inter-site link. This is very powerful. To minimize VM-to-VM traffic crossing the inter-site link, we must configure NSX-T correctly, migrate all VM's in a given app/workload between data centers at the same time, and also at the same time move any F5 VIPs that process application traffic for that workload. Pre-migration BIG-IP overview The simplified diagram below shows the customer's pre-migration environment. The legacy datacenter still hosts almost all workloads, but the plumbing has been laid for migration. Using NSX-T, the VLANs in datacenter A are stretched to datacenter B. Existing BIG-IP's can reach pool members in Datacenter B. Any VM can be migrated between sites A and B without changing it's IP address. Post-migration BIG-IP end state The diagram below shows our post-migration goal. Let's remember our requirements. We want to get here with: HA maintained within each datacenter (we'll have 4 devices for a period of time) Minimal inter-site traffic (skip the physical VLAN gateway if possible) The easiest method possible (ie, sync configs and do not deploy new config on a disparate BIG-IP cluster) Migration options Let's review a few options for migration. 1. HA pair across physical locations Given that our end goal is to have all VM workloads and BIG-IP VE's in location B, we may be tempted to take a shortcut approach: just migrate one BIG-IP VE to site B and run an Active/Standby pair across datacenters. This could work in terms of connectivity, but raises some disadvantages for a production scenario: Latency. Only 1 BIG-IP can be Active. Latency will occur for all traffic between the Active BIG-IP and any nodes in the other datacenter. HA. Running a single HA pair across sites leaves us without High Availabilitywithineither site. Hard cutover. A cutover of the Active/Standy roles between site A to site B can be planned, but it's an "all or nothing" approach in terms of site A or B hosting the Active BIG-IP. There's no graceful way to keep both VM workloadsandthe Active BIG-IP in the same site and migrate together. I have personally managed HA pairs run across two physical datacenters with Layer 2 connectivity. In a scenario with very little latency between sites, or where a migration was not planned, that might be appropriate. However, in this case, the disadvantages listed here make this option less than ideal. 2. Second HA pair of BIG-IP devices in site B Given that our end goal is to have a single HA pair of devices in site B, we could build a brand new, disparate BIG-IP cluster in site B and migrate Virtual Servers from cluster A to B. After migration, decommission cluster A. This could work but raises unnecessary complications: IP conflicts. If we build 2x HA pairs, both on the same VLAN's, we have the potential for IP conflicts. We must migrate every Virtual Server and Pool by recreating our configuration on cluster B. This means every VIP must change. Tediousness. We could alleviate some pain by cleverly replicating the configuration from cluster A on cluster B, but disabling Virtual Addresses until a time of cutover. This would be possible but tedious, and introduce some risk. We could automate the deletion of VIP's and pools from one device, and creation on another, but this automation is likely to be difficult if the original configuration was not created with automation itself. DNS. If IP addresses of VirtualServers do change, we must consider the DNS changes that would be required. One advantage, however, is that two separate clusters is an architecture that is very easy to understand. 3. Single device cluster with 4x BIG-IPs This is my preferred approach when latency between sites is non-trivial and we must operate for some time with Active BIG-IPs in both locations. We'll temporarily grow our cluster from 2 devices to 4, with 2 devices in each cluster. We'll introduce an additional Traffic Group. Traffic Group 1 (existing TG) will be Active on BIG-IP 1 with BIG-IP 2 as next preferred. Traffic Group 2 (new TG) will be Active on BIG-IP 3 with BIG-IP 4 as next preferred. Pre-migration, all VIP's exist in TG 1. Migrate workloads components together: Migrate related VM's for workloads to Datacenter B. Migrate appropriate VIP's between TG 1 and TG 2. East-West traffic between workload VMs and BIG-IP should all remain within Location B. Once all VIP's are migrated to Traffic Group 2: Delete Traffic Group 1 Remove and decommission BIG-IP's 1 and 2. The advantages to his approach are: No IP address changes. Virtual Addresses of Virtual Servers do not change. Operationally smooth. A TMSH command can move a Virtual Address between Traffic Groups. Workloads can move app-by-app and not in an "all-or-nothing" approach. No DNS changes required. HA maintained within sites. Further details when configuring a single, 4-Device Cluster The basic idea behind this migration architecture is that a given VIP will be advertised via ARP fromeitherLocation A or B, depending on the Traffic Group to which the Virtual Address belongs. This allows us to have asingle Device Group (easiest because config is synced between all devices) and to usetwo Traffic Groups. The Device Group type is stillSync-Failover (and not Sync-Only). In a Sync-Only group, the /Common folder is not synced between member devices, but in our case the existing configuration exists within /Common and so we will need this replicated between all four devices. Multiple Device Groups within a single Trust Domain, as diagrammed in this example, are not planned in this scenario. Because partitions are mapped to a Device Group, and all of our existing configuration is within a single /Common partition, multiple Device Groups are not appropriate in this case. Failover is configured individually for each Traffic Group. TG 1 will failover between BIG-IP 1 & 2, and TG 2 will failover between BIG-IP 3 & 4. We often refer to two-node clusters as "Active/Standby" or "Active/Active". When a cluster has 3+ nodes we often refer to it as "ScaleN" or "Active/Active/Standby", or similar. For this scenario, we might use the term "Active/Standby/Active/Standby", since we'll have 2 Active devices and 4 devices total during our migration. Further questions for our customer scenario When do we migrate the default gateway for each of our physical VLANs? The physical VLAN's have a gateway - let's call it ".1" - currently configured on routers on Virginia. This is relevant because some traffic may traverse physical VLAN's: traffic between VM's that are actually running in different datacenters because they were not migrated together, traffic from regional sites that are routed by the WAN to Virginia, traffic to physical VM's that will be migrated separately, etc. Moving that ".1" from Virginia to Dallas will be once-off move that's best left to the customer's network team. Advanced network management technologies can distribute a VLAN and gateway for physical networks, just like our NSX-T example does for VLANs within the virtual environment. But this decision is outside of the recommendation for F5 BIG-IP migration. What about physical servers in Virginia? How do we handle those? Physical machines must be re-built in Location 2. Physical VLAN's can be stretched, just like VLAN's within the virtual environment. Or, physical machines may reside on VLAN's that are local to each datacenter. In the case of a stretched physical VLAN, a physical server that is a pool member in BIG-IP could maintain it's IP address and BIG-IP configuration would not change. If the IP address of a pool member does change, of course the BIG-IP configuration must be updated. This makes the migration of physical servers less automated than VM's. In this scenario, the BIG-IP's themselves are virtual machines. If they were physical appliances on a stretched VLAN, the same approach would apply for BIG-IP migration (single Device Groups with two Traffic Groups). What about self IP addresses? This is a potentially important detail. Each BIG-IP will have unique Self IP's in each VLAN to which it is connected. Let's walk through this from two perspectives: health monitoring and SNAT. Health monitoring of pool members will be sourced from the non-floating Self IP that is local to the Active BIG-IP. Health monitoring is not sourced from a floating Self IP, if one exists. By introducing two additional devices we will be adding two IP addresses from which pool members may receive health checks. In our customer's case, no firewall currently exists between BIG-IP and pool members, so it's very unlikely that health monitoring from a new IP address will break an application. But it is important to note: if an application is expecting health monitoring to come from one of the two existing BIG-IP Self IP's only, we may need to update that application. SNAT, which is extremely common, means connections between BIG-IP and pool members will be sourced from a Self IP. If a floating Self IP exists, SNAT connections will be sourced from this IP address, otherwise they will be sourced from the non-floating address. Floating Self IP's are assigned to a Traffic Group and will failover between devices accordingly. This means that when a VirtualAddress is updated from Traffic Group 1 to 2, SNAT'd connections will now be sourced from a new IP address. The same potential concern exists for SNAT as with health monitors: if applications or firewalls accept only pre-configured IP addresses, we'll need to update them. However with SNAT an administrator might plan ahead with SNAT Lists. SNAT Lists are also assigned to Traffic Groups, and an administrator might create a SNAT List dedicated for an application. This SNAT List could be migrated between Traffic Groups at the same time as a Virtual Address and associated VM workload migration. Be mindful if updating the Traffic Group of a floating self IP or SNAT List: if multiple applications are expecting specific source IPs for health monitors or SNAT'd connections, those applications may need to be migrated together. Conclusion A stretched VLAN allows additional options for VM migrations. Careful planning of F5 BIG-IP deployments, using multiple Traffic Groups, will allow for a smooth migration plan. Thanks for reading!320Views4likes1CommentBIG-IP Next: iRules pool routing
If you use routing in iRules with the pool command in BIG-IP and you’re starting to kick the tires on BIG-IP Next in your lab environments, please note the pool reference is not just the pool name. For example, in classic BIG-IP, if I had a pool named myPool then the command in my iRule would just bepool myPool. In BIG-IP Next (as of this publish date, they are working on restoring the relative path) you will need these additional details: Tenant Name Application Name Pool Service Name The format for the pool reference is then: pool /app/myTenantName:ApplicationName/pool/PoolServiceName Consider this partial AS3 declaration (stripped down to the necessary lines for brevity): { "class": "ADC", "tenant2zBLVQCbR2unEw5ge6nVrQ": { "class": "Tenant", "testapp1": { "class": "Application", "pool_reference_test_7_testvip1": { "class": "iRule", "iRule": { "base64": "" } }, "testpool1": { "class": "Pool" }, "testpool2": { "class": "Pool" }, "testpool2_service": { "class": "Service_Pool", "pool": "testpool2" }, "testpool3": { "class": "Pool" }, "testpool3_service": { "class": "Service_Pool", "pool": "testpool3" }, "testvip1": { "class": "Service_HTTP", "iRules": [ "pool_reference_test_7_testvip1" ], "pool": "testpool1", "virtualAddresses": [ "10.0.2.51" ], "virtualPort": 80 } } } } In this case, there is a default pool (testpool1) attached to the virtual server, but the ones that will require routing in the iRule, testpool2 and testpool3, are not attached. They are mapped in the Service_Pool classes though, and that's what we need for the iRule. From this declaration, we need: Tenant Name: tenant2zBLVQCbR2unEw5ge6nVrQ Application Name: testapp1 Service Pool Names: testpool2_service, testpool3_service The original iRule then, as shown here: when HTTP_REQUEST { if { [string tolower [HTTP::uri]] == "/tp2" } { pool testpool2 HTTP::uri / } elseif { [string tolower [HTTP::uri]] == "/tp3" } { pool testpool3 HTTP::uri / } } Becomes: when HTTP_REQUEST { if { [string tolower [HTTP::uri]] == "/tp2" } { pool /app/tenant2zBLVQCbR2unEw5ge6nVrQ:testapp1/pool/testpool2_service HTTP::uri / } elseif { [string tolower [HTTP::uri]] == "/tp3" } { pool /app/tenant2zBLVQCbR2unEw5ge6nVrQ:testapp1/pool/testpool3_service HTTP::uri / } } When creating an application service in the Central Manager GUI, here’s the workflow I used: Create the application service without the iRule, but with whatever pools you’re going to route to so that the pools and pool services are defined. Validate the app and view results. This is where you’ll find your tenant and service pool names. The app’s name should be obvious as you set it! Go ahead and deploy; there isn’t a way here to save drafts currently. Create or edit the iRule with the pool format above with your details. Edit the deployment to reference your iRule (and the correct version), then redeploy. This should get you where you need to be! Comment below or start a thread in the forums if you get stuck.346Views2likes2Comments