hybrid cloud
18 TopicsThe Power of &: F5 Hybrid DNS solution
While some organizations prioritize the advantages of a SaaS solution like scalability, others value the benefits of an on-premises solution, such as data control and migration flexibility. This is why having the option to deploy a hybrid model can be beneficial, not just for redundancy, but also for allowing organizations to blend the best of both worlds. Understanding the Architecture’s components F5 BIG-IP DNS - (formerly BIG-IP GTM) is a well-known on-premise solution for delivering high-performance DNS services such as DNSExpress and DNS Caching. It is also recognized for offering intelligent DNS responses that are based on various factors such as LDNS’ Geolocation (GSLB) and health status of applications. F5 Distributed Cloud DNS (F5 XC DNS) – It is F5’s SaaS-based DNS solution which is built on a global data plane, ensuring automatic scalability to meet high-volume demand. Additionally, it also provides GSLB and security such as DNS DoS protection. On the diagram above, BIG-IP DNS will be the hidden primary DNS, acting as the source of truth for DNS records. This setup ensures centralized control and adds an extra layer of security by reducing exposure to potential attacks. F5XC DNS will function as the secondary DNS server, receiving DNS records from BIG-IP via Zone Transfer. It will be responsible for handling public DNS queries and providing domain name resolution services to clients. In the first part of this article, we will show you how to set up and configure BIG-IP DNS as the hidden primary and F5XC DNS as the authoritative secondary DNS server. For some, this setup is sufficient for their requirements, but for others, there may be additional requirements to consider in this hybrid design. In the later part of this article, we will demonstrate how we can address these challenges by leveraging F5's platform features and capabilities! Steps on Implementing F5 Hybrid DNS Solution Step 1: Configure BIG-IP DNS First, we need to configure BIG-IP DNS to be able to perform a zone transfer to F5XC DNS. For more details on the configuration, you can check this link: https://community.f5.com/kb/technicalarticles/configuring-big-ip-for-zone-transfer-and-dnssec/330359 Step 2: Configure F5XC DNS Now after you've configured BIG-IP DNS, we need to configure F5 XC DNS to be a secondary DNS server. For more details, check the steps below: Log into XC Console, select DNS Management option, click Add Zone. In Domain Name field, enter the domain/subdomain. In our example, it will be f5sg.com Zone Type: Secondary DNS Configuration Under the Secondary DNS Configuration field, click Configure On the DNS primary server IP field, enter the public IP address of the Primary DNS. In our example it will be the Public IP of BIG-IP DNS. On TSIG key, enter the name we used to generate TSIG earlier in BIG-IP. In our example, used example. On the TSIG Key algorithm field and select an algorithm from the drop-down. Select hmac-sha256. Click Configure in the TSIG key value in base 64 format section, On the Secret Type field, select Clear Secret and paste the secret in the Secret field. Use the same secret we generated earlier in BIG-IP DNS. Click Apply. You should see the DNS records transferred from BIG-IP DNS to F5XC DNS Step 3: Configure Domain Registrar In this example, the domain registrar I'm using is Namecheap. I'll configure it so that the authoritative name server for the domain f5sg.com is set to F5XC (ns1.f5clouddns.com and ns2.f5clouddns.com). The steps will vary depending on which domain registrar you are using. Refer to the documentation of your registrar. See the screenshots below for how I configured it in Namecheap. F5 XC DNS should now be able to answer DNS queries since it is set to be the authoritative DNS. Now, let's do some testing! On my local machine, I will perform a dig on the f5sg.com domain. See below: You can see that on the dig result, the NS for f5sg.com is set to ns1.f5clouddns.com & ns2.f5clouddns.com! I can also resolve sales.f5sg.com! We have successfully implemented BIG-IP as Hidden Primary and F5XC as Authoritative Secondary DNS! Challenges and Considerations Now let's discuss the additional requirements or challenges that we might encounter with this hybrid setup solution: Security: We need to comply with security compliance. Nowadays, there are laws requiring the implementation of DNSSEC (DNS Security Extensions). We need to consider this in the design and implement it without adding complexity. Resiliency: Although F5XC DNS Infrastructure is built to be resilient, we still want a backup plan to failover to the BIG-IP Primary DNS in case of unforeseen events. This process will be manual, as we need to change the NS records at the registrar to promote the hidden BIG-IP Primary DNS as the authoritative NS for the domain once F5XC is unavailable. Synchronization: BIG-IP will not be able to synchronize the GSLB functionality with F5XC because Wide-IP records are non-standard and cannot be transferred as part of zone transfers. Solution to Challenges Now comes the fun part: tackling the challenges we’ve laid out! Fortunately, F5 Distributed Cloud is an API-first platform that enables us to automate configuration. At the same time, we have the power of the BIG-IP platform, where you can run custom scripts that will enable us to integrate it with F5XC through API. Solution to Challenge #1: This is easy. DNSSEC records like RRSIG, DNSKEY, DS, NSEC, and NSEC3 are standardized and can be synchronized as part of a zone transfer. Since BIG-IP DNS is our primary DNS and supports DNSSEC, we can enable it. The records will synchronize to F5XC DNS and still respond with signed records, maintaining the integrity and security of our DNS infrastructure. How do you enable it? Check the last part of the technical article below: https://community.f5.com/kb/technicalarticles/configuring-big-ip-for-zone-transfer-and-dnssec/330359 Solution to Challenge #2: We need to automate failover! But when automating tasks, you need two things: a trigger and an action. In our scenario, our trigger should be the availability of F5XC DNS to resolve DNS queries, and the action should be to change our nameserver to BIG-IP on the domain registrar. If you can create and run a script in BIG-IP, it means you can continuously monitor the health of F5XC DNS, allowing us to determine the trigger. But what about the action to change the domain name server records in the registrar? It's easy—check if it can be configured via API, then the problem is solved! Let's explore using Namecheap as our registrar for this example. We will use the BIG-IP EAV (external) monitor to run the script. If you're unfamiliar with the BIG-IP external monitor and its capabilities, check this out → https://my.f5.com/manage/s/article/K71282813 A dummy pool configured with an external monitor will run at intervals. The attached script is designed to monitor F5XC and check if it can resolve DNS queries. If it cannot, the script will trigger an API call to Namecheap (our domain registrar) to change the nameservers back to BIG-IP DNS. Simultaneously, the script will update the domain's NS records from F5XC to BIG-IP. Step 1: Create an external monitor using the custom script. Refer to article K71282813 how to create the external monitor. See the codeshare link for the sample custom script I used: Namecheap and BIG-IP Integration via API | DevCentral Step 2: Create a dummy pool and attach the custom external monitor Let's do some tests! See the results in the later part of this article. Solution to Challenge #3: We can't use Zone Transfer to synchronize GSLB configurations? No problem! Instead, we'll harness the power of APIs. We can run a custom script in BIG-IP to convert Wide-IP configurations into F5XC DNSLB records via API. Let's see below how we can do this. On BIG-IP DNS, configure the zone records for the domain f5sg.com to delegate the subdomains needed for GSLB. For example, we need to perform GSLB for www.f5sg.com, we will configure the zone like below: www.f5sg.com CNAME www.gslb.f5sg.com gslb.f5sg.com NS ns1.f5clouddns.com On BIG-IP we will create Wide-IP configuration for www.gslb.f5sg.com which should hold the A records. These Wide-IP configurations can be converted by a script to F5XC DNSLB configurations. Check the sample script on this codeshare link: BIG-IP Wide-IP to F5XC DNSLB converter | DevCentral Testing and Result Challenge #2: Failover Testing To simulate the scenario in which F5XC is unable to respond to DNS queries, we designed the script to execute a dig command to F5XC for a TXT record. If F5XC responds with "RESPONSE-OK," no further action is needed. However, if it fails to respond correctly or does not respond at all, the script will trigger a failover action. Scenario 1: When F5XC responds to DNS queries (TXT record value is RESPONSE-OK) Namecheap dashboard shows F5XC nameservers BIG-IP DNS zone records shows F5XC nameservers F5XC zone records shows F5XC nameservers Result when performing dig to resolve sales.f5sg.com -> it shows that F5XC nameservers are Authoritative Scenario: When F5XC doesn't respond to DNS queries (TXT record value is RESPONSE-NOT-OK) We changed the TXT record value to 'RESPONSE-NOT-OK,' which should mark the monitor as down. The dummy pool went down, which means the script inside the monitor detected that the dig result was not what it expected. You can see from the zone records below that the NS records have now changed to GTM (gtm1.f5sg.com and gtm2.f5sg.com) When we check our domain registrar, Namecheap, we can see that the nameservers are now automatically set to BIG-IP GTMs. When I issue a dig command from my workstation, I can see that the nameserver responding to my query is gtm1.f5sg.com Online DNS tools (like MXToolbox) also report that gtm1.f5sg.com is the authoritative NS that responds to the DNS queries for sales.f5sg.com, which resolves to 2.2.2.2 We have now solved one of the challenges by implementing a backup failover plan using custom monitors and automations, made possible by the power of BIG-IP and APIs! Challenge #3: Synchronization Testing Using this script, we can convert and synchronize the BIG-IP Wide-IP configuration to its F5XC equivalent configuration Note: The sample script is limited to handling a Wide-IP with a single GTM pool. Inside the pool is where you will define the IP addresses that you want to load balance. The pool load balancing method is also limited to Round Robin, Ratio, Static Persist, and Global Availability. The script is designed to run at intervals. There are several ways to execute it: you can use external monitors (as we did earlier) or utilize a cronjob, etc. For testing and simplicity, I will use a cronjob set to run every 10 minutes. Let's begin creating our GSLB configuration. If you've configured BIG-IP GTM/DNS before, one of the first objects you need to create is a GTM server. I've configured two Generic Servers representing the application in two different Data Centers. Next is we create a GTM Pool which we will associate the Virtual Server inside the GTM server we created earlier. (i.e. I'm assigning 1.1.1.1 and 2.2.2.2 as the members of the pool) Lastly, we will create the Wide-IP record and attach the GTM Pool we created earlier After this, the script should get triggered and convert this BIG-IP DNS Wide-IP configuration into F5XC DNS configuration. We should see that a new Primary Zone will be created in F5XC (gslb.f5sg.com) When you view the resource records, we should see a DNSLB record which has the record name equivalent to the subdomain of the wide-IP record. (BIG-IP DNS Wide-IP record is www.glsb.f5sg.com, In F5XC DNS zone gslb.f5sg.com, the record name is www and pointing to DNSLB object) The load balancing rules should have the DNSLB pool (pool-www) which is the equivalent of the GTM Pool (pool_www) configured in BIG-IP DNS The DNSLB pool members will include the same IP addresses we defined as GTM Pool members in BIG-IP DNS. There are four load balancing methods available in F5XC, and there is an equivalent BIG-IP DNS load balancing method. The script was created to match this methods but if you configure the BIG-IP DNS pool load balancing method to something other than these four, it will default to Round Robin. BIG-IP DNS F5XC DNS Round Robin Round-Robin Ratio Ratio-Member Static Persist Static-Persist Global Availability Priority Based on the results above, we have successfully converted and synchronize BIG-IP DNS Wide-IP configuration into F5XC DNSLB records! Conclusion We have resolved DNS challenges using the power and integration of F5 solutions! By utilizing both BIG-IP and F5XC platforms, which can sign and serve DNSSEC records, we can seamlessly implement DNSSEC in a hybrid setup without complexity. Furthermore, our scalable F5XC Cloud DNS will shield you from myriad DNS DoS attacks, which are continually evolving, especially with the rise of AI. In terms of DNS resiliency, with the power of our API-first platforms and automation, we can create a DNS hybrid solution capable of automatically failing over from Cloud DNS to on-prem DNS. Lastly, we can synchronize the configurations of both platforms using standards like Zone Transfer and APIs. This capability allows us to convert and synchronize GSLB configurations between our on-prem DNS and Cloud DNS, making administration easier, and establishing a single source of truth.1.4KViews3likes0CommentsSecure AI RAG using F5 Distributed Cloud in Red Hat OpenShift AI and NetApp ONTAP Environment
Introduction Retrieval Augmented Generation (RAG) is a powerful technique that allows Large Language Models (LLMs) to access information beyond their training data. The “R” in RAG refers to the data retrieval process, where the system retrieves relevant information from an external knowledge base based on the input query. Next, the “A” in RAG represents the augmentation of context enrichment, as the system combines the retrieved relevant information and the input query to create a more comprehensive prompt for the LLM. Lastly, the “G” in RAG stands for response generation, where the LLM generates a response with a more contextually accurate output based on the augmented prompt as a result. RAG is becoming increasingly popular in enterprise AI applications due to its ability to provide more accurate and contextually relevant responses to a wide range of queries. However, deploying RAG can introduce complexity due to its components being located in different environments. For instance, the datastore or corpus, which is a collection of data, is typically on-premise for enhanced control over data access and management due to data security, governance, and compliance with regulations within the enterprise. Meanwhile, inference services are often deployed in the cloud for their scalability and cost-effectiveness. In this article, we will discuss how F5 Distributed Cloud can simplify the complexity and securely connect all RAG components seamlessly for enterprise RAG-enabled AI applications deployments. Specifically, we will focus on Network Connect, App Connect, and Web App & API Protection. We will demonstrate how these F5 Distributed Cloud features can be leveraged to secure RAG in collaboration with Red Hat OpenShift AI and NetApp ONTAP. Example Topology F5 Distributed Cloud Network Connect F5 Distributed Cloud Network Connect enables seamless and secure network connectivity across hybrid and multicloud environments. By deploying F5 Distributed Cloud Customer Edge (CE) at site, it allows us to easily establish encrypted site-to-site connectivity across on-premises, multi-cloud, and edge environment. Jensen Huang, CEO of NVIDIA, has said that "Nearly half of the files in the world are stored on-prem on NetApp.”. In our example, enterprise data stores are deployed on NetApp ONTAP in a data center in Seattle managed by organization B (Segment-B: s-gorman-production-segment), while RAG services, including embedding Large Language Model (LLM) and vector database, is deployed on-premise on a Red Hat OpenShift cluster in a data center in California managed by Organization A (Segment-A: jy-ocp). By leveraging F5 Distributed Cloud Network Connect, we can quickly and easily establish a secure connection for seamless and efficient data transfer from the enterprise data stores to RAG services between these two segments only: F5 Distributed Cloud CE can be deployed as a virtual machine (VM) or as a pod on a Red Hat OpenShift cluster. In California, we deploy the CE as a VM using Red Hat OpenShift Virtualization — click here to find out more on Deploying F5 Distributed Cloud Customer Edge in Red Hat OpenShift Virtualization: Segment-A: jy-ocp on CE in California and Segment-B: s-gorman-production-segment on CE in Seattle: Simply and securely connect Segment-A: jy-ocp and Segment-B: s-gorman-production-segment only, using Segment Connector: NetApp ONTAP in Seattle has a LUN named “tbd-RAG”, which serves as the enterprise data store in our demo setup and contains a collection of data. After these two data centers are connected using F5 XC Network Connect, a secure encrypted end-to-end connection is established between them. In our example, “test-ai-tbd” is in the data center in California where it hosts the RAG services, including embedding Large Language Model (LLM) and vector database, and it can now successfully connect to the enterprise data stores on NetApp ONTAP in the data center in Seattle: F5 Distributed Cloud App Connect F5 Distributed Cloud App Connect securely connects and delivers distributed applications and services across hybrid and multicloud environments. By utilizing F5 Distributed Cloud App Connect, we can direct the inference traffic through F5 Distributed Cloud's security layers to safeguard our inference endpoints. Red Hat OpenShift on Amazon Web Services (ROSA) is a fully managed service that allows users to develop, run, and scale applications in a native AWS environment. We can host our inference service on ROSA so that we can leverage the scalability, cost-effectiveness, and numerous benefits of AWS’s managed infrastructure services. For instance, we can host our inference service on ROSA by deploying Ollama with multiple AI/ML models: Or, we can enable Model Serving on Red Hat OpenShift AI (RHOAI). Red Hat OpenShift AI (RHOAI) is a flexible and scalable AI/ML platform builds on the capabilities of Red Hat OpenShift that facilitates collaboration among data scientists, engineers, and app developers. This platform allows them to serve, build, train, deploy, test, and monitor AI/ML models and applications either on-premise or in the cloud, fostering efficient innovation within organizations. In our example, we use Red Hat OpenShift AI (RHOAI) Model Serving on ROSA for our inference service: Once inference service is deployed on ROSA, we can utilize F5 Distributed Cloud to secure our inference endpoint by steering the inference traffic through F5 Distributed Cloud's security layers, which offers an extensive suite of features designed specifically for the security of modern AI/ML inference endpoints. This setup would allow us to scrutinize requests, implement policies for detected threats, and protect sensitive datasets before they reach the inferencing service hosted within ROSA. In our example, we setup a F5 Distributed Cloud HTTP Load Balancer (rhoai-llm-serving.f5-demo.com), and we advertise it to the CE in the datacenter in California only: We now reach our Red Hat OpenShift AI (RHOAI) inference endpoint through F5 Distributed Cloud: F5 Distributed Cloud Web App & API Protection F5 Distributed Cloud Web App & API Protection provides comprehensive sets of security features, and uniform observability and policy enforcement to protect apps and APIs across hybrid and multicloud environments. We utilize F5 Distributed Cloud App Connect to steer the inference traffic through F5 Distributed Cloud to secure our inference endpoint. In our example, we protect our Red Hat OpenShift AI (RHOAI) inference endpoint by rate-limiting the access, so that we can ensure no single client would exhaust the inference service: A "Too Many Requests" is received in the response when a single client repeatedly requests access to the inference service at a rate higher than the configured threshold: This is just one of the many security features to protect our inference service. Click here to find out more on Securing Model Serving in Red Hat OpenShift AI (on ROSA) with F5 Distributed Cloud API Security. Demonstration In a real-world scenario, the front-end application could be hosted on the cloud, or hosted at the edge, or served through F5 Distributed Cloud, offering flexible alternatives for efficient application delivery based on user preferences and specific needs. To illustrate how all the discussed components work seamlessly together, we simplify our example by deploying Open WebUI as the front-end application on the Red Hat OpenShift cluster in the data center in California, which includes RAG services. While a DPU or GPU could be used for improved performance, our setup utilizes a CPU for inferencing tasks. We connect our app to our enterprise data stores deployed on NetApp ONTAP in the data center in Seattle using F5 Distributed Cloud Network Connect, where we have a copy of "Chapter 1. About the Migration Toolkit for Virtualization" from Red Hat. These documents are processed and saved to the Vector DB: Our embedding Large Language Model (LLM) is Sentence-Transformers/all-MiniLM-L6-v2, and here is our RAG template: Instead of connecting to the inference endpoint on Red Hat OpenShift AI (RHOAI) on ROSA directly, we connect to the F5 Distributed Cloud HTTP Load Balancer (rhoai-llm-serving.f5-demo.com) from F5 Distributed Cloud App Connect: Previously, we asked, "What is MTV?“ and we never received a response related to Red Hat Migration Toolkit for Virtualization: Now, let's try asking the same question again with RAG services enabled: We finally received the response we had anticipated. Next, we use F5 Distributed Cloud Web App & API Protection to safeguard our Red Hat OpenShift AI (RHOAI) inference endpoint on ROSA by rate-limiting the access, thus preventing a single client from exhausting the inference service: As expected, we received "Too Many Requests" in the response on our app upon requesting the inference service at a rate greater than the set threshold: With F5 Distributed Cloud's real-time observability and security analytics from the F5 Distributed Console, we can proactively monitor for potential threats. For example, if necessary, we can block a client from accessing the inference service by adding it to the Blocked Clients List: As expected, this specific client is now unable to access the inference service: Summary Deploying and securing RAG for enterprise RAG-enabled AI applications in a multi-vendor, hybrid, and multi-cloud environment can present complex challenges. In collaboration with Red Hat OpenShift AI (RHOAI) and NetApp ONTAP, F5 Distributed Cloud provides an effortless solution that secures RAG components seamlessly for enterprise RAG-enabled AI applications.417Views1like0CommentsSite-to-Site Connectivity in F5 Distributed Cloud Network Connect – Reference Architecture
Purpose This guide describes the reference architecture for deploying F5 Distributed Cloud’s (XC) Multicloud Network Connect service to interconnect their workload across private connectivity or the internet. It enumerates the options available to an F5 Distributed Cloud user to configure site-to-site connectivity using F5 XC Customer Edges (CEs) and explains them in detail to help the user make informed decisions to choose the correct topology for their use case. Audience This guide is for technical readers, including network admins and architects who want to understand how the Multicloud Network Connect service works and what network topology they must use to interconnect their workloads across data centers, branches, and public clouds. This guide assumes the reader is familiar with networking concepts like routing, default gateway configurations, IPSec and SSL encryption, and private connectivity solutions provided by public clouds like AWS Direct Connect and Azure Express Route. Introduction Ensuring workload reachability across data centers, branches, and/or public cloud can be challenging and operationally complex if done in the traditional way. The network teams must design, configure, and maintain multiple networking and security equipment and need expertise across many vendor solutions that provide functionality like NAT, SD-WAN, VPN, firewalls, access control lists (ACLs), etc. This gets even more nuanced while connecting two networks that have overlapping IP address CIDRs which is often in the case of hybrid cloud deployments and during mergers and acquisitions. F5 XC Multicloud Network Connect provides a simple way to configure these interconnections and manage access and security policies across multiple heterogeneous environments, from a single console. It abstracts the complexities by taking the user intent and automating the underlying networking and security while providing the flexibility to choose to connect over a private network or the public internet. Customer Edge as a Gateway To provide site-to-site reachability and ensure enforcement of the security policies, traffic must flow through the CE site. For this, the CE’s Site Local Inside (SLI) IP address must be used as the default gateway or as the next hop to reach the networks on other sites. Figure: Using CE as the gateway Physical vs. Logical Connectivity Two or more CE sites can be physically connected in multiple ways. But this does not automatically allow networks on different sites to be L3 routable to each other by default. For this, the user must associate networks with segments. The physical connection dictates the path the packets will take while going from one site to the other, and the logical connection connects the VLANs (on-prem) and the VPCs/VNETs (in the public cloud) using network overlay and provides segmentation. Note: Multicloud Network Connect provides Layer 3 connectivity between networks. Configuring L3 connectivity is not required to have app-to-app connectivity across the sites. This is done using the distributed load balancer feature under App Connect. Physical Transit Options Over F5 Global Network Backbone A CE site is always connected to the two nearest Regional Edges (REs) for redundancy, using IPSec or SSL tunnels. The REs across different regions are connected via a global, private F5 backbone network. F5’s global network backbone provides high-speed, private transit across regions where Regional Edges are located. Users can use the CE-RE tunnels over the internet to securely connect to this backbone locally and leverage the private connectivity to connect across regions. Figure: Default CE-CE connectivity over REs and F5 network backbone Pros: No need to manage underlay networking if using CE-RE tunnels over the internet. High-speed private transit between geographically distant regions, at no extra cost. End-to-end encryption of traffic between sites. Option to have end-to-end private connectivity Cons: Throughput is limited to the bandwidth of the two tunnels per site. When To Use: The easiest way to connect when private connectivity is not available between data centers or to the cloud VPCs in the case of hybrid cloud. IPSec/SSL tunnels over the internet are acceptable, but you do not want to manage multiple VPN tunnels or SD-WAN devices. Connecting geographically distant sites and you need better end-to-end latency and reliability than going over the internet. Direct Site-to-Site Over Internet If security regulations prevent the use of F5’s private backbone network, users can connect the CE sites directly to each other using IPSec tunnels (SSL encryption is not supported in this case). This is done using the Site Mesh Group (SMG) feature. Note: Even when the CEs are a part of Site Mesh Group, they will still connect to the REs using encrypted tunnels as this is required for control plane connectivity. When the sites are in an SMG, the data path traffic flows through the CE-CE tunnels as the preferred path. If this link fails, as a backup, the traffic gets routed to the REs and over the F5 network backbone to the other site. The number of tunnels on each link between two CE sites depends on the number of control nodes they have. Two single-node sites are connected using only one tunnel. If any one of these sites has three control nodes, it forms three tunnels to the other site. Figure: Number of tunnels between sites in a SMG Pros: Sites are directly connected, so the data path is not dependent on RE. Easy connectivity over the internet Traffic is always encrypted in transit Eliminates the need to manually configure cross-site VPNs or SD-WAN. Cons: Encryption and decryption require more CPU resources on CE nodes as performance requirements increase. Note: L3 Mode Enhanced Performance can be enabled on the sites to get more performance from available CPU and memory resources, but this must be enabled when the site is used only for L3 connectivity as it reduces the resources available for L7 features. Direct Site-to-Site Over Customer’s Network Backbone The CE-CE tunnels can also be configured to be connected over a private network if end-to-end private connectivity is required. Customer can leverage their existing private connectivity between data centers provisioned using private NaaS providers like Equinix. The sites can either be connected directly using SMG where the connections are encrypted or using the DC Cluster Group (DCG) feature, which connects the sites using IP-in-IP tunnels (no encryption). The number of tunnels on each link between two CE sites depends on the number of control nodes they have. Two single-node sites are connected using only one tunnel. If any one of these sites has three control nodes, it forms three tunnels to the other site. A DCG will give better performance, while an SMG is more secure. Unlike SMG, DCG does not fall back to sending RE-CE tunnels if private connectivity fails. Pros: Data path confined within the customer’s private perimeter. Sites are directly connected, so the data path is not dependent on RE. Option to choose encrypted or unencrypted transit. Simplifies the ACLs on the physical network and allows users to manage segmentation using the F5 XC console. Cons: Customer needs to manage the private connectivity across data centers or from the data center to the public cloud. Direct Site-to-Site Connectivity Topologies For direct site-to-site connectivity, sites can be grouped into Site Mesh Group (SMG) or DC Cluster Group (DCG). These allow sites to connect either in Full Mesh or Hub-Spoke topologies as described below: Full Mesh Site Mesh Group All sites that are part of a full-mesh SMG are connected to every other site using IPSec tunnels, forming a full-mesh topology. Figure: Sites in full mesh Site Mesh Group When To Use: In Hybrid cloud use cases. When all sites have equal functionality (e.g. connecting workloads across data centers). High fault tolerance is required for site-to-site connectivity (not dependent on any one site for transit). Hub-Spoke Site Mesh Group This mode allows the sites to be grouped into a hub SMG and a spoke SMG. The sites within the hub SMG are connected using full mesh topology. The sites within the spoke SMG are connected to the sites in hub SMG only and not to other sites in the spoke SMG. Figure: Sites in Hub-Spoke Site Mesh Group Some characteristics of Hub-Spoke SMG: The hub can have multiple sites for redundancy, but it usually has one site in most customer use cases. A hub site can be a spoke site for a different Hub-Spoke SMG. A CE site can be a spoke for multiple hubs. When To Use: For data center/cloud to edge/branch connectivity use cases. Full Mesh DC Cluster Group DC Cluster Group only supports full mesh topology. Every site in a DCG is connected to every other site using IP-in-IP tunnels. Traffic is not encrypted in transit, but DCG is only supported when sites can be connected over a private network. Figure: Sites in DC Cluster Group When To Use: Connecting VLANs on a data center to VLANs on other data centers or to public cloud VPCs/VNETs, when there is private connectivity between them. When the security regulations allow unencrypted traffic over the private transit. Offline Survivability CE Sites require control plane connectivity to the REs and Global Controller (GC) to exchange routes, renew certificates, and decrypt blindfolded secrets. To enable business continuity during an upstream outage, the Offline Survivability feature can be enabled on all sites in a Full Mesh SMG or DCG. The feature is not supported for Hub-Spoke SMG. With this feature enabled the sites can continue normal operations for 7 days without connecting to the REs and the GC. With the offline survivability feature enabled on a CE site, the local control plane becomes the certificate authority in case of connectivity loss. The decrypted secrets and certificates are cached locally on the CE. So, this feature is not turned on by default to allow the user to decide if enabling it aligns with the security regulations of the company. Logical Connectivity Once the physical transit is configured and the connection topology is chosen the workloads on the networks across the sites can be connected using segments or the applications on one site can be delivered to any other site by configuring a distributed load balancer. Connect Networks - Segmentation Users can create segments and add data center VLANs or public cloud VPCs/VNETs to them. All such networks added to a segment become part of a common routing domain and all workloads on these networks can reach each other using the CE sites as gateways. Users must ensure the networks added to a segment do not overlap. Segments are isolated layer 3 domains. So, workloads on one segment cannot access workloads from other segments by default. However, users can configure Segment Connectors to allow traffic from one segment to another. Figure: Segmentation Connect Applications - Distributed Load Balancing Instead of allowing the workloads to route directly to one another, the user can configure a distributed load balancer to publish a service from one site to other sites. This is done by adding the service endpoints to an origin pool of a load balancer object and advertising it using a custom VIP to another site or multiple sites. This allows the client to connect to the service as a local resource. Using Distributed Load Balancing, an LB admin can configure policies to expose the required API of the application only to the required sites. This reduces the attack surface and increases app security. Figure: Distributed Load Balancing E.g. in the figure above, the on-prem database is advertised to client apps on AWS and Azure which can access the DB using their local VIP, and the on-prem application is advertised to the client on Azure only. Decision Flow to Choose Physical Connectivity Options Once you understand the various physical and logical connectivity options, the below chart can help you to make an informed decision based on the connectivity requirements, infrastructure/platform available, and security restrictions. Once the connectivity is decided, you can choose to connect the network or only publish apps to sites where required based on the application requirements. Related Articles F5XC Site Site Mesh Group DC Cluster Group Segmentation Distributed Load Balancer928Views1like0Comments