f5 xc
12 TopicsUsing Distributed Cloud DNS Load Balancer with Geo-Proximity and failover scenarios
Introduction To have both high performance and responsive apps available on the Internet, you need a cloud DNS that’s both scalable and one that operates at a global level to effectively connect users to the nearest point of presence. The F5 Distributed Cloud DNS Load Balancer positions the best features used with GSLB DNS to enable the delivery of hybrid and multi-cloud applications with compute positioned right at the edge, closest to users. With Global Server Load Balancing (GSLB) features available in a cloud-based SaaS format, the Distributed Cloud DNS Load Balancer has a number distinct advantages: Speed and simplicity: Integrate with DevOps pipelines, with an automation focus and a rich and intuitive user interface Flexibility and scale: Global auto-scale keeps up with demand as the number of apps increases and traffic patterns change Security: Built-in DDoS protection, automatic failover, and DNSSEC features help ensure your apps are effectively protected. Disaster recovery: With automatic detection of site failures, apps dynamically fail over to individual recovery-designated locations without intervention. Adding user-location proximity policies to DNS load balancing rules allows the steering of users to specific instances of an app. This not only improves the overall experience but it guarantees and safeguards data, effectively silo’ing user data keeping it region-specific. In the case of disaster recovery, catch-all rules can be created to send users to alternate destinations where restrictions to data don’t apply. Integrated Solution This solution uses a cloud-based Distributed Cloud DNS to load balance traffic to VIP’s that connect to region-specific pools of servers. When data privacy isn’t a requirement, catch-all rules can further distribute traffic should a preferred pool of origin servers become unhealthy or unreachable. The following solution covers the following three DNS LB scenarios: Geo-IP Proximity Active/Standby failover within a region Disaster Recovery for manually activated failovers Autonomous System Number (ASN) Lists Fallback pool for automated failovers The configuration for this solution assumes the following: The app is in multiple regions Users are from different regions Distributed Cloud hosts/manages/is delegated the DNS domain or subdomain (optional) Failover to another region is allowed Prerequisite Steps Distributed Cloud must be providing primary DNS for the domain. Your domain must be registered with a public domain name registrar with the nameservers ns1.f5clouddns.com and ns2.f5clouddns.com. F5 XC automatically validates the domain registration when configured to be the primary nameserver. Navigate to DNS Management > domain > Manage Configuration > Edit Configuration >> DNS Zone Configuration: Primary DNZ Configuration > Edit Configuration. Select “Add Item”, with Record Set type “DNS Load Balancer” Enter the Record Name and then select Add Item to create a new load balancer record. This opens the submenu to create DNS Load Balancer rules. DNS LB for Geo-Proximity Name the rule “app-dns-rule” then go to Load Balancing Rules > Configure. Select “Add Item” then under the Load Balancing Rule, within the default Geo Location Selection, expand the “Selector Expression” and select “geoip.ves.io/continent”. Select Operator “In” and then the value “EU”. Click Apply. Under the Action “Use DNS Load Balancer pool”, click “Add Item”. Name the pool “eu-pool”, and under Pool Type (A) > Pool Members, click “Add Item”. Enter a “Public IP”, then click “Apply”. Repeat this process to have a second IP Endpoint in the pool. Scroll down to Load Balancing Method and select “Static-Persist”. Now click Continue, and then Apply to the Load Balancing Rule, and then “Add Item” to add a second rule. In the new rule, choose Geo Location Selection value “Geo Location Set selector”, and use the default “system/global-users”. Click “Add Item”. Name this new pool “global-pool” and add then select “Add Item” with the following pool member: 54.208.44.177. Change the Load Balancing Mode to “Static-Persist”, then click Continue. Click “Continue”. Now set the Load Balancing Rule Score to 90. This allows the first load balancing rule, specific to EU users, to be returned as the only answer for users of that region unless the regional servers are unhealthy. Note: The rule with the highest score is returned. When two or more rules match and have the same score, answers for each rule is returned. Although there are legitimate reasons for doing this, matching more than one rule with the same score may provide an unanticipated outcome. Now click "Apply", “Apply”, and “Continue”. Click the final “Apply” to create the new DNS Zone Resource Record Set. Now click “Apply” to the DNS Zone configuration to commit the new Resource Record. Click “Save and Exit” to finalize everything and complete the DNS Zone configuration! To view the status of the services that were just created, navigate to DNS Management > Overview > DNS Load Balancers > app-dns-rule. Clicking on the rule “eu-pool”, you can find the status for each individual IP endpoint, showing the overall health of each pool’s service that has been configured. With the DNS Load Balancing rule configured to connect two separate regions, when one of the primary sites goes down in the eu-pool users will instead be directed to the global-pool. This provides reliability in the context of site failover that spans regions. If data privacy is also a requirement, additional rules can be configured to support more sites in the same region. DNS LB for Active-Passive Sites In the previous scenario, two members are configured to be equally active for a single location. We can change the weight of the pool members so that of the two only one is used when the other is unhealthy or disabled. This creates a backup/passive scenario within a region. Navigate to DNS Load Balancer Management > DNS Load Balancers. Go to the service name "app-dns-rule", then under Actions, select Manage Configuration. Click Edit Configuration for the DNS rule. Go to the Load Balancing Rules section, and Edit Configuration. On the Load Balancing Rules order menu, go to Actions > Edit for the eu-pool Rule Action. In the Load Balancing Rule menu for eu-pool, under the section Action, click Edit Configuration. In the rule for eu-pool, under Pool Type (A) > Pool Members click the Edit action In the IP Endpoint section, change the Load Balancing Priority to 1, then click Apply. Change the Load Balancing Mode to Priority, then exit and save all changes by clicking Continue, Apply, Apply, and then Save and Exit. DNS LB for Disaster Recovery Unlike with backup/standby where failover can happen automatically depending on the status of a service's health, disaster recovery (DR) can either happen automatically or be configured to require manual intervention. In the following two scenarios, I'll show how to configure manual DR failover within a region, and also how to manual failover outside the region. To support east/west manual DR failover within the EU region, use the steps above to create a new Load Balancing Rule with the same label selector as the EU rule (eu-pool) above, then create a new DNS LB pool (name it something like eu-dr-pool) and add new designated DR IP pool endpoints. Change the DR Load Balancing Rule Score to 80, and then click Apply. On the Load Balanacing Rules page, change the order of the rules and confirm that the score is such that it aligns to the following image, then click Apply, and then Save and Exit. In the previous active/standby scenario the Global rule functions as a backup for EU users when all sites in EU are down. To force a non-regional failover, you can change F5 XC DNS to send all EU users to the Global DNS rule by disabling each of the two EU DNS rule(s) above. To disable the EU DNS rules, Navigate to DNS Load Balancer Management > DNS Load Balancers, and then under Actions, select Manage Configuration. Click Edit Configuration for the DNS rule. Go to the Load Balancing Rules section, and Edit Configuration. On the Load Balancing Rules order menu, go to Actions > Edit for the eu-pool Rule Action. In the Load Balance Rule menu for eu-pool, under the section Action, click Edit Configuration. In the top section labeled Metadata, check the box to Disable the rule. Then click Continue, Apply, Apply, and then Save and Exit. With the EU DNS LB rules disabled, all requests in the EU region are served by the Global Pool. When it's time to restore regional services, all that's needed is to re-enter the configuration rule and uncheck the Disable box to each rule. DNS LB with ASN Lists ASN stands for Autonomous System Number. It is a unique identifier assigned to networks on the internet that operate under a single administration or entity. By mapping IP addresses to their corresponding ASN, DNS LB administrators can manage some traffic more effectively. To configure Distributed Cloud DNS LB to use ASN lists, navigate to DNS Management > DNS Load Balancer Management, then "Managed Configuration" for a DNS LB service. Choose "Add Item", and on the next page, select "ASN List", and enter one or more ASN's that apply to this rule, select a DNS LB pool, and optionally configure the score (weight). When the same ASN exists in multiple DNS LB rules, the rule having the highest score is used. Note: F5 XC uses ASPlain (4-byte) formatted AS numbers. Multiple numbers are configured one per item line. DNS LB with IP Prefix Lists and IP Prefix Sets Intermediate DNS servers are almost always involved in server name resolution. By default, DNS LB doesn't see originating IP address or subnet prefix of the client making the DNS request. To improve the effectiveness of DNS-based services like DNS LB by making more informed decisions about which server will be the closest to the client, RFC 7871 proposes a solution using the EDNS0 field to allow intermediate DNS servers to add to the DNS request the client subnet (EDNS Client Subnet or EDS). The IP Prefix List and IP Prefix Set in F5 XC DNS is used when DNS requests contain the client subnet and the prefix is within one of the prefixed defined in one or more DNS LB rule sets. To configure an IP Prefix rule, navigate to DNS Management > DNS Load Balancer Management, then "Manage Configuration" of your DNS LB service. Now click "Edit Configuration" at the top left corner, then "Edit Configuration" in the section dedicated to Load Balancing Rules. Inside the section for Load Balancing Rules, click "Add Item" and in the Client Selection box choose either "IP Prefix List" or "IP Prefix Sets" from the menu. For IP Prefix List, enter the IPv4 CIDR prefix, one prefix per line. For IP Prefix Sets, you have the option of choosing whether to use a pre-existing set created in the Shared Configuration space in your tenant or you can Add Item to create a completely new set. ::rt-arrow:: Note: IP Prefix Sets are intended to be part of much larger groups of IP CIDR block prefixes and are used for additional features in F5 XC, such as in L7 WAF and L3 Network Firewall access lists. IP Prefix Sets support the use of both IPv4 and IPv6 CIDR blocks. In the following example, the configured IP Prefix rule having client subnet 192.168.1.0/24 get an answer to our eu-dr-pool (1.1.1.1). Meanwhile, a request not having a client subnet in the defined prefix and is also outside of the EU region, get an answer for the pool global-poolx (54.208.44.177). Command line: ; <<>> DiG 9.10.6 <<>> @ns1.f5clouddns.com www.f5-cloud-demo.com +subnet=1.2.3.0/24 in a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44218 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; CLIENT-SUBNET: 1.2.3.0/24/0 ;; QUESTION SECTION: ;www.f5-cloud-demo.com. IN A ;; ANSWER SECTION: www.f5-cloud-demo.com. 30 IN A 54.208.44.177 ;; Query time: 73 msec ;; SERVER: 107.162.234.197#53(107.162.234.197) ;; WHEN: Wed Jun 05 21:46:04 PDT 2024 ;; MSG SIZE rcvd: 77 ; <<>> DiG 9.10.6 <<>> @ns1.f5clouddns.com www.f5-cloud-demo.com +subnet=192.168.1.0/24 in a ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48622 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ; CLIENT-SUBNET: 192.168.1.0/24/0 ;; QUESTION SECTION: ;www.f5-cloud-demo.com. IN A ;; ANSWER SECTION: www.f5-cloud-demo.com. 30 IN A 1.1.1.1 ;; Query time: 79 msec ;; SERVER: 107.162.234.197#53(107.162.234.197) ;; WHEN: Wed Jun 05 21:46:50 PDT 2024 ;; MSG SIZE rcvd: 77 Here we're able to see that a different answer is given based on the client-subnet provided in the DNS request. Additional use-cases apply. The ability to make a DNS LB decision with a client subnet improves ability of the F5 XC nameservers to deliver an optimal response. DNS LB Fallback Pool (Failsafe) The scenarios above illustrate how to designate alternate pools both regional and global when an individual pool fails. However, in the event of a catastrophic failure that brings all service pools are down, F5 XC provides one final mechanism, the fallback pool. Ideally, when implemented, the fallback pool should be independent from all existing pool-related infrastructure and services to deliver a failsafe service. To configure the Fallback Pool, navigate to DNS Management > DNS Load Balancer Management, then "Managed Configuration" of your DNS LB service. Click "Edit Configuration", navigate to the "Fallback Pool" box and choose an existing pool. If no qualified pool exists, the option is available to add a new pool. In my case, I've desginated "global-poolx" as my failsafe fallback pool which already functions as a regional backup. Best practice for the fallback pool is that it should be a pool not referenced elsewhere in the DNSLB configuration, a pool that exists on completely independent resources not regionally-bound. DNS LB Health Checks and Observability For sake of simplicity the above scenarios do not have DNS LB health checks configured and it's assumed that each pool's IP members are always reachable and healthy. My next article shows how to configure health checks to enable automatic failovers and ensure that users always reach a working server. Conclusion Using the Distributed Cloud DNS Load Balancer enables better performance of your apps while also providing greater uptime. With scaling and security automatically built into the service, responding to large volumes of queries without manual intervention is seamless. Layers of security deliver protection and automatic failover. Built-in DDoS protection, DNSSEC, and more make the Distributed Cloud DNS Load Balancer an ideal do-it-all GSLB distributor for multi-cloud and hybrid apps. To see a walkthrough where I configure first scenario above for Geo-IP proximity, watch the following accompanying video. Additional Resources Next article: Using Distributed Cloud DNS Load Balancer health checks and DNS observability More information about Distributed Cloud DNS Load Balancer available at: https://www.f5.com/cloud/products/dns-load-balancer Product Documentation: DNS LB Product Documentation DNS Zone Management6.7KViews4likes0CommentsF5 Distributed Cloud Telemetry (Metrics) - Prometheus
Scope This article walks through the process of collecting metrics from F5 Distributed Cloud’s (XC) Service Graph API and exposing them in a format that Prometheus can scrape. Prometheus then scrapes these metrics, which can be visualized in Grafana. Introduction Metrics are essential for gaining real-time insight into service performance and behaviour. F5 Distributed Cloud (XC) provides a Service Graph API that captures service-to-service communication data across your infrastructure. Prometheus, a leading open-source monitoring system, can scrape and store time-series metrics — and when paired with Grafana, offers powerful visualization capabilities. This article shows how to integrate a custom Python-based exporter that transforms Service Graph API data into Prometheus-compatible metrics. These metrics are then scraped by Prometheus and visualized in Grafana, all running in Docker for easy deployment. Prerequisites Access to F5 Distributed Cloud (XC) SaaS tenant VM with Python3 installed Running Prometheus instance (If not check "Configuring Prometheus" section below) Running Grafana instance (If not check "Configuring Grafana" section below) Note – In this demo, an AWS VM is used with Python installed and running exporter (port - 8888), Prometheus (host port - 9090) and Grafana (port - 3000) running as docker instance, all in same VM. Architecture Overview F5 XC API → Python Exporter → Prometheus → Grafana Building the Python Exporter To collect metrics from the F5 Distributed Cloud (XC) Service Graph API and expose them in a format Prometheus understands, we created a lightweight Python exporter using Flask. This exporter acts as a transformation layer — it fetches service graph data, parses it, and exposes it through a /metrics endpoint that Prometheus can scrape. Code Link -> exporter.py Key Functions of the Exporter Uses XC-Provided .p12 File for Authentication: To authenticate API requests to F5 Distributed Cloud (XC), the exporter uses a client certificate packaged in a .p12 file. This file must be manually downloaded from the F5 XC console (steps) and stored on the VM where the Python script runs. The script expects the full path to the .p12 file and its associated password to be specified in the configuration section. Fetches Service Graph Metrics: The script pulls service-level metrics such as request rates, error rates, throughput, and latency from the XC API. It supports both aggregated and individual load balancer views. Processes and Structures the Data: The exporter parses the raw API response to extract the latest metric values and converts them into Prometheus exposition format. Each metric is labelled (e.g., by vhost and direction) for flexibility in Grafana queries. Exposes a /metrics Endpoint: A Flask web server runs on port 8888, serving the /metrics endpoint. Prometheus periodically scrapes this endpoint to ingest the latest metrics. Handles Multiple Metric Types: Traffic metrics and health scores are handled and formatted individually. Each metric includes a descriptive name, type declaration, and optional labels for fine-grained monitoring and visualization. Running the Exporter python3 exporter.py > python.log 2>&1 & This command runs exporter.py using Python3 in background and redirects all standard output and error messages to python.log for easier debugging. Configuring Prometheus docker run -d --name=prometheus --network=host -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus:latest Prometheus is running as docker instance in host network (port 9090) mode with below configuration (prometheus.yml), scrapping /metrics endpoint exposed from python flask exporter on port 8888 every 60 seconds. Configuring Grafana docker run -d --name=grafana -p 3000:3000 grafana/grafana:latest Private IP of the Prometheus docker instance along with port (9090) is used as data source in Grafana configuration. Once Prometheus is configured under Grafana Data sources, follow below steps: Navigate to Explore menu Select “Prometheus” in data source picker Choose appropriate metric, in this case “f5xc_downstream_http_request_rate” Select desired time range and click “Run query” Observe metrics graph will be displayed Note : Some requests need to be generated for metrics to be visible in Grafana. A broader, high-level view of all metrics can be accessed by navigating to “Drilldown” and selecting “Metrics”, providing a comprehensive snapshot across services. Conclusion F5 Distributed Cloud’s (F5 XC) Service Graph API provides deep visibility into service-to-service communication, and when paired with Prometheus and Grafana, it enables powerful, real-time monitoring without vendor lock-in. This integration highlights F5 XC’s alignment with open-source ecosystems, allowing users to build flexible and scalable observability pipelines. The custom Python exporter bridges the gap between the XC API and Prometheus, offering a lightweight and adaptable solution for transforming and exposing metrics. With Grafana dashboards on top, teams can gain instant insight into service health and performance. This open approach empowers operations teams to respond faster, optimize more effectively, and evolve their observability practices with confidence and control.207Views3likes0CommentsConfigure Generic Webhook Alert Receiver using F5 Distributed Cloud Platform
Generic Webhook Alerts feature in F5 Distributed Cloud (F5 XC) gives feasibility to easily configure and send Alert notifications related to Application Infrastructure (IaaS) to specified URL receiver. F5 XC SaaS console platform sends alert messages to web servers to receive as soon as the events gets triggered.375Views2likes0CommentsConverting a BIG-IP Maintenance Page iRule to Distributed Cloud using App Stack
If you are familiar with BIG-IP, you are probably also familiar with its flexible and robust iRule functionality. In fact, I would argue that iRules makes BIG-IP the swiss-army knife that it is. If there is ever a need for advanced traffic manipulation, you can usually come up with an iRule to solve the problem. F5 Distributed Cloud (XC) has its own suite of tools to help in this regard. If you need to do some sort of traffic manipulation/routing you can usually handle that with Service Policies or simply using Routes. Even with these features, however, there are going to be some cases where iRule functionality from the BIG-IP cannot be reproduced directly in XC. When this happens, we switch to using App Stack, which is XC’s version of a swiss army knife. In this article, I wanted to walk through an example of how you can leverage XC's App Stack for a specific iRule conversion use case: Displaying a Custom Maintenance Page when all pool members are down. For reference, here is the iRule: when LB_FAILED { if { [active_members [LB::server pool]] == 0 } { if { ([string tolower [HTTP::host]] contains "example.com")} { if { [HTTP::uri] ends_with "SystemMaintenance.jpg" } { HTTP::respond 200 content [ifile get "SystemMaintenance.jpg"] "Content-Type" "image/jpg" } else { HTTP::respond 200 content "<!DOCTYPE html> <html lang="en"> <head> <title>System Maintenance</title> <style type="text/css"> .base { font-family: 'Tahoma'; font-size: large; } </style> </head> <body> <br> <center><img alt="sad" height="200" src="SystemMaintenance.jpg" width="200" /></center><br> <center><span class="base">This application is currently under system maintenance.</span></center> <br> <center><span class="base">All services will be back online in a few mintues.</span> </body> </html>" } } } } When dissecting this iRule, you can see we have to solve for the following: Trigger the maintenance page when all pool members are down Serve local files (images, css, etc.) Display the static HTML page So, how do we do this? Well, App Stack allows us to deploy and host a container in Distributed Cloud. So we can easily create a simple container (using NGINX for bonus points!) that contains all these images, stylesheets, HTML files, etc. and manipulate our pools so that it uses this container when required! Let’s deep dive into the step-by-step process… Step by Step Walk-through: Container Creation First, we have to create our container. I'm not going to go too deep into how to create a container in this article, but I will highlight the main steps I took. To start, I simply extracted the HTML from the iRule above and saved all the required files (images, stylesheets, etc.) in one directory. Since I am adding NGINX to the container, I must also create and include a nginx.conf file in this directory. Below was my configuration: worker_processes 1; error_log /var/log/nginx/error.log warn; pid /tmp/nginx.pid; events { worker_connections 1024; } http { client_body_temp_path /tmp/client_temp; proxy_temp_path /tmp/proxy_temp_path; fastcgi_temp_path /tmp/fastcgi_temp; uwsgi_temp_path /tmp/uwsgi_temp; scgi_temp_path /tmp/scgi_temp; include /etc/nginx/mime.types; server { listen 8080; location / { root /usr/share/nginx/html/; index index.html; } location ~* \.(js|jpg|png|css)$ { root /usr/share/nginx/html/; } } sendfile on; keepalive_timeout 65; } There really isn’t much to the NGINX configuration for this example, but keep in mind that you can expand on this and make it much more robust for other use cases. (One note about the configuration above is that you will see /tmp paths mentioned. These are required since our container will run as a non-root user. For more information, see the NGINX documentation here: https://hub.docker.com/_/nginx) Finally, I included a Dockerfile with my requirements for NGINX and exposing port 8080. Once that was all set, I built my container and pushed it Docker Hub as a private repository. App Stack Deployment Now that we have the container created and uploaded to Docker Hub, we are ready to bring it to XC. Start by opening up the F5 XC Console and navigate to the Distributed Apps tile. Navigate to Applications -> Container Registries, then click Add Container Registry. Here we just have to add a name for the Container Registry, our Docker Hub Username, “docker.io” for the Server FQDN, and then blindfold our password for Docker Hub. After saving, we are now ready to configure our workload To do so, we have to navigate over to Applications -> Virtual K8s. I already had a Virtual Site and Virtual K8s created, but you'll need to create those if you don't already have them. For your reference, here are some links to a walk-through on each of these: Virtual Site Creation: https://docs.cloud.f5.com/docs/how-to/fleets-vsites/create-virtual-site Virtual K8s Creation: https://docs.cloud.f5.com/docs/how-to/app-management/create-vk8s-obj Select your Virtual K8s cluster: After selecting your cluster, navigate to the Workloads tab. Under Workloads, click on Add VK8s Workload. Give your workload a name and then change the Type of Workload to Service instead of Simple Service. Your configuration should look something like below: You'll notice we now have to configure the Service. Click Configure. The first step is to tell XC which container we want to deploy for this service. Under Containers, select Add Item: Give the container a name, and then input your Image Name. The format for the image name is "registry/image:tagname". If you leave the tag name blank, it defaults to “latest”. Under the Select Container Registry drop down, select Private Registry. This will bring up another drop-down where we will select the container registry we created earlier. Your configuration should end up looking similar to below: For this simple use case, we can skip the Configuration Parameters and move to our Deploy Options. Here, we have some flexibility on where we want to deploy our workload. You can choose All Regional Edges (F5 PoPs), specific REs, or even custom CEs and Virtual Sites. In my basic example, I chose Regional Edge Sites and picked the ny8-nyc RE for now: Next, we have to configure where we want to advertise this workload. We have the option to keep it internal and only advertise in the vK8s Cluster or we could advertise this workload directly on the Internet. Since we only want this maintenance page to be seen when the pool members are all down, we are going to keep this to Advertise In Cluster. After selecting the advertisement, we have to configure our Port Information. Click Configure. Under the advertisement configuration, you’ll see we are simply choosing our ports. If you toggle “Show Advanced fields” you can see we have some flexibility on the port we want to advertise and the actual target port for the container. In my case, I am going to use 8080 for both, but you may want to have a different combination (i.e. 80:8080). Click Apply once finished. Now that we have the ports defined, we can simply hit Apply on the Service configuration and Save and Exit the workload to kick off the deployment. We should now see our new maintenance-page workload in the list. You’ll notice that after refreshing a couple times, the Running/Completed Pods and Total Pods fields will be populated with the number of REs/CEs you chose to deploy the workload to. After a few minutes, you should have a matching number of Running/Completed Pods to your Total Pods. This gives us an indication that the workload is ready to be used for our application. (Note: you can click on the pod numbers in this list to see a more detailed status of the pods. This helps when troubleshooting) Pool Creation With our workload live and advertised in the cluster, it is time to create our pool. In the top left of the platform we’ll need to Select Service and change to Mulitcloud App Connect: Under Mulit-Cloud App Connect, navigate to Manage -> Load Balancers -> Origin Pools and Select Add Origin Pool. Here, we’ll give our origin pool a name and then go directly to Origin Servers. Under Origin Servers, click Add Item. Change the Type of the Origin Server to be K8s Service Name of Origin Server on given Sites. Under Service Name, we have to use the format "servicename.namespace:cluster-id" to point to our workload. In my case, it was "maintenance-page.bohanson:bohanson-test" since I had the following: Service Name: maintenance-page Namespace: bo-hanson VK8s Cluster: bohanson-test Under Site or Virtual Site, I chose the Virtual Site I already had created. The last step is to change the network to vK8s Networks on Site and Click Apply. The result should look like the below: We now need to change our Origin Server port to be the port we defined in the workload advertisement configuration. In my case, I chose port 8080. The rest of the configuration of the origin server is up to you, but I chose to include a simple http health check to monitor the service. Once the configuration finished, click Save and Exit. The final pool configuration should look like this: Application Deployment: With our maintenance container up and running and our pool all set, it is time to finally deploy our solution. In this case, we can select any existing Load Balancer configuration where we want to add the maintenance page. You could also create a new Load Balancer from scratch, of course, but for this example I am deploying to an existing configuration. Under Manage -> Load Balancers, find the load balancer of your choosing and then select Manage Configuration. Once in the Load Balancer view, select Edit Configuration in the top right. To deploy the solution, we just need to navigate to our Origins section and add our new maintenance pool. Select Add Item. At this point, you may be thinking, “Well that is great, but how am I going to get the pool to only show when all other pool members are down?” That is the beauty of the F5 Distributed Cloud pool configuration. We have two options that we can set when adding a pool: Weight and Priority. Both of those options are pretty self-explanatory if you have used a load balancer before, but what is interesting here is when you give these options a value of zero. Giving a pool a weight of zero would disable the pool. For a maintenance pool use case, that could be helpful since we can manually go into the Load Balancer configuration during a maintenance window, disable the main pool, and then bring up the maintenance pool until our change window is closed when we could then reverse the weights and bring the main pool back online. That ALMOST solves our iRule use case, but it would be manual. Alternatively, we can give a pool a Priority of zero. Doing so would mean that all other pools take priority and will be used unless they go down. In the event of the main pool going down, it would default to the lowest priority pool (zero). Now that is more like it! This means we can set our maintenance pool to a Priority of zero and it will automatically be used when the health of all our other pool members go down – which completely fulfills the original iRule requirement. So in our configuration, let's add our new maintenance pool and set: Weight: 1 Priority: 0 After clicking save, the final pool configuration should look something like this: Testing To test, we can simply switch our health check on the main pool to something that would fail. In my case, I just changed the expected status code on the health check to something arbitrary that I knew would fail, but this could be different in your case. After changing the health check, we can navigate to our application in a browser, and see our maintenance page dynamically appear! Changing the health check on the main pool back to a working one should dynamically turn off the maintenance page as well: Summary This is just one example of how you can use App Stack to convert some more advanced/dynamic iRules over to F5 Distributed Cloud. I only used a basic NGINX configuration in this example, but you can start to see how leveraging NGINX in App Stack can give us even more flexibility. Hopefully this helps!396Views2likes0CommentsF5 Distributed Cloud Telemetry (Metrics) - ELK Stack
As we are looking into exporting metrics data to the ELK stack using Python script, let's first get a high-level overview of the same. Metrics are numerical values that provide actionable insights into the performance, health and behavior of systems or applications over time, allowing teams to monitor and improve the reliability, stability and performance of modern distributed systems. ELK Stack (Elasticsearch, Logstash, and Kibana) is a powerful open-source platform. It enables organizations to collect, process, store, and visualize telemetry data such as logs, metrics, and traces from remote systems in real-time.101Views1like0CommentsGetting started with F5 Distributed Cloud (XC) Telemetry
Introduction: This is an introductory article on the F5 Distributed Cloud (XC) telemetry series covering the basics. Going forward, there will be more articles focusing on exporting and visualizing logs and metrics from XC platform to telemetry tools like ELK Stack, Loki, Prometheus, Grafana etc. What is Telemetry? Telemetry refers to the process of collection and transmission of various kinds of data from remote systems to some central receiving entity for monitoring, analyzing and improving the performance, reliability, and security of remote systems. Telemetry Data involves: Metrics: Quantitative data like request rates, error rates, request/response throughputs etc. collected at regular intervals over a period of time. Logs: Textual time and event-based records generated by applications like request logs, security logs, etc. Traces: Information regarding journey/flow of requests across multiple services in a distributed system. Alerts: Alerts use telemetry data to set limits and send real-time notifications allowing organizations to act quickly if their systems don’t behave as expected. This makes alerts a critical pillar of observability. Overview: The F5 Distributed Cloud platform is designed to meet the needs of today’s modern and distributed applications. It allows for delivery, security, and observability across multiple clouds, hybrid clouds, and edge environments. This will create telemetry data that can be seen in XC’s own dashboards. But there may be times when customers want to collect their application’s telemetry data from different platforms to their own SIEM systems. To fulfill this kind of requirement, XC has come up with the Global Log Receiver (GLR) which will send XC logs to customer’s log collection systems. Along with this XC also exposes API that contains metrics data which can be fetched by exporter scripts and can be parsed and processed in such a way that telemetry tools can understand. As shown in the above diagram, there are a few steps involved before raw telemetry data can be presented into the dashboards, which include data collection, storage, and processing from remote systems. Once done, only then will the telemetry data be sent to the visualization tools for real-time monitoring and observability. To achieve this, there are several telemetry tools available like Prometheus (which is used for collecting, storing, and analyzing metrics), ELK stack, Grafana etc. We have covered a brief description of a few such tools below. F5 XC Global Log Receiver: F5 XC Global Log Receiver facilitates sending XC logs (Request, Audit, Security event and DNS request logs) to an external log collection system. The sent logs include all system and application logs of F5 XC tenant. Global log receiver supports sending the logs for the following log collection systems: AWS Cloudwatch AWS S3 HTTP Receiver Azure Blob Storage Azure Event Hubs Datadog GCP Bucket Generic HTTP or HTTPs server IBM QRadar Kafka NewRelic Splunk SumoLogic More information on how to setup or configure XC GLR can be found in this document. Observability/Monitoring Tools: Note: Below is a brief description of a few commonly used monitoring tools used by organizations. Prometheus: Prometheus is an open-source monitoring and alerting tool designed for collecting, storing, and analyzing time-series data (metrics) from modern, cloud-native, and distributed systems. It scrapes metrics from targets via HTTP endpoints, stores them in its optimized time-series database, and allows querying using the powerful PromQL language. Prometheus integrates seamlessly with tools like Grafana for visualization and includes Alertmanager for real-time alerting. It can also be integrated with Kubernetes and can help in continuously discovering and monitoring services from remote systems. Loki: Loki is a lightweight, open-source log aggregation tool designed for storing and querying logs from remote systems. Unlike traditional log management systems, Loki focuses on processing logs alongside metrics and is often paired with Prometheus, making it more efficient. It does not index the log content; rather it sets labels for each log stream. Logs can be queried using LogQL, a PromQL-like language. It is best suited for debugging and monitoring logs in cloud-native or containerized environments like Kubernetes. Grafana: Grafana is an open-source visualization and analytics platform for creating real-time dashboards from diverse data sets. It integrates with tools like Prometheus, Loki, Elasticsearch, and more. Grafana enables users to visualize trends, monitor performance, and set up alerts using a highly customizable interface. ELK Stack: The ELK Stack (Elasticsearch, Logstash, Kibana) is a powerful open-source solution for log management, search, and analytics. Elasticsearch handles storing, indexing, and querying data. Logstash ingests, parses, and transforms logs from various sources. Kibana provides an interactive interface for visualizing data and building dashboards. Conclusion: Telemetry turns system data into actionable insights enabling real-time visibility, early detection of issues, and performance tuning, thereby ensuring system reliability, security, stability, and efficiency. In this article, we’ve explored some of the foundational building blocks and essential tools that will set the stage for the topics we’ll cover in the upcoming articles of this series! Related Articles: F5 Distributed Cloud Telemetry (Logs) - ELK Stack F5 Distributed Cloud Telemetry (Metrics) - ELK Stack F5 Distributed Cloud Telemetry (Logs) - Loki F5 Distributed Cloud Telemetry (Metrics) - Prometheus References: XC Global Log Receiver Prometheus ELK Stack Loki268Views1like3CommentsAccelerate Your Initiatives: Secure & Scale Hybrid Cloud Apps on F5 BIG-IP & Distributed Cloud DNS
It's rare now to find an application that runs exclusively in one homogeneous environment. Users are now global, and enterprises must support applications that are always-on and available. These applications must also scale to meet demand while continuing to run efficiently, continuously delivering a positive user experience with minimal cost. Introduction In F5’s 2024 State of Application Strategy Report, Hybrid and Multicloud deployments are pervasive. With the need for flexibility and resilience, most businesses will deploy applications that span multiple clouds and use complex hybrid environments. In the following solution, we walk through how an organization can expand and scale an application that has matured and now needs to be highly-available to internal users while also being accessible to external partners and customers at scale. Enterprises using different form-factors such as F5 BIG-IP TMOS and F5 Distributed Cloud can quickly right-size and scale legacy and modern applications that were originally only available in an on-prem datacenter. Secure & Scale Applications Let’s consider the following example. Bookinfo is an enterprise application running in an on-prem datacenter that only internal employees use. This application provides product information and details that the business’ users access from an on-site call center in another building on the campus. To secure the application and make it highly-available, the enterprise has deployed an F5 BIG-IP TMOS in front of each of endpoint An endpoint is the combination of an IP, port, and service URL. In this scenario, our app has endpoints for the frontend product page and backend resources that only the product page pulls from. Internal on-prem users access the app with internal DNS on BIG-IP TMOS. GSLB on the device sends another class of internal users, who aren’t on campus and access by VPN, to the public cloud frontend in AWS. The frontend that runs in AWS can scale with demand, allowing it to expand as needed to serve an influx of external users. Both internal users who are off-campus and external users will now always connect to the frontend in AWS through the F5 Global Network and Regional Edges with Distributed Cloud DNS and App Connect. Enabling the frontend for the app in AWS, it now needs to pull data from backend services that still run on-prem. Expanding the frontend requires additional connectivity, and to do that we first deploy an F5 Distributed Cloud Customer Edge (CE) to the on-prem datacenter. The CE connects to the F5 Global Network and it extends Distributed Cloud Services, such as DNS and Service Discovery, WAF, API Security, DDoS, and Bot protection to apps running on BIG-IP. These protections not only secure the app but also help reduce unnecessary traffic to the on-prem datacenter. With Distributed Cloud connecting the public cloud and on-prem datacenter, Service Discovery is configured on the CE on-prem. This makes a catalog of apps (virtual servers) on the BIG-IP available to Distributed Cloud App Connect. Using App Connect with managed DNS, Distributed Cloud automatically creates the fully qualified domain name (FQDN) for external users to access the app publicly, and it uses Service Discovery to make the backend services running on the BIG-IP available to the frontend in AWS. Here are the virtual servers running on BIG-IP. Two of the virtual servers, “details” and “reviews,” need to be made available to the frontend in AWS while continuing to work for the frontend that’s on-prem. To make the virtual servers on BIG-IP available as upstream servers in App Connect, all that’s needed is to click “Add HTTP Load Balancer” directly from the Discovered Services menu. To make the details and reviews sevices that are on-prem available to the frontend product page in AWS, we advertise each of their virtual servers on BIG-IP to only the CE running in AWS. The menu below makes this possible with only a few clicks as service discovery eliminates the need to find the virtual IP and port for each virtual server. Because the CE in AWS runs within Kubernetes, the name of the new service being advertised is recognized by the frontend product page and is automatically handled by the CE. This creates a split-DNS situation where an internal client can resolve and access both the internal on-prem and external AWS versions of the app. The subdomain “external.f5-cloud-demo.com” is now resolved by Distributed Cloud DNS, and “on-prem.f5-cloud-demo.com” is resolved by the BIG-IP. When combined with GSLB, internal users who aren’t on campus and use a VPN will be redirected to the external version of the app. Demo The following video explains this solution in greater detail, showing how to configure connectivity to each service the app uses, as well as how the app looks to internal and external users. (Note: it looks and works identically! Just the way it should be and with minimal time needed to configure it). Key Takeaways BIG-IP TMOS has long delivered best-in-class service with high-availability and scale to enterprise and complex applications. When integrated with Distributed Cloud, freely expand and migrate application services regardless of the deployment model (on-prem, cloud, and edge). This combination leverages cloud environments for extreme scale and global availability while freeing up resources on-prem that would be needed to scrub and sanitize traffic. Conclusion Using the BIG-IP platform with Distributed Cloud services addresses key challenges that enterprises face today: whether it's making internal apps available globally to workforces in multiple regions or scaling services without purchasing more fixed-cost on-prem resources. F5 has the products to unlock your enterprise’s growth potential while keeping resources nimble. Check out the select resources below to explore more about the products and services featured in this solution. Additional Resources Solution Overview: Distributed Cloud DNS Solution Overview: One DNS – Four Expressions Interactive Demo: Distributed Cloud DNS at F5 DevCentral: The Power of &: F5 Hybrid DNS solution F5 Hybrid Security Architectures: One WAF Engine, Total Flexibility225Views1like0Comments