operations
66 TopicsDisplaying Application Study Tool (AST) Dashboards in Your Own Grafana Instance
The Application Study Tool (AST) has its own Prometheus and Grafana instances. These instances run as containers and are designed to coexist with other Prometheus and Grafana instances in your environment, even on the same host. However, during demos and discussions with customers, many have expressed the desire to use their existing Grafana instance to display AST dashboards. Although it may not be obvious to new Grafana users, this process is straightforward. This blog will walk you through launching a second generic Grafana container instance, connecting it to the AST instance of Prometheus (the data source), importing a dashboard from the AST instance of Grafana, and displaying it in the new Grafana instance. If you already have a non-AST instance of Grafana running in your environment, the steps to launch a second Grafana container are optional. However, you may want to run it in order to test the import functionality and make your own customizations before importing it again into your “production” Grafana instance. Here is an example of a dashboard folder in a non-AST Grafana instance after importing three dashboards from AST: Launch a Second (Generic) Grafana Container If you already have a Grafana instance, you may skip this step. However, if you don’t, or you would like to use a “sandbox” for testing customizations before importing the dashboard into your “production“ Grafana instance, you can use the following steps to launch a new Grafana container. The following assumptions are made for the steps that follow: You are using Docker as your container runtime. (If you are using Podman, simply substitute “podman” for “docker” in each of the following commands. Other container runtimes may also work for this exercise, but I have not tested them.) You have sufficient privileges to run containers. If you don’t, you may need to run these commands with “sudo”. If that fails due to permissions errors, you will need to request the necessary privileges from your Linux administrator. We want to run Grafana version 11.5.2. Any recent version should work. However, this is the latest version as of the writing of this blog. The IP address of the host where you are running these containers is 192.168.0.15. Yours will likely be different. Use your own host’s IP when you run “curl” inside the grafana2 container. In my testing, I used MacOS. This will also work on any current Linux distribution and should work on Windows. First, launch the Grafana container. I set this new instance of Grafana to listen on port 3002 (the default for Grafana is 3000) to avoid conflicts with the AST instance, if they are running on the same host. $ docker run -d --name=grafana2 -p 3002:3000 grafana/grafana:11.5.2 Next, exec into the container to ensure it can connect to the AST instance of Prometheus. You can instead check connectivity from the Grafana UI, but the below method is a good way to troubleshoot any connectivity errors you may encounter. $ docker exec -it grafana2 bash You are now running a Bash shell inside the new Grafana container. Run a curl command to confirm the new Grafana container can reach the Prometheus application, which listens on port 9090, by default. (The IP address, 192.168.0.15, is used as an example. Use your own host's IP address here.) 5d3e8256af3d:/usr/share/grafana$ curl 192.168.0.15:9090 <a href="/graph">Found</a>. Now, it is time to test the new Grafana instance. Open a web browser and navigate to the host where this new Grafana container is running, at port 3002. If you are running on your local machine, it will be http://localhost:3002/. The default credentials are admin/admin. When first logging in, Grafana will prompt you to change the password. You may choose to change it now or click “skip” to leave it as is. Now you can export one of the dashboards from AST and import it into this instance. Export a Dashboard from AST Now that you have launched a second instance of Grafana (or you are running your own non-AST instance), it is time to import a dashboard from AST. You can import just one dashboard of your choosing (i.e., BigIP - Device Device >> Virtual Servers), or several (or even all) dashboards from AST. For this example, we will only import one dashboard, BigIP - Device Device >> Virtual Servers. If you wish to import other dashboards, the steps are the same. Navigate to the dashboard you would like to import into your Grafana instance. For the example used here, navigate to Dashboards >> BigIP – Device >> Device Virtual Servers. Click the blue "Share" button near the upper-right corner. In the pop-up box, click the Export tab. Click the blue "Save to file" button to download the JSON file representing the dashboard. Two notes: If you wish to use your own non-AST instance of Prometheus, you will need to move the slider for “Export for sharing externally” (available in the Share pop-up box, under the Export tab) to the right to enable it. This will allow you to select your own Prometheus instance as the data source when importing the dashboard into the alternate Grafana instance. The default JSON for these dashboards is also available in “dashboards” folder of the repo: https://github.com/f5devcentral/application-study-tool/tree/main/services/grafana/provisioning/dashboards. This version has the “Export for sharing externally” option enabled, so you will need to select the desired Prometheus data source – either your own or the AST instance – when importing the dashboard into the alternate Grafana instance. Import the Dashboard into the New (or Existing) Grafana Instance If you have just launched a new, generic Grafana container using the instructions in the above section, Launch a Second (Generic) Grafana Container, you can now launch the UI from a web browser by navigating to http://localhost:3002/ (assuming you are running on your local machine). The default login credentials are admin/admin. If this is just a temporary test instance, you may click “skip” when prompted to “Update your password”. (For a production instance or any instance that will be used more than just briefly, we recommend changing this to a stronger password.) If you are using an existing Grafana instance, navigate to it and log in. Connect the New Grafana Instance to the AST Prometheus Instance From this non-AST Grafana instance, verify the Prometheus data source is reachable from Grafana, and then connect to it by following these steps: In the menu bar on the left, click Connections >> Data sources. If this is a new instance of Grafana, the “Add data source” button will appear in the middle of the screen. If this is an existing instance with pre-existing data sources, the button will be in the upper-right corner of the screen and will say “Add new data source”. Click on it. Select Prometheus from the list of data sources. You may have to scroll down or enter “prometheus” in the search bar. Fill in a name (for example, “ast-prometheus”), and the URL to connect to the Prometheus instance. In my case, it was my host's private IP address, 192.168.0.15, and the port Prometheus is listening on (9090 by default): http://192.168.0.15:9090. Set the “Interval behaviour >> Scrape interval” to be the same as the value used for the collection_interval setting in your AST configuration. If you did not explicitly change it when configuring AST, it will be the default value of 60s. Click the blue "Save & test" button and ensure you get the message, “Successfully queried the Prometheus API” at the bottom of the screen. Import the Dashboard into the New Grafana Instance Click on “Dashboards” in the menu on the left. Click the blue “New” button in the upper-right and, from the drop-down, select "Import". Click on "Upload dashboard JSON file" and upload the JSON file you previously exported from the original AST dashboard. Give it a name (under Name). Under the Prometheus drop-down, select your Prometheus data source. (In the example above, it is called "ast-prometheus". If you accept the default name, it will just be “prometheus”.) Click Import. Voilà! You are now taken to the newly imported Grafana dashboard. Conclusion The Application Study Tool offers excellent observability for F5 BIG-IP systems and the traffic they handle. If you have your own Grafana instance with your own set of dashboards, there is no need to manage two separate instances. You can combine the two so you have all your dashboards in one place. The flexibility of Grafana also allows it to be highly customizable, so you can modify any of the out-of-the-box dashboards AST provides and even create your own. If you have gotten value from customizing some of the default AST dashboards, feel free to post what you did below, as many of our readers will find this valuable.846Views7likes1CommentCOVID-19 Response: Supplement to the BIG-IP APM and BIG-IP Edge Client Operations Guides
In the new F5 VPN for Business Continuity guide published to AskF5, you can learn how to: License and provision many new BIG-IP Network Access VPN users. Ensure seamless and secure BIG-IP APM Network Access VPN service any time. Troubleshoot problems when your BIG-IP Network Access VPN is not functioning optimally. The F5 VPN for Business Continuity guide includes the following chapters: Chapter 1: Contents Chapter 2: What is Network Access VPN? Chapter 3: Licensing and provisioning Network Access VPN Chapter 4: Optimizing Network Access VPN Chapter 5: Securing Network Access VPN Chapter 6: Troubleshooting Network Access VPN583Views7likes0CommentsThe Power of F5 and NGINX
NGINX Controller 3.0 released Since the NGINX acquisition, F5 and NGINX have been integrating teams, listening to customers, and planning our first release as a unified company. Now, we have introduced NGINX Controller 3.0, which allows you to manage apps and services across a variety of deployment models, including multi-cloud scenarios. NGINX Controller 3.0 shifts from an infrastructure-centric to an application-centric design, improving developer productivity and accelerating time-to-market for new applications. In this article, learn about core NGINX concepts and explore new NGINX documentation on AskF5. Core NGINX concepts Putting your Apps First (5 mins) Load Balancing in a Multi-Cloud World (4 mins) Managing a Real Time API (4 mins) Simplifying the Move to Microservices (5 mins) Putting your Apps First (5 mins) Learn how an app-centric delivery platform can increase collaboration, decrease risk, and help you move with speed. Load Balancing in a Multi-Cloud World (4 mins) Explore considerations for deploying your applications to multiple clouds. Managing a Real Time API (4 mins) Learn the benefits of a lightweight API management platform. Simplifying the Move to Microservices (5 mins) Learn about options to successfully deploy microservices and see our six-point checklist to help you determine if you’re ready for a service mesh. New NGINX documentation on AskF5 As the F5 and NGINX engineering teams are releasing products together, engineers from both Support teams and AskF5 are combining forces to produce new documentation. For example, if you want to deploy your BIG-IP LTM system with HTTP load balancing to two NGINX proxies in AWS, see Quick deployment: BIG-IP LTM system with HTTP load balancing to two NGINX Plus web servers in AWS. More NGINX articles on AskF5: K74544015: Removing nginx/<version> from HTTP response headers K82655201: Host OS swap space must be disabled in NGINX Controller 2.8.0 and later K24214052: NGINX Controller 2.0.0 installation fails when the host OS locale is not UTF-8 K64001240: Enabling NGINX Controller Agent debug logging K06962163: Resetting the Admin account password on the NGINX Controller system K30389284: Backing up and restoring the NGINX Controller system K10640269: Setting nginx-controller as the default Kubernetes namespaceK51798430: Using the proxy_headers_hash_max_size and proxy_headers_hash_bucket_size directives K03453121: Basic Authentication on the health check request K21528053: [crit] message in error.log says '24: Too many open files' K43542013: NGINX returns status '400 Request Header Or Cookie Too Large' or '414 Request-URI Too Large' K48373902: [warn] message in error log: an upstream response is buffered to a temporary file while reading upstream K84508595: Different SSL protocols for different servers K18050039: Enabling client certificate authentication for NGINX K95305552: How to download or update the GeoIP2 database K68914062: Displaying a custom 502 response page K13912623: Configuring a default 'catchall' server K04600350: Using a common set of directives in the NGINX Plus configuration K46613025: High Availability solutions available for NGINX Plus in Azure K42497190: NGINX versions that support Lightweight M2M protocol K53631303: Capturing HTTP headers of a request in a log file K95324441: modsec_audit.log dramatically increasing What new NGINX topics would you like to see on AskF5? Leave your suggestions in the comments.1.8KViews4likes1CommentTools and facilities to troubleshoot HTTP/3 over QUIC with the BIG-IP system
Introduction This article is for engineers who are troubleshooting issues related to HTTP/3 over QUIC as you deploy this new technology on your BIG-IP system. As you perform your troubleshooting tasks, the BIG-IP system provides you a set of useful tools along with other third party software to identify the root cause of issues and even tune HTTP/3 performance to maximize your system's potential. Overview of HTTP/3 and QUIC HTTP/3 is the next version of the HTTP protocol after HTTP/2. The most significant change in HTTP/3 from its predecessors is that it uses the UDP protocol instead of TCP. HTTP/3 uses a new Internet transport protocol, QUIC uses streams at the transport layer and provides TCP-like congestion control and loss recovery. One major improvement QUIC provides is it combines the typical 3-way TCP handshake with TLS 1.3's handshake. This improves the time required to establish a connection. Hence, you may see QUIC as providing the functions previously provided by TCP, TLS, and HTTP/2 as shown in the following diagram: For an overview of HTTP/3 over QUIC with the BIG-IP system, refer to K60235402: Overview of the BIG-IP HTTP/3 and QUIC profiles. Available tools and facilties Beginning in BIG-IP 15.1.0.1, HTTP/3 over QUIC (client-side only) is available as an experimental feature on the BIG-IP system. Beginning in BIG-IP 16.1.0, BIG-IP supports QUIC and HTTP/3. In addition to that feature, there are tools and facilities that are available to help you troubleshoot issues you might encounter. Install an HTTP/3 command line client Use a browser that supports QUIC Review statistics on your BIG-IP system Enable QUIC debug logging on the BIG-IP system Perform advanced troubleshooting with Packet Tracing Use the NetLog feature from the Chromium Project to capture a NetLog dump. Use the tcpdump command and Wireshark to capture and analyze traffic. Use the qlog trace system database key on the BIG-IP system. Important: For BIG-IP versions prior to 16.1.0 that are in the experimental stages, it is important that you note in your troubleshooting, the version of the ietf draft that your client and server implements. For example, in the Hello packets between the client and server, version negotiation is performed to ensure that client and server agree to a QUIC version that is mutually supported. In BIG-IP 15.1.0.1, the HTTP/3 and QUIC profiles in the BIG-IP system are experimental implementations of draft-ietf-quic-http-24 and draft-ietf-quic-transport-24 respectively. You need to consider this when configuring the Alt-Svc header in HTTP/3 discovery. For some browsers such as those from the Chromium project, Chrome canary, Microsoft Edge canary and Opera, when starting these browsers from the command line, you need to provide the QUIC version it implements. For example, for Chrome canary, you run the following command: chrome.exe --enable-quic --quic-version=h3-25. Only implementations of the final, published RFC can identify themselves as h3. Until such an RFC exists, implementations must not identify themselves using the h3 string. 1. Install an HTTP/3 command line client Keep in mind that HTTP/3 over QUIC runs on UDP instead of TCP. By default, browsers always initiate a connection to the server using the traditional TCP handshake which will not work with a QUIC server listening for UDP packets. You therefore need to configure HTTP/3 discovery on your BIG-IP system. This can be done by using the HTTP Alternative Services concept which can be implemented either by inserting the Alt-Svc header or via DNS as a HTTPSVC DNS resource record. To insert the Alt-Svc header, refer to K16240003: Configuring HTTP/3 discovery for BIG-IP virtual server. As you troubleshoot your HTTP/3 discovery implementation, you can use a command line tool that does not come with the overhead of HTTP/3 discovery. Following are two popular tools that you can install on your client system: The picoquic client The curl client where you have the option to use either the ntcp2 or quiche software libraries. 2. Use a browser that supports QUIC At this time, browsers by default, still do not support QUIC and do not send the server UDP packets to establish a QUIC connection. The following browsers which are in development support it: Firefox Nightly Chrome Canary (Chromium Project) Microsoft Edge Canary (Chromium Project) Opera (Chromium Project) Note that for browsers from the Chromium project, you need to specify the QUIC ietf version that the browser supports when you launch it. For example, for Chrome, run the following command: chrome.exe --enable-quic --quic-version=h3-25. In most browsers today, you can quickly view the HTTP information exchanged by using the built-in developer tool. To open the tool, select F12 after your browser opens and access any site that supports QUIC. Select the Network tab and under the Protocol column, look at h3-<draft_version> . If the Protocol column is not there, you may have to right click the toolbar to add it. Note: Only implementations of the final, published RFC can identify themselves as h3. Until such an RFC exists, implementations must not identify themselves using the h3 string. Click the name of the HTTP request and you can see that the site returns the Alt-Svc header indicating that it supports HTTP/3 with its ietf draft version. 3. Review statistics on your BIG-IP system The statistics facility on the BIG-IP system displays the system's QUIC traffic processing. On the Configuration utility, go to Local Traffic > Virtual Servers. Select the name of your virtual server and select the Statistics tab. In the Profiles section, select the HTTP/3 and QUIC profiles associated with the virtual server. Alternatively, you can view the statistics from the TMOS shell (tmsh) utility using the following command syntax: tmsh show ltm profile http3 <http3_profile_name> tmsh show ltm profile quic <quic_profile_name> 4. Enable QUIC debug logging on the BIG-IP system You can use the sys db variable tmm.quic.log.level to adjust the verbosity of the QUIC log level to the /var/log/ltm file. Type the following command to see the list of values. tmsh list sys db tmm.quic.log.level value-range sys db tmm.quic.log.level { default-value "Critical" value-range "Critical Error Warning Notice Info Debug" } For example: tmsh modify sys db tmm.quic.log.level value debug 5. Perform advanced troubleshooting with Packet Tracing a. Use the NetLog feature from the Chromium Project to capture a NetLog dump. NetLog is an event logging mechanism for Chrome’s network stack to help debug problems and analyze performance not just for HTTP/3 over QUIC traffic but also HTTP/1.1 and HTTP/2. This feature is available only in browsers from the Chromium project, such as Google Chrome, Opera and Microsoft Edge. The feature provides detailed client side logging including SSL handshake and HTTP content without having to perform decryption or run any commands on your BIG-IP system. To start capturing, open a new tab on your browser and go to, for example, chrome://net-export (Chromium only). For a step by step guide, refer to How to capture a NetLog dump. Once you have your NetLog dump, you can view and analyze it by navigating to netlog-viewer (Chromium only). To analyze QUIC traffic, on the left panel, select Events. In the Description column, identify the URL you requested. For QUIC SSL handshake events, select QUIC_SESSION. For HTTP content, select URL_REQUEST. For example, in the following NetLog dump, the connection failed at the beginning because the client and server could not negotiate a common QUIC version. b. Use the tcpdump command and Wireshark to capture and analyze traffic. The tcpdump command and Wireshark are both essential tools when you need to examine any communication at the packet level. To generate captures and the SSL secrets required to decrypt them, follow the procedure in K05822509: Decrypting HTTP/3 over QUIC with Wireshark. Keep your Wireshark version updated at all times as Wireshark's ability to decode QUIC packets continue to evolve as we speak. c. Use the qlog traces on the BIG-IP system. The BIG-IP qlog trace facility provides you another tool to troubleshoot QUIC communications. By enabling a database variable, the system logs packets and other events to /var/log/trace<TMM_number>.qlog files. qlog is a standardized structured logging format for QUIC and is basically a well-defined JSON file with specified event names and associated data that allows you to use tools like qvis for visualization. Note that the payload is not logged. The qlog trace files are compliant to the IETF schema specified in draft-marx-qlog-main-schema-01. To capture and analyze qlog trace files on your BIG-IP system, perform the following procedure: Capturing qlog trace files on the BIG-IP system Login to the BIG-IP command line. Enable qlogging by typing the following command: tmsh modify sys db quic.qlogging value enable Reproduce the issue you are troubleshooting by initiating QUIC traffic to your virtual server. Disable qlogging by typing the following command: tmsh modify sys db quic.qlogging value disable Note: This step will log required closing json content to the trace files, terminate the trace logging gracefully and is required before you view the files. Sanitizing the qlog trace files Before loading the trace files onto a graphical visualization tool, you first need to sanitize the json content. The tools attempt to fix some common json errors but there may be cases where you need to manually correct some json syntax errors by adding closing braces or commas. Note: Knowledge of different json constructs such as objects, arrays and members may be required when you fix the json files. You can use any of the available online tools such as the following: Json Formatter Fixjson freeformatter Important: Even as the payload information such as IP addresses, or HTTP content are not included in the trace files, you should exercise caution when uploading content to online tools. F5 is not responsible for the privacy and security of your data when you use the third party software listed in this procedure. Alternatively, you can download and install any of the following command line tools on your client device: jsonlint-php jsonlint-py jsonlint-cli Loading and analyzing the qlog trace files with a visualization tool When you have sanitized your json trace files, upload them to a visualization tool for analysis. For example, you can use the following tool available for free. qvis QUIC and HTTP/3 visualization toolsuite The visualization tool can provide you graphical representations of the sequence of messages, congestion information and qlog stats for troubleshooting. For example, the following screenshot, illustrates a sequence diagram of the SSL handshake. Important: Even as payload information such as IP addresses, or HTTP content are not included in the trace files, you should exercise caution when uploading content to online tools. F5 is not responsible for the privacy and security of your data when you use the third party software listed in this procedure. Summary As you use the tools described in this article, you notice that each one helps you troubleshoot issues at the different OSI layers. The built-in developer tools in each browser provide a quick and easy way to view HTTP content but do not let you see the details of the protocol, as do the NetLog tool and Wireshark. However, viewing qlog traces on the qvis graphical tool provides you high level trends and statistics that packet captures do not show. Using the appropriate tool with the right troubleshooting methodology maximizes the potential of HTTP/3 and QUIC for your organization.5.4KViews3likes0CommentsUpgrade your BIG-IP systems for more secure apps
Keeping your BIG-IP software up to date ensures you have access to advanced capabilities and higher-quality, more secure releases. At a minimum, F5 recommends implementing BIG-IP 14.1.0 across your BIG-IP fleet. BIG-IP 15.1.0 and 16.0.0 will reward you with the most advanced BIG-IP functionality, including new protocol support and security capabilities. Walkthrough Videos For a walkthrough of three common upgrade scenarios, watch the following videos: Performing a software upgrade on a BIG-IP system (standalone) Updating BIG-IP HA systems with a point release Upgrading BIG-IP VIPRION vCMP systems Automation For information about upgrading with the assistance of automation, watch DevCentral Connects: Automating Software Updates with BIG-IQ or Ansible. Official Training F5 has temporarily, and until further notice, made the following training available for free: BIG-IP Fundamentals: Upgrading a BIG-IP System --- For more information about upgrading, refer to the F5 page on BIG-IP upgrade.366Views3likes1CommentNext Quarterly Security Notification is October 19, 2022
F5 discloses security vulnerabilities and security exposures for F5 products in a Quarterly Security Notification (QSN). On the day of the last QSN, Aug 3, 2022, F5 announced that the next QSN will occur Wednesday October 19, 2022. QSN dates are published in advance so that customers can schedule updates and business operations ahead of the public disclosure date. K67091411: Guidance for Quarterly Security Notifications includes steps you can take before and after a QSN, such as scheduling maintenance windows in advance, saving a UCS archive backup file, and planning for any upgrades that may be required. K67091411 also includes links to articles detailing additional security best practices.629Views2likes1CommentUpgrade your BIG-IQ system: Videos and procedures
AskF5 published a new series of videos and procedures to take you through the BIG-IQ upgrade process. Videos Watch the playlist here: In this playlist, you'll find the following videos: Preparing for a BIG-IQ upgrade Part 1: Checking the BIG-IQ DCD configuration (1:38) Part 2: Downloading and uploading files (2:23) Part 3: Running the pre-upgrade check tool (4:43) Part 4: Ensuring you have multiple DCDs (1:17) Upgrading the BIG-IQ and DCDs Part 5: Upgrading the BIG-IQ system (3:20) Part 6: Upgrading the secondary system (2:21) Completing the post-upgrade process Part 7: Adding back the secondary BIG-IQ system (1:28) Part 8: Re-importing the devices and services (2:37) Part 9: (VMware) Installing the certificate (1:37) Procedures To review the written procedures, refer to K51342220: Upgrading the BIG-IQ videos and procedures.729Views2likes2CommentsAskF5's new BIG-IP upgrade guide
AskF5 published the new BIG-IP upgrade guide to help you through the BIG-IP upgrade process. At a minimum, F5 recommends that you upgrade your BIG-IP appliances to at least version 14.1.x and your BIG-IP VE systems to at least version 15.1.x. Because F5 customers use a variety of configurations and tools, there are many upgrade scenarios. In this new Operations Guide, you'll find the following topics: Prepare to upgrade Chapter 1: Guide introduction and contents Chapter 2: Choosing a BIG-IP upgrade version Chapter 3: Preparing to upgrade the BIG-IP system BIG-IP hardware and VE platforms BIG-IP systems Chapter 4: Upgrading a standalone BIG-IP system Chapter 5: Upgrading a standalone BIG-IP system using the TMOS Shell VIPRION BIG-IP systems Chapter 6: Upgrading a VIPRION HA pair Chapter 7: Upgrading BIG-IP VIPRION vCMP systems Other BIG-IP vCMP systems Chapter 8: Upgrading BIG-IP vCMP systems (non-VIPRION) BIG-IP cloud platforms BIG-IP on AWS Chapter 9: Upgrading a BIG-IP VE instance on AWS using the f5-aws-migrate.py script BIG-IP on Microsoft Azure Chapter 10: Upgrading a BIG-IP Azure VM using the Azure portal to deploy an ARM template Chapter 11: Upgrading a BIG-IP Azure VM using Terraform to deploy an ARM template Chapter 12: Upgrading a BIG-IP Azure VM using Ansible to deploy an ARM template BIG-IP on GCP Chapter 13: Upgrading a BIG-IP GCP VM using Terraform Automation Chapter 14: Using F5 Modules for Ansible to upgrade BIG-IP system software Send Feedback This guide is a work in progress; request the upgrade scenarios you want to see in this guide. Use the feedback or survey button at the bottom of any guide page, or email TellAskF5@f5.com. Related Content For supported paths and general considerations, refer to K13845: Overview of supported BIG-IP upgrade paths and an upgrade planning reference.854Views2likes1CommentSimplifying Application Health Monitoring with F5 BIG-IP
A simple agreement between BIG-IP administrators and application owners can foster smooth collaboration between teams. Application owners define their own simple or complex health monitors and agree to expose a conventional /health endpoint. When a /health endpoint responds with an HTTP 200 request, BIG-IP assumes the application is healthy based on the application owners' own criteria. The Challenge of Health Monitoring in Modern Environments F5 BIG-IP administrators in Network Operations (NetOps) teams often work with application teams because the BIG-IP acts as a full proxy, providing services like: TLS termination Load balancing Health monitoring Health checks are crucial for effective load balancing. The BIG-IP uses them to determine where to send traffic among back-end application servers. However, health monitoring frequently causes friction between teams. Problems with the Traditional Approach Traditionally, BIG-IP administrators create and maintain health monitors ranging from simple ICMP pings to complex monitors that: Simulate user transactions Verify HTTP response codes Validate payload contents Track application dependencies This leads to several issues: Knowledge Gap: NetOps may not fully grasp each application's intricacies. Change Management Overhead: Application updates require retesting monitors, causing delays. Production Risk: Monitors can break after application changes, incorrectly marking services as up/down. Team Friction: Troubleshooting failed health checks involves tedious back-and-forth between teams. A Cloud-Native Solution The cloud-native and microservices communities have patterns that elegantly solve these problems. One widely used pattern is the [health endpoint], which adapts well to BIG-IP environments. The /health Endpoint Convention Cloud-native applications commonly expose dedicated health endpoints like /health, /healthy, or /ready. These return standard status codes reflecting the application's state. The /health endpoint provides a clear contract between NetOps and application teams for BIG-IP integration. Implementing the Contract This approach establishes a simple agreement: Application Team Responsibilities: Implement /health to return HTTP 200 when the application is ready for traffic Define "healthy" based on application needs (database connectivity, dependencies, etc.) Maintain the health check logic as the application changes BIG-IP Team Responsibilities: Configure an HTTP monitor targeting the /health endpoint Treat 200 as "healthy", anything else as "unhealthy" Benefits of This Approach Aligned Expertise: Application teams define health based on their knowledge. Less Friction: BIG-IP configuration stays stable as applications evolve. Better Reliability: Health checks reflect true application health, including dependencies. Easier Troubleshooting: The /health endpoint can return detailed diagnostic info, but this is ignored by the BIG-IP and used strictly for troubleshooting. Implementation Examples F5 BIG-IP Health Monitor Configuration ltm monitor http /Common/app-health-monitor { defaults-from /Common/http destination *:* interval 5 recv 200 recv-disable none send "GET /health HTTP/1.1\r\nHost: example.com\r\nConnection: close\r\n\r\n" time-until-up 0 timeout 16 } Node.js Health Endpoint Implementation const express = require('express'); const app = express(); const port = 3000; app.get('/', (req, res) => { res.send('Application is running'); }); app.get('/health', async (req, res) => { try { const dbStatus = await checkDatabaseConnection(); const serviceStatus = await checkDependentServices(); if (dbStatus && serviceStatus) { return res.status(200).json({ status: 'healthy', database: 'connected', services: 'available', timestamp: new Date().toISOString() }); } res.status(503).json({ status: 'unhealthy', database: dbStatus ? 'connected' : 'disconnected', services: serviceStatus ? 'available' : 'unavailable', timestamp: new Date().toISOString() }); } catch (error) { res.status(500).json({ status: 'error', message: error.message, timestamp: new Date().toISOString() }); } }); async function checkDatabaseConnection() { // Check real database connection return true; } async function checkDependentServices() { // Check required service connections return true; } app.listen(port, () => { console.log(`Application listening at http://localhost:${port}`); }); Adopting this health check pattern can greatly reduce friction between NetOps and application teams while improving reliability. The simple contract of HTTP 200 for healthy provides the needed integration while letting each team focus on their expertise. For apps that can't implement a custom /health endpoint, BIG-IP admins can still use traditional ICMP or TCP port monitoring. However, these basic checks can't accurately reflect an app's true health and complex dependencies. This approach fosters collaboration and leverages the specialized knowledge of both network and application teams. The result is more reliable services and smoother operations.271Views1like0Comments