Telemetry
15 TopicsGetting started with F5 Distributed Cloud (XC) Telemetry
Introduction: This is an introductory article on the F5 Distributed Cloud (XC) telemetry series covering the basics. Going forward, there will be more articles focusing on exporting and visualizing logs and metrics from XC platform to telemetry tools like ELK Stack, Loki, Prometheus, Grafana etc. What is Telemetry? Telemetry refers to the process of collection and transmission of various kinds of data from remote systems to some central receiving entity for monitoring, analyzing and improving the performance, reliability, and security of remote systems. Telemetry Data involves: Metrics: Quantitative data like request rates, error rates, request/response throughputs etc. collected at regular intervals over a period of time. Logs: Textual time and event-based records generated by applications like request logs, security logs, etc. Traces: Information regarding journey/flow of requests across multiple services in a distributed system. Alerts: Alerts use telemetry data to set limits and send real-time notifications allowing organizations to act quickly if their systems don’t behave as expected. This makes alerts a critical pillar of observability. Overview: The F5 Distributed Cloud platform is designed to meet the needs of today’s modern and distributed applications. It allows for delivery, security, and observability across multiple clouds, hybrid clouds, and edge environments. This will create telemetry data that can be seen in XC’s own dashboards. But there may be times when customers want to collect their application’s telemetry data from different platforms to their own SIEM systems. To fulfill this kind of requirement, XC has come up with the Global Log Receiver (GLR) which will send XC logs to customer’s log collection systems. Along with this XC also exposes API that contains metrics data which can be fetched by exporter scripts and can be parsed and processed in such a way that telemetry tools can understand. As shown in the above diagram, there are a few steps involved before raw telemetry data can be presented into the dashboards, which include data collection, storage, and processing from remote systems. Once done, only then will the telemetry data be sent to the visualization tools for real-time monitoring and observability. To achieve this, there are several telemetry tools available like Prometheus (which is used for collecting, storing, and analyzing metrics), ELK stack, Grafana etc. We have covered a brief description of a few such tools below. F5 XC Global Log Receiver: F5 XC Global Log Receiver facilitates sending XC logs (Request, Audit, Security event and DNS request logs) to an external log collection system. The sent logs include all system and application logs of F5 XC tenant. Global log receiver supports sending the logs for the following log collection systems: AWS Cloudwatch AWS S3 HTTP Receiver Azure Blob Storage Azure Event Hubs Datadog GCP Bucket Generic HTTP or HTTPs server IBM QRadar Kafka NewRelic Splunk SumoLogic More information on how to setup or configure XC GLR can be found in this document. Observability/Monitoring Tools: Note: Below is a brief description of a few commonly used monitoring tools used by organizations. Prometheus: Prometheus is an open-source monitoring and alerting tool designed for collecting, storing, and analyzing time-series data (metrics) from modern, cloud-native, and distributed systems. It scrapes metrics from targets via HTTP endpoints, stores them in its optimized time-series database, and allows querying using the powerful PromQL language. Prometheus integrates seamlessly with tools like Grafana for visualization and includes Alertmanager for real-time alerting. It can also be integrated with Kubernetes and can help in continuously discovering and monitoring services from remote systems. Loki: Loki is a lightweight, open-source log aggregation tool designed for storing and querying logs from remote systems. Unlike traditional log management systems, Loki focuses on processing logs alongside metrics and is often paired with Prometheus, making it more efficient. It does not index the log content; rather it sets labels for each log stream. Logs can be queried using LogQL, a PromQL-like language. It is best suited for debugging and monitoring logs in cloud-native or containerized environments like Kubernetes. Grafana: Grafana is an open-source visualization and analytics platform for creating real-time dashboards from diverse data sets. It integrates with tools like Prometheus, Loki, Elasticsearch, and more. Grafana enables users to visualize trends, monitor performance, and set up alerts using a highly customizable interface. ELK Stack: The ELK Stack (Elasticsearch, Logstash, Kibana) is a powerful open-source solution for log management, search, and analytics. Elasticsearch handles storing, indexing, and querying data. Logstash ingests, parses, and transforms logs from various sources. Kibana provides an interactive interface for visualizing data and building dashboards. Conclusion: Telemetry turns system data into actionable insights enabling real-time visibility, early detection of issues, and performance tuning, thereby ensuring system reliability, security, stability, and efficiency. In this article, we’ve explored some of the foundational building blocks and essential tools that will set the stage for the topics we’ll cover in the upcoming articles of this series! Related Articles: F5 Distributed Cloud Telemetry (Logs) - ELK Stack F5 Distributed Cloud Telemetry (Metrics) - ELK Stack F5 Distributed Cloud Telemetry (Logs) - Loki F5 Distributed Cloud Telemetry (Metrics) - Prometheus References: XC Global Log Receiver Prometheus ELK Stack Loki307Views1like3CommentsF5 Distributed Cloud Telemetry (Metrics) - ELK Stack
As we are looking into exporting metrics data to the ELK stack using Python script, let's first get a high-level overview of the same. Metrics are numerical values that provide actionable insights into the performance, health and behavior of systems or applications over time, allowing teams to monitor and improve the reliability, stability and performance of modern distributed systems. ELK Stack (Elasticsearch, Logstash, and Kibana) is a powerful open-source platform. It enables organizations to collect, process, store, and visualize telemetry data such as logs, metrics, and traces from remote systems in real-time.116Views1like0CommentsF5 Distributed Cloud Telemetry (Logs) - Loki
Scope This article walks through the process of integrating log data from F5 Distributed Cloud’s (F5 XC) Global Log Receiver (GLR) with Grafana Loki. By the end, you'll have a working log pipeline where logs sent from F5 XC can be visualized and explored through Grafana. Introduction Observability is a critical part of managing modern applications and infrastructure. F5 XC offers the GLR as a centralized system to stream logs from across distributed services. Grafana Loki, part of the Grafana observability stack, is a powerful and efficient tool for aggregating and querying logs. To improve observability, you can forward logs from F5 XC into Loki for centralized log analysis and visualization. This article shows you how to implement a lightweight Python webhook that bridges F5 XC GLR with Grafana Loki. The webhook acts as a log ingestion and transformation service, enabling logs to flow seamlessly into Loki for real-time exploration via Grafana. Prerequisites Access to F5 Distributed Cloud (XC) SaaS tenant with GLR setup VM with Python3 installed Running Loki instance (If not, check "Configuring Loki and Grafana" section below) Running Grafana instance (If not, check "Configuring Loki and Grafana" section below) Note – In this demo, an AWS VM is used with Python3 installed and running webhook (port - 5000), Loki (port - 3100) and Grafana (port - 3000) running as docker instance, all in the same VM. Architecture Overview F5 XC GLR → Python Webhook → Loki → Grafana F5 XC GLR Configuration Follow the steps mentioned below to set up and configure Global Log Receiver (GLR). F5 XC GLR Building the Python Webhook To send the log data from F5 Distributed Cloud Global Log Receiver (GLR) to Grafana Loki, we used a lightweight Python webhook implemented using the Flask framework. This webhook acts as a simple transformation and relay service. It receives raw log entries from F5 XC, repackages them in the structure Loki expects, and pushes them to a Loki instance running on the same virtual machine. Key Functions of the Webhook Listens for Log Data: The webhook exposes an endpoint (/glr-webhook) on port 5000 that accepts HTTP POST requests from the GLR. Each request can contain one or more newline-separated log entries. Parses and Structures the Logs: Incoming logs are expected to be JSON-formatted. The webhook parses each line individually and assigns a consistent timestamp (in nanoseconds, as required by Loki). Formats the Payload for Loki: The logs are then wrapped in a structure that conforms to Loki’s push API format. This includes organizing them into a stream, which can be labeled (e.g., with a job name like f5-glr) to make logs easier to query and group in Grafana. Pushes Logs to Loki: Once formatted, the webhook sends the payload to the Loki HTTP API using a standard POST request. If the request is successful, Loki returns a 204 No Content status. Handles Errors Gracefully: The webhook includes basic error handling for malformed JSON, network issues, or unexpected failures, returning appropriate HTTP responses. Running the Webhook python3 webhook.py > python.log 2>&1 & This command runs webhook.py using Python3 in the background and redirects all standard output and error messages to python.log for easier debugging. Configuring Loki and Grafana docker run -d --name=loki -p 3100:3100 grafana/loki:latest docker run -d --name=grafana -p 3000:3000 grafana/grafana:latest Loki and Grafana are running as docker instance in the same VM, private IP of the Loki docker instance along with port is used as data source in Grafana configuration. Once Loki is configured under Grafana Data sources, follow the below steps: Navigate to Explore menu Select “Loki” in data source picker Choose appropriate label and value, in this case label=job and value=f5-glr Select desired time range and click “Run query” Observe logs will be displayed based on “Log Type” selected in F5 XC GLR Configuration Note: Some requests need to be generated for logs to be visible in Grafana based on Log Type selected. Conclusion F5 Distributed Cloud's (F5 XC) Global Log Receiver (GLR) unlocks real-time observability by integrating with open-source tools like Grafana Loki. This reflects F5 XC's commitment to open source, enabling seamless log management with minimal overhead. A customizable Python webhook ensures adaptability to evolving needs. Centralized logs in Loki and visualized in Grafana empower teams with actionable insights, accelerating troubleshooting and optimization. F5 XC GLR's flexibility future-proofs observability strategies. This integration showcases F5’s dedication to interoperability and empowering customers with community-driven solutions.224Views0likes0CommentsF5 Distributed Cloud Telemetry (Metrics) - Prometheus
Scope This article walks through the process of collecting metrics from F5 Distributed Cloud’s (XC) Service Graph API and exposing them in a format that Prometheus can scrape. Prometheus then scrapes these metrics, which can be visualized in Grafana. Introduction Metrics are essential for gaining real-time insight into service performance and behaviour. F5 Distributed Cloud (XC) provides a Service Graph API that captures service-to-service communication data across your infrastructure. Prometheus, a leading open-source monitoring system, can scrape and store time-series metrics — and when paired with Grafana, offers powerful visualization capabilities. This article shows how to integrate a custom Python-based exporter that transforms Service Graph API data into Prometheus-compatible metrics. These metrics are then scraped by Prometheus and visualized in Grafana, all running in Docker for easy deployment. Prerequisites Access to F5 Distributed Cloud (XC) SaaS tenant VM with Python3 installed Running Prometheus instance (If not check "Configuring Prometheus" section below) Running Grafana instance (If not check "Configuring Grafana" section below) Note – In this demo, an AWS VM is used with Python installed and running exporter (port - 8888), Prometheus (host port - 9090) and Grafana (port - 3000) running as docker instance, all in same VM. Architecture Overview F5 XC API → Python Exporter → Prometheus → Grafana Building the Python Exporter To collect metrics from the F5 Distributed Cloud (XC) Service Graph API and expose them in a format Prometheus understands, we created a lightweight Python exporter using Flask. This exporter acts as a transformation layer — it fetches service graph data, parses it, and exposes it through a /metrics endpoint that Prometheus can scrape. Code Link -> exporter.py Key Functions of the Exporter Uses XC-Provided .p12 File for Authentication: To authenticate API requests to F5 Distributed Cloud (XC), the exporter uses a client certificate packaged in a .p12 file. This file must be manually downloaded from the F5 XC console (steps) and stored on the VM where the Python script runs. The script expects the full path to the .p12 file and its associated password to be specified in the configuration section. Fetches Service Graph Metrics: The script pulls service-level metrics such as request rates, error rates, throughput, and latency from the XC API. It supports both aggregated and individual load balancer views. Processes and Structures the Data: The exporter parses the raw API response to extract the latest metric values and converts them into Prometheus exposition format. Each metric is labelled (e.g., by vhost and direction) for flexibility in Grafana queries. Exposes a /metrics Endpoint: A Flask web server runs on port 8888, serving the /metrics endpoint. Prometheus periodically scrapes this endpoint to ingest the latest metrics. Handles Multiple Metric Types: Traffic metrics and health scores are handled and formatted individually. Each metric includes a descriptive name, type declaration, and optional labels for fine-grained monitoring and visualization. Running the Exporter python3 exporter.py > python.log 2>&1 & This command runs exporter.py using Python3 in background and redirects all standard output and error messages to python.log for easier debugging. Configuring Prometheus docker run -d --name=prometheus --network=host -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus:latest Prometheus is running as docker instance in host network (port 9090) mode with below configuration (prometheus.yml), scrapping /metrics endpoint exposed from python flask exporter on port 8888 every 60 seconds. Configuring Grafana docker run -d --name=grafana -p 3000:3000 grafana/grafana:latest Private IP of the Prometheus docker instance along with port (9090) is used as data source in Grafana configuration. Once Prometheus is configured under Grafana Data sources, follow below steps: Navigate to Explore menu Select “Prometheus” in data source picker Choose appropriate metric, in this case “f5xc_downstream_http_request_rate” Select desired time range and click “Run query” Observe metrics graph will be displayed Note : Some requests need to be generated for metrics to be visible in Grafana. A broader, high-level view of all metrics can be accessed by navigating to “Drilldown” and selecting “Metrics”, providing a comprehensive snapshot across services. Conclusion F5 Distributed Cloud’s (F5 XC) Service Graph API provides deep visibility into service-to-service communication, and when paired with Prometheus and Grafana, it enables powerful, real-time monitoring without vendor lock-in. This integration highlights F5 XC’s alignment with open-source ecosystems, allowing users to build flexible and scalable observability pipelines. The custom Python exporter bridges the gap between the XC API and Prometheus, offering a lightweight and adaptable solution for transforming and exposing metrics. With Grafana dashboards on top, teams can gain instant insight into service health and performance. This open approach empowers operations teams to respond faster, optimize more effectively, and evolve their observability practices with confidence and control.245Views3likes0CommentsF5 Distributed Cloud Telemetry (Logs) - ELK Stack
Introduction: This article is a part of the F5 Distributed Cloud (F5 XC) telemetry series. Here we will discuss how we can export logs from the XC console to ELK Stack using XC’s GLR (Global Log Receiver). F5 Distributed Cloud GLR (Global Log Receiver): Global Log Receiver is a feature provided by Distributed Cloud. It enables customers to send their logs from the F5 Distributed Cloud (F5 XC) console dashboards to their respective centralized SIEM tools like ELK. Global log receiver supports the following log collection systems: AWS Cloudwatch AWS S3 Azure Blob Storage Azure Event Hubs Datadog GCP Bucket Generic HTTP or HTTPs server IBM QRadar Kafka NewRelic Splunk SumoLogic As of now, global log receiver supports sending request (access) logs, DNS request logs, security events, and audit logs of all HTTP load balancers and sites. ELK Stack: ELK Stack is a popular and powerful open-source suite of tools used for centralized log aggregation, analysis, and visualization. "ELK" stands for Elasticsearch Logstash Kibana Together, these tools collect, process and visualize machine-generated data, helping organizations gain insights into their systems. Components of the ELK Stack: Elasticsearch: Elasticsearch is a highly scalable, distributed RESTful search and analytics engine that serves as the core backend of the ELK stack. It is a central data store where all data logs are indexed and stored. It is designed to search and analyze large volumes of structured or unstructured data, such as logs and metrics, quickly and in near real time. Logstash: Logstash is a data ingestion and processing tool that collects data (logs or events) from various sources, transforms it, and sends it to Elasticsearch (or other destinations). It acts as a data collection pipeline with configurable input, output, and filter blocks. Kibana: Kibana is the visualization layer of the ELK stack. It provides a powerful interface for exploring, visualizing, and analyzing data (logs or events) stored in Elasticsearch. It does this with the help of charts, graphs, and maps. It helps organizations monitor the health, performance, and behavior of applications and take data-driven decisions. Architecture Diagram: For this demo, we have configured GLR to export logs from a namespace to Logstash listening on port 8080. Logstash receives and processes the logs, and sends it to Elasticsearch, where the logs are indexed and stored to enable real-time search and queries. At the end, Kibana retrieves the logs from Elasticsearch and represents it through interactive dashboards. Demonstration: To bring the setup up, we will first deploy the ELK stack in the docker environment. ELK deployment and configurations: Step1: Clone the repository using command: git clone https://github.com/deviantony/docker-elk.git Step2: Update ./docker-elk/docker-compose.yml (by adding http receiver port 8080 under logstash section as shown in the screenshot below). Step3: Update ./docker-elk/logstash/pipeline/logstash.conf file. Step 4: Now, run command: docker-compose up setup followed by command: docker-compose up Step 5: Check status of ELK stack containers run command: docker ps Step 6: Once the ELK stack is already up and running, then you can access to ELK GUI http://<public-ip>:5601 using default username/password (elastic/changeme) F5 XC GLR configurations: Step 1: Login to the XC console from the home page, select the Multi-Cloud Network Connect service or the Shared Configuration service. Multi-Cloud Network Connect service: Select Manage > Log Management > Global Log Receiver, Shared Configuration service: Select Manage > Global Log Receiver. Select Add Global Log Receiver. Note: If used path: [Multi-Cloud Network Connect service: Select Manage > Log Management > Global Log Receiver] Log Message Selection can only set to current namespace Step 2: Enter a name in the Metadata section. Optionally, set labels and add a description. From the Log Type menu, select Request Logs, Security Events, Audit Logs, or DNS Request Logs. The request logs are set by default. For this demo, we have selected security events Step 3: In the case of Multi-Cloud Network Connect service, select from Log Message Selection menu, for this demo, we have set it to Select logs in specific namespaces. Step 4: From the Receiver Configuration drop-down menu, select a receiver. Here for this demo, we have set it to HTTP receiver and provided an HTTP URI (public IP of the ELK stack along with the receiver port we have set in the logstash configuration, i.e. 8080). Step 5: Optionally, configure advanced settings. Click Save and Exit. Step 6: Finally, inspect your connection by clicking on the Test Connection button as shown in the below screenshots and verify that logs are collected in the receiver ( Access ELK GUI http://<ELK instance public IP>:5601, and navigate to Home> Analytics>Discover, Add logs-* as a data view filter) Verification: Step 1: Monitor security event logs of the Load Balancers deployed in the specified namespace from the XC console. Select WAAP service, your namespace and then navigate to Overview > Security, here select the LB and then click on the security analytics tab. Step 2: Access ELK GUI http://<ELK instance public IP>:5601, and navigate to Home> Analytics>Discover, Add logs-* as a data view filter. You will notice the logs have been exported to ELK. Step 3: Optionally, Navigate to Home> Analytics>Dashboards and click create visualization to generate a customized visualization dashboard for your collected logs. Conclusion: F5 XC already has an in-built observability dashboard providing real-time visualization to monitor, analyze, and troubleshoot applications and infrastructure across multi-cloud and edge environments. This helps organizations boost efficiency, reduce downtime, and ensure system reliability. With the help of XC’s GLR feature, XC can provide seamless integration with other SIEM tools as well, like ELK stack for customers preferring to consolidate telemetry data from multiple platforms to their centralized SIEM systems. References: XC Global Log Receiver Docker-elk ELK Stack DevCentral Article170Views1like0CommentsTelemetry streaming to Elasticsearch
Hi all I am following a couple of threads since I want to send ASM logging to Elasticsearch like this one from Greg What I understand is that I need to send an AS3 declaration and a TS declaration. But there are a couple of things not entirely clear to me. 1. Can I remove the iRule, Service_TCP, Pool, Log_Destination, Log_Publisher and Traffic_Log_profile declarations from the AS3 declaration json? In the example the telemetry_asm_security_log_profile does not seem to depend on these? 2. In the AS declaration json an IP address is specified 255.255.255.254 (perhaps just an example since it is a subnet mask) and also in the TS declaration where it is 172.16.60.194. How are the IP in the servers section of the AS3 declaration related to the one in the consumer part in the TS declaration? 3. In telemetry_asm_security_log_profile the field remoteStorage is set to splunk. According to the reference guide: Reference Guide security-log-profile-application-object the allowed values are “remote”, “splunk”, “arcsight”, “bigiq”. I would opt for just remote. Is that the correct choice? Regards Hans1.4KViews0likes10CommentsTelemetry Streaming: getting HTTP statistics via SNMP
Hi F5 community, I am looking to get HTTP statistics (total count, and broken by response code) metrics from Telemetry Streaming via SNMP (seems to be the most viable option). F5-BIGIP-LOCAL-MIB::ltmHttpProfileStat oid: .1.3.6.1.4.1.3375.2.2.6.7.6 However, the stats don't seem to come out correct at all: I do see deltas happening, but they don't match at all the traffic rate I expect to see. Furthermore, I have done some tests where I would start a load testing tool (vegeta) to fire concurrent HTTP requests, for which I do see the logs from the virtual server, but no matching increment in the above SNMP OID entries on none of the profiles configured. What am I doing wrong? does something need to be enabled on the HTTP profile in use to collect those stats? Best, Owayss90Views0likes0CommentsApplication observability (Open Telemetry Tracing)
Hello, do you, or your customers, need BIG-IP to deliver OTEL tracing? It won't (AFAIK) be implemented in BIG-IP classic, but I've opened a RFE to ask for implementation of Open Telemetry (distributed) Tracing on BIG-IP Next: RFE: (Bug alias 1621853) [RFE] Implement OTEL traces If you'll need it, don't hesitate to open a support case to link that RFE-ID, that will give it more weight for prioritization.223Views0likes2CommentsStreaming Telemetry Errors to Kafka
Has anyone seen errors like the following in the restnoded.log file? Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Error: EndpointLoader.loadEndpoint: provisioning: Error: Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/provision Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Error: EndpointLoader.loadEndpoint: bashDisabled: Error: Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/db/systemauth.disablebash Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Error: SystemStats._loadData: provisioning (undefined): Error: Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/provision Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Error: SystemStats._loadData: bashDisabled (undefined): Error: Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/db/systemauth.disablebash Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Error: SystemStats._processProperty: provisioning (provisioning::items): Error: Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/provision Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Error: SystemStats._processProperty: bashDisabled (bashDisabled::value): Error: Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/db/systemauth.disablebash Fri, 13 Oct 2023 12:45:34 GMT - severe: [telemetry.f5telemetry_default::My_System::SystemPoller_1] Bad status code: 500 Server Error for http://localhost:8100/mgmt/tm/sys/provision I pushed a json file similar to the following (Few fields redacted with variables) { "class": "Telemetry", "schemaVersion": "1.33.0", "My_System": { "class": "Telemetry_System", "systemPoller": { "interval": 60, "enable": true }, "enable": true, "host": "localhost", "port": 8100, "protocol": "http", "allowSelfSignedCert": false }, "My_Listener": { "class": "Telemetry_Listener", "port": 6514, "enable": true }, "My_Consumer": { "class": "Telemetry_Consumer", "type": "Kafka", "topic": "myTopic", "host": "myHost", "protocol": "binaryTcpTls", "port": 9093, "allowSelfSignedCert": false, "enable": true } } --- What would be causing this? I tried turning it to debug and trace but didn't have much luck. Debug didn't show much more and I could not actually locate the trace file. Thanks in advance, Josh506Views0likes1Comment