devops
24015 TopicsHow I did it - "High-Performance S3 Load Balancing of Dell ObjectScale with F5 BIG-IP"
As AI and data-driven workloads grow, enterprises need scalable, high-performance, and resilient storage. Dell ObjectScale delivers with its cloud-native, S3-compatible design, ideal for AI/ML and analytics. F5 BIG-IP LTM and DNS enhance ObjectScale by providing intelligent traffic management and global load balancing—ensuring consistent performance and availability across distributed environments. This article introduces Dell ObjectScale and its integration with F5 solutions for advanced use cases.45Views0likes0CommentsF5OS API - not able to create vlan
Hi all, GET request: https://10.10.10.10/api/data/openconfig-vlan:vlans Gives me JSON payload as here: { "openconfig-vlan:vlans": { "vlan": [ { "vlan-id": 405, "config": { "vlan-id": 405, "name": "v405_10.10.20.0_m29" }, "members": { "member": [ { "state": { "interface": "Production_trunk" } } ] } } ] } } When I try to add new vlan or at least send the same contect with "PATCH https://10.10.10.10/api/data/openconfig-vlan:vlans" I get an error: 400 BAD request { "ietf-restconf:errors": { "error": [ { "error-type": "application", "error-tag": "malformed-message", "error-path": "/openconfig-vlan:vlans", "error-message": "object is not writable: /oc-vlan:vlans/oc-vlan:vlan[oc-vlan:vlan-id='405']/oc-vlan:members/oc-vlan:member" } ] } } Why so and how the PATCH works here, from documentation it works more like PUT, but well OK. Am I supposed to send all vlans in case I need to just add or remove 1 vlan - so I always have to send all of them in the PATCH? Thanks, ZdenekSolved684Views0likes12CommentsBIG-IP Next for Kubernetes Nvidia DPU deployment walkthrough
Introduction Modern AI factories—hyperscale environments powering everything from generative AI to autonomous systems—are pushing the limits of traditional infrastructure. As these facilities process exabytes of data and demand near-real-time communication between thousands of GPUs, legacy CPUs struggle to balance application logic with infrastructure tasks like networking, encryption, and storage management. Data Processing Units (DPUs), purpose-built accelerators that offload these housekeeping tasks, freeing CPUs and GPUs to focus on what they do best. DPUs are specialized system-on-chip (SoC) devices designed to handle data-centric operations such as network virtualization, storage processing, and security enforcement. By decoupling infrastructure management from computational workloads, DPUs reduce latency, lower operational costs, and enable AI factories to scale horizontally. BIG-IP Next for Kubernetes and Nvidia DPU Looking at F5 ability to deliver and secure every app, we needed it to be deployed at multiple levels, a crucial one being edge and DPU. Installing F5 BIG-IP Next for Kubernetes on Nvidia DPU requires installing Nvidia’s DOCA framework to be installed. What’s DOCA? NVIDIA DOCA is a software development kit for NVIDIA BlueField DPUs. BlueField provides data center infrastructure-on-a-chip, optimized for high-performance enterprise and cloud computing. DOCA is the key to unlocking the potential of the NVIDIA BlueField data processing unit (DPU) to offload, accelerate, and isolate data center workloads. With DOCA, developers can program the data center infrastructure of tomorrow by creating software-defined, cloud-native, GPU-accelerated services with zero-trust protection. Now, let's explore BIG-IP Next for Kubernetes components, The BIG-IP Next for Kubernetes solution has two main parts: the Data Plane - Traffic Management Micro-kernel (TMM) and the Control Plane. The Control Plane watches over the Kubernetes cluster and updates the TMM’s configurations. The BIG-IP Next for Kubernetes Data Plane (TMM) manages the supply of network traffic both entering and leaving the Kubernetes cluster. It also proxies the traffic to applications running in the Kubernetes cluster. The Data Plane (TMM) runs on the BlueField-3 Data Processing Unit (DPU) node. It uses all the DPU resources to handle the traffic and frees up the Host (CPU) for applications. The Control Plane can work on the CPU or other nodes in the Kubernetes cluster. This makes sure that the DPU is still used for processing traffic. Use-case examples: There are some recently awesome use cases released by F5’s team based on conversation and work from the field. Let’s explore those items: Protecting MCP servers with F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs LLM routing with dynamic load balancing with F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs F5 optimizes GPUs for distributed AI inferencing with NVIDIA Dynamo and KV cache integration. Deployment walk-through In our demo, we go through the configurations from BIG-IP Next for Kubernetes Main BIG-IP Next for Kubernetes features L4 ingress flow HTTP/HTTPs ingress flow Egress flow BGP integration Logging and troubleshooting (Qkview, iHealth) You can find a quick walk-through via BIG-IP Next for Kubernetes - walk-through Related Content BIG-IP Next for Kubernetes - walk-through BIG-IP Next for Kubernetes BIG-IP Next for Kubernetes and Nvidia DPU-3 walkthrough BIG-IP Next for Kubernetes F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs455Views1like1CommentIs there a way to export BigIP Analytics http captured transactions?
Are there any commands to export the http captured transactions? When trying to troubleshoot TLS 1.3 encrypted traffic we can set up an Analytics profile with the specific traffic we want to capture. It works and we can see the captured transactions manually, one by one, through the web interface. Is there a way to export all those captured transactions so we can share them offline?50Views0likes1CommentHow to get a F5 BIG-IP VE Developer Lab License
(applies to BIG-IP TMOS Edition) To assist operational teams teams improve their development for the BIG-IP platform, F5 offers a low cost developer lab license. This license can be purchased from your authorized F5 vendor. If you do not have an F5 vendor, and you are in either Canada or the US you can purchase a lab license online: CDW BIG-IP Virtual Edition Lab License CDW Canada BIG-IP Virtual Edition Lab License Once completed, the order is sent to F5 for fulfillment and your license will be delivered shortly after via e-mail. F5 is investigating ways to improve this process. To download the BIG-IP Virtual Edition, log into my.f5.com (separate login from DevCentral), navigate down to the Downloads card under the Support Resources section of the page. Select BIG-IP from the product group family and then the current version of BIG-IP. You will be presented with a list of options, at the bottom, select the Virtual-Edition option that has the following descriptions: For VMware Fusion or Workstation or ESX/i: Image fileset for VMware ESX/i Server For Microsoft HyperV: Image fileset for Microsoft Hyper-V KVM RHEL/CentoOS: Image file set for KVM Red Hat Enterprise Linux/CentOS Note: There are also 1 Slot versions of the above images where a 2nd boot partition is not needed for in-place upgrades. These images include _1SLOT- to the image name instead of ALL. The below guides will help get you started with F5 BIG-IP Virtual Edition to develop for VMWare Fusion, AWS, Azure, VMware, or Microsoft Hyper-V. These guides follow standard practices for installing in production environments and performance recommendations change based on lower use/non-critical needs for development or lab environments. Similar to driving a tank, use your best judgement. Deploying F5 BIG-IP Virtual Edition on VMware Fusion Deploying F5 BIG-IP in Microsoft Azure for Developers Deploying F5 BIG-IP in AWS for Developers Deploying F5 BIG-IP in Windows Server Hyper-V for Developers Deploying F5 BIG-IP in VMware vCloud Director and ESX for Developers Note: F5 Support maintains authoritative Azure, AWS, Hyper-V, and ESX/vCloud installation documentation. VMware Fusion is not an official F5-supported hypervisor so DevCentral publishes the Fusion guide with the help of our Field Systems Engineering teams.89KViews14likes152CommentsAdvertise OpenShift AI inference servers from F5 Distributed Cloud
Introduction This article describes how Inference servers in OpenShift AI (KServe), hosted in public, private cloud or edge, can be anycast-advertised securely to the Internet using F5 Distributed Cloud (XC) deployed inside OpenShift clusters. Red Hat, and by extension OpenShift AI, provides enterprise-ready, mission-critical Open Source software. In this article, the AI model is hosted in OpenShift AI’s KServe single-model framework. For the creation of this article, OpenShift in AWS (aka ROSA) was used. It could have used OpenShift in any public or private cloud, edge deployment or a mix of these. Once the model is available for serving in OpenShift, XC can be used to advertise it globally. This can be done by just installing an in-cluster XC Customer Edge (CE) SMSv1 in OpenShift. This CE component transparently connects to the closest Regional Edges (RE) of F5 XC´s Global AnyCast Network, exposing the VIP of the AI inference server in all F5 XC´s PoPs (IP anycast), reducing latency to the customer, providing redundancy and application security including Layer 7 DDoS. The overall setup can be seen in the next figure. The only F5 component that has to be installed is the CE. The REs are pre-existing in the F5 Global Anycast Network, and are used automatically as access points for the CEs. Connectivity between the CEs and REs happens through TLS or IPsec tunnels, which are automatically set up at CE deployment time without any user intervention. The next sections will cover the following topics: Traffic path overview Setup of an inference service of generative AI model using KServe and VLLM. Setup of F5 XC CE in OpenShift using SMSv1. Create a global anycast VIP in XC exposing the created inference server. Securing the inference service. Traffic path overview The traffic flow is shown in the figure above, starting with the request towards the inference service: A request is sent to the inference service (e.g.: inference.demos.bd.f5.com). DNS resolves this to a F5 XC Anycast VIP address. Through Internet routing, the request reaches the VIP at the closest F5 XC Point Of Presence (PoP). F5 XC validates that the request is for an expected hostname and applies any security policy to the traffic. F5 XC load balances towards the CEs where there are origin pools for the applications. In this article, a single OpenShift as cluster is used, but several on different sites could have been used, all using the same VIP. The traffic is ultimately sent to the CE´s designated RE. Traffic reaches the CE inside the OpenShift cluster through a pre-established tunnel (TLS or IPsec). The CE has previously discovered which are the local origin servers through DNS service discovery within the OpenShift cluster. For this AI model, KServe deploys a Service Type: ExternalName named vllm-cpu.vllm.svc.cluster.local. This is the recommended way to access a KServe AI model from workloads that are not part of the mesh, like the CE component. The exact service name for the deployed model is reported in the OpenShift UI as shown in the next figure: The corresponding ExternalName for vllm-cpu.vllm.svc.cluster.local (effectively a DNS CNAME) is Istio´s kserve-local-gateway.istio-system.svc.cluster.local Gateway exposed by another Service as the name indicates. This is shown next: % oc -n vllm get svc vllm-cpu NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE vllm-cpu ExternalName <none> kserve-local-gateway.istio-system.svc.cluster.local <none> 4d1h The Customer Edge sends the traffic towards kserve-local-gateway´s Service clusterIP. This Service load-balances between the available Istio instances (in the next output, there is only one). % oc -n istio-system get svc,ep kserve-local-gateway NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kserve-local-gateway ClusterIP 172.30.6.250 <none> 443/TCP 5d6h NAME ENDPOINTS AGE endpoints/kserve-local-gateway 10.131.0.12:8445 5d6h The CNI sends the traffic to the selected Istio instance. Istio ultimately sends the request to the AI model PODs. The steps 6 and 7 have been simplified because within these steps, there are activators, autoscalers, and queuing components, which are transparent to F5 XC. These components are set automatically by KServe´s Knative infrastructure and would add unnecessary complexity to the above picture. If you are more interested in these steps, details can be found at this link. Setup of an inference service of generative AI model using KServe and VLLM. In order to instantiate an AI model in KServe, it is needed to create three resources: A Secret resource containing the storage (Data Connection) to be used. A ServingRuntime resource, which defines the mode, its image, and its parameters. The next resource uses it. In the OpenShift AI UI, these are created using Templates. An InferenceService resource that binds the previous resources and actually instantiates the AI model. It specifies the min and max replicas, the amount of memory for the replicas and the storage (data connection) to be used. The AI model used as example in this article uses custom configuration for a VLLM-CPU model (useful for PoC purposes). You can find another example of a VLLM CPU AI model in https://github.com/rh-ai-kickstart/vllm-cpu. The configuration is as follows: Example Data Connection for this example, using S3: The example ServingRuntime using a VLLM model for CPUs: The InferenceService (shown as Models and model servers in the UI) used in this example: Setup of F5 XC CE SMSv1 in OpenShift Please note that presently the CE SMSv2 for in cluster Kubernetes deployments is not available yet. To deploy CE SMSv1 as PODs in the OpenShift cluster, follow these instructions in F5 XC docs site. Creating a global anycast VIP in XC exposing the created inference server It will create the following objects in the given order: Create an HTTP/2 health check for the AI model. Create an Origin Pool for the AI model, attach to it the created health check. Create a VIP specifying Internet advertisement and attach the created Origin pool. These steps are described next in detail. Log in cloud.f5.com and go to "Multi-Cloud App Connect". All the configurations take place in this section of the UI. In Manage >> Load Balancers >> Health Checks, create a new HTTP/2 health check indicating a path that can be used to test the AI model, in this example this is "/health". The whole configuration is shown next: In Manage >> Load Balancers >> Origin Pools, create a new pool where the servers are discovered using "DNS Name of Origin Server on given Sites" for the DNS name vllm-cpu.vllm.svc.cluster.local (from the traffic flow overview section) in the Outside network (the only one the CE PODs actually have). Attach the previously created health check and set it to do not require TLS validation of the server. This latter is necessary due to internal Istio components using self-signed certificates by default. The configuration is shown next: In Manage >> Load Balancers >> HTTP Load Balancers, create a new Load Balance, specify the FQDN of the VIP and attach the the previously created Origin Pool. In this example, it is used as an HTTP load balancer. XC automatically creates the DNS hosting and certificates. At the very bottom of the HTTP Load Balancer creation screen, you can advertise the VIP on the Internet. Securing the inference service Once the inference service is exposed to the internet, it is exposing many APIs that we might not want to expose. Additionally, the service has to be secured from abuse and breaches. To address these, F5 XC offers the following features for AI: Automated API Discovery & Posture Management: Identify all inference endpoints automatically, eliminating hidden APIs. Enforce schemas based on observed traffic to ensure requests and responses follow expected patterns. Integrate “shift-left” security checks into CI/CD pipelines, catching misconfigurations before production. LLM-Aware Threat Detection & Request Validation: Detect attempts to manipulate prompts or break compliance rules, ensuring suspicious requests are blocked before reaching the model. Bot Mitigation & Adaptive Rate Controls: Differentiate between legitimate users and bots or scrapers, blocking automated attacks. Dynamically adjust rate limits and policies based on usage history and real-time conditions, maintaining performance and reliability. Sensitive Data Redaction & Compliance: Identify and mask PII or sensitive tokens in requests and responses. Adhere to data protection regulations and maintain detailed logs for auditing, monitoring, and compliance reporting. All these security features in the same XC console, where both application delivery and security dashboards are available centralized. These provide analytics to monitor real-time metrics—latency, error rates, compliance indicators—to continuously refine policies and adapt to emerging threats. It is recommended to check this article's "F5 Distributed Cloud Capabilities in Action" section to see how to implement these. Conclusion and final remarks F5 XC can use Red Hat OpenShift AI in AWS, or any other public or private cloud. This can help F5 XC easily share an AI model with the Internet and provide security, preventing breaches and abuse of these models. I hope this article has been an eye-opener for the possibilities of how F5 XC can easily and securely advertise AI models. This article shows how to advertise the AI model on the Internet. XC lets you advertise it in any private place just as easily. I would love to hear if you have any specific requirements not covered in this article.105Views0likes0CommentsPrivileged Access in Action: Technical Controls for Real-World Environments
Introduction This article provides a technical walk-through demo to implement Privileged User Access (PUA) with BIG-IP APM. In modern IT environments, privileged user access refers to elevated permissions granted to administrators, engineers, and service accounts that manage critical infrastructure, applications, and data. These accounts can bypass standard security controls, modify configurations, provision resources, and access sensitive systems. This makes them a high-value target for attackers. From domain admins in Active Directory to root accounts on Linux servers and cloud (Identity Access Management) IAM roles, the scope of privileged access spans on-prem, hybrid, and cloud-native stacks. As environments scale and become more dynamic, especially with DevOps and automation, controlling and auditing privileged access is no longer optional. It’s a foundational requirement for operational integrity, threat detection, and zero trust security. Listing some of the common use cases for PUA, Use Case Description Risk if Unsecured PUA Control Measures 1. Hybrid Infrastructure Management Admins manage Linux/Windows servers across on-prem and cloud (AWS, Azure, GCP) using root or admin access. Lateral movement, persistence, full system compromise. Just-in-time access, session recording, MFA, IP restrictions. 2. Database Administration DBAs access production databases for tuning, backups, or incident response. Data exfiltration, insider threats, compliance violations (e.g., GDPR, PCI-DSS). Role-based access, query auditing, access approval workflows, credential vaulting. 3. CI/CD Pipeline Secrets Access DevOps pipelines use privileged credentials to deploy apps, access build environments, and manage cloud resources. Secrets leakage, automated misuse, supply chain attacks. Secrets management tools (e.g., HashiCorp Vault), scoped tokens, access expiration, auditing. 4. Cloud IAM Role Escalation Cloud engineers assume elevated IAM roles (e.g., AWS Admin, Azure Owner) to provision infrastructure and configure services. Privilege escalation, unauthorized changes, excessive entitlements. Attribute-based access control (ABAC), IAM role scoping, just-in-time elevation, CloudTrail monitoring. 5. Third-Party Vendor Access External support teams or vendors are given privileged access to troubleshoot or maintain systems temporarily. External compromise, unmanaged persistence, lack of accountability. Time-limited access, gateway proxies (e.g., bastion hosts), approval-based workflows, full session logging. BIG-IP APM & PUA BIG-IP APM provides Privileged User Access so that you can add CAC authentication (Common Access Card), Personal Identity Verification (PIV), or other strong authentication method to network infrastructure for enhanced security. This solution integrates directly into DoD PKI systems and works cooperatively with existing RADIUS, TACACS, Active Directory, and a variety of third-party authentication databases. Deployment of Privileged User Access requires a license and involves the configuration of these components: Ephemeral Authentication ServerWebSSH ProxyAuthentication Server (for RADIUS and/or LDAP or LDAPS) What is Ephemeral Authentication? The Privileged User Access license lets you create an Ephemeral Authentication server that generates and manages temporary or ephemeral passwords. BIG-IP APM acts as the Ephemeral Authentication server. It ensures a secure end-to-end encrypted connection while eliminating the possibility of credential reuse. The Ephemeral Authentication server includes the access profile/policy that authenticates the end user and contains the webtop resources for ephemeral authentication (so the server also acts as a webtop proxy). Going through the traffic flow steps below, User logs into the APM virtual server using a Smartcard or other credential. (The APM virtual server is the one that acts as the Ephemeral Authentication server on which the APM access profile/policy is configured.) The APM access policy checks provided credentials and retrieves AD/LDAP group membership information and returns a webtop showing backend resources. When the user clicks on a resource, APM generates an ephemeral password, and saves the username and password. Using SSO, APM signs the user on to the WebSSH virtual server with their ephemeral authentication credentials. At this point, portal access can be used instead. WebSSH makes an SSH connection (or HTTPS) to the router/server still using the ephemeral authentication credentials. The router sends an authentication request to the RADIUS or LDAP virtual server. The RADIUS or LDAP virtual server verifies the ephemeral password. The RADIUS or LDAP virtual server returns a Successful or Failure response. The SSH (or HTTPS) session is established or denied. For technical implementation, please review our demo here and go through our technical documentation Securing Privileged User Access Concepts Related Content Privileged User Access BIG-IP Access Policy Manager: Privileged User Access BIG-IP APM and PUA licenses81Views0likes0Comments