devops
1585 TopicsAgentic AI with F5 BIG-IP v21 using Model Context Protocoland OpenShift
Introduction to Agentic AI Agentic AI is the capability of extending the Large Language Models (LLM) by means of adding tools. This allows the LLMs to interoperate with functionalities external to the LLM. Examples of these are the capability to search a flight or to push code into github. Agentic AI operates proactively, minimising human intervention and making decisions and adapting to perform complex tasks by using tools, data, and the Internet. This is done by basically giving to the LLM the knowledge of the APIs of github or the flight agency, then the reasoning of the LLM makes use of these APIs. The external (to the LLM) functionality can be run in the local computer or in network MCP servers. This article focuses in network MCP servers, which fits in the F5 AI Reference Architecture components and the insertion point indicated in green of the shown next: Introduction to Model Context Protocol Model Context Protocol (MCP) is a universal connector between LLMs and tools. Without MCP, it is needed that the LLM is programmed to support the different APIs of the different tools. This is not a scalable model because it requires a lot of effort to add all tools for a given LLM and for a tool to support several LLMs. Instead, when using MCP, the LLM (or AI application) and the tool only need to support MCP. Without further coding, the LLM model automatically is able to use any tool that exposes its functionalities through MCP. This is exhibit in the following figure: MCP example workflow In the next diagram it is exposed the basic MCP workflow using the LibreChat AI application as example. The flow is as follows: The AI application queries agents (MCP servers) which tools they provide. The agents return a list of the tools, with a description and parameters required. When the AI application makes a request to the AI model it includes in the request information about the tools available. When the AI model finds out it doesn´t have built-in what it is required to fulfil the request, it makes use of the tools. The tools are accessed through the AI application. The AI model composes a result with its local knowledge and the results from the tools. Out of the workflow above, the most interesting is step 1 which is used to retrieve the information required for the AI model to use the tools. Using the mcpLogger iRule provided in this article later on, we can see the MCP messages exchanged. Step 1a: { "method": "tools/list", "jsonrpc": "2.0", "id": 2 } Step 1b: { "jsonrpc": "2.0", "id": 2, "result": { "tools": [ { "name": "airport_search", "description": "Search for airport codes by name or city.\n\nArgs:\n query: The search term (city name, airport name, or partial code)\n\nReturns:\n List of matching airports with their codes", "inputSchema": { "properties": { "query": { "type": "string" } }, "required": [ "query" ], "type": "object" }, "outputSchema": { "properties": { "result": { "type": "string" } }, "required": [ "result" ], "type": "object", "x-fastmcp-wrap-result": 1 }, "_meta": { "_fastmcp": { "tags": [] } } } ] } } Note from the above that the AI model only requires a description of the tool in human language and a formal declaration of the input and output parameters. That´s all!. The reasoning of the AI model is what will make good use of the API described through MCP. The AI models will interpret even the error messages. For example, if the AI model miss-interprets the input parameters (typically because of wrong descriptor of the tool), the AI model might correct itself if the error message is descriptive enough and call the tool again with the right parameters. Of course, the MCP protocol is more than this but the above is necessary to understand the basis of how tools are used by LLM and how the magic works. F5 BIG-IP and MCP BIG-IP v21 introduces support for MCP, which is based on JSON-RPC. MCP protocol had several iterations. For IP based communication, initially the transport of the JSON-RPC messages used HTTP+SSE transport (now considered legacy) but this has been completely replaced by Streamable HTTP transport. This later still uses SSE when streaming multiple server messages. Regardless of the MCP version, in the F5 BIG-IP it is just needed to enable the JSON and SSE profiles in the Virtual Server for handling MCP. This is shown next: By enabling these profiles we automatically get basic protocol validation but more relevantly, we obtain the ability to handle MCP messages with JSON and SSE oriented events and functions. These allows parsing and manipulation of MCP messages but also the capability of doing traffic management (load balancing, rate limiting, etc...). Next it can be seen the parameters available for these profiles, which allow to limit the size of the various parts of the messages. Defaults are fine for most of the cases: Check the next links for information on iRules events and commands available for the JSON and SSE protocols. MCP and persistence Session persistence is optional in MCP but when the server indicates an Mcp-Session-Id it is mandatory for the client. MCP servers require persistence when they keep a context (state) for the MCP dialog. This means that the F5 BIG-IP must support handling this Mcp-Session-Id as well and it does by using UIE (Universal) persistence with this header. A sample iRule mcpPersistence is provided in the gitHub repository. Demo and gitHub repository The video below demonstrate 3 functionalities using the BIG-IP MCP functionalities, these are: Using MCP persistence Getting visibility of MCP traffic by logging remotely the JSON-RPC payloads of the request and response messages using High Speed Logging. Controlling which tools are allowed or blocked, and logging the allowed/block actions with High Speed Logging. These functionalities are implemented with iRules available in this GitHub repository and deployed in Red Hat OpenShift using the Container Ingress Services (CIS) controller which automates the deployment of the configuration using Kubernetes resources. The overall setup is shown next: In the next embedded video we can see how this is deployed and used. Conclusion and next steps F5 BIG-IP v21 introduces support for MCP protocol and thanks to F5 CIS these setups can be automated in your OpenShift cluster using the Kubernetes API. The possibilities of Agentic AI are infinite, thanks to MCP it is possible to extend the LLM models to use any tool easily. The tools can be used to query or execute actions. I suggest to take a look to repositories of MCP servers and realize the endless possibilities of Agentic AI: https://mcpservers.org/ https://www.pulsemcp.com/servers https://mcpmarket.com/server https://mcp.so/ https://github.com/punkpeye/awesome-mcp-servers
81Views2likes0CommentsDelivering Secure Application Services Anywhere with Nutanix Flow and F5 Distributed Cloud
Introduction F5 Application Delivery and Security Platform (ADSP) is the premier solution for converging high-performance delivery and security for every app and API across any environment. It provides a unified platform offering granular visibility, streamlined operations, and AI-driven insights — deployable anywhere and in any form factor. The F5 ADSP Partner Ecosystem brings together a broad range of partners to deliver customer value across the entire lifecycle. This includes cohesive solutions, cloud synergies, and access to expert services that help customers maximize outcomes while simplifying operations. In this article, we’ll explore the upcoming integration between Nutanix Flow and F5 Distributed Cloud, showcasing how F5 and Nutanix collaborate to deliver secure, resilient application services across hybrid and multi-cloud environments. Integration Overview At the heart of this integration is the capability to deploy a F5 Distributed Cloud Customer Edge (CE) inside a Nutanix Flow VPC, establish BGP peering with the Nutanix Flow BGP Gateway, and inject CE-advertised BGP routes into the VPC routing table. This architecture provides us complete control over application delivery and security within the VPC. We can selectively advertise HTTP load balancers (LBs) or VIPs to designated VPCs, ensuring secure and efficient connectivity. Additionally, the integration securely simplifies network segmentation across hybrid and multi-cloud environments. By leveraging F5 Distributed Cloud to segment and extend the network to remote locations, combined with Nutanix Flow Security for microsegmentation within VPCs, we deliver comprehensive end-to-end network security. This approach enforces a consistent security posture while simplifying segmentation across environments. In this article, we’ll focus on application delivery and security, and explore segmentation in the next article. Demo Walkthrough Let’s walk through a demo to see how this integration works. The goal of this demo is to enable secure application delivery for nutanix5.f5-demo.com within the Nutanix Flow Virtual Private Cloud (VPC) named dev3. Our demo environment, dev3, is a Nutanix Flow VPC with a F5 Distributed Cloud Customer Edge (CE) named jy-nutanix-overlay-dev3 deployed inside: *Note: CE is named jy-nutanix-overlay-dev3 in the F5 Distributed Cloud Console and xc-ce-dev3 in the Nutanix Prism Central. eBGP peering is ESTABLISHED between the CE and the Nutanix Flow BGP Gateway: On the F5 Distributed Cloud Console, we created an HTTP Load Balancer named jy-nutanix-internal-5 serving the FQDN nutanix5.f5-demo.com. This load balancer distributes workloads across hybrid multicloud environments and is protected by a WAF policy named nutanix-demo: We advertised this HTTP Load Balancer with a Virtual IP (VIP) 10.10.111.175 to the CE jy-nutanix-overlay-dev3 deployed inside Nutanix Flow VPC dev3: The CE then advertised the VIP route to its peer via BGP – the Nutanix Flow BGP Gateway: The Nutanix Flow BGP Gateway received the VIP route and installed it in the VPC routing table: Finally, the VMs in dev3 can securely access nutanix5.f5-demo.com while continuing to use the VPC logical router as their default gateway: F5 Distributed Cloud Console observability provides deep visibility into applications and security events. For example, it offers comprehensive dashboards and metrics to monitor the performance and health of applications served through HTTP load balancers. These include detailed insights into traffic patterns, latency, HTTP error rates, and the status of backend services: Furthermore, the built-in AI assistant provides real-time visibility and actionable guidance on security incidents, improving situational awareness and supporting informed decision-making. This capability enables rapid threat detection and response, helping maintain a strong and resilient security posture: Conclusion The integration demonstrates how F5 Distributed Cloud and Nutanix Flow collaborate to deliver secure, resilient application services across hybrid and multi-cloud environments. Together, F5 and Nutanix enable organizations to scale with confidence, optimize application performance, and maintain robust security—empowering businesses to achieve greater agility and resilience across any environment. This integration is coming soon in CY2026. If you’re interested in early access, please contact your F5 representative. Reference URLs https://www.f5.com/products/distributed-cloud-services https://www.nutanix.com/products/flow/networking
96Views1like0CommentsFile Permissions Errors When Installing F5 Application Study Tool? Here’s Why.
F5 Application Study Tool is a powerful utility for monitoring and observing your BIG-IP ecosystem. It provides valuable insights into the performance of your BIG-IPs, the applications it delivers, potential threats, and traffic patterns. In my work with my own customers and those of my colleagues, we have sometimes run into permissions errors when initially launching the tool post-installation. This generally prevents the tool from working correctly and, in some cases, from running at all. I tend to see this more in RHEL installations, but the problem can occur with any modern Linux distribution. In this blog, I go through the most common causes, the underlying reasons, and how to fix it. Signs that You Have a File Permissions Issue These issues can appear as empty dashboard panels in Grafana, dashboards with errors in each panel (pink squares with white warning triangles, as seen in the image below), or the Grafana dashboard not loading at all. This image shows the Grafana dashboard with errors in each panel. When diving deeper, we see at least one of the three containers are down or continuously restarting. In the below example, the Prometheus container is continuously restarting: ubuntu@ubuntu:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 59a5e474ce36 prom/prometheus "/bin/prometheus --c…" 2 minutes ago Restarting (2) 18 seconds ago prometheus c494909b8317 grafana/grafana "/run.sh" 2 minutes ago Up 2 minutes 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp grafana eb3d25ff00b3 ghcr.io/f5devcentral/application-stu... "/otelcol-custom --c…" 2 minutes ago Up 2 minutes 4317/tcp, 55679-55680/tcp application-study-tool_otel-collector_1 A look at the container’s logs shows a file permissions error: ubuntu@ubuntu:~$ docker logs 59a5e474ce36 ts=2025-10-09T21:41:25.341Z caller=main.go:184 level=info msg="Experimental OTLP write receiver enabled" ts=2025-10-09T21:41:25.341Z caller=main.go:537 level=error msg="Error loading config (--config.file=/etc/prometheus/prometheus.yml)" file=/etc/prometheus/prometheus.yml err="open /etc/prometheus/prometheus.yml: permission denied" Note that the path, “/etc/prometheus/prometheus.yml”, is the path of the file within the container, not the actual location on the host. There are several ways to get the file’s actual location on the host. One easy method is to view the docker-compose.yaml file. Within the prometheus service, in the volumes section, you will find the following line: - ./services/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml” This indicates the file is located at “./services/prometheus/prometheus.yml” on the host. If we look at its permissions, we see that the user, “other” (represented by the three right-most characters in the permissions information to the left of the filename) are all dashes (“-“). This means the permissions are unset (they are disabled) for this user for reading, writing, or executing the file: ubuntu@ubuntu:~$ ls -l services/prometheus/prometheus.yml -rw-rw---- 1 ubuntu ubuntu 270 Aug 10 21:16 services/prometheus/prometheus.yml For a description of default user roles in Linux and file permissions, see Red Hat’s guide, “Managing file system permissions”. Since all containers in the Application Study Tool run as “other” by default, they will not have any access to this file. At minimum, they require read permissions. Without this, you will see the error above. The Fix! Once you figure out the problem lies in file permissions, it’s usually straightforward to fix it. A simple “chmod o+r” (or “chmod 664” for those who like numbers) on the file, followed by a restart of Docker Compose, will get you back up and running most of the time. For example: ubuntu@ubuntu:~$ ls -l services/prometheus/prometheus.yml -rw-rw---- 1 ubuntu ubuntu 270 Aug 10 21:16 services/prometheus/prometheus.yml ubuntu@ubuntu:~$ chmod o+r services/prometheus/prometheus.yml ubuntu@ubuntu:~$ ls -l services/prometheus/prometheus.yml -rw-rw-r-- 1 ubuntu ubuntu 270 Aug 10 21:16 services/prometheus/prometheus.yml ubuntu@ubuntu:~$ docker-compose down ubuntu@ubuntu:~$ docker-compose up -d The above is sufficient when read permission issues only impact in a few specific files. To ensure read permissions are enabled for "other" for all files in the services directory tree (which is where the AST containers read from), you can recursively set these permissions with the following commands: cd services chmod -R o+r . For AST to work, all containing directories also need to be executable by "other", or the tool will not be able to traverse these directories and reach the files. In this case, you will continue to see permissions errors. If that is the case, you can set execute permission recursively, just like the read permission setting performed above. To do this only for the services directory (which is the only place you should need it), run the following commands: # If you just ran the steps in the previous command section, you will still be in the services/ subdirectory. In the case, run "cd .." before running the following commands. chmod o+x services cd services chmod -R o+X . Notes: The dot (".") must be included at the end of the command. This tells chmod to start with the current working directory. The "-R" tells it to recursively act on all subdirectories. The "X" in "o+X" must capitalized to tell chmod to only operate on directories, not regular files. Execute permission is not needed for regular files in AST. For a good description of how directory permissions work in Linux, see https://linuxvox.com/blog/understanding-linux-directory-permissions-reasoning/ But Why Does this Happen? While the above discussion will fix file permissions issues after they've occurred, I wanted to understand what was actually causing this. Until recently, I had just chalked this up to some odd behavior in certain Red Hat installations (RHEL was the only place I had seen this) that modifies file permissions when they are pulled from GitHub repos. However, there is a better explanation. Many organizations have specific hardening practices when configuring new Linux machines. This sometimes involves the use of “umask” to set default file permissions for new files. Certain umask settings, such as 0007 and 0027 (anything ending with 7) will remove all permissions for “other”. This only affects newly created files, such as those pulled from a Git repo. It does not alter existing files. This example shows how the newly created file, testfile, gets created without read permissions for "other" when the umask is set to 0007. ubuntu@ubuntu:~$ umask 0007 ubuntu@ubuntu:~$ umask 0007 ubuntu@ubuntu:~$ touch testfile ubuntu@ubuntu:~$ ls -l testfile -rw-rw---- 1 ubuntu ubuntu 0 Oct 9 22:34 testfile Notes: In the above command block, note the last three characters in the permissions information, "-rw-rw----". These are all dashes ("-"), indicating the permission is disabled for user "other". The umask setting is available in any modern Linux distribution, but I see it more often on RHEL. Also, if you are curious, this post offers a good explanation of how umask works: What is "umask" and how does it work? To prevent permissions problems in the first place, you can run “umask” on the command line to check the setting before cloning the GitHub repo. If it ends in a 7, modify it (assuming your user account has permissions to do this) to something like “0002” or “0022”. This removes write permissions from “other”, or “group” and “other”, respectively, but does not modify read or execute permissions for anyone. You can also set it to “0000” which will cause it to make no changes to the file permissions of any new files. Alternatively, you can take a reactive approach, installing and launching AST as you normally would and only modifying file permissions when you encounter permission errors. If your umask is set to strip out read and/or execute permissions for "other", this will take more work than setting umask ahead of time. However, you can facilitate this by running the recursive "chmod -R o+r ." and "chmod -R o+X ." commands, as discussed above, to give "other" read permissions for all files and execute permissions for all subdirectories in the directory tree. (Note that this will also enable read permissions on all files, including those where it is not needed, so consider this before selecting this approach.) For a more in-depth discussion of file permissions, see Red Hat’s guide, “Managing file system permissions”. Hope this is helpful when you run into this type of error. Feel free to post questions below.39Views1like0CommentsModernizing F5 BIG-IP Synchronized HA Pairs with Ansible Validated Content
I wanted to provide an update to a previous article I released a few months ago where we developed Ansible Automation Platform code to help with migrating Standalone Legacy platforms (non-iSeries, iSeries and Viprion Instances) to our Modern Architectures (rSeries and Velos) using F5OS Tenant instances. I am happy to announce that the code for Synchronized HA pairs has been completed, and we have uploaded it to Ansible Automation Hub as Validated Content. What is Ansible Automation Hub Validated Content? Ansible validated content collections contain pre-built YAML content (such as playbooks or roles) to address the most common automation use cases. You can use Ansible validated content out-of-the-box or as a learning opportunity to develop your skills. It's a trusted starting point to bootstrap your automation: use it, customize it, and learn from it. Due to the focus on customization and the intent for this content to be modified, it is not subject to the same support requirements as our certified collections. To this end, any issues with this content should be filed directly at the source repository for that collection. Why Synchronized HA Pairs? This is a very common use case for a lot of our customers who want resiliency and redundancy, especially for their applications and services. The biggest issue with migrating an HA Pair is that because of the way they are set up, things like Management IP Addresses and Master Keys are essential to the transition process. Even mismatched versions during upgrades cannot synchronize during the process of upgrading to major/minor releases. What does the updated Validated Code do? Standalone Migrations – Where you can change the Management IP, due to the nature of being a standalone device, an outage will occur during the transition period. o There are 2 options for Playbooks Single Playbook for the full migration 2 Parts where Part 1 – Does backups and does a big start stop of the unit Part 2 – Migrates the standalone device HA Pairs – Combined – This code is designed for a customer who just needs to transition both HA Units but isn’t concerned about an outage window. It will migrate both units at the same time to F5OS Tenants. The Playbooks for this Code are broken apart in specific areas Part 1 – Backup the Information Part 2 – Ensure Both Units are offline and Migrate both units at the same time. HA Pairs – Sequential – This code is designed for customers who need to migrate one unit at a time and maintain availability of their applications. It will migrate the Standby Unit first as part of the code When ready to transition the active unit, it will place it in Standby and make the Transitioned Standby unit the Active Node transferring services to it Then the previously Active Unit (now standby) will be migrated There are playbooks to the Code to break apart specific areas of the transition Part 1 – Backup the Information Part 2 – Ensure the Standby Devices are offline (via Management IP) and Migrate the Standby Unit Part 3 – Transition the Standby to become the Active Unit and Begin Transitioning the New Standby Unit (Previously Active Unit) similarly to Part 2 This code has been tested and validated against many different platforms, and there are plans to continue testing for other use cases. The Transition can be Like-for-Like versioning (i.e. 16.1.x to 16.1.x) within the same family tree or can be an upgrade at the same time (i.e. 15.1.10 à 17.5.1.3 or even 21.0.0) These are Ansible Playbooks with supporting roles tailored for Red Hat Ansible Automation Platform. It’s built to perform a lift-and-shift migration of a F5 BIG-IP configuration from one device to another—with optional OS upgrades included. What is the future of the code? I plan on adding some Validation code to separate roles/playbooks so customers could have points of references for testing, i.e. (ping tests and pool tests) before and after the transition, QKView Backups, and other information provided on the state of the unit prior to transition to ensure when migrated it can be validated that everything is the way it was. Notes about the Code The code is not designed to handle Non-VLANed infrastructure (F5OS is designed to be multi-tenant and setup with VLANs to deal with Multi-Tenancy) If your BIG-IPs use Untagged networks, they will need to be migrated to VLANed prior to using this code. Has not been validated/tested with FIPS-based environments Has not been validated/tested F5 DNS environments – Coming Soon HA Pairs must retain the Management IP address from source to destination; the code will ensure that the source device is powered off prior to transitioning it. Cool Additions Override variables are allowed as extra_vars to create flexibility in your deployment override_cpu - This allows you to set the CPUs of the Tenant OS. If the memory override isn’t set, it will be set to the same formula that the F5OS Gui would calculate. DEFAULT is set to 4 CPU override_disk_size - This allows you to set the Disk Space of the Tenant OS. DEFAULT is set to 120GB override_memory - This allows you to set the Memory of the Tenant OS. Be warned if over-provisioned, the Tenant may not start. DEFAULT is calculated by the CPU counts formula used in the GUI. tenant_nodes - This allows you to set the slot for the Tenant OS if there are multiple slots associated with your F5OS Partition. DEFAULT is an array object and it is set to [1] cryptos - This allows you to set the Crypto on the Tenant OS to either enabled or disabled. DEFAULT is set to enabled Variables for deployments – the code is designed to utilize specific hostnames and group names to execute the code. These variables allow connectivity to BIGIP and F5OS Tenants. When creating these hosts, you will need to provide When creating hosts in AAP, you will need to provide the following information ansible_host: - This is the IP Address of the device of the host ansible_user: - This is the username to login to the device ansible_password: - this is the password to login to the device; if using a credential in AAP, you would associate that variables information here as a reference. i.e. Standalone deployments host_vars f5_destination_partition – This is the F5OS Partition information f5_destination_tenant – This is the F5OS Tenant information f5_source – This is the source device HA Pair Deployments group_vars ha_pair_destination_chassis – contains a group of 2 hosts for the destination tenants to be deployed to (can be 2 hosts with the same information or different) ha_pair_source – contains a group of 2 hosts for the source BIG-IP Devices in a synchronized HA Pair. ha_pair_source_dynamic - this group is created automatically throughout the code to program the new Tenant OSes after deployment (DOES NOT NEED TO BE CREATED) Demos/Information We have uploaded a new demo video below, you’ll see an migration of a synchronized HA Pair of BIG-IPs running as Viprion Tenants on F5 B2250 Blades running 15.1.10 transitioning to a pair of rSeries R5800s Tenant OSs running 17.5.1.x — demonstrating a smooth modernization process. Watch the synchronized HA migration Demo Video If you want to check out the information and demo video on the Standalone migrations, check out my other article at – Modernizing F5 Platforms with Ansible | DevCentral You can access the validated content via Ansible Automation Hub (Need Red Hat Account with AAP) https://console.redhat.com/ansible/automation-hub/repo/validated/f5networks/f5_platform_modernization/ Or you can access the direct code from our GitHub Repository https://github.com/f5devcentral/f5-bd-ansible-platform-modernization This project is built for the community/partners/system integrators — so as I always say, feel free to take it, fork it, and expand it. Let’s make F5 platform modernization as seamless and automated as possible!51Views3likes0CommentsGetting Started with the Certified F5 NGINX Gateway Fabric Operator on Red Hat OpenShift
As enterprises modernize their Kubernetes strategies, the shift from standard Ingress Controllers to the Kubernetes Gateway API is redefining how we manage traffic. For years, the F5 NGINX Ingress Controller has been a foundational component in OpenShift environments. With the certification of F5 NGINX Gateway Fabric (NGF) 2.2 for Red Hat OpenShift, that legacy enters its next chapter. This new certified operator brings the high-performance NGINX data plane into the standardized, role-oriented Gateway API model—with full integration into OpenShift Operator Lifecycle Manager (OLM). Whether you're a platform engineer managing cluster ingress or a developer routing traffic to microservices, NGF on OpenShift 4.19+ delivers a unified, secure, and fully supported traffic fabric. In this guide, we walk through installing the operator, configuring the NginxGatewayFabric resource, and addressing OpenShift-specific networking patterns such as NodePort + Route. Why NGINX Gateway Fabric on OpenShift? While Red Hat OpenShift 4.19+ includes native support for the Gateway API (v1.2.1), integrating NGF adds critical enterprise capabilities: ✔ Certified & OpenShift-Ready The operator is fully validated by Red Hat, ensuring UBI-compliant images and compatibility with OpenShift’s strict Security Context Constraints (SCCs). ✔ High Performance, Low Complexity NGF delivers the core benefits long associated with NGINX—efficiency, simplicity, and predictable performance. ✔ Advanced Traffic Capabilities Capabilities like Regular Expression path matching and support for ExternalName services allow for complex, hybrid-cloud traffic patterns. ✔ AI/ML Readiness NGF 2.2 supports the Gateway API Inference Extension, enabling inference-aware routing for GenAI and LLM workloads on platforms like Red Hat OpenShift AI. Prerequisites Before we begin, ensure you have: Cluster Administrator access to an OpenShift cluster (version 4.19 or later is recommended for Gateway API GA support). Access to the OpenShift Console and the oc CLI. Ability to pull images from ghcr.io or your internal mirror. Step 1: Installing the Operator from OperatorHub We leverage the Operator Lifecycle Manager (OLM) for a "point-and-click" installation that handles lifecycle management and upgrades. Log into the OpenShift Web Console as an administrator. Navigate to Operators > OperatorHub. Search for NGINX Gateway Fabric in the search box. Select the NGINX Gateway Fabric Operator card and click Install Accept the default installation mode (All namespaces) or select a specific namespace (e.g. nginx-gateway), and click Install. Wait until the status shows Succeeded. Once installed, the operator will manage NGF lifecycle automatically. Step 2: Configuring the NginxGatewayFabric Resource Unlike the Ingress Controller, which used NginxIngressController resources, NGF uses the NginxGatewayFabric Custom Resource (CR) to configure the control plane and data plane. In the Console, go to Installed Operators > NGINX Gateway Fabric Operator. Click the NginxGatewayFabric tab and select Create NginxGatewayFabric. Select YAML view to configure the deployment specifics. Step 3: Configuring the NginxGatewayFabric Resource NGF uses a Kubernetes Service to expose its data plane. Before the data plane launches, we must tell the Controller how to expose it. Option A - LoadBalancer (ROSA, ARO, Managed OpenShift) By default, the NGINX Gateway Fabric Operator configures the service type as LoadBalancer. On public cloud managed OpenShift services (like ROSA on AWS or ARO on Azure), this native default works out-of-the-box to provision a cloud load balancer. No additional steps required. Option B - NodePort with OpenShift Route (On-Prem/Hybrid) However, for on-premise or bare-metal OpenShift clusters lacking a native LoadBalancer implementation, the common pattern is to use a NodePort service exposed via an OpenShift Route. Update the NGF CR to use NodePort In the Console, go to Installed Operators > NGINX Gateway Fabric Operator. Click the NginxGatewayFabric tab and select NginxGatewayFabric. Select YAML view to directly edit the configuration specifics. Change the spec.nginx.service.type to NodePort: apiVersion: gateway.nginx.org/v1alpha1 kind: NginxGatewayFabric metadata: name: default namespace: nginx-gateway spec: nginx: service: type: NodePort Create the OpenShift Route: After applying the CR, create a Route to expose the NGINX Service. oc create route edge ngf \ --service=nginxgatewayfabric-sample-nginx-gateway-fabric\ --port=http \ -n nginx-gateway Note: This creates an Edge TLS termination route. For passthrough TLS (allowing NGINX to handle certificates), use --passthrough and target the https port. Step 4: Validating the Deployment Verify that the operator has deployed the control plane pods successfully. oc get pod -n nginx-gateway NAME READY STATUS RESTARTS AGE nginx-gateway-fabric-controller-manager-dd6586597-bfdl5 1/1 Running 0 23m nginxgatewayfabric-sample-nginx-gateway-fabric-564cc6df4d-hztm8 1/1 Running 0 18m oc get gatewayclass NAME CONTROLLER ACCEPTED AGE nginx gateway.nginx.org/nginx-gateway-controller True 4d1h You should also see a GatewayClass named nginx. This indicates the controller is ready to manage Gateway resources. Step 5: Functional Check with Gateway API To test traffic, we will use the standard Gateway API resources (Gateway and HTTPRoute) Deploy a Test Application (Cafe Service) Ensure you have a backend service running. You can use a simple service for validation. Create a Gateway This resource opens the listener on the NGINX data plane. apiVersion: gateway.networking.k8s.io/v1 kind: Gateway metadata: name: cafe spec: gatewayClassName: nginx listeners: - name: http port: 80 protocol: HTTP Create an HTTPRoute This binds the traffic to your backend service. apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: coffee spec: parentRefs: - name: cafe hostnames: - "cafe.example.com" rules: - matches: - path: type: PathPrefix value: / backendRefs: - name: coffee port: 80 Test Connectivity If you used Option B (Route), send a request to your OpenShift Route hostname. If you used Option A, send it to the LoadBalancer IP. OpenShift 4.19 Compatibility Meanwhile, it is vital to understand the "under the hood" constraints of OpenShift 4.19: Gateway API Version Pinning: OpenShift 4.19 ships with Gateway API CRDs pinned to v1.2.1. While NGF 2.2 supports v1.3.0 features, it has been conformance-tested against v1.2.1 to ensure stability within OpenShift's version-locked environment. oc get crd gateways.gateway.networking.k8s.io -o yaml | grep "gateway.networking.k8s.io/" gateway.networking.k8s.io/bundle-version: v1.2.1 gateway.networking.k8s.io/channel: standard However, looking ahead, future NGINX Gateway Fabric releases may rely on newer Gateway API specifications that are not natively supported by the pinned CRDs in OpenShift 4.19. If you anticipate running a newer NGF version that may not be compatible with the current OpenShift Gateway API version, please reach out to us to discuss your compatibility requirements. Security Context Constraints (SCC): In previous manual deployments, you might have wrestled with NET_BIND_SERVICE capabilities or creating custom SCCs. The Certified Operator handles these permissions automatically, using UBI-based images that comply with Red Hat's security standards out of the box. Next Steps: AI Inference With NGF running, you are ready for advanced use cases: AI Inference: Explore the Gateway API Inference Extension to route traffic to LLMs efficiently, optimizing GPU usage on Red Hat OpenShift AI. The certified NGINX Gateway Fabric Operator simplifies the operational burden, letting you focus on what matters: delivering secure, high-performance applications and AI workloads. References: NGINX Gateway Fabric Operator on Red Hat Catalog F5 NGINX Gateway Fabric Certified for Red Hat OpenShift NGINX Gateway Fabric Installation Docs164Views1like0CommentsI Tried to Beat OpenAI with Ollama in n8n—Here’s Why It Failed (and the Bug I’m Filing)
Hey, community. I wanted to share a story about how I built the n8n Labs workflow. It watches a YouTube channel, summarizes the latest videos with AI agents, and sends a clean HTML newsletter via Gmail. In the video, I show it working flawlessly with OpenAI. But before I got there, I spent a lot of time trying to copy the same flow using open source models through Ollama with the n8n Ollama node. My results were all over the map. I really wanted this to be a great “open source first” build. I tried many local models via Ollama, tuned prompts, adjusted parameters, and re‑ran tests. The outputs were always unpredictable: sometimes I’d get partial JSON, sometimes extra text around the JSON. Sometimes fields would be missing. Sometimes it would just refuse to stick to the structure I asked for. After enough iterations, I started to doubt whether my understanding of the agent setup was off. So, I built a quick proof inside the n8n Code node. If the AI Agent step is supposed to take the XML→JSON feed and reshape it into a structured list—title, description, content URL, thumbnail URL—then I should be able to do that deterministically in JavaScript and compare. I wrote a tiny snippet that reads the entries array, grabs the media fields, and formats a minimal output. And guess what? Voila. It worked on the first try and my HTML generator lit up exactly the way I wanted. That told me two things: one, my upstream data (HTTP Request + XML→JSON) was solid; and two, my desired output structure was clear and achievable without any trickery. With that proof in hand, I turned to OpenAI. I wired the same agent prompt, the same structured output parser, and the same workflow wiring—but swapped the Ollama node for an OpenAI chat model. It worked immediately. Fast, cheap, predictable. The agent returned a perfectly clean JSON with the fields I requested. My code node transformed it into HTML. The preview looked right, and Gmail sent the newsletter just like in the demo. So at that point, I felt confident the approach was sound and the transcript you saw in the video was repeatable—at least with OpenAI in the loop. Where does that leave Ollama and open source models? I’m not throwing shade—I love open source, and I want this path to be great. My current belief is the failure is somewhere inside the n8n Ollama node code path. I don’t think it’s the models themselves in isolation; I think the node may be mishandling one or more of these details: how messages are composed (system vs user). Whether “JSON mode” or a grammar/format hint is being passed, token/length defaults that cause truncation, stop settings that let extra text leak into the output; or the way the structured output parser constraints are communicated. If you’ve worked with local models, you know they can follow structure very well when you give them a strict format or grammar. If the node isn’t exposing that (or is dropping it on the floor), you get variability. To make sure this gets eyes from the right folks, my intent is to file a bug with n8n for the Ollama node. I’ll include a minimal, reproducible workflow: the same RSS fetch, the same XML→JSON conversion, the same agent prompt and required output shape, and a comparison run where OpenAI succeeds and Ollama does not. I’ll share versions, logs, model names, and settings so the team can trace exactly where the behavior diverges. If there’s a missing parameter (like format: json) or a message-role mix‑up, great—let’s fix it. If it needs a small enhancement to pass a grammar or schema to the model, even better. The net‑net is simple: for AI agents inside n8n to feel predictable with Ollama, we need the node to enforce reliably structured outputs the same way the OpenAI path does. That unlocks a ton of practical automation for folks who prefer local models. In the meantime, if you’re following the lab and want a rock‑solid fallback, you can use the Code node to do the exact transformation the agent would do. Here’s the JavaScript I wrote and tested in the workflow: const entries = $input.first().json.feed?.entry ?? []; function truncate(str, max) { if (!str) return ''; const s = String(str).trim(); return s.length > max ? s.slice(0, max) + '…' : s; // If you want total length (including …) to be max, use: // return s.length > max ? s.slice(0, Math.max(0, max - 1)) + '…' : s; } const output = entries.map(entry => { const g = entry['media:group'] ?? {}; return { title: g['media:title'] ?? '', description: truncate(g['media:description'], 60), contentUrl: g['media:content']?.url ?? '', thumbnailUrl: g['media:thumbnail']?.url ?? '' }; }); return [{ json: { output } }]; That snippet proves the data is there and your HTML builder is fine. If OpenAI reproduces the same structured JSON as the code, and Ollama doesn’t, the issue is likely in the node’s request/response handling rather than your workflow logic. I’ll keep pushing on the bug report so we can make agents with Ollama as predictable as they need to be. Until then, if you want speed and consistency to get the job done, OpenAI works great. If you’re experimenting with open source, try enforcing stricter formats and shorter outputs—and keep an eye on what the node actually sends to the model. As always, I’ll share updates, because I love sharing knowledge—and I want the open-source path to shine right alongside the rest of our AI, agents, n8n, Gmail, and OpenAI workflows. As always, community, if you have a resolution and can pull it off, please share!
144Views1like0CommentsModernizing F5 Platforms with Ansible
I’ve been meaning to publish this article for some time now. Over the past few months, I’ve been building Ansible automation that I believe will help customers modernize their F5 infrastructure. This especially true for those looking to migrate from legacy BIG-IP hardware to next-generation platforms like VELOS and rSeries. As I explored tools like F5 Journeys and traditional CLI-based migration methods, I noticed a significant amount of manual pre-work was still required. This includes: Ensuring the Master Key used to encrypt the UCS archive is preserved and securely handled Storing UCS, Master Key and information assets in a backup host Pre-configuring all VLANs and properly tagging them on the VELOS partition before deploying a Tenant OS To streamline this, I created an Ansible Playbook with supporting roles tailored for Red Hat Ansible Automation Platform. It’s built to perform a lift-and-shift migration of a F5 BIG-IP configuration from one device to another—with optional OS upgrades included. In the demo video below, you’ll see an automated migration of a F5 i10800 running 15.1.10 to a VELOS BX110 Tenant OS running 17.5.0—demonstrating a smooth, hands-free modernization process. Currently Working Velos Velos Controller/Partition running (F5OS-C 1.8.1) - which allows Tenant Management IP to be in a different VLAN Migrates a standalone F5 BIG-IP i10800 to a VELOS BX110 Tenant OS VLAN'ed Source tenant required (Doesn’t support non-vlan tenants) rSeries Shares MGMT IP with the same subnet as the Chassis Partition. Migrates a standalone F5 BIG-IP i10800 to a R5000 Tenant OS VLAN'ed Source tenant required (Doesn’t support non-vlan tenants) Handles: Configuration and crypto backup UCS creation, transfer, and validation F5OS System VLAN Creation, and Association to Tenant - (Does Not manage Interface to VLAN Mapping) F5 OS Tenant provisioning and deployment inline OS upgrades during the migration Roadmap / What's Next Expanding Testing to include Viprion/iSeries (Using VCMP) Tenant Testing. Supporting hardware-to-virtual platform migrations Adding functionality for HA (High Availability) environments Watch the Demo Video View the Source Code on GitHub https://github.com/f5devcentral/f5-bd-ansible-platform-modernization This project is built for the community—so feel free to take it, fork it, and expand it. Let’s make F5 platform modernization as seamless and automated as possible.
1.2KViews4likes2CommentsBIG-IP Next Edge Firewall CNF for Edge workloads
Introduction The CNF architecture aligns with cloud-native principles by enabling horizontal scaling, ensuring that applications can expand seamlessly without compromising performance. It preserves the deterministic reliability essential for telecom environments, balancing scalability with the stringent demands of real-time processing. More background information about what value CNF brings to the environment, https://community.f5.com/kb/technicalarticles/from-virtual-to-cloud-native-infrastructure-evolution/342364 Telecom service providers make use of CNFs for performance optimization, Enable efficient and secure processing of N6-LAN traffic at the edge to meet the stringent requirements of 5G networks. Optimize AI-RAN deployments with dynamic scaling and enhanced security, ensuring that AI workloads are processed efficiently and securely at the edge, improving overall network performance. Deploy advanced AI applications at the edge with the confidence of carrier-grade security and traffic management, ensuring real-time processing and analytics for a variety of edge use cases. CNF Firewall Implementation Overview Let’s start with understanding how different CRs are enabled within a CNF implementation this allows CNF to achieve more optimized performance, Capex and Opex. The traditional way of inserting services to the Kubernetes is as below, Moving to a consolidated Dataplane approach saved 60% of the Kubernetes environment’s performance The F5BigFwPolicy Custom Resource (CR) applies industry-standard firewall rules to the Traffic Management Microkernel (TMM), ensuring that only connections initiated by trusted clients will be accepted. When a new F5BigFwPolicy CR configuration is applied, the firewall rules are first sent to the Application Firewall Management (AFM) Pod, where they are compiled into Binary Large Objects (BLOBs) to enhance processing performance. Once the firewall BLOB is compiled, it is sent to the TMM Proxy Pod, which begins inspecting and filtering network packets based on the defined rules. Enabling AFM within BIG-IP Controller Let’s explore how we can enable and configure CNF Firewall. Below is an overview of the steps needed to set up the environment up until the CNF CRs installations [Enabling the AFM] Enabling AFM CR within BIG-IP Controller definition global: afm: enabled: true pccd: enabled: true f5-afm: enabled: true cert-orchestrator: enabled: true afm: pccd: enabled: true image: repository: "local.registry.com" [Configuration] Example for Firewall policy settings apiVersion: "k8s.f5net.com/v1" kind: F5BigFwPolicy metadata: name: "cnf-fw-policy" namespace: "cnf-gateway" spec: rule: - name: allow-10-20-http action: "accept" logging: true servicePolicy: "service-policy1" ipProtocol: tcp source: addresses: - "2002::10:20:0:0/96" zones: - "zone1" - "zone2" destination: ports: - "80" zones: - "zone3" - "zone4" - name: allow-10-30-ftp action: "accept" logging: true ipProtocol: tcp source: addresses: - "2002::10:30:0:0/96" zones: - "zone1" - "zone2" destination: ports: - "20" - "21" zones: - "zone3" - "zone4" - name: allow-us-traffic action: "accept" logging: true source: geos: - "US:California" destination: geos: - "MX:Baja California" - "MX:Chihuahua" - name: drop-all action: "drop" logging: true ipProtocol: any source: addresses: - "::0/0" - "0.0.0.0/0" [Logging & Monitoring] CNF firewall settings allow not only local logging but also to use HSL logging to external logging destinations. apiVersion: "k8s.f5net.com/v1" kind: F5BigLogProfile metadata: name: "cnf-log-profile" namespace: "cnf-gateway" spec: name: "cnf-logs" firewall: enabled: true network: publisher: "cnf-hsl-pub" events: aclMatchAccept: true aclMatchDrop: true tcpEvents: true translationFields: true Verifying the CNF firewall settings can be done through the sidecar container kubectl exec -it deploy/f5-tmm -c debug -n cnf-gateway – bash tmctl -d blade fw_rule_stat context_type context_name ------------ ------------------------------------------ virtual cnf-gateway-cnf-fw-policy-SecureContext_vs rule_name micro_rules counter last_hit_time action ------------------------------------ ----------- ------- ------------- ------ allow-10-20-http-firewallpolicyrule 1 2 1638572860 2 allow-10-30-ftp-firewallpolicyrule 1 5 1638573270 2 Conclusion To conclude our article, we showed how CNFs with consolidated data planes help with optimizing CNF deployments. In this article we went through the overview of BIG-IP Next Edge Firewall CNF implementation, sample configuration and monitoring capabilities. More use cases to cover different use cases to be following. Related content F5BigFwPolicy BIG-IP Next Cloud-Native Network Functions (CNFs) CNF Home86Views2likes1CommentPowering Progressive Deployment in Kubernetes with NGINX and Argo Rollouts
This article demonstrates you how you can use NGINX Gateway Fabric combined with Argo Rollouts to perform progressive delivery. The Canary pattern will be introduced and is employed as we explore and execute rollout scenarios.475Views1like0CommentsUsing n8n To Orchestrate Multiple Agents
I’ve been heads-down building a series of AI step-by-step labs, and this one might be my favorite so far: a practical, cost-savvy “mixture of experts” architectural pattern you can run with n8n and self-hosted models on Ollama. The idea is simple. Not every prompt needs a heavyweight reasoning model. In fact, most don’t. So we put a small, fast model in front to classify the user’s request—coding, reasoning, or something else—and then hand that prompt to the right expert. That way, you keep your spend and latency down, and only bring out the big guns when you really need them. Architecture at a glance: Two hosts: one for your models (Ollama) and one for your n8n app. Keeping these separate helps n8n stay snappy while the model server does the heavy lifting. Docker everywhere, with persistent volumes for both Ollama and n8n so nothing gets lost across restarts. Optional but recommended: NVIDIA GPU on the model host, configured with the NVIDIA Container Toolkit to get the most out of inference. On the model server, we spin up Ollama and pull a small set of targeted models: deepseek-r1:1.5b for the classifier and general chit-chat deepseek-r1:7b for the reasoning agent (this is your “brains-on” model) codellama:latest for coding tasks (Python, JSON, Node.js, iRules, etc.) llama3.2:3b as an alternative generalist On the app server, we run n8n. Inside n8n, the flow starts with the “On Chat Message” trigger. I like to immediately send a test prompt so there’s data available in the node inspector as I build. It makes mapping inputs easier and speeds up debugging. Next up is the Text Classifier node. The trick here is a tight system, prompt and clear categories: Categories: Reasoning and Coding Options: When no clear match → Send to an “Other” branch Optional: You can allow multiple matches if you want the same prompt to hit more than one expert. I’ve tried both approaches. For certain, ambiguous asks, allowing multiple can yield surprisingly strong results. I attach deepseek-r1:1.5b to the classifier. It’s inexpensive and fast, which is exactly what you want for routing. In the System Prompt Template, I tell it: If a prompt explicitly asks for coding help, classify it as Coding If it explicitly asks for reasoning help, classify it as Reasoning Otherwise, pass the original chat input to a Generalist From there, each classifier output connects to its own AI Agent node: Reasoning Agent → deepseek-r1:7b Coding Agent → codellama:latest Generalist Agent (the “Other” branch) → deepseek-r1:1.5b or llama3.2:3b I enable “Retry on Fail” on the classifier and each agent. In my environment (cloud and long-lived connections), a few retries smooth out transient hiccups. It’s not a silver bullet, but it prevents a lot of unnecessary red Xs while you’re iterating. Does this actually save money? If you’re paying per token on hosted models, absolutely. You’re deferring the expensive reasoning calls until a small model decides they’re justified. Even self-hosted, you’ll feel the difference in throughput and latency. CodeLlama crushes most code-related queries without dragging a reasoning model into it. And for general questions—“How do I make this sandwich?”—A small generalist is plenty. A few practical notes from the build: Good inputs help. If you know you’re asking for code, say so. Your classifier and downstream agent will have an easier time. Tuning beats guessing. Spend time on the classifier’s system prompt. Small changes go a long way. Non-determinism is real. You’ll see variance run-to-run. Between retries, better prompts, and a firm “When no clear match” path, you can keep that variance sane. Bigger models, better answers. If you have the budget or hardware, plugging in something like Claude, GPT, or a higher-parameter DeepSeek will lift quality. The routing pattern stays the same. Where to take it next: Wire this to Slack so an engineering channel can drop prompts and get routed answers in place. Add more “experts” (e.g., a data-analysis agent or an internal knowledge agent) and expand your classifier categories. Log token counts/latency per branch so you can actually measure savings and adjust thresholds/models over time. This is a lab, not a production, but the pattern is production-worthy with the right guardrails. Start small, measure, tune, and only scale up the heavy models where you’re seeing real business value. Let me know what you build—especially if you try multi-class routing and send prompts to more than one expert. Some of the combined answers I’ve seen are pretty great. Here's the lab in our git, if you'd like to try it out for yourself. If video is more your thing, try this: Thanks for building along, and I’ll see you in the next lab.
148Views3likes0Comments