ai
76 TopicsUsing n8n To Orchestrate Multiple Agents
I’ve been heads-down building a series of AI step-by-step labs, and this one might be my favorite so far: a practical, cost-savvy “mixture of experts” architectural pattern you can run with n8n and self-hosted models on Ollama. The idea is simple. Not every prompt needs a heavyweight reasoning model. In fact, most don’t. So we put a small, fast model in front to classify the user’s request—coding, reasoning, or something else—and then hand that prompt to the right expert. That way, you keep your spend and latency down, and only bring out the big guns when you really need them. Architecture at a glance: Two hosts: one for your models (Ollama) and one for your n8n app. Keeping these separate helps n8n stay snappy while the model server does the heavy lifting. Docker everywhere, with persistent volumes for both Ollama and n8n so nothing gets lost across restarts. Optional but recommended: NVIDIA GPU on the model host, configured with the NVIDIA Container Toolkit to get the most out of inference. On the model server, we spin up Ollama and pull a small set of targeted models: deepseek-r1:1.5b for the classifier and general chit-chat deepseek-r1:7b for the reasoning agent (this is your “brains-on” model) codellama:latest for coding tasks (Python, JSON, Node.js, iRules, etc.) llama3.2:3b as an alternative generalist On the app server, we run n8n. Inside n8n, the flow starts with the “On Chat Message” trigger. I like to immediately send a test prompt so there’s data available in the node inspector as I build. It makes mapping inputs easier and speeds up debugging. Next up is the Text Classifier node. The trick here is a tight system, prompt and clear categories: Categories: Reasoning and Coding Options: When no clear match → Send to an “Other” branch Optional: You can allow multiple matches if you want the same prompt to hit more than one expert. I’ve tried both approaches. For certain, ambiguous asks, allowing multiple can yield surprisingly strong results. I attach deepseek-r1:1.5b to the classifier. It’s inexpensive and fast, which is exactly what you want for routing. In the System Prompt Template, I tell it: If a prompt explicitly asks for coding help, classify it as Coding If it explicitly asks for reasoning help, classify it as Reasoning Otherwise, pass the original chat input to a Generalist From there, each classifier output connects to its own AI Agent node: Reasoning Agent → deepseek-r1:7b Coding Agent → codellama:latest Generalist Agent (the “Other” branch) → deepseek-r1:1.5b or llama3.2:3b I enable “Retry on Fail” on the classifier and each agent. In my environment (cloud and long-lived connections), a few retries smooth out transient hiccups. It’s not a silver bullet, but it prevents a lot of unnecessary red Xs while you’re iterating. Does this actually save money? If you’re paying per token on hosted models, absolutely. You’re deferring the expensive reasoning calls until a small model decides they’re justified. Even self-hosted, you’ll feel the difference in throughput and latency. CodeLlama crushes most code-related queries without dragging a reasoning model into it. And for general questions—“How do I make this sandwich?”—A small generalist is plenty. A few practical notes from the build: Good inputs help. If you know you’re asking for code, say so. Your classifier and downstream agent will have an easier time. Tuning beats guessing. Spend time on the classifier’s system prompt. Small changes go a long way. Non-determinism is real. You’ll see variance run-to-run. Between retries, better prompts, and a firm “When no clear match” path, you can keep that variance sane. Bigger models, better answers. If you have the budget or hardware, plugging in something like Claude, GPT, or a higher-parameter DeepSeek will lift quality. The routing pattern stays the same. Where to take it next: Wire this to Slack so an engineering channel can drop prompts and get routed answers in place. Add more “experts” (e.g., a data-analysis agent or an internal knowledge agent) and expand your classifier categories. Log token counts/latency per branch so you can actually measure savings and adjust thresholds/models over time. This is a lab, not a production, but the pattern is production-worthy with the right guardrails. Start small, measure, tune, and only scale up the heavy models where you’re seeing real business value. Let me know what you build—especially if you try multi-class routing and send prompts to more than one expert. Some of the combined answers I’ve seen are pretty great. Here's the lab in our git, if you'd like to try it out for yourself. If video is more your thing, try this: Thanks for building along, and I’ll see you in the next lab.
84Views3likes0CommentsManaging Model Context Protocol in iRules - Part 3
In part 2 of this series, we took a look at a couple iRules use cases that do not require the json or sse profiles and don't capitalize on the new JSON commands and events introduced in the v21 release. That changes now! In this article, we'll take a look at two use cases for logging MCP activity and removing MCP tools from a servers tool list. Event logging This iRule logs various HTTP, SSE, and JSON-related events for debugging and monitoring purposes. It provides clear visibility into request/response flow and detects anomalies or errors. How it works HTTP_REQUEST Logs each HTTP request with its URI and client IP. Example: "HTTP request received: URI /example from 192.168.1.1" SSE_RESPONSE Logs when a Server-Sent Event (SSE) response is identified. Example: "SSE response detected successfully." JSON_REQUEST and JSON_RESPONSE Logs when valid JSON requests or responses are detected Examples: "JSON Request detected successfully" JSON Response detected successfully" JSON_REQUEST_MISSING and JSON_RESPONSE_MISSING Logs if JSON payloads are missing from requests or responses. Examples: "JSON Request missing." "JSON Response missing." JSON_REQUEST_ERROR and JSON_RESPONSE_ERROR Logs when there are errors in parsing JSON during requests or responses. Examples: "Error processing JSON request. Rejecting request." "Error processing JSON response." iRule: Event Logging when HTTP_REQUEST { # Log the event (for debugging) log local0. "HTTP request received: URI [HTTP::uri] from [IP::client_addr]" when SSE_RESPONSE { # Triggered when a Server-Sent Event response is detected log local0. "SSE response detected successfully." } when JSON_REQUEST { # Triggered when the JSON request is detected log local0. "JSON Request detected successfully." } when JSON_RESPONSE { # Triggered when a Server-Sent Event response is detected log local0. "JSON response detected successfully." } when JSON_RESPONSE_MISSING { # Triggered when the JSON payload is missing from the server response log local0. "JSON Response missing." } when JSON_REQUEST_MISSING { # Triggered when the JSON is missing or can't be parsed in the request log local0. "JSON Request missing." } when JSON_RESPONSE_ERROR { # Triggered when there's an error in the JSON response processing log local0. "Error processing JSON response." #HTTP::respond 500 content "Invalid JSON response from server." } when JSON_REQUEST_ERROR { # Triggered when an error occurs (e.g., malformed JSON) during JSON processing log local0. "Error processing JSON request. Rejecting request." #HTTP::respond 400 content "Malformed JSON payload. Please check your input." } MCP tool removal This iRule modifies server JSON responses by removing disallowed tools from the result.tools array while logging detailed debugging information. How it works JSON parsing and logging print procedure - recursively traverses and logs the JSON structure, including arrays, objects, strings, and other types. jpath procedure - extracts values or JSON elements based on a provided path, allowing targeted retrieval of nested properties. JSON response handling When JSON_RESPONSE is triggered: Logs the root JSON object and parses it using JSON::root. Extracts the tools array from result.tools. Tool removal logic Iterates over the tools array and retrieves the name of each tool. If the tool name matches start-notification-stream: Removes it from the array using JSON::array remove. Logs that the tool is not allowed. If the tool does not match: Logs that the tool is allowed and moves to the next one. Logging information Logs all JSON structures and actions: Full JSON structure. Extracted tools array. Tools allowed or removed. Input JSON Response { "result": { "tools": [ {"name": "start-notification-stream"}, {"name": "allowed-tool"} ] } } Modified Response { "result": { "tools": [ {"name": "allowed-tool"} ] } } iRule: Remove tool list # Code to check JSON and print in logs proc print { e } { set t [JSON::type $e] set v [JSON::get $e] set p0 [string repeat " " [expr {2 * ([info level] - 1)}]] set p [string repeat " " [expr {2 * [info level]}]] switch $t { array { log local0. "$p0\[" set size [JSON::array size $v] for {set i 0} {$i < $size} {incr i} { set e2 [JSON::array get $v $i] call print $e2 } log local0. "$p0\]" } object { log local0. "$p0{" set keys [JSON::object keys $v] foreach k $keys { set e2 [JSON::object get $v $k] log local0. "$p${k}:" call print $e2 } log local0. "$p0}" } string - literal { set v2 [JSON::get $e $t] log local0. "$p\"$v2\"" } default { set v2 [JSON::get $e $t] log local0. "$p$v2" } } } proc jpath { e path {d .} } { if { [catch {set v [call jpath2 $e $path $d]} err] } { return "" } return $v } proc jpath2 { e path {d .} } { set parray [split $path $d] set plen [llength $parray] set i 0 for {} {$i < [expr {$plen }]} {incr i} { set p [lindex $parray $i] set t [JSON::type $e] set v [JSON::get $e] if { $t eq "array" } { # array set e [JSON::array get $v $p] } else { # object set e [JSON::object get $v $p] } } set t [JSON::type $e] set v [JSON::get $e $t] return $v } # Modify in response when JSON_RESPONSE { log local0. "JSON::root" set root [JSON::root] call print $root set tools [call jpath $root result.tools] log local0. "root = $root tools= $tools" if { $tools ne "" } { log local0. "TOOLS not empty" set i 0 set block_tool "start-notification-stream" while { $i < 100 } { set name [call jpath $root result.tools.${i}.name] if { $name eq "" } { break } if { $name eq $block_tool } { log local0. "tool $name is not alowed" JSON::array remove $tools $i } else { log local0. "tool $name is alowed" incr i } } } else { log local0. "no tools" } } Conclusion This not only concludes the article, but also this introductory series on managing MCP in iRules. Note that all these commands handle all things JSON, so you are not limited to MCP contexts. We look forward to what the community will build (and hopefully share back) with this new functionality! NOTE: This series is ghostwritten. Awaiting permission from original author to credit.121Views2likes0CommentsManaging Model Context Protocol in iRules - Part 2
In the first article in this series, we took a look at what Model Context Protocol (MCP) is, and how to get the F5 BIG-IP set up to manage it with iRules. In this article, we'll take a look at the first couple of use cases with session persistence and routing. Note that the use cases in this article do not require the json or sse profiles to work. That will change in part 3. Session persistence and routing This iRule ensures session persistence and traffic routing for three endpoints: /sse, /messages, and /mcp. It injects routing information (f5Session) via query parameters or headers, processes them for routing to specific pool members, and transparently forwards requests to the server. How it works Client sends HTTP GET request to SSE endpoint of server (typically /sse): GET /sse HTTP/1.1 Server responds 200 OK with an SSE event stream. It includes an SSE message with an "event" field of "endpoint", which provides the client with a URI where all its future HTTP requests must be sent. This is where servers might include a session ID: event: endpoint data: /messages?sessionId=abcd1234efgh5678 NOTE: the MCP spec does not specify how a session ID can be encoded in the endpoint here. While we have only seen use of a sessionId query parameter, theoretically a server could implement its session Ids with any arbitrary query parameter name, or even as part of the path like this: event: endpoint data: /messages/abcd1234efgh5678 Our iRule can take advantage of this mechanism by injecting a query parameter into this path that tells us which server we should persist future requests to. So when we forward the SSE message to the client, it looks something like this: event: endpoint data: /messages?f5Session=some_pool_name,10.10.10.5:8080&sessionId=abcd1234efgh5678 or event: endpoint data: /messages/abcd1234efgh5678?f5Session=some_pool_name,10.10.10.5:8080 When the client sends a subsequent HTTP request, it will use this endpoint. Thus, when processing HTTP requests, we can read the f5Session secret we inserted earlier, route to that pool member, and then remove our secret before forwarding the request to the server using the original endpoint/sessionId it provided. Load Balancing when HTTP_REQUEST { set is_req_to_sse_endpoint false # Handle requests to `/sse` (Server-Sent Event endpoint) if { [HTTP::path] eq "/sse" } { set is_req_to_sse_endpoint true return } # Handle `/messages` endpoint persistence query processing if { [HTTP::path] eq "/messages" } { set query_string [HTTP::query] set f5_sess_found false set new_query_string "" set query_separator "" set queries [split $query_string "&"] ;# Split query string into individual key-value pairs foreach query $queries { if { $f5_sess_found } { append new_query_string "${query_separator}${query}" set query_separator "&" } elseif { [string match "f5Session=*" $query] } { # Parse `f5Session` for persistence routing set pmbr_info [string range $query 10 end] set pmbr_parts [split $pmbr_info ","] if { [llength $pmbr_parts] == 2 } { set pmbr_tuple [split [lindex $pmbr_parts 1] ":"] if { [llength $pmbr_tuple] == 2 } { pool [lindex $pmbr_parts 0] member [lindex $pmbr_parts 1] set f5_sess_found true } else { HTTP::respond 404 noserver return } } else { HTTP::respond 404 noserver return } } else { append new_query_string "${query_separator}${query}" set query_separator "&" } } if { $f5_sess_found } { HTTP::query $new_query_string } else { HTTP::respond 404 noserver } return } # Handle `/mcp` endpoint persistence via session header if { [HTTP::path] eq "/mcp" } { if { [HTTP::header exists "Mcp-Session-Id"] } { set header_value [HTTP::header "Mcp-Session-Id"] set header_parts [split $header_value ","] if { [llength $header_parts] == 3 } { set pmbr_tuple [split [lindex $header_parts 1] ":"] if { [llength $pmbr_tuple] == 2 } { pool [lindex $header_parts 0] member [lindex $header_parts 1] HTTP::header replace "Mcp-Session-Id" [lindex $header_parts 2] } else { HTTP::respond 404 noserver } } else { HTTP::respond 404 noserver } } } } when HTTP_RESPONSE { # Persist session for MCP responses if { [HTTP::header exists "Mcp-Session-Id"] } { set pool_member [LB::server pool],[IP::remote_addr]:[TCP::remote_port] set header_value [HTTP::header "Mcp-Session-Id"] set new_header_value "$pool_member,$header_value" HTTP::header replace "Mcp-Session-Id" $new_header_value } # Inject persistence information into response payloads for Server-Sent Events if { $is_req_to_sse_endpoint } { set sse_data [HTTP::payload] ;# Get the SSE payload # Extract existing query params from the SSE response set old_queries [URI::query $sse_data] if { [string length $old_queries] == 0 } { set query_separator "" } else { set query_separator "&" } # Insert `f5Session` persistence information into query set new_query "f5Session=[URI::encode [LB::server pool],[IP::remote_addr]:[TCP::remote_port]]" set new_payload "?${new_query}${query_separator}${old_queries}" # Replace the payload in the SSE response HTTP::payload replace 0 [string length $sse_data] $new_payload } } Persistence when CLIENT_ACCEPTED { # Log when a new TCP connection arrives (useful for debugging) log local0. "New TCP connection accepted from [IP::client_addr]:[TCP::client_port]" } when HTTP_REQUEST { # Check if this might be an SSE request by examining the Accept header if {[HTTP::header exists "Accept"] && [HTTP::header "Accept"] contains "text/event-stream"} { log local0. "SSE Request detected from [IP::client_addr] to [HTTP::uri]" # Insert a custom persistence key (optional) set sse_persistence_key "[IP::client_addr]:[HTTP::uri]" persist uie $sse_persistence_key } } when HTTP_RESPONSE { # Ensure this is an SSE connection by checking the Content-Type if {[HTTP::header exists "Content-Type"] && [HTTP::header "Content-Type"] equals "text/event-stream"} { log local0. "SSE Response detected for [IP::client_addr]. Enabling persistence." # Use the same persistence key for the response persist add uie $sse_persistence_key } } Conclusion Thank you for your patience! Now is the time to continue on to part 3 where we'll finally get into the new JSON commands and events added in version 21! NOTE: This series is ghostwritten. Awaiting permission from original author to credit.80Views3likes0CommentsManaging Model Context Protocol in iRules - Part 1
The Model Context Protocol (MCP) was introduced by Anthropic in November of 2024, and has taken the industry by storm since. MCP provides a standardized way for AI applications to connect with external data sources and tools through a single protocol, eliminating the need for custom integrations for each service and enabling AI systems to dynamically discover and use available capabilities. It's gained rapid industry adoption because major model providers and numerous IDE and tool makers have embraced it as an open standard, with tens of thousands of MCP servers built and widespread recognition that it mostly solves the fragmented integration challenge that previously plagued AI development. In this article, we'll take a look at the MCP components, how MCP works, and how you can use the JSON iRules events and commands introduced in version 21 to control the messsaging between MCP clients and servers. MCP components Host The host is the AI application where the LLM logic resides, such as Claude Desktop, AI-powered IDEs like Cursor, Open WebUI with the mcpo proxy like in our AI Step-by-Step labs, or via custom agentic systems that receive user requests and orchestrate the overall interaction. Client The client exists within the host application and maintains a one-to-one connection with each MCP server, converting user requests into the structured format that the protocol can process and managing session details like timeouts and reconnects. Server Servers are lightweight programs that expose data and functionality from external systems, whether internal databases or external APIs, allowing connections to both local and remote resources. Multiple clients can exist within a host, but each client has a dedicated (or perceived in the case of using a proxy) 1:1 relationship with an MCP server. MCP servers expose three main types of capabilities: Resources - information retrieval without executing actions Tools - performing side effects like calculations or API requests Prompts - reusable templates and workflows for LLM-server communication Message format (JSON-RPC) The transport layer between clients and servers uses JSON-RPC format for two-way message conversion, allowing the transport of various data structures and their processing rules. This enforces a consistent request/response format across all tools, so applications don't have to handle different response types for different services. Transport options MCP supports three standard transport mechanisms: stdio (standard input/output for local connections), Server-Sent Events (SSE for remote connections with separate endpoints for requests and responses), and Streamable HTTP (a newer method introduced in March 2025 that uses a single HTTP endpoint for bidirectional messaging). NOTE: SSE transport has been deprecated as of protocol version 2024-11-05 and replaced by Streamable HTTP, which addresses limitations like lack of resumable streams and the need to maintain long-lived connections, though SSE is still supported for backward compatibility. MCP workflow Pictures tell a compelling story. First, the diagram. The steps in the diagram above are as follows: The MCP client requests capabilities from the MCP server The MCP server provides a list of available tools and services the MCP client sends the question and the retrieved MCP server tools and services to the LLM The LLM specifies which tools and services to use. The MCP client calls the specific tool or service The MCP server returns the result/context to the MCP client The MCP client passes the result/context to the LLM The LLM uses the result/context to prepare the answer iRules MCP-based use cases There are a bunch of use cases for MCP handling, such as: Load-balancing of MCP traffic across MCP Servers High availability of the MCP Servers MCP message validation on behalf of MCP servers MCP protocol inspection and payload modification Monitoring the MCP Servers' health and their transport protocol status. In case of any error in MCP request and response, BIG-IP should be able to detect and report to the user Optimization Profiles Support Use OneConnect Profile Use Compression Profile Security support for MCP servers. There are no native features for this yet, but you can build your own secure business logic into the iRules logic for now. LTM profiles Configuring MCP involves creating two profiles - an SSE profile and a JSON profile - and then attaching them to a virtual server. The SSE profile is for backwards compatibility should you need it in your MCP client/server environment. The defaults for these profiles are shown below. [root@ltm21a:Active:Standalone] config # tmsh list ltm profile sse all-properties ltm profile sse sse { app-service none defaults-from none description none max-buffered-msg-bytes 65536 max-field-name-size 1024 partition Common } [root@ltm21a:Active:Standalone] config # tmsh list ltm profile json all-properties ltm profile json json { app-service none defaults-from none description none maximum-bytes 65536 maximum-entries 2048 maximum-non-json-bytes 32768 partition Common } These can be tuned down from these maximums by creating custom profiles that will meet the needs of your environment, for example (without all properties like above): [root@ltm21a:Active:Standalone] config # tmsh create ltm profile sse sse_test_env max-buffered-msg-bytes 1000 max-field-name-size 500 [root@ltm21a:Active:Standalone] config # tmsh create ltm profile json json_test_env maximum-bytes 3000 maximum-entries 1000 maximum-non-json-bytes 2000 [root@ltm21a:Active:Standalone] config # tmsh list ltm profile sse sse_test_env ltm profile sse sse_test_env { app-service none max-buffered-msg-bytes 1000 max-field-name-size 500 } [root@ltm21a:Active:Standalone] config # tmsh list ltm profile json json_test_env ltm profile json json_test_env { app-service none maximum-bytes 3000 maximum-entries 1000 maximum-non-json-bytes 2000 } NOTE: Both profiles have database keys that can be temporarily enabled for troubleshooting purposes. The keys are log.sse.level and log.json.level. You can set the value for one or both to debug. Do not leave them in debug mode! Conclusion Now that we have the laid the foundation, continue on to part 2 where we'll look at the first two use cases. NOTE: This series is ghostwritten. Awaiting permission from original author to credit.139Views3likes1CommentAccelerating AI Data Delivery with F5 BIG-IP
Introduction AI continues to rely heavily on efficient data delivery infrastructures to innovate across industries. S3 is the protocol that AL/ML engineers rely on for data delivery. As AI workloads grow in complexity, ensuring seamless and resilient data ingestion and delivery becomes critical. This will support massive datasets, robust training workflows, and production-grade outputs. S3 is HTTP-based, so F5 is commonly used to provide advanced capabilities for managing S3-compatible storage pipelines, enforcing policies, and preventing delivery failures. This enables businesses to maintain operational excellence in AI environments. This article explores three key functions of F5 BIG-IP within AI data delivery through embedded demo videos. From optimizing S3 data pipelines and enforcing granular policies to monitoring traffic health in real time, F5 presents core functions for developers and organizations striving for agility in their AI operations. The diagram shows a scalable, resilient, and secure AI architecture facilitated by F5 BIG-IP. End-user traffic is directed to the front-end application through F5, ensuring secure and load-balanced access via the "Web and API front door." This traffic interacts with the AI Factory, comprising components like AI agents, inference, and model training, also secured and scaled through F5. Data is ingested into enterprise events and data stores, which are securely delivered back to the AI Factory's model training through F5 to support optimized resource utilization. Additionally, the architecture includes Retrieval-Augmented Generation (RAG), securely backed by AI object storage and connected through F5 for AI APIs. Whether from the front-end applications or the AI Factory, traffic to downstream services like AI agents, databases, websites, or queues is routed via F5 to ensure consistency, security, and high availability across the ecosystem. This comprehensive deployment highlights F5's critical role in enabling secure, efficient AI-powered operations. 1. Ensure Resilient AI Data and S3 Delivery Pipelines with F5 BIG-IP Modern AI workflows often rely on S3-compatible storage for high-throughput data delivery. However, a common problem is inefficient resource utilization in clusters due to uneven traffic distribution across storage nodes, causing bottlenecks, delays, and reliability concerns. If you manage your own storage environment, or have spoken to a storage administrator, you’ll know that “hot spots” are something to avoid when dealing with disk arrays. In this demo, F5 BIG-IP demonstrates how a loose-coupling architecture solves these issues. By intelligently distributing traffic across all cluster nodes via a virtual server, BIG-IP ensures balanced load distribution, eliminates bottlenecks, and provides high-performance bandwidth for AI workloads. The demo uses Warp, a S3 benchmarking too, to highlight how F5 BIG-IP can take incoming S3 traffic and route it efficiently to storage clusters. We use the least-connection load balancing algorithm to minimize latency across the nodes while maximizing resource utilization. We also add new nodes to the load balancing pool, ensuring smooth, scalable, and resilient storage pipelines. 2.Enforce Policy-Driven AI Data Delivery with F5 BIG-IP AI workloads are susceptible to traffic spikes that can destabilize storage clusters and impact concurrent data workflows. The video demonstrates using iRules to cap connections and stabilize clusters under high request-per-second spikes. Additionally, we use local traffic policies to redirect specific buckets while preserving other ongoing requests. For operational clarity, the study tool visualizes real-time cluster metrics, offering deep insights into how policies influence traffic. 3.Prevent AI Data Delivery Failures with F5 BIG-IP AI operations depend on high efficiency and reliable data delivery to maintain optimal training and model fine-tuning workflows. The video demonstrates how F5 BIG-IP uses real-time health monitors to ensure storage clusters remain operational during failure scenarios. By dynamically detecting node health and write quorum thresholds, BIG-IP intelligently routes traffic to backup pools or read quorum clusters without disrupting endpoints. The health monitors also detect partial node failures, which is important to avoid risk of partial writes when working with S3 storage.. Conclusion Once again, with AI so reliant on HTTP-based S3 storage, F5 administrators find themselves as a critical part of the latest technologies. By enabling loose coupling, enforcing granular policies, and monitoring traffic health in real time, F5 optimizes data delivery for improved AI model accuracy, faster innovation, and future-proof architectures. Whether facing unpredictable traffic surges or handling partial failures in clusters, BIG-IP ensures your applications remain resilient and ready to meet business demands with ease. Related Resources AI Data Delivery Use Case AI Reference Architecture Enterprise AI delivery and security
118Views2likes0CommentsGetting Started With n8n For AI Automation
First, what is n8n? If you're not familiar with n8n yet, it's a workflow automation utility that allows us to use nodes to connect services quite easily. It's been the subject of quite a bit of Artificial Intelligence hype because it helps you construct AI Agents. I'm going to be diving more into n8n and what it can do with AI. My hope is that you can use this in your own labs to work out some of these AI networking and security challenges in your environment. Here's an example of how someone could use Ollama to control multiple Twitter accounts, for instance: How do you install it? Well... It’s all node, so the best way to install it in any environment is to ensure you have node version 22 (on Mac, homebrew install node@22) installed on your machine, as well as nvm (again, for mac, homebrew install nvm) and then do an npm install -g n8n. Done! Really...That simple. How much does it cost? While there is support and expanded functionality for paid subscribers, there is also a community edition that I have used here and it's free. How to license:
490Views5likes0CommentsS3 Traffic Optimization with F5 BIG-IP & NetApp StorageGRID
The S3 protocol is seeing tremendous growth, often in projects involving AI where storage is a critical component, ranging from model training to inference with RAG. Model training projects usually need large, readily available datasets to keep GPUs operating efficiently. Additionally, storing sizable checkpoints—detailed snapshots of the model—is essential for these tasks. These are vital for resuming training after interruptions that could otherwise imperil weeks or months-long projects. Regardless of the AI ecosystem, S3 increasingly comes into play in some manner. For example, it is common for one tier of the storage involved, perhaps an outer tier, to be tasked with network loads to accrue knowledge sources. This data acquisition role is normally achieved today using S3 protocol transactions. Why does a F5 BIG-IP, an industry-leading ADC solution, work so well for optimizing S3 flows that are directed at a S3 storage solution such as StorageGRID? An interesting aspect of S3 is just how easily extensible it is, to a degree other protocols may not be. Take for example, routing protocols, like OSPF or BGP-4; that are governed by RFC’s controlled and published by the IETF (Internet Engineering Task Force). Unwavering compliance to RFC specifications is often non-negotiable with customers. Similarly, storage protocols are often governed by SNIA (Storage Networking Industry Association) and extensibility may not have been top of mind when standards were initially released. S3, as opposed to these examples, is proactively steered by Amazon. In fact, S3 (Simple Storage Service) was one of the earliest AWS services, launched in 2006. When referring to “S3” today, many times, the reference is normally to the set of S3 API commands that the industry has adopted in general, not specifically the storage service of AWS. Amazon’s documentation for S3, the starting point is found here, is easily digested, not clinical sets of clauses and archaic “must” or “should” directives. An entire category of user defined metadata is made available, and encourages unbounded extensibility: “User-defined metadata is metadata that you can choose to set at the time that you upload an object. This user-defined metadata is a set of name-value pairs” Why S3’s Flexibility Amplifies the Possibilities of BIG-IP and StorageGRID The extensibility baked into S3 is tailor made for the sophisticated and unique capabilities of BIG-IP. Specifically, BIG-IP ships with what has become an industry standard in Data Plane programmability: F5 iRules. For years, iRules have let IT administrators define specific, real-time actions based on the content and characteristics of network traffic. This enables them to do advanced tasks like content steering, protocol manipulation, customized persistence rules, and security policy enforcement. StorageGRID from NetApp allows a site of clustered storage nodes to use S3 protocol to both ingest content and deliver requested objects. Automatic backend synchronization allows any node to be offered up as a target by a server load balancer like BIG-IP. This allows overall storage node utilization to be optimized across the node set and scaled performance to reach the highest S3 API bandwidth levels, all while offering high availability to S3 API consumers. Should any node go off-line, or be taken out of service for maintenance, laser precise health checks from BIG-IP will remove that node from the pool used by F5, and customer S3 traffic will flow unimpeded. A previous article available here dove into the setup details and value delivered out of the box by BIG-IP for S3 clusters, such as NetApp StorageGRID. Sample BIG-IP and StorageGRID Configuration – S3 Storage QoS Steering One potential use case for BIG-IP is to expedite S3 writes such that the initial transaction is directed to the best storage node. Consider a workload that requires perhaps the latest in SSD technology, reflected by the lowest latency and highest IOPS. This will drive the S3 write (or read) towards the fastest S3 transactional completion time possible. A user can add this nuance, the need for this differentiated service level, easily as S3 metadata. The BIG-IP will make this custom field actionable, directing traffic to the backend storage node type required. As per normal StorageGRID behavior, any node can handle subsequent reads due to backend synchronization that occurs post-S3 write. Here is a sample scenario where storage nodes have been grouped into QoS levels, gold, silver and bronze to reflect the performance of the media in use. To demonstrate this use case, a lab environment was configured with three pools. This simulated a production solution where each pool could have differing hardware vintages and performance characteristics. To understand the use of S3 metadata fields, one may look at a simple packet trace of a S3 download (GET) directed towards an StorageGRID solution. Out of the box, a few header fields will indicate that the traffic is S3. By disabling TLS, a packet analyzer on the client, Wireshark, can display the User-Agent field value as highlighted above. Often S3 traffic will originate with specific S3 graphical utilities like S3 Browser, Cyberduck, or FileZilla. The indicator of S3 is also found with the presence of common HTTP X-headers, in the example we observe x-amz-content-sha256, and x-amz-date highlighted. The guidance for adding one’s own headers is to preface the new headers with “x-amz-meta-“. In this exercise, we have chosen to include the header “x-amz-meta-object-storage-qos” as a header and have BIG-IP search out the values gold, silver, and bronze to select the appropriate storage node pool. Traffic without this header, including S3 control plane actions, will see traffic proxied by default to the bronze pool. To exercise the lab setup, we will use the “s3cmd”, a free command-line utility available for Linux and macOS, with about 60 command line options; s3cmd easily allows for header inclusions. Here is our example, which will move a 1-megabyte file using S3 into a StorageGRID hosted bucket on the “Silver” pool: s3cmd put large1megabytefile.txt s3://mybucket001/large1megabytefile.txt --host=10.150.92.75:443 --no-check-certificate --add-header="x-amz-meta-object-storage-qos:silver" BIG-IP Setup and Validation of QoS Steering The setup of a virtual server on BIG-IP is straightforward in our case. S3 traffic will be accepted on TCP port 443 of the server address, 10.150.92.75, and independent TLS sessions will face both the client and the backend storage nodes. As seen in the following BIG-IP screenshot, three pools have been defined. An iRule can be written from scratch or, alternatively, can easily be created using AI. The F5 AI Assistant, a chat interface found within the Distributed Cloud console, is extensively trained on iRules. It was used to create the following (double-click image to enlarge). Upon issuing the client’s s3cmd command to upload the 1-megabyte file with “silver” QoS requirement, we immediately observe the following traffic to the pools, having just zeroed all counters. Converting the proxied 8.4 million bits to megabytes confirms that the 1,024 KB file indeed was sent to the correct pool. Other S3 control plane interactions make up the small amount of traffic proxied to the bronze, spinning disks pool. BIG-IP and XML Tag Monitoring for Security Purposes Beyond HTTP header metadata, S3 user consoles can be harnessed to apply content tags to StorageGRID buckets and objects within buckets. In the following simple example, a bucket “bucketaugust002” has been tied to various XML user-defined tags, such as “corporate-viewer-permissions-level: restricted”. When a S3 client lists objects within a bucket tagged as restricted, it may be prudent to log this access. One approach would be iRules, but it’s not the optimal path, as the entirety of the HTTP response payload would need to be scoured for bucket listings. iRules could then apply REGEX scanning for “restricted”, as one simple example. A better approach is to use the Data Guard feature of the BIG-IP Advanced WAF module. Data Guard is built into the TMM (Traffic Management Microkernel) of BIG-IP at great optimization, whereas the TCL engine of iRules is not. Thus, iRules is fine when the exact offset of a pattern in known, and really good with request and response header manipulation, but for launching scans throughout payloads Data Guard is even better. Within AWAF, one need simply build a policy and under the “Advanced” option put in Data Guard strings of interest in a REGEX format, in our example (?:^|\W)restricted(?:$|\W), as seen below, double-click to see image in sharper detail. Although the screenshot indicates “blocking” is the enforcement mode, in our use case, alerting was only sought. As a S3Browser user explores the contents of a bucket flagged as “restricted” the following log is seen by an authorized BIG-IP administrator (double-click). Other use cases for Data Guard and S3 would include scanning of textual data objects being retrieved, specifically for the presence of fields that might be classified as PII, such as credit card numbers, US social security numbers, or any other custom value an organization is interested in. A multitude of online resources provide REGEX expressions matching the presence of any of a variety of potentially sensitive information mandated through international standards. Occurrences within retrieved data, such as data formats matching drivers’ licenses, national health care values, and passport IDs are all quickly keyed in on. The ability of Data Guard to obfuscate sensitive information, beyond blocking or alerting, is a core reason to use BIG-IP in line with your data. BIG-IP Advanced Traffic Management - Preserve Customer Performance The BIG-IP offers a myriad of safeguards and enforcement options around the issue of high-rate traffic management. The risks today include any one traffic source maliciously or inadvertently overwhelming the StorageGRID cluster with unreasonable loads. The possible protections are simply touched upon in this section, starting with some simple features, enabled per virtual server, with a few clicks. A BIG-IP deployment can be as simple as one virtual server directing all traffic to a backend cluster of StorageGRID nodes. However, just as easily, it could be another extreme: a virtual server entry reserved for one company, one department, even one specific high value client machine. In the above screenshot, one sees a few options. A virtual server may have a cap set on how many connections are permitted, normally TCP port 443-based for S3. This is a primitive but effective safeguard against denial-of-service attacks that attempt to overwhelm with bulk connection loads. As seen, an eviction policy can be enabled that closes connections that exhibit negative characteristics, like clients who become unresponsive. Lastly, it is seen that connection rate safeguards are also available and can be applied to consider source IP addresses. This can counteract singular, rogue clients in larger deployments. Another set of advanced features serve to protect the StorageGRID infrastructure in terms of clamping down on the total traffic that the BIG-IP will forward to a cluster. These features are interesting as many can apply not just at the per virtual server level, but some controls can be applied to the entirety of traffic being sent on particular VLANs or network interfaces that might interconnect to backend pools. BIG-IP’s advanced features include the aptly named Bandwidth Controllers, and are made up of two varieties: static and dynamic. Static is likely the correct choice for limiting the total bandwidth a given virtual server can direct at the backend pool of Storage Nodes. The setup is trivial, just visit the “Acceleration” tab of BIG-IP and create a static bandwidth controller of the magnitude desired. At this point, nothing further is required beyond simply choosing the policy from the virtual server advanced configuration screen. For even more agile behavior, one might choose to use a dynamic bandwidth controller. As seen below, the options now extend into individual user traffic-level metering. With ancillary BIG-IP modules like Access Policy Manager (APM) traffic can be tied back to individual authenticated users by integration with common Identity Providers (IdP) solutions like Microsoft Active Directory (AD) servers or using the SSO framework standard SAML 2.0, others exist as well. However, with LTM by itself, we can consider users as sessions consisting of source IP and source (ephemeral) TCP port pairs and can apply dynamic bandwidth controls upon this definition of a user. The following provides an example of the degree to which traffic management can be imposed by BIG-IP with dynamic controllers, including per user bandwidth. The resulting dynamic bandwidth solution can be tied to any virtual server with the following few lines of a simple iRule: when CLIENT_ACCEPTED { set mycookie [IP::remote_addr]:[TCP:: remote_port] BWC::policy attach No_user_to_exceed_5Mbps $mycookie} Summary The NetApp StorageGRID multi-node S3 compatible object storage solution fits well with a high-performance server load balancer, thus making the F5 BIG-IP a good fit. There are various features within the BIG-IP toolset that can open up a broad set of StorageGRID’s practical use cases. As documented, the ability exists for a BIG-IP iRule to isolate in upon any HTTP-level S3 header, including customized user meta-data fields. This S3 metadata now becomes actionable – the resulting options are bounded only by what can be imagined. One potential option, selecting a specific pool of Storage Nodes based upon a signaled QoS value was successfully tested. Other use cases include using the advanced WAF capabilities of BIG-IP, such as the Data Guard feature, analyzing real-time response content and alerting upon xml tags or potentially obfuscating or entirely blocking transfers with sensitive data fields. A rich set of traffic safeguards were also discussed, including per virtual server connection limits and advanced bandwidth controls. When combined with optimizations discussed in other articles, such as the OneConnect profile that can drastically suppress the number of concurrent TCP sessions handled by StorageGrid nodes, one quickly sees that the performance and security improvements attainable make the suggested architecture compelling.436Views0likes0CommentsF5 AI Roadmap
We are currently using F5 AFM for DDoS mitigation and plan to enhance our protection by adding WAF in production soon. Could you share insights on F5’s AI roadmap in the context of DDoS protection? Additionally, what AI-driven features or capabilities can we enable to strengthen our defenses? any ppt or pdf for Roadmap170Views1like5CommentsJust Announced! Attend a lab and receive a Raspberry Pi
Have a Slice of AI from a Raspberry Pi Services such as ChatGPT have made accessing Generative AI as simple as visiting a web page. Whether at work or at home, there are advantages to channeling your user base (or family in the case of at home) through a central point where you can apply safeguards to their usage. In this lab, you will learn how to: Deliver centralized AI access through something as basic as a Raspberry Pi Learn basic methods for safeguarding AI Learn how users might circumvent basic safeguards Learn how to deploy additional services from F5 to enforce broader enterprise policies Register Here This lab takes place in an F5 virtual lab environment. Participants who complete the lab will receive a Raspberry Pi* to build the solution in their own environment. *Limited stock. Raspberry Pi is exclusive to this lab. To qualify, complete the lab and join a follow-up call with F5.964Views7likes2CommentsWebinar (08/21): Deliver and Secure Enterprise AI Applications
Upcoming Webinar: Deliver and Secure Enterprise AI Applications Date: August 21, 2025 Time: 10:00am PT | 1:00pm ET What's the webinar about? AI technology is evolving rapidly, moving from early predictive models and machine learning to the widespread use of generative AI. Enterprises are now deploying these generative capabilities and preparing for the next phase: autonomous agentic AI systems and physical AI that connects the digital and real worlds. Each phase brings increased demands on infrastructure, data delivery, and security. In this webinar, we will explore how enterprises can scale, secure, and manage modern AI workloads by focusing on two key critical areas: AI Application & Data Delivery and AI Security. As AI workloads extend across hybrid and multicloud environments, its crucial enterprises ensure low-latency performance, reliable data throughput, and robust protection against emerging threats. Join us to discover how to deliver and defend high-performance AI applications—and protect the data, models, and APIs that drive them. Learning Objectives: Protect AI applications, models, and data with full visibility, scalable security, and compliance-ready controls. Accelerate the delivery of AI applications and the high-throughput data flows that power them. Orchestrate seamless API and data connectivity to optimize inference pipelines across hybrid multicloud environments. Learn more, register now https://www.f5.com/company/events/webinars/deliver-and-secure-enterprise-ai-applications93Views0likes1Comment