mcp
8 TopicsContext Cloak: Hiding PII from LLMs with F5 BIG-IP
The Story As I dove deeper into the world of AI -- MCP servers, LLM orchestration, tool-calling models, agentic workflows -- one question kept nagging me: how do you use the power of LLMs to process sensitive data without actually exposing that data to the model? Banks, healthcare providers, government agencies -- they all want to leverage AI for report generation, customer analysis, and workflow automation. But the data they need to process is full of PII: Social Security Numbers, account numbers, names, phone numbers. Sending that to an LLM (whether cloud-hosted or self-hosted) creates a security and compliance risk that most organizations can't accept. I've spent years working with F5 technology, and when I learned that BIG-IP TMOS v21 added native support for the MCP protocol, the lightbulb went on. BIG-IP already sits in the data path between clients and servers. It already inspects, transforms, and enforces policy on HTTP traffic. What if it could transparently cloak PII before it reaches the LLM, and de-cloak it on the way back? That's Context Cloak. The Problem An analyst asks an LLM: "Generate a financial report for John Doe, SSN 078-05-1120, account 4532-1189-0042." The LLM now has real PII. Whether it's logged, cached, fine-tuned on, or exfiltrated -- that data is exposed. Traditional approaches fall short: Approach What Happens The Issue Masking (****) LLM can't see the data Can't reason about what it can't see Tokenization (<<SSN:001>>) LLM sees placeholders Works with larger models (14B+); smaller models may hallucinate Do nothing LLM sees real PII Security and compliance violation The Solution: Value Substitution Context Cloak takes a different approach -- substitute real PII with realistic fake values: John Doe --> Maria Garcia 078-05-1120 --> 523-50-6675 4532-1189-0042 --> 7865-4412-3375 The LLM sees what looks like real data and reasons about it naturally. It generates a perfect financial report for "Maria Garcia." On the way back, BIG-IP swaps the fakes back to the real values. The user sees a report about John Doe. The LLM never knew John Doe existed. This is conceptually a substitution cipher -- every real value maps to a consistent fake within the session, and the mapping is reversed transparently. When I was thinking about this concept, my mind kept coming back to James Veitch's TED talk about messing with email scammers. Veitch tells the scammer they need to use a code for security: Lawyer --> Gummy Bear Bank --> Cream Egg Documents --> Jelly Beans Western Union --> A Giant Gummy Lizard The scammer actually uses the code. He writes back: "I am trying to raise the balance for the Gummy Bear so he can submit all the needed Fizzy Cola Bottle Jelly Beans to the Creme Egg... Send 1,500 pounds via a Giant Gummy Lizard." The real transaction details -- the amounts, the urgency, the process -- all stayed intact. Only the sensitive terms were swapped. The scammer didn't even question it. That idea stuck with me -- what if we could do the same thing to protect PII from LLMs? But rotate the candy -- so it's not a static code book, but a fresh set of substitutions every session. Watch the talk: https://www.ted.com/talks/james_veitch_this_is_what_happens_when_you_reply_to_spam_email?t=280 Why BIG-IP? F5 BIG-IP was the natural candidate: Already in the data path -- BIG-IP is a reverse proxy that organizations already deploy MCP protocol support -- TMOS v21 added native MCP awareness via iRules iRules -- Tcl-based traffic manipulation for real-time HTTP payload inspection and rewriting Subtables -- in-memory key-value storage perfect for session-scoped cloaking maps iAppLX -- deployable application packages with REST APIs and web UIs Trust boundary -- BIG-IP is already the enforcement point for SSL, WAF, and access control How Context Cloak Works An analyst asks a question in Open WebUI Open WebUI calls MCP tools through the BIG-IP MCP Virtual Server The MCP server queries Postgres and returns real customer data (name, SSN, accounts, transactions) BIG-IP's MCP iRule scans the structured JSON response, extracts PII from known field names, generates deterministic fakes, and stores bidirectional mappings in a session-keyed subtable. The response passes through unmodified so tool chaining works. Open WebUI receives real data and composes a prompt When the prompt goes to the LLM through the BIG-IP Inference VS, the iRule uses [string map] to swap every real PII value with its fake counterpart The LLM generates its response using fake data BIG-IP intercepts the response and swaps fakes back to reals. The analyst sees a report about John Doe with his real SSN and account numbers. Two Cloaking Modes Context Cloak supports two modes, configurable per PII field: Substitute Mode Replaces PII with realistic fake values. Names come from a deterministic pool, numbers are digit-shifted, emails are derived. The LLM reasons about the data naturally because it looks real. John Doe --> Maria Garcia (name pool) 078-05-1120 --> 523-50-6675 (digit shift +5) 4532-1189-0042 --> 7865-4412-3375 (digit shift +3) john@email.com --> maria.g@example.net (derived) Best for: fields the LLM needs to reason about naturally -- names in reports, account numbers in summaries. Tokenize Mode Replaces PII with structured placeholders: 078-05-1120 --> <<SSN:32.192.169.232:001>> John Doe --> <<name:32.192.169.232:001>> 4532-1189-0042 --> <<digit_shift:32.192.169.232:001>> A guidance prompt is automatically injected into the LLM request, instructing it to reproduce the tokens exactly as-is. Larger models (14B+ parameters) handle this reliably; smaller models (7B) may struggle. Best for: defense-in-depth with F5 AI Guardrails. The tokens are intentionally distinctive -- if one leaks through de-cloaking, a guardrails policy can catch it. Both modes can be mixed per-field in the same request. The iAppLX Package Context Cloak is packaged as an iAppLX extension -- a deployable application on BIG-IP with a REST API and web-based configuration UI. When deployed, it creates all required BIG-IP objects: data groups, iRules, HTTP profiles, SSL profiles, pools, monitors, and virtual servers. The PII Field Configuration is the core of Context Cloak. The admin selects which JSON fields in MCP responses contain PII and chooses the cloaking mode per field: Field Aliases Mode Type / Label full_name customer_name Substitute Name Pool ssn Tokenize SSN account_number Substitute Digit Shift phone Substitute Phone email Substitute Email The iRules are data-group-driven -- no PII field names are hardcoded. Change the data group via the GUI, and the cloaking behavior changes instantly. This means Context Cloak works with any MCP server, not just the financial demo. Live Demo Enough theory -- here's what it looks like in practice. Step 1: Install the RPM Installing Context Cloak via BIG-IP Package Management LX Step 2: Configure and Deploy Context Cloak GUI -- MCP server, LLM endpoint, PII fields, one-click deploy Deployment output showing session config and saved configuration Step 3: Verify Virtual Servers BIG-IP Local Traffic showing MCP VS and Inference VS created by Context Cloak Step 4: Baseline -- No Cloaking Without Context Cloak: real PII flows directly to the LLM in cleartext This is the "before" picture. The LLM sees everything: real names, real SSNs, real account numbers. Demo 1: Substitute Mode -- SSN Lookup Prompt: "Show me the SSN number for John Doe. Just display the number." Substitute mode -- Open WebUI + Context Cloak GUI showing all fields as Substitute Result: User sees real SSN 078-05-1120. LLM saw a digit-shifted fake. Demo 2: Substitute Mode -- Account Lookup Prompt: "What accounts are associated to John Doe?" Left: Open WebUI with real data. Right: vLLM logs showing "Maria Garcia" with fake account numbers What the LLM saw: "customer_name": "Maria Garcia" "account_number": "7865-4412-3375" (checking) "account_number": "7865-4412-3322" (investment) "account_number": "7865-4412-3376" (savings) What the user saw: Customer: John Doe Checking: 4532-1189-0042 -- $45,230.18 Investment: 4532-1189-0099 -- $312,500.00 Savings: 4532-1189-0043 -- $128,750.00 Switching to Tokenize Mode Changing PII fields from Substitute to Tokenize in the GUI Demo 3: Mixed Mode -- Tokenized SSN SSN set to Tokenize, name set to Substitute. Prompt: "Show me the SSN number for Jane Smith. Just display the number." Mixed mode -- real SSN de-cloaked on left, <<SSN:...>> token visible in vLLM logs on right What the LLM saw: "customer_name": "Maria Thompson" "ssn": "<<SSN:32.192.169.232:001>>" What the user saw: Jane Smith, SSN 219-09-9999 Both modes operating on the same customer record, in the same request. Demo 4: Full Tokenize -- The Punchline ALL fields set to Tokenize mode. Prompt: "Show me the SSN and account information for Carlos Rivera. Display all the numbers." Full tokenize -- every PII field as a token, all de-cloaked on return What the LLM saw -- every PII field was a token: "full_name": "<<name:32.192.169.232:001>>" "ssn": "<<SSN:32.192.169.232:002>>" "phone": "<<phone:32.192.169.232:002>>" "email": "<<email:32.192.169.232:001>>" "account_number": "<<digit_shift:32.192.169.232:002>>" (checking) "account_number": "<<digit_shift:32.192.169.232:003>>" (investment) "account_number": "<<digit_shift:32.192.169.232:004>>" (savings) What the user saw -- all real data restored: Name: Carlos Rivera SSN: 323-45-6789 Checking: 6789-3345-0022 -- $89,120.45 Investment: 6789-3345-0024 -- $890,000.00 Savings: 6789-3345-0023 -- $245,000.00 And here's the best part. Qwen's last line in the response: "Please note that the actual numerical values for the SSN and account numbers are masked due to privacy concerns." The LLM genuinely believed it showed the user masked data. It apologized for the "privacy masking" -- not knowing that BIG-IP had already de-cloaked every token back to the real values. The user saw the full, real, unmasked report. What's Next: F5 AI Guardrails Integration Context Cloak's tokenize mode is designed to complement F5 AI Guardrails. The <<TYPE:ID:SEQ>> format is intentionally distinctive -- if any token leaks through de-cloaking, a guardrails policy can catch it as a pattern match violation. The vision: Context Cloak as the first layer of defense (PII never reaches the LLM), AI Guardrails as the safety net (catches anything that slips through). Defense in depth for AI data protection. Other areas I'm exploring: Hostname-based LLM routing -- BIG-IP as a model gateway with per-route cloaking policies JSON profile integration -- native BIG-IP JSON DOM parsing instead of regex Auto-discovery of MCP tool schemas for PII field detection Centralized cloaking policy management across multiple BIG-IP instances Try It Yourself The complete project is open source: https://github.com/j2rsolutions/f5_mcp_context_cloak The repository includes Terraform for AWS infrastructure, Kubernetes manifests, the iAppLX package (RPM available in Releases), iRules, sample financial data, a test script, comprehensive documentation, and a full demo walkthrough with GIFs (see docs/demo-evidence.md). A Note on Production Readiness I want to be clear: this is a lab proof-of-concept. I have not tested this in a production environment. The cloaking subtable stores PII in BIG-IP memory, the fake name pool is small (100 combinations), the SSL certificates are self-signed, and there's no authentication on the MCP server. There are edge cases around streaming responses, subtable TTL expiry, and LLM-derived values that need more work. But the core concept is proven: BIG-IP can transparently cloak PII in LLM workflows using value substitution and tokenization, and the iAppLX packaging makes it deployable and configurable without touching iRule code. I'd love to hear what the community thinks. Is this approach viable for your use cases? What PII types would you need to support? How would you handle the edge cases? What would it take to make this production-ready for your environment? Let me know in the comments -- and if you want to contribute, PRs are welcome! Demo Environment F5 BIG-IP VE v21.0.0.1 on AWS (m5.xlarge) Qwen 2.5 14B Instruct AWQ on vLLM 0.8.5 (NVIDIA L4, 24GB VRAM) MCP Server: FastMCP 1.26 + PostgreSQL 16 on Kubernetes (RKE2) Open WebUI v0.8.10 Context Cloak iAppLX v0.2.0 References Managing MCP in iRules -- Part 1 Managing MCP in iRules -- Part 2 Managing MCP in iRules -- Part 3 Model Context Protocol Specification James Veitch: This is what happens when you reply to spam email (TED, skip to 4:40)129Views1like0CommentsUsing the Model Context Protocol with Open WebUI
This year we started building out a series of hands-on labs you can do on your own in our AI Step-by-Step repo on GitHub. In my latest lab, I walk you through setting up a Model Context Protocol (MCP) server and the mcpo proxy to allow you to use MCP tools in a locally-hosted Open WebUI + Ollama environment. The steps are well-covered there, but I wanted to highlight what you learn in the lab. What is MCP and why does it matter? MCP is a JSON-based open standard from Anthropic that (shockingly!) is only about 13 months old now. It allows AI assistants to securely connect to external data sources and tools through a unified interface. The key delivery that led to it's rapid adoption is that it solves the fragmentation problem in AI integrations—instead of every AI system needing custom code to connect to each tool or database, MCP provides a single protocol that works across different AI models and data sources. MCP in the local lab My first exposure to MCP was using Claude and Docker tools to replicate a video Sebastian_Maniak released showing how to configure a BIG-IP application service. I wanted to see how F5-agnostic I could be in my prompt and still get a successful result, and it turned out that the only domain-specific language I needed, after it came up with a solution and deployed it, was to specify the load balancing algorithm. Everything else was correct. Kinda blew my mind. I spoke about this experience throughout the year at F5 Academy events and at a solutions days event in Toronto, but more-so, I wanted to see how far I could take this in a local setting away from the pay-to-play tooling offered at that time. This was the genesis for this lab. Tools In this lab, you'll use the following tools: Ollama - Open WebUI mcpo custom mcp server Ollama and Open WebUI are assumed to already be installed, those labs are also in the AI Step-by-Step repo: Installing Ollama Installing Open WebUI Once those are in place, you can clone the repo and deploy in docker or podman, just make sure the containers for open WebUI are in the same network as the repo you're deploying. Results The success for getting your Open WebUI inference through the mcpo proxy and the MCP servers (mine is very basic just for test purposes, there are more that you can test or build yourself) depends greatly on your prompting skills and the abilities of the local models you choose. I had varying success with llama3.2:3b. But the goal here isn't production-ready tooling, it's to build and discover and get comfortable in this new world of AI assistants and leveraging them where it makes sense to augment our toolbox. Drop a comment below if you build this lab and share your successes and failures. Community is the best learning environment.
1.9KViews5likes1CommentSecuring MCP Servers with F5 Distributed Cloud WAF
Learn how F5 Distributed Cloud WAF protects MCP Servers and seamlessly integrates with MCP Clients. As Agentic AI is increasing its adoption rate, remote MCP (Model Context Protocol) Servers are becoming more prevalent. The MCP protocol allows AI Agents to reach many more tools than it was possible through the previous model of tight, local, integration between the client and the MCP server. MCP tools are now the new APIs and more and more organizations are exposing their resources through MCP servers, allowing them to be consumed by MCP clients.
784Views5likes2CommentsManaging Model Context Protocol in iRules - Part 3
In part 2 of this series, we took a look at a couple iRules use cases that do not require the json or sse profiles and don't capitalize on the new JSON commands and events introduced in the v21 release. That changes now! In this article, we'll take a look at two use cases for logging MCP activity and removing MCP tools from a servers tool list. Event logging This iRule logs various HTTP, SSE, and JSON-related events for debugging and monitoring purposes. It provides clear visibility into request/response flow and detects anomalies or errors. How it works HTTP_REQUEST Logs each HTTP request with its URI and client IP. Example: "HTTP request received: URI /example from 192.168.1.1" SSE_RESPONSE Logs when a Server-Sent Event (SSE) response is identified. Example: "SSE response detected successfully." JSON_REQUEST and JSON_RESPONSE Logs when valid JSON requests or responses are detected Examples: "JSON Request detected successfully" JSON Response detected successfully" JSON_REQUEST_MISSING and JSON_RESPONSE_MISSING Logs if JSON payloads are missing from requests or responses. Examples: "JSON Request missing." "JSON Response missing." JSON_REQUEST_ERROR and JSON_RESPONSE_ERROR Logs when there are errors in parsing JSON during requests or responses. Examples: "Error processing JSON request. Rejecting request." "Error processing JSON response." iRule: Event Logging when HTTP_REQUEST { # Log the event (for debugging) log local0. "HTTP request received: URI [HTTP::uri] from [IP::client_addr]" when SSE_RESPONSE { # Triggered when a Server-Sent Event response is detected log local0. "SSE response detected successfully." } when JSON_REQUEST { # Triggered when the JSON request is detected log local0. "JSON Request detected successfully." } when JSON_RESPONSE { # Triggered when a Server-Sent Event response is detected log local0. "JSON response detected successfully." } when JSON_RESPONSE_MISSING { # Triggered when the JSON payload is missing from the server response log local0. "JSON Response missing." } when JSON_REQUEST_MISSING { # Triggered when the JSON is missing or can't be parsed in the request log local0. "JSON Request missing." } when JSON_RESPONSE_ERROR { # Triggered when there's an error in the JSON response processing log local0. "Error processing JSON response." #HTTP::respond 500 content "Invalid JSON response from server." } when JSON_REQUEST_ERROR { # Triggered when an error occurs (e.g., malformed JSON) during JSON processing log local0. "Error processing JSON request. Rejecting request." #HTTP::respond 400 content "Malformed JSON payload. Please check your input." } MCP tool removal This iRule modifies server JSON responses by removing disallowed tools from the result.tools array while logging detailed debugging information. How it works JSON parsing and logging print procedure - recursively traverses and logs the JSON structure, including arrays, objects, strings, and other types. jpath procedure - extracts values or JSON elements based on a provided path, allowing targeted retrieval of nested properties. JSON response handling When JSON_RESPONSE is triggered: Logs the root JSON object and parses it using JSON::root. Extracts the tools array from result.tools. Tool removal logic Iterates over the tools array and retrieves the name of each tool. If the tool name matches start-notification-stream: Removes it from the array using JSON::array remove. Logs that the tool is not allowed. If the tool does not match: Logs that the tool is allowed and moves to the next one. Logging information Logs all JSON structures and actions: Full JSON structure. Extracted tools array. Tools allowed or removed. Input JSON Response { "result": { "tools": [ {"name": "start-notification-stream"}, {"name": "allowed-tool"} ] } } Modified Response { "result": { "tools": [ {"name": "allowed-tool"} ] } } iRule: Remove tool list # Code to check JSON and print in logs proc print { e } { set t [JSON::type $e] set v [JSON::get $e] set p0 [string repeat " " [expr {2 * ([info level] - 1)}]] set p [string repeat " " [expr {2 * [info level]}]] switch $t { array { log local0. "$p0\[" set size [JSON::array size $v] for {set i 0} {$i < $size} {incr i} { set e2 [JSON::array get $v $i] call print $e2 } log local0. "$p0\]" } object { log local0. "$p0{" set keys [JSON::object keys $v] foreach k $keys { set e2 [JSON::object get $v $k] log local0. "$p${k}:" call print $e2 } log local0. "$p0}" } string - literal { set v2 [JSON::get $e $t] log local0. "$p\"$v2\"" } default { set v2 [JSON::get $e $t] log local0. "$p$v2" } } } proc jpath { e path {d .} } { if { [catch {set v [call jpath2 $e $path $d]} err] } { return "" } return $v } proc jpath2 { e path {d .} } { set parray [split $path $d] set plen [llength $parray] set i 0 for {} {$i < [expr {$plen }]} {incr i} { set p [lindex $parray $i] set t [JSON::type $e] set v [JSON::get $e] if { $t eq "array" } { # array set e [JSON::array get $v $p] } else { # object set e [JSON::object get $v $p] } } set t [JSON::type $e] set v [JSON::get $e $t] return $v } # Modify in response when JSON_RESPONSE { log local0. "JSON::root" set root [JSON::root] call print $root set tools [call jpath $root result.tools] log local0. "root = $root tools= $tools" if { $tools ne "" } { log local0. "TOOLS not empty" set i 0 set block_tool "start-notification-stream" while { $i < 100 } { set name [call jpath $root result.tools.${i}.name] if { $name eq "" } { break } if { $name eq $block_tool } { log local0. "tool $name is not alowed" JSON::array remove $tools $i } else { log local0. "tool $name is alowed" incr i } } } else { log local0. "no tools" } } Conclusion This not only concludes the article, but also this introductory series on managing MCP in iRules. Note that all these commands handle all things JSON, so you are not limited to MCP contexts. We look forward to what the community will build (and hopefully share back) with this new functionality! NOTE: This series is ghostwritten. Awaiting permission from original author to credit.358Views2likes0CommentsManaging Model Context Protocol in iRules - Part 2
In the first article in this series, we took a look at what Model Context Protocol (MCP) is, and how to get the F5 BIG-IP set up to manage it with iRules. In this article, we'll take a look at the first couple of use cases with session persistence and routing. Note that the use cases in this article do not require the json or sse profiles to work. That will change in part 3. Session persistence and routing This iRule ensures session persistence and traffic routing for three endpoints: /sse, /messages, and /mcp. It injects routing information (f5Session) via query parameters or headers, processes them for routing to specific pool members, and transparently forwards requests to the server. How it works Client sends HTTP GET request to SSE endpoint of server (typically /sse): GET /sse HTTP/1.1 Server responds 200 OK with an SSE event stream. It includes an SSE message with an "event" field of "endpoint", which provides the client with a URI where all its future HTTP requests must be sent. This is where servers might include a session ID: event: endpoint data: /messages?sessionId=abcd1234efgh5678 NOTE: the MCP spec does not specify how a session ID can be encoded in the endpoint here. While we have only seen use of a sessionId query parameter, theoretically a server could implement its session Ids with any arbitrary query parameter name, or even as part of the path like this: event: endpoint data: /messages/abcd1234efgh5678 Our iRule can take advantage of this mechanism by injecting a query parameter into this path that tells us which server we should persist future requests to. So when we forward the SSE message to the client, it looks something like this: event: endpoint data: /messages?f5Session=some_pool_name,10.10.10.5:8080&sessionId=abcd1234efgh5678 or event: endpoint data: /messages/abcd1234efgh5678?f5Session=some_pool_name,10.10.10.5:8080 When the client sends a subsequent HTTP request, it will use this endpoint. Thus, when processing HTTP requests, we can read the f5Session secret we inserted earlier, route to that pool member, and then remove our secret before forwarding the request to the server using the original endpoint/sessionId it provided. Load Balancing when HTTP_REQUEST { set is_req_to_sse_endpoint false # Handle requests to `/sse` (Server-Sent Event endpoint) if { [HTTP::path] eq "/sse" } { set is_req_to_sse_endpoint true return } # Handle `/messages` endpoint persistence query processing if { [HTTP::path] eq "/messages" } { set query_string [HTTP::query] set f5_sess_found false set new_query_string "" set query_separator "" set queries [split $query_string "&"] ;# Split query string into individual key-value pairs foreach query $queries { if { $f5_sess_found } { append new_query_string "${query_separator}${query}" set query_separator "&" } elseif { [string match "f5Session=*" $query] } { # Parse `f5Session` for persistence routing set pmbr_info [string range $query 10 end] set pmbr_parts [split $pmbr_info ","] if { [llength $pmbr_parts] == 2 } { set pmbr_tuple [split [lindex $pmbr_parts 1] ":"] if { [llength $pmbr_tuple] == 2 } { pool [lindex $pmbr_parts 0] member [lindex $pmbr_parts 1] set f5_sess_found true } else { HTTP::respond 404 noserver return } } else { HTTP::respond 404 noserver return } } else { append new_query_string "${query_separator}${query}" set query_separator "&" } } if { $f5_sess_found } { HTTP::query $new_query_string } else { HTTP::respond 404 noserver } return } # Handle `/mcp` endpoint persistence via session header if { [HTTP::path] eq "/mcp" } { if { [HTTP::header exists "Mcp-Session-Id"] } { set header_value [HTTP::header "Mcp-Session-Id"] set header_parts [split $header_value ","] if { [llength $header_parts] == 3 } { set pmbr_tuple [split [lindex $header_parts 1] ":"] if { [llength $pmbr_tuple] == 2 } { pool [lindex $header_parts 0] member [lindex $header_parts 1] HTTP::header replace "Mcp-Session-Id" [lindex $header_parts 2] } else { HTTP::respond 404 noserver } } else { HTTP::respond 404 noserver } } } } when HTTP_RESPONSE { # Persist session for MCP responses if { [HTTP::header exists "Mcp-Session-Id"] } { set pool_member [LB::server pool],[IP::remote_addr]:[TCP::remote_port] set header_value [HTTP::header "Mcp-Session-Id"] set new_header_value "$pool_member,$header_value" HTTP::header replace "Mcp-Session-Id" $new_header_value } # Inject persistence information into response payloads for Server-Sent Events if { $is_req_to_sse_endpoint } { set sse_data [HTTP::payload] ;# Get the SSE payload # Extract existing query params from the SSE response set old_queries [URI::query $sse_data] if { [string length $old_queries] == 0 } { set query_separator "" } else { set query_separator "&" } # Insert `f5Session` persistence information into query set new_query "f5Session=[URI::encode [LB::server pool],[IP::remote_addr]:[TCP::remote_port]]" set new_payload "?${new_query}${query_separator}${old_queries}" # Replace the payload in the SSE response HTTP::payload replace 0 [string length $sse_data] $new_payload } } Persistence when CLIENT_ACCEPTED { # Log when a new TCP connection arrives (useful for debugging) log local0. "New TCP connection accepted from [IP::client_addr]:[TCP::client_port]" } when HTTP_REQUEST { # Check if this might be an SSE request by examining the Accept header if {[HTTP::header exists "Accept"] && [HTTP::header "Accept"] contains "text/event-stream"} { log local0. "SSE Request detected from [IP::client_addr] to [HTTP::uri]" # Insert a custom persistence key (optional) set sse_persistence_key "[IP::client_addr]:[HTTP::uri]" persist uie $sse_persistence_key } } when HTTP_RESPONSE { # Ensure this is an SSE connection by checking the Content-Type if {[HTTP::header exists "Content-Type"] && [HTTP::header "Content-Type"] equals "text/event-stream"} { log local0. "SSE Response detected for [IP::client_addr]. Enabling persistence." # Use the same persistence key for the response persist add uie $sse_persistence_key } } Conclusion Thank you for your patience! Now is the time to continue on to part 3 where we'll finally get into the new JSON commands and events added in version 21! NOTE: This series is ghostwritten. Awaiting permission from original author to credit.287Views3likes0CommentsManaging Model Context Protocol in iRules - Part 1
The Model Context Protocol (MCP) was introduced by Anthropic in November of 2024, and has taken the industry by storm since. MCP provides a standardized way for AI applications to connect with external data sources and tools through a single protocol, eliminating the need for custom integrations for each service and enabling AI systems to dynamically discover and use available capabilities. It's gained rapid industry adoption because major model providers and numerous IDE and tool makers have embraced it as an open standard, with tens of thousands of MCP servers built and widespread recognition that it mostly solves the fragmented integration challenge that previously plagued AI development. In this article, we'll take a look at the MCP components, how MCP works, and how you can use the JSON iRules events and commands introduced in version 21 to control the messsaging between MCP clients and servers. MCP components Host The host is the AI application where the LLM logic resides, such as Claude Desktop, AI-powered IDEs like Cursor, Open WebUI with the mcpo proxy like in our AI Step-by-Step labs, or via custom agentic systems that receive user requests and orchestrate the overall interaction. Client The client exists within the host application and maintains a one-to-one connection with each MCP server, converting user requests into the structured format that the protocol can process and managing session details like timeouts and reconnects. Server Servers are lightweight programs that expose data and functionality from external systems, whether internal databases or external APIs, allowing connections to both local and remote resources. Multiple clients can exist within a host, but each client has a dedicated (or perceived in the case of using a proxy) 1:1 relationship with an MCP server. MCP servers expose three main types of capabilities: Resources - information retrieval without executing actions Tools - performing side effects like calculations or API requests Prompts - reusable templates and workflows for LLM-server communication Message format (JSON-RPC) The transport layer between clients and servers uses JSON-RPC format for two-way message conversion, allowing the transport of various data structures and their processing rules. This enforces a consistent request/response format across all tools, so applications don't have to handle different response types for different services. Transport options MCP supports three standard transport mechanisms: stdio (standard input/output for local connections), Server-Sent Events (SSE for remote connections with separate endpoints for requests and responses), and Streamable HTTP (a newer method introduced in March 2025 that uses a single HTTP endpoint for bidirectional messaging). NOTE: SSE transport has been deprecated as of protocol version 2024-11-05 and replaced by Streamable HTTP, which addresses limitations like lack of resumable streams and the need to maintain long-lived connections, though SSE is still supported for backward compatibility. MCP workflow Pictures tell a compelling story. First, the diagram. The steps in the diagram above are as follows: The MCP client requests capabilities from the MCP server The MCP server provides a list of available tools and services the MCP client sends the question and the retrieved MCP server tools and services to the LLM The LLM specifies which tools and services to use. The MCP client calls the specific tool or service The MCP server returns the result/context to the MCP client The MCP client passes the result/context to the LLM The LLM uses the result/context to prepare the answer iRules MCP-based use cases There are a bunch of use cases for MCP handling, such as: Load-balancing of MCP traffic across MCP Servers High availability of the MCP Servers MCP message validation on behalf of MCP servers MCP protocol inspection and payload modification Monitoring the MCP Servers' health and their transport protocol status. In case of any error in MCP request and response, BIG-IP should be able to detect and report to the user Optimization Profiles Support Use OneConnect Profile Use Compression Profile Security support for MCP servers. There are no native features for this yet, but you can build your own secure business logic into the iRules logic for now. LTM profiles Configuring MCP involves creating two profiles - an SSE profile and a JSON profile - and then attaching them to a virtual server. The SSE profile is for backwards compatibility should you need it in your MCP client/server environment. The defaults for these profiles are shown below. [root@ltm21a:Active:Standalone] config # tmsh list ltm profile sse all-properties ltm profile sse sse { app-service none defaults-from none description none max-buffered-msg-bytes 65536 max-field-name-size 1024 partition Common } [root@ltm21a:Active:Standalone] config # tmsh list ltm profile json all-properties ltm profile json json { app-service none defaults-from none description none maximum-bytes 65536 maximum-entries 2048 maximum-non-json-bytes 32768 partition Common } These can be tuned down from these maximums by creating custom profiles that will meet the needs of your environment, for example (without all properties like above): [root@ltm21a:Active:Standalone] config # tmsh create ltm profile sse sse_test_env max-buffered-msg-bytes 1000 max-field-name-size 500 [root@ltm21a:Active:Standalone] config # tmsh create ltm profile json json_test_env maximum-bytes 3000 maximum-entries 1000 maximum-non-json-bytes 2000 [root@ltm21a:Active:Standalone] config # tmsh list ltm profile sse sse_test_env ltm profile sse sse_test_env { app-service none max-buffered-msg-bytes 1000 max-field-name-size 500 } [root@ltm21a:Active:Standalone] config # tmsh list ltm profile json json_test_env ltm profile json json_test_env { app-service none maximum-bytes 3000 maximum-entries 1000 maximum-non-json-bytes 2000 } NOTE: Both profiles have database keys that can be temporarily enabled for troubleshooting purposes. The keys are log.sse.level and log.json.level. You can set the value for one or both to debug. Do not leave them in debug mode! Conclusion Now that we have the laid the foundation, continue on to part 2 where we'll look at the first two use cases. NOTE: This series is ghostwritten. Awaiting permission from original author to credit.524Views3likes1CommentScaling and Traffic-Managed Model Context Protocol (MCP) with BIG-IP Next for K8s
Introduction As AI models get more advanced, running them at scale—especially in cloud-native environments like Kubernetes—can be tricky. That’s where the Model Context Protocol (MCP) comes in. MCP makes it easier to connect and interact with AI models, but managing all the traffic and scaling these services as demand grows is a whole different challenge. In this article and demo video, I will show how F5's BIG-IP Next for K8s (BNK), a powerful cloud native traffic management platform from F5 can solve that and keep things running smoothly and scale your MCP services as needed. Model Context Protocol (MCP) in a nutshell. There were many articles explaining what is MCP on the internet. Please refer to those in details. In a nutshell, it is a standard framework or specification to securely connect AI apps to your critical data, tools, and workflow. The specification allow Tracking of context across multiple conversation Tool integration — model call external tools Share memory/state — remember information. MCP’s "glue" model to tools through a universal interface "USB-C for AI" What EXACTLY does MCP solve? MCP addresses many challenges in the AI ecosystem. I believe two key challenges it solve Complexities of integrating AI Model (LLM) with external sources and tools By standardization with a universal connector ("USB-C for AI") Everyone build "USB-C for AI" port so that it can be easily plug in each other Interoperability. Security with external integration Framework to establish secure connection Managing permission and authorization. What is BIG-IP’s Next for K8s (BNK)? BNK is F5 modernized version of the well-known Big-IP platform, redesigned to work seamlessly in cloud-native environments like Kubernetes. It is a scalable networking and security solution for ingress and egress traffic control. It builds on decades of F5's leadership in application delivery and security. It powers Kubernetes networking for today's complex workloads. BNK can be deployed on X86 architecture or ARM architecture - Nvidia Data Processing Unit (DPU) Lets see how F5's BNK scale and traffic managed an AIOps ecosystem. DEMO Architecture Setup Video Key Takeaways BIGIP Next for K8s, the backbone of the MCP architecture Technology built on decades of market-leading application delivery controller technology Secure, Deliver, and Optimize your AI infrastructure Provides deep insight through observability and visibility of your MCP traffic.
894Views1like0Comments