Mar 27, 2026 - For details about updated CVE-2025-53521 (BIG-IP APM vulnerability), refer to K000156741.

Context Cloak: Hiding PII from LLMs with F5 BIG-IP

Like a lot of people in the infrastructure and security space, I didn't set out to build an AI project...

The Story

As I dove deeper into the world of AI -- MCP servers, LLM orchestration, tool-calling models, agentic workflows -- one question kept nagging me: how do you use the power of LLMs to process sensitive data without actually exposing that data to the model?

Banks, healthcare providers, government agencies -- they all want to leverage AI for report generation, customer analysis, and workflow automation. But the data they need to process is full of PII: Social Security Numbers, account numbers, names, phone numbers. Sending that to an LLM (whether cloud-hosted or self-hosted) creates a security and compliance risk that most organizations can't accept.

I've spent years working with F5 technology, and when I learned that BIG-IP TMOS v21 added native support for the MCP protocol, the lightbulb went on. BIG-IP already sits in the data path between clients and servers. It already inspects, transforms, and enforces policy on HTTP traffic. What if it could transparently cloak PII before it reaches the LLM, and de-cloak it on the way back?

That's Context Cloak.

The Problem

An analyst asks an LLM: "Generate a financial report for John Doe, SSN 078-05-1120, account 4532-1189-0042."

The LLM now has real PII. Whether it's logged, cached, fine-tuned on, or exfiltrated -- that data is exposed. Traditional approaches fall short:

Approach

What Happens

The Issue

Masking (****)

LLM can't see the data

Can't reason about what it can't see

Tokenization (<<SSN:001>>)

LLM sees placeholders

Works with larger models (14B+); smaller models may hallucinate

Do nothing

LLM sees real PII

Security and compliance violation

 

The Solution: Value Substitution

Context Cloak takes a different approach -- substitute real PII with realistic fake values:

John Doe        -->  Maria Garcia
078-05-1120     -->  523-50-6675
4532-1189-0042  -->  7865-4412-3375

The LLM sees what looks like real data and reasons about it naturally. It generates a perfect financial report for "Maria Garcia." On the way back, BIG-IP swaps the fakes back to the real values. The user sees a report about John Doe. The LLM never knew John Doe existed.

This is conceptually a substitution cipher -- every real value maps to a consistent fake within the session, and the mapping is reversed transparently.

 

When I was thinking about this concept, my mind kept coming back to James Veitch's TED talk about messing with email scammers. Veitch tells the scammer they need to use a code for security:

Lawyer         -->  Gummy Bear
Bank           -->  Cream Egg
Documents      -->  Jelly Beans
Western Union  -->  A Giant Gummy Lizard

The scammer actually uses the code. He writes back:

"I am trying to raise the balance for the Gummy Bear so he can submit all the needed Fizzy Cola Bottle Jelly Beans to the Creme Egg... Send 1,500 pounds via a Giant Gummy Lizard."

The real transaction details -- the amounts, the urgency, the process -- all stayed intact. Only the sensitive terms were swapped. The scammer didn't even question it. That idea stuck with me -- what if we could do the same thing to protect PII from LLMs? But rotate the candy -- so it's not a static code book, but a fresh set of substitutions every session.

Watch the talk: https://www.ted.com/talks/james_veitch_this_is_what_happens_when_you_reply_to_spam_email?t=280

Why BIG-IP?

F5 BIG-IP was the natural candidate:

  • Already in the data path -- BIG-IP is a reverse proxy that organizations already deploy
  • MCP protocol support -- TMOS v21 added native MCP awareness via iRules
  • iRules -- Tcl-based traffic manipulation for real-time HTTP payload inspection and rewriting
  • Subtables -- in-memory key-value storage perfect for session-scoped cloaking maps
  • iAppLX -- deployable application packages with REST APIs and web UIs
  • Trust boundary -- BIG-IP is already the enforcement point for SSL, WAF, and access control

How Context Cloak Works

  1. An analyst asks a question in Open WebUI
  2. Open WebUI calls MCP tools through the BIG-IP MCP Virtual Server
  3. The MCP server queries Postgres and returns real customer data (name, SSN, accounts, transactions)
  4. BIG-IP's MCP iRule scans the structured JSON response, extracts PII from known field names, generates deterministic fakes, and stores bidirectional mappings in a session-keyed subtable. The response passes through unmodified so tool chaining works.
  5. Open WebUI receives real data and composes a prompt
  6. When the prompt goes to the LLM through the BIG-IP Inference VS, the iRule uses [string map] to swap every real PII value with its fake counterpart
  7. The LLM generates its response using fake data
  8. BIG-IP intercepts the response and swaps fakes back to reals. The analyst sees a report about John Doe with his real SSN and account numbers.

Two Cloaking Modes

Context Cloak supports two modes, configurable per PII field:

Substitute Mode

Replaces PII with realistic fake values. Names come from a deterministic pool, numbers are digit-shifted, emails are derived. The LLM reasons about the data naturally because it looks real.

John Doe        -->  Maria Garcia       (name pool)
078-05-1120     -->  523-50-6675        (digit shift +5)
4532-1189-0042  -->  7865-4412-3375     (digit shift +3)
john@email.com  -->  maria.g@example.net (derived)

Best for: fields the LLM needs to reason about naturally -- names in reports, account numbers in summaries.

Tokenize Mode

Replaces PII with structured placeholders:

078-05-1120     -->  <<SSN:32.192.169.232:001>>
John Doe        -->  <<name:32.192.169.232:001>>
4532-1189-0042  -->  <<digit_shift:32.192.169.232:001>>

A guidance prompt is automatically injected into the LLM request, instructing it to reproduce the tokens exactly as-is. Larger models (14B+ parameters) handle this reliably; smaller models (7B) may struggle.

Best for: defense-in-depth with F5 AI Guardrails. The tokens are intentionally distinctive -- if one leaks through de-cloaking, a guardrails policy can catch it.

Both modes can be mixed per-field in the same request.

The iAppLX Package

Context Cloak is packaged as an iAppLX extension -- a deployable application on BIG-IP with a REST API and web-based configuration UI. When deployed, it creates all required BIG-IP objects: data groups, iRules, HTTP profiles, SSL profiles, pools, monitors, and virtual servers.

The PII Field Configuration is the core of Context Cloak. The admin selects which JSON fields in MCP responses contain PII and chooses the cloaking mode per field:

Field

Aliases

Mode

Type / Label

full_name

customer_name

Substitute

Name Pool

ssn

 

Tokenize

SSN

account_number

 

Substitute

Digit Shift

phone

 

Substitute

Phone

email

 

Substitute

Email

 

The iRules are data-group-driven -- no PII field names are hardcoded. Change the data group via the GUI, and the cloaking behavior changes instantly. This means Context Cloak works with any MCP server, not just the financial demo.

Live Demo

Enough theory -- here's what it looks like in practice.

Step 1: Install the RPM

Installing Context Cloak via BIG-IP Package Management LX

Step 2: Configure and Deploy

Context Cloak GUI -- MCP server, LLM endpoint, PII fields, one-click deploy

Deployment output showing session config and saved configuration

Step 3: Verify Virtual Servers

BIG-IP Local Traffic showing MCP VS and Inference VS created by Context Cloak

Step 4: Baseline -- No Cloaking

Without Context Cloak: real PII flows directly to the LLM in cleartext

This is the "before" picture. The LLM sees everything: real names, real SSNs, real account numbers.

Demo 1: Substitute Mode -- SSN Lookup

Prompt: "Show me the SSN number for John Doe. Just display the number."

Substitute mode -- Open WebUI + Context Cloak GUI showing all fields as Substitute

Result: User sees real SSN 078-05-1120. LLM saw a digit-shifted fake.

Demo 2: Substitute Mode -- Account Lookup

Prompt: "What accounts are associated to John Doe?"

 

Left: Open WebUI with real data. Right: vLLM logs showing "Maria Garcia" with fake account numbers

What the LLM saw:

"customer_name": "Maria Garcia"
"account_number": "7865-4412-3375"  (checking)
"account_number": "7865-4412-3322"  (investment)
"account_number": "7865-4412-3376"  (savings)

What the user saw:

Customer: John Doe
Checking:   4532-1189-0042  --  $45,230.18
Investment: 4532-1189-0099  --  $312,500.00
Savings:    4532-1189-0043  --  $128,750.00

Switching to Tokenize Mode

Changing PII fields from Substitute to Tokenize in the GUI

Demo 3: Mixed Mode -- Tokenized SSN

SSN set to Tokenize, name set to Substitute.

Prompt: "Show me the SSN number for Jane Smith. Just display the number."

 

Mixed mode -- real SSN de-cloaked on left, <<SSN:...>> token visible in vLLM logs on right

What the LLM saw:

"customer_name": "Maria Thompson"
"ssn": "<<SSN:32.192.169.232:001>>"

What the user saw: Jane Smith, SSN 219-09-9999

Both modes operating on the same customer record, in the same request.

Demo 4: Full Tokenize -- The Punchline

ALL fields set to Tokenize mode.

Prompt: "Show me the SSN and account information for Carlos Rivera. Display all the numbers."

Full tokenize -- every PII field as a token, all de-cloaked on return

What the LLM saw -- every PII field was a token:

"full_name":       "<<name:32.192.169.232:001>>"
"ssn":             "<<SSN:32.192.169.232:002>>"
"phone":           "<<phone:32.192.169.232:002>>"
"email":           "<<email:32.192.169.232:001>>"
"account_number":  "<<digit_shift:32.192.169.232:002>>"  (checking)
"account_number":  "<<digit_shift:32.192.169.232:003>>"  (investment)
"account_number":  "<<digit_shift:32.192.169.232:004>>"  (savings)

What the user saw -- all real data restored:

Name:        Carlos Rivera
SSN:         323-45-6789
Checking:    6789-3345-0022  --  $89,120.45
Investment:  6789-3345-0024  --  $890,000.00
Savings:     6789-3345-0023  --  $245,000.00

 

And here's the best part. Qwen's last line in the response:

"Please note that the actual numerical values for the SSN and account numbers are masked due to privacy concerns."

 

The LLM genuinely believed it showed the user masked data. It apologized for the "privacy masking" -- not knowing that BIG-IP had already de-cloaked every token back to the real values. The user saw the full, real, unmasked report.

What's Next: F5 AI Guardrails Integration

Context Cloak's tokenize mode is designed to complement F5 AI Guardrails. The <<TYPE:ID:SEQ>> format is intentionally distinctive -- if any token leaks through de-cloaking, a guardrails policy can catch it as a pattern match violation.

The vision: Context Cloak as the first layer of defense (PII never reaches the LLM), AI Guardrails as the safety net (catches anything that slips through). Defense in depth for AI data protection.

Other areas I'm exploring:

  • Hostname-based LLM routing -- BIG-IP as a model gateway with per-route cloaking policies
  • JSON profile integration -- native BIG-IP JSON DOM parsing instead of regex
  • Auto-discovery of MCP tool schemas for PII field detection
  • Centralized cloaking policy management across multiple BIG-IP instances

Try It Yourself

The complete project is open source:

https://github.com/j2rsolutions/f5_mcp_context_cloak

The repository includes Terraform for AWS infrastructure, Kubernetes manifests, the iAppLX package (RPM available in Releases), iRules, sample financial data, a test script, comprehensive documentation, and a full demo walkthrough with GIFs (see docs/demo-evidence.md).

A Note on Production Readiness

I want to be clear: this is a lab proof-of-concept. I have not tested this in a production environment. The cloaking subtable stores PII in BIG-IP memory, the fake name pool is small (100 combinations), the SSL certificates are self-signed, and there's no authentication on the MCP server. There are edge cases around streaming responses, subtable TTL expiry, and LLM-derived values that need more work.

But the core concept is proven: BIG-IP can transparently cloak PII in LLM workflows using value substitution and tokenization, and the iAppLX packaging makes it deployable and configurable without touching iRule code.

I'd love to hear what the community thinks. Is this approach viable for your use cases? What PII types would you need to support? How would you handle the edge cases? What would it take to make this production-ready for your environment?

Let me know in the comments -- and if you want to contribute, PRs are welcome!

Demo Environment

  • F5 BIG-IP VE v21.0.0.1 on AWS (m5.xlarge)
  • Qwen 2.5 14B Instruct AWQ on vLLM 0.8.5 (NVIDIA L4, 24GB VRAM)
  • MCP Server: FastMCP 1.26 + PostgreSQL 16 on Kubernetes (RKE2)
  • Open WebUI v0.8.10
  • Context Cloak iAppLX v0.2.0

References

Updated Apr 07, 2026
Version 2.0
No CommentsBe the first to comment