vegas

12 Topics

JSON-query'ish meta language for iRules
Intro Jason Rahm recently dropped his "Working with JSON data in iRules" series, which included a few JSON challenges and a subtle hint [string toupper [string replace Jason 1 1 ""]] about the upcoming iRule challenge at AppWorld 2026 in Las Vegas. With cash prizes and bragging rights on the line, my colleagues and I dove into Jason's code. While his series is a great foundation, we saw an opportunity to push the boundaries of security, performance and add RFC compliance. Problem Although F5 recently introduced native iRule commands for JSON parsing (v21.x); these tools remain "bare metal" compared to modern programming languages. They offer minimal abstraction, requiring developers to possess both deep JSON schema knowledge and advanced iRule expertise to implement safely. Without a supporting framework, engineers are forced to manually manage complex types, nested objects, and arrays. A process that is both labor-intensive and error-prone. As JSON has become the de facto standard for AI-centric workloads and modern API traffic, the need to efficiently manipulate session data on the ADC platform has never been greater. Solution Our goal is to bridge this gap by developing a "Swiss Army Knife" framework for iRule JSON parsing, providing the abstraction and reliability needed for high-performance traffic management. Imagine a JSON data structure as shown below: { "my_string": "Hello World", "my_number": 42, "my_boolean": true, "my_null": null, "my_array": [ 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 ], "my_object": { "nested_string": "I'm nested" }, "my_children": [ {"name": "Anna Conda","firstname": "Anna", "surname": "Conda"}, {"name": "Justin Case","firstname": "Justin", "surname": "Case"}, {"name": "Don Key","firstname": "Don", "surname": "Key"}, {"name": "Artie Choke","firstname": "Artie", "surname": "Choke"}, {"name": "Barbie Doll","firstname": "Barbie", "surname": "Doll"} ] } The [call json_get] and [call json_set] procedures from our iRule introduce a JSON-Query meta-language to slice information into and out of JSON. Here are a few examples of how these procedures can be used: # Define JSON root element set root [JSON::root] # Without a filter is behaves like json_stringify log [call json_get $root ""] -> {"my_string": "Hello World","my_number": 42,"my_boolean": true,"my_null": .... <truncated for better readability> # But as soon as you add filters, it becomes parsing on steroids! log [call json_get $root "my_string"] -> "Hello World" # You simply ask for a path and you promptly get an answer! log [call json_get $root "my_object nested_string"] -> "I'm nested" # Are you ready for the more advanced examples? log [call json_get $root "my_array (5)"] -> [5] log [call json_get $root "my_array (0,5-10,16-18)"] -> [0,5,6,7,8,9,10,16,17,18] log [call json_get $root "my_children (*) firstname"] -> ["Anna","Justin","Don","Artie","Barbie"] log [call json_get $root "my_children (*) {firstname|surname}"] -> [["Anna","Conda"],["Justin","Case"],["Don","Key"],["Artie","Choke"],["Barbie","Doll"]] # Lets add some information to my childrens... call json_set $root "my_children (0,4) gender" string "she/her" call json_set $root "my_children (1-3) gender" string "he/him" call json_set $root "my_children (2) gender" string "they/them" log [call json_get $root "my_children (*) name|gender"] -> [["Anna Conda","she/her"],["Justin Case","he/him"],["Don Key","they/them"],["Artie Choke","he/him"],["Barbie Doll","she/her"]] # Lets write in an empty cache... set empty_cache [JSON::create] call json_set $empty_cache "rootpath subpath" string "I'm deeply nested" log [call json_get $empty_cache] -> {"rootpath": {"subpath": "I'm deeply nested"}} After seeing what our project is about, lets try how [call json_get] and [call json_set] can be used to solve the challenges Jason suggested in his Working with JSON data in iRules series. As a reminder, this is Jason's final iRule with his open challenges to the community: when JSON_REQUEST priority 500 { set json_data [JSON::root] if {[call find_key $json_data "nested_array"] contains "b" } { set cache [JSON::create] set rootval [JSON::root $cache] JSON::set $rootval object set obj [JSON::get $rootval object] JSON::object add $obj "[IP::client_addr] status" string "rejected" set rendered [JSON::render $cache] log local0. "$rendered" HTTP::respond 200 content $rendered "Content-Type" "application/json" } } "Now, I offer you a couple challenges. lines 4-9 in the JSON_REQUEST example above should really be split off to become another proc, so that the logic of the JSON_REQUEST is laser-focused. How would YOU write that proc, and how would you call it from the JSON_REQUEST event? The find_key proc works, but there's a Tcl-native way to get at that information with just the JSON::object subcommands that is far less complex and more performant. Come at me!" -Jason Rahm By using our general-purpose iRule procedures, we achieve the laser-focused syntax Jason requested: when JSON_REQUEST priority 500 { set json_data [JSON::root] if { [call json_get $json_data "my_object nested_array"] contains "b" } then { set cache [JSON::create] call json_set $cache "{[IP::client_addr] status}" string "rejected" HTTP::respond 200 content [JSON::render $cache] "Content-Type" "application/json" } } Despite our larger codebase, it is remarkable that our code runs ~20% faster (425 vs. 532 microseconds) per JSON request. This performance gain stems from traversing the JSON structure with a provided path; the procedure knows exactly where to look without unnecessary searching. Additionally, we utilized performance-oriented syntax that prefers fast commands, deploys variables only when necessary, and avoids string-to-list conversions (Tcl shimmering). Impact Our project highlights the current state of JSON-related iRule commands and proves that meta-languages are more suitable for the average iRule developer. We hope this project catches the attention of F5 product development so that a similar JSON-query language can be provided natively. In the meantime, we are deploying this code in production environments and will continue to maintain it. Code Because of size restrictions we had to attach the code as a file. placeholder for insertion Installation Upload the submitted iRule code to your BIG-IP, save as new iRule. Attach a JSON profile to your virtual server. Then attach the iRule to this virtual server. Ready for testing, enjoy! Demo Video Link https://youtu.be/wAHjeC-j8MM
Daniel_Wolf
Mar 10, 2026 Place Contest Entries
300Views
5likes
1Comment
Automation Is Not Your Enemy.
Sun Tzu wrote that you cannot win if you do not know your enemy and yourself. In his sense, he was talking about knowing your army and its capabilities, but this rule seriously applies to nearly every endeavor, and certainly every competitive endeavor. Knowing your own strengths and weaknesses - In our case the strengths and weaknesses of IT staff and architecture – is imperative if you are to meet the challenges that your IT department faces every day. It is not enough to know that you must do X, you must know how X fits (or doesn’t!) into your architecture, and how easily your staff will be able to absorb the knowledge necessary to implement X. Take RSS feeds for example. RSS is largely automated. But if you receive a requirement to implement RSS in the corporate intranet or web portal, the first question is “can the system handle it?” If the answer is no, the next question is “can staff today implement it?” If the answer is no, the next question is “do we buy something to do this for us, or train staff to implement a solution?” Remember this is all hypothetical. Unless you had very specific needs, I would not recommend training staff to write an RSS parser. At best I’d say get a library and train them to use calls to it… Which does indicate a corollary to this point of Sun Tzu’s… Know the terrain (in this case the RSS ecosystem) in which you will meet your enemies. Sun Tzu, courtesy of Wikipedia By extension, knowing the terrain implies “have some R&D time in normal workloads”. I’ve said that before, but it’s worth saying over and over. Sure, some employees might waste that R&D time. Some won’t. Ask Google. It doesn’t have to be some huge percentage, just don’t ask your staff to be up-to-date on things they don’t have time to go research. But I digress. As virtualization and cloud grow in importance, so too does the ability to automate some functionality. As end user computing starts to utilize a growing breadth of devices, automation starts to gain even more imperative. Seriously, on my team alone we have Android, Blackberry, and Apple tablets, Apple and Blackberry phones… And we’re all hitting websites originally designed for Windows. The ability to serve all of these devices intelligently is facilitated by the ability to detect and route them to the correct location – and to be able to monitor usage and switch infrastructure resources to the place that they’re most needed. Some IT staff reasonably worry that automation is going to negatively impact their job prospects. Network Admins in particular have seen many jobs other than theirs shipped off-shore or automated out of existence, and don’t want to end up doing the same. But there are two types of automation advancement, those that eliminate or minimize the need for people – as factory automation often does to keep expenses down – and the type that frees people up to handle greater volumes or more complex tasks – as virtualization did. Virtualization reduced the time to bring up a new server to near zero. That eliminated approximately zero systems admin jobs. The reason is that there was a pent up demand for more servers, and once IT wasn’t holding requests up with cost and timing bottlenecks, demand exploded. Also, admins had more responsibilities – now there were the host systems and dozens of resident VMs. The same will be true of increasing network automation. Yes, some of the tasks regularly done by network admins will get automated out of existence, but in return, managing the system that automates those tasks will fall upon the shoulders of the very administrators that have more time. And the complexity of networks in the age of cloud and virtualization is headed up, meaning the specialized knowledge required to keep these networks not just working, but performing well will end up with the network admins. Making network automation an opportunity, not a risk. An opportunity to better serve customers, an opportunity to learn new things, an opportunity to take on greater responsibility. And make things happen that need to happen at 2am, without the dreaded on-call phone call. We at F5 have been calling it “ABLE infrastructure” to reference our network automation efforts, and that’s really what it boils down to – make the network ABLE to do what network admins have been doing, so they can do the next step, integrating WAN and cloud as if it was on the LAN, and dealing with the ever-growing number of VMs requesting IP addresses. And some R&D. After all, once automation is in place, another “must have” project will come along. They always do, and for most of us in IT, that’s a good thing.
Don_MacVittie_1
Oct 03, 2011 Place Technical Articles
300Views
0likes
0Comments
SwagWAF Wins The Budget Bodyguard Award
EXECUTIVE SUMMARY ENGINEERING DETAILS: In The Weeds AppWorld'26 - iRules Contest Entry: SwagWAF 1. Problem Statement The Challenge: AI/LLM API endpoints face unique threats that enterprises can't afford to miss with traditional WAFs: Management expect SREs to prove resilience and present governance plans for AI adoption before approving budget increases for disruptive technologies - which they may not even understand. Here are a few of the things that can easily get overlooked: Prompt injection & automation hijacks raise new risks, as AI agents spawn at scale - across the enterprise Bot scraping/abuse drains API credits (OpenAI charges per token) Weak APIs and fragile supply chains can turn into open doors for attackers, exposing sensitive data and credentials across agent workflows Prompt injection attacks can bypass LLM & Chat-Bot safety guardrails Rapid-fire inference requests from automated scripts can cripple performance Slow-rolling "Discovery" attacks from multiple vectors may never even be recognized Insecure API integrations leaking sensitive prompts/responses cerfate additional risks Traditional WAFs are expensive ($$$) and/or don't cover AI-specific attack patterns Smaller teams might need lightweight protection to prove the need for increased enterprise WAF budgets. 2. Single iRule & Simple Solution This is NOT just a really clever iRule; This is NOT just a "Poor Man's WAF"; This has NOT just been "enhanced for AI"... THIS is a lightweight AI & API protection framework Yes, this iRule handles L4/L7 web traffic for standard workloads, and then some. The heavy lifting is provided by BigIP. This addresses the unique challenges of protecting API's & AI workloads — such as resource-exhausting long responses, prompt engineering exploits, and automated data scraping. We're using a simple Bot Detection Engine for Sliding Window Rate Limiting, and adding Prompt Injection Defense Posturing to detect (and mitigate) common LLM jailbreak attempts via pattern-matching. The Concept: Here are some of the key features: How it all works: - (iRule - Event Handlers) HTTP_REQUEST: Rate limiting + XFF sanitization HTTP_REQUEST_DATA: JSON payload inspection CLIENTSSL_HANDSHAKE: TLS enforcement HTTP_RESPONSE: Security headers + cookie hardening 1. Security Hardening - (Production Best Practices) TLS 1.2+ enforcement (rejects insecure connections) X-Forwarded-For sanitization (accurate rate limiting) HSTS, Cache-Control, X-Content-Type-Options headers Cookie security (Secure + HttpOnly flags) 2. Dynamic Bot Detection Engine - (Sliding Window Rate Limiting) Tracks request velocity per IP (10 req / 2s default) Violation counter with escalating penalties Temporary IP blocks (10 min) for repeat offenders Returns JSON error responses (AI-friendly format) 3. Prompt Injection Defense - (Dynamic Pattern Matching) Detects common LLM jailbreak attempts ("ignore previous instructions", etc.) SQL injection variants targeting RAG databases XSS attempts in prompt payloads Increments violation counter faster (3× multiplier) 4. Adaptive Intelligence: - (Dynamic iRule Data-Groups) This SwagWAF solution can be easily extended to use externally managed BIG-IP data groups for jailbreak patterns, malicious IP reputation, trusted client bypasses, and endpoint-specific rate limits. This allows SOC teams, CI/CD pipelines, or scheduled automation scripts to update threat intelligence without editing the iRule itself, preserving high-performance local lookups while improving adaptability over time. (more on that later) 3. Impact Business Value Impact Infinite ROI: 100% FREE (As in FREE BEER: $0 CapEx / OpEx & Licensing Costs) vs $10K–50K/year enterprise WAF Solutions Literally Deploys in <5 minutes BEFORE: AFTER: Saves REAL Money Requests exceeding the threshold trigger progressive penalties and temporary IP blocking. Cost Controls: Prevents bot abuse from draining your precious API credits Security Compliance: OWASP Top 10 coverage without dedicated WAF Rapid deployment: drop-in protection (no code changes) Developer-friendly: JSON error responses Real-World Use Cases ChatGPT-style apps protecting backend APIs RAG pipelines with vector DBs Model inference endpoints (HuggingFace, Bedrock, etc.) Multi-tenant AI API gateways 4. The Code Algorithm & Process Flow iRule Source Code #-------------------------------------------------------------------------- # iRule Name: SwagWAF - v0.2.6 #-------------------------------------------------------------------------- # ABSTRACT: "Poor Man's WAF for AI API Endpoints" # PURPOSE: Protect LLM/AI inference APIs from abuse, injection attacks, and # bot scraping while enforcing security best practices # THEME: AI Infrastructure - Traffic management & security for AI workloads # CREATED: 2026-03-10 FOR: AppWorld 2026 iRules Contest # AUTHOR: Joe Negron <[email protected]> #-------------------------------------------------------------------------- # FEATURES: # - Bot detection via rate limiting (sliding window, violation tracking) # - Prompt injection pattern detection (AI-specific threat protection) # - TLS 1.2+ enforcement (secure AI API communications) # - X-Forwarded-For sanitization (accurate client IP tracking) # - Security header hardening (HSTS, cache control, MIME sniffing prevention) # - Cookie security (Secure + HttpOnly flags) # - JSON payload validation (AI API request inspection) #-------------------------------------------------------------------------- when RULE_INIT { # === RATE LIMITING CONFIG (Bot Detection) === set static::max_requests 10 ;# Max requests per window set static::window_ms 2000 ;# 2-second sliding window set static::violation_threshold 5 ;# Violations before block set static::violation_window_ms 30000 ;# 30s violation window set static::block_seconds 600 ;# 10 min block duration # === AI-SPECIFIC PROTECTION === # Prompt injection patterns (examples of common LLM jailbreak attempts) set static::injection_patterns { "ignore previous instructions" "disregard all prior" "forget everything" "system prompt" "you are now in developer mode" "<script>" "'; DROP TABLE" "UNION SELECT" } # === DEBUG LOGGING === set static::debug 1 } #-------------------------------------------------------------------------- # CLIENTSSL_HANDSHAKE - TLS Version Enforcement #-------------------------------------------------------------------------- when CLIENTSSL_HANDSHAKE { if {$static::debug}{log local0. "<DEBUG>[IP::client_addr]:[TCP::client_port]:[virtual name]:== TLS VERSION CHECK"} if {[SSL::cipher version] ne "TLSv1.2" && [SSL::cipher version] ne "TLSv1.3"} { log local0. "REJECTED: Client [IP::client_addr] attempted insecure TLS version: [SSL::cipher version]" reject HTTP::respond 403 content "TLS 1.2 or higher required for AI API access" } } #-------------------------------------------------------------------------- # HTTP_REQUEST - Multi-Layer Protection #-------------------------------------------------------------------------- when HTTP_REQUEST { set ip [IP::client_addr] set now [clock clicks -milliseconds] set window_start [expr {$now - $static::window_ms}] # === X-FORWARDED-FOR SANITIZATION === if {$static::debug}{log local0. "<DEBUG>$ip:[TCP::client_port]:[virtual name]:== SANITIZING XFF"} HTTP::header remove x-forwarded-for HTTP::header insert x-forwarded-for [IP::remote_addr] HTTP::header remove X-Custom-XFF HTTP::header insert X-Custom-XFF [IP::remote_addr] # === CHECK IF IP IS BLOCKED === if {[table lookup "block:$ip"] eq "1"} { if {$static::debug}{log local0. "BLOCKED: $ip (repeated abuse)"} HTTP::respond 429 content "{\n \"error\": \"rate_limit_exceeded\",\n \"message\": \"Temporarily blocked for repeated abuse\",\n \"retry_after\": 600\n}" "Content-Type" "application/json" return } # === CLEANUP OLD REQUEST TIMESTAMPS === foreach ts [table keys -subtable "ts:$ip"] { if {$ts < $window_start} { table delete -subtable "ts:$ip" $ts } } # === COUNT REQUESTS IN CURRENT WINDOW === set req_count [llength [table keys -subtable "ts:$ip"]] if {$req_count >= $static::max_requests} { # Record violation set v [table incr "viol:$ip"] table timeout "viol:$ip" $static::violation_window_ms if {$v >= $static::violation_threshold} { # Block IP temporarily table set "block:$ip" 1 $static::block_seconds log local0. "BLOCKED: $ip (violation threshold: $v)" HTTP::respond 429 content "{\n \"error\": \"rate_limit_exceeded\",\n \"message\": \"Blocked for repeated abuse\",\n \"retry_after\": 600\n}" "Content-Type" "application/json" return } log local0. "RATE_LIMITED: $ip (req_count: $req_count, violations: $v)" HTTP::respond 429 content "{\n \"error\": \"rate_limit_exceeded\",\n \"message\": \"Too many requests - slow down\",\n \"retry_after\": 2\n}" "Content-Type" "application/json" return } # === LOG TIMESTAMP OF THIS REQUEST === table set -subtable "ts:$ip" $now 1 $static::window_ms # === AI-SPECIFIC: PROMPT INJECTION DETECTION === # Only inspect POST requests with JSON payload if {[HTTP::method] eq "POST" && [HTTP::header exists "Content-Type"] && [HTTP::header "Content-Type"] contains "application/json"} { if {[HTTP::header exists "Content-Length"] && [HTTP::header "Content-Length"] < 65536} { HTTP::collect [HTTP::header "Content-Length"] } } } #-------------------------------------------------------------------------- # HTTP_REQUEST_DATA - JSON Payload Inspection #-------------------------------------------------------------------------- when HTTP_REQUEST_DATA { set payload [HTTP::payload] set payload_lower [string tolower $payload] # Check for prompt injection patterns foreach pattern $static::injection_patterns { if {[string match -nocase "*$pattern*" $payload_lower]} { set ip [IP::client_addr] log local0. "INJECTION_ATTEMPT: $ip tried pattern: $pattern" # Increment violation counter (treat injection attempts seriously) set v [table incr "viol:$ip" 3] table timeout "viol:$ip" $static::violation_window_ms if {$v >= $static::violation_threshold} { table set "block:$ip" 1 $static::block_seconds HTTP::respond 403 content "{\n \"error\": \"forbidden\",\n \"message\": \"Malicious payload detected\"\n}" "Content-Type" "application/json" return } HTTP::respond 400 content "{\n \"error\": \"invalid_request\",\n \"message\": \"Request rejected by security policy\"\n}" "Content-Type" "application/json" return } } } #-------------------------------------------------------------------------- # HTTP_RESPONSE - Security Header Hardening #-------------------------------------------------------------------------- when HTTP_RESPONSE { if {$static::debug}{log local0. "<DEBUG>[IP::client_addr]:[TCP::client_port]:[virtual name]:== SANITIZING RESPONSE HEADERS"} # Remove server fingerprinting headers HTTP::header remove "Server" HTTP::header remove "X-Powered-By" HTTP::header remove "X-AspNet-Version" HTTP::header remove "X-AspNetMvc-Version" # Enforce security headers HTTP::header remove "Cache-Control" HTTP::header remove "Strict-Transport-Security" HTTP::header remove "X-Content-Type-Options" HTTP::header insert "Strict-Transport-Security" "max-age=31536000; includeSubDomains" HTTP::header insert "Cache-Control" "no-store, no-cache, must-revalidate, proxy-revalidate" HTTP::header insert "X-Content-Type-Options" "nosniff" # === COOKIE HARDENING (Secure + HttpOnly) === if {$static::debug}{log local0. "<DEBUG>[IP::client_addr]:[TCP::client_port]:[virtual name]:== SECURING COOKIES"} # Use F5 native cookie security (faster than manual parsing) foreach cookieName [HTTP::cookie names] { HTTP::cookie secure $cookieName enable } # Add HttpOnly flag to all Set-Cookie headers set new_cookies {} foreach cookie [HTTP::header values "Set-Cookie"] { if { ![string match "*HttpOnly*" [string tolower $cookie]] } { set modified_cookie [string trimright $cookie ";"] append modified_cookie "; HttpOnly" lappend new_cookies $modified_cookie } else { lappend new_cookies $cookie } } # Apply secured cookies HTTP::header remove "Set-Cookie" foreach cookie $new_cookies { if { ![string match "*secure*" [string tolower $cookie]] } { HTTP::header insert "Set-Cookie" "$cookie; Secure" } else { HTTP::header insert "Set-Cookie" "$cookie" } } } Test Commands # Rate limiting test for i in {1..15}; do curl -X POST https://your-api/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"prompt":"test"}' done # Prompt injection test curl -X POST https://your-api/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"prompt":"Ignore previous instructions"}' # TLS enforcement test curl --tlsv1.1 https://your-api/ Expected Responses # Throttling: { "error":"rate_limit_exceeded", "message":"Too many requests - slow down", "retry_after":2} # Rejection: {"error":"invalid_request","message":"Request rejected by security policy"} # Suspension: {"error":"rate_limit_exceeded","message":"Blocked for repeated abuse","retry_after":600} Production Deployment Checklist [ ] Test on F5 v21+ [ ] Tune max_requests for real traffic [ ] Add provider-specific injection patterns [ ] Monitor /var/log/ltm for false positives [ ] Set static::debug 0 in production [ ] Define bypass for trusted high-volume clients [UPDATE: March 15th, 2026] Quick Reality Check (important) This is already solid, but if I had more than a few hours to write, test & submit the code, I considered adding: IP reputation hooks (even just stubbed) for Alerting per-endpoint rate limiting (not just per IP) enhanced AI-awareness — using more dynamic iRule DataSets Roadmap: Adaptive Threat Intelligence Layer The plan is to add external, dynamically maintained data groups for: dg_swagwaf_jailbreak_patterns dg_swagwaf_sql_patterns dg_swagwaf_xss_patterns dg_swagwaf_bad_ips dg_swagwaf_trusted_clients dg_swagwaf_endpoint_limits We could've added some iRule checks maybe cache those classes locally using class match, which is super fast and avoids those pesky (per-request) API calls. Something like this: if {[class match $payload_lower contains dg_swagwaf_jailbreak_patterns]} { # reject / increment violations yadda-yadda blah-blah... } AND: For endpoint-specific rate limits, we could use a data group like this: /api/v1/chat/completions := 10:2000 /api/v1/embeddings := 50:2000 /api/v1/images/generations := 5:5000 Then the iRule derives the limit from [HTTP::path] instead of using one global static::max_requests; AND: External scripts should update the data groups on a schedule or event trigger: Those few tweaks would additionally give us: “a lightweight, extensible AI API protection framework with DevSecOps integration” faster runtime decisions dynamic jailbreak-pattern updates reusable shared protection across multiple iRules / VIPs lower operational risk because updates happen out-of-band better governance because pattern changes can go through Git/CI/CD to be continued...
Joe_Negron_NYC
May 05, 2026 Place Contest Entries
299Views
0likes
0Comments
In Times Of Change, IT Can Lead, Follow, Or Get Out of the Way.
Information Technology – geeks like you and I – have been responsible for an amazing transformation of business over the last thirty or forty years. The systems that have been put into place since computers became standard fare for businesses have allowed the business to scale out in almost every direction. Greater production, more customers, better marketing and sales follow-through, even insanely targeted marketing for those of you selling to consumers. There is not a piece of the business that would be better off without us. With that change came great responsibility though. Inability to access systems and/or data brings the organization to a screeching halt. So we spend a lot of time putting in redundant systems – for all of its power as an Advanced Application Delivery Controller, many of F5’s customers rely on BIG-IP LTM to keep their systems online even if a server fails. Because it’s good at that (among other things), and they need redundancy to keep the business running. When computerization first came about, and later when Palm and Blackberry were introducing the first personal devices, people – not always IT people – advocated change, and those changes impacted every facet of the business, and provide you and I with steady work. The people advocating were vocal, persistent, and knew that there would be long-term benefit from the systems, or even short-term benefit to dealing with ever increasing workloads. Many of them were rewarded with work maintaining and improving the systems they had advocated for, and all of them were leaders. As we crest the wave of virtualization and start to seriously consider cloud computing on a massive scale – be it cloud storage, cloud applications, or SOA applications that have been cloud-washed – it is time to seriously consider IT’s role in this process once again. Those leaders of the past pushed at business management until they got the systems they thought the organization needed, and another group of people will do the same this time. So as I’ve said before, you need to facilitate this activity. Don’t make them go outside the IT organization, because history says that any application or system allowed to grow outside the IT organization will inevitably fall upon the shoulders of IT to manage. Take that bull by the horns, frame the conversation in the manner that makes the most sense to your business, your management, and your existing infrastructure. Companies like F5 can help you move to the cloud with products like ARX Cloud Extender to make cloud storage look like local NAS, and BIG-IP LTM VE to make cloud apps able to partake of load balancing and other ADC functionality, but all the help in the world doesn’t do you any good if you don’t have a plan. Look at the cloud options available, they’re certainly telling you about themselves right now so that should be easy, then look at your organization’s acceptance of risk, and the policies of cloud service providers in regards to that risk, and come up with ideas on how to utilize the cloud. One thing about a new market that includes a cool buzz word like cloud, if you aren’t proposing where it fits, someone in your organization is. And that person is never going to be as qualified as IT to determine which applications and data belong outside the firewall. Never. I’ve said make a plan before, but many organizations don’t seem to be listening, so I’m saying it again. Whether Cloud is an enabling technology for your organization or a disruptive one for IT is completely in your hands. Be the leader of the past, it’s exciting stuff if managed properly, and like many new technologies, scary stuff if not managed in the context of the rest of your architecture. So build a checklist, pick some apps and even files that could sit in the cloud without a level of risk greater than your organization is willing to accept, and take the list to business leaders. Tell them that cloud is helping to enable IT to better serve them and ask if they’d like to participate in bringing cloud to the enterprise. It doesn’t have to be big stuff, just enough to make them feel like you’re leading the effort, and enough to make you feel like you’re checking cloud out with out “going all in”. After a few pilots, you’ll find you have one more set of tools to solve business problems. And that is almost never a bad thing. Even if you decide cloud usage isn’t for your organization, you chose what was put out there, not a random business person who sees the possibilities but doesn’t know the steps required and the issues to confront. Related Blogs: Risk is not a Synonym for “Lack of Security” Cloud Changes Cost of Attacks Cloud Computing: Location is important, but not the way you think Cloud Storage Gateways, stairway to (thin provisioning) heaven? If Security in the Cloud Were Handled Like Car Accidents Operational Risk Comprises More Than Just Security Quarantine First to Mitigate Risk of VM App Stores CloudFucius Tunes into Radio KCloud Risk Averse or Cutting Edge? Both at Once.
Don_MacVittie_1
Jun 14, 2011 Place Technical Articles
299Views
0likes
0Comments
Load Balancing For Developers: Security and TCP Optimizations
It has been a while since I wrote a Load Balancing for Developers installment, and since they’re pretty popular and there’s still a lot about Application Delivery Controllers (ADCs) that are taken for granted in the Networking industry but relatively unknown in the development world, I thought I’d throw one out about making your security more resilient with ADCs. For those who are just joining this series, here’s the full list of posts I’ve tagged as Load Balancing for Developers, though only the ones whose title starts with “Load Balancing for Developers” or “Advance Load Balancing for Developers” were actually written from this perspective, utilizing our fictional web application Zap’N’Go! as an example. This post, like most of them, doesn’t require that you read the other entries in the “Load Balancers for Developers” series, but if you’re interested in the topic, they are all written from the developer’s perspective, and only bring in the networking/ops portions where it makes sense. So your organization has a truly successful web application called Zap’N’Go! That has taken the Internet by storm. Your hits are in the thousands an hour, and orders are rolling in. All was going well until your server couldn’t keep up and you went to a load balanced scenario so that multiple servers could share the load. The problem is that with the money you’ve generated off of Zap’N’Go, you’ve bought a competitor and started several new web applications, set up a forum or portal for your customers to communicate with you and each other directly, and are using the old datacenter from the company you purchased as a redundant datacenter in case the worst should happen. And all of that means that you are suffering server (and VM) sprawl. The CPU cycles being eaten up by your applications are truly astounding, and you’re looking into ways to drive them down. Virtualization helped you to be more agile in responding to the requests of the business, but also brings a lot of management overhead in making certain servers aren’t overloaded with too high a virtual density. One of the cool bits about an ADC is that they do a lot more than load balance, and much of that can be utilized to improve application performance without re-architecting the entire system. While there are a lot of ways that an ADC can improve application performance, we’ll look at a couple of easy ones here, and leave some of the more difficult or involved ones for another time. That keeps me in writing topics, and makes certain that I can give each one the attention it deserves in the space available. The biggest and most obvious improvement in an ADC is of course load balancing. This blog assumes you already have an ADC in place, and load balancing was your primary reason for purchasing it. While I don’t have market numbers in front of me, it is my experience that this is true of the vast majority of ADC customers. If you have overburdened web applications and have not looked into load balancing, before you go rewriting your entire system, take a look at the rest of this series. There really are options out there to help. After that win, I think the biggest place – in a virtualized environment – that developers can reap benefits from an ADC is one that developers wouldn’t normally think of. That’s the reason for this series, so I suppose that would be a good thing. Nearly every application out there hits a point where SSL is enabled. That point may be simply the act of accessing it, or it may be when they go to the “shopping cart” section of the web site, but they all use SSL to protect sensitive user data being passed over the Internet. As a developer, you don’t have to care too much about this fact. Pay attention to the protocol if you’re writing at that level and to the ports if you have reason to, but beyond that you don’t have to care. Networking takes care of all of that for you. But what if you could put a request in to your networking group that would greatly improve performance without changing a thing in your code and from a security perspective wouldn’t change much – most companies would see it as not changing anything, while a few will want to talk about it first? What if you could make this change over lunch and users wouldn’t know the difference? Here’s the background. SSL Encryption is expensive in terms of CPU cycles. No doubt you know that, most developers have to face this issue head-on at some point. It takes a lot of power to do encryption, and while commodity hardware is now fast enough that it isn’t a problem on a stand-alone server, in a VM environment, the number of applications requesting SSL encryption on the same physical hardware is many times what it once was. That creates a burden that, at this time at least, often drags on the hardware. It’s not the fault of any one application or a rogue programmer, it is the summation of the burdens placed by each application requiring SSL translation. One solution to this problem is to try and manage VM deployment such that encryption is only required on a couple of applications per physical server, but this is not a very appealing long-term solution as loads shift and priorities change. From a developers’ point of view, do you trust the systems/network teams to guarantee your application is not sharing hardware with a zillion applications that all require SSL encryption? Over time, this is not going to be their number one priority, and when performance troubles crop up, the first place that everyone looks in an in-house developed app is at the development team. We could argue whether that’s the right starting point or not, but it certainly is where we start. Another, more generic solution is to take advantage of a non-development feature of your ADC. This feature is SSL termination. Since the ADC sits between your application and the Internet, you can tell your ADC to handle encryption for your application, and then not worry about it again. If your network team sets this up for all of your applications, then you have no worries that SSL is burning up your CPU cycles behind your back. Is there a negative? A minor one that most organizations (as noted above) just won’t see as an issue. That is that from the ADC to your application, communications will happen in the clear. If your application is internal, this really isn’t a big deal at all. If you suspect a bad-guy on your internal network, you have much more to worry about than whether communications between two boxes are in the clear. If you application is in the cloud, this concern is more realistic, but in that case, SSL termination is limited in usefulness anyway because you can’t know if the other apps on the same hardware are utilizing it. So you simply flick a switch on your ADC to turn on SSL termination, and then turn it off on your applications, and you have what the ADC industry calls “SSL offload”. If your ADC is purpose-built hardware (like our BIG-IP), then there is encryption hardware in the box and you don’t have to worry about the impact to the ADC of overloading it with SSL requests, it’s built to handle the load. If your ADC is software or a VM (like our BIG-IP LTM VE), then you’ll have to do a bit of testing to see what the tolerance level for SSL load is on the hardware you deployed it on – but you can ask the network staff to worry about all of that, once you’ve started the conversation. Is this the only security-based performance boost you can get? No, but it is the easy one. Everything on the Internet remains encrypted, but your application is not burdening the server’s CPU with encryption requests each time communications in or out occur. The other easy one is TCP optimizations. This one requires less talk because it is completely out of the realm of the developer. Simply put, TCP is a well designed protocol that sometimes gets bogged down communicating and has a lot of overhead in those situations. Turning on TCP optimizations in your ADC can reduce the overhead – more or less, depending upon what is on the other end of the communications network – and improve perceived performance, which honestly is one of the most important measures of web application availability. By making it seem to load faster, you’ve improved your customer experience, and nothing about your development has to change. TCP optimizations are not new, and thus the ones that are turned on when you activate the option on most ADCs are stable and won’t disrupt most applications. Of course you should run a short test cycle with them enabled, just to be certain, but I would be surprised if you saw any issues. They’re not unheard of, but they are very rare. That’s enough for now, I think. I don’t want these to get so long that you wander off to develop some more. Keep doing what you do. And strive to keep your users from doing this. Slow apps anger users
Don_MacVittie_1
Apr 07, 2011 Place Technical Articles
299Views
0likes
0Comments
Mounting Offline VM Drives
Most of the files I use in my virtual desktop environment are centrally located in a share I make accessible to the host and all the guests for ease of transfer between them. However, there is one guest I keep fairly isolated for security reasons. This is great, but when I need a file, it (has previously) required me to start that guest, wait, login, move the files I need to the share, then shutdown. It’s frequent enough to be annoying. I’d leave it up, but I prefer to keep my BIG-IP LTM VE and a couple linux guests running and there’s only so many resources. Anyway, the annoyance hit the tipping point today and I found out that in my VMware Workstation, there is a tool to mount the guest drives in the host OS.
JRahm
Sep 01, 2010 Place Technical Articles
246Views
0likes
0Comments
Useful Cloud Advice, Part Two. Applications
This is the second part of this series talking about things you need to consider, and where cloud usage makes sense given the current state of cloud evolution. The first one, Cloud Storage, can be found here. The point of the series is to help you figure out what you can do now, and what you have to consider when moving to the cloud. This will hopefully help you to consider your options when pressure from the business or management to “do something” mounts. Once again, our definition of cloud is Infrastructure as a Service (IaaS) - “VM containers”, not SOA or other variants of Cloud. For our purposes, we’ll also assume “public cloud”. The reasoning here is simple, if you’re implementing internal cloud, you’re likely already very virtualized, and you don’t have the external vendor issues, so you don’t terribly need this advice – though some of it will still apply to you, so read on anyway. Related Articles and Blogs Maybe Ubuntu Enterprise Cloud Makes Cloud Computing Too Easy Cloud Balancing, Cloud Bursting, and Intercloud Bursting the Cloud The Impossibility of CAP and Cloud Amazon Makes the Cloud Sticky Cloud, Standards, and Pants The Inevitable Eventual Consistency of Cloud Computing Infrastructure 2.0 + Cloud + IT as a Service = An Architectural ... Cloud Computing Makes Servers Obsolete Cloud Computing's Other Achilles' Heel: Software Licensing
Don_MacVittie_1
Jun 24, 2011 Place Technical Articles
211Views
0likes
0Comments
AI Token Limit Enforcement
Problem Companies that run AI inference services on-premise instead of using public cloud providers often do so to keep sensitive data local. However, local LLM infrastructure introduces a new challenge: resource control. Without proper limits, users or applications can generate excessive inference requests and consume GPU or CPU capacity uncontrollably. Inference stacks may lack built-in mechanisms for enforcing per-user or per-role token budgets, so organizations need a way to control usage before requests reach the model. Solution Our approach uses BIG-IP LTM iRules only to control access and usage: JWT validation The company issues a JWT for each user request. When the request arrives at the iRule, we verify it using a RSA to ensure it hasn’t been tampered with. Role-based token limits The JWT payload includes the user role. We have three roles with different token budgets: standard_user → small token budget extended_user → medium token budget power_user → large token budget Token tracking with tables commands Budget enforcement If a user has already used too many tokens, the iRule returns HTTP 429. Otherwise, the token budget is decreased and the request is allowed to proceed. Role-change handling If the user role changes during a session, the token budget updates accordingly. Impact This iRule enables token budget enforcement directly on BIG-IP LTM without requiring additional modules or external gateways. By validating JWTs and extracting user and role information, the iRule applies role-based token limits before requests reach the inference service. This provides a simple, native way to introduce quota control and protect on-premise AI infrastructure from uncontrolled usage. Authors Marcio Goncalves <[email protected]>, Sven Schaefer <[email protected]> Code Main iRule, requires the procedure library (proc_lib) below. # Title: AI Token Limit Enforcement # Author: Marcio Goncalves <[email protected]>, Sven Schaefer <[email protected]> # Version: 1.0 # Description: # This iRule enforces token budgets for AI inference services. The main goal # is to limit how many tokens a user can consume based on their assigned # role. Each role has a configurable token budget and a reset timer that # defines when the budget is refreshed. # The role information is provided through a JWT. Because the iRule relies # on the JWT to determine the user identity and role, the token must first # be validated before any request can be processed. # # JWT validation is therefore only a prerequisite. It ensures that the # request is authenticated and that the role information can be trusted. # Without a valid JWT the request cannot be processed, since neither the # user nor the role would be known. # The iRule validates the RSA signature of the JWT using the public key # referenced by the key ID (kid) in the JWT header. Multiple keys are # supported to allow key rollover. The expiration time (exp claim) is also # verified to ensure the token is still valid. # # Once the JWT is validated, the iRule extracts the username and role from # the payload and applies the corresponding token limits. If a user exceeds # the allowed token budget, the iRule returns HTTP status code 429 (Too Many # Requests). # # Logging is intentionally very verbose and controlled via debug levels # ranging from 0 (silent) to 5 (logging like crazy). # # The overall goal is to implement a native LTM-only mechanism for enforcing # token limits for AI workloads, without requiring APM. # # Credits / Sources: # JWT validation logic adapted from: # https://github.com/JuergenMang/f5-irules-jwt/blob/main/jwt-validate # (Juergen Mang) # # JSON handling techniques inspired by: # https://community.f5.com/kb/technicalarticles/working-with-json-data-in- # irules---part-2/345282 # (Jason Rahm) when RULE_INIT priority 100 { # SHA256 signing header set static::jwt_validate_digest_header_sha256 "3031300d060960864801650304020105000420" # Public key for signature validation set static::jwt_validate_pubkey_kid1 {-----BEGIN PUBLIC KEY----- MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEA1RAIiNKFjm4DEuQet0zN SQQ1/LDXP1xqUuEWEBWZ7nfhOru/l9eiJibtfoO+F8vUUFBTthm0SdiVWETF/psT yqoDqKSjobqGquaglGmK63KDQparjnh5nJjtmMELvA4DSz6e5pO5mDdATVRpVXvp j45rIW7eBoxMGAB0ivVm88ChyGA0UJUuyTSRuZnXyY8sMHz8JkhxWwr6i87i5p+p E27HJ9WaCikBL2RALJIZLL+ByVknTWuRW785hN1A6V+/o/Yy9Cdqt0hif0zSC2+r D+hIMHqDSR6WLb07KqCTbbL8q9v2selR8X5lbYYYh0vk9voD3JFvRbTtfz1YystH qQIDAQAB -----END PUBLIC KEY----- } set static::jwt_validate_pubkey_kid2 {-----BEGIN PUBLIC KEY----- MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAwlik5HcRTfp4c4oP5Jta Thhqa4EjV+dJB9w9EqQa9dMQzVWXG8O1b3izee1kESICe+YUryVS9I6TbJavqH1t ut0cM0VHLnWYQJAd7w2nK7qoDYX+uj9Lcq6pTSUH6zM/Sro0D4+/Ha6LAtyiJosx QzA+yxaFrBwJHzXRgnCd/6crMG3eP/jaz+xid/AecHerQ1C0kRBTZd7FHt+SS677 489emEMwtpjNZCq2YnHgTULxQKjKEKMQGQrD1OOnz8ZyN9wtYSQp24lDmXVw5p6G a42UqjQ5C6Nbj3qr/FV+49maLrXEw6kowMAb0qWpAui1BrEjxR95WrWQQrdfWZCU 6wIDAQAB -----END PUBLIC KEY----- } array set static::user_role_token_limits { standard_user 10000 extended_user 50000 power_user 100000 } set static::user_role_default_token_limit 1000 set static::token_limit_reset_timer 30 } when HTTP_REQUEST priority 100 { # Debug set debug_mode 3 if { not ([HTTP::header value Authorization] starts_with "Bearer ") } { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "No bearer token found" return } # Get JWT from authorization header set jwt_header_b64_url [string range [getfield [HTTP::header value Authorization] "." 1] 7 end] set jwt_body_b64_url [getfield [HTTP::header value Authorization] "." 2] set jwt_sig_b64_url [getfield [HTTP::header value Authorization] "." 3] if { $jwt_header_b64_url eq "" or $jwt_body_b64_url eq "" or $jwt_sig_b64_url eq "" } { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "No bearer token found" return } if {$debug_mode > 3}{log local0. "Header: $jwt_header_b64_url"} if {$debug_mode > 3}{log local0. "Body: $jwt_body_b64_url"} if {$debug_mode > 3}{log local0. "Sig: $jwt_sig_b64_url"} # Decode JWT components set jwt_header [call proc_lib::b64url_decode $jwt_header_b64_url] if {$debug_mode > 3}{log local0. "JWT Header: $jwt_header"} set jwt_body [call proc_lib::b64url_decode $jwt_body_b64_url] if {$debug_mode > 3}{log local0. "JWT Body: $jwt_body"} set jwt_sig [call proc_lib::b64url_decode $jwt_sig_b64_url] if { $jwt_header eq "" or $jwt_body eq "" or $jwt_sig eq ""} { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "Unable to decode jwt components" return } # Get signing algorithm set jwt_algo [call proc_lib::get_json_str "alg" $jwt_header] if {$debug_mode > 3}{log local0. "JWT signing: $jwt_algo"} if { $jwt_algo ne "RS256" } { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "Unsupported signature algorithm" return } # Get expiration set jwt_exp [call proc_lib::get_json_num "exp" $jwt_body] if {$debug_mode > 3}{log local0. "JWT expiration: $jwt_exp"} set now [clock seconds] if { $jwt_exp < $now } { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "JWT expired" return } # Get key id set jwt_kid [call proc_lib::get_json_str "kid" $jwt_header] switch -- $jwt_kid { "kid1" { set jwt_pubkey $static::jwt_validate_pubkey_kid1 } "kid2" { set jwt_pubkey $static::jwt_validate_pubkey_kid2 } default { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "Unknown kid: $jwt_kid" return } } # Decrypt signature with public key if { [catch { set jwt_sig_decrypted [CRYPTO::decrypt -alg rsa-pub -key $jwt_pubkey $jwt_sig] binary scan $jwt_sig_decrypted H* jwt_sig_decrypted_hex if {$debug_mode > 3}{log local0. "Signature: $jwt_sig_decrypted_hex"} }] } { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" log local0. "Unable to decrypt signature: [subst "\$::errorInfo"]" return } # Create hash from JWT header and payload set hash [sha256 "$jwt_header_b64_url.$jwt_body_b64_url"] binary scan $hash H* hash_hex if {$debug_mode > 3}{log local0. "Calculated: ${static::jwt_validate_digest_header_sha256}${hash_hex}"} # Compare calculated and decrypted hash if { "${static::jwt_validate_digest_header_sha256}${hash_hex}" ne $jwt_sig_decrypted_hex } { HTTP::respond 401 content "Authorization required" "Content-Type" "text/plain" "WWW-Authenticate" "Bearer" return } set jwt_user [call proc_lib::get_json_str "user" $jwt_body] set jwt_role [call proc_lib::get_json_str "role" $jwt_body] if {$debug_mode > 0}{log local0. "Signature verified. JWT accepted. User: $jwt_user, Role: $jwt_role"} } when JSON_REQUEST { if {$debug_mode > 4}{log local0. "JSON Request detected successfully."} # Get JSON data from request body set json_data [JSON::root] if {$debug_mode > 4} { #call proc_lib::print $json_data log local0. [call proc_lib::stringify $json_data] } set user_prompts [call proc_lib::find_key $json_data "messages"] if {$debug_mode > 4}{log local0. "User-Prompts: $user_prompts"} if {$debug_mode > 3}{log local0. "JWT-User: $jwt_user"} if {$debug_mode > 3}{log local0. "JWT-Role: $jwt_role"} # check if role exists in dict if {[info exists static::user_role_token_limits($jwt_role)]} { # get configured token limit set initial_tokens $static::user_role_token_limits($jwt_role) } else { if {$debug_mode > 0}{log local0. "Role \"$jwt_role\" unknown, applying default limit"} # fallback value set initial_tokens $static::user_role_default_token_limit } if {$debug_mode > 1}{log local0. "Initial Tokens: $initial_tokens"} set estimated_tokens [expr {[string length $user_prompts] / 4}] if {$debug_mode > 1}{log local0. "Estimated Tokens: $estimated_tokens"} # Current time set now [clock seconds] # Check last refill for this user set last_refill [table lookup "last_refill:$jwt_user"] # If no refill exists or 24h passed if {$last_refill eq "" || ($now - $last_refill) >= $static::token_limit_reset_timer} { if {$debug_mode > 1}{log local0. "Refilling tokens for user $jwt_user, because reset timer expired."} table set "tokens_remaining:$jwt_user" $initial_tokens indef table set "last_refill:$jwt_user" $now indef } set prev_role [table lookup "user_role:$jwt_user"] if {$prev_role eq ""} { if {$debug_mode > 1}{log local0. "Role not yet defined for user $jwt_user"} table set "user_role:$jwt_user" $jwt_role indef } elseif {$prev_role ne $jwt_role} { if {$debug_mode > 0}{log local0. "Role change detected for user $jwt_user: $prev_role -> $jwt_role"} # Re-calculate token limits based on new role set tokens_left [table lookup "tokens_remaining:$jwt_user"] set prev_role_limit $static::user_role_token_limits($prev_role) set new_role_limit $static::user_role_token_limits($jwt_role) set new_role_limit_diff [expr {$new_role_limit - $prev_role_limit}] set tokens_left [expr {$tokens_left + $new_role_limit_diff}] if {$debug_mode > 1}{log local0. "Adjusting tokens for role change. Previous role limit: $prev_role_limit, New role limit: $new_role_limit, Tokens left adjusted by: $new_role_limit_diff, New tokens left: $tokens_left"} table set "tokens_remaining:$jwt_user" $tokens_left indef table set "user_role:$jwt_user" $jwt_role indef } else { if {$debug_mode > 1}{log local0. "Role for user $jwt_user remains unchanged: $jwt_role"} } set tokens_left [table lookup "tokens_remaining:$jwt_user"] # Initialize or reset token count if new session or role has changed if {$tokens_left eq "" || $prev_role ne $jwt_role} { set tokens_left $initial_tokens } if {$debug_mode > 3}{log local0. "Session table info for user $jwt_user"} foreach key [list "tokens_remaining:$jwt_user" "tokens_used:$jwt_user" "prompt:$jwt_user" "user_role:$jwt_user"] { set val [table lookup $key] if {$debug_mode > 3}{log local0. " $key = $val"} } if {$tokens_left < $estimated_tokens} { if {$debug_mode > 0}{log local0. "Token budget exceeded for user $jwt_user (role: $jwt_role). Remaining: $tokens_left, needed: $estimated_tokens"} HTTP::respond 429 content "Token budget exceeded for role $jwt_user. Please upgrade your plan." "Content-Type" "text/plain" return } else { # decrease remaining tokens if {$debug_mode > 1}{log local0. "Decreasing tokens for user $jwt_user (role: $jwt_role). Remaining: $tokens_left, needed: $estimated_tokens"} set tokens_left [expr {$tokens_left - $estimated_tokens}] table set "tokens_remaining:$jwt_user" $tokens_left indef # initialize or update used tokens if {$debug_mode > 1}{log local0. "Updating used tokens for user $jwt_user (role: $jwt_role). Used: $estimated_tokens"} set tokens_used [table lookup "tokens_used:$jwt_user"] if {$tokens_used eq ""} { set tokens_used 0 } set tokens_used [expr {$tokens_used + $estimated_tokens}] table set "tokens_used:$jwt_user" $tokens_used indef } } when JSON_REQUEST_MISSING { if {$debug_mode > 4}{log local0. "JSON Request missing."} } when JSON_REQUEST_ERROR { if {$debug_mode > 4}{log local0. "Error processing JSON request. Rejecting request."} } when JSON_RESPONSE { if {$debug_mode > 4}{log local0. "JSON response detected successfully."} } when JSON_RESPONSE_MISSING { if {$debug_mode > 4}{log local0. "JSON Response missing."} } when JSON_RESPONSE_ERROR { if {$debug_mode > 4}{log local0. "Error processing JSON response."} } This is procedure library (proc_lib must be used): proc b64url_decode { str } { set mod [expr { [string length $str] % 4 } ] if { $mod == 2 } { append str "==" } elseif {$mod == 3} { append str "=" } if { [catch { b64decode [ string map {- + _ /} $str] } str_b64decoded ] == 0 and $str_b64decoded ne "" } { return $str_b64decoded } else { log local0. "Base64URL decoding error: [subst "\$::errorInfo"]" return "" } } proc get_json_num { key str } { set value [findstr $str "\"$key\"" [ expr { [string length $key] + 2 } ] ] set value [string trimleft $value {: }] return [scan $value {%[0-9]}] } proc get_json_str { key str } { set value [findstr $str "\"$key\"" [ expr { [string length $key] + 2 } ] ] set value [string trimleft $value {:" }] set json_value "" set escaped 0 foreach char [split $value ""] { if { $escaped == 0 } { if { $char eq "\\" } { # next char is escaped set escaped 1 } elseif { $char eq {"} } { # exit loop on first unescaped quotation mark break } else { append json_value $char } } else { switch -- $char { "\"" - "\\" { append json_value $char } default { # simply ignore other escaped values } } set escaped 0 } } return $json_value } proc print { e } { set t [JSON::type $e] set v [JSON::get $e] set p0 [string repeat " " [expr {2 * ([info level] - 1)}]] set p [string repeat " " [expr {2 * [info level]}]] switch $t { array { log local0. "$p0\[" set size [JSON::array size $v] for {set i 0} {$i < $size} {incr i} { set e2 [JSON::array get $v $i] call proc_lib::print $e2 } log local0. "$p0\]" } object { log local0. "$p0{" set keys [JSON::object keys $v] foreach k $keys { set e2 [JSON::object get $v $k] log local0. "$p${k}:" call proc_lib::print $e2 } log local0. "$p0}" } string - literal { set v2 [JSON::get $e $t] log local0. "$p\"$v2\"" } default { set v2 [JSON::get $e $t] if { $v2 eq "" && $t eq "null" } { log local0. "${p}null" } elseif { $v2 == 1 && $t eq "boolean" } { log local0. "${p}true" } elseif { $v2 == 0 && $t eq "boolean" } { log local0. "${p}false" } else { log local0. "$p$v2" } } } } proc stringify { json_element } { set element_type [JSON::type $json_element] set element_value [JSON::get $json_element] set output "" switch -- $element_type { array { append output "\[" set array_size [JSON::array size $element_value] for {set index 0} {$index < $array_size} {incr index} { set array_item [JSON::array get $element_value $index] append output [call proc_lib::stringify $array_item] if {$index < $array_size - 1} { append output "," } } append output "\]" } object { append output "{" set object_keys [JSON::object keys $element_value] set key_count [llength $object_keys] set current_index 0 foreach current_key $object_keys { set nested_element [JSON::object get $element_value $current_key] append output "\"${current_key}\":" append output [call proc_lib::stringify $nested_element] if {$current_index < $key_count - 1} { append output "," } incr current_index } append output "}" } string - literal { set actual_value [JSON::get $json_element $element_type] append output "\"$actual_value\"" } default { set actual_value [JSON::get $json_element $element_type] append output "$actual_value" } } return $output } proc find_key { json_element search_key } { set element_type [JSON::type $json_element] set element_value [JSON::get $json_element] switch -- $element_type { array { set array_size [JSON::array size $element_value] for {set index 0} {$index < $array_size} {incr index} { set array_item [JSON::array get $element_value $index] set result [call proc_lib::find_key $array_item $search_key] if {$result ne ""} { return $result } } } object { set object_keys [JSON::object keys $element_value] foreach current_key $object_keys { if {$current_key eq $search_key} { set found_element [JSON::object get $element_value $current_key] set found_type [JSON::type $found_element] if {$found_type eq "object" || $found_type eq "array"} { set found_value [call proc_lib::stringify $found_element] } else { set found_value [JSON::get $found_element $found_type] } return $found_value } set nested_element [JSON::object get $element_value $current_key] set result [call proc_lib::find_key $nested_element $search_key] if {$result ne ""} { return $result } } } } return "" } Example JWT: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6ImtpZDEifQ.eyJzdWIiOiIxMjM0NTY3ODkwIiwidXNlciI6ImpvaG4uZG9lQGNvbmNlbnRyYWRlLmRlIiwicm9sZSI6InN0YW5kYXJkX3VzZXIiLCJpYXQiOjE3NzU4NzU5MjMsImV4cCI6MTc3NTg3NTkyM30.rV-gaGKOEG1p_1G652_dFUBHT_X4pI-KNgu2W_I0eJevIg3FviO_0c9BOoOOUspBADttCjzEciBhLPJ2P5r_PqIdXu5khUCjH4Sq5P6zV_sTQjbRiPatYirLWtbypamSJby_TfnEFFl7sz642YuDQ7zyvbHbPCllaM4stE_Zsa1QtOy18lUJO3Uy4ngJR8CRZ6flgPhvk79rTOGXAczYNJVo5gwHyKKA6Stdp5_c7FjyEySpCfYNmWQ2AasF3DDFCDiQQpxgW-hr--NnLc0FFBan4IfQ7btn73Pc56mhJC5gAwgRJLnLLe7LbR5chfjZ26COuH0ILYvaBq0w3yCE2g Example POST Data: { "model": "llama3.1:8b", "messages": [ { "role": "system", "content": "You are a helpful assistant for security operations." }, { "role": "user", "content": "Analyze this HTTP request and tell me whether it looks malicious." } ], "stream": false, "options": { "temperature": 0.2 } }
Marcio_G
Mar 10, 2026 Place Contest Entries
200Views
4likes
0Comments
Rate limiting WebSocket messages for Agents
Problem Protecting WebSocket-based AI services from Overload caused by high message rates, temporary spikes via burst control, resource waste from duplicate or repeated messages, aggressive/malicious agents with temporary penalties, and lack of visibility via structured JSON logging. Solution This iRule protects WebSocket endpoints from aggressive or misbehaving AI agents by enforcing message rate limits, burst controls, and duplicate suppression. Each client IP is allowed up to 40 messages per 10 seconds (rate_limit / rate_window) with a maximum of 20 messages per second (burst_limit). Duplicate messages within 5 seconds (dup_ttl) are dropped, and any client exceeding limits is temporarily penalized for 60 seconds (penalty_time) and disconnected. All violations are logged in JSON format to an HSL pool, including timestamp, client IP, event type, message content, and count. Impact For organizations running AI at scale, this is a huge game changer that safeguards availability, performance, and security across potentially thousands of clients simultaneously. Code when RULE_INIT { # HSL pool for JSON logging set static::hsl_pool "syslog_pool" # Sliding window rate limit: 40 messages per 10 seconds set static::rate_limit 40 set static::rate_window 10 # Burst protection: 20 messages per second set static::burst_limit 20 # Duplicate message suppression TTL (seconds) set static::dup_ttl 5 # Penalty/quarantine duration (seconds) set static::penalty_time 60 } # ----------------------------- # Detect WebSocket Upgrade # ----------------------------- when HTTP_REQUEST { if {[string tolower [HTTP::header "Upgrade"]] eq "websocket"} { # Nothing required here, IP can be grabbed from client_addr in other events } } # ----------------------------- # Inspect WebSocket Frames # ----------------------------- when WS_CLIENT_DATA { set payload [WS::payload] } when WS_CLIENT_FRAME { set ip [IP::client_addr] # Open HSL set hsl [HSL::open -proto UDP -pool $static::hsl_pool] # ----------------------------- # Check penalty/quarantine # ----------------------------- if {[table lookup "ws_penalty:$ip"] ne ""} { # Log JSON event set ts [clock format [clock seconds] -gmt 1 -format "%Y-%m-%dT%H:%M:%SZ"] set logmsg [string map {\" \\\" \n "" \r ""} $payload] set json "{\ \"timestamp\":\"$ts\",\ \"client_ip\":\"$ip\",\ \"event\":\"penalty_block\",\ \"message\":\"$logmsg\",\ \"count\":\"0\"\ }" HSL::send $hsl $json # Mark violation for disconnect table set "ws_violation:$ip" 1 2 return } # ----------------------------- # Sliding window rate counter # ----------------------------- set rate_key "ws_rate:$ip" set rate [table incr $rate_key] if {$rate == 1} { table timeout $rate_key $static::rate_window } # ----------------------------- # Burst detection # ----------------------------- set burst_key "ws_burst:$ip" set burst [table incr $burst_key] if {$burst == 1} { table timeout $burst_key 1 } # ----------------------------- # Duplicate message detection # ----------------------------- set hash [crc32 $payload] set dup_key "ws_dup:$ip:$hash" if {[table lookup $dup_key] ne ""} { # Log duplicate message set ts [clock format [clock seconds] -gmt 1 -format "%Y-%m-%dT%H:%M:%SZ"] set logmsg [string map {\" \\\" \n "" \r ""} $payload] set json "{\ \"timestamp\":\"$ts\",\ \"client_ip\":\"$ip\",\ \"event\":\"duplicate_message\",\ \"message\":\"$logmsg\",\ \"count\":\"$rate\"\ }" HSL::send $hsl $json WS::frame drop return } # Store this message hash for duplicate detection table set $dup_key 1 $static::dup_ttl # ----------------------------- # Rate violation check # ----------------------------- if {$rate > $static::rate_limit || $burst > $static::burst_limit} { # Log rate limit exceeded set ts [clock format [clock seconds] -gmt 1 -format "%Y-%m-%dT%H:%M:%SZ"] set logmsg [string map {\" \\\" \n "" \r ""} $payload] set json "{\ \"timestamp\":\"$ts\",\ \"client_ip\":\"$ip\",\ \"event\":\"rate_limit_exceeded\",\ \"message\":\"$logmsg\",\ \"count\":\"$rate\"\ }" HSL::send $hsl $json # Apply penalty/quarantine table set "ws_penalty:$ip" 1 $static::penalty_time # Mark violation for disconnect in FRAME_DONE table set "ws_violation:$ip" 1 2 return } } # ----------------------------- # Disconnect violating clients in valid event # ----------------------------- when WS_CLIENT_FRAME_DONE { set ip [IP::client_addr] if {[table lookup "ws_violation:$ip"] eq "1"} { WS::disconnect 1000 "Violation occurred" table delete "ws_violation:$ip" } }
mcabral10
Mar 10, 2026 Place Contest Entries
200Views
3likes
0Comments
AI/Bot Traffic Throttling iRule (UA Substring + IP Range Mapping)
Problem Tags: appworld 2026, vegas, irules Created by Tim Riker using AI for the DevCentral competition. Written entirely by ChatGPT. Executive Summary This iRule provides a practical, production-ready method for throttling AI agents, crawlers, automation frameworks, and other high-volume HTTP clients at the BIG-IP edge. Bots are identified first by User-Agent substring matching and, if necessary, by source IP range mapping. Solution Throttling is enforced per bot identity rather than per client IP, which more accurately reflects how modern AI systems operate using distributed egress networks. The solution is entirely data-group driven, operationally simple, and requires no external systems. Security and operations teams can adjust bot behavior dynamically without modifying the iRule itself. Why This Matters Modern AI agents, LLM training bots, search indexers, and automation frameworks can generate extremely high request volumes. Even legitimate AI services can unintentionally: Create excessive origin load Increase bandwidth and infrastructure cost Trigger autoscaling events Impact latency for real users Skew analytics and performance metrics Rather than blocking AI traffic outright, organizations often need controlled rate limiting. This iRule enables responsible throttling while preserving service availability and fairness. Contest Justification Innovation and Creativity This iRule implements identity-based throttling rather than traditional per-IP rate limiting. Because AI agents frequently operate from multiple IP addresses, shared throttling by canonical bot identity provides significantly more accurate control. The dual attribution model (User-Agent substring first, IP-range fallback second) allows the system to handle both transparent and opaque clients, including cases where User-Agent headers are missing or spoofed. Technical Excellence This implementation uses native BIG-IP primitives only: class match -element -- contains for efficient substring matching class match -value for IP range mapping table incr for shared counters HTTP 429 with Retry-After for standards-compliant throttling The iRule parses only the first two whitespace tokens of the datagroup value, allowing inline comments while maintaining strict numeric enforcement. The logic executes only when a bot match occurs, keeping overhead minimal. Theme Alignment As AI-generated traffic becomes increasingly common, edge enforcement policies must evolve. This iRule demonstrates a practical, deployable mechanism for managing AI-era traffic patterns directly at the application delivery layer. Impact Organizations deploying AI throttling controls can: Protect origin infrastructure from automated traffic surges Maintain consistent performance for human users Reduce infrastructure and bandwidth cost Avoid over-provisioning driven by bot bursts Implement governance policies for AI consumption Because throttle limits are configured via datagroups, operational adjustments can be made instantly without code changes, reducing risk and change-control friction. Code Required Datagroup Configuration dg_bot_agent (String Datagroup) Key: User-Agent substring or canonical bot name. Value format: First two whitespace-separated integers define <limit> <window> . Additional text after the first two tokens is ignored. googlebot = "5 60" bingbot = "3 30 search crawler" my-ai-agent = "10 10 internal load test" "5 60" means allow 5 requests per 60 seconds. dg_bot_net (Address Datagroup) Key: IP address or CIDR range. Value: Must match a key defined in dg_bot_agent. 198.51.100.0/24 = "my-ai-agent" 203.0.113.0/25 = "googlebot" Deployment Steps Create dg_bot_agent (string). Create dg_bot_net (address). Populate dg_bot_agent using "<limit> <window> optional comment". Populate dg_bot_net ranges mapping to dg_bot_agent keys. Attach the iRule to an HTTP virtual server. Testing Scenario Set dg_bot_agent entry: my-ai-agent = "3 30 demo". Send four rapid requests using User-Agent: my-ai-agent. The first three succeed. The fourth returns HTTP 429 with Retry-After: 30. Map an IP range in dg_bot_net to my-ai-agent. Multiple clients within that range will share the same throttle counter. Operational Notes Throttling is per bot identity, not per IP. Enable logging by setting static::bot_log to 1. Configure table mirroring if cluster-wide counters are required. Validate on BIG-IP v21 to meet contest eligibility requirements. Architectural Diagram Description The solution can be visualized as an edge-side decision pipeline on BIG-IP, where each HTTP request is classified and optionally rate-limited before it reaches the application. Diagram components: Client: Human browser, bot, crawler, AI agent, automation framework, or any HTTP client. BIG-IP Virtual Server (HTTP): Entry point where the iRule executes in the HTTP_REQUEST event. Identification Layer: Determines the bot identity using a two-stage method (User-Agent first, IP fallback). Configuration Datagroups: dg_bot_agent and dg_bot_net provide bot identification and throttle settings. Shared Rate Counter (table): A per-bot bucket that tracks request counts over a time window. Decision Output: Either allow request through to the pool or return HTTP 429 with Retry-After. Application Pool: Origin servers that only receive traffic allowed by the throttle policy. Diagram flow (left-to-right): Step 1: Client sends HTTP request to BIG-IP VIP. Step 2: BIG-IP extracts User-Agent and client IP. Step 3: User-Agent substring lookup is performed using class match -element -- <ua> contains dg_bot_agent. Step 4: If Step 3 finds a match, the matched dg_bot_agent key becomes the canonical bot identity and its value provides <limit> <window>. Step 5: If Step 3 does not match, BIG-IP checks client IP against dg_bot_net. If the IP matches a range, dg_bot_net returns a canonical bot identity. Step 6: BIG-IP uses that canonical identity to lookup throttle values in dg_bot_agent. If no dg_bot_agent entry exists, the iRule exits and does not throttle. Step 7: BIG-IP increments a shared counter in table using the canonical bot identity as the only key (no IP component). All IPs mapped to that bot share the same bucket. Step 8: If the request count exceeds the configured limit within the configured window, BIG-IP returns HTTP 429 with a Retry-After header. Otherwise, the request is forwarded to the application pool. Key design choice: This architecture intentionally rate-limits by bot identity rather than by source IP. This is important for AI agents and modern crawlers because they frequently distribute traffic across many IP addresses. A per-IP limiter can be bypassed unintentionally or can fail to represent the true load being generated by the bot as a whole. A shared per-identity bucket enforces a realistic, policy-driven ceiling on aggregate bot traffic. Code # ------------------------------------------------------------------------------ # iRule: Bot Throttle via Data Groups # # Created by Tim Riker using AI for the DevCentral competition. # Written entirely by ChatGPT. # # DESCRIPTION: # Throttles HTTP requests for known bots and AI agents based on configuration # stored in datagroups. User-Agent matching is attempted first. If no match # is found, client IP is evaluated against a network datagroup to determine # the bot identity. # # WHY THIS MATTERS: # Modern AI agents, crawlers, LLM training bots, search indexers, and # automation frameworks can generate extremely high request volumes. # Having a controlled throttling mechanism allows organizations to protect # infrastructure, manage costs, and preserve UX without blocking outright. # # IMPLEMENTATION NOTES: # • Throttling is performed per unique bot key (NOT per IP). # • All IPs mapped to the same bot share a single counter. # • Throttle values are configurable per bot in dg_bot_agent. # # REQUIRED DATAGROUP FORMATS # # dg_bot_agent (string): # Key: UA substring (and/or canonical bot name used by dg_bot_net values) # Value: "<limit> <window> [optional comment...]" # Only the first two whitespace tokens are used. # # dg_bot_net (address): # Key: IP/CIDR range # Value: MUST match a key in dg_bot_agent # ------------------------------------------------------------------------------ when RULE_INIT { set static::bot_limit 3 set static::bot_window 30 set static::bot_log 0 set static::bot_table "bot_throttle" } when HTTP_REQUEST { set ua [string tolower [HTTP::header "User-Agent"]] set ip [IP::client_addr] set dg_key "" set dg_value "" if { $ua ne "" } { set result [class match -element -- $ua contains dg_bot_agent] if { $result ne "" } { set dg_key [lindex $result 0] set dg_value [lindex $result 1] if { $dg_value eq "" } { set dg_value [class lookup $dg_key dg_bot_agent] } } } if { $dg_key eq "" } { if { [class match $ip equals dg_bot_net] } { set net_val [class match -value $ip equals dg_bot_net] if { $net_val ne "" } { set dg_key $net_val set dg_value [class lookup $dg_key dg_bot_agent] } else { return } } else { return } } if { $dg_key eq "" || $dg_value eq "" } { return } set vlimit "" set vwindow "" set tokens [regexp -inline -all {\S+} $dg_value] if { [llength $tokens] >= 1 } { set t1 [lindex $tokens 0] if { [string is integer -strict $t1] } { set vlimit $t1 } } if { [llength $tokens] >= 2 } { set t2 [lindex $tokens 1] if { [string is integer -strict $t2] } { set vwindow $t2 } } if { $vlimit ne "" } { set bot_limit $vlimit } else { set bot_limit $static::bot_limit } if { $vwindow ne "" } { set bot_window $vwindow } else { set bot_window $static::bot_window } set bot_key [string tolower [string trim $dg_key]] set count [table incr -subtable $static::bot_table $bot_key] if { $count == 1 } { table timeout -subtable $static::bot_table $bot_key $bot_window } if { $count > $bot_limit } { if { $static::bot_log } { log local0. "BOT_THROTTLED bot=$bot_key limit=$bot_limit window=$bot_window count=$count ip=$ip ua=\"$ua\"" } HTTP::respond 429 content "Too Many Requests\r\n" \ "Retry-After" $bot_window \ "Connection" "close" return } } </window></limit>
TimRiker
Mar 10, 2026 Place Contest Entries
199Views
4likes
0Comments