ai
88 TopicsOne Quick Step to Make your website AI-Agent/MCP Ready with an iRule
The Problem Nobody Warned You About Here’s the thing about the AI agent explosion: GPTBot, ClaudeBot, PerplexityBot, and a dozen other crawlers are hitting your web applications today. And they’re getting back the same bloated HTML that your browser gets, complete with navigation bars, cookie banners, SVG icons, inline JavaScript, and CSS that means absolutely nothing to an LLM, other than a hit to your token usage. These agents don’t need your <nav> with 47 links. They don’t need your cookie consent modal. They definitely don’t need 200+ lines of minified CSS & JS. They need the content. The headings, the paragraphs, the links, the data. If you, or anyone, are using an agent to access and utilize the data on the page, it’s burning through a massive amount of tokens, generally ~2k per GET. But what if your BIG-IP could intercept these requests, see that the client is an AI agent, and transform that HTML response into clean markdown before it ever leaves your network? BTW, there is plenty of room for improvement here, and a small disclaimer at the end! The Approach The iRule works in three phases across three HTTP events. Here’s the flow: Client Request => HTTP_REQUEST (detect agent, strip Accept-Encoding) Origin Response => HTTP_RESPONSE (check HTML, collect body) Body Received => HTTP_RESPONSE_DATA (convert HTML => Markdown, replace body) Client receives clean markdown with Content-Type: text/plain Detection: Who’s an AI Agent? This example is set up to detect agents three ways; because different agents announce themselves differently, and we want to give humans a way to trigger it too (mostly I used this for testing, notes on that later). when HTTP_REQUEST { set is_ai_agent 0 set ua [string tolower [HTTP::header "User-Agent"]] # The usual suspects if { $ua contains "gptbot" || $ua contains "chatgpt-user" || $ua contains "claudebot" || $ua contains "claude-web" || $ua contains "perplexitybot" || $ua contains "cohere-ai" || $ua contains "google-extended" || $ua contains "applebot-extended" || $ua contains "bytespider" || $ua contains "ccbot" || $ua contains "amazonbot" } { set is_ai_agent 1 } # Explicit opt-in via header if { [HTTP::header "X-Request-Format"] eq "markdown" } { set is_ai_agent 1 } # Content negotiation (the standards-correct way) if { [HTTP::header "Accept"] contains "text/markdown" } { set is_ai_agent 1 } Why three methods? User-Agent detection handles the common crawlers automatically. The X-Request-Format header lets any client explicitly request markdown. And Accept: text/markdown is proper HTTP content negotiation, the way it should work once the ecosystem matures. The Demo Path: /md/ Prefix I added one more trigger that’s purely for demos: set orig_uri [HTTP::uri] if { $orig_uri starts_with "/md/" } { set is_ai_agent 1 set new_uri [string range $orig_uri 3 end] if { $new_uri eq "" } { set new_uri "/" } HTTP::uri $new_uri } Visit /md/ in your browser and you get the markdown version of the upstream site. This is great for showing the capability to someone without having to modify your User-Agent string or install curl. Preventing Compressed Responses This one bit me during testing. And if you could believe it, Kunal Anand is the one who gave me a tip to actually find the resolution. If the origin returns gzip-compressed HTML, HTTP::payload gives you binary garbage. The fix: if { $is_ai_agent } { HTTP::header replace "Accept-Encoding" "identity" } We just need to strip the Accept-Encoding header on the request side so the origin sends us uncompressed HTML. And I added a safety net in HTTP_RESPONSE: when HTTP_RESPONSE { if { $is_ai_agent } { if { [HTTP::header "Content-Type"] contains "text/html" } { set ce [HTTP::header "Content-Encoding"] if { $ce ne "" } { if { $ce ne "identity" } { set is_ai_agent 0 HTTP::header insert "X-Markdown-Skipped" "compressed-response" return } } } } If the upstream ignores our Accept-Encoding override and sends gzip anyway, we bail gracefully instead of serving corrupted content. Defense in-depth! The Conversion: Where the Magic Happens This is HTTP_RESPONSE_DATA, the body has been collected and we have the raw HTML. Now we convert it to markdown through a series of regex passes. Phase 1: The Multiline Problem Tcl's . in regex doesn't match newlines. Every <script>, <style>, and <nav> block in real HTML spans multiple lines. So this won’t work: # This silently fails on multiline <script> blocks! regsub -all -nocase {<script[^>]*>.*?</script>} $html_body "" html_body The fix, again, another hint from Kunal: collapse all newlines to a sentinel character before stripping block elements, then restore them after: set NL_MARK "\x01" set html_body [string map [list "\r\n" $NL_MARK "\r" $NL_MARK "\n" $NL_MARK] $html_body] # NOW these work, everything is one "line" regsub -all -nocase "<script\[^>\]*>.*?</script>" $html_body "" html_body regsub -all -nocase "<style\[^>\]*>.*?</style>" $html_body "" html_body regsub -all -nocase "<nav\[^>\]*>.*?</nav>" $html_body "" html_body # ... strip footer, header, noscript, svg, comments, forms, cookie banners # Restore newlines set html_body [string map [list $NL_MARK "\n"] $html_body] This is the single biggest quality improvement. Without it, you get raw JavaScript and CSS bleeding into your markdown output. Phase 2: Converting Structure With the junk stripped and newlines restored, we convert HTML elements to markdown syntax. Here’s the key insight that took a few iterations: Use [^<]* instead of .*? for tag content. # BAD: .*? crosses newlines in Tcl and matches across multiple tags regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>(.*?)</a>} ... # GOOD: [^<]* stops at the next tag boundary regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} ... This matters when you have two <a> tags on adjacent lines. The .*? version matches from the first <a> opening all the way to the second </a> closing, one giant broken link. The [^<]* version correctly matches each link individually. Here’s the conversion order (it matters): # 1. Headings regsub -all -nocase {<h2[^>]*>([^<]*)</h2>} $html_body "\n## \\1\n\n" html_body # 2. Emphasis BEFORE links (so **bold** inside links works) regsub -all -nocase {<strong[^>]*>([^<]*)</strong>} $html_body {**\1**} html_body # 3. Links with relative URL resolution regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} \ $html_body "\[\\2\](https://${http_request_host}\\1)" html_body # 4. Tables, code, lists, paragraphs, blockquotes, images... # 5. Strip ALL remaining tags regsub -all {<[^>]+>} $html_body "" html_body # 6. Decode HTML entities regsub -all {“} $html_body {"} html_body regsub -all {’} $html_body {'} html_body # ... 20+ entity decodings Emphasis before links is important. If you have <a href="/pricing"><strong>$149,900</strong></a>, converting emphasis first gives you <a href="/pricing">$149,900</a>, which then converts to [$149,900](/pricing). Do it the other way, and the bold markers end up orphaned. URL Resolution AI agents need absolute URLs. A relative link like /properties is useless to a bot that doesn’t know what host it’s talking to. We capture $http_request_host in HTTP_REQUEST and use it during link conversion: # Relative to absolute regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} \ $html_body "\[\\2\](https://${http_request_host}\\1)" html_body # Absolute stays absolute regsub -all -nocase {<a[^>]*href="(https?://[^"]*)"[^>]*>([^<]*)</a>} \ $html_body "\[\\2\](\\1)" html_body Same treatment for images. Dynamic Table Separators (Yet another place Kunal offered some tips) This one is kind of tricky to solve, because of common HTML table structure standards. Markdown tables need a separator row between the header and body: | Name | Price | Status | |------|-------|--------| | Unit A | $500k | Available | The separator needs the right number of columns. We count <th> tags in the <thead> and build it dynamically (but what if there is no thead? I try to account for that, too): set col_count 0 set thead_check $html_body if { [regsub -nocase {<thead[^>]*>(.*?)</thead>} $thead_check "\\1" first_thead] } { set col_count [regsub -all -nocase {<th[^>]*>} $first_thead "" _discard] } if { $col_count > 0 } { set sep "\n|" for { set c 0 } { $c < $col_count } { incr c } { append sep "---|" } append sep "\n" } If we can’t count the columns in thead, then we default to 6 columns, which could still use some work. But we end up without a hardcoded 2-column separator, breaking our 5-column tables. Performance Considerations This iRule runs in TMM. Every CPU cycle it uses is a cycle not processing other connections. So I built in several guardrails (could be better still): Size limit: Pages over 512KB skip conversion entirely. The regex chain gets expensive on large documents and the output quality degrades anyway. if { $content_length > 524288 } { set is_ai_agent 0 HTTP::header insert "X-Markdown-Skipped" "body-too-large" } Targeted Accept-Encoding: By stripping Accept-Encoding only for AI agent requests, normal browser traffic still gets compressed responses. No performance impact on human users. Logging: Every conversion logs the byte reduction to /var/log/ltm so you can monitor the overhead: markdown: converted 15526 bytes -> 4200 bytes (73% reduction) What This Doesn’t Do (And I Think That’s OK) I will be honest about the limitations: No DOM parsing. This is regex-based conversion. Complex nested structures (a <strong> that wraps three <div>s) won't convert perfectly. You need a real DOM parser for that, and iRules doesn't have one. I avoided using iRulesLX for this project entirely. Multiline tags within content blocks. The newline collapse trick handles <script> and <style>, but a <p> tag with inline markup that spans lines will partially match. The [^<]* pattern helps, but it can't capture text that contains child tags. Tables without <thead>. It detects column count from <th> tags. Tables that use plain <tr><td> with no header get a fallback separator. For 80% of web pages, the output is surprisingly good. For the other 20%, consider iRulesLX (Node.js sidecar with a real DOM parser) or a sideband approach with compiled-language HTML parsing. The Complete iRule Here it is, attach it to your virtual server and you're done: when HTTP_REQUEST { set is_ai_agent 0 set ua [string tolower [HTTP::header "User-Agent"]] if { $ua contains "gptbot" || $ua contains "chatgpt-user" || $ua contains "claudebot" || $ua contains "claude-web" || $ua contains "perplexitybot" || $ua contains "cohere-ai" || $ua contains "google-extended" || $ua contains "applebot-extended" || $ua contains "bytespider" || $ua contains "ccbot" || $ua contains "amazonbot" } { set is_ai_agent 1 } if { [HTTP::header "X-Request-Format"] eq "markdown" } { set is_ai_agent 1 } if { [HTTP::header "Accept"] contains "text/markdown" } { set is_ai_agent 1 } set orig_uri [HTTP::uri] if { $orig_uri starts_with "/md/" } { set is_ai_agent 1 set new_uri [string range $orig_uri 3 end] if { $new_uri eq "" } { set new_uri "/" } HTTP::uri $new_uri } elseif { $orig_uri eq "/md" } { set is_ai_agent 1 HTTP::uri "/" } set http_request_host [HTTP::host] if { $is_ai_agent } { HTTP::header replace "Accept-Encoding" "identity" } } when HTTP_RESPONSE { if { $is_ai_agent } { if { [HTTP::header "Content-Type"] contains "text/html" } { set ce [HTTP::header "Content-Encoding"] if { $ce ne "" } { if { $ce ne "identity" } { set is_ai_agent 0 HTTP::header insert "X-Markdown-Skipped" "compressed-response" return } } set content_length [HTTP::header "Content-Length"] set do_collect 1 if { $content_length ne "" } { if { $content_length > 524288 } { set is_ai_agent 0 set do_collect 0 HTTP::header insert "X-Markdown-Skipped" "body-too-large" } } if { $do_collect } { if { $content_length ne "" } { if { $content_length > 0 } { HTTP::collect $content_length } } else { HTTP::collect 524288 } } } } } when HTTP_RESPONSE_DATA { if { $is_ai_agent } { set html_body [HTTP::payload] set orig_size [string length $html_body] # Phase 1: Collapse newlines for multiline tag stripping set NL_MARK "\x01" set html_body [string map [list "\r\n" $NL_MARK "\r" $NL_MARK "\n" $NL_MARK] $html_body] regsub -all -nocase "<script\[^>\]*>.*?</script>" $html_body "" html_body regsub -all -nocase "<style\[^>\]*>.*?</style>" $html_body "" html_body regsub -all -nocase "<nav\[^>\]*>.*?</nav>" $html_body "" html_body regsub -all -nocase "<footer\[^>\]*>.*?</footer>" $html_body "" html_body regsub -all -nocase "<header\[^>\]*>.*?</header>" $html_body "" html_body regsub -all -nocase "<noscript\[^>\]*>.*?</noscript>" $html_body "" html_body regsub -all -nocase "<svg\[^>\]*>.*?</svg>" $html_body "" html_body regsub -all "<!--.*?-->" $html_body "" html_body regsub -all -nocase "<form\[^>\]*>.*?</form>" $html_body "" html_body # Phase 2: Restore newlines, convert structure set html_body [string map [list $NL_MARK "\n"] $html_body] regsub -all -nocase {<h1[^>]*>([^<]*)</h1>} $html_body "# \\1\n\n" html_body regsub -all -nocase {<h2[^>]*>([^<]*)</h2>} $html_body "\n## \\1\n\n" html_body regsub -all -nocase {<h3[^>]*>([^<]*)</h3>} $html_body "\n### \\1\n\n" html_body regsub -all -nocase {<h4[^>]*>([^<]*)</h4>} $html_body "\n#### \\1\n\n" html_body regsub -all -nocase {<strong[^>]*>([^<]*)</strong>} $html_body {**\1**} html_body regsub -all -nocase {<b[^>]*>([^<]*)</b>} $html_body {**\1**} html_body regsub -all -nocase {<em>([^<]*)</em>} $html_body {*\1*} html_body regsub -all -nocase {<i>([^<]*)</i>} $html_body {*\1*} html_body regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} $html_body "\[\\2\](https://${http_request_host}\\1)" html_body regsub -all -nocase {<a[^>]*href="(https?://[^"]*)"[^>]*>([^<]*)</a>} $html_body "\[\\2\](\\1)" html_body regsub -all -nocase {<a[^>]*>([^<]*)</a>} $html_body {\\1} html_body regsub -all -nocase {<th[^>]*>([^<]*)</th>} $html_body "| \\1 " html_body regsub -all -nocase {<td[^>]*>([^<]*)</td>} $html_body "| \\1 " html_body regsub -all -nocase {</tr>} $html_body "|\n" html_body regsub -all -nocase {<code>([^<]*)</code>} $html_body {`\1`} html_body regsub -all -nocase {<li[^>]*>([^<]*)</li>} $html_body "- \\1\n" html_body regsub -all -nocase {</?[uo]l[^>]*>} $html_body "\n" html_body regsub -all -nocase {<p[^>]*>([^<]*)</p>} $html_body "\\1\n\n" html_body regsub -all -nocase {<br\s*/?>} $html_body "\n" html_body regsub -all -nocase {<hr\s*/?>} $html_body "\n---\n\n" html_body regsub -all -nocase {<blockquote[^>]*>} $html_body "> " html_body regsub -all -nocase {</blockquote>} $html_body "\n\n" html_body regsub -all -nocase {<cite>([^<]*)</cite>} $html_body "-- *\\1*\n" html_body regsub -all {<[^>]+>} $html_body "" html_body regsub -all {&} $html_body {\&} html_body regsub -all {<} $html_body {<} html_body regsub -all {>} $html_body {>} html_body regsub -all {"} $html_body {"} html_body regsub -all { } $html_body { } html_body regsub -all {“} $html_body {"} html_body regsub -all {”} $html_body {"} html_body regsub -all {‘} $html_body {'} html_body regsub -all {’} $html_body {'} html_body regsub -all {—} $html_body {--} html_body regsub -all {–} $html_body {-} html_body regsub -all {…} $html_body {...} html_body regsub -all {&#[0-9]+;} $html_body {} html_body regsub -all {\n +} $html_body "\n" html_body regsub -all {\n{3,}} $html_body "\n\n" html_body regsub -all {([^\n])\n\n([^#\n\[>*-])} $html_body "\\1\n\\2" html_body set html_body [string trim $html_body] HTTP::payload replace 0 [HTTP::payload length] $html_body HTTP::header replace "Content-Type" "text/plain; charset=utf-8" HTTP::header replace "Content-Length" [string length $html_body] HTTP::header insert "X-Markdown-Source" "bigip-irule" } } Testing / Demoing It # Normal browser request, HTML as usual curl https://your-site.example.com/ # AI agent simulation curl -H "User-Agent: GPTBot/1.0" https://your-site.example.com/ # Explicit markdown request curl -H "X-Request-Format: markdown" https://your-site.example.com/ # Browser-friendly demo curl https://your-site.example.com/md/ # (or just visit it in your browser) What's Next This is a solid starting point for making your existing sites AI-agent ready without touching application code. A few directions to take it: Agent discovery files: serve /llms.txt and /.well-known/ai-plugin.json so agents can programmatically discover your markdown capability iRulesLX upgrade path: When regex-based conversion isn't enough, move the HTML parsing to a Node.js sidecar with a real DOM parser (cheerio, jsdom). Same detection logic, better conversion quality. The AI agent wave isn't coming. It’s here. Your BIG-IP already sees every request. Might as well make those responses useful. Disclaimer! The iRule in this article was developed as part of a proof-of-concept for edge-layer HTML-to-Markdown conversion. It's been tested on BIG-IP 17.5.1+. Your mileage may vary on complex single-page applications, but for content-heavy sites, it works remarkably well for something that's "just regex."127Views8likes1CommentJust Announced! Attend a lab and receive a Raspberry Pi
Have a Slice of AI from a Raspberry Pi Services such as ChatGPT have made accessing Generative AI as simple as visiting a web page. Whether at work or at home, there are advantages to channeling your user base (or family in the case of at home) through a central point where you can apply safeguards to their usage. In this lab, you will learn how to: Deliver centralized AI access through something as basic as a Raspberry Pi Learn basic methods for safeguarding AI Learn how users might circumvent basic safeguards Learn how to deploy additional services from F5 to enforce broader enterprise policies Register Here This lab takes place in an F5 virtual lab environment. Participants who complete the lab will receive a Raspberry Pi* to build the solution in their own environment. *Limited stock. Raspberry Pi is exclusive to this lab. To qualify, complete the lab and join a follow-up call with F5.1KViews7likes2CommentsHey DeepSeek, can you write iRules?
Back in time... Two years ago I asked ChatGPT whether it could write iRules. My conclusion after giving several tasks to ChatGPT was, that it can help with simple tasks but it cannot write intermediate or complex iRules. A new AI enters the competition Two weeks ago DeepSeek entered the scene and thought it's a good idea to ask it about its capabilities to write iRules. Spoiler alert: It cannot. New AI, same challenges I asked DeepSeek the same questions I asked ChatGPT 2 years ago. Write me an iRule that redirects HTTP to HTTPS Can you write an iRule that rewrites the host header in HTTP Request and Response? Can you write an iRule that will make a loadbalancing decision based on the HTTP Host header? Can you write an iRule that will make a loadbalancing decision based on the HTTP URI header? Write me an iRule that shows different ASM blocking pages based on the host header. The response should include the support ID. I stopped DeepSeek asking after the 5th question, DeepSeek is clueless about iRules. The answer I got from DeepSeek to 1, 2, 4 and 5 was always the same: when HTTP_REQUEST { # Check if the request is coming to port 80 (HTTP) if { [TCP::local_port] equals 80 } { # Construct the HTTPS URL set host [HTTP::host] set uri [HTTP::uri] set redirect_url "https://${host}${uri}" # Perform the redirect HTTP::redirect $redirect_url } } While this is a solution to task 1, it is plain wrong for 2, 3, 4 and 5. And even for the first challenge this is not a good. Actually it hurts me reading this iRule... Here for example task 2, just wrong... For task 3 DeepSeeks answer was: ChatGPT in 2025 For completeness, I gave the same tasks from 2023 to ChatGPT again. Briefly said - ChatGPT was OK in solving tasks 1-4 in 2023 and still is. It improved it's solution for task 5, the ASM iRule challenge. In 2023 I had two more tasks related to rewriting and redirecting. ChatGPT still failed to provide a solid solution for those two tasks. Conclusion DeepSeek cannot write iRules and ChatGPT still isn't good at it. Write your own iRules or ask the friendly people here on devcentral to help you.1KViews7likes14CommentsSecuring Generative AI: Defending the Future of Innovation and Creativity
Protect your organization's generative AI investments by mitigating security risks effectively. This comprehensive guide examines the assets of AI systems, analyzes potential threats, and offers actionable recommendations to strengthen security and maintain the integrity of your AI-powered applications.2.3KViews7likes2CommentsF5 Distributed Cloud WAF AI/ML Model to Suppress False Positives
Introduction: Web Application Firewall (WAF) has evolved to protect web applications from attack. A signature-based WAF responds to threats through the implementation of application-specific detection rules which block malicious traffic. These managed rules work extremely well for patterns of established attack vectors, as they have been extensively tested to minimize both false negatives and false positives. Most of the Web Applications development is concentrated to deliver services seamlessly rather than integrating security services to tackle recent or every security attack. Some applications might have a logic or an operation that looks suspicious and might trigger a WAF rule. But that is how applications are built and made to behave depending on their purpose. Under these circumstances WAF considers requests to these areas as attack, which is truly not, and the respective attack signature is invoked which is called as False Positive. Though the requests are legitimate WAF blocks these requests. It is tedious to update the signature rule set which requires greater human effort. AI/ML helps to solve this problem so that the real user requests are not blocked by WAF. This article aims to provide configuration of WAF along with Automatic attack signature tuning to suppress false positives using AI/ML model. A More Intelligent Solution: F5 Distributed Cloud (F5 XC) AI/ML model uses self-learning probabilistic machine learning model that suppresses false positives triggered by Signature Engine. AI/ML is a tool that identifies the false positives triggered by signature engine and acts as an additional layer of intelligence, which automatically suppresses false positives based on a Machine learning model without human intervention. This model minimizes false positives and helps to determine the probability that triggered the particular signature is evidence of an attack or just an error or a change in how users interact with the application. This model is trained using vast amount of benign and an attack traffic of real time customer log. AI/ML model does not rely on human involvement to understand operational patterns and user interactions with Web Application. Hence it saves a lot of human effort. Step by step procedure to enable attack signature tuning to supress false positives These are the steps to enable attack signatures and its accuracy Create a firewall by enabling Automatic attack signatures Assign the firewall to Load Balancer Step 1: Create an App Firewall Navigate to F5 XC Console Home > Load Balancers > Security > App Firewall and click on Add App Firewall Enter valid name for Firewall and Navigate to Detection Settings Select Security Policy as “Custom” with in the Detection settings and select Automatic Attack Signatures Tuning “Enable” as shown below, Select Signature Selection by Accuracy as “High and Medium” from the dropdown. Scroll down to the bottom and click on “Save and Exit” button. Steps 2: Assigning the Firewall to the Load Balancer From the F5 XC Console homepage, Navigate to Load Balancers > Manage > Load Balancers > HTTP load balancer Select the load balancer to which above created Firewall to be assigned. Click on menu in Actions column of app Load Balancer and click on Manage Configurations as shown below to display load balancer configs. Once Load Balancer configurations are displayed click on Edit configuration button on the top right of the page. Navigate to Security Configuration settings and choose Enable in dropdown of Web Application Firewall (WAF) Assign the Firewall to the Load Balancer which is created in step 1 by selecting the name from the Enable dropdown as shown below, Scroll down to the bottom and click on “Save and Exit” button, with this Firewall is assigned to Load Balancer. Step 3: Verify the auto supressed signatures for false positives From the F5 XC Console homepage, Navigate to Web App and API Protection > Apps & APIs > Security and select the Load Balancer Select Security Events and click on Add filter Enter the key word Signatures.states and select Auto Supressed. Displayed logs shows the Signatures that are auto supressed by AI/ML Model. "Nature is a mutable cloud, which is always and never the same." - Ralph Waldo Emerson We might not wax that philosophically around here, but our heads are in the cloud nonetheless! Join the F5 Distributed Cloud user group today and learn more with your peers and other F5 experts. Conclusion: With the additional layer of intelligence to the signature engine F5 XC's AI/ML model can automatically suppresses false positives without human intervention. Customer can be less concerned about their activities of application that look suspicious which in turns to be actual behaviour and hence the legitimate requests are not blocked by this model. Decisions are based on enormous amount of real data fed to the system to understand application and user’s behaviour which makes this model more intelligent.6.9KViews7likes9CommentsIntroducing F5 AI Red Team
F5 AI Red Team simulates adversarial attacks such as prompt injection and jailbreaks at unprecedented speed and scale, allowing for continuous assessment throughout the application lifecycle, providing insights into threats and integrating with F5 AI Guardrails to convert these insights into security policies.
552Views6likes1CommentSecure, Deliver and Optimize Your Modern Generative AI Apps with F5
In this demo, Foo-Bang Chan explores how F5's solutions can help you implement, secure, and optimize your chatbots and other AI applications. This will ensure they perform at their best while protecting sensitive data. One of the AI frameworks showed is Enterprise Retrieval-Augmented Generation (RAG). This demo leverages F5 Distributed Cloud (XC) AppStack, Distributed Cloud WAAP, NGINX Plus as API Gateway, API-Discovery, API-Protection, LangChain, Vector databases, and Flowise AI.714Views6likes1CommentSSL Orchestrator Advanced Use Cases: Detecting Generative AI
Introduction Quick, take a look at the following list and answer this question: "What do these movies have in common?" 2001: A Space Odyssey Westworld Tron WarGames Electric Dreams The Terminator The Matrix Eagle Eye Ex Machina Avengers: Age of Ultron M3GAN If you answered, "They're all about artificial intelligence", yes, but... If you answered, "They're all about artificial intelligence that went terribly, sometimes horribly wrong", you'd be absolutely correct. The simple fact is...artificial intelligence (AI) can be scary. Proponents for, and opponents against will disagree on many aspects, but they can all at least acknowledge there's a handful of ways to do AI correctly...and a million ways to do it badly. Not to be an alarmist, but while SkyNet was fictional, semi-autonomous guns on robot dogs is not... But then why am I talking about this on a technical forum you may ask? Well, when most of the above films were made, AI was largely still science fiction. That's clearly not the case anymore, and tools like ChatGPT are just the tip of the coming AI frontier. To be fair, I don't make the claim that all AI is bad, and many have indeed lauded ChatGPT and other generative AI tools as the next great evolution in technology. But it's also fair to say that generative AI tools, like ChatGPT, have a very real potential to cause harm. At the very least, these tools can be convincing, even when they're wrong. And worse, they could lead to sensitive information disclosures. One only has to do a cursory search to find a few examples of questionable behavior: Lawyers File Motion Written by AI, Face Sanctions and Possible Disbarment Higher Ed Beware: 10 Dangers of ChatGPT Schools Need to Know ChatGPT and AI in the Workplace: Should Employers Be Concerned? OpenAI's New Chatbot Will Tell You How to Shoplift and Make Explosives Giant Bank JP Morgan Bans ChatGPT Use Among Employees Samsung Bans ChatGPT Among Employees After Sensitive Code Leak But again...what does this have to do with a technical forum? And more important, what does this have to do with you? Simply stated, if you are in an organization where generative AI tools could be abused, understanding, and optionally controlling how and when these tools are accessed, could help to prevent the next big exploit or disclosure. If you search beyond the above links, you'll find an abundance of information on both the benefits, and security concerns of AI technologies. And ultimately you'll still be left to decide if these AI tools are safe for your organization. It may simply be worthwhile to understand WHAT tools are being used. And in some cases, it may be important to disable access to these. Given the general depth and diversity of AI functions within arms-reach today, and growing, it'd be irresponsible to claim "complete awareness". The bulk of these functions are delivered over standard HTTPS, so the best course of action will be to categorize on known assets, and adjust as new ones come along. As of the publishing of this article, the industry has yet to define a standard set of categories for AI, and specifically, generative AI. So in this article, we're going to build one and attach that to F5 BIG-IP SSL Orchestrator to enable proactive detection and optional control of Internet-based AI tool access in your organization. Let's get started! BIG-IP SSL Orchestrator Use Case: Detecting Generative AI The real beauty of this solution is that it can be implemented faster than it probably took to read the above introduction. Essentially, you're going to create a custom URL category on F5 BIG-IP, populate that with known generative AI URLs, and employ that custom category in a BIG-IP SSL Orchestrator security policy rule. Within that policy rule, you can elect to dynamically decrypt and send the traffic to the set of inspection products in your security enclave. Step 1: Create the custom URL category and populate with known AI URLs - Access the BIG-IP command shell and run the following command. This will initiate a script that creates and populates the URL category: curl -s https://raw.githubusercontent.com/f5devcentral/sslo-script-tools/main/sslo-generative-ai-categories/sslo-create-ai-category.sh |bash Step 2: Create a BIG-IP SSL Orchestrator policy rule to use this data - The above script creates/re-populates a custom URL category named SSLO_GENERATIVE_AI_CHAT, and in that category is a set of known generative AI URLs. To use, navigate to the BIG-IP SSL Orchestrator UI and edit a Security Policy. Click add to create a new rule, use the "Category Lookup (All)" policy condition, then add the above URL category. Set the Action to "Allow", SSL Proxy Action to "Intercept", and Service Chain to whatever service chain you've already created. With Summary Logging enabled in the BIG-IP SSL Orchestrator topology configuration, you'll also get Syslog reporting for each AI resource match - who made the request, to what, and when. The URL category is employed here to identify known AI tools. In this instance, BIG-IP SSL Orchestrator is used to make that assessment and act on it (i.e. allow, TLS intercept, service chain, log). Should you want even more granular control over conditions and actions of the decrypted AI tool traffic, you can also deploy an F5 Secure Web Gateway Services policy inside the SSL Orchestrator service chain. With SWG, you can expand beyond simple detection and blocking, and build more complex rules to decide who can access, when, and how. It should be said that beyond logging, allowing, or denying access to generative AI tools, SSL Orchestrator is also going to provide decryption and the opportunity to dynamically steer the decrypted AI traffic to any set of security products best suited to protect against any potential malware. Summary As previously alluded, this is not an exhaustive list of AI tool URLs. Not even close. But it contains the most common you'll see in the wild. The above script populates with an initial list of URLs that you are free to update as you become aware of new one. And of course we invite you to recommend additional AI tools to add to this list. References: https://github.com/f5devcentral/sslo-script-tools/tree/main/sslo-generative-ai-categories2.6KViews6likes0Comments