ai
75 TopicsOne Quick Step to Make your website AI-Agent/MCP Ready with an iRule
The Problem Nobody Warned You About Here’s the thing about the AI agent explosion: GPTBot, ClaudeBot, PerplexityBot, and a dozen other crawlers are hitting your web applications today. And they’re getting back the same bloated HTML that your browser gets, complete with navigation bars, cookie banners, SVG icons, inline JavaScript, and CSS that means absolutely nothing to an LLM, other than a hit to your token usage. These agents don’t need your <nav> with 47 links. They don’t need your cookie consent modal. They definitely don’t need 200+ lines of minified CSS & JS. They need the content. The headings, the paragraphs, the links, the data. If you, or anyone, are using an agent to access and utilize the data on the page, it’s burning through a massive amount of tokens, generally ~2k per GET. But what if your BIG-IP could intercept these requests, see that the client is an AI agent, and transform that HTML response into clean markdown before it ever leaves your network? BTW, there is plenty of room for improvement here, and a small disclaimer at the end! The Approach The iRule works in three phases across three HTTP events. Here’s the flow: Client Request => HTTP_REQUEST (detect agent, strip Accept-Encoding) Origin Response => HTTP_RESPONSE (check HTML, collect body) Body Received => HTTP_RESPONSE_DATA (convert HTML => Markdown, replace body) Client receives clean markdown with Content-Type: text/plain Detection: Who’s an AI Agent? This example is set up to detect agents three ways; because different agents announce themselves differently, and we want to give humans a way to trigger it too (mostly I used this for testing, notes on that later). when HTTP_REQUEST { set is_ai_agent 0 set ua [string tolower [HTTP::header "User-Agent"]] # The usual suspects if { $ua contains "gptbot" || $ua contains "chatgpt-user" || $ua contains "claudebot" || $ua contains "claude-web" || $ua contains "perplexitybot" || $ua contains "cohere-ai" || $ua contains "google-extended" || $ua contains "applebot-extended" || $ua contains "bytespider" || $ua contains "ccbot" || $ua contains "amazonbot" } { set is_ai_agent 1 } # Explicit opt-in via header if { [HTTP::header "X-Request-Format"] eq "markdown" } { set is_ai_agent 1 } # Content negotiation (the standards-correct way) if { [HTTP::header "Accept"] contains "text/markdown" } { set is_ai_agent 1 } Why three methods? User-Agent detection handles the common crawlers automatically. The X-Request-Format header lets any client explicitly request markdown. And Accept: text/markdown is proper HTTP content negotiation, the way it should work once the ecosystem matures. The Demo Path: /md/ Prefix I added one more trigger that’s purely for demos: set orig_uri [HTTP::uri] if { $orig_uri starts_with "/md/" } { set is_ai_agent 1 set new_uri [string range $orig_uri 3 end] if { $new_uri eq "" } { set new_uri "/" } HTTP::uri $new_uri } Visit /md/ in your browser and you get the markdown version of the upstream site. This is great for showing the capability to someone without having to modify your User-Agent string or install curl. Preventing Compressed Responses This one bit me during testing. And if you could believe it, Kunal Anand is the one who gave me a tip to actually find the resolution. If the origin returns gzip-compressed HTML, HTTP::payload gives you binary garbage. The fix: if { $is_ai_agent } { HTTP::header replace "Accept-Encoding" "identity" } We just need to strip the Accept-Encoding header on the request side so the origin sends us uncompressed HTML. And I added a safety net in HTTP_RESPONSE: when HTTP_RESPONSE { if { $is_ai_agent } { if { [HTTP::header "Content-Type"] contains "text/html" } { set ce [HTTP::header "Content-Encoding"] if { $ce ne "" } { if { $ce ne "identity" } { set is_ai_agent 0 HTTP::header insert "X-Markdown-Skipped" "compressed-response" return } } } } If the upstream ignores our Accept-Encoding override and sends gzip anyway, we bail gracefully instead of serving corrupted content. Defense in-depth! The Conversion: Where the Magic Happens This is HTTP_RESPONSE_DATA, the body has been collected and we have the raw HTML. Now we convert it to markdown through a series of regex passes. Phase 1: The Multiline Problem Tcl's . in regex doesn't match newlines. Every <script>, <style>, and <nav> block in real HTML spans multiple lines. So this won’t work: # This silently fails on multiline <script> blocks! regsub -all -nocase {<script[^>]*>.*?</script>} $html_body "" html_body The fix, again, another hint from Kunal: collapse all newlines to a sentinel character before stripping block elements, then restore them after: set NL_MARK "\x01" set html_body [string map [list "\r\n" $NL_MARK "\r" $NL_MARK "\n" $NL_MARK] $html_body] # NOW these work, everything is one "line" regsub -all -nocase "<script\[^>\]*>.*?</script>" $html_body "" html_body regsub -all -nocase "<style\[^>\]*>.*?</style>" $html_body "" html_body regsub -all -nocase "<nav\[^>\]*>.*?</nav>" $html_body "" html_body # ... strip footer, header, noscript, svg, comments, forms, cookie banners # Restore newlines set html_body [string map [list $NL_MARK "\n"] $html_body] This is the single biggest quality improvement. Without it, you get raw JavaScript and CSS bleeding into your markdown output. Phase 2: Converting Structure With the junk stripped and newlines restored, we convert HTML elements to markdown syntax. Here’s the key insight that took a few iterations: Use [^<]* instead of .*? for tag content. # BAD: .*? crosses newlines in Tcl and matches across multiple tags regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>(.*?)</a>} ... # GOOD: [^<]* stops at the next tag boundary regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} ... This matters when you have two <a> tags on adjacent lines. The .*? version matches from the first <a> opening all the way to the second </a> closing, one giant broken link. The [^<]* version correctly matches each link individually. Here’s the conversion order (it matters): # 1. Headings regsub -all -nocase {<h2[^>]*>([^<]*)</h2>} $html_body "\n## \\1\n\n" html_body # 2. Emphasis BEFORE links (so **bold** inside links works) regsub -all -nocase {<strong[^>]*>([^<]*)</strong>} $html_body {**\1**} html_body # 3. Links with relative URL resolution regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} \ $html_body "\[\\2\](https://${http_request_host}\\1)" html_body # 4. Tables, code, lists, paragraphs, blockquotes, images... # 5. Strip ALL remaining tags regsub -all {<[^>]+>} $html_body "" html_body # 6. Decode HTML entities regsub -all {“} $html_body {"} html_body regsub -all {’} $html_body {'} html_body # ... 20+ entity decodings Emphasis before links is important. If you have <a href="/pricing"><strong>$149,900</strong></a>, converting emphasis first gives you <a href="/pricing">$149,900</a>, which then converts to [$149,900](/pricing). Do it the other way, and the bold markers end up orphaned. URL Resolution AI agents need absolute URLs. A relative link like /properties is useless to a bot that doesn’t know what host it’s talking to. We capture $http_request_host in HTTP_REQUEST and use it during link conversion: # Relative to absolute regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} \ $html_body "\[\\2\](https://${http_request_host}\\1)" html_body # Absolute stays absolute regsub -all -nocase {<a[^>]*href="(https?://[^"]*)"[^>]*>([^<]*)</a>} \ $html_body "\[\\2\](\\1)" html_body Same treatment for images. Dynamic Table Separators (Yet another place Kunal offered some tips) This one is kind of tricky to solve, because of common HTML table structure standards. Markdown tables need a separator row between the header and body: | Name | Price | Status | |------|-------|--------| | Unit A | $500k | Available | The separator needs the right number of columns. We count <th> tags in the <thead> and build it dynamically (but what if there is no thead? I try to account for that, too): set col_count 0 set thead_check $html_body if { [regsub -nocase {<thead[^>]*>(.*?)</thead>} $thead_check "\\1" first_thead] } { set col_count [regsub -all -nocase {<th[^>]*>} $first_thead "" _discard] } if { $col_count > 0 } { set sep "\n|" for { set c 0 } { $c < $col_count } { incr c } { append sep "---|" } append sep "\n" } If we can’t count the columns in thead, then we default to 6 columns, which could still use some work. But we end up without a hardcoded 2-column separator, breaking our 5-column tables. Performance Considerations This iRule runs in TMM. Every CPU cycle it uses is a cycle not processing other connections. So I built in several guardrails (could be better still): Size limit: Pages over 512KB skip conversion entirely. The regex chain gets expensive on large documents and the output quality degrades anyway. if { $content_length > 524288 } { set is_ai_agent 0 HTTP::header insert "X-Markdown-Skipped" "body-too-large" } Targeted Accept-Encoding: By stripping Accept-Encoding only for AI agent requests, normal browser traffic still gets compressed responses. No performance impact on human users. Logging: Every conversion logs the byte reduction to /var/log/ltm so you can monitor the overhead: markdown: converted 15526 bytes -> 4200 bytes (73% reduction) What This Doesn’t Do (And I Think That’s OK) I will be honest about the limitations: No DOM parsing. This is regex-based conversion. Complex nested structures (a <strong> that wraps three <div>s) won't convert perfectly. You need a real DOM parser for that, and iRules doesn't have one. I avoided using iRulesLX for this project entirely. Multiline tags within content blocks. The newline collapse trick handles <script> and <style>, but a <p> tag with inline markup that spans lines will partially match. The [^<]* pattern helps, but it can't capture text that contains child tags. Tables without <thead>. It detects column count from <th> tags. Tables that use plain <tr><td> with no header get a fallback separator. For 80% of web pages, the output is surprisingly good. For the other 20%, consider iRulesLX (Node.js sidecar with a real DOM parser) or a sideband approach with compiled-language HTML parsing. The Complete iRule Here it is, attach it to your virtual server and you're done: when HTTP_REQUEST { set is_ai_agent 0 set ua [string tolower [HTTP::header "User-Agent"]] if { $ua contains "gptbot" || $ua contains "chatgpt-user" || $ua contains "claudebot" || $ua contains "claude-web" || $ua contains "perplexitybot" || $ua contains "cohere-ai" || $ua contains "google-extended" || $ua contains "applebot-extended" || $ua contains "bytespider" || $ua contains "ccbot" || $ua contains "amazonbot" } { set is_ai_agent 1 } if { [HTTP::header "X-Request-Format"] eq "markdown" } { set is_ai_agent 1 } if { [HTTP::header "Accept"] contains "text/markdown" } { set is_ai_agent 1 } set orig_uri [HTTP::uri] if { $orig_uri starts_with "/md/" } { set is_ai_agent 1 set new_uri [string range $orig_uri 3 end] if { $new_uri eq "" } { set new_uri "/" } HTTP::uri $new_uri } elseif { $orig_uri eq "/md" } { set is_ai_agent 1 HTTP::uri "/" } set http_request_host [HTTP::host] if { $is_ai_agent } { HTTP::header replace "Accept-Encoding" "identity" } } when HTTP_RESPONSE { if { $is_ai_agent } { if { [HTTP::header "Content-Type"] contains "text/html" } { set ce [HTTP::header "Content-Encoding"] if { $ce ne "" } { if { $ce ne "identity" } { set is_ai_agent 0 HTTP::header insert "X-Markdown-Skipped" "compressed-response" return } } set content_length [HTTP::header "Content-Length"] set do_collect 1 if { $content_length ne "" } { if { $content_length > 524288 } { set is_ai_agent 0 set do_collect 0 HTTP::header insert "X-Markdown-Skipped" "body-too-large" } } if { $do_collect } { if { $content_length ne "" } { if { $content_length > 0 } { HTTP::collect $content_length } } else { HTTP::collect 524288 } } } } } when HTTP_RESPONSE_DATA { if { $is_ai_agent } { set html_body [HTTP::payload] set orig_size [string length $html_body] # Phase 1: Collapse newlines for multiline tag stripping set NL_MARK "\x01" set html_body [string map [list "\r\n" $NL_MARK "\r" $NL_MARK "\n" $NL_MARK] $html_body] regsub -all -nocase "<script\[^>\]*>.*?</script>" $html_body "" html_body regsub -all -nocase "<style\[^>\]*>.*?</style>" $html_body "" html_body regsub -all -nocase "<nav\[^>\]*>.*?</nav>" $html_body "" html_body regsub -all -nocase "<footer\[^>\]*>.*?</footer>" $html_body "" html_body regsub -all -nocase "<header\[^>\]*>.*?</header>" $html_body "" html_body regsub -all -nocase "<noscript\[^>\]*>.*?</noscript>" $html_body "" html_body regsub -all -nocase "<svg\[^>\]*>.*?</svg>" $html_body "" html_body regsub -all "<!--.*?-->" $html_body "" html_body regsub -all -nocase "<form\[^>\]*>.*?</form>" $html_body "" html_body # Phase 2: Restore newlines, convert structure set html_body [string map [list $NL_MARK "\n"] $html_body] regsub -all -nocase {<h1[^>]*>([^<]*)</h1>} $html_body "# \\1\n\n" html_body regsub -all -nocase {<h2[^>]*>([^<]*)</h2>} $html_body "\n## \\1\n\n" html_body regsub -all -nocase {<h3[^>]*>([^<]*)</h3>} $html_body "\n### \\1\n\n" html_body regsub -all -nocase {<h4[^>]*>([^<]*)</h4>} $html_body "\n#### \\1\n\n" html_body regsub -all -nocase {<strong[^>]*>([^<]*)</strong>} $html_body {**\1**} html_body regsub -all -nocase {<b[^>]*>([^<]*)</b>} $html_body {**\1**} html_body regsub -all -nocase {<em>([^<]*)</em>} $html_body {*\1*} html_body regsub -all -nocase {<i>([^<]*)</i>} $html_body {*\1*} html_body regsub -all -nocase {<a[^>]*href="(/[^"]*)"[^>]*>([^<]*)</a>} $html_body "\[\\2\](https://${http_request_host}\\1)" html_body regsub -all -nocase {<a[^>]*href="(https?://[^"]*)"[^>]*>([^<]*)</a>} $html_body "\[\\2\](\\1)" html_body regsub -all -nocase {<a[^>]*>([^<]*)</a>} $html_body {\\1} html_body regsub -all -nocase {<th[^>]*>([^<]*)</th>} $html_body "| \\1 " html_body regsub -all -nocase {<td[^>]*>([^<]*)</td>} $html_body "| \\1 " html_body regsub -all -nocase {</tr>} $html_body "|\n" html_body regsub -all -nocase {<code>([^<]*)</code>} $html_body {`\1`} html_body regsub -all -nocase {<li[^>]*>([^<]*)</li>} $html_body "- \\1\n" html_body regsub -all -nocase {</?[uo]l[^>]*>} $html_body "\n" html_body regsub -all -nocase {<p[^>]*>([^<]*)</p>} $html_body "\\1\n\n" html_body regsub -all -nocase {<br\s*/?>} $html_body "\n" html_body regsub -all -nocase {<hr\s*/?>} $html_body "\n---\n\n" html_body regsub -all -nocase {<blockquote[^>]*>} $html_body "> " html_body regsub -all -nocase {</blockquote>} $html_body "\n\n" html_body regsub -all -nocase {<cite>([^<]*)</cite>} $html_body "-- *\\1*\n" html_body regsub -all {<[^>]+>} $html_body "" html_body regsub -all {&} $html_body {\&} html_body regsub -all {<} $html_body {<} html_body regsub -all {>} $html_body {>} html_body regsub -all {"} $html_body {"} html_body regsub -all { } $html_body { } html_body regsub -all {“} $html_body {"} html_body regsub -all {”} $html_body {"} html_body regsub -all {‘} $html_body {'} html_body regsub -all {’} $html_body {'} html_body regsub -all {—} $html_body {--} html_body regsub -all {–} $html_body {-} html_body regsub -all {…} $html_body {...} html_body regsub -all {&#[0-9]+;} $html_body {} html_body regsub -all {\n +} $html_body "\n" html_body regsub -all {\n{3,}} $html_body "\n\n" html_body regsub -all {([^\n])\n\n([^#\n\[>*-])} $html_body "\\1\n\\2" html_body set html_body [string trim $html_body] HTTP::payload replace 0 [HTTP::payload length] $html_body HTTP::header replace "Content-Type" "text/plain; charset=utf-8" HTTP::header replace "Content-Length" [string length $html_body] HTTP::header insert "X-Markdown-Source" "bigip-irule" } } Testing / Demoing It # Normal browser request, HTML as usual curl https://your-site.example.com/ # AI agent simulation curl -H "User-Agent: GPTBot/1.0" https://your-site.example.com/ # Explicit markdown request curl -H "X-Request-Format: markdown" https://your-site.example.com/ # Browser-friendly demo curl https://your-site.example.com/md/ # (or just visit it in your browser) What's Next This is a solid starting point for making your existing sites AI-agent ready without touching application code. A few directions to take it: Agent discovery files: serve /llms.txt and /.well-known/ai-plugin.json so agents can programmatically discover your markdown capability iRulesLX upgrade path: When regex-based conversion isn't enough, move the HTML parsing to a Node.js sidecar with a real DOM parser (cheerio, jsdom). Same detection logic, better conversion quality. The AI agent wave isn't coming. It’s here. Your BIG-IP already sees every request. Might as well make those responses useful. Disclaimer! The iRule in this article was developed as part of a proof-of-concept for edge-layer HTML-to-Markdown conversion. It's been tested on BIG-IP 17.5.1+. Your mileage may vary on complex single-page applications, but for content-heavy sites, it works remarkably well for something that's "just regex."85Views7likes1CommentScality RING and F5 BIG-IP: High-Performance S3 Object Storage
The load balancing of F5 BIG-IP, both locally within a site as well as for global traffic steering to an optimal site around large geographies, works effectively with Scality RING, a modern and massively scalable object storage solution. The RING architecture takes an innovative “bring-your-own Linux” approach to turning highly performant servers, equipped with ample disks, into a resilient, durable storage solution. The BIG-IP can scale in lock step with offered S3 access loads, for use cases like AI data delivery for model training as an example, avoiding any single RING node from being a hot spot, with pioneering load balancing algorithms like “Least Connections” or “Fastest”, to name just a couple. From a global server load balancing perspective, BIG-IP DNS can apply similar advanced logic, for instance, steering S3 traffic to the optimal RING site, taking into consideration the geographic locale of the traffic source or leveraging on-going latency measurements from these traffic source sites. Scality RING – High Capacity and Durability for Today’s Object Storage The Scality solution is well known for the ability to grow the capacity of an enterprise’s storage needs with agility; simply license the usable storage needed today and upgrade on an as-needed basis as business warrants. RING supports both object and file storage; however, the focus of this investigation is object. Industry drivers of object storage growth include its prevalence in AI model training, specifically for content accrual, which will in-turn feed GPUs, as well as data lakehouse implementations. There is an extremely long-tailed distribution of other use cases, such as video clip retention in the media and entertainment industry, medical imaging repositories, updates to traditional uses like NAS offload to S3 and the evolution of enterprise storage backups. At the very minimum, a 3-node site, with 200 TB of storage, serves as a starting point for a RING implementation. The underlying servers typically run RHEL 9 or Rocky Linux, running upon x86 or AMD architectures, and a representative server would offer disk bays, front or back, with loaded disks totally anywhere from 10 to dozens of disk units. Generally, S3 objects are stored upon spinning hard disk drives (HDD) while the corresponding metadata warrants inclusion of a subset of flash drives in a typical Scality deployment. A representative diagram of BIG-IP in support of a single RING site would be as follows. One of the known attributes of a well-engineered RING solution is 100 percent data availability. In industry terms, this is an RPO (recovery point objective) of zero, meaning that no data is lost between the moment a failure occurs and the moment the system is restored to its last known good state. This is achieved through means like multiple nodes, multiple disks, and often multiple sites. Included is the combination of replication for small objects, such as retaining 2 or 3 copies of objects smaller than 60 kilobytes and erasure coding (EC) for larger objects. Erasure coding is a nuanced topic within the storage industry. Scality uses a sophisticated take on Erasure coding known as ARC (Advanced Resiliency Coding). In alignment with availability, is the durability of data that can be achieved through RING. This is to say, how “intact” can I believe my data at rest is? The Scality solution is a fourteen 9’s solution, exceeding most other advertised values, including that of AWS. What 9’s correspond to in terms of downtime in a single year can be found here, although it is telling that Wikipedia, as of early 2026, does not even provide calculations beyond twelve 9’s. Finally, in keeping with sound information lifecycle management (ILM), the Scality site may offer an additional server running XDM (eXtended Data Management) to act as a bridge between on-premises RING and public clouds such as AWS and Azure. This allows a tiering approach, where older, “cold” data is moved off-site. Archiving to-tape solutions are also available options. Scality – Quick Overview of Data at Rest Protection The two principal approaches to protecting data in large single or multi-site RING deployments is to combine replication and erasure coding. Replication is simple to understand, for smaller objects an operator simply chooses the number of replicas desired. If two replicas are chosen, indicated by class of service (COS) 2, two copies are spread across nodes. For COS 3, three copies are spread across nodes. A frequent rule of thumb is a three percent rule, this being the fraction of files frequently being 60 kilobytes or less across a full object storage environment, meaning they are to be replicated; replicas are available in cases of hardware disruptions with a given node. Erasure coding is an adjustable technique where larger objects are divided into data chunks, sometimes called data shards or data blocks, and spread (or “striped”) across many nodes. To add resilience, in the case of one or even more hardware issues with nodes or disks within nodes, additional parity chunks are mathematically derived. This way, cleverly and by design, only a subset of the data chunks and parity chunks are required in a solution under duress, and the original object is still easily provided upon an S3 request. In smaller node deployments, it is possible to consider a single RING server as two entities, but dividing storage into two “disk groups.” However, for an ideal, larger RING site, the approach depicted is preferred. The erasure coding depicted, normally referred to with the nomenclature EC(9,3), leads into a deeper design consideration where storage overhead is traded off against data resiliency. In the diagram, as many as 3 nodes holding portions of the data could become unreachable and still the erasure-coded object would be available. The overhead can be considered 33 percent as 3 additional parity chunks were created, beyond the 9 data chunks, and stored. For more risk-adverse operators, an EC of, say, EC(8,4) would allow even more, four points of failure. The trade off would be, in this case, a 50 percent overhead to achieve that increased resiliency. The overhead is still much less than replication, which can see hundreds of percent in overhead, thus the logical choice to use that for only small objects. Together, replication and EC lead to an overall storage efficiency number. Considering a 3 percent small objects environment, an EC(9,3) and COS3 for replication might tactically lead to a long-term palatable data protection posture, all for only a total cost of 41 percent additional storage overhead. The ability to scale out and protect the S3 data in flight is the domain of BIG-IP and what we will review next. BIG-IP – Bring Scale and Traffic Control to Scality RING A starting point for any discussion around BIG-IP are the rich load balancing algorithms and the ability to drop unhealthy nodes from an origin pool, transparent to users who only interact with the configured virtual server. Load balancing for S3 involves avoiding “hot spots”, where a single RING node might otherwise by overly tasked by users directly communicating to it, all while other nodes remain vastly underutilized. By steering DNS resolution of S3 services to BIG-IP, and configured virtual servers, traffic can be spread across all healthy nodes and spread in accordance with interesting algorithms. Popular ones for S3 include: Least Connections – RING nodes with fewer established TCP connections will receive proportionally more of the new S3 transactions, towards a goal of balanced load in the server cluster. Ratio (member) – Although sound practice would be all RING members having similar compute and storage makeup, in some cases, perhaps two vintages of server exist. Ratio will allow proportionally more traffic to target newer, more performant classes of Scality nodes. Fastest (Application) – The number of “in progress” transactions any one server in a pool is handling is considered. If traffic steered to all members is generally similar over time, a member with the least number of transactions actively in progress will be considered a faster member in the pool, and new transactions can favor such low latency servers. The RING nodes are contacted through Scality "S3 Connectors", in an all object deployment the connector resides on the storage node itself. For some configurations, perhaps one with file-based protocols like NFS concurrently running, the S3 Connectors can also be installed on VM or 1U appliances too. Of course, an unhealthy node should be precluded from an origin pool and the ability to do low-impact HTTP-based health monitors, like the HTTP HEAD method to see if an endpoint is responsive are frequently used. With BIG-IP Extended Application Validation (EAV) one can move towards even more sophisticated health checks. An S3 access and secret token pair installed on BIG-IP can be harnessed to perpetually upload and download small objects to each pool member, assuring the BIG-IP administrator that S3 is unequivocally healthy with each pool member. BIG-IP – Control-Plane and Data-Plane Safeguards A popular topic in a Scality software-defined distributed storage solution is that of a noisy neighbor when multiple tenants are considered. Perhaps one tenant has an S3 application which consumes disproportionate amounts of shared resources (CPU, network, or disk I/O), degrading performance for other tenants; controls are needed to counter this. With BIG-IP, a simple control plane threshold can be invoked with a straight-forward iRule, a programmatic rule which can limit the source from producing more than, say, 25 S3 requests over 10 seconds. An iRule is a powerful but normally short, event-driven script. A sample is provided below. Most modern generative AI solutions are well-versed in F5 iRules and can summarize even the most advanced scripts into digestible terms. This iRule examines an application (“client_addr”) that connects to a BIG-IP virtual server and starts a counter, after 10 transactions within 6 seconds the S3 commands will be rejected. The approach is that of a leaky bucket, and the application will be replenished with credits for future transactions over time. Whereas iRules frequently target layer 7, HTTP-layer activity, a wealth of layer 3 and layer 4 controls to limit data plane excessive consumption exist. Take for example the static bandwidth controller concept. Simply create a profile such as the following 10 Mbps example. This bandwidth controller can then be applied against a virtual server, including a virtual server supporting, say, lower-priority S3 application traffic. Focusing on layer-4, the TCP layer, a number of BIG-IP safeguards exist, amongst which are those that can defend against orphaned S3 connections, including those intentionally set up and left open by a bad actor to try to deplete RING resources. Another safeguard, the ability to re-map DiffServ code points or Type of Service (TOS) precedence bits exists. In this manner, a source that exceeds ideal traffic rates can be passed without intervention; however, by remapping heavy upstream traffic, BIG-IP enables network infrastructure adjacent to Scality RING nodes to police or discard such traffic if required. Evolving Modern S3 Traffic with Fresh Takes on TLS TLS underwent a major improvement with the first release of TLS 1.3 in 2018. It removed a number of antiquated security components from official support, things like RSA-style key agreements, SHA-1 hashes, and DES encryption. However, from a performance point of view, the upgrade to TLS 1.3 is equally significant. When establishing a TLS 1.2 session, perhaps towards the goal of an S3 transaction with RING, with a TCP connection established, an application can expect 2 round-trip times to successfully pass the TLS negotiation phase and move forward with encrypted communications. TLS1.3 cuts round trips in half; a new TLS 1.3 session can proceed to encrypted data exchange with a single round-trip time. In fact, when resuming a previously established TLS 1.3 session, 0RTT is possible, meaning the first resumption from the client can itself carry encrypted data. The following packet trace demonstrates 1RTT TLS1.3 establishment (double-click to enlarge image). To turn on this feature, simply use a client-facing TLS profile on BIG-IP and remove the “No TLS1.3” option. Another advancement in TLS, which must have TLS 1.3 enabled to start with, is quantum computing resistance to shared key agreement algorithms in TLS. This is a foundational building block of Post Quantum Computing (PCQ) cryptography, and the most well-known of these techniques is NIST FIPS-203 ML-KEM. The concern with not supporting PCQ today is that traffic in flight, which may be surreptitiously siphoned off and stored long term, will be readable in the future with quantum computers, perhaps as early as 2030. This risk stems from thought leadership like Shor’s algorithm, which indicates public key (asymmetric) cryptography, foundational to shared key establishment between parties in TLS, is at risk. The concern is that due to large-scale, fault-tolerant quantum computers potentially cracking elliptic curve cryptography (ECC) and Diffie-Helman (DH) algorithms. This risk, the so-called Harvest Now, Decrypt Later threat, means sensitive data like tax records, medical information and anything with longer term retention value requires protections today. It cannot be put off safely; action needs to be taken now. FIPS-203 ML-KEM suggests a hybrid approach to shared key derivation, after which TLS parties today can safely continue to use symmetric encryption algorithms like AES, which are thought to be far less susceptible to quantum attacks. Updating our initial one-site topology, we can consider the following improvements. A key understanding is that a hybrid key agreement scheme is used in FIPS -203. Essentially, a parallel set of crypto operations using traditional key agreements like X25519 ECDH key exchange protocol is performed simultaneously to the new MLKEM768 quantum resistant key encapsulation approach. The net result is a significant amount of crypto is carried out, with two sets of calculations, and the final combining of outcomes to come to an agreed upon shared key. The conclusion is this load is likely best suited for only a subset of S3 flows, those with objects housing PII of high long-term potential value. A method to achieve this balance, the trade off between security and performance, is to use multiple BIG-IP virtual servers: a regular set of S3 endpoints with classical TLS support, and higher-security S3 endpoints for selective use. The latter would support the PQC provisions of modern TLS. A full article on configuring BIG-IP for PQC, including a video demonstration of the click-through to add support to a virtual server, can be found here. Multi-site Global Server Load Balancing with BIG-IP and Scality RING An illustrative diagram showing two RING sites, asynchronously connected and offering S3 ingestion and object retrieval is shown below. Note that the BIG-IP DNS, although frequently deployed independently from BIG-IP LTM appliances, can operate on the same, existing LTM appliances as well. In this example, an S3 application physically situated in Phoenix, Arizona, in the American southwest, will use its configured local DNS resolver (frequently shorted to LDNS) to resolve S3 targets to IP addresses. Think, finance.s3.acme.com or humanresources.s3.acme.com. In F5 terms, these example domain names are referred to as “Wide IPs”. An organization such as the fictious acme.com will delegate the relevant sub-domains to F5 DNS, such as s3.acme.com in our example, meaning the F5 appliances in San Francisco and Boston hold the DNS nameservice (NS) resource records for the S3 domain in question, and can answer the client’s DNS resolver authoritatively. The DNS A queries required by the S3 application will land on either BIG-IP DNS platform, San Francisco or Boston. The pair serve for redundancy purposes, and both can provide an enterprise-controlled answer. In other words, should the S3 application target be resolved in Los Angeles or New York City? The F5 solution allows for a multitude of considerations when providing the answer to the above question. Interesting options and their impact on our topology diagram: Global Availability – A common disaster recovery approach. The BIG-IP DNS appliance distributes DNS name resolution requests to the first available virtual server in a pool list the administrator configures. BIG-IP DNS starts at the top of the list of virtual servers and sends requests to the first available virtual server in the list. Only when the virtual server becomes unavailable does BIG-IP DNS send requests to the next virtual server in the list. If we want S3 generally to travel to Los Angeles, and only utilize New York when application availability problems arise, this would be a good approach. Ratio – In a case where we would like a, say, 80/20 split between S3 traffic landing in Los Angeles versus New York, this would be a sound method. Perhaps market reasons make the cost of ingesting traffic in New York more expensive. Round Robin – the logical choice where we would like to see both data centers receive, generally, over time, the same amount of S3 transactions. Topology - BIG-IP DNS distributes DNS name resolution requests using proximity-based load balancing. BIG-IP DNS determines the proximity of the resource by comparing location information derived from the DNS message to the topology records in a topology statement. A great choice if data centers are of similar capacity and S3 transactions are best serviced by the closest physical data center. Note, the source IP address of the application’s DNS resolver is analyzed; if a centralized DNS service is used, perhaps it is not in Phoenix at all. There are techniques like EDNS0 to try to place the actual locality of the application. Round Trip Time – An advanced algorithm that is dynamic, not static. BIG-IP DNS distributes DNS name resolution requests to the virtual server with the fastest measured round-trip time between that data center and a client’s LDNS. This is achieved by having sites send low-impact probes, from “prober pools”, to each application’s DNS resolver over time. Therefore, for new DNS resolution requests, the BIG-IP DNS can tap into real-world latency knowledge to direct S3 traffic to the site, which is demonstrably known to offer the lowest latency. This again works best when the application and DNS resolver are in the same location. The BIG-IP DNS, when selecting between virtual servers, such as in Los Angeles and New York City in our simple example, can have a primary algorithm, a secondary algorithm and a fall-back, hard-coded IP. For instance, consider the first two algorithms are, in order, dynamic approaches, such as prober pools measuring round-trip time and, as a second approach, the measurement of active hop counts between sites and application LDNS. Should both methods fail to provide results, an IP address of last resort, perhaps in our case, Los Angeles, will be provided through the configured fall-back IP. Key takeaway: what is being provided by F5 and Scality is “intelligent” DNS, traffic is directed to the sites not based upon basic network reachability to Los Angeles or New York. In reality, the solution looks behind the local load balancing tier and is aware of the health of each Scality RING member. Thus, traffic is steered in accordance to back-end application health monitoring, something a regular DNS solution would not offer. Multi-site Solutions for Global Deployments and Geo-Awareness One potentially interesting use case for F5 BIG-IP DNS and Scality RING sites would be to tier all data centers into pools, based upon wider geographies. Consider a use case such as the following, with Scality RING sites spread across both North America and Europe. The BIG-IP DNS solution can handle this higher layer of abstraction, the first layer involves choosing between a pool of sites, before delving down one more layer into the pool of virtual servers spread across the sites within the optimal region. Policy is driving the response to a DNS query for S3 services all the way through these two layers. To explore all load balancing methods is an interesting exercise but beyond the scope of this article. The manual here drills into the possible options. To direct traffic at the country or even continent level, one can follow the “Topology” algorithm for first selecting the correct site pool. Persistence can be enabled, allowing future requests from the same LDNS resolver to follow prior outcomes. First, it is good practice to ensure the geo-IP database of BIG-IP is up to date. A brief video here steps a user through the update. The next thing to create is regions. In this diagram the user has created an “Americas” and “Europe” region. In fact, in this particular setup, the Europe region is seen to match all traffic with DNS queries originating outside of North and South American, per the list of member continents. With regions defined, now the one creates simple topology records to control DNS responses for S3 services based upon the source IP of DNS queries on behalf of S3 applications. The net result is a worldwide set of controls with regard to which Scality site S3 transactions will land upon. The decision based upon enterprise objectives can fully consider geographies like continents or individual countries. In our example, once a source region has been decided upon for an inbound DNS request, any of the previous algorithms can kick in. This would include options like global availability for DR within the selected regions, or perhaps measured latency to steer traffic to the most performant site in the region. Summary Scality RING is a software-defined object and file solution that supports data resiliency at levels expected by risk-adverse storage groups, all with contemporary Linux-friendly hardware platforms selected by the enterprise. The F5 BIG-IP application delivery controller complements S3 object traffic involving Scality, through massive scale out of nodes coupled with innovative algorithms for agile spreading of the traffic. Health of RING nodes is perpetually monitored so as to seamlessly bypass any troubled system. When moving to multi-site RING deployments, within a country or even across continents, BIG-IP DNS is harnessed to steer traffic to the optimal site, potentially including geo-ip rules, proximity between user and data center, and established baseline latencies offered by each site to the S3 application’s home location.395Views2likes2CommentsYouTube RSS Newsletter in n8n Root Cause: Why the Ollama Node Broke My Agent
Hey community—Aubrey here. I want to talk about a failure I ran into while building an n8n workflow, because this one cost me some real time and I think it’s going to save you an afternoon if you’re headed down the same road. The short version: I had a workflow working great with OpenAI, and I wanted to swap in Ollama so I could run the LLM locally. Same prompt, same data, same structured output requirements. In my head, that should’ve been a clean plug-and-play change. It wasn’t. It broke in a way that looked like “the model isn’t returning valid JSON,” but the real root cause was something else entirely—and it’s actually documented. What broke (and where it broke) The failure always showed up in the Structured Output Parser. n8n would run the flow, then the parser would throw: "Model output doesn't fit required format," Which is a super reasonable error if your model is rambling, adding commentary, wrapping JSON in markdown, returning tool traces, whatever. So that’s where my head went first: “Okay, I need to tighten the prompt. Maybe the schema is too strict. Maybe Ollama’s being weird.” But here’s the thing: this wasn’t one of those “LLM didn’t obey” moments. This was repeatable, consistent, and it didn’t really matter how I tuned the prompt. The OpenAI version worked; the Ollama version failed, and the parser was just the first place it showed up. The first big clue: the 5-minute wall As I dug in, I started seeing a pattern: a hard failure at exactly five minutes. Not “about five minutes,” not “sometimes,” but right on the dot. That error often surfaced as: "fetch failed." So now we’re not talking about a formatting issue anymore—we’re talking about the request itself failing. That matters, because if the model call dies mid-stream, the structured parser downstream is going to be handed something incomplete or empty or error-shaped, and it’s going to complain that it doesn’t match the schema. That’s not the parser being wrong. That’s the parser doing its job. This 5-minute behavior is also what other folks reported in n8n issue #13655—the Ollama chat model node timing out after 5 minutes even when people tried to change the “keep alive” setting. Reproducing behavior with logs (and why it matters) One of the most useful things I found in that issue thread was simple: Ollama’s own logs clearly show the request dying at 5 minutes when driven by n8n’s AI nodes. You’ll see entries like: failure case: 500 | 5m0s | POST "/api/chat" Then a n8n community member swapped the same payload into a manual HTTP Request node in n8n (which does let you set a timeout), and suddenly the same call works: success case: 200 | 7m0s | POST "/api/chat" That’s a huge diagnostic move. Because it tells you the model isn’t “incapable” or “too slow”—it tells you the client behavior is the problem (timeout / abort / cancellation), not your prompt, not your JSON schema, not the content. And that lined up perfectly with what I was seeing on my side. Getting serious: tcpdump, FIN/ACK, and “context canceled” At some point I wanted proof of what was actually happening on the wire, so I ran a tcpdump against the Ollama port. And yeah—this is where it got real. What I saw was: n8n connects to Ollama fine Data flows for a while (so we’re not talking about “can’t reach host”) At the ~5 minute mark, n8n sends a TCP FIN/ACK (client closes connection) Then also sends an HTTP 500 containing an error like "context cancelled" In the issue thread, you can literally see an example of that pattern: client FIN from n8n → Ollama, then 'HTTP/1.1 500 Internal Server Error' with a body indicating: 'context canceled.' So when I originally said “the structured output parser fails because Ollama’s tool call output isn’t close to what’s expected,” I wasn’t totally wrong about the symptom. But the deeper “why” is: the request is being canceled and what comes back is not a valid structured model output. The parser is just where it becomes obvious when you force the node to return data back to the agent. The root cause (and the part I want everyone to notice) Now here’s the punchline, and this is the part I want to underline, bold, highlight, put on a billboard: The n8n Ollama model node does not work with LLM Tools implementations. That’s not a rumor. That’s in the n8n docs! After a quick recap and discussion, JRahm pointed me to the documentation for the Ollama model integration, and it straight-up says the Ollama node does not support tools, and recommends using Basic LLM Chain instead: What I’m doing next (and what you should do) I’m not done with Ollama—I’m just done trying to use it the wrong way and this is going to spawn two follow-up efforts for me: Attempt to rebuild the same idea using Basic LLM Chain with Ollama, the way the docs recommend. Write a deeper explainer on LLM Tools—what they are, why agents use them, and how that’s different than RAG (because those concepts get mashed together constantly). So if you’re out there wiring up an Agent with structured output and you’re thinking “I’ll just switch the model to Ollama,” don’t do what I did. Read that doc line first. If you need tools, pick a model/node combo that supports tools. If you’re using Ollama, design for the Basic LLM Chain path and you’ll save yourself the five-minute timeout rabbit hole and the structured-parser blame game.183Views0likes0CommentsUsing the Model Context Protocol with Open WebUI
This year we started building out a series of hands-on labs you can do on your own in our AI Step-by-Step repo on GitHub. In my latest lab, I walk you through setting up a Model Context Protocol (MCP) server and the mcpo proxy to allow you to use MCP tools in a locally-hosted Open WebUI + Ollama environment. The steps are well-covered there, but I wanted to highlight what you learn in the lab. What is MCP and why does it matter? MCP is a JSON-based open standard from Anthropic that (shockingly!) is only about 13 months old now. It allows AI assistants to securely connect to external data sources and tools through a unified interface. The key delivery that led to it's rapid adoption is that it solves the fragmentation problem in AI integrations—instead of every AI system needing custom code to connect to each tool or database, MCP provides a single protocol that works across different AI models and data sources. MCP in the local lab My first exposure to MCP was using Claude and Docker tools to replicate a video Sebastian_Maniak released showing how to configure a BIG-IP application service. I wanted to see how F5-agnostic I could be in my prompt and still get a successful result, and it turned out that the only domain-specific language I needed, after it came up with a solution and deployed it, was to specify the load balancing algorithm. Everything else was correct. Kinda blew my mind. I spoke about this experience throughout the year at F5 Academy events and at a solutions days event in Toronto, but more-so, I wanted to see how far I could take this in a local setting away from the pay-to-play tooling offered at that time. This was the genesis for this lab. Tools In this lab, you'll use the following tools: Ollama - Open WebUI mcpo custom mcp server Ollama and Open WebUI are assumed to already be installed, those labs are also in the AI Step-by-Step repo: Installing Ollama Installing Open WebUI Once those are in place, you can clone the repo and deploy in docker or podman, just make sure the containers for open WebUI are in the same network as the repo you're deploying. Results The success for getting your Open WebUI inference through the mcpo proxy and the MCP servers (mine is very basic just for test purposes, there are more that you can test or build yourself) depends greatly on your prompting skills and the abilities of the local models you choose. I had varying success with llama3.2:3b. But the goal here isn't production-ready tooling, it's to build and discover and get comfortable in this new world of AI assistants and leveraging them where it makes sense to augment our toolbox. Drop a comment below if you build this lab and share your successes and failures. Community is the best learning environment.
1.7KViews5likes1CommentThe Fast Path to Safer Labs: CycloneDX SBOMs for F5 Practitioners
Quick note up front about my intent with this lab... I built it to quickly help F5 practitioners keep their lab environments safe from hidden threats. Fast, approachable, and useful on day one. We used the bundled Dependency-Track container because it’s trivial to stand up in a lab. In production, please deploy Dependency-Track backed by a production‑grade database and tune it for scale and durability. Lab-first, but think ahead to enterprise‑ready. Now, let’s talk about why I chose CycloneDX for the SBOM we generated with Trivy, and why it’s the accepted standard I recommend for modern, AI‑heavy workloads. At a high level, an SBOM is your ingredient list for software. Containers that host LLM apps are layered: base OS, GPU drivers and CUDA, language runtimes, Python packages, app binaries, plus external services you call (hosted inference, embeddings, vector databases). If you don’t know what’s in that stack, you can’t manage risk when new CVEs land. CycloneDX gives you that visibility and does it with a security-first design. Here’s why CycloneDX is such a good fit: - Security-first schema. CycloneDX was born into the AppSec world at OWASP. It bakes in identifiers that vulnerability tooling actually uses—package URLs (purls), CPEs, hashes—and a proper dependency graph. That graph matters when the vulnerable thing isn’t your top-level app but the library three layers deep. - Broad component coverage, including services. Real LLM apps don’t stop at “libraries.” CycloneDX can represent applications, libraries, containers, operating systems, files, and services. That service support is huge: if you depend on an external inference API, a hosted vector DB, or a third-party embedding service, CycloneDX can document that right in your SBOM. Your risk picture is no longer just what’s “in the image,” but what the image calls it. - VEX support to cut noise. CycloneDX supports VEX (Vulnerability Exploitability eXchange), which lets you annotate “not affected” or “mitigated” when a CVE shows up in your base image but is not exploitable in your specific deployment. That’s how you keep the signal high and the noise low. - Toolchain adoption. The path we used in the lab—Trivy generates CycloneDX JSON in a single command, Dependency-Track ingests it cleanly—is exactly what you want. Fewer conversions, fewer surprises, more time looking at risk with a project-centric view. So how does that map to LLM app security, specifically? - Containers and drivers: CycloneDX captures the full container context—OS packages, runtime layers, GPU driver stacks—so when you rebuild to pick up a CUDA or base image update, your SBOM reflects the change and your risk dashboard stays current. - Python ecosystems: For model-serving and data pipelines, CycloneDX tracks the Python libraries and their transitive dependencies, so when a popular package pushes a patch for a nasty CVE, you’ll see the impact across your projects. - Model artifacts and files: CycloneDX can represent file components with hashes. If you pin or verify model files, that checksum data helps you detect drift or tampering. - External services: Many LLM apps rely on hosted endpoints. CycloneDX’s service component type lets you document those dependencies, so governance isn’t blind to the parts of your “system” that live outside your containers. Now, let’s compare CycloneDX to other SBOM standards you’ll hear about. SPDX (Software Package Data Exchange) - Strengths: It’s a Linux Foundation standard with deep traction, especially for license compliance. Legal and compliance teams love it for moving license information through CI/CD. - Tradeoffs for AppSec: SPDX can represent dependencies and has added security-relevant fields, but its heritage is compliance rather than vulnerability analysis. Modeling external services is less natural, and a lot of AppSec tooling (like the Trivy -> Dependency-Track workflow we used) is tuned for CycloneDX. If your primary goal is security visibility and CVE triage for containerized AI apps, CycloneDX tends to be the smoother path. SWID tags (ISO/IEC 19770-2) - Strengths: Vendor provided software identification for asset management—who installed what, what version, and how it’s licensed. - Tradeoffs: Limited open tooling, and not a great fit for layered containers or fast-moving dependency graphs. You won’t get the rich, developer-centric view you need for daily AppSec in LLM environments. And a quick reality check: package manifests and lockfiles (pip freeze, requirements.txt, package-lock.json) are useful, but they’re not SBOMs. They miss OS packages, drivers, and container layers. CycloneDX gives you the whole picture. Practically speaking, here’s the loop we ran—and why CycloneDX makes it painless: - Generate: Use Trivy to scan your AI container and spit out CycloneDX JSON. It’s trivial—one line, usually under a minute. - Ingest: Push that SBOM into Dependency-Track via the API. You get components, licenses, vulnerability scores, dependency graphs, and a clean project/version history. - Act: Watch for new CVEs. Use VEX to mark what’s not exploitable in your context. Rebuild, rescan, repeat. Automate it in CI so your SBOM stays fresh without manual babysitting. Production note again, because it matters: the bundled Dependency-Track container is perfect for labs and demos. In production, deploy Dependency-Track with a production-grade database, persistent storage, backups, and access controls that match your enterprise standards. Bottom line: SPDX and CycloneDX are both legitimate, widely accepted SBOM standards. If your priority is license compliance, SPDX is an excellent fit. If your priority is application security for modern, service-heavy, containerized LLM apps, CycloneDX gives you security-first modeling, service coverage, VEX, and an ecosystem that lets you move fast without sacrificing visibility. Voila—grab Trivy, generate CycloneDX, feed Dependency-Track, and start getting signals instead of noise. Fresh installs often look green on day one, but when something changes tomorrow, you’ll see it. That’s the whole game: make hidden threats visible, then make them go away. If you’d like to try the lab, it’s located here. If you want to check out the video of the lab, instead, try this one:
136Views3likes0CommentsIntroducing F5 AI Red Team
F5 AI Red Team simulates adversarial attacks such as prompt injection and jailbreaks at unprecedented speed and scale, allowing for continuous assessment throughout the application lifecycle, providing insights into threats and integrating with F5 AI Guardrails to convert these insights into security policies.
544Views6likes1CommentKey Steps to Securely Scale and Optimize Production-Ready AI for Banking and Financial Services
This article outlines three key actions banks and financial firms can take to better securely scale, connect, and optimize their AI workflows, which will be demonstrated through a scenario of a bank taking a new AI application to production.1.2KViews3likes0CommentsI Tried to Beat OpenAI with Ollama in n8n—Here’s Why It Failed (and the Bug I’m Filing)
Hey, community. I wanted to share a story about how I built the n8n Labs workflow. It watches a YouTube channel, summarizes the latest videos with AI agents, and sends a clean HTML newsletter via Gmail. In the video, I show it working flawlessly with OpenAI. But before I got there, I spent a lot of time trying to copy the same flow using open source models through Ollama with the n8n Ollama node. My results were all over the map. I really wanted this to be a great “open source first” build. I tried many local models via Ollama, tuned prompts, adjusted parameters, and re‑ran tests. The outputs were always unpredictable: sometimes I’d get partial JSON, sometimes extra text around the JSON. Sometimes fields would be missing. Sometimes it would just refuse to stick to the structure I asked for. After enough iterations, I started to doubt whether my understanding of the agent setup was off. So, I built a quick proof inside the n8n Code node. If the AI Agent step is supposed to take the XML→JSON feed and reshape it into a structured list—title, description, content URL, thumbnail URL—then I should be able to do that deterministically in JavaScript and compare. I wrote a tiny snippet that reads the entries array, grabs the media fields, and formats a minimal output. And guess what? Voila. It worked on the first try and my HTML generator lit up exactly the way I wanted. That told me two things: one, my upstream data (HTTP Request + XML→JSON) was solid; and two, my desired output structure was clear and achievable without any trickery. With that proof in hand, I turned to OpenAI. I wired the same agent prompt, the same structured output parser, and the same workflow wiring—but swapped the Ollama node for an OpenAI chat model. It worked immediately. Fast, cheap, predictable. The agent returned a perfectly clean JSON with the fields I requested. My code node transformed it into HTML. The preview looked right, and Gmail sent the newsletter just like in the demo. So at that point, I felt confident the approach was sound and the transcript you saw in the video was repeatable—at least with OpenAI in the loop. Where does that leave Ollama and open source models? I’m not throwing shade—I love open source, and I want this path to be great. My current belief is the failure is somewhere inside the n8n Ollama node code path. I don’t think it’s the models themselves in isolation; I think the node may be mishandling one or more of these details: how messages are composed (system vs user). Whether “JSON mode” or a grammar/format hint is being passed, token/length defaults that cause truncation, stop settings that let extra text leak into the output; or the way the structured output parser constraints are communicated. If you’ve worked with local models, you know they can follow structure very well when you give them a strict format or grammar. If the node isn’t exposing that (or is dropping it on the floor), you get variability. To make sure this gets eyes from the right folks, my intent is to file a bug with n8n for the Ollama node. I’ll include a minimal, reproducible workflow: the same RSS fetch, the same XML→JSON conversion, the same agent prompt and required output shape, and a comparison run where OpenAI succeeds and Ollama does not. I’ll share versions, logs, model names, and settings so the team can trace exactly where the behavior diverges. If there’s a missing parameter (like format: json) or a message-role mix‑up, great—let’s fix it. If it needs a small enhancement to pass a grammar or schema to the model, even better. The net‑net is simple: for AI agents inside n8n to feel predictable with Ollama, we need the node to enforce reliably structured outputs the same way the OpenAI path does. That unlocks a ton of practical automation for folks who prefer local models. In the meantime, if you’re following the lab and want a rock‑solid fallback, you can use the Code node to do the exact transformation the agent would do. Here’s the JavaScript I wrote and tested in the workflow: const entries = $input.first().json.feed?.entry ?? []; function truncate(str, max) { if (!str) return ''; const s = String(str).trim(); return s.length > max ? s.slice(0, max) + '…' : s; // If you want total length (including …) to be max, use: // return s.length > max ? s.slice(0, Math.max(0, max - 1)) + '…' : s; } const output = entries.map(entry => { const g = entry['media:group'] ?? {}; return { title: g['media:title'] ?? '', description: truncate(g['media:description'], 60), contentUrl: g['media:content']?.url ?? '', thumbnailUrl: g['media:thumbnail']?.url ?? '' }; }); return [{ json: { output } }]; That snippet proves the data is there and your HTML builder is fine. If OpenAI reproduces the same structured JSON as the code, and Ollama doesn’t, the issue is likely in the node’s request/response handling rather than your workflow logic. I’ll keep pushing on the bug report so we can make agents with Ollama as predictable as they need to be. Until then, if you want speed and consistency to get the job done, OpenAI works great. If you’re experimenting with open source, try enforcing stricter formats and shorter outputs—and keep an eye on what the node actually sends to the model. As always, I’ll share updates, because I love sharing knowledge—and I want the open-source path to shine right alongside the rest of our AI, agents, n8n, Gmail, and OpenAI workflows. As always, community, if you have a resolution and can pull it off, please share!
711Views2likes1CommentHow I did it.....again "High-Performance S3 Load Balancing with F5 BIG-IP"
Introduction Welcome back to the "How I did it" series! In the previous installment, we explored the high‑performance S3 load balancing of Dell ObjectScale with F5 BIG‑IP. This follow‑up builds on that foundation with BIG‑IP v21.x’s S3‑focused profiles and how to apply them in the wild. We’ll also put the external monitor to work, validating health with real PUT/GET/DELETE checks so your S3-compatible backends aren’t just “up,” they’re truly dependable. New S3 Profiles for the BIG-IP…..well kind of A big part of why F5 BIG-IP excels is because of its advanced traffic profiles, like TCP and SSL/TLS. These profiles let you fine-tune connection behavior—optimizing throughput, reducing latency, and managing congestion—while enforcing strong encryption and protocol settings for secure, efficient data flow. Available with version 21.x the BIG-IP now includes new S3-specific profiles, (s3-tcp and s3-default-clientssl). These profiles are based off existing default parent profiles, (tcp and clientssl respectively) that have been customized or “tuned” to optimize s3 traffic. Let’s take a closer look. Anatomy of a TCP Profile The BIG-IP includes a number of pre-defined TCP profiles that define how the system manages TCP traffic for virtual servers, controlling aspects like connection setup, data transfer, congestion control, and buffer tuning. These profiles allow administrators to optimize performance for different network conditions by adjusting parameters such as initial congestion window, retransmission timeout, and algorithms like Nagle’s or Delayed ACK. The s3-tcp, (see below) has been tweaked with respect to data transfer and congestion window sizes as well as memory management to optimize typical S3 traffic patterns (i.e. high-throughput data transfer, varying request sizes, large payloads, etc.). Tweaking the Client SSL Profile for S3 Client SSL profiles on BIG-IP define how the system terminates and manages SSL/TLS sessions from clients at the virtual server. They specify critical parameters such as certificates, private keys, cipher suites, and supported protocol versions, enabling secure decryption for advanced traffic handling like HTTP optimization, security policies, and iRules. The s3-default-clientssl has been modified, (see below) from the default client ssl profile to optimize SSL/TLS settings for high-throughput object storage traffic, ensuring better performance and compatibility with S3-specific requirements. Advanced S3-compatible health checking with EAV Has anyone ever told you how cool BIG-IP Extended Application Verification (EAV) aka external monitors are? Okay, I suppose “coolness” is subjective, but EAVs are objectively cool. Let me prove it to you. Health monitoring of backend S3-compatible servers typically involves making an HTTP GET request to either the exposed S3 ingest/egress API endpoint or a liveness probe. Get a 200 and all's good. Wouldn’t it be cool if you could verify a backend server's health by verifying it can actually perform the operations as intended? Fortunately, we can do just that using an EAV monitor. Therefore, based on the transitive property, EAVs are cool. —mic drop The bash script located at the bottom of the page performs health checks on S3-compatible storage by executing PUT, GET, and DELETE operations on a test object. The health check creates a temporary health check file with timestamp, retrieves the file to verify read access, and removes the test file to clean up. If all three operations return the expected HTTP status code, the node is marked up otherwise the node is marked down. Installing and using the EAV health check Import the monitor script Save the bash script, (.sh) extension, (located at the bottom of this page) locally and import the file onto the BIG-IP. Log in to the BIG-IP Configuration Utility and navigate to System > File Management > External Monitor Program File List > Import. Use the file selector to navigate to and select the newly created. bash file, provide a name for the file and select 'Import'. Create a new external monitor Navigate to Local Traffic > Monitors > Create Provide a name for the monitor. Select 'External' for the type, and select the previously uploaded file for the 'External Program'. The 'Interval' and 'Timeout' settings can be modified or left at the default as desired. In addition to the backend host and port, the monitor must pass three (3) additional variables to the backend: bucket - The name of an existing bucket where the monitor can place a small text file. During the health check, the monitor will create a file, request the file and delete the file. access_key - S3-compatible access key with permissions to perform the above operations on the specified bucket. secret_key - corresponding S3-compatible secret key. Select 'Finished' to create the monitor. Associate the monitor with the pool Navigate to Local Traffic > Pools > Pool List and select the relevant backend S3 pool. Under 'Health Monitors' select the newly created monitor and move from 'Available' to the 'Active'. Select 'Update' to save the configuration. Additional Links How I did it - "High-Performance S3 Load Balancing of Dell ObjectScale with F5 BIG-IP" F5 BIG-IP v21.0 brings enhanced AI data delivery and ingestion for S3 workflows Overview of BIG-IP EAV external monitors EAV Bash Script #!/bin/bash ################################################################################ # S3 Health Check Monitor for F5 BIG-IP (External Monitor - EAV) ################################################################################ # # Description: # This script performs health checks on S3-compatible storage by # executing PUT, GET, and DELETE operations on a test object. It uses AWS # Signature Version 4 for authentication and is designed to run as a BIG-IP # External Application Verification (EAV) monitor. # # Usage: # This script is intended to be configured as an external monitor in BIG-IP. # BIG-IP automatically provides the first two arguments: # $1 - Pool member IP address (may be IPv6-mapped format: ::ffff:x.x.x.x) # $2 - Pool member port number # # Additional arguments must be configured in the monitor's "Variables" field: # bucket - S3 bucket name # access_key - Access key for authentication # secret_key - Secret key for authentication # # BIG-IP Monitor Configuration: # Type: External # External Program: /path/to/this/script.sh # Variables: # bucket="your-bucket-name" # access_key="your-access-key" # secret_key="your-secret-key" # # Health Check Logic: # 1. PUT - Creates a temporary health check file with timestamp # 2. GET - Retrieves the file to verify read access # 3. DELETE - Removes the test file to clean up # Success: All three operations return expected HTTP status codes # Failure: Any operation fails or times out # # Exit Behavior: # - Prints "UP" to stdout if all checks pass (BIG-IP marks pool member up) # - Silent exit if any check fails (BIG-IP marks pool member down) # # Requirements: # - openssl (for SHA256 hashing and HMAC signing) # - curl (for HTTP requests) # - xxd (for hex encoding) # - Standard bash utilities (date, cut, sed, awk) # # Notes: # - Handles IPv6-mapped IPv4 addresses from BIG-IP (::ffff:x.x.x.x) # - Uses AWS Signature Version 4 authentication # - Logs activity to syslog (local0.notice) # - Creates temporary files that are automatically cleaned up # # Author: [Gregory Coward/F5] # Version: 1.0 # Last Modified: 12/2025 # ################################################################################ # ===== PARAMETER CONFIGURATION ===== # BIG-IP automatically provides these HOST="$1" # Pool member IP (may include ::ffff: prefix for IPv4) PORT="$2" # Pool member port BUCKET="${bucket}" # S3 bucket name ACCESS_KEY="${access_key}" # S3 access key SECRET_KEY="${secret_key}" # S3 secret key OBJECT="${6:-healthcheck.txt}" # Test object name (default: healthcheck.txt) # Strip IPv6-mapped IPv4 prefix if present (::ffff:10.1.1.1 -> 10.1.1.1) # BIG-IP may pass IPv4 addresses in IPv6-mapped format if [[ "$HOST" =~ ^::ffff: ]]; then HOST="${HOST#::ffff:}" fi # ===== S3/AWS CONFIGURATION ===== ENDPOINT="http://$HOST:$PORT" # S3 endpoint URL SERVICE="s3" # AWS service identifier for signature REGION="" # AWS region (leave empty for S3 compatible such as MinIO/Dell) # ===== TEMPORARY FILE SETUP ===== # Create temporary file for health check upload TMP_FILE=$(mktemp) printf "Health check at %s\n" "$(date)" > "$TMP_FILE" # Ensure temp file is deleted on script exit (success or failure) trap "rm -f $TMP_FILE" EXIT # ===== CRYPTOGRAPHIC HELPER FUNCTIONS ===== # Calculate SHA256 hash and return as hex string # Input: stdin # Output: hex-encoded SHA256 hash hex_of_sha256() { openssl dgst -sha256 -hex | sed 's/^.* //' } # Sign data using HMAC-SHA256 and return hex signature # Args: $1=hex-encoded key, $2=data to sign # Output: hex-encoded signature sign_hmac_sha256_hex() { local key_hex="$1" local data="$2" printf "%s" "$data" | openssl dgst -sha256 -mac HMAC -macopt "hexkey:$key_hex" | awk '{print $2}' } # Sign data using HMAC-SHA256 and return binary as hex # Args: $1=hex-encoded key, $2=data to sign # Output: hex-encoded binary signature (for key derivation chain) sign_hmac_sha256_binary() { local key_hex="$1" local data="$2" printf "%s" "$data" | openssl dgst -sha256 -mac HMAC -macopt "hexkey:$key_hex" -binary | xxd -p -c 256 } # ===== AWS SIGNATURE VERSION 4 IMPLEMENTATION ===== # Generate AWS Signature Version 4 for S3 requests # Args: # $1 - HTTP method (PUT, GET, DELETE, etc.) # $2 - URI path (e.g., /bucket/object) # $3 - Payload hash (SHA256 of request body, or empty hash for GET/DELETE) # $4 - Content-Type header value (empty string if not applicable) # Output: pipe-delimited string "Authorization|Timestamp|Host" aws_sig_v4() { local method="$1" local uri="$2" local payload_hash="$3" local content_type="$4" # Generate timestamp in AWS format (YYYYMMDDTHHMMSSZ) local timestamp=$(date -u +"%Y%m%dT%H%M%SZ" 2>/dev/null || gdate -u +"%Y%m%dT%H%M%SZ") local datestamp=$(date -u +"%Y%m%d") # Build host header (include port if non-standard) local host_header="$HOST" if [ "$PORT" != "80" ] && [ "$PORT" != "443" ]; then host_header="$HOST:$PORT" fi # Build canonical headers and signed headers list local canonical_headers="" local signed_headers="" # Include Content-Type if provided (for PUT requests) if [ -n "$content_type" ]; then canonical_headers="content-type:${content_type}"$'\n' signed_headers="content-type;" fi # Add required headers (must be in alphabetical order) canonical_headers="${canonical_headers}host:${host_header}"$'\n' canonical_headers="${canonical_headers}x-amz-content-sha256:${payload_hash}"$'\n' canonical_headers="${canonical_headers}x-amz-date:${timestamp}" signed_headers="${signed_headers}host;x-amz-content-sha256;x-amz-date" # Build canonical request (AWS Signature V4 format) # Format: METHOD\nURI\nQUERY_STRING\nHEADERS\n\nSIGNED_HEADERS\nPAYLOAD_HASH local canonical_request="${method}"$'\n' canonical_request+="${uri}"$'\n\n' # Empty query string (double newline) canonical_request+="${canonical_headers}"$'\n\n' canonical_request+="${signed_headers}"$'\n' canonical_request+="${payload_hash}" # Hash the canonical request local canonical_hash canonical_hash=$(printf "%s" "$canonical_request" | hex_of_sha256) # Build string to sign local algorithm="AWS4-HMAC-SHA256" local credential_scope="$datestamp/$REGION/$SERVICE/aws4_request" local string_to_sign="${algorithm}"$'\n' string_to_sign+="${timestamp}"$'\n' string_to_sign+="${credential_scope}"$'\n' string_to_sign+="${canonical_hash}" # Derive signing key using HMAC-SHA256 key derivation chain # kSecret = HMAC("AWS4" + secret_key, datestamp) # kRegion = HMAC(kSecret, region) # kService = HMAC(kRegion, service) # kSigning = HMAC(kService, "aws4_request") local k_secret k_secret=$(printf "AWS4%s" "$SECRET_KEY" | xxd -p -c 256) local k_date k_date=$(sign_hmac_sha256_binary "$k_secret" "$datestamp") local k_region k_region=$(sign_hmac_sha256_binary "$k_date" "$REGION") local k_service k_service=$(sign_hmac_sha256_binary "$k_region" "$SERVICE") local k_signing k_signing=$(sign_hmac_sha256_binary "$k_service" "aws4_request") # Calculate final signature local signature signature=$(sign_hmac_sha256_hex "$k_signing" "$string_to_sign") # Return authorization header, timestamp, and host header (pipe-delimited) printf "%s|%s|%s" \ "${algorithm} Credential=${ACCESS_KEY}/${credential_scope}, SignedHeaders=${signed_headers}, Signature=${signature}" \ "$timestamp" \ "$host_header" } # ===== HTTP REQUEST FUNCTION ===== # Execute HTTP request using curl with AWS Signature V4 authentication # Args: # $1 - HTTP method (PUT, GET, DELETE) # $2 - Full URL # $3 - Authorization header value # $4 - Timestamp (x-amz-date header) # $5 - Host header value # $6 - Payload hash (x-amz-content-sha256 header) # $7 - Content-Type (optional, empty for GET/DELETE) # $8 - Data file path (optional, for PUT with body) # Output: HTTP status code (e.g., 200, 404, 500) do_request() { local method="$1" local url="$2" local auth="$3" local timestamp="$4" local host_header="$5" local payload_hash="$6" local content_type="$7" local data_file="$8" # Build curl command with required headers local cmd="curl -s -o /dev/null --connect-timeout 5 --write-out %{http_code} \"$url\"" cmd="$cmd -X $method" cmd="$cmd -H \"Host: $host_header\"" cmd="$cmd -H \"x-amz-date: $timestamp\"" cmd="$cmd -H \"x-amz-content-sha256: $payload_hash\"" # Add optional headers [ -n "$content_type" ] && cmd="$cmd -H \"Content-Type: $content_type\"" cmd="$cmd -H \"Authorization: $auth\"" [ -n "$data_file" ] && cmd="$cmd --data-binary @\"$data_file\"" # Execute request and return HTTP status code eval "$cmd" } # ===== MAIN HEALTH CHECK LOGIC ===== # ===== STEP 1: PUT (Upload Test Object) ===== # Calculate SHA256 hash of the temp file content UPLOAD_HASH=$(openssl dgst -sha256 -binary "$TMP_FILE" | xxd -p -c 256) CONTENT_TYPE="application/octet-stream" # Generate AWS Signature V4 for PUT request SIGN_OUTPUT=$(aws_sig_v4 "PUT" "/$BUCKET/$OBJECT" "$UPLOAD_HASH" "$CONTENT_TYPE") AUTH_PUT=$(cut -d'|' -f1 <<< "$SIGN_OUTPUT") DATE_PUT=$(cut -d'|' -f2 <<< "$SIGN_OUTPUT") HOST_PUT=$(cut -d'|' -f3 <<< "$SIGN_OUTPUT") # Execute PUT request (expect 200 OK) PUT_STATUS=$(do_request "PUT" "$ENDPOINT/$BUCKET/$OBJECT" "$AUTH_PUT" "$DATE_PUT" "$HOST_PUT" "$UPLOAD_HASH" "$CONTENT_TYPE" "$TMP_FILE") # ===== STEP 2: GET (Download Test Object) ===== # SHA256 hash of empty body (for GET requests with no payload) EMPTY_HASH="e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" # Generate AWS Signature V4 for GET request SIGN_OUTPUT=$(aws_sig_v4 "GET" "/$BUCKET/$OBJECT" "$EMPTY_HASH" "") AUTH_GET=$(cut -d'|' -f1 <<< "$SIGN_OUTPUT") DATE_GET=$(cut -d'|' -f2 <<< "$SIGN_OUTPUT") HOST_GET=$(cut -d'|' -f3 <<< "$SIGN_OUTPUT") # Execute GET request (expect 200 OK) GET_STATUS=$(do_request "GET" "$ENDPOINT/$BUCKET/$OBJECT" "$AUTH_GET" "$DATE_GET" "$HOST_GET" "$EMPTY_HASH" "" "") # ===== STEP 3: DELETE (Remove Test Object) ===== # Generate AWS Signature V4 for DELETE request SIGN_OUTPUT=$(aws_sig_v4 "DELETE" "/$BUCKET/$OBJECT" "$EMPTY_HASH" "") AUTH_DEL=$(cut -d'|' -f1 <<< "$SIGN_OUTPUT") DATE_DEL=$(cut -d'|' -f2 <<< "$SIGN_OUTPUT") HOST_DEL=$(cut -d'|' -f3 <<< "$SIGN_OUTPUT") # Execute DELETE request (expect 204 No Content) DEL_STATUS=$(do_request "DELETE" "$ENDPOINT/$BUCKET/$OBJECT" "$AUTH_DEL" "$DATE_DEL" "$HOST_DEL" "$EMPTY_HASH" "" "") # ===== LOG RESULTS ===== # Log all operation results for troubleshooting #logger -p local0.notice "S3 Monitor: PUT=$PUT_STATUS GET=$GET_STATUS DEL=$DEL_STATUS" # ===== EVALUATE HEALTH CHECK RESULT ===== # BIG-IP considers the pool member "UP" only if this script prints "UP" to stdout # Check if all operations returned expected status codes: # PUT: 200 (OK) # GET: 200 (OK) # DELETE: 204 (No Content) if [ "$PUT_STATUS" -eq 200 ] && [ "$GET_STATUS" -eq 200 ] && [ "$DEL_STATUS" -eq 204 ]; then #logger -p local0.notice "S3 Monitor: UP" echo "UP" fi # If any check fails, script exits silently (no "UP" output) # BIG-IP will mark the pool member as DOWN763Views4likes0Comments