security
3231 TopicsAPIs First: Why AI Systems Are Still API Systems
AI and APIs Over the past several years, the industry has seen an explosion of interest in large language models and AI driven applications. Much of the discussion has focused on the models themselves: their size, their capabilities, and their apparent ability to reason, summarize, and generate content. In the process, it is easy to overlook a more fundamental reality. Modern AI systems are still API systems. Despite new abstractions and new terminology, the underlying mechanics of AI applications remain familiar. Requests are sent, responses are returned. Identities are authenticated, authorization decisions are made, data is retrieved, and actions are executed. These interactions happen over APIs, and the reliability, security, and scalability of AI systems are constrained by the same architectural principles that have always governed distributed systems. What is new is not the presence of APIs, but the nature of the consumer calling them. In traditional systems, API consumers are deterministic. They are code written by engineers who read the documentation and invoke endpoints in predictable ways. In AI systems, the consumer is increasingly a model, a probabilistic component that infers behavior from schemas, chains calls dynamically, and produces traffic patterns that were not explicitly programmed. That single shift is what makes every downstream concern in this series, including MCP design, token budgets, authorization, and operations, behave differently than in traditional API platforms. Understanding this relationship is critical, not only for building AI systems, but for operating and securing them in production. AI Applications as API Orchestration Platforms At a high level, an AI application is best understood not as a single model invocation, but as an orchestration layer that coordinates multiple API interactions. A typical request may involve: A client calling an application API Authentication and authorization checks Retrieval of contextual data from internal or external services One or more calls to a model inference endpoint Follow-on tool or service calls triggered by the model’s output Aggregation and formatting of the final response From an architectural perspective, this is not fundamentally different from any other multi-service application. Routing, observability, traffic management, and trust boundaries remain as relevant here as in any traditional platform. What has changed is that the decision logic, meaning when to call which service and with what parameters, is increasingly driven by model output rather than static application code. That shift does not eliminate APIs. It increases their importance. AI Application as an Orchestration Platform Models as API Endpoints, Not Black Boxes In production environments, models are consumed almost exclusively through APIs. Whether hosted by a third party or deployed internally, a model is exposed as an endpoint that accepts structured input and returns structured output. Treating models as API endpoints clarifies several important points. A model does not "see" your system. It receives a request payload, processes it, and returns a response. Everything the model knows about your environment arrives through an API boundary. What distinguishes model endpoints from conventional APIs is not their interface, but their operational profile. Responses are frequently streamed rather than returned as a single payload, which changes how load balancers, proxies, and timeouts behave. Payload sizes are highly variable, with both requests and responses ranging from a few hundred bytes to many megabytes depending on context and output length. Rate limits are often expressed in tokens per minute rather than requests per second, which complicates capacity planning and quota enforcement. Self-hosted models introduce additional concerns around GPU scheduling, cold start latency, and memory pressure that do not exist for traditional stateless services. These characteristics do not change the fundamental nature of a model as an API endpoint. They do mean that the operational assumptions built into the existing API infrastructure may not hold without adjustment. Tools, Retrieval, and Data Access Are Still APIs As AI systems evolve beyond simple prompt-and-response interactions, they increasingly rely on tools: databases, search systems, ticketing platforms, code repositories, and internal business services. These tools are almost always accessed through APIs. Retrieval-augmented generation, for example, is often described as a novel AI pattern. In practice, it is a sequence of API calls: An embedding service is called to encode a query A vector database is queried for relevant results A document store is accessed to retrieve source material The retrieved data is passed to the model as context Each step carries the usual concerns: latency, authorization, data exposure, and error handling. The model may influence when these calls occur, but it does not change their fundamental nature. Why API Design Matters More in AI Systems If AI systems are built on APIs, why do they feel harder to manage? The answer lies in amplification. Model-driven systems tend to: Chain API calls dynamically Surface data in ways developers did not explicitly anticipate Expand the blast radius of a misconfigured authorization Increase sensitivity to payload size and response shape A poorly designed API that returns excessive data may be tolerable in a traditional application. In an AI system, that same response can overflow context limits, leak sensitive information into prompts, or cascade into additional unintended tool calls. This amplification rarely stays within a single domain. A schema decision that looks like an application concern becomes a traffic and routing concern when responses grow unpredictably, and an authorization concern when a model uses that response to drive the next call. Design choices that were once contained within one team’s scope now propagate across the stack. In this sense, AI does not introduce entirely new architectural risks. It magnifies existing ones. Introducing MCP as an API Coordination Layer As models gain the ability to invoke tools directly, the need for consistent, structured access to APIs becomes more pressing. This is where Model Context Protocol (MCP) enters the picture. At a conceptual level, MCP does not replace APIs. It standardizes how AI systems discover, describe, and invoke API-backed tools. MCP servers typically sit in front of existing services, exposing them in a model-friendly way while relying on the same underlying API infrastructure. Seen through this lens, MCP is not a departure from established architecture patterns. It is an adaptation, one that acknowledges models as active participants in API-driven systems rather than passive consumers of text. But it is also the introduction of a new coordination layer, a tool plane, with its own operational, network, and security properties that do not map cleanly onto the API layer beneath it. The rest of this series examines what that means for the systems you build, run, and secure. Looking Ahead If AI systems are still API systems, then the familiar disciplines of API architecture, security, and operations remain essential. What changes is where decisions are made, how data flows, and how quickly small design flaws can propagate. The next article looks more closely at MCP itself, examining how it standardizes tool access on top of APIs and why treating it as a tool plane helps clarify both its power and its risks. From there, the series turns to tokens as a first-class design constraint that shapes tool schemas, response shaping, and traffic behavior. The fourth article addresses authorization and the security implications of letting models invoke tools directly, including identity, delegation, and the expanded blast radius MCP introduces. The series closes with a look at operating MCP-enabled systems in production, where reliability, cost, and safety have to be enforced rather than assumed. Resources: Article Series: MCP, APIs, and Tokens: Building and Securing the Tool Plane of AI Systems (Intro) MCP, APIs, and Tokens (Part 1 - APIs First: Why AI Systems Are Still API Systems) MCP, APIs, and Tokens (Part 2 - MCP as the Tool Plane: Standardizing Access Across APIs) MCP, APIs, and Tokens (Part 3 - Tokens as a Design Constraint for MCP and APIs) MCP, APIs, and Tokens (Part 4 - Securing the Tool Plane: MCP, APIs, and Authorization) MCP, APIs, and Tokens (Part 5 - Designing for the Inference Track: Safe, Scalable MCP Systems)102Views5likes2CommentsBIG-IP Cloud-Native Network Functions 2.3: What’s New in CNF and BNK
Introduction F5 BIG-IP continues to advance BIG-IP Next for Kubernetes (BNK) and Cloud-Native Network Functions (CNFs) to meet the growing demands of service providers and modern application environments. F5 provides the full stack required to make cloud-native networking work in a service provider environment. CNFs alone are not enough; you need functions, control, infrastructure, and observability, working together as one system. What is new in BIG-IP Cloud-Native Edition 2.3 for BNK and CNF? Release 2.3 adds MPLS provider edge support (early access), native UDP/TCP load balancing, DPU-accelerated data plane offload on NVIDIA BlueField, and subscriber-aware Policy Enforcement Manager (PEM) with Gx interface integration. It also introduces VRF-aware AFM policies, GSLB with Sync Groups for multi-region deployments, BBRv2 congestion control, and crash diagnostics that operate without host-level Kubernetes access. This release targets service providers and telecom operators who need cloud-native networking without sacrificing the protocol support and policy control of traditional infrastructure. Cloud-Native Network Functions (CNFs): What Changed in 2.3? In release 2.3, CNF capabilities focus on strengthening the underlying network functions required for service provider deployments. How does BIG-IP CNF 2.3 handle crash diagnostics in restricted Kubernetes clusters? Operating CNFs in production environments requires strong observability, even in restricted clusters. Release 2.3 introduces improvements to the crash agent that allow core files to be collected directly from pods without requiring host-level access. This enables deployments in more secure Kubernetes environments and simplifies troubleshooting when issues occur. How does BIG-IP CNF 2.3 operate in multi-tenant and multi-VRF environments? Multi-tenant environments demand precise control over traffic behavior. Release 2.3, Advanced Firewall Manager (AFM), introduces VRF-aware ACL and NAT policies, allowing operators to apply firewall and translation rules within specific routing contexts. This enables better segmentation and supports overlapping address spaces while maintaining consistent policy enforcement. It aligns CNFs more closely with how service provider networks are designed and operated. Can BIG-IP CNF 2.3 operate at edge? One of the most significant additions to this release is MPLS support within CNFs. This is currently an early-access feature and is expected to reach general availability in a future release. CNFs can now operate as provider edge nodes, supporting label-based forwarding and applying policies based on MPLS labels. This allows service providers to extend existing MPLS architectures into Kubernetes environments without requiring major redesigns. It also provides a path for replacing legacy systems with cloud-native alternatives while maintaining familiar networking constructs. UDP and TCP Application Load Balancing Release 2.3 introduces UDP and TCP application load balancing, expanding support beyond HTTP-based traffic management. This capability enables CNFs to handle a broader range of applications and telco protocols, including workloads that rely on Layer 4 traffic patterns. Traffic can be balanced across services both inside and outside the Kubernetes cluster, which is critical for hybrid deployments and incremental modernization efforts. This enhancement is especially important for service providers and large enterprises that operate in mixed environments. They allow existing applications to continue functioning while new cloud-native components are introduced, without requiring immediate architectural changes. Subscriber-Aware Policy Enforcement Subscriber awareness remains a core requirement for service providers. Release 2.3 enhances Policy Enforcement Manager (PEM) with GX interface integration, enabling real-time policy enforcement based on subscriber data. This allows traffic to be classified and controlled dynamically, supporting use cases such as QoS enforcement, traffic shaping, and content filtering. It also enables compliance with regulatory requirements and opens new opportunities for service differentiation. Improved Observability and Aggregated Insights As CNFs scale, visibility becomes more complex. Earlier approaches relied on per-pod metrics, which made it difficult to build a unified view of the system. Release 2.3 enhances PEM by introducing aggregation through TODA, allowing statistics and session data to be collected and presented as a single entity. Enhancements to MRFDB and PEM reporting further improve visibility into subscriber sessions and traffic behavior, giving operators a more complete and centralized view of network activity. Building on this foundation, release 2.3 expands PEM capabilities with subscriber-aware policy enforcement. By integrating external policy systems and classification services, CNFs can now correlate traffic with subscriber identity and apply policies dynamically. This provides deeper insight into how individual subscribers and applications behave on the network, enabling more precise control and improved operational awareness. Additional DNS visibility enhancements, such as adding dig support into netkvest, further strengthen troubleshooting capabilities. By enabling more detailed DNS query inspection and response analysis, operators can quickly diagnose resolution issues and better understand traffic patterns tied to application behavior. Together, these enhancements move CNFs beyond basic monitoring. They provide a richer, more contextual understanding of traffic, subscribers, and services, which simplifies operations and enables faster troubleshooting in large-scale environments. DNS and Traffic Behavior Enhancements Release 2.3 includes improvements that address real-world network behavior, particularly in how DNS and transport protocols operate at scale. One example is the handling of DNS requests during certain scenarios. Instead of silently dropping traffic, CNFs can now return NXDOMAIN responses, preventing upstream systems from interpreting the lack of response as a service failure. This improves reliability and ensures better interoperability with external DNS resolvers in distributed environments. In addition, support for BBRv2 congestion control improves TCP performance in challenging conditions. It provides better fairness across flows and adapts more effectively to latency and packet loss, improving overall user experience in mobile and distributed networks. Extending to Multi-Region Traffic Management Release 2.3 continues to expand DNS capabilities with early access support for Global Server Load Balancing, enabling traffic distribution across multiple locations such as data centers and cloud environments. This represents an important step toward multi-region and hybrid architectures, where applications are no longer tied to a single cluster or deployment location. Building on this, the introduction of GSLB Sync Groups improves how configurations are managed across distributed deployments. Within a sync group, one instance is designated as the sync agent and is responsible for propagating configuration changes to other members. This approach ensures consistency across environments while preventing conflicting updates and reducing the risk of synchronization issues. Release 2.3 also begins to introduce more intelligent traffic steering with topology-based load balancing. This capability allows traffic to be directed based on user-defined parameters such as location or network proximity. As a result, operators can optimize application delivery by sending users to the most appropriate endpoint, improving latency and overall service quality. Together, these enhancements move CNFs closer to providing a fully cloud-native, globally distributed traffic management solution that aligns with modern application deployment patterns. BNK on NVIDIA BlueField DPU: What Performance Does Hardware Offload Deliver? As Kubernetes environments scale, the limitations of CPU-based packet processing become more visible. Networking workloads compete directly with applications for resources, which can impact both performance and cost. Release 2.3 continues the expansion of BNK on NVIDIA BlueField DPUs, allowing key data plane functions to be offloaded from the host CPU. This change improves throughput and reduces latency while freeing compute resources for applications that generate business value. The benefit is not just raw performance. It also brings predictability. With networking and security processing handled on the DPU, operators can achieve more consistent performance across distributed environments. This is especially important for AI infrastructure and high-throughput telco deployments, where even small inefficiencies can scale into significant costs. From an operational perspective, this also simplifies infrastructure design. Separating application workloads from networking functions reduces contention and allows for more efficient scaling strategies. CNFs begin to behave less like shared software components and more like purpose-built networking systems, while still retaining the flexibility of Kubernetes. BNK for Telco and Modern Applications Modern environments rarely consist of purely cloud-native applications. Most organizations are running a mix of legacy protocols, telco workloads, and newer microservices. Release 2.3 addresses this reality directly. Can BNK 2.3 load balance non-HTTP protocols? One of the most important additions to this release is TCP and UDP load balancing. This extends BNK beyond HTTP-based traffic management and enables support for telco protocols and other non-HTTP workloads. It also allows traffic to be balanced both inside and outside the Kubernetes cluster, which is critical for hybrid architectures and phased migrations This capability reflects a broader shift in BNK. It is no longer just an ingress layer. It is evolving into a unified traffic management platform that can handle diverse protocols and application types without forcing architectural changes. For service providers, this means they can modernize incrementally. Existing applications can continue to operate while new components are introduced in Kubernetes. For enterprise environments, it provides a consistent way to manage traffic across distributed services without introducing additional tools or complexity. Frequently asked questions These questions represent the most common queries architects and operators ask when evaluating BIG-IP Cloud-Native Edition 2.3. Q: What is new in BIG-IP Cloud-Native Edition 2.3? A: BIG-IP CNF 2.3 adds MPLS provider edge support (early access), native UDP/TCP load balancing, DPU-accelerated data plane offload on NVIDIA BlueField, subscriber-aware PEM with Gx integration, VRF-aware AFM policies, GSLB with Sync Groups, congestion control, and crash diagnostics that operate without host-level Kubernetes access. Q: What CPU savings does BNK on NVIDIA BlueField DPU deliver? A: Validated testing (Tolly Report #226104, February 2026) showed approximately 80% host CPU reduction, 40% more output tokens per second versus HAProxy on Llama 70B, and 61% faster time to first token (TTFT). These results reflect BNK offloading data plane processing from the host CPU to the BlueField DPU, freeing the host compute for application workloads. Q: Does BIG-IP CNF 2.3 support protocols beyond HTTP and HTTPS? A: Yes. Release 2.3 adds native UDP and TCP load balancing to both CNFs and BNK, extending traffic management beyond HTTP. This supports telco protocols such as GTP-U, Diameter, and RADIUS, with the ability to balance traffic across services inside and outside the Kubernetes cluster.49Views1like0CommentsImplementing Risk-Based Actions with AI-Powered WAF: Customer Policy Paths
Why Custom policy is where risk-based actions matter most The default policy is straightforward: it applies a broad mix of signatures, threat campaigns, and violations; “Enhance with AI” is an optional add-on. Custom policies are where customers can accidentally recreate the same problems Risk Scoring is designed to solve—usually by combining: Overly broad/noisy signature selection (especially low-accuracy signatures) Aggressive enforcement (blocking Medium too early) Disabling/excluding key signatures and unintentionally reducing ML invocation So the rest of this blog is a tight, configuration-oriented walkthrough of the Custom path. Custom policy: configuration walkthrough (decision points → operational outcomes) Baseline: Navigate to the Custom controls LB Config → Web Application Firewall Create/edit the WAF object (Metadata `Name`, etc.) Set Security Policy = Custom Choose Signature Selection by Accuracy Optionally enable Enhance with AI (Risk Scoring) If enabled, optionally configure Action by Risk Score (risk-based enforcement) Step 1: Signature Selection by Accuracy (choose your baseline level) Accuracy indicates susceptibility to false positives: Low: high likelihood of false positives Medium: some likelihood of false positives High: low likelihood of false positives Note: This setting is foundational: it determines which signatures are active, and therefore the quality and volume of detection signals that feed into downstream risk evaluation. Operationally: High accuracy tends to support faster, safer enforcement. Medium/Low accuracy can expand coverage but increases the chance you’ll need exceptions, investigations, or staged rollout discipline. Step 2: Enhance with AI (turn on Risk Scoring) Enhance with AI = On enables AI-powered risk scoring and assigns each request a High/Medium/Low risk score using layered signals. Two implementation details to make explicit in your blog because they affect customer expectations: ML invocation depends on enabled signatures firing in the specified injection/execution categories. If teams disable/exclude those signatures, they may reduce when the model runs—changing practical behavior of risk evaluation. Step 3: Action by Risk Score (map risk levels to enforcement) When Action by Risk Score is enabled: By default, high-risk requests are blocked Users can choose whether Medium-risk requests are blocked (via dropdown) This is the primary knob that determines how quickly a user decides to move from “safe enforcement” to “broad enforcement.” Recommended rollout path: Day 0 → Day 7 → Steady state This is the most common and safest operational progression for customers Day 0 (safe enforcement baseline) Custom → Signature Selection by Accuracy = High (or High + Medium if you need broader coverage immediately) Enhance with AI = On Action by Risk Score = High Outcome Gets to blocking quickly while minimizing availability risk. High is blocked. This is the “prove safety while stopping obvious bad” posture. Day 7 (controlled expansion) Keep Custom + Enhance with AI + Action by Risk Score Optionally widen Signature Selection from High → High + Medium if coverage is insufficient Enhance with AI = On Action by Risk Score = High + Medium Outcome Expands detection inputs without immediately expanding enforcement. Teams focus on what’s landing in Medium and whether exclusions/disabled signatures are reducing ML invocation in key categories Steady state (mature enforcement) Custom → signature selection set to the broadest set Widen Signature Selection from High + Medium → High + Medium + Low Action by Risk Score = High + Medium Enhance with AI = On Action by Risk Score = High + Medium Outcome Risk outcomes become the enforcement interface. Broad, consistent blocking across apps/APIs with reduced per-app tuning and fewer signature-level decisions Common Pitfalls: Avoid Block Medium on Day 0 when including low-accuracy signatures—this is the fastest way to recreate false-positive outages. If you disable/exclude signatures in the key injection/execution categories, you can reduce ML invocation and change risk evaluation behavior. Summary Custom policies traditionally scale poorly because every app ends up with bespoke signature decisions and exception handling. Risk Scoring is designed to invert that: keep signatures as key signals but standardize enforcement via risk outcomes. If you implement Custom with the Day 0 → Day 7 → Steady state progression above, you get a predictable path from “block safely now” to “enforce broadly later” without returning to signature-by-signature tuning as your primary operating model.113Views1like1CommentAutomating F5 Application Delivery and Security Platform Deployments
The F5 ADSP Architecture Automation Project The F5 ADSP reduces the complexity of modern applications by integrating operations, traffic management, performance optimization, and security controls into a single platform with multiple deployment options. This series outlines practical steps anyone can take to put these ideas into practice using the F5 ADSP Architectures GitHub repo. Each article highlights different deployment examples, which can be run locally or integrated into CI/CD pipelines following DevSecOps practices. The repository is community-supported and provides reference code that can be used for demos, workshops, or as a stepping stone for your own F5 ADSP deployments. If you find any bugs or have any enhancement requests, open an issue, or better yet, contribute. The F5 Application Delivery and Security Platform (F5 ADSP) The F5 ADSP addresses four core areas: how you operate day to day, how you deploy at scale, how you secure against evolving threats, and how you deliver reliably across environments. Each comes with its own challenges, but together they define the foundation for keeping systems fast, stable, and safe. Each architecture deployment example is designed to cover at least two of the four core areas: xOps, Deployment, Delivery, and Security. This ensures the examples demonstrate how multiple components of the platform work together in practice. DevSecOps: Integrating security into the software delivery lifecycle is a necessary part of building and maintaining secure applications. This project incorporates DevSecOps practices by using supported APIs and tooling, with each use case including a GitHub repository containing IaC code, CI/CD integration examples, and telemetry options. Demo: Use-Case 1: F5 Distributed Cloud WAF and BIG-IP Advanced WAF Resources: F5 Application Delivery and Security Platform GitHub Repo and Automation Guide ADSP Architecture Article Series: Automating F5 Application Delivery and Security Platform Deployments (Intro) F5 Hybrid Security Architectures (Part 1 - F5's Distributed Cloud WAF and BIG-IP Advanced WAF) F5 Hybrid Security Architectures (Part 2 - F5's Distributed Cloud WAF and NGINX App Protect WAF) F5 Hybrid Security Architectures (Part 3 - F5 XC API Protection and NGINX Ingress Controller) F5 Hybrid Security Architectures (Part 4 - F5 XC BOT and DDoS Defense and BIG-IP Advanced WAF) F5 Hybrid Security Architectures (Part 5 - F5 XC, BIG-IP APM, CIS, and NGINX Ingress Controller) Minimizing Security Complexity: Managing Distributed WAF Policies
620Views3likes0CommentsThe End of ClientAuth EKU…Oh Mercy…What to do?
If you’ve spent any time recently monitoring the cryptography and/or public key infrastructure (PKI) spaces…beyond that ever-present “post-quantum” thing, you may have read that starting in May of 2026, Google Chrome Root Program Policy will start requiring public certificate authorities (CAs) to stop issuing certificates with the Client Authentication Extended Key Usage (ClientAuth EKU) extension. While removing ClientAuth EKU from TLS server certificates correctly reduces the scope of these certificates, some internal client certificate authenticated TLS machine-to-machine and API workloads could fail when new/renewed certificates are received from a public CA. Read more here for details and options.4.4KViews12likes8CommentsSecuring AI with F5 AI Guardrails and Nutanix Enterprise AI
Enterprises are rapidly moving from AI experimentation to production-grade deployments, using Kubernetes to deliver and scale AI services consistently across environments. As AI workloads become embedded in business-critical workflows, new risks emerge—particularly at runtime. Generative AI systems can expose sensitive data, generate responses to restricted topics, or be manipulated through prompt injection and jailbreak techniques during user interactions. The challenge is no longer just how to deploy AI models, but how to control AI behavior in real-time—without adding operational complexity or slowing innovation. F5 and Nutanix address this challenge together by combining Nutanix Enterprise AI (NAI) with F5 AI Guardrails. Nutanix Enterprise AI provides a Kubernetes-based platform to deploy and operate AI models and inference services at scale, while F5 AI Guardrails enforces real-time security and policy controls on AI interactions. 🎬 Watch this demo video to see how F5 and Nutanix work together to deliver secure, trustworthy, enterprise‑ready AI—running side by side on Nutanix Kubernetes Platform (NKP) deployed on Nutanix Infrastructure. F5 AI Guardrails Overview F5 AI Guardrails provides runtime security and governance for AI applications, models, and agents by enforcing policy‑based controls on how AI interacts with users and data. It includes a broad set of out‑of‑the‑box guardrails that enable organizations to immediately begin securing and governing AI interactions from day one. These built‑in guardrails address common AI risks such as prompt injection, jailbreaks, PII exposure, restricted topics, and more. In addition to built‑in guardrails, F5 AI Guardrails supports custom guardrails, allowing teams to tailor policy enforcement to specific use cases. One option for defining custom guardrails is through natural‑language policy creation, guided by an integrated F5 AI Assistant. The AI Assistant reviews and refines guardrail definitions using F5 best practices, ensuring that descriptions are clear, concise, and consistently structured for optimal performance. AI Guardrails simplifies AI observability by providing continuous visibility and traceability across AI interactions, with audit‑ready observability, scanning, and logging to support governance and compliance. As part of the F5 Application Delivery and Security Platform (ADSP), AI Guardrails delivers runtime protection for AI models, agents, and connected data, extending F5’s application delivery, security, and observability capabilities across applications and APIs. https://www.f5.com/products/ai-guardrails Nutanix Enterprise AI (NAI) Overview Nutanix Enterprise AI (NAI) is a cloud‑native, Kubernetes‑based AI platform designed to help organizations run AI inference at scale. NAI is supported on CNCF‑certified Kubernetes platforms, including Nutanix Kubernetes Platform (NKP). It enables enterprises to deploy, manage, and operate AI models, inference endpoints, and AI agents through a single, unified management experience, allowing organizations to run AI consistently wherever they operate Kubernetes while maintaining enterprise control, security, and visibility. https://www.nutanix.com/products/nutanix-enterprise-ai F5 and Nutanix: Better Together Nutanix Enterprise AI provides a Kubernetes‑based platform for deploying, managing, and operating AI models, inference endpoints, and AI agents, while F5 AI Guardrails provides runtime security and governance for AI interactions. F5 AI Guardrails can run side by side with Nutanix Enterprise AI, with both solutions deployed on the same Nutanix Kubernetes Platform (NKP)–managed Kubernetes cluster, enabling them to operate on a common Kubernetes foundation. Together, F5 and Nutanix deliver secure, trustworthy, enterprise‑ready AI deployments by combining an enterprise AI platform with runtime AI security—enabling organizations to move fast with AI without losing control over security, compliance, and governance.
265Views2likes0CommentsSSL Orchestrator Advanced Use Cases: Client Certificate Constrained Delegation (C3D) Support
Introduction F5 BIG-IP is synonymous with "flexibility". You likely have few other devices in your architecture that provide the breadth of capabilities that come native with the BIG-IP platform. And for each and every BIG-IP product module, the opportunities to expand functionality are almost limitless. In this article series we examine the flexibility options of the F5 SSL Orchestrator in a set of "advanced" use cases. If you haven't noticed, the world has been steadily moving toward encrypted communications. Everything from web, email, voice, video, chat, and IoT is now wrapped in TLS, and that's a good thing. The problem is, malware - that thing that creates havoc in your organization, that exfiltrates personnel records to the Dark Web - isn't stopped by encryption. TLS 1.3 and multi-factor authentication don't eradicate malware. The only reasonable way to defend against it is to catch it in the act, and an entire industry of security products are designed for just this task. But ironically, encryption makes this hard. You can't protect against what you can't see. F5 SSL Orchestrator simplifies traffic decryption and malware inspection, and dynamically orchestrates traffic to your security stack. But it does much more than that. SSL Orchestrator is built on top of F5's BIG-IP platform, and as stated earlier, is abound with flexibility. SSL Orchestrator Use Case: Client Certificate Constrained Delegation (C3D) Using certificates to authenticate is one of the oldest and most reliable forms of authentication. While not every application supports modern federated access or multi-factor schemes, you'll be hard-pressed to find something that doesn't minimally support authentication over TLS with certificates. And coupled with hardware tokens like smart cards, certificates can enable one of the most secure multi-factor methods available. But certificate-based authentication has always presented a unique challenge to security architectures. Certificate "mutual TLS" authentication requires an end-to-end TLS handshake. When a server indicates a requirement for the client to submit its certificate, the client must send both its certificate, and a digitally-signed hash value. This hash value is signed (i.e. encrypted) with the client's private key. Should a device between the client and server attempt to decrypt and re-encrypt, it would be unable to satisfy the server's authentication request by virtue of not having access to the client's private key (to create the signed hash). This makes encrypted malware inspection complicated, often requiring a total bypass of inspection to sites that require mutual TLS authentication. Fortunately, F5 has an elegant solution to this challenge in the form of Client Certificate Constrained Delegation, affectionally referred to as "C3D". The concept is reasonably straightforward. In very much the same way that SSL forward proxy re-issues a remote server certificate to local clients, C3D can re-issue a remote client certificate to local servers. A local server can continue to enforce secure mutual TLS authentication, while allowing the BIG-IP to explicitly decrypt and re-encrypt in the middle. This presents an immediate advantage in basic load balancing, where access to the unencrypted data allows the BIG-IP greater control over persistence. In the absence of this, persistence would typically be limited to IP address affinity. But of course, access to the unencrypted data also allows the content to be scanned for malicious malware. C3D actually takes this concept of certificate re-signing to a higher level though. The "constrained delegation" portion of the name implies a characteristic much like Kerberos constrained delegation, where (arbitrary) attributes can be inserted into the re-signed token, like the PAC attributes in a Kerberos ticket, to inform the server about the client. Servers for their part can then simply filter on client certificates issued by the BIG-IP (to prevent direct access), and consume any additional attributes in the certificate to understand how better to handle the client. With C3D you can maintain strong mutual TLS authentication all the way through to your servers, while allowing the BIG-IP to more effectively manage availability. And combined with SSL Orchestrator, C3D can enable decryption and inspection of content for malware inspection. This article describes how to configure SSL Orchestrator to enable C3D for inbound decrypted inspection. Arguably, most of what follows is the C3D configuration itself, as the integration with SSL Orchestrator is pretty simple. Note that Client Certificate Constrained Delegation (C3D) is included with Local Traffic Manager (LTM) 13.0 and beyond, but for integration with SSL Orchestrator you should be running 14.1 or later.To learn more about C3D, please see the following resources: K14065425: Configuring Client Certificate Constrained Delegation (C3D): https://support.f5.com/csp/article/K14065425 Manual Chapter: SSL Traffic Management: https://techdocs.f5.com/en-us/bigip-15-1-0/big-ip-system-ssl-administration/ssl-traffic-management.html#GUID-B4D2529E-D1B0-4FE2-8C7F-C3774ADE1ED2 SSL::c3d iRule reference - not required to use C3D, but adds powerful functionality https://clouddocs.f5.com/api/irules/SSL__c3d.html The integration of C3D with SSL Orchestrator involves effectively replacing the client and server SSL profiles that the SSL Orchestrator topology creates, with C3D SSL profiles. This is done programmatically with an iRule, so no "non-strict" customization is required at the topology. Also note that an inbound (reverse proxy) SSL Orchestrator topology will take the form of a "gateway mode" deployment (a routed path to multiple applications), or "application mode" deployment (a single application instance hosted at the BIG-IP). See section 2.5 of the SSL Orchestrator deployment guide for a deeper examination of gateway and application modes: https://clouddocs.f5.com/sslo-deployment-guide/ The C3D integration is only applicable to application mode deployments. Configuration C3D itself involves the creation of client and server SSL profiles: Create a new Client SSL profile: Configuration Certificate Key Chain: public-facing server certificate and private key. This will be the certificate and key presented to the client on inbound request. It will likely be the same certificate and key defined in the SSL Orchestrator inbound topology. Client Authentication Client Certificate: require Trusted Certificate Authorities: bundle that can validate client certificate. This is a certificate bundle used to verify the client's certificate, and will contain all of the client certificate issuer CAs. Advertised Certificate Authorities: optional CA hints bundle. Not expressly required, but this certificate bundle is forwarded to the client during the TLS handshake to "hint" at the correct certificate, based on issuer. Client Certificate Constrained Delegation Client Certificate Constrained Delegation: Enabled Client Fallback Certificate (new in 15.1): option to select a default client certificate if client does not send one. This option was introduced in 15.1 and provides the means to select an alternate (local) certificate if the client does not present one. The primary use case here might be to select a "template" certificate, and use an iRule function to insert arbitrary attributes. OCSP: optional client certificate revocation control. This option defines an OCSP revocation provider for the client certificate. Unknown OCSP Response Control (new in 15.1): determines what happens when OCSP returns Unknown. If an OCSP revocation provider is selected, this option defines what to do if the response to the OCSP query is "unknown". Create a new Server SSL profile: Configuration Certificate: default.crt. The certificate and key here are used as "templates" for the re-signed client certificate. Key: default.key Client Certificate Configuration Delegation Client Certificate Constrained Delegation: Enabled CA Certificate: local forging CA cert. This is the CA certificate used to re-sign the client certificate. This CA must be trusted by the local servers. CA Key: local forging CA key CA Passphrase: optional CA passphrase Certificate Extensions: extensions from the real client cert to be included in the forged cert. This is the list of certificate extensions to be copied from the original certificate to the re-issued certificate. Custom Extension: additional extensions to copy to forged cert from real cert (OID). This option allows you to insert additional extensions to be copied, as OID values. Additional considerations: Under normal conditions, the F5 and backend server attempt to resume existing SSL sessions, whereby the server doesn’t send a Certificate Request message. The effect is that all connections to the backend server use the same forged client cert. There are two ways to get around this: Set a zero-length cache value in the server SSL profile, or Set server authentication frequency to ‘always’ in the server SSL profile CA certificate considerations: A valid signing CA certificate should possess the following attributes. While it can work in some limited scenarios, a self-signed server certificate is generally not an adequate option for the signing CA. keyUsage: certificate extension containing "keyCertSign" and "digitalSignature" attributes basicConstraints: certificate extension containing "CA = true" (for Yes), marked as "Critical" With the client and server SSL profiles built, the C3D configuration is basically done. To integrate with an inbound SSL Orchestrator topology, create a simple iRule and add it to the topology's Interception Rule configuration. Modify the SSL profile paths below to reflect the profiles you created earlier. ### Modify the SSL profile paths below to match real C3D SSL profiles when CLIENT_ACCEPTED priority 250 { ## set clientssl set cmd1 "SSL::profile /Common/c3d-clientssl" ; eval $cmd1 } when SERVER_CONNECTED priority 250 { ## set serverssl SSL::profile "/Common/c3d-serverssl" } In the SSL Orchestrator UI, either from topology workflow, or directly from the corresponding Interception Rule configuration, add the above iRule and deploy. The above iRule programmatically overrides the SSL profiles applied to the Interception Rule (virtual server), effectively enabling C3D support. At this point, the virtual server will request a client certificate, perform revocation checks if defined, and then mint a local copy of the client certificate to pass to the backend server. Optionally, you can insert additional certificate attributes via the server SSL profile configuration, or more dynamically through additional iRule logic: ### Added in 15.1 - allows you to send a forged cert to the server ### irrespective of the client side authentication (ex. APM SSO), ### and insert arbitrary values when SERVERSSL_SERVERCERT { ### The following options allow you to override/replace a submitted ### client cert. For example, a minted client certificate can be sent ### to the server irrespective of the client side authentication method. ### This certificate "template" could be defined locally in the iRule ### (Base64-encoded), pulled from an iFile, or some other certificate source. # set cert1 [b64decode "LS0tLS1a67f7e226f..."] # set cert1 [ifile get template-cert] ### In order to use a template cert, it must first be converted to DER format # SSL::c3d cert [X509::pem2der $cert1] ### Insert arbitrary attributes (OID:value) SSL::c3d extension 1.3.6.1.4.1.3375.3.1 "TEST" } If you've configured the above, a server behind SSL Orchestrator that requires mutual TLS authentication can receive minted client certificates from external users, and SSL Orchestrator can explicitly decrypt and pass traffic to the set of malware inspection tools. You can look at the certificate sent to the server by injecting a tcpdump packet between the BIG-IP and server, then open in Wireshark. tcpdump -lnni [VLAN] -Xs0 -w capture.pcap [additional filters] Finally, you might be asking what to do with certificate attributes injected by C3D, and really it depends on what the server can support. The below is a basic example in an Apache config file to block a client certificate that doesn't contain your defined attribute. <Directory /> SSLRequire "HTTP/%h" in PeerExtList("1.3.6.1.4.1.3375.3.1") RewriteEngine on RewriteCond %{SSL::SSL_CLIENT_VERIFY} !=SUCCESS RewriteRule .? - [F] ErrorDocument 403 "Delegation to SPN HTTP/%h failed. Please pass a valid client certificate" </Directory> And there you have it. In just a few steps you've configured your SSL Orchestrator to integrate with Client Certificate Constrained Delegation to support mutual TLS authentication, and along the way you have hopefully recognized the immense flexibility at your command. Updates As of F5 BIG-IP 16.1.3, there are some new C3D capabilities: C3D has been updated to encode and return the commonName (CN) found in the client certificate subject field in printableString format if possible, otherwise the value will be encoded as UTF8. C3D has been updated to support inserting a subject commonName (CN) via 'SSL::c3d subject commonName' command: when CLIENTSSL_HANDSHAKE { if {[SSL::cert count] > 0} { SSL::c3d subject commonName [X509::subject [SSL::cert 0] commonName] } } C3D has been updated to support inserting a Subject Alternative Name (SAN) and Authority Info Access (AIA) via 'SSL::c3d extension' commands: when CLIENTSSL_HANDSHAKE { ## Insert Subject Alternative Name (SAN) value SSL::c3d extension SAN "DNS:*.test-client.com, IP:1.1.1.1" ### Insert Authority Info Access (AIA) value SSL::c3d extension AIA "ocsp,https://ocsp.entrust.net.com; caIssuer, https://aia.entrust.net/l1m-chain256.cer" } C3D has been updated to add the Authority Key Identifier (AKI) extension to the client certificate if the CA certificate has a Subject Key Identifier (SKI) extension. Another interesting use case is copying the real client certificate Subject Key Identifier (SKI) to the minted client certificate. By default, the minted client certificate will not contain an SKI value, but it's easy to configure C3D to copy the origin cert's SKI by modifying the C3D server SSL profile. In the "Custom extension" field of the C3D section, add 2.5.29.14 as an available extension. As of F5 BIG-IP 17.1.0 (SSL Orchestrator 11.0), C3D has been integrated natively. Now, for a deployed Inbound topology, the C3D SSL profiles are listed in the Protocol Settings section of the Interception Rules tab. You can replace the client and server SSL profiles created by SSL Orchestrator, with C3D SSL profiles in the Interception Rules tab to support C3D. The C3D support is now extended to both Gateway and Application modes.4.6KViews2likes3CommentsCentralized Application Control for Distributed AI with Equinix and F5 Distributed Cloud
As AI adoption accelerates, I’ve been seeing a common architectural pattern emerge: centralized AI factories handling model training, with inference workloads pushed out to remote departments like public safety, healthcare, or logistics. While the execution is distributed, the operational requirements—security, performance, and policy consistency—remain very much centralized. The challenge isn’t running inference at the edge; it’s delivering centralized AI services to distributed consumers without introducing complex routing, fragmented security controls, or inconsistent performance between locations. This article outlines how you can address that problem using F5 Distributed Cloud (XC) Customer Edge deployed on Equinix Network Edge, with private connectivity provided by Equinix Fabric. The Problem to Solve From an infrastructure perspective, these environments tend to stress three things simultaneously: Scalability, as data volumes and inference demand grow rapidly Security, to protect models, APIs, and sensitive inference data Reliability, so performance remains consistent regardless of where requests originate Traditional approaches often force tradeoffs—centralize everything and accept latency, or decentralize enforcement and deal with policy sprawl. What we need is centralized control with distributed execution. Architectural Approach Rather than building bespoke connectivity for each inference location, we’ll focus on creating a repeatable edge pattern that could be deployed globally while still being governed centrally. The architecture breaks down into four core components: Central AI Factory (Training Hub) This is where model training and lifecycle management live. It connects to S3‑compatible object storage for large‑scale data ingestion and model artifacts. Importantly, it doesn’t need direct exposure to every inference a consumer makes. Equinix Fabric Equinix Fabric provides private, low‑latency connectivity between the AI factory and distributed inference locations. In this design, it effectively acts as a segment extender across regions, keeping AI traffic off the public internet while preserving predictable performance. F5 Distributed Cloud (XC) Customer Edge F5 XC Customer Edge (CE) instances are deployed close to inference consumers. These handle traffic management, API security, segmentation, and observability, while remaining under centralized policy control. This is where enforcement happens—consistently, everywhere. Equinix Network Edge Marketplace Equinix Network Edge enables rapid deployment of Customer Edge instances in new regions without waiting on physical infrastructure, which is critical when inference demand expands faster than traditional provisioning cycles. How It Works Inference requests are processed locally through CEs at each location. When access to centralized resources is required—such as model updates or validation—traffic traverses Equinix Fabric back to the AI factory. The key detail is that policy is defined centrally but enforced at the edge. Security controls, API protections, and segmentation rules are created once and applied uniformly, regardless of geography. That eliminates the need for custom routing logic or per‑site security tuning. Design Principles That Matter A few principles guided the implementation: Centralized control, distributed execution — inference stays close to data. Governance stays centralized Zero Trust by default — all AI data flows are explicitly authenticated and authorized Elastic expansion — new regions can be brought online quickly through the Marketplace Integrated observability — traffic, performance, and security posture are visible across all endpoints Compliance‑ready — isolation and segmentation support regulatory requirements like GDPR and HIPAA When This Pattern Fits This approach works well for organizations that need to scale AI inference across multiple regions or departments while maintaining tight operational control. It’s particularly effective when inference demand grows incrementally and predictability, security, and governance matter more than ad‑hoc edge autonomy. If the goal is centralized governance with distributed execution, this pattern provides a clean and repeatable way to get there. Additional Links F5 Distributed Cloud Services F5 Distributed Cloud (XC) Customer Edge Equinix Fabric Equinix Network Edge Marketplace159Views1like0CommentsHow I Did It - "Decoupling Access Points from CloudVision AGNI with a BIG‑IP RADIUS Proxy"
Modern network access control platforms like Arista CloudVision AGNI are designed to be centralized policy engines, not edge‑facing protocol endpoints. Introducing a BIG-IP as RADIUS proxy between external access points (APs) and AGNI aligns the architecture with that design intent while solving several real‑world operational, security, and scalability challenges.219Views1like0Comments