application delivery
2256 TopicsF5 AI Gateway to Strengthen LLM Security and Performance in Red Hat OpenShift AI
In my previous article, we explored how F5 Distributed Cloud (XC) API Security enhances the perimeter of AI model serving in Red Hat OpenShift AI on ROSA by protecting against threats such as DDoS attacks, schema misuse, and malicious bots. As organizations move from piloting to scaling GenAI applications, a new layer of complexity arises. Unlike traditional APIs, LLMs process free-form, unstructured inputs and return non-deterministic responses—introducing entirely new attack surfaces. Conventional web or API firewalls fall short in detecting prompt injection, data leakage, or misuse embedded within model interactions. Enter F5 AI Gateway—a solution designed to provide real-time, LLM-specific security and optimization within the OpenShift AI environment. Understanding the AI Gateway Recent industry leaders have said that an AI Gateway layer is coming into use. This layer is between clients and LLM endpoints. It will handle dynamic prompt/response patterns, policy enforcement, and auditability. Inspired by these patterns, F5 AI Gateway brings enterprise-grade capabilities such as: Inspecting and Filtering Traffic: Analyzes both client requests and LLM responses to detect and mitigate threats such as prompt injection and sensitive data exposure. Implementing Traffic Steering Policies: Directs requests to appropriate LLM backends based on content, optimizing performance and resource utilization. Providing Comprehensive Logging: Maintains detailed records of all interactions for audit and compliance purposes. Generating Observability Data: Utilizes OpenTelemetry to offer insights into system performance and security events. These capabilities ensure that AI applications are not only secure but also performant and compliant with organizational policies. Integrated Architecture for Enhanced Security The combined deployment of F5 Distributed Cloud API Security and F5 AI Gateway within Red Hat OpenShift AI creates a layered defense strategy: F5 Distributed Cloud API Security: Acts as the first line of defense, safeguarding exposed model APIs from external threats. F5 AI Gateway: Operates within the OpenShift AI cluster, providing real-time inspection and policy enforcement tailored to LLM traffic. This layered design ensures multi-dimensional defense, aligning with enterprise needs for zero-trust, data governance, and operational resilience. Key Benefits of F5 AI Gateway Enhanced Security: Addresses OWASP Top 10 for LLM Applications by mitigating risks such as prompt injection, model denial-of-service, and sensitive information disclosure. Performance Optimization: Employs semantic caching and intelligent routing to reduce latency and operational costs. Scalability and Flexibility: Supports deployment across various environments, including public cloud, private cloud, and on-premises data centers. Comprehensive Observability: Provides detailed metrics and logs through OpenTelemetry, facilitating monitoring and compliance. Conclusion The rise of LLM applications requires a new architectural mindset. F5 AI Gateway complements existing security layers by focusing on content-level inspection, traffic governance, and compliance-grade visibility. It is specifically tailored for AI inference traffic. When used with Red Hat OpenShift AI, this solution provides not just security, but also trust and control. This helps organizations grow GenAI workloads in a responsible way. For a practical demonstration of this integration, please refer to the embedded demo video below. If you’re planning to attend this year’s Red Hat Summit, please attend an F5 session and visit us in Booth #648. Related Articles: Securing model serving in Red Hat OpenShift AI (on ROSA) with F5 Distributed Cloud API Security88Views0likes0CommentsPost-Quantum Cryptography: Building Resilience Against Tomorrow’s Threats
Modern cryptographic systems such as RSA, ECC (Elliptic Curve Cryptography), and DH (Diffie-Hellman) rely heavily on the mathematical difficulty of certain problems, like factoring large integers or computing discrete logarithms. However, with the rise of quantum computing, algorithms like Shor's and Grover's threaten to break these systems, rendering them insecure. Quantum computers are not yet at the scale required to break these encryption methods in practice, but their rapid development has pushed the cryptographic community to act now. This is where Post-Quantum Cryptography (PQC) comes in — a new wave of algorithms designed to remain secure against both classical and quantum attacks. Why PQC Matters Quantum computers exploit quantum mechanics principles like superposition and entanglement to perform calculations that would take classical computers millennia2. This threatens: Public-key cryptography: Algorithms like RSA rely on factoring large primes or solving discrete logarithms-problems quantum computers could crack using Shor’s algorithm. Long-term data security: Attackers may already be harvesting encrypted data to decrypt later ("harvest now, decrypt later") once quantum computers mature. Figure1: Cryptography evolution How PQC Works The National Institute of Standards and Technology (NIST) has led a multi-year standardization effort. Here are the main algorithm families and notable examples. Lattice-Based Cryptography. Lattice problems are believed to be hard for quantum computers. Most of the leading candidates come from this category. CRYSTALS-Kyber (Key Encapsulation Mechanism) CRYSTALS-Dilithium (Digital Signatures) Uses complex geometric structures (lattices) where finding the shortest vector is computationally hard, even for quantum computers Example: ML-KEM (formerly Kyber) establishes encryption keys using lattices but requires more data transfer (2,272 bytes vs. 64 bytes for elliptic curves) The below figure shows an illustration of how Lattice-based cryptography works. Imagine solving a maze with two maps-one public (twisted paths) and one private (shortest route). Only the private map holder can navigate efficiently Code-Based Cryptography Based on the difficulty of decoding random linear codes. Classic McEliece: Resistant to quantum attacks for decades. Pros: Very well-studied and conservative. Cons: Very large public key sizes. Relies on error-correcting codes. The Classic McEliece scheme hides messages by adding intentional errors only the recipient can fix. How it works: Key generation: Create a parity-check matrix (public key) and a secret decoder (private key). Encryption: Encode a message with random errors. Decryption: Use the private key to correct errors and recover the message Figure3: Code-Based Cryptography Illustration Multivariate & Hash-Based Quadratic Equations Multivariate These are based on solving systems of multivariate quadratic equations over finite fields and relies on solving systems of multivariate equations, a problem believed to be quantum-resistant. Hash-Based Use hash functions to construct secure digital signatures. SPHINCS+: Stateless and hash-based, good for long-term digital signature security. Challenges and Adoption Integration: PQC must work within existing TLS, VPN, and hardware stacks. Key sizes: PQC algorithms often require larger keys. For example, Classic McEliece public keys can exceed 1MB. Hybrid Schemes: Combining classical and post-quantum methods for gradual adoption. Performance: Lattice-based methods are fast but increase bandwidth usage. Standardization: NIST has finalized three PQC standards (e.g., ML-KEM) and is testing others. Organizations must start migrating now, as transitions can take decades. Adopting PQC with BIG-IP As of F5 BIG-IP 17.5, the BIG-IP now supports the widely implemented X25519Kyber768Draft00 cipher group for client-side TLS negotiations (BIG-IP as a TLS server). Other cipher groups and capabilities will become available in subsequent releases. Cipher walkthrough Let's take the supported cipher in v17.5.0 (Hybrid X25519_Kyber768) as an example and walk through it. X25519: A classical elliptic-curve Diffie-Hellman (ECDH) algorithm Kyber768: A post-quantum Key Encapsulation Mechanism (KEM) The goal is to securely establish a shared secret key between the two parties using both classical and quantum-resistant cryptography. Key Exchange X25519 Exchange: Alice and Bob exchange X25519 public keys. Each computes a shared secret using their own private key + the other’s public key: Kyber768 Exchange: Alice uses Bob’s Kyber768 public key to encapsulate a secret: Produces a ciphertext and a shared secret Bob uses his Kyber768 private key to decapsulate the ciphertext and recover the same shared secret: Both parties now have: A classical shared secret A post-quantum shared secret They combine them using a KDF (Key Derivation Function): Why the hybrid approach is being followed: If quantum computers are not practical yet, X25519 provides strong classical security. If a quantum computer arrives, Kyber768 keeps communications secure. Helps organizations migrate gradually from classical to post-quantum systems. Implementation guide F5 published article Enabling Post-Quantum Cryptography in F5 BIG-IP TMOS to implement PQC on BIG-IP v17.5 Create a new Cipher Rule To create a new Cipher Rule, log in to the BIG-IP Configuration Utility, go to Local Traffic > Ciphers > Rules. Select Create. In the Name box, provide a name for the Cipher Rule. For Cipher Suits, select any of the suites from the provided Cipher Suites list. Use ALL or DEFAULT to list all of the available suites. For DH Groups, enter X25519KYBER768 to restrict to only this PQC cipher For Example: X25519KYBER768 For Signature Algorithms, select an algorithm. For example: DEFAULT Select Finished. Create a new Cipher Group In the BIG-IP Configuration Utility, go to Local Traffic > Ciphers > Groups Select Create. In the Name box, provide a name for the Cipher Group. Add the newly created Cipher Rule to Allow the following box or Restrict the Allowed list to the following in Group Details. All of the other details, including DH Group, Signature Algorithms, and Cipher Suites, will be reflected in the Group Audit as per the selected rule. Select Finished. Configure a Client SSL Profile In the BIG-IP Configuration Utility, go to Local Traffic > Profiles > SSL > Client. Create a new client SSL profile or edit an existing. For Ciphers, select the Cipher Group radio button and select the created group to enable Post-Quantum cryptography for this client’s SSL profile. NGINX Support for PQC We are pleased to announce support for Post Quantum Cryptography (PQC) starting NGINX Plus R33. NGINX provides PQC support using the Open Quantum Safe provider library for OpenSSL 3.x (oqs-provider). This library is available from the Open Quantum Safe (OQS) project. The oqs-provider library adds support for all post-quantum algorithms supported by the OQS project into network protocols like TLS in OpenSSL-3 reliant applications. All ciphers/algorithms provided by oqs-provider are supported by NGINX. To configure NGINX with PQC support using oqs-provider, follow these steps: Install the necessary dependencies sudo apt update sudo apt install -y build-essential git cmake ninja-build libssl-dev pkg-config Download and install liboqs git clone --branch main https://github.com/open-quantum-safe/liboqs.git cd liboqs mkdir build && cd build cmake -GNinja -DCMAKE_INSTALL_PREFIX=/usr/local -DOQS_DIST_BUILD=ON .. ninja sudo ninja install Download and install oqs-provider git clone --branch main https://github.com/open-quantum-safe/oqs-provider.git cd oqs-provider mkdir build && cd build cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local -DOPENSSL_ROOT_DIR=/usr/local/ssl .. make -j$(nproc) sudo make install Download and install OpenSSL with oqs-provider support git clone https://github.com/openssl/openssl.git cd openssl ./Configure --prefix=/usr/local/ssl --openssldir=/usr/local/ssl linux-x86_64 make -j$(nproc) sudo make install_sw Configure OpenSSL for oqs-provider /usr/local/ssl/openssl.cnf: openssl_conf = openssl_init [openssl_init] providers = provider_sect [provider_sect] default = default_sect oqsprovider = oqsprovider_sect [default_sect] activate = 1 [oqsprovider_sect] activate = 1 Generate post quantum certificates export OPENSSL_CONF=/usr/local/ssl/openssl.cnf # Generate CA key and certificate /usr/local/ssl/bin/openssl req -x509 -new -newkey dilithium3 -keyout ca.key -out ca.crt -nodes -subj "/CN=Post-Quantum CA" -days 365 # Generate server key and certificate signing request (CSR) /usr/local/ssl/bin/openssl req -new -newkey dilithium3 -keyout server.key -out server.csr -nodes -subj "/CN=your.domain.com" # Sign the server certificate with the CA /usr/local/ssl/bin/openssl x509 -req -in server.csr -out server.crt -CA ca.crt -CAkey ca.key -CAcreateserial -days 365 Download and install NGINX Plus Configure NGINX to use the post quantum certificates server { listen 0.0.0.0:443 ssl; ssl_certificate /path/to/server.crt; ssl_certificate_key /path/to/server.key; ssl_protocols TLSv1.3; ssl_ecdh_curve kyber768; location / { return 200 "$ssl_curve $ssl_curves"; } } Conclusion By adopting PQC, we can future-proof encryption against quantum threats while balancing security and practicality. While technical hurdles remain, collaborative efforts between researchers, engineers, and policymakers are accelerating the transition. Related Content New Features in BIG-IP Version 17.5.0 K000149577: Enabling Post-Quantum Cryptography in F5 BIG-IP TMOS F5 NGINX Plus R33 Release Now Available | DevCentral156Views1like5CommentsBIG-IP APM integration with Open Policy Agent (OPA)
In this article, we are exploring a technical deployment where BIG-IP APM integrates with Open Policy Agent (OPA) via HTTP Connector to fetch client authorization information and enforce access. Open Policy Agent (OPA) OPA is a unified policy engine for cloud-native environments, enabling policy-as-code across infrastructure, APIs, and data. Open Policy Agent (OPA) is widely adopted across industries for cloud-native authorization and policy management. OPA is widely used in the market, Used by around 50% of Fortune 500 companies (per OPA’s creator) in sectors like finance, tech, and healthcare. Major adopters: Netflix, Goldman Sachs, Airbnb, Uber, Pinterest, and Cisco. Kubernetes: Integrated with Istio, Kubernetes Gatekeeper, and cloud platforms (EKS, AKS, GKE). BIG-IP APM HTTP Connector HTTP Connector enables BIG-IP APM to post an HTTP request to an external HTTP server. This enables APM to make HTTP calls from a per-request policy without the need for an iRule, for example. The typical use for an HTTP Connector is to provide access to an external API or service. For example, you can use the HTTP Connector to check a server against an external blocklist, or an external reputation engine, and then use the results in an Access Policy Manager per-request policy. Lab environment and configurations Lab setup, BIG-IP APM v15.1+ OPA server. Backend (API endpoint) BIG-IP APM HTTP Connector request shown below uses APM access variables to fetch authorization level from OPA server. Related Content About the HTTP Connector Open Policy Agent Introduction90Views0likes0CommentsHow To Secure Multi-Cloud Networking with Routing & Web Application and API Protection
Introduction With the proliferation of intra-cloud networking requirements, continuing, organizations are increasingly leveraging multi-cloud solutions to optimize performance, cost, and resilience. However, with this approach comes the challenge of ensuring robust security across diverse cloud and on-prem environments. Secure multi-cloud networking with advanced web application and API protection services, is essential to safeguard digital assets, maintain compliance, and uphold operational integrity. Understanding Multi-Cloud Networking Multi-cloud networking involves orchestrating connectivity between multiple cloud platforms such as AWS, Microsoft Azure, Google Cloud Platform, and others including on-prem and private data centers. This approach allows organizations to avoid vendor lock-in, enhance redundancy, and tailor services to specific workloads. However, managing networking and web application security across these platforms can be complex due to different platform security models, configurations, and interfaces. Key Components of Multi-Cloud Networking Inter-cloud Connectivity: Establishing secure connections between multiple cloud providers to ensure seamless data flow and application interoperability. Unified Management: Implementing centralized management tools to oversee network configurations, policies, and security protocols across all cloud environments. Automated Orchestration: Utilizing automation to provision, configure, and manage network resources dynamically, reducing manual intervention and potential errors. Compliance and Governance: Ensuring adherence to regulatory requirements and best practices for data protection and privacy across all cloud platforms. Securing Multi-Cloud Environments Security is paramount in multi-cloud networking. With multiple entry points and varying security measures across different cloud providers, organizations must adopt a comprehensive strategy to protect their assets. Strategies for Secure Multi-Cloud Networking Zero Trust Architecture: Implementing a zero-trust model that continuously verifies and validates every request, irrespective of its source, to mitigate risks. Encryption: Utilizing advanced encryption methods for data in transit and at rest to protect against unauthorized access. Continuous Monitoring: Deploying monitoring tools to detect, analyze, and respond to threats in real-time. Application Security: Using a common framework for web application and API security reduces the number of steps needed to identify and remediate security risks and misconfigurations across disparate infrastructures. Web Application and API Protection Services Web applications and APIs are critical components of modern digital ecosystems. Protecting these assets from cyber threats is crucial, especially in a multi-cloud environment where they may be distributed across various platforms. Comprehensive Web Application Protection Web Application Firewalls (WAFs) play a vital role in safeguarding web applications. They filter and monitor HTTP traffic between a web application and the internet, blocking malicious requests and safeguarding against common threats such as SQL injection, cross-site scripting (XSS), and DDoS attacks. Advanced Threat Detection: Employing machine learning and artificial intelligence to identify and block sophisticated attacks. Application Layer Defense: Providing protection at the application layer, where traditional network security measures may fall short. Scalability and Performance: Ensuring WAF solutions can scale and perform adequately in response to varying traffic loads and attack volumes. Securing APIs in Multi-Cloud Environments APIs are pivotal for integration and communication between services. Securing APIs involves protecting them from unauthorized access, misuse, and exploitation. Authentication and Authorization: Implement strong authentication mechanisms such as OAuth and JWT to ensure only authorized users and applications can access APIs. Rate Limiting: Controlling the number of API calls to prevent abuse and ensure fair usage across consumers. Input Validation: Validating input data to prevent injection attacks and ensure data integrity. Threat Detection: Monitoring API traffic for anomalies and potential threats, and responding swiftly to mitigate risks. Best Practices for Secure Multi-Cloud Networking To effectively manage and secure multi-cloud networks, organizations should adhere to best practices that align with their operational and security objectives. Adopt a Holistic Security Framework A holistic security framework encompasses the entire multi-cloud environment, focusing on integration and coordination between different security measures across cloud platforms. Unified Policy Enforcement: Implementing consistent security policies across all cloud environments to ensure uniform protection. Regular Audits: Conducting frequent security audits to identify vulnerabilities, assess compliance, and improve security postures. Incident Response Planning: Developing and regularly updating incident response plans to handle potential breaches and disruptions efficiently. Leverage Security Automation Automation can significantly enhance security in multi-cloud environments by reducing human errors and ensuring timely responses to threats. Automated Compliance Checks: Using automation to continuously monitor and enforce compliance with security standards and regulations. Real-time Threat Mitigation: Implementing automated remediation processes to address security threats as they are detected. Demo: Bringing it Together with F5 Distributed Cloud Using services in Distributed Cloud, F5 brings everything together. Deploying and orchestrating connectivity between hybrid and multi-cloud environments, Distributed Cloud not only connects these environments, it also secures them with universal Web App and API Protection policies. The following solution uses Distributed Cloud Network Connect, App Connect, and Web App & API Protection to connect, deliver, and secure an application with services that exist in different cloud and on-prem environments. This video shows how each of the features in this solution come together: Bringing it together with Automation Orchestrating this end-to-end, including the Distributed Cloud components itself is trivial using the combination of GitHub Workflow Actions and Terraform. The following automation workflow guide and companion article at DevCentral provides all the steps and modular code necessary to build a complete multicloud environment, manage Kubernetes clusters, and deploy the sample functional multi-site application, Arcadia Finance. GitHub Repository: https://github.com/f5devcentral/f5-xc-terraform-examples/tree/main/workflow-guides/smcn/mcn-distributed-apps-l3 Conclusion Secure multi-cloud networking, combined with robust web application and API protection services, is vital for organizations seeking to leverage the benefits of a multi-cloud strategy without compromising security. By adopting comprehensive security measures, enforcing best practices, and leveraging advanced technologies, organizations can safeguard their digital assets, ensure compliance, and maintain operational integrity in a dynamic and ever-evolving cloud landscape. Additional Resources Introducing Secure MCN features on F5 Distributed Cloud Driving Down Cost & Complexity: App Migration in the Cloud The App Delivery Fabric with Secure Multicloud Networking Scale Your DMZ with F5 Distributed Cloud Services Seamless Application Migration to OpenShift Virtualization with F5 Distributed Cloud Automate Multicloud Networking w/ Terraform: routing and app connect on F5 Distributed Cloud79Views1like0CommentsBIG-IP BGP Routing Protocol Configuration And Use Cases
Is the F5 BIG-IP a router? Yes! No! Wait what? Can the BIG-IP run a routing protocol? Yes. But should it be deployed as a core router? An edge router? Stay tuned. We'll explore these questions and more through a series of common use cases using BGP on the BIG-IP... And oddly I just realized how close in typing BGP and BIG-IP are, so hopefully my editors will keep me honest. (squirrel!) In part one we will explore the routing components on the BIG-IP and some basic configuration details to help you understand what the appliance is capable of. Please pay special attention to some of the gotchas along the way. Can I Haz BGP? Ok. So your BIG-IP comes with ZebOS in order to provide routing functionality, but what happens when you turn it on? What do you need to do to get routing updates in to the BGP process? And well does my licensing cover it? Starting with the last question… tmsh show /sys license | grep "Routing Bundle" The above command will help you determine if you’re going to be able to proceed, or be stymied at the bridge like the Black Knight in the Holy Grail. Fear not! There are many licensing options that already come with the routing bundle. Enabling Routing First and foremost, the routing protocol configuration is tied to the route-domain. What’s a route-domain? I’m so glad you asked! Route-domains are separate Layer 3 route tables within the BIG-IP. There is a concept of parent and child route domains, so while they’re similar to another routing concept you may be familiar with; VRF’s, they’re no t quite the same but in many ways they are. Just think of them this way for now. For this context we will just say they are. Therefore, you can enable routing protocols on the individual route-domains. Each route-domain can have it’s own set of routing protocols. Or run no routing protocols at all. By default the BIG-IP starts with just route-domain 0. And well because most router guys live on the cli, we’ll walk through the configuration examples that way on the BIG-IP. tmsh modify net route-domain 0 routing-protocol add { BGP } So great! Now we’re off and running BGP. So the world know’s we’re here right? Nope. Considering what you want to advertise. The most common advertisements sourced from the BIG-IP are the IP addresses for virtual servers. Now why would I want to do that? I can just put the BIG-IP on a large subnet and it will respond to ARP requests and send gratuitous ARPs (GARP). So that I can reach the virtual servers just fine. <rant> Author's opinion here: I consider this one of the worst BIG-IP implementation methods. Why? Well for starters, what if you want to expand the number of virtual servers on the BIG-IP? Well then you need to re-IP the network interfaces of all the devices (routers, firewalls, servers) in order to expand the subnet mask. Yuck! Don't even talk to me about secondary subnets. Second: ARP floods! Too many times I see issues where the BIG-IP has to send a flood of GARPs; and well the infrastructure, in an attempt to protect its control plane, filters/rate limits the number of incoming requests it will accept. So engineers are left to try and troubleshoot the case of the missing GARPs Third: Sometimes you need to migrate applications to maybe another BIG-IP appliance as it grew to big for the existing infrastructure. Having it tied to this interface just leads to confusion. I'm sure there's some corner cases where this is the best route. But I would say it's probably in the minority. </rant> I can hear you all now… “So what do you propose kind sir?” See? I can hear you... Treat the virtual servers as loopback interfaces. Then they’re not tied to a specific interface. To move them you just need to start advertising the /32 from another spot (Yes. You could statically route it too. I hear you out there wanting to show your routing chops.) But also, the only GARPs are those from the self-ip's This allows you to statically route of course the entire /24 to the BIG-IP’s self IP address, but also you can use one of them fancy routing protocols to announce the routes either individually or through a summarization. Announcing Routes Hear ye hear ye! I want the world to know about my virtual servers. *ahem* So quick little tangent on BIG-IP nomenclature. The virtual server does not get announced in the routing protocol. “Well then what does?” Eery mind reading isn't it? Remember from BIG-IP 101, a virtual server is an IP address and port combination and well, routing protocols don’t do well with carrying the port across our network. So what BIG-IP object is solely an IP address construct? The virtual-address! “Wait what?” Yeah… It’s a menu item I often forget is there too. But here’s where you let the BIG-IP know you want to advertise the virtual-address associated with the virtual server. But… but… but… you can have multiple virtual servers tied to a single IP address (http/https/etc.) and that’s where the choices for when to advertise come in. tmsh modify ltm virtual-address 10.99.99.100 route-advertisement all There are four states a virtual address can be in: Unknown, Enabled, Disabled and Offline. When the virtual address is in Unknown or Enabled state, its route will be added to the kernel routing table. When the virtual address is in Disabled or Offline state, its route will be removed if present and will not be added if not already present. But the best part is, you can use this to only advertise the route when the virtual server and it’s associated pool members are all up and functioning. In simple terms we call this route health injection. Based on the health of the application we will conditionally announce the route in to the routing protocol. At this point, if you’d followed me this far you’re probably asking what controls those conditions. I’ll let the K article expand on the options a bit. https://my.f5.com/manage/s/article/K15923612 “So what does BGP have to do with popcorn?” Popcorn? Ohhhhhhhhhhh….. kernel! I see what you did there! I’m talking about the operating system kernel silly. So when a virtual-address is in an unknown or enabled state and it is healthy, the route gets put in the kernel routing table. But that doesn’t get it in to the BGP process. Here is how the kernel (are we getting hungry?) routes are represented in the routing table with a 'K' This is where the fun begins! You guessed it! Route redistribution? Route redistribution! And well to take a step back I guess we need to get you to the ZebOS interface. To enter the router configuration cli from the bash command line, simply type imish. In a multi-route-domain configuration you would need to supply the route-domain number but in this case since we’re just using the 0 default we’re good. It’s a very similar interface to many vendor’s router and switch configuration so many of you CCIE’s should feel right at home. It even still lets you do a write memory or wr mem without having to create an alias. Clearly dating myself here.. I’m not going to get in to the full BGP configuration at this point but the simplest way to get the kernel routes in to the BGP process is simply going under the BGP process and redisitrubting the kernel routes. BUT WAIT! Thar be dragons in that configuration! First landmine and a note about kernel routes. If you manually configure a static route on the BIG-IP via tmsh or the tmui those will show up also as kernel routes Why is that concerning? Well an example is where engineers configure a static default route on the BIG-IP via tmsh. And well, when you redistribute kernel routes and that default route is now being advertised into BGP. Congrats! AND the BIG-IP is NOT your default gateway hilarity ensues. And by hilarity I mean the type of laugh that comes out as you're updating your resume. The lesson here is ALWAYS when doing route redistribution always use a route filter to ensure only your intended routes or IP range make it in to the routing protocol. This goes for your neighbor statements too. In both directions! You should control what routes come in and leave the device. Another way to have some disasterous consequences with BIG-IP routing is through summarization. If you are doing summarization, keep in mind that BGP advertises based on reachability to the networks it wants to advertise. In this case, BGP is receiving it in the form of kernel routes from tmm. But those are /32 addresses and lots of them! And you want to advertise a /23 summary route. But the lone virtual-address that is configured for route advertisement; and the only one your BGP process knows about within that range has a monitor that fails. The summary route will be withdrawn leaving all the /23 stranded. Be sure to configure all your virtual-addresses within that range for advertisement. Next: BGP Behavior In High Availability Configurations2.5KViews6likes19CommentsSetting up BIG-IP with AWS CloudHSM
Recently I was working on a project and there was a requirement for using AWS CloudHSM. F5 has documented the process to install the AWS CloudHSM client in the implementation guide. I found it light on details of what a config should look like and showing examples. So let's pickup where the article leaves you on having installed the client software what does a working configuration look like?1.4KViews2likes2CommentsBIG-IP Next for Kubernetes, addressing today’s enterprise challenges
Enterprises have started adopting Kubernetes (K8s)—not just cloud service providers—as it offers strategic advantages in agility, cost efficiency, security, and future-proofing. Cloud Native Functions account for around 60% TCO savings Easier to deploy, manage, maintain, and scale. Easier to add and roll out new services. Kubernetes complexities With the move from traditional application deployments to microservices and containerized services, some complexities were introduced, Networking Challenges with Kubernetes Default Deployments Kubernetes networking has some problems when using default settings. These problems can affect performance, security, and reliability in production environments. Core Networking Challenges Flat Network Model All pods can communicate with all other pods by default (east-west traffic) No network segmentation between applications Potential security risks from excessive inter-pod communication Service Discovery Limitations DNS-based service discovery has caching behaviors that can delay updates No built-in load balancing awareness (can route to unhealthy pods during updates) Limited traffic shaping capabilities (all requests treated equally) Ingress Challenges No default ingress controller installed Multiple ingress controllers can conflict if not properly configured SSL/TLS termination requires manual certificate management Network Policy Absence No network policies applied by default (allow all traffic). Difficult to implement zero-trust networking principles No default segmentation between namespaces DNS Issues CoreDNS default cache settings may not be optimal. Pod DNS policies may not match application requirements. Nodelocal DNS cache not enabled by default Load-Balancing Problems Service `ClusterIP` is the default (no external access). NodePort` services can conflict on port allocations. Cloud provider load balancers can be expensive if overused CNI (Container Network Interface) Considerations Default CNI plugin may not support required features Network performance varies significantly between CNI choices IP address management challenges at scale Performance-Specific Issues kube-proxy inefficiencies Default iptables mode becomes slow with many services IPVS (IP Virtual Server) mode requires explicit configuration Service mesh sidecars can double latency Pod Network Overhead Additional hops for cross-node communication Encapsulation overhead with some CNI plugins No QoS guarantees for network traffic Multicluster Communication No default solution for cross-cluster networking Complex to establish secure connections between clusters Service discovery doesn’t span clusters by default Security Challenges No default encryption between pods No default authentication for service-to-service communication. All namespaces are network-accessible to each other by default. External traffic can bypass ingress controllers if misconfigured. These challenges highlight why most production Kubernetes deployments require significant, complex customization beyond the default configuration. Figure 1 shows those workarounds being implemented and how complicated our setup would be, with multiple add-ons required to overcome Kubernetes limitations. In the following section, we are exploring how BIG-IP Next for Kubernetes simplifies and enhances application delivery and security within Kubernetes environment. BIG-IP Next for Kubernetes Introducing BIG-IP Next for Kubernetes not only reduces complexity, but leverages the main networking components to the TMM pods rather than relying on the host server. Think of where current network functions are applied, it’s the host kernel. Whether you are doing NAT or firewalling services, this requires intervention by the host side, which impacts the zero-trust architecture and traffic performance is limited by default kernel IP and routing capabilities. Deployment overview Among the introduced features in 2.0.0 Release API GW CRs (Custom Resources). F5 IPAM Controller to manage IP addresses for Gateway resource. Seamless firewall policy integration in Gateway API. Ingress DDoS protection in Gateway API. Enforced access control for Debug and QKView APIs with Admin Token. In this section, we explore the steps to deploy BIG-IP Next for Kubernetes in your environment, Infrastructure Using different flavors depending on your needs and lab type (demo or production), for labs microk8s, k8s or kind, for example. BIG-IP Next for Kubernetes helm, docker are required packages for this installation. Follow the installation guide BIG-IP Next for Kubernetes current 2.0.0 GA release is available. For the desired objective in this article, you may skip the Nvidia DOCA (that's the focus of the coming article) and go directly for BIG-IP Next for Kubernetes. Install additional CRDs Once the licensing and core pods are ready, you can move to adding additional CRDs (Customer Resources Definition). BIG-IP Next for Kubernetes CRDs BIG-IP Next for Kubernetes CRDs Custom CRDs Install F5 Use case Custom Resource Definitions Related Content BIG-IP Next for Kubernetes v2.0.0 Release Notes System Requirements BIG-IP Next for Kubernetes CRDs BIG-IP Next for Kubernetes BIG-IP Next SPK: a Kubernetes native ingress and egress gateway for Telco workloads F5 BIG-IP Next for Kubernetes deployed on NVIDIA BlueField-3 DPUs BIG-IP Next for Kubernetes running in Amazon EKS253Views1like0CommentsIntroducing the F5 Application Study Tool (AST)
In the ever-evolving world of application delivery and security, gaining actionable insights into your infrastructure and applications has become more critical than ever. The Application Study Tool (AST) is designed to help technical teams and administrators leverage the power of open-source telemetry and visualization tools to enhance their monitoring, diagnostics, and analysis workflows.5.4KViews7likes9CommentsFeed Your On-Premises Data into Amazon Bedrock RAG using F5 Distributed Cloud and NetApp
Retrieval-Augmented Generation (RAG) solutions have tended to require a more sophisticated approach to AI, frequently involving Python scripts to interact with LLMs and hands-on experience with libraries like Streamlit to expose user-friendly chatbot-style interfaces. In short, the technical hurdles to operationalizing a RAG solution, especially with critical built-up on-prem volumes, such as those hosted in NetApp ONTAP appliances, have been anything but trivial. Today, a solution exists that is essentially a turnkey, end-to-end solution to offer RAG-infused AI outcomes to employees or other parties. This solution is one that is largely SaaS-configured and is mostly a series of intuitive mouse-clicks that can achieve the goal of harnessing the latest AI LLMs to leverage your own existing data via RAG. The elements of the solution include Amazon Bedrock, NetApp FSx for NetApp ONTAP, and F5 Distributed Cloud to onboard existing on-prem data volumes into the RAG solution. RAG AI Knowledge Bases—Vastly Reduced Time to Deployment The solution described in this article chiefly leverages two elements to quickly harness corporate data, historically on-premises, into a working RAG solution: NetApp ONTAP (on-premises), F5 Distributed Cloud, and Amazon FSx for Netapp ONTAP (FSxN) together provide an infrastructure solution to incorporate existing data as easily accessible content to infuse RAG with. NetApp BlueXP Workload Factory for AWS Automation (Workload Factory) to bind Amazon Bedrock LLMs to your data, including Active Directory awareness for RAG that produces responses for users in accordance with their user and group file permissions The ease of this solution is demonstrated with Amazon-provided models, which are from industry leaders like Amazon, Anthropic, Cohere, DeepSeek, and Meta. The models are the necessities of a modern RAG implementation, including embedding, text, and Vision LLMs. The majority of models are now available to be applied to your own NetApp data volumes, securely brought into your own AWS VPC through F5 Distributed Cloud. An interesting aspect of the solution is that on-premises volumes are simply SnapMirror-ed to FSxN, in your own VPC, and remain in the standard SnapMirror volume type of Data Protection (DP). RAG, leveraging these leading LLMs, creates vector embeddings directly from the DP volume, automatically, and a chat interface is immediately made available to your users. The following is a description of the elements, and a demonstration of a context-aware AI reply based upon on-prem NetApp volumes. F5 Distributed Cloud for Secure On-Prem to Amazon Connectivity The F5 Distributed Cloud (XC) is a SaaS console-configured secure application and network delivery solution. The XC “Network Connect” module harnesses the built-out global F5 infrastructure to provide automatic reachability between customer edge (CE) sites on-prem, in private clouds, or enterprise tenants within an array of public cloud providers. Where WAN facilities may already exist, CE sites can also be directly inter-connected with those data planes, otherwise the aggregate 14Tbps+ XC infrastructure will be harnessed. A key deliverable is that routing is automatic, and those simple troubleshooting and analytic tools required by NetOps are uniform and available as a single-pane-of-glass experience in the XC SaaS console. The capabilities to segment an enterprise’s global network assets into dispersed managed network entities, such as network interfaces or VLANs, that are treated as walled-off communities of interest exist. This previous article describes network segmentation, as provided by F5 XC Network Connect, in terms of securely connecting on-prem NetApp ONTAP volumes with Amazon FSxN volumes, for key issues like SnapMirror replication, which simplify disaster recovery. Another high-value feature enabled by the secured F5 connectivity is CacheVolumes, where globally dispersed volumes are cached within the Amazon environment, as they are accessed, allowing for rapid, high-QoS access times in subsequent reads. The following is a simple example of four secured and isolated segments simply labeled as color names, including a CE site in the lower right with attachments to multiple segments. Amazon’s Bedrock for the Latest AI Models to Optimize RAG upon your Data Amazon’s Bedrock is a fully managed service that simplifies building and scaling generative AI applications by providing access to a variety of leading foundational models. To utilize Bedrock models, a simple AWS console access request check box exists, with chatbot-style support of granted models normally available within a minute or two. The following demonstrates the Bedrock portion of an AWS Console session, and the spot where model access is configured. Note the highlighted models that have already been granted access and the arrow indicating how easily one can add many more. The value of using your own NetApp volumes with AI is seen in the following sequence, simply for illustrative purposes. Background: The F5 Distributed Cloud CE (Client Edge) is the component that customers will frequently run on virtual machines or bare metal in on-premises environments, or perhaps within their cloud instances, to publish applications or facilitate network reachability. Release notes from the second half of 2024 describe enhancements to the ease of CE deployments and registration with the XC global SaaS console. Without access to this data, an LLM, especially a foundational model with a training cut-off date too distant, or void of proprietary or context-specific information, may struggle or be unable to answer very specific questions around these related areas. Note the very limited depth in response to our highlighted AI request below (double click to enlarge image). The solution is, of course, to inject your own pertinent data sources to avoid an AI outcome such as the above. The question: how to do this easily and securely? NetApp BlueXP Workload Factory for AWS Automation Amazon, direct from the AWS console, does offer some degree of RAG capabilities, but they may not always be aligned with what an existing NetApp on-premises storage user is seeking. Knowledge Bases with vector stores, an Amazon term synonymous with the popular understanding of RAG, can ingest data from your AWS S3 endpoints, or by crawling publicly reachable HTTPS servers; other options are being added over time, such as tie-ins to SharePoint sites. The enthusiasm for an established NetApp ONTAP corporate customer to transfer large and potentially sensitive volumes to AWS S3, over the insecure Internet, is likely muted. Instead, a solution where NetApp volumes can be SnapMirror-ed directly to a customer’s own VPC, and consumed only then as part of a knowledge base is much more enticing. This is where Workload Factory shines. Phase One of Two: Setup Your Storage for RAG The first step is to log into the Workload Factory UI and from the Storage menu (menu selections vertically arranged on the left of screen) simply “discover” all on-prem ONTAP appliances. For brevity, only one site has been discovered in the following screenshot, located in the Seattle area. Notice how it is being accessed by private RFC-1918 style addressing. This is due to the F5 Distributed Cloud Network Connect module that securely ties together through private layer-3 reachability the customer’s AWS VPC and every on-premises site around the world with ONTAP appliances. Workload factory in the image has discovered, on the customer premises, a version 9.16 ONTAP appliance and all volumes configured there on storage virtual machine 0 (SVM0). By simply clicking the “Replicate” button next to each volume of interest, one can create a SnapMirror-ed volume on FSx for NetApp ONTAP (FSxN) in the local VPC. This is what will inform the RAG knowledge base. The Replicate button will be available for use as soon as an FSxN file system is created, something we will now do using the adjacent tab on Workload Factory. Simply mouse-click on the “FSx for ONTAP” tab from the current “On-Premises ONTAP” tab. Beyond configuring an existing FSxN file system, Workload Factory takes ease of use to the next level. It allows one to add an entire new FSxN instance completely from the current portal, as opposed to requiring a transition to the AWS Console FSxN module. Here we see a sample screenshot of adding a new FSxN file system (double click to enlarge). The entire creation process, using AWS credentials that are customizable, takes minutes and is performed in this single web form above. Only some features are called out in the provided image. Other features exist, such as opting for a “Scale Out” approach where HA pairs of nodes are created in large quantities for huge capacities. Scale-out, as described here, offers significantly higher performance and capacity than a “scale-up” approach by distributing workloads across multiple file servers, while scale-up is suitable for general-purpose workloads with lower performance demands. A scale-up deployment, which is used in this project, spreads storage across multiple AWS availability zones and is as simple as toggling a single checkbox. To prepare the relationship between the FSxN file system and the on-premises ONTAP cluster, only a single aspect of the workflow involved briefly using ONTAP CLI; all other aspects were mouse-click-driven from Workload Factory. The one CLI use case involved peering the two clusters to each other; it is described within this guide. The cluster peering involved issuing this command from the FSx cluster and the on-premises cluster via SSH access and providing the two Inter-Cluster LIF IP addresses used at each end, referred to as source_inter_[1,2] and dest_inter_[1,2]: (From FSxN cluster SSH session) #cluster peer create -address-family ipv4 -peer-addrs source_inter_1,source_inter_2 (From ONTAP on-premises SSH session) #cluster peer create -address-family ipv4 -peer-addrs dest_inter_1,dest_inter_2 (From ONTAP on-premises SSH session) #cluster peer show Peer Cluster Name Availability Authentication ----------------- -------------- -------------- FSx-Dest Available ok The peer relationship came up within seconds (available and authentication “ok”), after which all further steps were conducted in Workload Factory. The following shows our source on-premises sample volume, filled with our RAG corpus of documents, SnapMirror-ed to FSxN after clicking the “Replication” button in the GUI. The volume contains various files that will help us prove out that RAG is working as it will contain information not presently in the foundational model’s knowledge base. The volume is titled “companyAstorage”(double click to enlarge). An examination of the volume now available for RAG within our VPC on FSxN, named “companyAstorage_copy”, indicates, as expected it is of type “DP” (Data Protection). With our data in place, we now move on to setting up an embedding and generative text LLM from Amazon’s wide offering and conducting a RAG demonstration. Phase Two of Two: Utilize Amazon Suite of LLMs to Offer Powerful Chatbot-based RAG AI Against Your Own Data Files To finish the configuration, we tie together the power of Amazon’s LLM services with our own data sourced from on-premises ONTAP and SnapMirror-ed to FSxN. Also, active read/write volumes already within FSxN are ripe for RAG, too. By moving to the AI menu on the left of the screen, we can see the three key elements of the solution (double click to expand). The first time a user enters the AI portion of Workload Factory they will be prompted to perform the one-time “Infrastructure” setup, which is largely automated through AWS CloudFormation in the backend. As seen in the above image, a diagram indicates the solution consists of 1. Amazon’s Bedrock LLMs 2. FSxN for Data 3. AI Engine and Orchestration. As depicted, an EC2 instance was created automatically to empower the NetApp AI Engine the first time the user entered the screen. It will be created in the same AWS Region as your FSxN deployment and base features like Vector Database are called out, in our case, LanceDB. With the one-time AI Infrastructure now built out by Workload Factory’s automation, we can choose the RAG solution details that suit us, picking the Bedrock models (embedding and generative text in our example) and the source volumes. Simply move to the Knowledge Base’s tab and by choosing “Add Knowledge Base” you can have a RAG solution, with LLMs of your choice, fed by your own data within a minute or two. The “Add knowledge base” workflow generally breaks down into two objectives. First up is the selection of RAG particulars, such as which embedding and text LLMs to use for our use case, while attributing a volume on an FSxN file system for a workspace. The following calls out a couple of aspects in terms of choices. Note in area "1" of the screenshot, this is just a small portion of the text LLMs available, based upon what models we have requested access to in Amazon Bedrock. In area "2", we decide upon the location for our working volume. The second of the two objectives of the “Add knowledge base” workflow is to point our knowledge base at our own data volumes to be employed when RAG provides AI services to our users. These are the data volumes we own, which have often been actively used on-premises, possibly for years, and now copies of which will be chunked and embedded into vector embeddings automatically. The ensuing “Add data source” screens can be added repeatedly to capture as many of our source volumes from as many offices as desired for RAG; we can even apply a simple filter to only pick up certain folders within a given volume. In our demonstration, we will add the Seattle-area On-Prem ONTAP volume that was SnapMirror-ed to FSxN. A key deliverable of this complete solution is “time to real, demonstrable AI value”, unlike self-hosted and often programmatic solutions requiring Python skills and knowledge around RAG, the default settings in this case will often be sufficient to start using the solution right away. When necessary, things like file types in your volumes can be pruned from RAG, but by default note the wide selection of textual formats, including pdf, docx, and parquet formats are available. This file parsing service is a critical part of the pre-processing data pipeline that all RAG implementations must deliver upon but which are easily included here as part of the offering. Graphical formats such as .jpeg and .png are also enabled out of the box. Outcome: A Test of RAG Against Our Data As seen in our original test of Amazon with Anthropic Haiku 3, the foundational model struggled with detailed questions about the F5 Distributed Cloud solution. To validate RAG against our own data, a volume on-premises (“companyAstorage”) was filled, using NFS as an access protocol, with recent F5 Distributed Cloud release notes. These notes are available from the product’s subscribers’ portal. The release note file names generally do not tip off what they refer to. In this case, all were in .pdf format and included: We can now ask the same question as earlier, “How has F5 Distributed Cloud Services streamlined CE registration?”. Double-click to enlarge image of successful RAG outcome below. We see the fruits of NetApp Workload Factory, F5 Distributed Cloud for secure access to on-premises volumes, and the AI-wide capabilities of Amazon’s Bedrock. The same LLM, which previously was not empowered to deliver specifics on our question, now has successfully done so through interpreting our private data, and gone ahead and provided tactical and usable feedback. Interesting power-user tip: clicking on the attributions does not simply open the source documents, but reveals the individual chunks from those documents that led to the AI chatbot response. Summary and Further Exploration Candidates In this demonstration, we have seen the ease and speed to incorporate existing NetApp ONTAP appliance-based volumes with the tremendous breadth of AI offered by Amazon’s Bedrock. The secure access of enterprise data was accomplished by F5 Distributed Cloud secure connectivity between enterprise AWS VPCs and a physical office in the Seattle area. Amazon’s FSx NetApp for ONTAP (FSxN) offering was configured through the new Workload Factory SaaS console, and volumes required for RAG were replicated securely from on-premises to FSxN. With the data made securely accessible and models subscribed to, NetApp BlueXP Workload Factory corralled the storage and AI elements to produce a simple chatbot interface that produced meaningful responses that RAG can now produce, with our data being the key ingredient. Other explorations beyond this initial setup would certainly include user-aware RAG. Incorporate an on-premises Active Directory, which traditionally allows an access protocol like server message block (SMB) and its inherent elements of RBAC, to consider user and group memberships prior to granting file access. This allows RAG to tie the data chunks that will augment customer AI inferences to Active Directory permissions. A user, in our example documented, not granted access to the sample .pdf release note documents, would incur a RAG result that would instead simply rely upon the other documents or even the model’s base knowledge. Another area to expand upon is multi-modal RAG. This is the ability to “chat” with the non-text elements of your documents, documents, which include images, video, or audio. Consider, purely as an example, a RAG response with this described solution that considers document images of bar graphs when showing, say, quarter-over-quarter sales results, and formulates richer AI replies with this content. Unlike a data science project in the past, where GitHub repositories, Jupiter Notebooks, or Python freeform coding exercises might all be required, with the solution offered by NetApp, F5, and Amazon, the RAG solution described simply needs your volumes containing your proprietary data and RAG will produce the results.102Views0likes0Comments