AI Gateway
4 TopicsJust Announced! Attend a lab and receive a Raspberry Pi
Have a Slice of AI from a Raspberry Pi Services such as ChatGPT have made accessing Generative AI as simple as visiting a web page. Whether at work or at home, there are advantages to channeling your user base (or family in the case of at home) through a central point where you can apply safeguards to their usage. In this lab, you will learn how to: Deliver centralized AI access through something as basic as a Raspberry Pi Learn basic methods for safeguarding AI Learn how users might circumvent basic safeguards Learn how to deploy additional services from F5 to enforce broader enterprise policies Register Here This lab takes place in an F5 virtual lab environment. Participants who complete the lab will receive a Raspberry Pi* to build the solution in their own environment. *Limited stock. Raspberry Pi is exclusive to this lab. To qualify, complete the lab and join a follow-up call with F5.899Views7likes2CommentsF5 AI Gateway to Strengthen LLM Security and Performance in Red Hat OpenShift AI
In my previous article, we explored how F5 Distributed Cloud (XC) API Security enhances the perimeter of AI model serving in Red Hat OpenShift AI on ROSA by protecting against threats such as DDoS attacks, schema misuse, and malicious bots. As organizations move from piloting to scaling GenAI applications, a new layer of complexity arises. Unlike traditional APIs, LLMs process free-form, unstructured inputs and return non-deterministic responses—introducing entirely new attack surfaces. Conventional web or API firewalls fall short in detecting prompt injection, data leakage, or misuse embedded within model interactions. Enter F5 AI Gateway—a solution designed to provide real-time, LLM-specific security and optimization within the OpenShift AI environment. Understanding the AI Gateway Recent industry leaders have said that an AI Gateway layer is coming into use. This layer is between clients and LLM endpoints. It will handle dynamic prompt/response patterns, policy enforcement, and auditability. Inspired by these patterns, F5 AI Gateway brings enterprise-grade capabilities such as: Inspecting and Filtering Traffic: Analyzes both client requests and LLM responses to detect and mitigate threats such as prompt injection and sensitive data exposure. Implementing Traffic Steering Policies: Directs requests to appropriate LLM backends based on content, optimizing performance and resource utilization. Providing Comprehensive Logging: Maintains detailed records of all interactions for audit and compliance purposes. Generating Observability Data: Utilizes OpenTelemetry to offer insights into system performance and security events. These capabilities ensure that AI applications are not only secure but also performant and compliant with organizational policies. Integrated Architecture for Enhanced Security The combined deployment of F5 Distributed Cloud API Security and F5 AI Gateway within Red Hat OpenShift AI creates a layered defense strategy: F5 Distributed Cloud API Security: Acts as the first line of defense, safeguarding exposed model APIs from external threats. F5 AI Gateway: Operates within the OpenShift AI cluster, providing real-time inspection and policy enforcement tailored to LLM traffic. This layered design ensures multi-dimensional defense, aligning with enterprise needs for zero-trust, data governance, and operational resilience. Key Benefits of F5 AI Gateway Enhanced Security: Mitigates risks outlined in the OWASP Top 10 for LLM Applications - such as prompt injection (LLM01) - by detecting malicious prompts, enforcing system prompt guardrails, and identifying repetition-based exploits, delivering contextual, Layer 8 protection. Performance Optimization: Boosts efficiency through intelligent, context-aware routing and endpoint abstraction, simplifying integration across multiple LLMs. Scalability and Flexibility: Supports deployment across various environments, including public cloud, private cloud, and on-premises data centers. Comprehensive Observability: Provides detailed metrics and logs through OpenTelemetry, facilitating monitoring and compliance. Conclusion The rise of LLM applications requires a new architectural mindset. F5 AI Gateway complements existing security layers by focusing on content-level inspection, traffic governance, and compliance-grade visibility. It is specifically tailored for AI inference traffic. When used with Red Hat OpenShift AI, this solution provides not just security, but also trust and control. This helps organizations grow GenAI workloads in a responsible way. For a practical demonstration of this integration, please refer to the embedded demo video below. If you’re planning to attend this year’s Red Hat Summit, please attend an F5 session and visit us in Booth #648. Related Articles: Securing model serving in Red Hat OpenShift AI (on ROSA) with F5 Distributed Cloud API Security382Views0likes0CommentsLab: Have a Slice of AI from a Raspberry Pi
Have a Slice of AI from a Raspberry Pi Services such as ChatGPT have made accessing Generative AI as simple as visiting a web page. Whether at work or at home, there are advantages to channeling your user base (or family in the case of at home) through a central point where you can apply safeguards to their usage. In this lab, you will learn how to: Deliver centralized AI access through something as basic as a Raspberry Pi Learn basic methods for safeguarding AI Learn how users might circumvent basic safeguards Learn how to deploy additional services from F5 to enforce broader enterprise policies This lab takes place in an F5 virtual lab environment. Participants who complete the lab will receive a Raspberry Pi* to build the solution in their own environment. Register Here *Limited stock. Raspberry Pi is exclusive to this lab. To qualify, complete the lab and join a follow-up call with F5. Sun, Sand, and Security Set not included.97Views2likes0CommentsConfiguring semantic caching on F5 AI Gateway
The semantic caching feature is mentioned on the F5 AI Gateway introduction page, but I couldn't find any documentation on how to use it. Is there a guide available for this? Also, I'm curious whether token-based rate limiting will be supported in the future.91Views0likes3Comments