f5 distributed cloud
258 TopicsHow I Did it - Migrating Applications to Nutanix NC2 with F5 Distributed Cloud Secure Multicloud Networking
In this edition of "How I Did it", we will explore how F5 Distributed Cloud Services (XC) enables seamless application extension and migration from an on-premises environment to Nutanix NC2 clusters.752Views4likes0CommentsF5 XC vk8s open source nginx deployment on RE
Code is community submitted, community supported, and recognized as ‘Use At Your Own Risk’. Short Description This an example for F5 XC virtual kubernetes (vk8s) workload on Regional Edges for rewriting the URL requests and response body. Problem solved by this Code Snippet The XC Distributed Cloud rewrite option under the XC routes is sometimes limited in dynamically replacing a specific sting like for example to replace the string "ne" with "da" no matter where in the url the string is located. location ~ .*ne.* { rewrite ^(.*)ne(.*) $1da$2; } Other than that in XC there is no default option to replace a string in the payload like rewrite profile in F5 LTM or iRule stream option. sub_filter 'Example' 'NIKI'; sub_filter_types *; sub_filter_once off; Open source NGINX can also be used to return custom error based on the server error as well: error_page 404 /custom_404.html; location = /custom_404.html { return 404 'gangnam style!'; internal; } Now with proxy protocol support in XC the Nginx can see real client ip even for non-HTTP traffic that does not have XFF HTTP headers. log_format niki '$proxy_protocol_addr - $remote_addr - $remote_user [$time_local] - $proxy_add_x_forwarded_for' '"$request" $status $body_bytes_sent ' '"$http_referer" "$http_user_agent"'; #limit_req_zone $binary_remote_addr zone=mylimit:10m rate=1r/s; server { listen 8080 proxy_protocol; server_name localhost; How to use this Code Snippet Read the description readme file in the github link and modify the nginx default.conf file as per your needs. Code Snippet Meta Information Version: 1.25.4 Nginx Coding Language: nginx config Full Code Snippet https://github.com/Nikoolayy1/xc_nginx/tree/main172Views0likes3CommentsSeamless Application Migration to OpenShift Virtualization with F5 Distributed Cloud
As organizations endeavor to modernize their infrastructure, migrating applications to advanced virtualization platforms like Red Hat OpenShift Virtualization becomes a strategic imperative. However, they often encounter challenges such as minimizing downtime, maintaining seamless connectivity, ensuring consistent security, and reducing operational complexity. Addressing these challenges is crucial for a successful migration. This article explores howF5 Distributed Cloud (F5 XC), in collaboration with Red Hat's Migration Toolkit for Virtualization (MTV), provides a robust solution to facilitate a smooth, secure, and efficient migration to OpenShift Virtualization. The Joint Solution: F5 XC CE and Red Hat MTV Building upon our previous work ondeploying F5 Distributed Cloud Customer Edge (XC CE) in Red Hat OpenShift Virtualization, we delve into the next phase of our joint solution with Red Hat. By leveraging F5 XC CE in both VMware and OpenShift environments, alongside Red Hat’s MTV, organizations can achieve a seamless migration of virtual machines (VMs) from VMware NSX to OpenShift Virtualization. This integration not only streamlines the migration process but also ensures continuous application performance and security throughout the transition. Key Components: Red Hat Migration Toolkit for Virtualization (MTV): Facilitates the migration of VMs from VMware NSX to OpenShift Virtualization, an add-on to OpenShift Container Platform F5 Distributed Cloud Customer Edge (XC CE) in VMware: Manages and secures application traffic within the existing VMware NSX environment. F5 XC CE in OpenShift: Ensures consistent load balancing and security in the new OpenShift Virtualization environment. Demonstration Architecture To illustrate the effectiveness of this joint solution, let’s delve into the Demo Architecture employed in our demo: The architecture leverages F5 XC CE in both environments to provide a unified and secure load balancing mechanism. Red Hat MTV acts as the migration engine, seamlessly transferring VMs while F5 XC CE manages traffic distribution to ensure zero downtime and maintain application availability and security. Benefits of the Joint Solution 1. Seamless Migration: Minimal Downtime: The phased migration approach ensures that applications remain available to users throughout the process. IP Preservation: Maintaining the same IP addresses reduces the complexity of network reconfiguration and minimizes potential disruptions. 2. Enhanced Security: Consistent Policies: Security measures such as Web Application Firewalls (WAF), bot detection, and DoS protection are maintained across both environments. Centralized Management: F5 XC CE provides a unified interface for managing security policies, ensuring robust protection during and after migration. 3. Operational Efficiency: Unified Platform: Consolidating legacy and cloud-native workloads onto OpenShift Virtualization simplifies management and enhances operational workflows. Scalability: Leveraging Kubernetes and OpenShift’s orchestration capabilities allows for greater scalability and flexibility in application deployment. 4. Improved User Experience: Continuous Availability: Users experience uninterrupted access to applications, unaware of the backend migration activities. Performance Optimization: Intelligent load balancing ensures optimal application performance by efficiently distributing traffic across environments. Watch the Demo Video To see this joint solution in action, watch our detailed demo video on the F5 DevCentral YouTube channel. The video walks you through the migration process, showcasing how F5 XC CE and Red Hat MTV work together to facilitate a smooth and secure transition from VMware NSX to OpenShift Virtualization. Conclusion Migrating virtual machines (VMs) from VMware NSX to OpenShift Virtualization is a significant step towards modernizing your infrastructure. With the combined capabilities of F5 Distributed Cloud Customer Edge and Red Hat’s Migration Toolkit for Virtualization, organizations can achieve this migration with confidence, ensuring minimal disruption, enhanced security, and improved operational efficiency. Related Articles: Deploying F5 Distributed Cloud Customer Edge in Red Hat OpenShift Virtualization BIG-IP VE in Red Hat OpenShift Virtualization VMware to Red Hat OpenShift Virtualization Migration OpenShift Virtualization205Views1like0CommentsDeploying F5 Distributed Cloud Customer Edge in Red Hat OpenShift Virtualization
Introduction Red Hat OpenShift Virtualization is a feature that brings virtual machine (VM) workloads into the Kubernetes platform, allowing them to run alongside containerized applications in a seamless, unified environment. Built on the open-source KubeVirt project, OpenShift Virtualization enables organizations to manage VMs using the same tools and workflows they use for containers. Why OpenShift Virtualization? Organizations today face critical needs such as: Rapid Migration: "I want to migrate ASAP" from traditional virtualization platforms to more modern solutions. Infrastructure Modernization: Transitioning legacy VM environments to leverage the benefits of hybrid and cloud-native architectures. Unified Management: Running VMs alongside containerized applications to simplify operations and enhance resource utilization. OpenShift Virtualization addresses these challenges by consolidating legacy and cloud-native workloads onto a single platform. This consolidation simplifies management, enhances operational efficiency, and facilitates infrastructure modernization without disrupting existing services. Integrating F5 Distributed Cloud Customer Edge (XC CE) into OpenShift Virtualization further enhances this environment by providing advanced networking and security capabilities. This combination offers several benefits: Multi-Tenancy: Deploy multiple CE VMs, each dedicated to a specific tenant, enabling isolation and customization for different teams or departments within a secure, multi-tenant environment. Load Balancing: Efficiently manage and distribute application traffic to optimize performance and resource utilization. Enhanced Security: Implement advanced threat protection at the edge to strengthen your security posture against emerging threats. Microservices Management: Seamlessly integrate and manage microservices, enhancing agility and scalability. This guide provides a step-by-step approach to deploying XC CE within OpenShift Virtualization, detailing the technical considerations and configurations required. Technical Overview Deploying XC CE within OpenShift Virtualization involves several key technical steps: Preparation Cluster Setup: Ensure an operational OpenShift cluster with OpenShift Virtualization installed. Access Rights: Confirm administrative permissions to configure compute and network settings. F5 XC Account: Obtain access to generate node tokens and download the XC CE images. Resource Optimization: Enable CPU Manager: Configure the CPU Manager to allocate CPU resources effectively. Configure Topology Manager: Set the policy to single-numa-node for optimal NUMA performance. Network Configuration: Open vSwitch (OVS) Bridges: Set up OVS bridges on worker nodes to handle networking for the virtual machines. NetworkAttachmentDefinitions (NADs): Use Multus CNI to define how virtual machines attach to multiple networks, supporting both external and internal connectivity. Image Preparation: Obtain XC CE Image: Download the XC CE image in qcow2 format suitable for KubeVirt. Generate Node Token: Create a one-time node token from the F5 Distributed Cloud Console for node registration. User Data Configuration: Prepare cloud-init user data with the node token and network settings to automate the VM initialization process. Deployment: Create DataVolumes: Import the XC CE image into the cluster using the Containerized Data Importer (CDI). Deploy VirtualMachine Resources: Apply manifests to deploy XC CE instances in OpenShift. Network Configuration Setting up the network involves creating Open vSwitch (OVS) bridges and defining NetworkAttachmentDefinitions (NADs) to enable multiple network interfaces for the virtual machines. Open vSwitch (OVS) Bridges Create a NodeNetworkConfigurationPolicy to define OVS bridges on all worker nodes: apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: ovs-vms spec: nodeSelector: node-role.kubernetes.io/worker: '' desiredState: interfaces: - name: ovs-vms type: ovs-bridge state: up bridge: allow-extra-patch-ports: true options: stp: true port: - name: eno1 ovn: bridge-mappings: - localnet: ce2-slo bridge: ovs-vms state: present Replace eno1 with the appropriate physical network interface on your nodes. This policy sets up an OVS bridge named ovs-vms connected to the physical interface. NetworkAttachmentDefinitions (NADs) Define NADs using Multus CNI to attach networks to the virtual machines. External Network (ce2-slo): External Network (ce2-slo): Connects VMs to the physical network with a specific VLAN ID. This setup allows the VMs to communicate with external systems, services, or networks, which is essential for applications that require access to resources outside the cluster or need to expose services to external users. apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: ce2-slo namespace: f5-ce spec: config: | { "cniVersion": "0.4.0", "name": "ce2-slo", "type": "ovn-k8s-cni-overlay", "topology": "localnet", "netAttachDefName": "f5-ce/ce2-slo", "mtu": 1500, "vlanID": 3052, "ipam": {} } Internal Network (ce2-sli): Internal Network (ce2-sli): Provides an isolated Layer 2 network for internal communication. By setting the topology to "layer2", this network operates as an internal overlay network that is not directly connected to the physical network infrastructure. The mtu is set to 1400 bytes to accommodate any overhead introduced by encapsulation protocols used in the internal network overlay. apiVersion: k8s.cni.cncf.io/v1 kind: NetworkAttachmentDefinition metadata: name: ce2-sli namespace: f5-ce spec: config: | { "cniVersion": "0.4.0", "name": "ce2-sli", "type": "ovn-k8s-cni-overlay", "topology": "layer2", "netAttachDefName": "f5-ce/ce2-sli", "mtu": 1400, "ipam": {} } VirtualMachine Configuration Configuring the virtual machine involves preparing the image, creating cloud-init user data, and defining the VirtualMachine resource. Image Preparation Obtain XC CE Image: Download the qcow2 image from the F5 Distributed Cloud Console. Generate Node Token: Acquire a one-time node token for node registration. Cloud-Init User Data Create a user-data configuration containing the node token and network settings: #cloud-config write_files: - path: /etc/vpm/user_data content: | token: <your-node-token> slo_ip: <IP>/<prefix> slo_gateway: <Gateway IP> slo_dns: <DNS IP> owner: root permissions: '0644' Replace placeholders with actual network configurations. This file automates the VM's initial setup and registration. VirtualMachine Resource Definition Define the VirtualMachine resource, specifying CPU, memory, disks, network interfaces, and cloud-init configurations. Resources: Allocate sufficient CPU and memory. Disks: Reference the DataVolume containing the XC CE image. Interfaces: Attach NADs for network connectivity. Cloud-Init: Embed the user data for automatic configuration. Conclusion Deploying F5 Distributed Cloud CE in OpenShift Virtualization enables organizations to leverage advanced networking and security features within their existing Kubernetes infrastructure. This integration facilitates a more secure, efficient, and scalable environment for modern applications. For detailed deployment instructions and configuration examples, please refer to the attached PDF guide. Related Articles: BIG-IP VE in Red Hat OpenShift Virtualization VMware to Red Hat OpenShift Virtualization Migration OpenShift Virtualization535Views1like0CommentsSeamless App Connectivity with F5 and Nutanix Cloud Services
Modern enterprise networks face the challenge of connecting diverse cloud services and partner ecosystems, expanding routing domains and distributed application connections. As network complexity grows, the need to onboard applications quickly become paramount despite intricate networking operations. Automation tools like Ansible help manage network and security devices. But it is still hard to connect distributed applications across new systems like clouds and Kubernetes. This article explores F5 Distributed Cloud Services (XC), which facilitates intent-based application-to-application connectivity across varied infrastructures. Expanding and increasingly sophisticated enterprise networks Traditional data center networks typically rely on VLAN-isolated architectures with firewall ACLs for intra-network security. With digital transformation and cloud migrations, new systems work with existing networks. They move from VLANs to labels and namespaces like Kubernetes and Red Hat OpenShift. Future trends point to increased edge computing adoption, further complicating network infrastructures. Managing these complexities raises costs due to design, implementation, and testing impacts of network changes. Network infrastructure is becoming more complex due to factors such as increased NAT settings, routing adjustments, added ACLs, and the need to remove duplicate IP addresses when integrating new and existing networks. This growing complexity drives up the costs of design, implementation, and testing, amplifying the impact of network changes A typical network uses firewalls to control access between VLANs or isolated networks based on security policies. Access lists (ACLs) are organized by source and destination applications or clients. The order of ACLs is important because shorter prefixes can override longer ones. In some cases, ACLs exceed 10,000 lines, making it difficult to identify which rules impact application connectivity. As a result, ACLs for discontinued applications often remain in the firewall configuration, posing potential security risks. However, refactoring ACLs requires significant effort and testing. F5 XC addresses the ACL issue by managing all application connectivity across on-premises, data centers, and clouds. Its load balancer connects applications through VLANs, VPCs, and other networks. Each load balancer has its own application control and associated ACLs, ensuring changes to one load balancer don’t affect others. When an application is discontinued, its load balancer is deleted, preventing leftover configurations and reducing security risks. SNAT/DNAT is commonly used to resolve IP conflicts, but it reduces observability since each NAT table between the client and application must be referenced. ACL implementations vary across router/firewall vendors—some apply filters before NAT, others after. Logs may follow this behavior, making IP address normalization difficult. Pre-NAT IP addresses must be managed across the entire network, adding complexity and increasing costs. F5 XC simplifies distributed load balancing by terminating client sessions and forwarding them to endpoints. Clients and endpoints only need IP reachability to the CE interface. The load balancer can use a VIP for client access, but global management isn't required, as the client-side network sets it. It provides clear source/destination data, making application access easy to track without managing NAT IP addresses. F5 Distributed Cloud MCN/Network as a Service F5XC offers an all-in-one software package for networking and security, managed through a single console with intent-based configuration. This eliminates the need for separate appliance setups. Furthermore, firewall, load balancer, WAF changes, and function updates are done via SDN without causing downtime. The F5 XC Customer Edge (CE) works across physical appliances, VMs, containers, and clouds, creating overlay tunnels between CEs. Acting as an application gateway, it isolatesrouting domains and terminates L3 routing, while maintaining best practices for cloud, Kubernetes, and on-prem networks. This simplifies physical network configurations, speeding up provisioning and reducing costs. How CE works in enterprise network This design is straightforward: CEs are deployed in the cloud and the user’s data center. In this example, the data center CE connects four networks: Underlay network for CE-to-CE connectivity Cloud application network (VPC/VNET) Branch office network Data center application network The data center CE connects the branch office and application networks to their respective VRFs, while the cloud CE links the cloud network to a VRF. The application runs on VLAN 100 (192.168.1.1) in the data center, while clients are in the branch office. The load balancer configuration is as follows: Load balancer: Branch office VRF Domain: ent-app1.com Endpoint (Origin pool): Onpre-VLAN100 VRF, 192.168.1.1 Clients access "ent-app1.com," which resolves to a VIP or CE interface address. The CE identifies the application’s location, and since the endpoint is in the same CE within the Onpre-VLAN100 VRF, it sets the source IP as the CE and forwards traffic to 192.168.1.1. Next, the application is migrated to the cloud. The CE facilitates this transition from on-premises to the cloud by specifying the endpoint. The load balancer configuration is as follows: Load Balancer: Branch Office VRF Domain: ent-app1.com Endpoint (Origin Pool): AWS-VPC VRF, 10.0.0.1 When traffic arrives at the CE from a client in the branch office, the CE in the data center forwards the traffic to the CE on AWS through an overlay tunnel. The only configuration change required is updating the endpoint via the console. There is no need to modify the firewall, switch, or router in the underlay network. Kubernetes networking is unique because it abstracts IP addresses. Applications use FQDNs instead of IP addresses and are isolated by namespace or label. When accessing the external network, Kubernetes uses SNAT to map the IP address to that of the Kubernetes node, masking the source of the traffic. CE can run as a pod within a Kubernetes namespace. Combining a CE on Kubernetes with a CE on-premises offers flexible external access to Kubernetes applications without complex configurations like Multus. F5XC simplifies connectivity by providing a common configuration for linking Kubernetes to on-premises and cloud environments via a load balancer. The load balancer configuration is as follows: Load Balancer: Kubernetes Domain: ent-app1.com Endpoint (Origin Pool): Onprem-vlan100 VRF, 192.168.1.1 / AWS-VPC VRF, 10.0.0.1 * Related blog https://community.f5.com/kb/technicalarticles/multi-cluster-multi-cloud-networking-for-k8s-with-f5-distributed-cloud-%E2%80%93-archite/307125 Load balancer observability The Load Balancer in F5XC offers advanced observability features, including end-to-end network latency, L7 request logs, and API visualization. It automatically aggregates data from various logs and metrics, (see below). You can also generate API graphs from the aggregated data. This allows users to identify which APIs are in use and create policies to block unnecessary ones. Demo movie The video demonstration shows how F5 XC connects applications across different environments, including VM, Kubernetes, on-premises, and cloud. CE delivers both L3 and L7 connectivity across these environments. Conclusion F5 Distributed Cloud Services (F5XC) simplifies enterprise networking by integrating applications seamlessly across VMs, Kubernetes, on-premises, and cloud environments. By minimizing physical network modifications and leveraging best practices from diverse infrastructure systems, F5XC enables efficient, scalable, and secure application connectivity. Key Benefits: - Simplified physical network design - Network as a Service for seamless application communication and visibility - Integration with new networking paradigms like Kubernetes and SASE Looking Ahead As new networking systems emerge, F5XC remains adaptable, leveraging APIs for self-service network configurations and enabling future-ready enterprise networks.167Views1like0CommentsHow to Split DNS with Managed Namespace on F5 Distributed Cloud (XC) Part 2 – TCP & UDP
Re-Introduction In Part 1, we covered the deployment of the DNS workloads to our Managed Namespace and creating an HTTPS Load Balancer and Origin Pool for DNS over HTTPS. If you missed Part 1, feel free to jump over and give it a read. In Part 2, we will cover creating a TCP and UDP Load Balancer and Origin Pools for standard TCP & UDP DNS. TCP Origin Pool First, we need to create an origin pool. On the left menu, under Manage, Load Balancers, click Origin Pools. Let’s give our origin pool a name, and add some Origin Servers, so under Origin Servers, click Add Item. In the Origin Server settings, we want to select K8s Service Name of Origin Server on given Sites as our type, and enter our service name, which will be the service name from Part 1 and our namespace, so “servicename.namespace”. For the Site, we select one of the sites we deployed the workload to, and under Select Network on the Site, we want to seledt vK8s Networks on the Site, then click Apply. Do this for each site we deployed to so we have several servers in our Origin Pool. In Part 1, our Services defined the targetPort as 5553. So, we set Port to 5553 on the origin. This is all we need to configure for our TCP Origin, so click Save and Exit. TCP Load Balancer Next, we are going to make a TCP Load Balancer, since its less steps (and quicker) than a UDP Load Balancer (today). On the left menu under Manage, Load Balancers, select TCP Load Balancers. Let’s set a name for our TCP LB and set our listen port, 53 is a reserved port on Customer Edge Sites so we need to use something else, so let’s use 5553 again, under origin pools we set the origin that we created previously, and then we get to the important piece, which is Where to Advertise. In Part 1 we advertised to the internet with some extra steps on how to advertise to an internal network, in this part we will advertise internally. Select Advertise Custom, then click edit configuration. Then under Custom Advertise VIP Configuration, click Add Item. We want to select the Site where we are going to advertise, the network interface we will advertise.Click Apply, then Apply again. We don’t need to configure anything else, so click Save and Exit. UDP Load Balancer For UDP Load Balancers we need to jump to the Load Balancer section again, but instead of a load balancer, we are going to create a Virtual Host which are not listed in the Distributed Applications tile, so from the top drop down “Select Service” choose the Load Balancers tile. In the left menu under Manage, we go to Virtual Hosts instead of Load Balancers. The first thing we will configure is an Advertise Policy, so let’s select that. Advertise Policy Let’s give the policy a name, select the location we want to advertise on the Site Local Inside Network, and set the port to 5553. Save and Exit. Endpoints Now back to Manage, Virtual Hosts, and Endpoints so we can add an endpoint. Name the endpoint and specify based on the screenshot below. Endpoint Specifier: Service Selector Info Discovery: Kubernetes Service: Service Name Service Name: service-name.namespace Protocol: UDP Port: 5553 Virtual Site or Site or Network: Site Reference: Site Name Network Type: Site Local Service Network Save and Exit. Cluster The Cluster configuration will be simple, from Manage, Virtual Hosts, Clusters, add Cluster. We just need a name and select the Origin Servers / Endpoints and select the endpoint we just created. Save and Exit. Route The Route configuration will be simple as well, from Manage, Virtual Hosts, Routes, add Route. Name the route and under List of Routes click Configure, then Add Item. Leave most settings as they are, and under Actions, choose Destination List, then click Configure. Under Origin Pools and Weights, click Add Item. Under Cluster with Weight and Priority select the cluster we created previously, leave Weight as null for this configuration, then click Apply, apply again, apply again, Apply again, Save and Exit. Now we can Finally create a Virtual Host. Virtual Host Under Manage, Virtual Host, Select Virtual Host, then Click Add Virtual Host. There are a ton of options here, but we only care about a couple. Give the Virtual Host a name. Proxy Type: UDP Proxy Advertise Policy: previously created policy Moment of Truth, Again Now that we have our services published we can give them a test. Since they are currently on a non standard port, and most systems dont let us specify a port in default configurations we need to test with dig, nslookup, etc. To test TCP with nslookup: nslookup -port=5553 -vc google.com 192.168.125.229 Server: 192.168.125.229 Address: 192.168.125.229#5553 Non-authoritative answer: Name: google.com Address: 142.251.40.174 To test UDP with nslookup: nslookup -port=5553 google.com 192.168.125.229 Server: 192.168.125.229 Address: 192.168.125.229#5553 Non-authoritative answer: Name: google.com Address: 142.251.40.174 IP Tables for Non-Standard DNS Ports If we wanted to use the nonstandard port tcp/udp dns on Linux or MacOS, we can use IPTABLES to forward all the traffic for us. There isnt a way to set this up in Windows OS today, but as in Part 1, Windows Server 2022 supports encrypted DNS over HTTPS, and it can be pushed as policy through Group Policy as well. iptables -t nat -A PREROUTING -i eth0 -p udp –dport 53 -j DNAT –to XXXXXXXXXX:5553 iptables -t nat -A PREROUTING -i eth0 -p udp –dport 53 -j DNAT –to XXXXXXXXXX:5553 iptables -t nat -A PREROUTING -i eth0 -p tcp –dport 53 -j DNAT –to XXXXXXXXXX:5553 iptables -t nat -A PREROUTING -i eth0 -p tcp –dport 53 -j DNAT –to XXXXXXXXXX:5553 "Nature is a mutable cloud, which is always and never the same." - Ralph Waldo Emerson We might not wax that philosophically around here, but our heads are in the cloud nonetheless! Join the F5 Distributed Cloud user group today and learn more with your peers and other F5 experts. Conclusion I hope this helps with a common use-case we are hearing every day, and shows how simple it is to deploy workloads into our Managed Namespaces.1.4KViews2likes2CommentsCascading Configs Tool for F5 Distributed Cloud Managed Service Provider (MSP) and Delegated Access Customers
A new tool has been released that enables F5 Distributed Cloud Managed Service Provider customers or customers with Delegated Access to push and maintain shared configurations to any of their Child Tenants.101Views0likes0CommentsSecuring the LLM User Experience with an AI Firewall
As artificial intelligence (AI) seeps into the core day-to-day operations of enterprises, a need exists to exert control over the intersection point of AI-infused applications and the actual large language models (LLMs) that answer the generated prompts. This control point should serve to impose security rules to automatically prevent issues such as personally identifiable information (PII) inadvertently exposed to LLMs. The solution must also counteract motivated, intentional misuse such as jailbreak attempts, where the LLM can be manipulated to provide often ridiculous answers with the ensuing screenshotting attempting to discredit the service. Beyond the security aspect and the overwhelming concern of regulated industries, other drivers include basic fiscal prudence 101, ensuring the token consumption of each offered LLM model is not out of hand. This entire discussion around observability and policy enforcement for LLM consumption has given rise to a class of solutions most frequently referred to as AI Firewalls or AI Gateways (AI GW). An AI FW might be leveraged by a browser plugin, or perhaps applying a software development kit (SDK) during the coding process for AI applications. Arguably, the most scalable and most easily deployed approach to inserting AI FW functionality into live traffic to LLMs is to use a reverse proxy. A modern approach includes the F5 Distributed Cloud service, coupled with an AI FW/GW service, cloud-based or self-hosted, that can inspect traffic intended for LLMs like those of OpenAI, Azure OpenAI, or privately operated LLMs like those downloaded from Hugging Face. A key value offered by this topology, a reverse proxy handing off LLM traffic to an AI FW, which in turn can allow traffic to reach target LLMs, stems from the fact that traffic is seen, and thus controllable, in both directions. Should an issue be present in a user’s submitted prompt, also known as an “inference”, it can be flagged: PII (Personally Identifiable Information) leakage is a frequent concern at this point. In addition, any LLM responses to prompts are also seen in the reverse path: consider a corrupted LLM providing toxicity in its generated replies. Not good. To achieve a highly performant reverse proxy approach to secured LLM access, a solution that can span a global set of users, F5 worked with Prompt Security to deploy an end-to-end AI security layer. This article will explore the efficacy and performance of the live solution. Impose LLM Guardrails with the AI Firewall and Distributed Cloud An AI firewall such as the Prompt Security offering can get in-line with AI LLM flows through multiple means. API calls from Curl or Postman can be modified to transmit to Prompt Security when trying to reach targets such as OpenAI or Azure OpenAI Service. Simple firewall rules can prevent employee direct access to these well-known API endpoints, thus making the Prompt Security route the sanctioned method of engaging with LLMs. A number of other methods could be considered but have concerns. Browser plug-ins have the advantage of working outside the encryption of the TLS layer, in a manner similar to how users can use a browser’s developer tools to clearly see targets and HTTP headers of HTTPS transactions encrypted on the wire. Prompt Security supports plugins. A downside, however, of browser plug-ins is the manageability issue, how to enforce and maintain across-the-board usage, simply consider the headache non-corporate assets used in the work environment. Another approach, interesting for non-browser, thick applications on desktops, think of an IDE like VSCode, might be an agent approach, whereby outbound traffic is handled by an on-board local proxy. Again, Prompt can fit in this model however the complexity of enforcement of the agent, like the browser approach, may not always be easy and aligned with complete A-to-Z security of all endpoints. One of the simplest approaches is to ingest LLM traffic through a network-centric approach. An F5 Distributed Cloud HTTPS load balancer, for instance, can ingest LLM-bound traffic, and thoroughly secure the traffic at the API layer, things like WAF policy and DDoS mitigations, as examples. HTTP-based control plane security is the focus here, as opposed to the encapsulated requests a user is sending to an LLM. The HTTPS load balancer can in turn hand off traffic intended for the likes of OpenAI to the AI gateway for prompt-aware inspections. F5 Distributed Cloud (XC) is a good architectural fit for inserting a third-party AI firewall service in-line with an organization’s inferencing requests. Simply project a FQDN for the consumption of AI services; in this article we used the domain name “llmsec.busdevF5.net” into the global DNS, advertising one single IP address mapping to the name. This DNS advertisement can be done with XC. The IP address, through BGP-4 support for anycast, will direct any traffic to this address to the closest of 27 international points of presence of the XC global fabric. Traffic from a user in Asia may be attracted to Singapore or Mumbai F5 sites, whereas a user in Western Europe might enter the F5 network in Paris or Frankfurt. As depicted, a distributed HTTPS load balancer can be configured – “distributed” reflects the fact traffic ingressing in any of the global sites can be intercepted by the load balancer. Normally, the server name indicator (SNI) value in the TLS Client Hello can be easily used to pick the correct load balancer to process this traffic. The first step in AI security is traditional reverse proxy core security features, all imposed by the XC load balancer. These features, to name just a few, might include geo-IP service policies to preclude traffic from regions, automatic malicious user detection, and API rate limiting; there are many capabilities bundled together. Clean traffic can then be selected for forwarding to an origin pool member, which is the standard operation of any load balancer. In this case, the Prompt Security service is the exclusive member of our origin pool. For this article, it is a cloud instantiated service - options exist to forward to Prompt implemented on a Kubernetes cluster or running on a Distributed Cloud AppStack Customer Edge (CE) node. Block Sensitive Data with Prompt Security In-Line AI inferences, upon reaching Prompt’s security service, are subjected to a wide breadth of security inspections. Some of the more important categories would include: Sensitive data leakage, although potentially contained in LLM responses, intuitively the larger proportion of risk is within the requesting prompt, with user perhaps inadvertently disclosing data which should not reach an LLM Source code fragments within submissions to LLMs, various programming languages may be scanned for and blocked, and the code may be enterprise intellectual property OWASP LLM top 10 high risk violations, such as LLM jailbreaking where the intent is to make the LLM behave and generate content that is not aligned with the service intentions; the goal may be embarrassing “screenshots”, such as having a chatbot for automobile vendor A actually recommend a vehicle from vendor B OWASP Prompt Injection detection, considered one of the most dangerous threats as the intention is for rogue users to exfiltrate valuable data from sources the LLM may have privileged access to, such as backend databases Token layer attacks, such as unauthorized and excessive use of tokens for LLM tasks, the so-called “Denial of Wallet” threat Content moderation, ensuring a safe interaction with LLMs devoid of toxicity, racial and gender discriminatory language and overall curated AI experience aligned with those productivity gains that LLMs promise To demonstrate sensitive data leakage protection, a Prompt Security policy was active which blocked LLM requests with, among many PII fields, a mailing address exposed. To reach OpenAI GPT3.5-Turbo, one of the most popular and cost-effective models in the OpenAI model lineup, prompts were sent to an F5 XC HTTPS load balancer at address llmsec.busdevf5.net. Traffic not violating the comprehensive F5 WAF security rules were proxied to the Prompt Security SaaS offering. The prompt below clearly involves a mailing address in the data portion. The ensuing prompt is intercepted by both the F5 and Prompt Security solutions. The first interception, the distributed HTTPS load balancer offered by F5 offers rich details on the transaction, and since no WAF rules or other security policies are violated, the transaction is forwarded to Prompt Security. The following demonstrates some of the interesting details surrounding the transaction, when completed (double-click to enlarge). As highlighted, the transaction was successful at the HTTP layer, producing a 200 Okay outcome. The traffic originated in the municipality of Ashton, in Canada, and was received into Distributed Cloud in F5’s Toronto (tr2-tor) RE site. The full details around the targeted URL path, such as the OpenAI /v1/chat/completions target and the user-agent involved, vscode-restclient, are both provided. Although the HTTP transaction was successful, the actual AI prompt was rejected, as hoped for, by Prompt Security. Drilling into the Activity Monitor in the Prompt UI, one can get a detailed verdict on the transaction (double-click). Following the yellow highlights above, the prompt was blocked, and the violation is “Sensitive Data”. The specific offending content, the New York City street address, is flagged as a precluded entity type of “mailing address”. Other fields that might be potentially blocking candidates with Prompt’s solution include various international passports or driver’s license formats, credit card numbers, emails, and IP addresses, to name but a few. A nice, time saving feature offered by the Prompt Security user interface is to simply choose an individual security framework of interest, such as GDPR or PCI, and the solution will automatically invoke related sensitive data types to detect. An important idea to grasp: The solution from Prompt is much more nuanced and advanced than simple REGEX; it invokes the power of AI itself to secure customer journeys into safe AI usage. Machine learning models, often transformer-based, have been fine-tuned and orchestrated to interpret the overall tone and tenor of prompts, gaining a real semantic understanding of what is being conveyed in the prompt to counteract simple obfuscation attempts. For instance, using printed numbers, such as one, two, three to circumvent Regex rules predicated on numerals being present - this will not succeed. This AI infused ability to interpret context and intent allows for preset industry guidelines for safe LLM enforcement. For instance, simply indicating the business sector is financial will allow the Prompt Security solution to pass judgement, and block if desired, financial reports, investment strategy documents and revenue audits, to name just a few. Similar awareness for sectors such as healthcare or insurance is simply a pull-down menu item away with the policy builder. Source Code Detection A common use case for LLM security solutions is identification and, potentially, blocking submissions of enterprise source code to LLM services. In this scenario, this small snippet of Python is delivered to the Prompt service: def trial(): return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 sum(trial() for i in range(10_000)) / 10_000 A policy is in place for Python and JavaScript detection and was invoked as hoped for. curl --request POST \ --url https://llmsec.busdevf5.net/v1/chat/completions \ --header 'authorization: Bearer sk-oZU66yhyN7qhUjEHfmR5T3BlbkFJ5RFOI***********' \ --header 'content-type: application/json' \ --header 'user-agent: vscode-restclient' \ --data '{"model": "gpt-3.5-turbo","messages": [{"role": "user","content": "def trial():\n return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500\n\nsum(trial() for i in range(10_000)) / 10_000"}]}' Content Moderation for Interactions with LLMs One common manner of preventing LLM responses from veering into undesirable territory is for the service provider to implement a detailed system prompt, a set of guidelines that the LLM should be governed by when responding to user prompts. For instance, the system prompt might instruct the LLM to serve as polite, helpful and succinct assistant for customers purchasing shoes in an online e-commerce portal. A request for help involving the trafficking of narcotics should, intuitively, be denied. Defense in depth has traditionally meant no single point of failure. In the above scenario, screening both the user prompt and ensuring LLM response for a wide range of topics leads to a more ironclad security outcome. The following demonstrates some of the topics Prompt Security can intelligently seek out; in this simple example, the topic of “News & Politics” has been singled out to block as a demonstration. Testing can be performed with this easy Curl command, asking for a prediction on a possible election result in Canadian politics: curl --request POST \ --url https://llmsec.busdevf5.net/v1/chat/completions \ --header 'authorization: Bearer sk-oZU66yhyN7qhUjEHfmR5T3Blbk*************' \ --header 'content-type: application/json' \ --header 'user-agent: vscode-restclient' \ --data '{"model": "gpt-3.5-turbo","messages": [{"role": "user","content": "Who will win the upcoming Canadian federal election expected in 2025"}],"max_tokens": 250,"temperature": 0.7}' The response, available in the Prompt Security console, is also presented to the user. In this case, a Curl user leveraging the VSCode IDE. The response has been largely truncated for brevity, fields that are of interest is an HTTP “X-header” indicating the transaction utilized the F5 site in Toronto, and the number of tokens consumed in the request and response are also included. Advanced LLM Security Features Many of the AI security concerns are given prominence by the OWASP Top Ten for LLMs, an evolving and curated list of potential concerns around LLM usage from subject matter experts. Among these are prompt injection attacks and malicious instructions often perceived as benign by the LLM. Prompt Security uses a layered approach to thwart prompt injection. For instance, during the uptick in interest in ChatGPT, DAN (Do Anything Now) prompt injection was widespread and a very disruptive force, as discussed here. User prompts will be closely analyzed for the presence of the various DAN templates that have evolved over the past 18 months. More significantly, the use of AI itself allows the Prompt solution to recognize zero-day bespoke prompts attempting to conduct mischief. The interpretative powers of fine-tuned, purpose-built security inspection models are likely the only way to stay one step ahead of bad actors. Another chief concern is protection of the system prompt, the guidelines that reel in unwanted behavior of the offered LLM service, what instructed our LLM earlier in its role as a shoe sales assistant. The system prompt, if somehow manipulated, would be a significant breach in AI security, havoc could be created with an LLM directed astray. As such, Prompt Security offers a policy to compare the user provided prompt, the configured system prompt in the API call, and the response generated by the LLM. In the event that a similarity threshold with the system prompt is exceeded in the other fields, the transaction can be immediately blocked. An interesting advanced safeguard is the support for a “canary” word - a specific value that a well behaved LLM should never present in any response, ever. The detection of the canary word by the Prompt solution will raise an immediate alert. One particularly broad and powerful feature in the AI firewall is the ability to find secrets, meaning tokens or passwords, frequently for cloud-hosted services, that are revealed within user prompts. Prompt Security offers the ability to scour LLM traffic for in excess of 200 meaningful values. Just as a small representative sample of the industry’s breadth of secrets, these can all be detected and acted upon: Azure Storage Keys Detector Artifactory Detector Databricks API tokens GitLab credentials NYTimes Access Tokens Atlassian API Tokens Besides simple blocking, a useful redaction option can be chosen. Rather than risk compromise of credentials and obfuscated value will instead be seen at the LLM. F5 Positive Security Models for AI Endpoints The AI traffic delivered and received from Prompt Security’s AI firewall is both discovered and subjected to API layer policies by the F5 load balancer. Consider the token awareness features of the AI firewall, excessive token consumption can trigger an alert and even transaction blocking. This behavior, a boon when LLMs like the OpenAI premium GPT-4 models may have substantial costs, allows organizations to automatically shut down a malicious actor who illegitimately got hold of an OPENAI_API key value and bombarded the LLM with prompts. This is often referred to as a “Denial of Wallet” situation. F5 Distributed Cloud, with its focus upon the API layer, has congruent safeguards. Each unique user of an API service is tracked to monitor transactional consumption. By setting safeguards for API rate limiting, an excessive load placed upon the API endpoint will result in HTTP 429 “Too Many Request” in response to abusive behavior. A key feature of F5 API Security is the fact that it is actionable in both directions, and also an in-line offering, unlike some API solutions which reside out of band and consume proxy logs for reporting and threat detection. With the automatic discovery of API endpoints, as seen in the following screenshot, the F5 administrator can see the full URL path which in this case exercises the familiar OpenAI /v1/chat/completions endpoint. As highlighted by the arrow, the schema of traffic to API endpoints is fully downloadable as an OpenAPI Specification (OAS), formerly known as a Swagger file. This layer of security means fields in API headers and bodies can be validated for syntax, such that a field whose schema expects a floating-point number can see any different encoding, such as a string, blocked in real-time in either direction. A possible and valuable use case: allow an initial unfettered access to a service such as OpenAI, by means of Prompt Security’s AI firewall service, for a matter of perhaps 48 hours. After a baseline of API endpoints has been observed, the API definition can be loaded from any saved Swagger files at the end of this “observation” period. The loaded version can be fully pruned of undesirable or disallowed endpoints, all future traffic must conform or be dropped. This is an example of a “positive security model”, considered a gold standard by many risk-adverse organizations. Simply put, a positive security model allows what has been agreed upon through and rejects everything else. This ability to learn and review your own traffic, and then only present Prompt Security with LLM endpoints that an organization wants exposed is an interesting example of complementing an AI security solution with rich API layer features. Summary The world of AI and LLMs is rapidly seeing investment, in time and money, from virtually all economic sectors; the promise of rapid dividends in the knowledge economy is hard to resist. As with any rapid deployment of new technology, safe consumption is not guaranteed, and it is not built in. Although LLMs often suggest guardrails are baked into offerings, a 30-second search of the Internet will expose firsthand experiences where unexpected outcomes when invoking AI are real. Brand reputation is at stake and false information can be hallucinated or coerced out of LLMs by determined parties. By combining the ability to ingest globally at high-speed dispersed users and apply a first level of security protections, F5 Distributed Cloud can be leveraged as an onboarding for LLM workloads. As depicted in this article, Prompt Security can in turn handle traffic egressing F5’s distributed HTTPS load balancers and provide state-of-the-art AI safeguards, including sensitive data detection, content moderation and other OWASP-aligned mechanisms like jailbreak and prompt injection mitigation. Other deployment models exist, including deploying Prompt Security’s solution on-premises, self-hosted in cloud tenants, and running the solution on Distributed Cloud CE nodes themselves is supported.976Views4likes1Comment