Mitigate OWASP LLM Security Risk: Sensitive Information Disclosure Using F5 NGINX App Protect
Introduction:
This article covers the basics of the rise of AI (Gen AI) and how F5 products can protect these AI backends. The rise of Large Language Models (LLMs) has marked a transformative era in AI, enabling machines to produce and comprehend text with human-like proficiency. These sophisticated models are now integral to applications in customer support, content creation, and even scientific research. However, their advanced capabilities also cause big security worries, especially around accidentally sharing sensitive information. These models can sometimes share private data from their training data. This means we need strong protection systems to reduce these risks. Addressing these challenges, the OWASP LLM Top 10 project has been created to identify and prioritize the most critical security threats associated with LLMs. Out of this top 10, LLM-06 risk specifically focuses on sensitive information disclosure, emphasizing the importance of stringent data handling protocols and privacy safeguards to prevent unintended data leaks to ensure the secure and ethical use of LLM technology. In this article, we are going to see how F5 Nginx App Protect v5 can protect the LLM backends from LLM06: Sensitive Information Disclosure risk.
Use case:
We are going to deploy a Gen AI application which takes URL hosting data and is passed on to a backend LLM application. Once data is analyzed by LLM, users can ask questions about this data and LLM will come back with the right answers. We have deployed this application inside AWS EKS cluster and 2 application services are running inside this cluster. Front-end services serve the UI, and the backend hosts the LLM model. How and what of this application with its internal tools and LLM model is not of importance and can find many free tools online. Since this article focusses on LLM06: Sensitive Information Disclosure risk, we will pass a website URL containing some dummy SSN of random users. Once this website data is loaded to LLM, we can ask for the SSN of a user and LLM will return the SSN from this data. SSNs are sensitive information and should always be protected as it will lead to personal data exploitation. In this case, LLM model does not have security rules to find and protect this data. So, it will be directly shown in the response as shown below.
To protect this LLM backend service, we are going to deploy and configure NGINX App Protect Version 5 as a k8s workload in the data path. The latest release of NGINX App Protect v5 has made the WAF process more ridiculously easy, thereby making it more efficient and optimized. All data traffic will be validated by the NGINX App Protect before being exposed in the response. In this use case, since we want to mask the SSN, we are going to configure the data-guard feature with its appropriate configuration files onboarded to this container.
Above configuration file can be downloaded from NGINX App Protect WAF configuration guide.
Data guard is a WAF feature which detects and masks Credit Card Number (CCN) and/or U.S. Social Security Number (SSN) and/or custom patterns in HTTP responses. Since data-guard feature is enabled, SSNs of users in the LLM backend response are detected and masked by the NGINX App Protect thereby protecting the personal data. For more info on NGINX App Protect data guard feature, check this link.
NOTE: Since this is just for demo and focuses only on LLM workload protection, we are using NGINX App Protect v5. But as per customer practices, users can configure NGINX Ingress Controller, Secure Mesh, etc.
Deployment Steps:
- Check the service cluster IP of backend LLM service and update it in below yaml file upstream server to create nginx config file configmap
apiVersion: v1 kind: ConfigMap metadata: name: nginx-conf-map-api namespace: default data: nginx.conf: | user nginx; worker_processes auto; load_module modules/ngx_http_app_protect_module.so; error_log /var/log/nginx/error.log debug; events { worker_connections 10240; } http { include /etc/nginx/mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; # NGINX App Protect WAF app_protect_enforcer_address 127.0.0.1:50000; upstream main_DNS_name { server 172.20.41.242:8000; } server { listen 80; proxy_http_version 1.1; proxy_read_timeout 600; proxy_connect_timeout 600; proxy_send_timeout 600; app_protect_enable on; app_protect_policy_file "/etc/app_protect/bundles/NAP_API_Policy.tgz"; app_protect_security_log_enable on; app_protect_security_log log_all /etc/app_protect/bundles/security.log; location / { client_max_body_size 0; default_type text/html; # set your backend here proxy_pass http://main_DNS_name; proxy_set_header Host $host; } } }
- Build and push a NGINX Plus docker image to your private registry by following this link
- Copy your JWT token and run below command to create a k8s secret
# kubectl create secret docker-registry regcred --docker-server=private-registry.nginx.com --docker-username=<JWT Token> --docker-password=none - Check the below file, update API policy bundle URL in init container and docker image info in nginx container. Apply this file to install nginx deployment and pods.
apiVersion: apps/v1 kind: Deployment metadata: name: nap5-deployment spec: selector: matchLabels: app: nap5 replicas: 1 template: metadata: labels: app: nap5 spec: imagePullSecrets: - name: regcred initContainers: - name: init-fetchbundle image: curlimages/curl:8.9.1 command: - sh - -c - | echo "Downloading file..." curl -vvv -L https://github.com/f5devcentral/f5-xc-terraform-examples/raw/main/workflow-guides/NAP_API_Policy.tgz -o /etc/app_protect/bundles/NAP_API_Policy.tgz volumeMounts: - name: app-protect-bundles mountPath: /etc/app_protect/bundles containers: - name: nginx image: <registry-url>:tag-name imagePullPolicy: IfNotPresent volumeMounts: - name: app-protect-bd-config mountPath: /opt/app_protect/bd_config - name: app-protect-config mountPath: /opt/app_protect/config - name: nginx-conf-map-api-volume mountPath: /etc/nginx/nginx.conf subPath: nginx.conf - name: nap-api-policy-volume mountPath: /etc/nginx/NAP_API_Policy.json subPath: NAP_API_Policy.json - name: waf-enforcer image: private-registry.nginx.com/nap/waf-enforcer:5.2.0 imagePullPolicy: IfNotPresent env: - name: ENFORCER_PORT value: "50000" volumeMounts: - name: app-protect-bd-config mountPath: /opt/app_protect/bd_config - name: waf-config-mgr image: private-registry.nginx.com/nap/waf-config-mgr:5.2.0 imagePullPolicy: IfNotPresent securityContext: allowPrivilegeEscalation: false capabilities: drop: - all volumeMounts: - name: app-protect-bd-config mountPath: /opt/app_protect/bd_config - name: app-protect-config mountPath: /opt/app_protect/config - name: app-protect-bundles mountPath: /etc/app_protect/bundles volumes: - name: app-protect-bd-config emptyDir: {} - name: app-protect-config emptyDir: {} - name: app-protect-bundles emptyDir: {} - name: nginx-conf-map-api-volume configMap: name: nginx-conf-map-api - name: nap-api-policy-volume configMap: name: nap-api-policy
- Next deploy NGINX App Protect service using below file
apiVersion: v1 kind: Service metadata: name: nap5 labels: app: nap5 service: nap5 spec: ports: - protocol: TCP port: 80 targetPort: 80 selector: app: nap5 type: ClusterIP
- Check the cluster services and copy the NGINX App Protect service cluster ip
- Update App Protect cluster ip address in your Gen AI application frontend yaml file openAI address to create deployment and load balancer service
Testing:
- Once setup is complete, check the cluster services command and open the Gen AI front end load balancer service URL in a browser
- Enter the web page input as https://dlptest.com/sample-data/namessndob/ and in query provide "What is Robert Aragon's SSN?"
- After some time, validate that SSN number is masked in the response
Conclusion:
This article highlights the critical security gaps present in current Gen AI applications, emphasizing the urgent need for robust protection measures in LLM design deployments. In the latter half, we demonstrated how F5 NGINX App Protect v5, with its advanced security features, offers an effective solution to mitigate the OWASP LLM Top 10 risks. By leveraging these capabilities, organizations can significantly enhance the security and resilience of their AI applications.
References:
- https://genai.owasp.org/llm-top-10/
- https://genai.owasp.org/llmrisk/llm06-sensitive-information-disclosure/
- https://docs.nginx.com/nginx-app-protect-waf/v5/admin-guide/deploy-on-kubernetes/
- https://docs.nginx.com/nginx-app-protect-waf/v5/admin-guide/compiler/
NOTE: This article covered only one risk and stay tuned for more exciting articles on remaining OWASP LLM Top 10 risks prevention using F5 products.