How I did it - "Securing NVIDIA’s Morpheus AI Framework with NGINX Plus Ingress Controller”
Hello! In this installment of "How I Did It," we continue our journey into AI security. Below I have documented how I deployed an NVIDIA Morpheus AI infrastructure along with F5's NGINX Plus Ingress Controller to provide secure and scalable external access.
The NVIDIA Morpheus AI Framework is a cybersecurity framework designed to detect and mitigate threats in real time by leveraging AI and machine learning. It provides tools for deep packet inspection, anomaly detection, and automated response, enabling organizations to protect their data and infrastructure from advanced cyber threats. The framework is optimized for use with NVIDIA GPUs, ensuring high performance and scalability. Morpheus can run on Kubernetes allowing for scalable, flexible, and efficient management of AI workloads across distributed environments. In addition to Morpheus, NVIDIA offers several NIM microservices that utilize the same AI engine (Triton Inference server).
NVIDIA NIM (NVIDIA Infrastructure Manager) Microservices is a suite of tools designed to simplify and automate the deployment, management, and scaling of AI and deep learning infrastructure on Kubernetes clusters. It enables seamless integration with NVIDIA GPUs, facilitating efficient resource allocation and optimization for AI workloads.
NGINX Plus Ingress Controller is architected to provide and manage external connectivity to applications running on Kubernetes. It enhances the standard Kubernetes Ingress capabilities by providing advanced features such as SSL/TLS termination, traffic throttling, and advanced routing based on request attributes. NGINX Plus can also integrate with external authentication services and provide enhanced security features like NGINX App Protect. The controller is designed to improve the performance, reliability, and security of applications deployed on Kubernetes, making it a popular choice for managing ingress traffic in production environments.
Okay, with the major players identified, let's see How I did it....
Prerequisites
- GPU-enabled Kubernetes cluster - I utilized a Standard_NC6s_v3 with a V100 GPU running on an Azure AKS cluster. To enable the cluster to utilize the GPU, I deployed the NVIDIA GPU Operator
- NVIDIA NGC Account - You will need to have an NGC login as well generate an API token to pull the NVIDIA images. Additional NIMs, container, models, etc. listed may require a NVIDIA Enterprise subscription. You can request a 90-day evaluation as well.
- NGINX Plus License - Log into or create an account on the MyF5 Portal, navigate to your subscription details, and download the relevant .JWT files. With respect to licensing, if needed, F5 provides a fully functional 30-day trial license.
- Service TLS Certificate(s) - The embedded Triton Inference server exposes three (3) endpoints for HTTP, GRPC, and metrics. You will need to create Kubernetes TLS secret(s) referencing the cert/key pair. In below walkthrough, I combined my three self-signed certificates and keys (PEM format) and generated a single TLS secret.
- Helm - Each of the pods, (Morpheus AI Engine, Morpheus SDK CLI, Morpheus MLflow, and NGINX Plus Ingress Controller) are deployed using Helm charts.
Let's Get Started 😀
Download the Helm Charts
Prior to deploying any resources, I used VSCode to first create a folder 'nvidia' and fetched/pulled down Helm charts into individual subfolders for the following:
Create and export NGC API Key
From the NVIDIA NGC portal, I navigated to 'Setup' and 'Generate API Key'
From the upper right corner I selected '+ Generate API Key' and confirm to create the key. I copied the key from the screen and store for the next step.
From the command line I created an environment variable, API_KEY specifying the copied API key. This environment variable will be referenced in the NVIDIA helm installs.
export API_KEY=<ngc apikey>
Create Kubernetes Secrets
From the MyF5 Portal, navigate to your subscription details, and download the relevant .JWT file. With the JWT token in hand I created the docker registry secret. This will be used to pull NGINX Plus IC from the private registry.
kubectl create secret docker-registry regcred --docker-server=private-registry.nginx.com --docker-username=<jwt token> --docker-password=none
I also needed to create a Kubernetes TLS secret to store my NGINX hosted endpoints. For ease, I combined my three cert/key combos into two files. I used the command below to create the secret.
kubectl create secret tls tls-secret --cert=/users/coward/certificates/combined-cert.pem --key=/users/coward/certificates/combined-key.pem
Deploy NGINX Plus Ingress Controller
Before I deployed the NGINX Helm chart, I first needed to modify a few settings in the 'values.yaml' file. Since I wanted to deploy NGINX Plus Ingress Controller with (NAP) I specified a repo image, set the 'nginxplus' to true and enabled NGINX App Protect.
I used the below command to deploy the Helm chart.
helm install nginx-ingress ./nginx-ingress
I made note of the ingress controller's EXTERNAL IP and updated my domain's (f5demo.net) DNS records (triton-http.f5demo.net, triton-grpc.f5demo.net, triton-http-metrics.f5demo.net) to reflect the public address.
I used the below command to port-forward the ingress controller container to my local machine and accessed the dashboard - localhost:8080/dashboard.html.
kubectl port-forward nginx-ingress-controller-6489bd6546 8080:8080
Deploy Morpheus AI Engine
Update the values.yaml file
I updated the 'values.yaml' file to include FQDN tags for the published services.
In addition, I created three separate NGINX virtual server resource objects corresponding to the three Triton server endpoints. The virtual server resources map the Morpheus AI Engine endpoints, (8000, 8001, 8002) to the already deployed NGINX Ingress Controller. The contents of each file is included below.
apiVersion: k8s.nginx.org/v1 kind: VirtualServer metadata: name: {{ .Values.tags.http_fqdn }} spec: host: {{ .Values.tags.http_fqdn }} tls: secret: tls-secret gunzip: on upstreams: - name: triton-http service: ai-engine port: 8000 routes: - path: / action: pass: triton-http
apiVersion: k8s.nginx.org/v1 kind: VirtualServer metadata: name: {{ .Values.tags.grpc_fqdn }} spec: host: {{ .Values.tags.grpc_fqdn }} tls: secret: tls-secret upstreams: - name: triton-grpc service: ai-engine port: 8001 type: grpc routes: - path: / action: pass: triton-grpc
apiVersion: k8s.nginx.org/v1 kind: VirtualServer metadata: name: {{ .Values.tags.metrics_fqdn }} spec: host: {{ .Values.tags.metrics_fqdn }} tls: secret: tls-secret gunzip: on upstreams: - name: triton-metrics service: ai-engine port: 8002 routes: - path: / action: pass: triton-metrics
With the above files created, I deployed the Helm chart using the following command.
helm install --set ngc.apiKey="$API_KEY" morpheus-ai ./morpheus-ai-engine
I returned to my NGINX dashboard to verify connectivity with the ingress controller and the upstream Morpheus service.
Deploy Morpheus SDK Client
The SDK Client hosts shared models and data libraries. More importantly, the client provides a platform to deploy Morpheus pipelines. I deployed the Helm chart using the following command.
helm install --set ngc.apiKey="$API_KEY" morpheus-sdk ./morpheus-sdk-client
Once the sdk-cli-helper pod reaches a running state, use the below commands to connect to the pod and move models/data to the shared volume. The shared volume will be accessed by the Morpheus MLflow pod to publish and deploy models to the AI engine.
kubectl exec sdk-cli-helper -- cp -RL /workspace/models /common
kubectl exec sdk-cli-helper -- cp -R /workspace/examples/data /common
Deploy Morpheus MLFlow
The Morpheus MLFlow service is used to publish and deploy models to the Morpheus AI engine, (triton inference server). Once the deployed, I will exec into the pod and publish/deploy models.
helm install --set ngc.apiKey="$API_KEY" morpheus-mflow ./morpheus-mlflow
The Morpheus AI engine Helm chart deploys an instance of Kafka and Zookeeper to facilitate Morpheus pipelines. To enable pipelines, I created two (2) Kafka topics with the below commands.
kubectl exec deploy/broker -c broker -- kafka-topics.sh --create --bootstrap-server broker:9092 --replication-factor 1 --partitions 3 --topic mytopic
kubectl exec deploy/broker -c broker -- kafka-topics.sh --create --bootstrap-server broker:9092 --replication-factor 1 --partitions 3 --topic mytopic-out
Deploy Models
The final step in the deployment was to connect to the MLFlow pod and publish/deploy a couple models to the Triton Inference server.
kubectl exec -it deploy/mlflow -- bash
Now connected to the pods command line, I took a quick look at the available models in the shared volume.
ls -lrt /common/models
I used the below commands to publish and deploy two models (sid-minibert-onnx, phishing-bert-onnx) from the shared volume.
python publish_model_to_mlflow.py --model_name sid-minibert-onnx --model_directory /common/models/triton-model-repo/sid-minibert-onnx --flavor triton
mlflow deployments create -t triton --flavor triton --name sid-minibert-onnx -m models:/sid-minibert-onnx/1 -C "version=1"
python publish_model_to_mlflow.py --model_name phishing-bert-onnx --model_directory /common/models/triton-model-repo/phishing-bert-onnx --flavor triton
mlflow deployments create -t triton --flavor triton --name phishing-bert-onnx -m models:/phishing-bert-onnx/1 -C "version=1"
Validate Remote Access to Models
With my Morpheus infrastructure deployed and sitting securely behind my NGINX Ingress Controller, there's nothing left to do but test with a couple of curl commands. I'll first use curl to reach the inference server's status page.
curl -v -k https://triton-http.f5demo.net/v2/health/ready
Finally, i'll use curl to validate that I can reach one of the hosted models.
curl -k https://triton-http.f5demo.net/v2/models/sid-minibert-onnx/config | jq
Check it Out
Want to get a feel for it before trying yourself? The video below provides a step-by-step walkthrough of the above deployment.
Additional Links
How I did it - "Securing Nvidia Triton Inference Server with NGINX Plus Ingress Controller”