BIG-IP Next Automation: AS3 Basics
I need a little Mr. Miyagi right now to grab my face and intently look me in the eye and give me a "Concentrate! Focus power!" For those of you youngins' who don't know who that is, he's the OG Karate Kid mentor. Anyway, I have a thousand things I want to say about AS3 but in this article, I'll attempt to cut this down to a narrow BIG-IP Next-specific context to get you started. It helps that last December I did a five-part streaming series on AS3 in the BIG-IP classic context. If you haven't seen that, you have my blessing to stop right now, take some time to digest AS3 conceptually and practice against workloads and configurations in BIG-IP classic that you know and understand, before returning here to embrace all the newness of BIG-IP Next. AS3 is FOUNDATIONAL in BIG-IP Next In classic BIG-IP, you could edit the bigip.conf file directly, use tmsh commands, or iControlREST commands to imperatively create/modify/delete BIG-IP objects. With the exception of system configuration and shared configuration objects, this is not the case with BIG-IP Next. All application configuration is AS3 at its lowest state level. This doesn't mean you have to work primarily in AS3 configuration. If you utilize the migration utility in Central Manager, it will generate the AS3 necessary to get your apps up and running. Another option is to use the built-in http FAST template (we'll cover FAST in later articles) to build out an application from scratch in the GUI. But if you use features outside the purview of that template, or you need to edit your migration output, you'll need to work in the AS3 configuration declaration, even if just a little bit. Apples to Apples It's a fun card game, no? My family takes it to snarky absurd levels of sarcasm, to the point that when we play with "outsiders" we get lots of blank looks and stares as we're all rolling on the floor laughing. Oh well, to each his own. But we're here to talk about AS3, right? Well, in BIG-IP Next, there is a compatibility interface for AS3, such that you can take a declaration from BIG-IP classic and as long as the features within that declaration are supported, it should "just work" via the Central Manager API. That's pretty cool, right? Let's start with a basic application declaration from the recent video posted by Mark_Dittmerexploring the API differences between classic and Next. { "class": "ADC", "schemaVersion": "3.0.0", "id": "generated-for-testing", "Tenant_1": { "class": "Tenant", "App_1": { "class": "Application", "Service_1": { "class": "Service_HTTP", "virtualAddresses": [ "10.0.0.1" ], "virtualPort": 80, "pool": "Pool_1" }, "Pool_1": { "class": "Pool", "members": [ { "servicePort": 80, "serverAddresses": [ "10.1.0.1", "10.1.0.2" ] } ] } } } } A simple VIP with a pool with two pool members. A toy config to be sure, but it is useful here to show the format (JSON) of an AS3 declaration and some of the schema as well. With the compatibility interface, this same declaration can be posted to a classic BIG-IP like this: POST https://<BIG-IP IP Address>/mgmt/shared/appsvcs/declare Or a BIG-IP Next instance like this: POST https://<Central Manager IP Address>/api/v1/spaces/default/appsvcs/declare?target_address=<BIG-IP Next instance IP Address> For those already embracing AS3, this compatibility interface in BIG-IP Next should make the transition easier. AS3 Workflow in BIG-IP Next With BIG-IP classic, you had to install the AS3 package (technically an iControl LX, or sometimes referenced as an iApps v2 package) onto each BIG-IP system you wanted to use the AS3 declarative configuration model on. Each BIG-IP was an island, and the configuration management of the overall system of BIG-IPs was reliant on an external system for source of truth. With BIG-IP Next, the Central Manager API has native AS3 support so there are no packages to install to prepare the environment. Also, Central Manager is the centralized AS3 interface for all Next instances. This has several benefits: A singular and centralized source of truth for your configuration management No external package management requirements Tremendous improvement in API performance management since most of the heavy lifting is offloaded from the instances and onto Central Manager and the control-plane functionality that remains on the instance is intentionally designed for API-first operations The general application deployment workflow introduced exclusively for Next is twofold: Create an application service First, you create the application service on Central Manager. You can use the same JSON declaration from the section above here, only the API endpoint is different: POST https://<Central Manager IP Address>/api/v1/spaces/default/appsvcs/documents A successful transaction will result in an application service document on Central Manager. A couple notes on this at time of writing: Documents created through the API are not validated against the journeys migration tool that is available for use in the Central Manager GUI. Documents are not schema validated at the attribute level of classes, so whereas a class used in classic might be supported in Next, some of the attributes might not be. This means that whereas the document creation process can appear successful, the deployment will fail if classes and/or class attributes supported in classic BIG-IP are present in the AS3 declarations when an attempt to apply to an instance occurs. Deploy the application service Assuming, however, all your AS3 work is accurate to the Next-supported schema, you post the specified document by ID to the target BIG-IP Next instance, here as a JSON payload versus a query parameter on the compatibility endpoint shown earlier. POST https://<Central Manager IP Address>/api/v1/spaces/default/appsvcs/documents/<Document ID>/deployments { "target": "<BIG-IP Next Instance IP Address>" } At this point, your service should be available to receive traffic on the instance it was deployed on. Next Up... Now that we have the theory in place, join me next time where we'll take a look at working with a couple application services through both approaches. Resources CM App Services Management AS3 Schema AS3 User Guide (classic, but useful) AS3 Reference Guide (classic, but useful) AS3 Foundations (streaming series)205Views0likes0CommentsGovernance and Automation - Distributed Apps for Hybrid Cloud Architecture
Overview This is the first of a three part series of demo and showcase created for Singapore Govware 2022 - Asia largest Cyber Security Conference. Part 1 - Governance and Automation - Distributed Apps for Hybrid Cloud Architecture Demonstrate how F5 Distributed Cloud integration with CI/CD to govern, control and automate deployment of a modern distributed apps across a hybrid cloud environment. Part 2 - F5 Distributed Cloud API Security and Discovery with Advance ML Demonstrate how F5 Distributed Cloud helps to discover and map APIs, block unwanted connection and prevent data leakage. Part 3 - F5 Distributed Cloud BOT Defense to prevent fraud and risk cause by bot automation Demonstrate how F5 Distributed Cloud helps in defense against malicious bots and ensure safe, fast and seamless user experiences for your digital assets. Learn the best ways to incorporate API security controls. Governance and Automation - Distributed Apps for Hybrid Cloud Architecture The important of a solid software release methodology Applications are essential to our daily life. It is a critical asset for all of us as well as for organisation. Hence, for organization, it is critical to ensure apps are robust, secure, fast and appealing to user. To maintain a robust, secure, fast and appealing apps, it is important to ensure those apps are frequently released in a timely manner. Apps releases need to be supported by a solid software release methodology. This demostration will showcase how an organization can leverage F5 Distributed Cloud Platform couple with various open source tools to deliver an application securely to production via Continious Integration and Continious Deployment (CI/CD) pipeline. The entire appliation delivery lifecycle follow the lifecyce from Day-0, Day-1 and Day2. Securing and operationalizing the following distributed application with hybrid edge is complex and difficult. Let’s see how F5 Distributed Cloud Platform can make it seamless and easy. Sample application - Arcadia Financial will be use for this demo. This application consist of many components/microservices that are distributed across many sites - AWS, Azure, Ruggadized hardware, Data Center1 and Data Center2. All those compoenents communicate with each other via a secure application mesh fabric (IPSEC/SSL tunnel). Security policies (e.g. Web Application Firewall, API Security and Bot management policy) are enforced and in position to secure, control and monitor traffic between all sites. Application developer can securely deploy apps onto the platform with the assurance that they are getting a consistant protections. Application Delivery Framework This framework encompasses BUILD, CHANGE and OPERATE methodology as shown below. Here is the full video demonstration. Here is the Software Development Lifecycle (SDLC) from code commit to secure image scanning to testing in stagging/dev to production releases. Pipeline will start upon code commit by software developer. CODESparks - ShipHats is a CI/CDtools that help shorten development life cycles and speed up software delivery while ensuring security and reliability. CODESparks will trigger the deployment of apps onto F5 Distributed Cloud Platform. Please refer to the video. Once application deployed, platform operator, application owner, netops, secops and developer can visualize and monitor those application from a centralized SaaS console. Below is a service graph observability obtained from F5 Console. It gave a clear visibility on where are those site located and the service interaction of those component of the apps Below shown the microservices interaction and it associated health. If any part of the microservices not healthy, color will change to amber or red. Summary of all 3 parts Part 1 - Governance and Automation - Distributed Apps for Hybrid Cloud Architecture Part 2 - API Security Strategy - Discover and map APIs, block unwanted connection and prevent data leakage Part 3 - Bot Management Strategy - Defense against malicious bots with F5 Distributed Cloud Bot Defense2.1KViews3likes0CommentsHow to Split DNS with Managed Namespace on F5 Distributed Cloud (XC) Part 2 – TCP & UDP
Re-Introduction In Part 1, we covered the deployment of the DNS workloads to our Managed Namespace and creating an HTTPS Load Balancer and Origin Pool for DNS over HTTPS. If you missed Part 1, feel free to jump over and give it a read. In Part 2, we will cover creating a TCP and UDP Load Balancer and Origin Pools for standard TCP & UDP DNS. TCP Origin Pool First, we need to create an origin pool. On the left menu, under Manage, Load Balancers, click Origin Pools. Let’s give our origin pool a name, and add some Origin Servers, so under Origin Servers, click Add Item. In the Origin Server settings, we want to select K8s Service Name of Origin Server on given Sites as our type, and enter our service name, which will be the service name from Part 1 and our namespace, so “servicename.namespace”. For the Site, we select one of the sites we deployed the workload to, and under Select Network on the Site, we want to seledt vK8s Networks on the Site, then click Apply. Do this for each site we deployed to so we have several servers in our Origin Pool. In Part 1, our Services defined the targetPort as 5553. So, we set Port to 5553 on the origin. This is all we need to configure for our TCP Origin, so click Save and Exit. TCP Load Balancer Next, we are going to make a TCP Load Balancer, since its less steps (and quicker) than a UDP Load Balancer (today). On the left menu under Manage, Load Balancers, select TCP Load Balancers. Let’s set a name for our TCP LB and set our listen port, 53 is a reserved port on Customer Edge Sites so we need to use something else, so let’s use 5553 again, under origin pools we set the origin that we created previously, and then we get to the important piece, which is Where to Advertise. In Part 1 we advertised to the internet with some extra steps on how to advertise to an internal network, in this part we will advertise internally. Select Advertise Custom, then click edit configuration. Then under Custom Advertise VIP Configuration, click Add Item. We want to select the Site where we are going to advertise, the network interface we will advertise.Click Apply, then Apply again. We don’t need to configure anything else, so click Save and Exit. UDP Load Balancer For UDP Load Balancers we need to jump to the Load Balancer section again, but instead of a load balancer, we are going to create a Virtual Host which are not listed in the Distributed Applications tile, so from the top drop down “Select Service” choose the Load Balancers tile. In the left menu under Manage, we go to Virtual Hosts instead of Load Balancers. The first thing we will configure is an Advertise Policy, so let’s select that. Advertise Policy Let’s give the policy a name, select the location we want to advertise on the Site Local Inside Network, and set the port to 5553. Save and Exit. Endpoints Now back to Manage, Virtual Hosts, and Endpoints so we can add an endpoint. Name the endpoint and specify based on the screenshot below. Endpoint Specifier: Service Selector Info Discovery: Kubernetes Service: Service Name Service Name: service-name.namespace Protocol: UDP Port: 5553 Virtual Site or Site or Network: Site Reference: Site Name Network Type: Site Local Service Network Save and Exit. Cluster The Cluster configuration will be simple, from Manage, Virtual Hosts, Clusters, add Cluster. We just need a name and select the Origin Servers / Endpoints and select the endpoint we just created. Save and Exit. Route The Route configuration will be simple as well, from Manage, Virtual Hosts, Routes, add Route. Name the route and under List of Routes click Configure, then Add Item. Leave most settings as they are, and under Actions, choose Destination List, then click Configure. Under Origin Pools and Weights, click Add Item. Under Cluster with Weight and Priority select the cluster we created previously, leave Weight as null for this configuration, then click Apply, apply again, apply again, Apply again, Save and Exit. Now we can Finally create a Virtual Host. Virtual Host Under Manage, Virtual Host, Select Virtual Host, then Click Add Virtual Host. There are a ton of options here, but we only care about a couple. Give the Virtual Host a name. Proxy Type: UDP Proxy Advertise Policy: previously created policy Moment of Truth, Again Now that we have our services published we can give them a test. Since they are currently on a non standard port, and most systems dont let us specify a port in default configurations we need to test with dig, nslookup, etc. To test TCP with nslookup: nslookup -port=5553 -vc google.com 192.168.125.229 Server: 192.168.125.229 Address: 192.168.125.229#5553 Non-authoritative answer: Name: google.com Address: 142.251.40.174 To test UDP with nslookup: nslookup -port=5553 google.com 192.168.125.229 Server: 192.168.125.229 Address: 192.168.125.229#5553 Non-authoritative answer: Name: google.com Address: 142.251.40.174 IP Tables for Non-Standard DNS Ports If we wanted to use the nonstandard port tcp/udp dns on Linux or MacOS, we can use IPTABLES to forward all the traffic for us. There isnt a way to set this up in Windows OS today, but as in Part 1, Windows Server 2022 supports encrypted DNS over HTTPS, and it can be pushed as policy through Group Policy as well. iptables -t nat -A PREROUTING -i eth0 -p udp –dport 53 -j DNAT –to XXXXXXXXXX:5553 iptables -t nat -A PREROUTING -i eth0 -p udp –dport 53 -j DNAT –to XXXXXXXXXX:5553 iptables -t nat -A PREROUTING -i eth0 -p tcp –dport 53 -j DNAT –to XXXXXXXXXX:5553 iptables -t nat -A PREROUTING -i eth0 -p tcp –dport 53 -j DNAT –to XXXXXXXXXX:5553 "Nature is a mutable cloud, which is always and never the same." - Ralph Waldo Emerson We might not wax that philosophically around here, but our heads are in the cloud nonetheless! Join the F5 Distributed Cloud user group today and learn more with your peers and other F5 experts. Conclusion I hope this helps with a common use-case we are hearing every day, and shows how simple it is to deploy workloads into our Managed Namespaces.1.1KViews1like1CommentEmbracing AS3: Foundations
(updated to remove the event-nature of this post) Last fall, a host of teams took to the road to support the launch of BIG-IP Next in the form of F5 Academy roadshows, where we shared the BIG-IP story: where we started, where we are, and where we're going with it; complete with hands-on LTM and WAF labs with the attendees. For this spring's roadshows, we added SSLO and Access labs. Across the fall and spring legs, I attended in person in Kansas City (twice!), St. Louis, Cincinnati, Columbus, Omaha, and (soon) Chicago and talked with customers at all stages of the automation journey. Some haven't automated much of anything. Some have been using a variety of on-/off-box scripts, and some are all-in, baby! That said, when I ask about AS3 as a tool in their tool belt, not that many have adopted or even investigated it yet. For classic BIG-IP AS3 is not a requirement, but in BIG-IP Next, AS3 is a critical component as it's THE underlying configuration language for all applications. Because of this, I did a five-part live stream series in December to get you started with AS3. Details below. Beyond Imperatives—What the heck is AS3? In this first episode, I covered the history of automation on BIG-IP, the differences between imperative and declarative models, and the basics of AS3 from data structure and systems architecture perspectives. Top 10 Features to Know in the VSCode F5 Extension Special guest and friend of the show Ben Novak, author of the F5 Extension for VSCode, joined me to dig into the features most important to know and learn to use for understanding how to take stock BIG-IP configurations and turn applications into declarations. Migrating and Deploying Applications in VSCode In this episode, armed with the knowledge I learned from Ben in the last episode, I dug into the brass tacks of AS3! I reviewed diagnostics and migrated applications from standard configuration to AS3 declarations and deployed as well. For migrating active workloads, I discussed the steps necessary to reduce transition impact. Creating New Apps and Using Shared Objects I've always tried to alter existing things before creating new things with tech, and this is no different. Now that I had a few migrations under my belt, I attacked a net-new application and looked at shared objects and how and when to use them. Best Practices Finally, I closed this series with a look at several best practices when working with AS3. The conclusion to this series, though, is hopefully just the stepping off point for everyone new to AS3, and we can continue the conversation right here on DevCentral.2.1KViews7likes0CommentsSecuring and Scaling Hybrid apps with F5 NGINX (Part 2)
If you attended a cybersecurity trade-show lately, you may have noticed the term “Zero Trust (ZT)” advertised on almost every booth. It may seem like most security companies are offering the same value proposition: ‘Securing apps with ZT’. Its commonality stems from the fact that ZT is a broad term that can span endless use cases. ZT is not a feature or capability, rather a philosophy embraced by IT security leaders based on the idea that all traffic entering and exiting a system is not trusted and must be scrutinized before passing through. Organizations are shifting to a zero-trust mindset due to the increased complexity of cyber-attacks. Perimeter based firewalls are no longer sufficient in securing digital resources. In Part 1 of our series, we configured NGINX Plus as an external load balancer to route and terminate TLS traffic to cluster nodes. In this part of the series, we leverage the same NGINX Plus deployment to enable ZT use cases that will improve the security posture of your hybrid applications. NOTE: Part 1 of the series is a prerequisite for enabling the ZT use cases in our examples. Please ensure that part 1 is completed before starting with part 2 Part 1 of the series ZT Use case #1: OIDC Authentication OIDC (OpenID Connect) is an authentication layer on top of the OAuth 2.0 framework. Many organizations will choose OIDC to authenticate digital identities and enable SSO (Single-Sign-On) for consumer applications. With Single-Sign-on technologies, users gain access to multiple applications with one set of user credentials by authenticating their identities through an IdP (Identity Provider). I can configure NGINX Plus to operate as an OIDC relaying party to exchange and validate ID tokens with the IdP (Identity Provider), in addition to basic reverse-proxy load balancing configured in part 1. I will extend the architecture in part 1 with an IdP and configure NGINX Plus as the identity aware proxy. Prerequisites for NGINX Plus Before configuring NGINX Plus as the OIDC identity aware proxy: 1. Installation of NJS is required. $ sudo apt-get install nginx-plus-module-njs 2. Load the NJS module into the NGINX configuration by adding the following line at the top of yournginx.conf file. load_module modules/ngx_http_js_module.so; 3. Clone the OIDC GitHub repository in your directory of choice cd /home/ubuntu && git clone --branch R28 https://github.com/nginxinc/nginx-openid-connect.git Setting up the IdP The IdP will manage and store digital identities to mitigate attackers from impersonating users to steal sensitive information. There are many IdP vendors to choose from; Okta, Ping Identity, Azure AD. We chose Okta as the IdP in our example moving forward. If you do not have access to an IdP, you can quickly get started with the Okta Command Line Interface (CLI) and run the okta register command to sign up for a new account. Once account creation is successful, we will use the Okta CLI to preconfigure Okta as the IdP, creating what Okta calls an app integration. Other IdPs will have different nomenclatures defining an application integration. For example, Azure AD will call them App registrations. If you are not using Okta, you can follow the documentation of the IdP you are using and skip to the next section (Configuring NGINX as the OpenID Connect relying party). 1. Run the okta login command to authenticate the Okta Cli with your Okta developer account. Enter your Okta domain and API token at the prompts $ okta login Okta Org URL: https://your-okta-domain Okta API token: your-api-token 2. Create the app integration $ okta apps create --app-name=mywebapp --redirect-uri=https://<nginx-plus-hostname>:443/_codexch where --app-name defines the application name (here, mywebapp) --redirect-uri defines the URI to which sign-ins are redirected to the NGINX Plus. <nginx-plus-hostname> should resolve to the NGINX Plus external IP configured in part 1. We use port 443 since TLS termination is configured on NGINX Plus from part 1. Recall we used self-signed certificates and keys to configure TLS on NGINX Plus. In a production environment, we recommend using certs/keys issued by a trusted certificate authority such as Let’s Encrypt. Once the command from step #2 is completed, the client ID and Secret generated from the app integration can be found in ${HOME}/.okta.env Configuring NGINX as the OpenID Connect relying party Now that we have finished setting up our IdP, we can now start configuring NGINX Plus as the OpenID Connect relying party. Once logged into the NGINX Plus instance, simply run the configuration script from your home directory. $ ./nginx-openid-connect/configure.sh -h <nginx-plus-hostname> -k request -i <YOURCLIENTID> -s <YOURCLIENTSECRET> –xhttps://dev-xxxxxxx.okta.com/.well-known/openid-configuration where -h defines the hostname of NGINX Plus -k defines how NGINX will retrieve JWK files to validate JWT signatures. The JWK file is retrieved from a subrequest to the IdP -i defines the Client ID generated from the IdP -s defines the Client Secret generated from the IdP -x defines the URL of the OpenID configuration endpoint. Using Okta as the example, the URL starts with your Okta organization domain, followed by the path URI /.well-known/openid-configuration The configure script will generate OIDC config files for NGINX Plus. We will copy the generated config files into the /etc/nginx/conf.d directory from part 1. $ sudo cp frontend.conf openid_connect.js openid_connect.server_conf openid_connect_configuration.conf /etc/nginx/conf.d/ You will notice by default that frontend.conf listens on port 8010 with clear text http. We need to merge kube_lb.conf into frontend.conf to enable both use cases from part 1 and 2. The resulting frontend.conf should look something like this: https://gist.github.com/nginx-gists/af067326734063da6a4ff42146873262 Finally, I will need to edit the openid_connect_configuration.conf file and modify my client secret to the one generated by my Okta IdP. Reload NGINX Plus for the new config to take effect. $ nginx -s reload Testing the Environment Now we are ready to test our environment in action. To summarize, we set up an IdP and configured NGINX Plus as the identity aware proxy to validate user ID tokens before the entering the Kubernetes cluster. To test the environment, we will open a browser and enter the hostname of NGINX Plus into the address field. You should be redirected to your IdP login page. Note: The host name should resolve to the Public IP of the NGINX Plus machine. Once you are prompted with the IdP login page from your browser, you can access the Kubernetes pods once the user credentials are validated. User credentials should be defined from the IdP. Once you are logged into your application, the ID token of the authenticated is stored in the NGINX Plus Key-Value Store. Enabling PKCE with OIDC In the previous section, we learned how to configure NGINX Plus as the OIDC relying party to authenticate user identities attempting connections to protected Kubernetes applications. However, there are few cases where attackers can intercept code exchange requests issued from the IdP and hijack your ID tokens and gain access to your sensitive applications. PKCE is an extension of the OIDC Authorization code flow designed to protect against authorization code interception and theft. PKCE provides an extra layer of security where the attacker will need to provide a code verifier in addition to the authorization code in exchange for the ID token from the IdP. In our current setup, NGINX Plus will send a random generated PKCE code verifier as a query parameter when redirecting users to the IdP login page. The IdP will use this PKCE code verifier as extra validation when the authorization code is exchanged for the ID token. PKCE needs to be enabled from both NGINX Plus and the IdP. To enable PKCE verification on NGINX Plus, edit the openid_connect_configuration.conf file and modify $oidc_pkce_enable to 1 and reload NGINX Plus. Depending on the IdP you are using, a checkbox should be available to enable PKCE. Testing the Environment To test that PKCE is working, we will open a browser and enter the NGINX Plus host name once again. You should be redirected to the login page, only this time you will notice the URL had slightly changed with additional query parameters: code_challenge_method – Method used to hash the plain code verifier (most likely SHA256) code_challenge – The hashed value of the plain code verifier NGINX Plus will provide this plain code verifier along with the authorization code in exchange for the ID token. NGINX Plus will then validate the ID token and store it in cache. Extending OIDC with 3rd party Systems Customers may need to integrate their OIDC workflow with proprietary Auth/Auth systems already in production. For example, additional metadata pertaining to a user may need to be collected from an external Redis Cache or JFrog Artifactory. We can fill this gap by extending our diagram from the previous section. In addition to token validation with NGINX Plus, I pull response data from JFrog Artifactory and pass it to the backend applications once users are authenticated. Note: I am using JFrog Artifactory as an example 3rd party endpoint here. I can technically use any endpoint I want. Testing the Environment To test our environment in action, I will make a few updates to my NGINX OIDC configuration. You can pull our updated GitHub repository and use it as reference for updating your NGINX configuration. Update #1: Adding the njs source code The first update is extending NGINX Plus with njs and retrieving response data from our 3rd party system. Add the KvOperations.js source file in your /etc/nginx/njs directory. Update #2: frontend.conf I am adding lines 37-39 to frontend.conf so that NGINX Plus will initiate the sub-request to our 3rd party system after users are authenticated with OIDC. We are setting the URI of this sub-request to /kvtest. More on this on the next update. Update #3: openid_connect.server_conf We are adding lines 35-48 to openid_connect.server_conf consisting of two internal redirects in NGINX: /kvtest; Internal redirect from sub-requests with URI /kvtest will functions in KvOperations.js /auth; Internal redirect for sub-requests with URI /authwill be proxied to the 3rd party endpoint. You can replace the <artifactory-url> in line 47 with your own endpoints Update #4: openid_connect_configuration.conf This update is optional and applies when passing dynamic variables to your 3rd party endpoints. You can dynamically update variables on the fly by sending POST requests to the NGINX Plus Key-Value store. $ curl -iX POST -d '{"admin", "<input-value>"}' http://localhost:9000/api/7/http/<keyval-zone-name> We are defining/instantiating the new Key-Value store in lines 102-104.Once the updates are complete, I can test the optimized OIDC environment by troubleshooting/verifying the application is on the receiving end of my dynamic input values. Wrapping it up I covered a subset of ZT use cases with NGINX in hybrid architectures. The use cases presented in this article center around authentication. In the next part of our series, I will cover more ZT use case that include: Alternative authentication methods (SAML) Encryption Authorization and Access Control Monitoring/Auditing218Views0likes0CommentsiRules Style Guide
This article (formatted here in collaboration with and from the notes of F5er Jim_Deucker) features an opinionated way to write iRules, which is an extension to the Tcl language. Tcl has its own style guide for reference, as do other languages like my personal favorite python. From the latter, Guido van Rossum quotes Ralph Waldo Emerson: "A foolish consistency is the hobgoblin of little minds..," or if you prefer, Morpheus from the Matrix: "What you must learn is that these rules are no different than the rules of a computer system. Some of them can be bent. Others can be broken." The point? This is a guide, and if there is a good reason to break a rule...by all means break it! Editor Settings Setting a standard is good for many reasons. It's easier to share code amongst colleagues, peers, or the larger community when code is consistent and professional looking. Settings for some tools are provided below, but if you're using other tools, here's the goal: indent 4 spaces (no tab characters) 100-column goal for line length (120 if you must) but avoid line continuations where possible file parameters ASCII Unix linefeeds (\n) trailing whitespace trimmed from the end of each line file ends with a linefeed Visual Studio Code If you aren't using VSCode, why the heck not? This tool is amazing, and with the F5 Networks iRules extension coupled with The F5 Extension, you get the functionality of a powerful editor along with the connectivity control of your F5 hosts. With code diagnostics and auto formatting based on this very guide, the F5 Networks iRules Extension will make your life easy. Seriously...stop reading and go set up VSCode now. EditorConfig For those with different tastes in text editing using an editor that supports EditorConfig: # 4 space indentation [*.{irule,irul}] indent_style = space indent_size = 4 end_of_line = lf insert_final_newline = true charset = ascii trim_trailing_whitespace = true Vim I'm a vi guy with sys-admin work, but I prefer a full-fledge IDE for development efforts. If you prefer the file editor, however, we've got you covered with these Vim settings: # in ~/.vimrc file set tabstop=4 set shiftwidth=4 set expandtab set fileencoding=ascii set fileformat=unix Sublime There are a couple tools for sublime, but all of them are a bit dated and might require some work to bring them up to speed. Unless you're already a Sublime apologist, I'd choose one of the other two options above. sublime-f5-irules (bitwisecook fork, billchurch origin) for editing Sublime Highlight for export to RTF/HTML Guidance Watch out for smart-quotes and non-breaking spaces inserted by applications like Microsoft Word as they can silently break code. The VSCode extension will highlight these occurrences and offer a fix automatically, so again, jump on that bandwagon! A single iRule has a 64KB limit. If you're reaching that limit it might be time to question your life choices, I mean, the wisdom of the solution. Break out your iRules into functional blocks. Try to separate (where possible) security from app functionality from stats from protocol nuances from mgmt access, etc. For example, when the DevCentral team managed the DevCentral servers and infrastructure, we had 13 iRules to handle maintenance pages, masking application error codes and data, inserting scripts for analytics, managing vanity links and other structural rewrites to name a few. With this strategy, priorities for your events are definitely your friend. Standardize on "{" placement at the end of a line and not the following line, this causes the least problems across all the BIG-IP versions. # ### THIS ### if { thing } { script } else { other_script } # ### NOT THIS ### if { thing } { script } else { other_script } 4-character indent carried into nested values as well, like in switch. # ### THIS ### switch -- ${thing} { "opt1" { command } default { command } } Comments (as image for this one to preserve line numbers) Always comment at the same indent-level as the code (lines 1, 4, 9-10) Avoid end-of-line comments (line 11) Always hash-space a comment (lines 1, 4, 9-10) Leave out the space when commenting out code (line 2) switch statements cannot have comments inline with options (line 6) Avoid multiple commands on a single line. # ### THIS ### set host [getfield [HTTP::host] 1] set port [getfield [HTTP::host] 2] # ### NOT THIS ### set host [getfield [HTTP::host] 1]; set port [getfield [HTTP::host] 2] Avoid single-line if statements, even for debug logs. # ### THIS ### if { ${debug} } { log local0. "a thing happened...." } # ### NOT THIS ### if { ${debug} } { log local0. "a thing happened..."} Even though Tcl allows a horrific number of ways to communicate truthiness, Always express or store state as 0 or 1 # ### THIS ### set f 0 set t 1 if { ${f} && ${t} } { ... } # ### NOT THIS ### # Valid false values set f_values "n no f fal fals false of off" # Valid true values set t_values "y ye yes t tr tru true on" # Set a single valid, but unpreferred, state set f [lindex ${f_values} [expr {int(rand()*[llength ${f_values}])}]] set t [lindex ${t_values} [expr {int(rand()*[llength ${t_values}])}]] if { ${f} && ${t} } { ... } Always use Tcl standard || and && boolean operators over the F5 special and and or operators in expressions, and use parentheses when you have multiple arguments to be explicitly clear on operations. # ### THIS ### if { ${state_active} && ${level_gold} } { if { (${state} == "IL") || (${state} == "MO") } { pool gold_pool } } # ### NOT THIS ### if { ${state_active} and ${level_gold} } { if { ${state} eq "IL" or ${state} eq "MO" } { pool gold_pool } } Always put a space between a closing curly bracket and an opening one. # ### THIS ### if { ${foo} } { log local0.info "something" } # ### NOT THIS ### if { ${foo} }{ log local0.info "something" } Always wrap expressions in curly brackets to avoid double expansion. (Check out a deep dive on the byte code between the two approaches shown in the picture below) # ### THIS ### set result [expr {3 * 4}] # ### NOT THIS ### set result [expr 3 * 4] Always use space separation around variables in expressions such as if statements or expr calls. Always wrap your variables in curly brackets when referencing them as well. # ### THIS ### if { ${host} } { # ### NOT THIS ### if { $host } { Terminate options on commands like switch and table with "--" to avoid argument injection if if you're 100% sure you don't need them. The VSCode iRules extension will throw diagnostics for this. SeeK15650046 for more details on the security exposure. # ### THIS ### switch -- [whereis [IP::client_addr] country] { "US" { table delete -subtable states -- ${state} } } # ### NOT THIS ### switch [whereis [IP::client_addr] country] { "US" { table delete -subtable states ${state} } } Always use a priority on an event, even if you're 100% sure you don't need them. The default is 500 so use that if you have no other starting point. Always put a timeout and/or lifetime on table contents. Make sure you really need the table space before settling on that solution, and consider abusing the static:: namespace instead. Avoid unexpected scope creep with static:: and table variables by assigning prefixes. Lacking a prefix means if multiple rules set or use the variable changing them becomes a race condition on load or rule update. when RULE_INIT priority 500 { # ### THIS ### set static::appname_confvar 1 # ### NOT THIS ### set static::confvar 1 } Avoid using static:: for things like debug configurations, it's a leading cause of unintentional log storms and performance hits. If you have to use them for a provable performance reason follow the prefix naming rule. # ### THIS ### when CLIENT_ACCEPTED priority 500 { set debug 1 } when HTTP_REQUEST priority 500 { if { ${debug} } { log local0.debug "some debug message" } } # ### NOT THIS ### when RULE_INIT priority 500 { set static::debug 1 } when HTTP_REQUEST priority 500 { if { ${static::debug} } { log local0.debug "some debug message" } } Comments are fine and encouraged, but don't leave commented-out code in the final version. Wrapping up that guidance with a final iRule putting it all into practice: when HTTP_REQUEST priority 500 { # block level comments with leading space #command commented out if { ${a} } { command } if { !${a} } { command } elseif { ${b} > 2 || ${c} < 3 } { command } else { command } switch -- ${b} { "thing1" - "thing2" { # thing1 and thing2 business reason } "thing3" { # something else } default { # default branch } } # make precedence explicit with parentheses set d [expr { (3 + ${c} ) / 4 }] foreach { f } ${e} { # always braces around the lists } foreach { g h i } { j k l m n o p q r } { # so the lists are easy to add to } for { set i 0 } { ${i} < 10 } { incr i } { # clarity of each parameter is good } } What standards do you follow for your iRules coding styles? Drop a comment below!6.6KViews23likes12CommentsHow to get a F5 BIG-IP VE Developer Lab License
(applies to BIG-IP TMOS Edition) To assist DevOps teams improve their development for the BIG-IP platform, F5 offers a low cost developer lab license.This license can be purchased from your authorized F5 vendor. If you do not have an F5 vendor, you can purchase a lab license online: CDW BIG-IP Virtual Edition Lab License CDW Canada BIG-IP Virtual Edition Lab License Once completed, the order is sent to F5 for fulfillment and your license will be delivered shortly after via e-mail. F5 is investigating ways to improve this process. To download the BIG-IP Virtual Edition, please log into downloads.f5.com (separate login from DevCentral), and navigate to your appropriate virtual edition, example: For VMware Fusion or Workstation or ESX/i:BIGIP-16.1.2-0.0.18.ALL-vmware.ova For Microsoft HyperV:BIGIP-16.1.2-0.0.18.ALL.vhd.zip KVM RHEL/CentoOS: BIGIP-16.1.2-0.0.18.ALL.qcow2.zip Note: There are also 1 Slot versions of the above images where a 2nd boot partition is not needed for in-place upgrades. These images include_1SLOT- to the image name instead of ALL. The below guides will help get you started with F5 BIG-IP Virtual Edition to develop for VMWare Fusion, AWS, Azure, VMware, or Microsoft Hyper-V. These guides follow standard practices for installing in production environments and performance recommendations change based on lower use/non-critical needs fo Dev/Lab environments. Similar to driving a tank, use your best judgement. DeployingF5 BIG-IP Virtual Edition on VMware Fusion Deploying F5 BIG-IP in Microsoft Azure for Developers Deploying F5 BIG-IP in AWS for Developers Deploying F5 BIG-IP in Windows Server Hyper-V for Developers Deploying F5 BIG-IP in VMware vCloud Director and ESX for Developers Note: F5 Support maintains authoritativeAzure, AWS, Hyper-V, and ESX/vCloud installation documentation. VMware Fusion is not an official F5-supported hypervisor so DevCentral publishes the Fusion guide with the help of our Field Systems Engineering teams.75KViews13likes143CommentsConverting a BIG-IP Maintenance Page iRule to Distributed Cloud using App Stack
If you are familiar with BIG-IP, you are probably also familiar with its flexible and robust iRule functionality. In fact, I would argue that iRules makes BIG-IP the swiss-army knife that it is. If there is ever a need for advanced traffic manipulation, you can usually come up with an iRule to solve the problem. F5 Distributed Cloud (XC) has its own suite of tools to help in this regard. If you need to do some sort of traffic manipulation/routing you can usually handle that with Service Policies or simply using Routes. Even with these features, however, there are going to be some cases where iRule functionality from the BIG-IP cannot be reproduced directly in XC. When this happens, we switch to using App Stack, which is XC’s version of a swiss army knife. In this article, I wanted to walk through an example of how you can leverage XC's App Stack for a specific iRule conversion use case: Displaying a Custom Maintenance Page when all pool members are down. For reference, here is the iRule: when LB_FAILED { if { [active_members [LB::server pool]] == 0 } { if { ([string tolower [HTTP::host]] contains "example.com")} { if { [HTTP::uri] ends_with "SystemMaintenance.jpg" } { HTTP::respond 200 content [ifile get "SystemMaintenance.jpg"] "Content-Type" "image/jpg" } else { HTTP::respond 200 content "<!DOCTYPE html> <html lang="en"> <head> <title>System Maintenance</title> <style type="text/css"> .base { font-family: 'Tahoma'; font-size: large; } </style> </head> <body> <br> <center><img alt="sad" height="200" src="SystemMaintenance.jpg" width="200" /></center><br> <center><span class="base">This application is currently under system maintenance.</span></center> <br> <center><span class="base">All services will be back online in a few mintues.</span> </body> </html>" } } } } When dissecting this iRule, you can see we have to solve for the following: Trigger the maintenance page when all pool members are down Serve local files (images, css, etc.) Display the static HTML page So, how do we do this? Well, App Stack allows us to deploy and host a container in Distributed Cloud. So we can easily create a simple container (using NGINX for bonus points!) that contains all these images, stylesheets, HTML files, etc. and manipulate our pools so that it uses this container when required! Let’s deep dive into the step-by-step process… Step by Step Walk-through: Container Creation First, we have to create our container. I'm not going to go too deep into how to create a container in this article, but I will highlight the main steps I took. To start, I simply extracted the HTML from the iRule above and saved all the required files (images, stylesheets, etc.) in one directory. Since I am adding NGINX to the container, I must also create and include a nginx.conf file in this directory. Below was my configuration: worker_processes 1; error_log /var/log/nginx/error.log warn; pid /tmp/nginx.pid; events { worker_connections 1024; } http { client_body_temp_path /tmp/client_temp; proxy_temp_path /tmp/proxy_temp_path; fastcgi_temp_path /tmp/fastcgi_temp; uwsgi_temp_path /tmp/uwsgi_temp; scgi_temp_path /tmp/scgi_temp; include /etc/nginx/mime.types; server { listen 8080; location / { root /usr/share/nginx/html/; index index.html; } location ~* \.(js|jpg|png|css)$ { root /usr/share/nginx/html/; } } sendfile on; keepalive_timeout 65; } There really isn’t much to the NGINX configuration for this example, but keep in mind that you can expand on this and make it much more robust for other use cases. (One note about the configuration above is that you will see /tmp paths mentioned. These are required since our container will run as a non-root user. For more information, see the NGINX documentation here: https://hub.docker.com/_/nginx) Finally, I included a Dockerfile with my requirements for NGINX and exposing port 8080. Once that was all set, I built my container and pushed it Docker Hub as a private repository. App Stack Deployment Now that we have the container created and uploaded to Docker Hub, we are ready to bring it to XC. Start by opening up the F5 XC Console and navigate to the Distributed Apps tile. Navigate to Applications -> Container Registries, then click Add Container Registry. Here we just have to add a name for the Container Registry, our Docker Hub Username, “docker.io” for the Server FQDN, and then blindfold our password for Docker Hub. After saving, we are now ready to configure our workload To do so, we have to navigate over to Applications -> Virtual K8s. I already had a Virtual Site and Virtual K8s created, but you'll need to create those if you don't already have them. For your reference, here are some links to a walk-through on each of these: Virtual Site Creation:https://docs.cloud.f5.com/docs/how-to/fleets-vsites/create-virtual-site Virtual K8s Creation:https://docs.cloud.f5.com/docs/how-to/app-management/create-vk8s-obj Select your Virtual K8s cluster: After selecting your cluster, navigate to theWorkloads tab.Under Workloads, click on Add VK8s Workload. Give your workload a name and then change the Type of Workload toService instead of Simple Service. Your configuration should look something like below: You'll notice we now have to configure the Service. ClickConfigure. The first step is to tell XC which container we want to deploy for this service. Under Containers, select Add Item: Give the container a name, and then input your Image Name. The format for the image name is "registry/image:tagname". If you leave the tag name blank, it defaults to “latest”. Under the Select Container Registry drop down, selectPrivate Registry. This will bring up another drop-down where we will select the container registry we created earlier. Your configuration should end up looking similar to below: For this simple use case, we can skip the Configuration Parameters and move to our Deploy Options. Here, we have some flexibility on where we want to deploy our workload. You can choose All Regional Edges (F5 PoPs), specific REs, or even custom CEs and Virtual Sites. In my basic example, I chose Regional Edge Sites and picked the ny8-nyc RE for now: Next, we have to configure where we want to advertise this workload. We have the option to keep it internal and only advertise in the vK8s Cluster or we could advertise this workload directly on the Internet. Since we only want this maintenance page to be seen when the pool members are all down, we are going to keep this to Advertise In Cluster. After selecting the advertisement, we have to configure our Port Information. Click Configure. Under the advertisement configuration, you’ll see we are simply choosing our ports. If you toggle “Show Advanced fields” you can see we have some flexibility on the port we want to advertise and the actual target port for the container. In my case, I am going to use 8080 for both, but you may want to have a different combination (i.e. 80:8080). Click Apply once finished. Now that we have the ports defined, we can simply hit Apply on the Service configuration and Save and Exit the workload to kick off the deployment. We should now see our new maintenance-page workload in the list. You’ll notice that after refreshing a couple times, the Running/Completed Pods and Total Pods fields will be populated with the number of REs/CEs you chose to deploy the workload to. After a few minutes, you should have a matching number of Running/Completed Pods to your Total Pods. This gives us an indication that the workload is ready to be used for our application. (Note: you can click on the pod numbers in this list to see a more detailed status of the pods. This helps when troubleshooting) Pool Creation With our workload live and advertised in the cluster, it is time to create our pool. In the top left of the platform we’ll need to Select Service and change to Mulitcloud App Connect: Under Mulit-Cloud App Connect, navigate to Manage -> Load Balancers -> Origin Pools and SelectAdd Origin Pool. Here, we’ll give our origin pool a name and then go directly to Origin Servers. Under Origin Servers, clickAdd Item. Change the Type of the Origin Server to be K8s Service Name of Origin Server on given Sites. Under Service Name, we have to use the format "servicename.namespace:cluster-id" to point to our workload. In my case, it was "maintenance-page.bohanson:bohanson-test" since I had the following: Service Name: maintenance-page Namespace: bo-hanson VK8s Cluster: bohanson-test Under Site or Virtual Site, I chose the Virtual Site I already had created. The last step is to change the network to vK8s Networks on Site and Click Apply. The result should look like the below: We now need to change our Origin Server port to be the port we defined in the workload advertisement configuration. In my case, I chose port 8080. The rest of the configuration of the origin server is up to you, but I chose to include a simple http health check to monitor the service. Once the configuration finished, click Save and Exit. The final pool configuration should look like this: Application Deployment: With our maintenance container up and running and our pool all set, it is time to finally deploy our solution. In this case, we can select any existing Load Balancer configuration where we want to add the maintenance page. You could also create a new Load Balancer from scratch, of course, but for this example I am deploying to an existing configuration. Under Manage -> Load Balancers, find the load balancer of your choosing and then select Manage Configuration. Once in the Load Balancer view, select Edit Configuration in the top right. To deploy the solution, we just need to navigate to our Origins section and add our new maintenance pool. SelectAdd Item. At this point, you may be thinking, “Well that is great, but how am I going to get the pool to only show when all other pool members are down?” That is the beauty of the F5 Distributed Cloud pool configuration. We have two options that we can set when adding a pool: Weight and Priority. Both of those options are pretty self-explanatory if you have used a load balancer before, but what is interesting here is when you give these options a value of zero. Giving a pool a weight of zero would disable the pool. For a maintenance pool use case, that could be helpful since we can manually go into the Load Balancer configuration during a maintenance window, disable the main pool, and then bring up the maintenance pool until our change window is closed when we could then reverse the weights and bring the main pool back online. That ALMOST solves our iRule use case, but it would be manual. Alternatively, we can give a pool a Priority of zero. Doing so would mean that all other pools take priority and will be used unless they go down. In the event of the main pool going down, it would default to the lowest priority pool (zero). Now that is more like it! This means we can set our maintenance pool to a Priority of zero and it will automatically be used when the health of all our other pool members go down – which completely fulfills the original iRule requirement. So in our configuration, let's add our new maintenance pool and set: Weight: 1 Priority: 0 After clicking save, the final pool configuration should look something like this: Testing To test, we can simply switch our health check on the main pool to something that would fail. In my case, I just changed the expected status code on the health check to something arbitrary that I knew would fail, but this could be different in your case. After changing the health check, we can navigate to our application in a browser, and see our maintenance page dynamically appear! Changing the health check on the main pool back to a working one should dynamically turn off the maintenance page as well: Summary This is just one example of how you can use App Stack to convert some more advanced/dynamic iRules over to F5 Distributed Cloud. I only used a basic NGINX configuration in this example, but you can start to see how leveraging NGINX in App Stack can give us even more flexibility. Hopefully this helps!132Views2likes0CommentsScalable AI Deployment: Harnessing OpenVINO and NGINX Plus for Efficient Inference
Introduction In the realm of artificial intelligence (AI) and machine learning (ML), the need for scalable and efficient AI inference solutions is paramount. As organizations deploy increasingly complex AI models to solve real-world problems, ensuring that these models can handle high volumes of inference requests becomes critical. NGINX Plus serves as a powerful ally in managing incoming traffic efficiently. As a high-performance web server and reverse proxy server, NGINX Plus is adept at load balancing and routing incoming HTTP and TCP traffic across multiple instances of AI model serving environments. The OpenVINO Model Server, powered by Intel's OpenVINO toolkit, is a versatile inference server supporting various deep learning frameworks and hardware acceleration technologies. It allows developers to deploy and serve AI models efficiently, optimizing performance and resource utilization. When combined with NGINX Plus capabilities, developers can create resilient and scalable AI inference solutions capable of handling high loads and ensuring high availability. Health checks allow NGINX Plus to continuously monitor the health of the upstream OVMS instances. If an OVMS instance becomes unhealthy or unresponsive, NGINX Plus can automatically route traffic away from it, ensuring that inference requests are processed only by healthy OVMS instances. Health checks provide real-time insights into the health status of OVMS instances. Administrators can monitor key metrics such as response time, error rate, and availability, allowing them to identify and address issues proactively before they impact service performance. In this article, we'll delve into the symbiotic relationship between the OpenVINO Model Server, and NGINX Plus to construct a robust and scalable AI inference solution. We'll explore setting up the environment, configuring the model server, harnessing NGINX Plus for load balancing, and conducting testing. By the end, readers will gain insights into how to leverage Docker, the OpenVINO Model Server, and NGINX Plus to build scalable AI inference systems tailored to their specific needs. Flow explanation: Now, let's walk through the flow of a typical inference request. When a user submits an image of a zebra for inference, the request first hits the NGINX load balancer. The load balancer then forwards the request to one of the available OpenVINO Model Server containers, distributing the workload evenly across multiple containers. The selected container processes the image using the optimized deep-learning model and returns the inference results to the user. In this case, the object is named zebra. OpenVINO™ Model Server is a scalable, high-performance solution for serving machine learning models optimized for Intel® architectures. The server provides an inference service via gRPC, REST API, or C API -- making it easy to deploy new algorithms and AI experiments. You can visit https://hub.docker.com/u/openvino for reference. Setting up: We'll begin by deploying model servers within containers. For this use case, I'm deploying the model server on a virtual machine (VM). Let's outline the steps to accomplish this: Get the docker image for OpenVINO ONNX run time docker pull openvino/onnxruntime_ep_ubuntu20 You can also visit https://docs.openvino.ai/nightly/ovms_docs_deploying_server.html for OpenVINO model server deployment in a container environment. Begin by creating a docker-compose file following the structure below: https://raw.githubusercontent.com/f5businessdevelopment/F5openVino/main/docker-compose.yml version: '3' services: resnet1: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9001 volumes: - ./models:/models ports: - "9001:9001" resnet2: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9002 volumes: - ./models:/models ports: - "9002:9002" # Add more services for additional containers resnet3: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9003 volumes: - ./models:/models ports: - "9003:9003" resnet4: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9004 volumes: - ./models:/models ports: - "9004:9004" resnet5: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9005 volumes: - ./models:/models ports: - "9005:9005" resnet6: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9006 volumes: - ./models:/models ports: - "9006:9006" resnet7: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9007 volumes: - ./models:/models ports: - "9007:9007" resnet8: image: openvino/model_server:latest command: > --model_name=resnet --model_path=/models/resnet50 --layout=NHWC:NCHW --port=9008 volumes: - ./models:/models ports: - "9008:9008" Make sure you have Docker and Docker Compose installed on your system. Place your model files in the `./models/resnet50` directory on your local machine. Save the provided Docker Compose configuration to a file named `docker-compose.yml`. Run the following command in the directory containing the `docker-compose.yml` file to start the services: docker-compose up -d You can now access the OpenVINO Model Server instances using the specified ports (e.g., `http://localhost:9001` for `resnet1` and `http://localhost:9002` for `resnet2`). - Ensure that the model files are correctly placed in the `./models/resnet50` directory before starting the services. Set up an NGINX Plus proxy server. You can refer to https://docs.nginx.com/nginx/admin-guide/installing-nginx/installing-nginx-plus/ for NGINX Plus installation also You have the option to configure VMs with NGINX Plus on AWS by either: Utilizing the link provided below, which guides you through setting up NGINX Plus on AWS via the AWS Marketplace: NGINX Plus on AWS Marketplace or Following the instructions available on GitHub at the provided repository link. This repository facilitates spinning up VMs using Terraform on AWS and deploying VMs with NGINX Plus under the GitHub repository - F5 OpenVINO The NGINX Plus proxy server functions as a proxy for upstream model servers. Within the upstream block, backend servers (model_servers) are defined along with their respective IP addresses and ports. In the server block, NGINX listens on port 80 to handle incoming HTTP/2 requests targeting the specified server name or IP address. Requests directed to the root location (/) are then forwarded to the upstream model servers utilizing the gRPC protocol. The proxy_set_header directives are employed to maintain client information integrity while passing requests to the backend servers. Ensure to adjust the IP addresses, ports, and server names according to your specific setup. Here is an example configuration that is also available at GitHubhttps://github.com/f5businessdevelopment/F5openVino upstream model_servers { server 172.17.0.1:9001; server 172.17.0.1:9002; server 172.17.0.1:9003; server 172.17.0.1:9004; server 172.17.0.1:9005; server 172.17.0.1:9006; server 172.17.0.1:9007; server 172.17.0.1:9008; zone model_servers 64k; } server { listen 80 http2; server_name 10.0.0.19; # Replace with your domain or public IP location / { grpc_pass grpc://model_servers; health_check type=grpc grpc_status=12; # 12=unimplemented proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } } If you are using gRPC with SSL please refer to the detailed configuration at NGINX Plus SSL Configuration Here is the explanation: upstream model_servers { server 172.17.0.1:9001; # Docker bridge network IP and port for your container server 172.17.0.1:9002; # Docker bridge network IP and port for your container .... .... } This section defines an upstream block named model_servers, which represents a group of backend servers. In this case, there are two backend servers defined, each with its IP address and port. These servers are typically the endpoints that NGINX will proxy requests to. server { listen 80 http2; server_name 10.1.1.7; # Replace with your domain or public IP This part starts with the main server block. It specifies that NGINX should listen for incoming connections on port 80 using the HTTP/2 protocol (http2), and it binds the server to the IP address 10.1.1.7. Replace this IP address with your domain name or public IP address. location / { grpc_pass grpc://model_servers; health_check type=grpc grpc_status=12; # 12=unimplemented proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } Within the location/block, NGINX defines how to handle requests to the root location. In this case, it's using gRPC (grpc_pass grpc://model_servers;) to pass the requests to the upstream servers defined in the model_servers block. The proxy_set_header directives are used to set headers that preserve client information when passing requests to the backend servers. These headers include Host, X-Real-IP, and X-Forwarded-For. Health checks with type=grpc enable granular monitoring of individual gRPC services and endpoints. You can verify the health of specific gRPC methods or functionalities, ensuring each service component is functioning correctly. In summary, this NGINX configuration sets up a reverse proxy server that listens for HTTP/2 requests on port 80 and forwards them to backend servers (model_servers) using the gRPC protocol. It's commonly used for load balancing or routing requests to multiple backend servers. Inference Testing: This is how you can conduct testing. On the client side, we utilize a script named predict.py. Below is the script for reference # Import necessary libraries import numpy as np from classes import imagenet_classes from ovmsclient import make_grpc_client # Create a gRPC client to communicate with the server # Replace "10.1.1.7:80" with the IP address and port of your server client = make_grpc_client("10.1.1.7:80") # Open the image file "zebra.jpeg" in binary read mode with open("zebra.jpeg", "rb") as f: img = f.read() # Send the image data to the server for prediction using the "resnet" model output = client.predict({"0": img}, "resnet") # Extract the index of the predicted class with the highest probability result_index = np.argmax(output[0]) # Print the predicted class label using the imagenet_classes dictionary print(imagenet_classes[result_index]) This script imports necessary libraries, establishes a connection to the server at the specified IP address and port, reads an image file named "zebra.jpeg," sends the image data to the server for prediction using the "resnet" model, retrieves the predicted class index with the highest probability, and prints the corresponding class label. Results: Execute the following command from the client machine. Here, we are transmitting this image of Zebra to the model server. python3 predict.py zebra.jpg #run the Inference traffic zebra. The prediction output is 'zebra'. Let's now examine the NGINX Plus logs cat /var/log/nginx/access.log 10.1.1.7 - - [13/Apr/2024:00:18:52 +0000] "POST /tensorflow.serving.PredictionService/Predict HTTP/2.0" 200 4033 "-" "grpc-python/1.62.1 grpc-c/39.0.0 (linux; chttp2)" This log entry shows that a POST request was made to the NGINX server at the specified timestamp, and the server responded with a success status code (200). The request was made using gRPC, as indicated by the user agent string. Conclusion: Using NGINX Plus, organizations can achieve a scalable and efficient AI inference solution. NGINX Plus can address disruptions caused by connection timeouts/errors, sudden spikes in request rates, or changes in network topology. OpenVINO Model Server optimizes model performance and inference speed, utilizing Intel hardware acceleration for enhanced efficiency. NGINX Plus acts as a high-performance load balancer, distributing incoming requests across multiple model server instances for improved scalability and reliability. Together, this enables seamless scaling of AI inference workloads, ensuring optimal performance and resource utilization. You can look at this video for reference: https://youtu.be/Sd99woO9FmQ References: https://hub.docker.com/u/openvino https://docs.nginx.com/nginx/deployment-guides/amazon-web-services/high-availability-keepalived/ https://www.nginx.com/blog/nginx-1-13-10-grpc/ https://github.com/f5businessdevelopment/F5openVino.git https://docs.openvino.ai/nightly/ovms_docs_deploying_server.html414Views0likes0CommentsGetting Started with iRules LX, Part 4: NPM & Best Practices
So far in this series we've covered basic nomenclature and concepts, and in the last article actually dug into the code that makes it all work. At this point, I'm sure the wheels of possibilities are turning in your minds, cooking up all the nefarious interesting ways to extend your iRules repertoire. The great thing about iRules LX, as we'll discuss in the onset of this article, is that a lot of the heavy lifting has probably already been done for you. The Node.js package manager, or NPM, is a living, breathing, repository of 280,000+ modules you won't have to write yourself should you need them! Sooner or later you will find a deep desire or maybe even a need to install packages from NPM to help fulfill a use case. Installing packages with NPM NPM on BIG-IP works much of the same way you use it on a server. We recommend that you not install modules globally because when you export the workspace to another BIG-IP, a module installed globally won't be included in the workspace package. To install an NPM module, you will need to access the Bash shell of your BIG-IP. First, change directory to the extension directory that you need to install a module in. Note: F5 Development DOES NOT host any packages or provide any support for NPM packages in any way, nor do they provide security verification, code reviews, functionality checks, or installation guarantees for specific packages. They provide ONLY core Node.JS, which currently, is confined only to versions 0.12.15 and 6.9.1. The extension directory will be at /var/ilx/workspaces/<partition_name>/<workspace_name>/extensions/<extension_name>/ . Once there you can run NPM commands to install the modules as shown by this example (with a few ls commands to help make it more clear) - [root@test-ve:Active:Standalone] config # cd /var/ilx/workspaces/Common/DevCentralRocks/extensions/dc_extension/ [root@test-ve:Active:Standalone] dc_extension # ls index.js node_modules package.json [root@test-ve:Active:Standalone] dc_extension # npm install validator --save validator@5.3.0 node_modules/validator [root@test-ve:Active:Standalone] dc_extension # ls node_modules/ f5-nodejs validator The one caveat to installing NPM modules on the BIG-IP is that you can not install native modules. These are modules written in C++ and need to be complied. For obvious security reasons, TMOS does not have a complier. Best Practices Node Processes It would be great if you could spin up an unlimited amount of Node.js processes, but in reality there is a limit to what we want to run on the control plane of our BIG-IP. We recommend that you run no more than 50 active Node processes on your BIG-IP at one time (per appliance or per blade). Therefore you should size the usage of Node.js accordingly. In the settings for an extension of a LX plugin, you will notice there is one called concurrency - There are 2 possible concurrency settings that we will go over. Dedicated Mode This is the default mode for all extensions running in a LX Plugin. In this mode there is one Node.js process per TMM per extension in the plugin. Each process will be "dedicated" to a TMM. To know how many TMMs your BIG-IP has, you can run the following TMSH command - root@(test-ve)(cfg-sync Standalone)(Active)(/Common)(tmos) # show sys tmm-info | grep Sys::TMM Sys::TMM: 0.0 Sys::TMM: 0.1 This shows us we have 2 TMMs. As an example, if this BIG-IP had a LX plugin with 3 extensions, I would have a total of 6 Node.js processes. This mode is best for any type of CPU intensive operations, such as heavy parsing data or doing some type of lookup on every request, an application with massive traffic, etc. Single Mode In this mode, there is one Node.js process per extension in the plugin and all TMMs share this "single" process. For example, one LX plugin with 3 extensions will be 3 Node.js processes. This mode is ideal for light weight processes where you might have a low traffic application, only do a data lookup on the first connection and cache the result, etc. Node.js Process Information The best way to find out information about the Node.js processes on your BIG-IP is with the TMSH command show ilx plugin . Using this command you should be able to choose the best mode for your extension based upon the resource usage. Here is an example of the output - root@(test-ve)(cfg-sync Standalone)(Active)(/Common)(tmos) # show ilx plugin DC_Plugin --------------------------------- ILX::Plugin: DC_Plugin --------------------------------- State enabled Log Publisher local-db-publisher ------------------------------- | Extension: dc_extension ------------------------------- | Status running | CPU Utilization (%) 0 | Memory (bytes) | Total Virtual Size 1.1G | Resident Set Size 7.7K | Connections | Active 0 | Total 0 | RPC Info | Total 0 | Notifies 0 | Timeouts 0 | Errors 0 | Octets In 0 | Octets Out 0 | Average Latency 0 | Max Latency 0 | Restarts 0 | Failures 0 --------------------------------- | Extension Process: dc_extension --------------------------------- | Status running | PID 16139 | TMM 0 | CPU Utilization (%) 0 | Debug Port 1025 | Memory (bytes) | Total Virtual Size 607.1M | Resident Set Size 3.8K | Connections | Active 0 | Total 0 | RPC Info | Total 0 | Notifies 0 | Timeouts 0 | Errors 0 | Octets In 0 | Octets Out 0 | Average Latency 0 | Max Latency 0 From this you can get quite a bit of information, including which TMM the process is assigned to, PID, CPU, memory and connection stats. If you wanted to know the total number of Node.js processes, that same command will show you every process and it could get quite long. You can use this quick one-liner from the bash shell (not TMSH) to count the Node.js processes - [root@test-ve:Active:Standalone] config # tmsh show ilx plugin | grep PID | wc -l 16 File System Read/Writes Since Node.js on BIG-IP is pretty much stock Node, file system read/writes are possible but not recommended. If you would like to know more about this and other properties of Node.js on BIG-IP, please see AskF5 Solution ArticleSOL16221101. Note:NPMs with symlinks will no longer work in 14.1.0+ due to SELinux changes In the next article in this series we will cover troubleshooting and debugging.2.5KViews1like4Comments