Run Advanced Containers like OPSWAT Content Scrubbing Directly On F5 Distributed Cloud Nodes

A traditionalist’s view on applications and networking is very much two discrete solutions, servers run apps and devices like proxies and server load balancers aid in feeding servers with safe, low latency data flows.   What if you had opportunities to converge the two?  To run apps, directly upon the same network appliances already invested in and running to five 9’s or greater availability?   How important to green initiatives is overall reduction in appliance count, rack space and power consumption?

With the advent of containerized applications, this mixed use of existing platforms is now not only possible but quite easy for F5 Distributed Cloud (XC) customers; even those who have had Kubernetes projects as a distant target, never quite embraced to date.   With the F5 Distributed Cloud virtual Kubernetes (vK8s) offering, most containers can be up and operational within minutes, with a simple GUI click-through approach for those looking to get effective and agile, quickly.

Two potential use cases for vK8s, of many, would include:

  • Host the public-facing web component of critical applications on F5’s Distributed Cloud global regional edge (RE) network, in other words, have the initial customer interaction with web apps in a global region actually near the user.   Getting closer to your users reduces web-related TCP round-trip latency, opening TCP congestion windows briskly and making your front-end solution hyper “responsive” in look and feel.  Nearby users gravitate to nearby RE’s through anycast support within BGP-4, the key protocol that steers Internet traffic.
  • Run internal apps, for internal groups such as secops teams, on F5 customer edge (CE) nodes within data centers, both physical and in the cloud.   This also can be attractive for edge locations where compute resources are often scarce.  Not only will this cut down on server sprawl, but the containerized applications will benefit from a broad set of security mechanisms offered by F5, the same security servers are afforded.  F5 is an industry leader in the blossoming WAAP market which wraps services in A-to-Z security best practices.

This article introduces an example of the second bullet point, in this case we step through deploying a leading security application, OPSWAT MetaDefender Core, a tool a SECOPS team can harness to detect advance malware in submitted file ware, as well as cutting edge tools such as Content Disarm and Reconstruction (CDR).   With the OPSWAT container running on CE nodes, as a vK8s service, IT security groups can now scrub files; for instance, documents intended for C-suite executives or legal teams which are often points of interest in targeted malware campaigns.

Scrubbing is a widely thrown around metaphor, but accurate in this case as any active or ancillary content can be removed by policy, yet still leave the files usable at their destination.  This includes removing macros, pulling out risky JavaScript, suppressing hyperlinks a reader might inadvertently click.   All without spreading a fleet of servers to support this advanced service, simply harness the F5 platform for compute and not exclusively advanced network solutions.

How to Run a Container like OPSWAT MetaDefender Core on F5 Distributed Cloud

The first step to run containers on a CE node, and specifically the “mesh” offering of node traditionally directed towards items like load balancing and multi-cloud networking, is to quickly define a “virtual site”, a set of one or more Distributed Cloud appliances (virtual or physical) that will help run the application.   Simply log into the XC SaaS global console, and choose the “Distributed Apps” module and select a name space to work in.  When clicking on the “Virtual Sites” side-menu a user can click the Add Virtual Site link and name a virtual site and identify the fleet of CE nodes it should contain.

 The second step, create a Virtual Kubernetes (vK8s) offering by clicking the “Virtual K8s” side-menu; the user simply names the service and ties it to the virtual site from the previous step.   Easy enough.  One vK8s offering is supported per namespace, and many containerized applications may be run per vK8s instance.

For advanced users, from the Actions menu (three dots to the right), after the service is created, one can potentially download a corresponding Kubeconfig file, to allow command line control through Kubectl, to add and start containers.   Only partially familiar with Kubeconfig, Kubectl, Kube”anything”?  No worries, that is exactly the point of this article.  It is not a necessary prerequisite skillset to get your favorite container up and running, as we will follow a purely GUI-driven, ClickOps approach to get familiarized and quickly productive.

We are now able to run our applications on the CEs, turning a once networking-specific solution into one that offers a convergence with computing.   To get started and see a dashboard view where intuitively we can add our apps graphically, simply click on the vK8s entry just created.

Above is a typical Virtual K8s dashboard, in this case we see two sites make up the virtual site with one on-line and one off-line (healthy and unhealthy).   The application to be deployed, in keeping with the tenets of Kubernetes, will run on any active member (site) of the virtual site defined and will make use of any members brought on-line, automatically of course.

Setup the OPSWAT MetaDefender Core Container to Run on CE Nodes

As a starting point, from the dashboard, click on the “Workloads” menu to get active with adding applications (containers) to be run on the members of the virtual site.  Click on the “Add vK8s Workload” button.

 In the above example, we will add a workload called “opswatvk8sworkload” of the type “Service”.   Other types of workloads include:

  • those designed to run on the international RE network and bring applications closer to global audiences, selective geographies may be leverage or a truly global roll out.
  • batch style workloads, for processing tasks that are expected to run to completion and end.

By clicking on “Configure” we move to enabling our workload, a service in the parlance of Kubernetes, to run on a set of our selected customer edge nodes.   When we add a containerized application, we fundamentally need to follow just three steps to provide the container’s details:

  1. The container name and the registry where the code is stored.
  2. Environment variables to use when the container is launched, such as a license string the application might require to fully execute.
  3. (Optional) A volume to read/write to on the node, outside of the container, only required if the application prefers to have existing, historical data when the container restarts.

 The following is an example of specifying OPSWAT’s name and registry fields upon entering the data (Step 1).

The startup parameters, in the form of environmental variables, are entered below in step 2.

Finally, in step 3, if we wish to use a permanent volume outside of the container, since storage inside the container is ephemeral and resets in the event of a container restart, we do so as shown below.   We could also set up storage off the node, using standard NAS volume mounts if desired. 

We will now select where the container should run, although containers can run on RE nodes, in our use case of empowering Secops with a critical internal security tool, we will run containers on select CEs and access the application through a load balancer and full web security stack that F5 is well known for.

Run The OPSWAT Application while also Providing Rigorous, Secure Access

To provide access to the OSWAT application, or any other container you might run on the Distributed Cloud nodes, we need to add two more pieces of information.   Continuing down the main Workloads screen we need to (a) choose the nodes to utilize for the application and (B) describe the TCP ports the application is expecting.   The latter TCP port will be obfuscated from end users who will securely reach the application through a Distributed Cloud HTTP(s) load balancer, typically on port 443.

Double clicking into the configuration of the “Advertise Options” we see that the load balancer we will shortly build will map HTTPS transaction arriving on the external side, to a Kubernetes service which runs our OPSWAT container.   Also, the internal traffic will exercise OPSWAT’s default port, TCP 8008.

Ensure the OPSWAT Container is Running and Access It

At this point, our application will be up and running.   The container becomes a reality within a Kubernetes pod, and the dashboard displays the status of the pod.  With multiple CE nodes included in a virtual site one can easily ask for multiple instances of pods running, we will touch upon desired replication counts a little later.

 The access to the application is provided with a standard F5 Distributed Cloud HTTPS load balancer.   The steps to setup a similar load balancer, using the AppConnect module, is best demonstrated by this demonstration, which uses the simulator.f5 to show the setup of load balancer as an interactive learning exercise.

In our setup, the HTTPS load balancer has been set up, with the following key points highlighted in the diagram.   First, we have used a delegated DNS name to direct any traffic for the service through the nearest RE nodes, anycast will ensure the closest RE is used.   Although we are running containers on CE nodes, we harness the RE network to steer traffic towards the service and implement security, such as an advanced WAF or anti-bot solution, immediately upon ingress.  The load balancer type is HTTPS, with an automatic certificate provided.   This will ensure there are no browser pop-ups complaining about trust issues.   The port exposed to users is the standard HTTPS port of 443.

The final highlighted portion of the screen shot is the origin pool definition, what the load balancer will map arriving traffic to.   Often, an origin pool might map to servers, at public or private IP addresses lists, or perhaps to a set of fully qualified domain names (FQDN).  In this case, though, the origin pool will map to our containerized OPSWAT application which is exposed to the load balancer as a Kubernetes service.

Result – A Scalable and Secure Application Platform

With the container running on the CE node, as a Kubernetes pod, all achieved through simple GUI menus and no Kubectl or other deep Kubernetes knowledge exercises, the organization can focus upon using the applications deployed.   In this demonstration, the OPSWAT MetaDefender Core is being run, a one stop shop for various security inquiries and file sanitization.  OPSWAT has numerous licensable technologies available within Core, two of which are MetaScan and DeepCDR.

With MetaScan running on the F5 Distributed Cloud CE, as a container, the principal value is scanning files for known malware.   The solution curates as many as five different leading AV engines in the analysis process.   Think of it as a malware analysis aggregator for rich defense in depth.  For instance, this is a complete analysis of a copy of the Visual Studio Code (VSCode) source-code editor from Microsoft, with respect to the Windows 64-bit exe installer to ensure it has not been tampered with.

Engines like the five above seek out files, using file-centric signatures that look for malicious components within the internal components of the files, such as the nebulously named file pointers and sections.   The solution is impressive in the layers of independent, parallel inspection but what about going deeper?  For instance, what about files that are obfuscated, perhaps with an extension suggesting one file type but deeper analysis shows another true file type actually exists.   Perhaps a document is fine, and will not trigger known AV signatures, but has rogue, malicious macros or dangerous JavaScript that can be unwittingly executed by a recipient’s host?

The above is where a second OPSWAT feature, DeepCDR, comes into play.

Content Disarm and Reconstruction (CDR) Running on the Distributed Cloud CE Node

The advent of CDR technology takes a complementary approach to end user security, it does not determine or detect malware's functionality but rather removes all file components that are not approved within enterprise policy.  In the following example, a file is analyzed by OPSWAT DeepCDR and found not to be a pdf as suggested by the file extension, but rather is in fact a .docx file.   A frequent attempt to thwart security policy is to shield true file contents with layer upon layer of obfuscation, such as a .zip archive that contains an embedded .7z archive, in turn containing a tarball that purports to contain a .pdf, which in turn is actually a .docx file.  So it goes, to the degree than an entire approach to impeding traditional AV engines has evolved, the so-called Zip bomb.

 As seen in the preceding figure, a file demo.pdf with CDR can be “sanitized” or scrubbed of potential risk factors, to an end state where the resulting file can safely be distributed and still preserve the usability of the file.

 The result of this CDR policy is the file demo.pdf, which is truly a .docx file, has had all JavaScript, active URL links, images and attachments fully removed, thereby allowing the file to be safely distributed.  Drilling deeper the solution, which can perform this sanitization for a broad spectrum of file types typical of office workflows such as the Microsoft Suite of productivity tools, can pull apart the JavaScript that was originally found with files.

Although a block action, as opposed to the sanitize approach, can certainly be followed, some institutions, such as the large financials, will have malware teams that will keenly want to access archived scripts and attachments that were originally included with blocked content.

Performance Adjustments and Conclusion

The ability to tweak the Distributed Cloud CE nodes exists, to “right-size” the appliance to meet the demands that various applications make, is straightforward.   In this article, which utilized CE nodes implemented as virtual appliances on a VMWare ESXi platform, some parameters were adjusted to provide more generous resources to the containers.   Prior to the initial power-up operation of the CE nodes, the assigned vCPU count was increased to 8 and the provisioned disk space was increased from 40 GB to 100 GB.

Once inside the Distributed Cloud global console, a “workload flavor” was designed to harness the resources available for the container.

 As seen in the image, a workload flavor was created, and many can be setup, which in this particular case allotted 12000 MiB of memory, 8 vCPUs and 6000 MiB of container storage.   The latter value is not overly tasked in this example of OPSWAT MetaDefender Core as the container, as a permanent volume was mapped to the appliance to preserve historical results after a container restart event.

Finally, in terms of scaling out applications running upon the virtual Kubernetes service offered by Distributed Cloud, one can adjust replication sets per site.   This will have the net result of more pods running across the solution and increasing the overall performance of containerized software under higher loads.

The result of the vK8s service is a new option for organizations looking, for a multitude of reasons, to converge their compute requirements to fewer platforms, including a general effort to curtail server sprawl and aspire to a greener working environment.   The solution of running code on network appliances can in some cases leverage the worldwide RE network but in all cases can utilize CE deployments in data centers, in the cloud and perhaps most promisingly, at the edge where compute options can be at a premium.  The intuitive GUI-driven approach to running containers means a fully matured Kubernetes command line skill set is not a prerequisite to a rapid and effective application roll out.

Updated Oct 19, 2023
Version 2.0