Exploring BIG-IP VE capabilities on Dell PowerEdge R650 Servers - Part 1 of 2

This article provides details on how a readily available, rack-mountable Dell Technologies server can be configured as a BIG-IP appliance, using BIG-IP Virtual Edition (VE) software and VMware ESXi™ as a hypervisor. The goal is to understand the expected performance of the resulting BIG-IP and to document the steps followed as it was easily and quickly implemented on a mainstream compute platform from Dell; specifically, the widely deployed PowerEdge R650™ server.  This hardware continues to be available for delivery within days of order placement lending itself to short deployment cycles, the performance to be expected is a key aspect in this investigation.

Performance was established in a lab environment using a publicly accepted load generator hardware solution from Keysight’s Ixia division, specifically an Ixia PerfectStorm. The following tests demonstrate that using Dell as the underlying compute platform for BIG-IP functionality not only reduces the time between the envision stage and a fully deployed solution, but also offers full support for complex traffic loads commensurate with traditional F5 dedicated appliances.

In short, whether purchasing a new Dell server or re-purposing existing hardware, this document details how higher performance outcomes can be achieved when industry leading F5 software is executed upon a popular and widely available hardware platform.

Dell Server and BIG-IP Software Overview

The BIG-IP platform has many years of wide adoption in the industry, where one common deployment approach is to use a hardware solution from F5, such as the iSeries or rSeries platforms. To gauge whether using commercial, off the shelf (COTS) servers could match the performance of the BIG-IP system on F5 appliances, a popular server was chosen for rigorous testing. Tests were performed using widely accepted, industry standard load generation and measurement tools. The server selected for these tests, and the key internal components, were specifically chosen to reflect a readily available, mainstream solution. The full parts list is available from F5, but the highlights include:

  • Dell PowerEdge R650 server (two, to allow for high availability [HA] deployments)
  • Intel Xeon Platinum 8362 2.8G (32 physical cores, only one of two motherboard CPU sockets populated allowing for future scaling)
  • 128 gigabytes of memory (RDIMM, 3200 MT/sec) (8x16GB DIMM configuration to utilize all memory channels)
  • 2x 960 gigabytes of storage in RAID-1 (SSD SAS ISE Read Intensive) for storage redundancy
  • Intel E810-XXV Dual Port 10/25GbE OCP3.0 interface card equipped with 25 Gbps transceivers for BIG-IP traffic handling and HA configuration sync purposes
  • Motherboard 1 Gbps interfaces for ESXi management

See the section below for information on our rationale for selecting the above components over possible other choices that we could have made.

The operating system of the server—ordered as a pre-installed, out-of-the-box solution from Dell—was VMware ESXi version 7.0.3 (Build 20036589), for those who want to exactly match the environment, including the precise hypervisor version.

Upon powering up and configuring the ESXi server’s initial values (management, DNS, NTP, etc.), the next item would be to create network port groups with different VLANs and subnets per port group (e.g., management, internal, external, and high availability [HA]). Next was the deployment of a single instance of BIG-IP VE as a virtual machine on the ESXi OS. To maximize performance and allow for repeatable deployments, no other virtual machines were deployed within the ESXi instance. BIG-IP VE was deployed using an OVA downloaded from downloads.f5.com and setup with the instructions from BIG-IP VE deployment guide (No SR-IOV) configuring with the 8vCPU deployment model and the previously created networks from the ESXi/vSphere HTML5 console. This type of deployment usually takes less than 30 minutes to complete.

To facilitate enhanced performance reporting and common industry practices, a VMWare vCenter™ instance was also utilized for management and testing of complementary features such as the use of virtual distributed switching (VDS) with the Dell ESXi host. The vCenter was hosted in a different cluster than the test environment. Virtual distributed switching is not critical in one host testing, but it allows for features like VLANs to be created once and distributed across the data center hosts and is considered industry norm in real world scenarios and was utilized in the test.  The first tests, before optimization, include the default VMWare virtual standard switch (VSS).  The optimized tests were run using the virtual distrusted switch, facilitated by the use of vCenter.

Dell Server Components - Selection Rationale

The selection of elements internal to the Dell R650 server was guided by F5 best practice criteria. Highlights of the decision-making process included the following:

  • A Platinum class Intel® Xeon® 8362 processor with a 2.8 GHz frequency and 32 physical cores and up to 64 threads was chosen. There are multiple classes of processors that could have been selected (e.g., higher core count CPUs vs higher frequency CPUs). The aim for this testing was to balance between core counts and frequency of the CPU and is why the Intel Platinum processor was selected. As cores are increased, frequency typically is lessened due to the need to dissipate more heat across the denser core count housed on the silicon used in most processor dies.
  • Processor hyperthreading was not enabled. The optimal performance of virtual machines on a server such as the Dell R650 should occur when virtual cores map directly to physical cores. Hyperthreading enables two threads per core, effectively giving the appearance of a doubling in the underlying thread count, however it does not increase the performance of the processor by 2x. Our preference in this case is to guarantee that the virtual machine utilizes physical cores.
  • F5 does co-develop enhanced drivers with Intel and Mellanox network adapters, to allow for features such as SR-IOV support, where a virtual function (VF) of the adapter can be utilized for offloading and direct network access. This guide was initially tested with an Intel E810 25gb NIC without SR-IOV (VMXNET3), we intend to validate Mellanox adapters and the additional network features of these cards in the future.
  • The Dell server was equipped with durable and fast solid-state drives as opposed to spinning hard disk drives. We selected dual 960 GB running in RAID-1 for the Operating System (OS) and the function of the BIG-IP VE, with ample room for expansion and redundancy. The usage of solid-state disks allows for improved data access times on both the OS and VE.

Performance Test Setup in Lab

The Dell R650, with the factory installation of ESXi 7.0.3, was exposed to a series of test scenarios to see what benchmarks could be achieved with no manipulation of hardware or software, providing a first impression of what could be expected from the solution with no optimization. BIG-IP VE can be equipped with a range of license types that can impose throughput ceilings at levels that reflect real world scenarios. In this way, F5 delivers a variety of price/performance options to suit a range of users. Typical VE licenses can be capped at 1, 3, 5 and 10 Gbps. For the purposes of our initial benchmark, a 10 Gbps license was installed. Full license details, including the physical cores each license can leverage, are available here.

A virtual server was configured on the external side of the BIG-IP deployment and a corresponding internal pool of 72 configured nodes—corresponding to internal servers—was set up. This is a standard setup within modern datacenters. Per our intention to record a best-case scenario with the Dell server right out of the box, advanced features for which F5 is well known (e.g., detailed iRules and TLS encryption), were not used as part of the initial benchmark. The server profiles utilized were layer 4 TCP-centric for bandwidth measurements and layer 7 HTTP-centric for transaction rate tests.

The test bed is depicted in the following diagram.  The out-of-the-box setup included the standard VMware virtual standard switch (VSS) configured on the ESXi server.

Initial Test Results with Dell Server Out-of-the-Box Setup

The objective of the first set of tests was to determine

1) The achievable throughput of the solution, established with massive numbers of clients downloading large 512KB (kilobyte) objects.

2) The maximum HTTP (HyperText Transfer Protocol) transaction rate measured, when users rapidly request many, smaller 128B (byte) pages.

In both scenarios, there were 1,000 concurrent simulated users (clients) with 100 requests per TCP (Transmission Control Protocol) connection.

Throughput Result:

With the Ixia PerfectStorm solution instructed to step up to 1,000 concurrent simulated users (clients) with 100 requests per TCP connection, each retrieving large 512KB (kilobyte) objects, the steady state achieved was just over 8.0 Gbps of sustained traffic across the Dell R650 and BIG-IP VE pairing. Interestingly, CPU availability provided by the Dell server was not a bottleneck, with ample CPU reported available even at full load. The corresponding transaction rate for these large downloads was measured to be approaching 2,000 transactions per second.

Transaction Rate Result:

When layer 7 (HTTP) transactions (as opposed to bandwidth) became the focus, the smaller 128B (byte) page download meant many more potential transactions per second. Since the BIG-IP platform is by nature performing a full proxy function, the achievable transaction rate is just as important as the supported bandwidth. The measured, sustained maximum HTTP transaction rate was determined to be more than 510,000 transactions per second. It is important to note, as with throughput, this is simply out-of-the-box performance.

The transaction performance depicted above was achieved with a layer 7, HTTP-aware server profile. HTTP-centric features such as cookie-based server load balancing are empowered when using layer 7 profiles.

Optimized Test Results with Specific Adjustments to the Dell R650 and ESXi 7.0.3

The impact of the optimizations to both the underlying Dell server and the BIG-IP virtual machine was extremely significant in terms of positive test measurements for both throughput and transaction rates. The details around those optimizations are provided in a subsequent section, while this analysis is focused entirely on the new results.

The throughput-oriented tests once again had the Ixia PerfectStorm stepping up the number of active users up to 1,000, with 100 requests per TCP connection, 512KB (kilobyte) objects retrieved over each TCP connection. An immediate increase of 2 Gbps over the originally measured 8 Gbps throughput was detected after implementing the optimizations.

The 2 Gbps of extra throughput corresponds to a remarkable 25 percent increase from out-of-the-box performance. Also of importance, testing was performed with a 10 Gbps virtual edition license, suggesting the underlying Dell server and interfaces could potentially have supported even more throughput.

To measure transactions per second, tests were performed using the same logic as the Ixia PerfectStorm throughput test, but the payload was much smaller at 128B (bytes). This resulted in more transactions attempted and even higher incremental gains. With the optimized server and virtual BIG-IP, the increase was almost 151 percent, with 1.28 million transactions per second achieved using a layer 7-aware server profile on BIG-IP.

BIG-IP virtual edition licenses are presently bandwidth oriented, in terms of selecting the appropriate trim level; our use of a 10 Gbps license for this testing was reflected in the fact that bandwidth plateaued at the 10Gbps mark—suggesting the license, not the server hardware, is the gating factor. As such, it is reasonable to suggest that even more transactions and throughput per second could be achieved by the underlying Dell R650 and BIG-IP VE.

Please click the following link to be taken to part two of this article, where the specifics on what optimizations were conducted are discussed in detail:

Exploring BIG-IP VE capabilities on Dell PowerEdge R650 Servers - Part Two 

Updated Mar 19, 2024
Version 5.0
No CommentsBe the first to comment