This article provides details on how a readily available, rack-mountable Dell Technologies server can be configured as a BIG-IP appliance, using BIG-IP Virtual Edition (VE) software and VMware ESXi™ as a hypervisor. The goal is to understand the expected performance of the resulting BIG-IP and to document the steps followed as it was easily and quickly implemented on a mainstream compute platform from Dell; specifically, the widely deployed PowerEdge R650™ server. This hardware continues to be available for delivery within days of order placement lending itself to short deployment cycles, the performance to be expected is a key aspect in this investigation.
Performance was established in a lab environment using a publicly accepted load generator hardware solution from Keysight’s Ixia division, specifically an Ixia PerfectStorm. The following tests demonstrate that using Dell as the underlying compute platform for BIG-IP functionality not only reduces the time between the envision stage and a fully deployed solution, but also offers full support for complex traffic loads commensurate with traditional F5 dedicated appliances.
In short, whether purchasing a new Dell server or re-purposing existing hardware, this document details how higher performance outcomes can be achieved when industry leading F5 software is executed upon a popular and widely available hardware platform.
Dell Server and BIG-IP Software Overview
The BIG-IP platform has many years of wide adoption in the industry, where one common deployment approach is to use a hardware solution from F5, such as the iSeries or rSeries platforms. To gauge whether using commercial, off the shelf (COTS) servers could match the performance of the BIG-IP system on F5 appliances, a popular server was chosen for rigorous testing. Tests were performed using widely accepted, industry standard load generation and measurement tools. The server selected for these tests, and the key internal components, were specifically chosen to reflect a readily available, mainstream solution. The full parts list is available from F5, but the highlights include:
See the section below for information on our rationale for selecting the above components over possible other choices that we could have made.
The operating system of the server—ordered as a pre-installed, out-of-the-box solution from Dell—was VMware ESXi version 7.0.3 (Build 20036589), for those who want to exactly match the environment, including the precise hypervisor version.
Upon powering up and configuring the ESXi server’s initial values (management, DNS, NTP, etc.), the next item would be to create network port groups with different VLANs and subnets per port group (e.g., management, internal, external, and high availability [HA]). Next was the deployment of a single instance of BIG-IP VE as a virtual machine on the ESXi OS. To maximize performance and allow for repeatable deployments, no other virtual machines were deployed within the ESXi instance. BIG-IP VE was deployed using an OVA downloaded from downloads.f5.com and setup with the instructions from BIG-IP VE deployment guide (No SR-IOV) configuring with the 8vCPU deployment model and the previously created networks from the ESXi/vSphere HTML5 console. This type of deployment usually takes less than 30 minutes to complete.
To facilitate enhanced performance reporting and common industry practices, a VMWare vCenter™ instance was also utilized for management and testing of complementary features such as the use of virtual distributed switching (VDS) with the Dell ESXi host. The vCenter was hosted in a different cluster than the test environment. Virtual distributed switching is not critical in one host testing, but it allows for features like VLANs to be created once and distributed across the data center hosts and is considered industry norm in real world scenarios and was utilized in the test. The first tests, before optimization, include the default VMWare virtual standard switch (VSS). The optimized tests were run using the virtual distrusted switch, facilitated by the use of vCenter.
Dell Server Components - Selection Rationale
The selection of elements internal to the Dell R650 server was guided by F5 best practice criteria. Highlights of the decision-making process included the following:
Performance Test Setup in Lab
The Dell R650, with the factory installation of ESXi 7.0.3, was exposed to a series of test scenarios to see what benchmarks could be achieved with no manipulation of hardware or software, providing a first impression of what could be expected from the solution with no optimization. BIG-IP VE can be equipped with a range of license types that can impose throughput ceilings at levels that reflect real world scenarios. In this way, F5 delivers a variety of price/performance options to suit a range of users. Typical VE licenses can be capped at 1, 3, 5 and 10 Gbps. For the purposes of our initial benchmark, a 10 Gbps license was installed. Full license details, including the physical cores each license can leverage, are available here.
A virtual server was configured on the external side of the BIG-IP deployment and a corresponding internal pool of 72 configured nodes—corresponding to internal servers—was set up. This is a standard setup within modern datacenters. Per our intention to record a best-case scenario with the Dell server right out of the box, advanced features for which F5 is well known (e.g., detailed iRules and TLS encryption), were not used as part of the initial benchmark. The server profiles utilized were layer 4 TCP-centric for bandwidth measurements and layer 7 HTTP-centric for transaction rate tests.
The test bed is depicted in the following diagram. The out-of-the-box setup included the standard VMware virtual standard switch (VSS) configured on the ESXi server.
Initial Test Results with Dell Server Out-of-the-Box Setup
The objective of the first set of tests was to determine
1) The achievable throughput of the solution, established with massive numbers of clients downloading large 512KB (kilobyte) objects.
2) The maximum HTTP (HyperText Transfer Protocol) transaction rate measured, when users rapidly request many, smaller 128B (byte) pages.
In both scenarios, there were 1,000 concurrent simulated users (clients) with 100 requests per TCP (Transmission Control Protocol) connection.
With the Ixia PerfectStorm solution instructed to step up to 1,000 concurrent simulated users (clients) with 100 requests per TCP connection, each retrieving large 512KB (kilobyte) objects, the steady state achieved was just over 8.0 Gbps of sustained traffic across the Dell R650 and BIG-IP VE pairing. Interestingly, CPU availability provided by the Dell server was not a bottleneck, with ample CPU reported available even at full load. The corresponding transaction rate for these large downloads was measured to be approaching 2,000 transactions per second.
Transaction Rate Result:
When layer 7 (HTTP) transactions (as opposed to bandwidth) became the focus, the smaller 128B (byte) page download meant many more potential transactions per second. Since the BIG-IP platform is by nature performing a full proxy function, the achievable transaction rate is just as important as the supported bandwidth. The measured, sustained maximum HTTP transaction rate was determined to be more than 510,000 transactions per second. It is important to note, as with throughput, this is simply out-of-the-box performance.
The transaction performance depicted above was achieved with a layer 7, HTTP-aware server profile. HTTP-centric features such as cookie-based server load balancing are empowered when using layer 7 profiles.
Optimized Test Results with Specific Adjustments to the Dell R650 and ESXi 7.0.3
The impact of the optimizations to both the underlying Dell server and the BIG-IP virtual machine was extremely significant in terms of positive test measurements for both throughput and transaction rates. The details around those optimizations are provided in a subsequent section, while this analysis is focused entirely on the new results.
The throughput-oriented tests once again had the Ixia PerfectStorm stepping up the number of active users up to 1,000, with 100 requests per TCP connection, 512KB (kilobyte) objects retrieved over each TCP connection. An immediate increase of 2 Gbps over the originally measured 8 Gbps throughput was detected after implementing the optimizations.
The 2 Gbps of extra throughput corresponds to a remarkable 25 percent increase from out-of-the-box performance. Also of importance, testing was performed with a 10 Gbps virtual edition license, suggesting the underlying Dell server and interfaces could potentially have supported even more throughput.
To measure transactions per second, tests were performed using the same logic as the Ixia PerfectStorm throughput test, but the payload was much smaller at 128B (bytes). This resulted in more transactions attempted and even higher incremental gains. With the optimized server and virtual BIG-IP, the increase was almost 151 percent, with 1.28 million transactions per second achieved using a layer 7-aware server profile on BIG-IP.
BIG-IP virtual edition licenses are presently bandwidth oriented, in terms of selecting the appropriate trim level; our use of a 10 Gbps license for this testing was reflected in the fact that bandwidth plateaued at the 10Gbps mark—suggesting the license, not the server hardware, is the gating factor. As such, it is reasonable to suggest that even more transactions and throughput per second could be achieved by the underlying Dell R650 and BIG-IP VE.
Please click the following link to be taken to part two of this article, where the specifics on what optimizations were conducted are discussed in detail: