on 24-Jun-2011 03:06
No, not World of Warcraft “Damage per Second” - infrastructure “Decisions per second”.
Metrics are tricky. Period. Comparing metrics is even trickier. The purpose of performance metrics is, of course, to measure performance. But like most tests, before you can administer such a test you really need to know what it is you’re testing. Saying “performance” isn’t enough and never has been, as the term has a wide variety of meanings that are highly dependent on a number of factors.
The problem with measuring infrastructure performance today – and this will continue to be a major obstacle in metrics-based comparisons of cloud computing infrastructure services – is that we’re still relying on fairly simple measurements as a means to determine performance. We still focus on speeds and feeds, on wires and protocols processing. We look at throughput, packets per second (PPS) and connections per second (CPS) for network and transport layer protocols. While these are generally accurate for what they’re measuring, we start running into real problems when we evaluate the performance of any component – infrastructure or application – in which processing, i.e. decision making, must occur.
Consider the difference in performance metrics between a simple HTTP request / response in which the request is nothing more than a GET request paired with a 0-byte payload response and an HTTP POST request filled with data that requires processing not only on the application server, but on the database, and the serialization of a JSON response. The metrics that describe the performance of these two requests will almost certainly show that the former has a higher capacity and faster response time than the latter. Obviously those who wish to portray a high-performance solution are going to leverage the former test, knowing full well that those metrics are “best case” and will almost never be seen in a real environment because a real environment must perform processing, as per the latter test.
Suggestions that a standardized testing environment, similar to application performance comparisons using the Pet Shop Application, are generally met with a frown because using a standardized application to induce real processing delays doesn’t actually test the infrastructure component’s processing capabilities, it merely adds latency on the back-end and stresses capacity of the infrastructure component. Too, such a yardstick would fail to really test what’s important – the speed and capacity of an infrastructure component to perform processing itself, to make decisions and apply them on the component – whether it be security or application routing or transformational in nature.
It’s an accepted fact that processing of any kind, at any point along the application delivery service chain induces latency which impacts capacity. Performance numbers used in comparisons should reveal the capacity of a system including that processing impact. Complicating the matter is the fact that since there are no accepted standards for performance measurement, different vendors can use the same term to discuss metrics measured in totally different ways.
Infrastructure components, especially those that operate at the higher layers of the networking stack, make decisions all the time. A firewall service may make a fairly simple decision: is this request for this port on this IP address allowed or denied at this time? An identity and access management solution must make similar decisions, taking into account other factors, answering the question is this user coming from this location on this device allowed to access this resource at this time? Application delivery controllers, a.k.a. load balancers, must also make decisions: which instance has the appropriate resources to respond to this user and this particular request within specified performance parameters at this time?
We’re not just passing packets anymore, and therefore performance tests that measure only the surface ability to pass packets or open and close connections is simply not enough. Infrastructure today is making decisions and because those decisions often require interception, inspecting and processing of application data – not just individual packets – it becomes more important to compare solutions from the perspective of decisions per second rather than surface-layer protocol per second measurements.
Decision-based performance metrics are a more accurate gauge as to how the solution will perform in a “real” environment, to be sure, as it’s portraying the component’s ability to do what it was intended to do: make decisions and perform processing on data. Layer 4 or HTTP throughput metrics seldom come close to representing the performance impact that normal processing will have on a system, and, while important, should only be used with caution when considering performance.
Consider the metrics presented by Zeus Technologies in a recent performance test (Zeus Traffic Manager - VMware vSphere 4 Performance on Cisco UCS – 2010 and F5’s performance results from 2010 (F5 2010 Performance Report) While showing impressive throughput in both cases, it also shows the performance impact that occurs when additional processing – decisions – are added into the mix.
The ability of any infrastructure component to pass packets or manage connections (TCP capacity) is all well and good, but these metrics are always negatively impacted once the component begins actually doing something, i.e. making decisions. Being able to handle almost 20 Gbps throughput is great but if that measurement wasn’t taken while decisions were being made at the same time, your mileage is not just likely to vary – it will vary wildly.
Throughput is important, don’t get me wrong. It’s part of – or should be part of – the equation used to determine what solution will best fit the business and operational needs of the organization. But it’s only part of the equation, and probably a minor part of that decision at that. Decision based metrics should also be one of the primary means of evaluating the performance of an infrastructure component today. “High performance” cannot be measured effectively based on merely passing packets or making connections – high performance means being able to push packets, manage connections and make decisions, all at the same time.
This is increasingly a fact of data center life as infrastructure components continue to become more “intelligent”, as they become a first class citizen in the enterprise infrastructure architecture and are more integrated and relied upon to assist in providing the services required to support today’s highly motile data center models. Evaluating a simple load balancing service based on its ability to move HTTP packets from one interface to the other with no inspection or processing is nice, but if you’re ultimately planning on using it to support persistence-based routing, a.k.a. sticky sessions, then the rate at which the service executes the decisions necessary to support that service should be as important – if not more – to your decision making processes.
There are very few pieces of infrastructure on which decisions are not made on a daily basis. Even the use of VLANs requires inspection and decision-making to occur on the simplest of switches. Identity and access management solutions must evaluate a broad spectrum of data in order to make a simple “deny” or “allow” decision and application delivery services make a variety of decisions across the security, acceleration and optimization demesne for every request they process.
And because every solution is architected differently and comprised of different components internally, the speed and accuracy with which such decisions are made are variable and will certainly impact the ability of an architecture to meet or exceed business and operational service-level expectations. If you’re not testing that aspect of the delivery chain before you make a decision, you’re likely to either be pleasantly surprised or hopelessly disappointed in the decision making performance of those solutions.
It’s time to start talking about decisions per second and performance of infrastructure in the context it’s actually used in data center architectures rather than as stand-alone, packet-processing, connection-oriented devices. And as we do, we need to remember that every network is different, carrying different amounts of traffic from different applications. That means any published performance numbers are simply guidelines and will not accurately represent the performance experienced in an actual implementation. However, the published numbers can be valuable tools in comparing products… as long as they are based on the same or very similar testing methodology. Before using any numbers from any vendor, understand how those numbers were generated and what they really mean, how much additional processing do they include (if any).
When looking at published performance measurements for a device that will be making decisions and processing traffic, make sure you are using metrics based on performing that processing.