Load Balancing on the Inside

Business critical internal processing systems often require high-availability and fault tolerance, too.

Load balancing and application delivery is almost always associated with scaling out interactive, web-based applications. Rarely does anyone think about load balancing and application delivery in batch processing systems even when those systems might be critical to the business they are supporting. But scaling out non-interactive processing systems and providing high-availability to such critical systems is just as easily accomplished for an application delivery controller (ADC) as it is to scale out an interactive web-based application. Maybe easier.

When that system also requires a bit more intelligence than just simple load balancing, it makes a lot of sense to look closer at a context-aware system that can support all the requirements in a single solution.


A batch document processing system uses a document ID to match all related documents to the same “case.” The first time a document ID is encountered, it creates a new “case” and subsequent documents bearing that ID are attached to the original case.

To ensure processing around the clock, a redundant set of application servers is configured to process the documents, and the vendor’s application server clustering solution is used to load balance documents (in simple round-robin fashion) across the two instances.

A load test is conducted, ramping up to 2500 documents per hour (41 per minute, fewer than 1 per second). During the test it is discovered that in some situations two documents with the same ID will arrive at the clustering solution in order. They will each be load balanced to separate instances. There is no existing “case” for this document id. Because of processing times and load on the servers, both documents result in the creation of separate “cases.”

The test is considered a failure. Because the system, while managing the load fine from a network perspective, executed incorrectly under load from a process perspective.

The solution? Reconfigure the clustering solution to an active-standby configuration, thus introducing the process latency needed to ensure that the scenario does not occur. Retest. Success.

The result? The investment in the second instance of the application server – hardware, software licenses, management, maintenance – is wasted. It is a “failover” node only and reduces the overall capacity – and ultimately performance at higher load levels – of the system.


This scenario is real; it was described to me by a program manager at a Fortune 500 with a great deal of frustration as it seemed, to her anyway, that the architects could not come up with a working solution other than wasting a perfectly good set of resources. Instinctively she described a solution that leveraged persistence to force all documents with the same ID to the same server as it had been proven repeatedly that if all documents with the same ID were processed by the same application server that the system processed them correctly and associated them with the right “case” in all situations. But the application server clustering solution, which can provide server affinity (persistence) based on a few variables, was for some reason not able to support affinity (persistence) based on the document ID.

After a few questions regarding the overall system and processing times it became clear that a context-aware application delivery controller could indeed solve this problem.

The solution is fairly simple, actually, and based on existing persistence-based load balancing solutions. It is a given that documents with the same ID are batch processed within minutes of each other. Thus, a persistence table with a life of an hour or even thirty-minutes would provide the proper context in which documents could be processed and directed to the “right” web application server. This requires context; it requires that the load balancing solution, the application delivery controller, be aware of not only what it is processing but what it has processed already, and where it’s been sent.

Document ID Based Persistence Logic
  • Extract the document ID from the document
  • Check the persistence table for the document ID
    • If the document ID already exists, route the document to the same server as the previous document(s) with that ID
    • If the document ID does not exist, decide which server the document will be sent to for processing and create an entry in the persistence table
  • Wash. Rinse. Repeat.

This problem is really about process level execution; about enforcing a business requirement on the technological implementation. In order to achieve compliance with the business process expectations it is necessary to be able to view each request in the context of that process rather than as an individual request that needs to be executed. Thus each touch point in the architecture that needs to manipulate, transform, or perform some task with or on or to the request needs to be able to take into consideration the process; it needs to be context-aware so that its decisions are made within the context of the entire process and not just the individual request.

Layer 7 switching, application load balancing, application delivery. Whatever you want to call it, it is the way in which load balancing becomes context-aware and becomes collaborative. It enables the business requirements to be not only taken into consideration but enforced while ensuring that CapEx and OpEx investments in additional systems are not left to sit idle; wasted. It improves capacity essentially by introducing process latency into the equation. By forcing the process to follow a particular path the application delivery controller assists in the technological implementation meeting the goals of the business.

In order words, it aligns IT with the business.

Sometimes the marketing fluff is more solid than it appears.

AddThis Feed Button Bookmark and Share

Published Sep 22, 2009
Version 1.0

Was this article helpful?