There are two major issues we have found with this.
1) Rails is a single threaded application, so that if one client request takes 5 seconds to respond to, it will queue up all other client HTTP requests until it has served the first). Most of the existing load balancing algorithms will queue up requests. Ideally we would like a load balancing algorithm that will make only one connection to each Webserver. (With extra traffic being handled as an exception case, that either goes to a specific pool, is redirected, or some other action.
2) The least connections load balancing algorithm is the closest suitable algorithm for Rails apps. However the problem is that in our case we have dozens of servers, with scores of webservers running on each server. When we are at a low load level, we are seeing all the traffic going to one webserver. e.g. - The first ten connections are going to ports xxx1-10 on webserver one. Only when traffic has been sent to all the webserver instances on webserver one, are we seeing any traffic being sent to webserver two.
Long term, 1 would be nice to fix, but for now, properly striping my traffic across my servers would be best (2).
I would expect this to be the behaviour, but it is hard to know for certain without testing.