iRule::ology; Connection Limiting Take 2

Welcome to the series in which we tear apart an iRule line by line to see what’s really going on.  What is iRule::ology, you ask?

    i·Rule·ol·o·gy

     –noun, plural -gies.

1.The derivation, that is, the study of the origin of, an iRule

2. An account of the history, purpose and creation of a particular iRule

3. The study of  an iRule, simple or advanced, in hopes of completely and totally understanding the functionality, intended purpose, and possible uses of said code.

In the continuation of the iRule::ology series I want to take a look at another, different way to look at rate limiting.  Last time we were looking at HTTP rate limiting, and were concerned with the number of HTTP requests in a given time period.  This time we’re looking at limiting the total concurrent connections from a particular user (IP address) to an application.  This is a slightly more difficult endeavor because you have to deal with both incrementing and decrementing a counter, not just tracking requests per second, but it makes use of some of the same concepts as you’ll see below.

First, let’s take a look at the entire iRule we’re going to be deconstructing so you can familiarize yourself and follow along:

 1: when CLIENT_ACCEPTED {
 2:  set tbl "connlimit:[IP::client_addr]"
 3:  set key "[TCP::client_port]"
 4:  
 5:  table set -subtable $tbl $key "ignored" 180
 6:  if { [table keys -subtable $tbl -count] > 1000 } {
 7:  table delete -subtable $tbl $key
 8:  event CLIENT_CLOSED disable
 9:  reject
 10: } else {
 11:  set timer [after 60000 -periodic { table lookup -subtable $tbl $key }]      12: }
 13:}
 14:when CLIENT_CLOSED {
 15:  after cancel $timer
 16:  table delete -subtable $tbl $key       17:}

As you can tell the bulk of this iRule is happening in the CLIENT_ACCEPTED event, so that’s where we’re going to get started:

CLIENT_ACCEPTED

This iRule is counting connections, not HTTP requests.  It doesn’t care what kind of connections these are, what protocol they’re using, etc. It just needs to know that there is a client connecting to the VIP and initiating a new connection to be counted.  CLIENT_ACCEPTED is not only protocol agnostic, it’s also the very first iRule event to be executed. In addition, it’s only executed once per connection, rather than on a per request basis like HTTP_REQUEST which almost always fires multiple times before a user disconnects.  For all of these reasons, CLIENT_ACCEPTED makes the most sense for this particular iRule.

set tbl "connlimit:[IP::client_addr]"

Next we’re going to start setting up the structure for the table that we’ll be using to track how many connections are coming from each IP.  The easiest way to do this will be to set up a subtable for each IP.  To ensure these aren’t interfering with other records in the session table as well as to help with ease of searching, we’ll set up a custom subtable name. In this example we’re just concatenating a static string and the IP of the inbound request.  For example, a request from 10.10.10.1 would result in a table named “connlimit:10.10.10.1”. Once we have this subtable, we’ll be able to much more easily track the number of requests from this “user”.

set key "[TCP::client_port]"

The “key” variable we’re setting up so, ideally, we’ll be able to identify individual rows in the table for each IP address as new connections are opened from that IP. We’re setting this to be the local port of the client that’s connecting.  Also because, well, we need some kind of a key for the table set command that comes next.

table set -subtable $tbl $key "ignored" 180

As promised, the table set command.  This is what crafts then inserts the rows in the subtable that are going to be used to count the number of concurrent requests we’re currently handling.  Each time a connection is opened with the LTM via the VIP this iRule is applied to this command is fired, which adds a row to the appropriate subtable for the IP address of the connecting client, creating the subtable if necessary.  We’re using the variables set above, $tbl and $key, as well as a static “ignored” value, since the value of the entries in these subtables isn’t particularly important.  Also of note is the timeout of 180 seconds, which ensures that if no changes/lookups are performed on these entries within 180 seconds, they’ll expire and be removed.  That becomes important in a few lines.

if { [table keys -subtable $tbl -count] > 1000 } {

So here it is, the comparison that you’ve been waiting for. This is how we keep the connections limited to a given number, 1000 in this case.  Note that this happens after the table set.  We need to ensure that we’re getting an accurate connection count, so we must wait until after the set occurs to count the number of rows and compare against whatever our limit is going to be.  The table keys count command makes this comparison extremely easy, since it returns exactly what we’re looking for, a count of how many rows are in a given table (or subtable, in this case).  Since we’ve been adding a row to the subtable for the inbound IP address each time a connection is opened, this command will give us a number of how many connections are currently open to this VIP. The code inside this if statement is executed whenever a connection from an IP address that already has 1000 concurrent connections open is received.

table delete -subtable $tbl $key

The first thing we’re going to do if we see a request past the limit for a given IP is delete the subtable entry we just created.  Yes, I know this seems a little counter intuitive. Yes, I know you probably want to perform the check first and then only add the subtable entry if we’re not already over the limit for the IP in question, but take my word on this, you want to insert, check, then delete if necessary.  This is due to some very hairy, under the covers memory management stuff with sharing table info between TMMs and the like.  I won’t go into all the nitty gritty here but like I said, it’s important, which is why we’re doing it this way.  The table delete command is pretty straight forward, it takes the subtable name and the key of the row you’d like to delete as arguments.  Fortunately for us, we already have those stored in variables, so we just apply those variables here.

event CLIENT_CLOSED disable

Next, we’re going to disable the CLIENT_CLOSED event.  This is because the CLIENT_CLOSED event is only necessary in this iRule if a request was successfully received and a row was added to a subtable.  Since we just deleted the row we originally added, we can do without that chunk of code. Disabling ensures we’re not wasting time executing the unnecessary code. Efficiency is good, go with it.

 reject

The last and smallest line of code fired when the if statement returns true packs the most punch. We reject the connection outright.  There’s a connection limit we’re here to enforce, this IP is at that limit, so we’re not taking on any new ones.  Toss it and move on.  Since we’ve already disabled the CLIENT_CLOSED event, this is where the iRule processing in this case ends, effectively.  The rest of the code is only executed if we haven’t already hit the limit for the incoming IP.

set timer [after 60000 -periodic { table lookup -subtable $tbl $key }] 

I’m skipping the “else” because I trust you to understand what an else does. Instead, we’re diving right into the table refresh timer.  What is a table refresh timer, aside from a term I just completely made up? It’s a way for us to ensure that, as long as the connection remains active, the subtable entry we just added back on line 5 never times out.  Now, for those of you that are observant out there, you’ll notice that we specifically set a timeout of 180 seconds on each subtable entry we’re creating.  You might be asking yourself why we don’t just set the timeout to indefinite and do away with the timer line here. The reason is that if we did that, then the table entries would never go away if the connection was lost unexpectedly without executing the code needed to remove its row in the table.  Because we’re setting a relatively short timeout, even in the worst case scenario in which the connection drops off suddenly without executing the CLIENT_CLOSED section (we’re getting there, hang on) that would normally remove the table row representing said connection, the row in the table will eventually remove itself anyway.  This is because the only thing keeping it from timing out after 180 seconds is this timer.  The after –periodic command above is doing a lookup on the specific subtable/key pair we just added once every 60 seconds. Every time this lookup is performed, it’s touching the row, thereby resetting the countdown until the row times out. It’s doing this indefinitely, which may sound like a bad idea (it did to me at first) because as long as this out of band loop of sorts keeps occurring, this particular row in the table will never expire. What you have to keep in mind, though, is command scope.  Because the after command is tied to the connection flow of this connection, when the connection goes away, so does the after –periodic.  That means that this out of band loop will stop happening, the timer on the row in the table will stop getting refreshed, and after 180 seconds, it’ll just automagically go away even if the iRule isn’t able to explicitly tell it to because something bad happened to the connection.  Slick, eh?  In an ideal world we’ll just delete the row ourselves, but having a backup is important since we all know it’s not always an ideal world.

when CLIENT_CLOSED {

We’re now dealing the the CLIENT_CLOSED event. This is the event that fires when the connection is, as you may have guessed, closed.  If this event fires it means that the connection was shut down properly (usually) and that we can now remove the row from the subtable that we originally added to serve as a counter for the current active connection, to ensure we didn’t go over our limit.

after cancel $timer

First of all, remember that fancy magical counter we created earlier with the after –periodic command? We need to cancel that.  It has served its purpose and ensured that the subtable entry didn’t time out, so our count stayed accurate, but since we’re about to delete that entry, let’s cancel the after loop that’s been running out of band.  This simply tells after to stop processing the periodic we initiated earlier.

table delete -subtable $tbl $key

Lastly, if all has gone according to plan and we’re in the CLIENT_CLOSED event, we want to manually remove the entry from the connlimit subtable ourselves, since this connection no longer needs to be counted.  We’ll use the table delete command in nearly the same way we used the table set command earlier, with the $tbl and $key variables.  This will effectively decrement the number of rows in the table by one. Since we’re using the count of the rows as the comparison for our counter, we’ve just decremented our counter by one, and the next connection is ready to fire through the iRule.

That’s it for this iRule, and this installment of the iRule::ology series.  I’m still refining the format and tone a bit, so if you have any comments or suggestions, please don’t be shy. I’m still very excited about this format, and can’t wait to break down more cool pieces of code. A big thanks to spark of F5 PD fame for the iRule I dissected this time.

Published Jan 25, 2011
Version 1.0
  • Paul_Szabo_9016's avatar
    Paul_Szabo_9016
    Historic F5 Account
    This iRule will fail for very low connection limits because there's a race between two or more TMMs for these three statements:

     

     

    table set -subtable $tbl $key "ignored" 180

     

    if { [table keys -subtable $tbl -count] > 1000 }

     

    table delete -subtable $tbl $key

     

     

    If the table is at 999 and three TMMs simulatanously try to insert the subtable entry before querying the subtable size all three connections will be rejected. Correct behavior would be that 2 out of the 3 would be rejected. If you have 32 TMMs (e.g. a fully loaded Viprion) the potential error starts to get larger.

     

     

    Obviously for a limit of 1000 the iRule is mostly correct, but for a limit of 10 it is likely not.

     

     

    This is a classic "dining philosopher's problem", and it can be solved by using a random collision avoidance mechanism (iRule to follow in later post), or by F5 implementing an "insert only if count is less than some value" command. (also known as the "let the waiter decide" solution).

     

     

    BTW you can also write the command so that it unecessary to disable the CLIENT_CLOSED event. I think it's cleaner code to have cleanup code in only one place.

     

  • Paul_Szabo_9016's avatar
    Paul_Szabo_9016
    Historic F5 Account
    Here's the snippet of an iRule that shows a random backoff-and-retry if the limit is exceeded by some amount. This allows low limits to work pretty accurately (might be off by 1 occasionally)

     

     

    set limit $static::cmcc_limit

     

    set max_backlog $static::cmcc_backlog

     

    set accept_it 0

     

     

    Save the subtable name with the client IP address

     

    set tbl "connlimit:[IP::client_addr]"

     

     

    Use the client's source port as the subtable key

     

    set key "[TCP::client_port]"

     

     

    the client source port to the subtable with an 180 second timeout

     

    table set -subtable $tbl $key "ignored" 180

     

     

    Check if the client IP has more than X connections

     

    set count [table keys -subtable $tbl -count]

     

    if { $count <= $limit } {

     

    set accept_it 1

     

    log local0.alert "accept-1 count=$count"

     

    } elseif { $count <= $limit + $max_backlog } {

     

    we're close to the limit, randomly retry

     

    set tmout [expr { int(rand()*20) + 1 }]

     

    after $tmout

     

    set count [table keys -subtable $tbl -count]

     

    if { $count <= $limit } {

     

    log local0.alert "accept-2 count=$count"

     

    set accept_it 1

     

    }

     

    }