iRule::ology - Table Based Rate Limiting
What is iRule::ology, you ask? Well it’s a kind of catchy way of writing iRuleology, which is a term I’m coining right now, that sprouts from the vast and great mind of Jason Rahm, on the DevCentral team.
i·Rule·ol·o·gy
–noun, plural -gies.
1.The derivation, that is, the study of the origin of, an iRule
2. An account of the history, purpose and creation of a particular iRule
3. The study of an iRule, simple or advanced, in hopes of completely and totally understanding the functionality, intended purpose, and possible uses of said code.
So there you have it, iRule::ology will be a series of Tech Tips wherein each installment focuses on an iRule and truly breaks it down, line by line, to the roots of the iRule, thereby exposing (hopefully) every nuance of the code and removing any mysticism involved.
To start, we have an iRule developed by F5’s own Kirk Bauer, who was basing his logic on an example by Christian Koenning, another awesome F5 engineer. These two were working on making a rate limiting iRule using the table command. They each came up with a pretty killer solution, one for HTTP one for Radius requests. I featured the HTTP version in an installment of the 20LoL and got an almost immediate response saying, "You have a tech tip in entry #3. Explaining what the heck is going on here. I don't think very many people will look at that and have any clue what is happening.”. I took another look and thought that maybe it did bear some explaining.
That, combined with the chat I had with Jason about the iRule::ology concept, and I figured what better place to start? So let’s see how this format works out:
when HTTP_REQUEST {
set request_limit_reached [ table lookup "request_limit_reached" ]
Next we’re setting a variable called “request_limit_reached” to the value of a table lookup for an entry of the same name. This will be used later to help us count the number of requests that have gone through the rule in the defined amount of time (defined later). The reason that we’re setting it to the value of the lookup is so it can be acted upon before the value is updated again. Basically it’s a way for us to keep a running tally of something via the table structure. Why use tables for what seems like a simple variable? Why not just use the incr command to update the variable itself? Because there is a timeout system built into tables on the LTM, which allows us to very easily set the granularity of our rate limiting window, as you’ll see in a few lines. Also, it ensures that this iRule is fully CMP compliant which is important for future proofing this in regards to scaling and portability.
if { [expr [table incr "counter_all_requests"] % 15] == 0 } {
This is a busy one, and in fact is most of the heavy lifting of the “counter” portion of this iRule all in one line. Working from the inside out, let’s start with the table incr command. table incr updates the value of the entry in the table matching the defined key. You can set a custom amount to update it by, but the default value of one is all we’re looking for here. This gives us an entry in the table called “counter_all_requests” with a value of 1 if that key didn’t already exist. If it did exist, it updates the value to 2. Next out from there logically is the expr command. What’s happening here is that we’re taking a modulo comparison between whatever the value of the previous table incr command was and comparing it against the static value 15.
Modulus operations are an interesting thing. Basically, they can be thought of as returning the remainder of a division operation. So if you have 5 % 4, you’re basically doing 5 / 4, and capturing the remainder, which in this case would be 1, since 4 goes into 5 once, with 1 left over. Going back to our use case, if the the “counter_all_requests” key in the table were a 7, this line of code would make it an 8 (table incr), and then compare 8 % 15 to see if it is equal to 0. If you’re keeping up, you’ll realize that our "% 15" operation will equal 0 when the counter hits 15. That is what makes the modulus operator great for counting loops without the need for count commands. Pretty tricky. This does not, in this case, mean we’re allowing 15 requests through per time allocation though. You’ll see why in a few lines.
if { $request_limit_reached < 2 } {
This is a simple yet important line. What we’re going to use this for is to set the cap of how many requests per time allotment we allow. If all we checked was the modulus, we’d never know how many times the counter had been incremented without doing a count, we’d just see that it hit 0 every so often. This is going to track how many times the modulus hits 0, thereby signaling that the count has increased by our static 15 times. By the way, that static number could be anything, 10, 100, 1000…it’s just 15 for simplicity’s sake while testing.
set request_limit_reached [ table incr "request_limit_reached" ]
This line looks almost identical to the second line of the iRule above, but with one important difference. Here we’re incrementing the value of the “request_limit_reached” key inside the table, rather than just setting a value equal to it. Again, incr works simply by taking whatever value is in the variable you’re dealing with, our table entry in this case, and updating it by a default value of 1, with the option to increment by any value you like.
The value of this key in the table is important because it is effectively a placeholder for our counter. Our counter is going up by one for each request that comes in, but we’re not tracking a raw counter, we’re only tracking this placeholder value, which is incremented by 1 every time the modulus operation above returns 0. So effectively, this can be thought of as our total number of current connections divided by that the static value we set up earlier, or in this case, 15. The above if statement is ensuring that this, and the code to follow, is only executed if the “request_limit_reached” value is less than 2, I.E. only if there are less than 30 connections (in this case) in whatever the given period of time we’re using to rate limit is (here’s a hint, it’s one second!).
table timeout "request_limit_reached" 1
And here is how we enforce that one second rollover time limit. As I said earlier, the reason we’re using the table command at all is to take advantage of the built in timeout functionality it has. If we were to do this manually it would involve clock clicks commands and loops and … bleh. This approach is much smoother and more efficient. Setting the timeout to 1 second means that this entry will expire after not being updated in any fashion (including a lookup) for one second. That’s well and good, but it leaves the possibility open for this entry to be touched more often than one second and thereby never technically expire. Fortunately there is a command for that, as well.
table lifetime "request_limit_reached" 1
The lifetime setting does exactly what it sounds like it does, it sets the lifetime of a table entry. What that means is that after this amount of time, no matter how often it has been used or whether or not the timeout was ever reached, this table entry is going away. This is what ensures that we are sticking to a strict one second window of time when counting connections. It’s important to keep in mind that you can get yourself into trouble when setting the timeout and lifetime separately, simply because the lifetime will always win out. Keep this in mind when setting your lifetime, and make sure it’s sufficient to allow your timeout to, well, time out. Also, keep in mind that you can set this in-line when using the set command. We would have done that here, and saved ourselves two more commands, but the incr command doesn’t offer setting these values in-line, only set does.
if { $request_limit_reached >= 2 } {
This line is checking against the “request_limit_reached” counter, which remember, is effectively equal to the total count of requests sent through the iRule in the last second, divided by our constant, which is 15 in this case. So for this to ever evaluate as true, we would have to see more than 30 requests in a given second, which is precisely the thing we’re trying to avoid. This is the case in which you would put whatever code you want to execute when you see the rate of requests exceed your desired limit.
HTTP::respond 500
The HTTP::respond command will respond to the client with whatever status and message you determine in-line. In this case the user will receive an HTTP response with a status of 500, more commonly known as an Internal Server Error, in most common web clients. We chose a very simple action to perform once the rate was exceeded, but there is a wealth of options in this arena.Whether you want to simply drop requests, send an error, do a redirect, send a “Please be patient” page, etc…iRules can do it all. Simply replace this command with whatever command(s) you want and you’re off and running.
Something that those of you that are following along keenly may notice is missing is the line where we reset the "counter_all_requests" row in the table back to 0 so it doesn't continue to grow exponentially and/or so we can keep an accurate count. Well, we aren't doing that because frankly, we don't have to. First, the count will always be accurate due to the nature of the modulus command. We're only dealing with even sets of 15 so the secondary counter value "request_limit_reached" will always be updated appropriately no matter what the value of "counter_all_requests" grows to. Second, the value of "counter_all_requests" will eventually roll over back to zero, thanks to the way TCL deals with integer values in memory, so we don't even have to worry about it breaking things if we let it grow forever. This buys us back one operation we don't need to execute inside the main logic clause. A cycle saved is a cycle earned, after all.
Well there you have it, there’s the first break down from top to bottom of an iRule in the iRule::ology series. Please leave a comment or rate this article to let us know what you think. I personally think this format is incredible and will be very useful in dissecting exactly what’s happening iRules, but I’d like to know what you think. In closing, here is the iRule in its entirety:
1: when HTTP_REQUEST timing on {
2: set request_limit_reached [ table lookup "request_limit_reached" ]
3: if { [expr [table incr "counter_all_requests"] % 15] == 0 } {
4: if { $request_limit_reached < 2 } {
5: set request_limit_reached [ table incr "request_limit_reached" ]
6: table timeout "request_limit_reached" 1
7: table lifetime "request_limit_reached" 1
8: }
9: }
10:
11: if { $request_limit_reached >= 2 } {
12: HTTP::respond 500
13: }
14: }
- KarimCirrostratus
Hi,
First thanks for this article !
I have a question however : why did you use the "request_limit_reached" variable ? would it work fine without it ?
Bellow is the same code you provide without the "request_limit_reached" variable ?
when HTTP_REQUEST timing on { if { [expr [table incr "counter_all_requests"] % 15] == 0 } { if { $[ table lookup "request_limit_reached" ] < 2 } { table incr "request_limit_reached" table timeout "request_limit_reached" 1 table lifetime "request_limit_reached" 1 } } if { [ table lookup "request_limit_reached" ] ne "" && [ table lookup "request_limit_reached" ] >= 2 } { HTTP::respond 500 } }
Is this variable only for clarity ? or am I missing something ?
by the way, jschilen I hope that you solved your issue since then , but just in case you have "[" instead of a "]" right before the "%" 🙂
Many thanks ,
Karim
- jschilenNimbostratusWhen I try to copy and paste this rule I'm getting
- KarimCirrostratus
another question would be; why do we even bother with the "%15" ? Isn't it possible just to do something like this :
when HTTP_REQUEST { if { [ table lookup "request_limit_reached" ] < 30 } { table incr "request_limit_reached" table timeout "request_limit_reached" 1 table lifetime "request_limit_reached" 1 } else { HTTP::respond 500 } }
The only benefit I see of the modulo, is that it prevents the hole code from executing each time a request comes. And hence the irule would be, I suppose, a bit faster. am I wrong ? am I missing something ?
many thanks ,
karim