Green IT: 404 Blacklisting

One of the premises of a greener IT is to reduce the number of servers necessary while maintaining performance levels and meeting capacity needs.

Chances are that many of the HTTP requests received that result in a 404 (not found) message are typos, bots, or bad guys attempting to find a way into your web applications. The thing is that the server must respond to these requests, and it often requires some disk I/O to discover the file doesn't exist. That's expensive in terms of resources and can increase the total power consumption of your servers.

If you're finding enough 404 errors in your logs, and you've verified that they're accurate, i.e. the files don't - and shouldn't - exist, then you may want to consider blacklisting those requests. By blacklisting those pesky non-existent files you can obviate the need for the server to look for it, thus reducing the overall burden (and power consumption) on your servers. This equates to a better performing server, and ensures that real requests get the resources they need to be fulfilled in a timely manner.

To accomplish this task, you'll need an iRule that keeps track of URIs resulting in a 404 and ensures that subsequent requests for that URI are immediately "kicked back" to the user instead of passed on to the server. We'll do that dynamically, in real-time, because it's nearly impossible to guess what kind of funky URIs will be requested by users, bots, and bad guys.

We'll also want to log additions to our blacklist so administrators can verify that the files in question really aren't valid files that have somehow been removed/lost.

when RULE_INIT {
array set ::unknown_pages { }
}
when HTTP_REQUEST {
if { [info exists ::unknown_pages([HTTP::uri])] } {
HTTP::respond 200 content "
The requested file could not be found"

} else {
set curr_uri [HTTP::uri]
pool webpool
}
when HTTP_RESPONSE {
if { [HTTP::status] == "404"} {
set ::unknown_pages($curr_uri) 1
log "Added $curr_uri to the 404 blacklist"

}
}

There are myriad other options you could employ to make this iRule even more flexible. You could add a timestamp to any URI added to the 404 blacklist, and revalidate that it is, in fact, still missing after a specified time interval. This allows you to support the case where the file should have existed, but didn't, giving IT time to resolve the problem and automatically removing the URI from the blacklist later. Or you can write another iRule that specifically removes a URI from the list, so you can manually manage the list when you need to.

You could also redirect the user to a prettier "not found" page rather than responding with simple text. Just replace the HTTP::respond line with one that uses HTTP::redirect with the appropriate URL:

HTTP::redirect http://www.example.com/sorry_page.html

In general, removing unnecessary processing from your server infrastructure can reduce costs, improve performance, and increase/maintain capacity. With a flexible platform like BIG-IP Local Traffic Manager and iRules, you can reduce the burden on your servers and get "greener".

Imbibing: Mountain Dew


Published Jun 27, 2008
Version 1.0
No CommentsBe the first to comment