Forum Discussion
Mike_62629
Nimbostratus
Jul 16, 2008Rate limiting Search Spiders
We're currently having some problems with some web spiders beating up our webservers sucking up available sessions in our application and slurping up a whole bunch of our bandwidth. We're interested i...
Mike_62629
Nimbostratus
Jul 16, 2008Here's the rate limiting approach:
when RULE_INIT {
array set ::active_crawlers { }
set ::min_interval 1
}
when HTTP_REQUEST {
set user_agent [string tolower [HTTP::header "User-Agent"]]
Logic only relevant for crawler user agents
if { [matchclass $user_agent contains $::Crawlers] } {
Throttle crawlers.
set curr_time [clock seconds]
if { [info exists ::active_crawlers($user_agent)] } {
if { [ $::active_crawlers($user_agent) < $curr_time ] } {
set ::active_crawlers($user_agent) [expr {$curr_time + $::min_interval}]
} else {
reject
}
} else {
set ::active_crawlers($user_agent) [expr {$curr_time + $::min_interval}]
}
}
}
Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects