Forum Discussion
winifred_corbet
Nimbostratus
Aug 10, 2010How to block Googlebot/Nutch-1.0 Bot
We have this Bot that is killing us:
Googlebot/Nutch-1.0 (Prototype; http://en.wikipedia.org/wiki/Web_crawler; donotreply at prototype dot com)
We would like to block it completely. I ...
hoolio
Cirrostratus
Aug 10, 2010Hi Winifred,
Sure, you can use a simple iRule to send a TCP reset if the user agent header contains that string:
when HTTP_REQUEST {
Check the UA header value, set to lower case
switch -glob [string tolower [HTTP::header User-Agent]] {
"*googlebot/nutch*" {
Bad UA, send a TCP reset
reject
}
}
}
If you know the IP address(es) they typically make a request from you could do this more efficiently by adding the IPs to an address datagroup and then using the matchclass (v9) or class (v10) commands in CLIENT_ACCEPTED. This would avoid checking every HTTP request for the User-Agent header value.
http://devcentral.f5.com/wiki/default.aspx/iRules/class
http://devcentral.f5.com/wiki/default.aspx/iRules/matchclass
Aaron
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
