Forum Discussion
winifred_corbet
Nimbostratus
Aug 10, 2010How to block Googlebot/Nutch-1.0 Bot
We have this Bot that is killing us:
Googlebot/Nutch-1.0 (Prototype; http://en.wikipedia.org/wiki/Web_crawler; donotreply at prototype dot com)
We would like to block it completely. I see there are irules to redirect Bots, but is there a simple one that can block this all together?
1 Reply
- hoolio
Cirrostratus
Hi Winifred,
Sure, you can use a simple iRule to send a TCP reset if the user agent header contains that string:when HTTP_REQUEST { Check the UA header value, set to lower case switch -glob [string tolower [HTTP::header User-Agent]] { "*googlebot/nutch*" { Bad UA, send a TCP reset reject } } }
If you know the IP address(es) they typically make a request from you could do this more efficiently by adding the IPs to an address datagroup and then using the matchclass (v9) or class (v10) commands in CLIENT_ACCEPTED. This would avoid checking every HTTP request for the User-Agent header value.
http://devcentral.f5.com/wiki/default.aspx/iRules/class
http://devcentral.f5.com/wiki/default.aspx/iRules/matchclass
Aaron
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects
