Forum Discussion
gnico
Nov 08, 2022Nimbostratus
Irule to block request from amazonaws.com
Hello, I have an irule to block request from amazonaws.com bad crawlers (millions of requests a day) but my irule doesn't work. Total executions is 0.. Here is the code : when HTTP_REQUES...
Kevin_Stewart
Nov 08, 2022Employee
The issue here is that amazonaws.com is not the Host value. The amazonaws.com bot is making a request to your site, so the Host value is still your HTTP Host. To find a crawler bot, you'd want to use the User-Agent header.
when HTTP_REQUEST {
if { [matchclass [string tolower [HTTP::header User-Agent]] contains blacklist_host] } {
reject
}
}
But then it may also be useful to consider using a robots.txt file: https://developers.google.com/search/docs/crawling-indexing/robots/intro, which you could host directly from an iRule:
when HTTP_REQUEST {
if { [HTTP::uri] == "/robots.txt" } {
HTTP::respond 200 content [ifile get robots.txt]
}
}
Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects