Forum Discussion
Irule to block request from amazonaws.com
Hello,
I have an irule to block request from amazonaws.com bad crawlers (millions of requests a day) but my irule doesn't work. Total executions is 0..
Here is the code :
when HTTP_REQUEST {
if { [matchclass [string tolower [HTTP::header Host]] contains blacklist_host] } {
reject
}
}
In my datagroup blacklist_host, I have amazonaws.com entry.
If someone has a solution. Thank you
- gnicoNimbostratus
Thank you for your reply.
That I want to get is a property like the Apache remote_host. In my Apache logs, i have millions hits from remote_host ec2-xx-xxx-xxx-xx.eu-west-3.compute.amazonaws.com
They are malicious bots with classical User Agent likeMozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78 and with a lot of differents IP
So, the last and only solution I found is to block them with their remote host. I don't want to use Apache rules. I want to block them before Apache.
Is there a way ?
Thank you
- Kevin_StewartEmployee
Apache remote_host is essentially a reverse DNS lookup. You can do this in an iRule:
1. Create a resolver object:
list net dns-resolver my-resolver net dns-resolver my-resolver { forward-zones { . { nameservers { 10.1.20.1:domain { } } } } route-domain 0 }
2. Create an iRule that uses the resolver object.
Ref: https://clouddocs.f5.com/api/irules/RESOLVER__name_lookup.html
proc resolv_ptr_v4 { addr_v4 } { set ret [scan $addr_v4 {%d.%d.%d.%d} a b c d] if { $ret != 4 } { return } set ret [RESOLVER::name_lookup "/Common/my-resolver" "$d.$c.$b.$a.in-addr.arpa" PTR] set ret [lindex [DNSMSG::section $ret answer] 0] if { $ret eq "" } { return } return [lindex $ret end] } when CLIENT_ACCEPTED { set result [call resolv_ptr_v4 [IP::client_addr]] log local0. $result ## put your data group search here }
Hi gnico ,
if you recieve millions of hits also if you have Advanced WAF license on your F5 Bigip Appliance, I think configuring Bot defense on your F5 will be good workaround.
Please check this Video :
https://www.youtube.com/watch?v=zSw4boZmNBA
and monitor traffic from Asm event logs , also keep track your CPU and Memory as well.
- Kevin_StewartEmployee
The issue here is that amazonaws.com is not the Host value. The amazonaws.com bot is making a request to your site, so the Host value is still your HTTP Host. To find a crawler bot, you'd want to use the User-Agent header.
when HTTP_REQUEST { if { [matchclass [string tolower [HTTP::header User-Agent]] contains blacklist_host] } { reject } }
But then it may also be useful to consider using a robots.txt file: https://developers.google.com/search/docs/crawling-indexing/robots/intro, which you could host directly from an iRule:
when HTTP_REQUEST { if { [HTTP::uri] == "/robots.txt" } { HTTP::respond 200 content [ifile get robots.txt] } }
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com