Technical Forum
Ask questions. Discover Answers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Irule to block request from amazonaws.com

gnico
Nimbostratus
Nimbostratus

Hello,

I have an irule to block request from amazonaws.com bad crawlers (millions of requests a day) but my irule doesn't work. Total executions is 0.. 

Here is the code : 

 

 

when HTTP_REQUEST  {
    if { [matchclass [string tolower [HTTP::header Host]] contains blacklist_host] } {
        reject
    }
}

 

 

In my datagroup blacklist_host, I have amazonaws.com entry.

If someone has a solution. Thank you

4 REPLIES 4

Kevin_Stewart
F5 Employee
F5 Employee

The issue here is that amazonaws.com is not the Host value. The amazonaws.com bot is making a request to your site, so the Host value is still your HTTP Host. To find a crawler bot, you'd want to use the User-Agent header.

when HTTP_REQUEST  {
    if { [matchclass [string tolower [HTTP::header User-Agent]] contains blacklist_host] } {
        reject
    }
}

But then it may also be useful to consider using a robots.txt file: https://developers.google.com/search/docs/crawling-indexing/robots/intro, which you could host directly from an iRule:

when HTTP_REQUEST {
  if { [HTTP::uri] == "/robots.txt" } {
    HTTP::respond 200 content [ifile get robots.txt]
  }
}

gnico
Nimbostratus
Nimbostratus

Thank you for your reply.

That I want to get is a property like the Apache remote_host. In my Apache logs, i have millions hits from remote_host ec2-xx-xxx-xxx-xx.eu-west-3.compute.amazonaws.com

They are malicious bots with classical User Agent likeMozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78 and with a lot of differents IP

So, the last and only solution I found is to block them with their remote host. I don't want to use Apache rules. I want to block them before Apache.

Is there a way ?

Thank you

Hi @gnico , 
      if you recieve millions of hits also if you have Advanced WAF license on your F5 Bigip Appliance 

, I think configuring Bot defense on your F5 will be good workaround. 
Please check this Video : 
https://www.youtube.com/watch?v=zSw4boZmNBA

and monitor traffic from Asm event logs , also keep track your CPU and Memory as well. 

_______________________
Regards
Mohamed Kansoh

Apache remote_host is essentially a reverse DNS lookup. You can do this in an iRule:

1. Create a resolver object:

list net dns-resolver my-resolver 
net dns-resolver my-resolver {
    forward-zones {
        . {
            nameservers {
                10.1.20.1:domain { }
            }
        }
    }
    route-domain 0
}

2. Create an iRule that uses the resolver object.

Ref: https://clouddocs.f5.com/api/irules/RESOLVER__name_lookup.html

proc resolv_ptr_v4 { addr_v4 } {
    set ret [scan $addr_v4 {%d.%d.%d.%d} a b c d]
    if { $ret != 4 } {
        return
    }
    set ret [RESOLVER::name_lookup "/Common/my-resolver" "$d.$c.$b.$a.in-addr.arpa" PTR]
    set ret [lindex [DNSMSG::section $ret answer] 0]
    if { $ret eq "" } {
        return
    }
    return [lindex $ret end]
}
when CLIENT_ACCEPTED {
    set result [call resolv_ptr_v4 [IP::client_addr]]
    log local0. $result
    ## put your data group search here
}