Forum Discussion

Jason_64846's avatar
Jason_64846
Icon for Nimbostratus rankNimbostratus
Aug 14, 2012

Web Scraping detection and referrers

We are fairly new to the BIG-IP world and we host several sites which are all highly susceptible to web scraping so we have enable ASM on these sites and it has been working very well for us. One recent issue came up that our Google organic clicks had dropped by about 40-50%. I quickly realize that this was due to the fact that our users were getting the JS challenge page by ASM then it would post that form and let the user thru without issue but we lose that referrer data. ASM does retain the original referrer in the form data so to resolve this I created an Irule to fish this out and push it thru to our web server.

 

So is there a built in way already to preserve this or is an Irule currently the only way to go? If not I would like to put a feature request in for this as it would seem this would be important data to a lot of other Big-IP users. Also the way to retrieve this data from an Irule doesn’t appear to be the cleanest as you have to look for a form value on a post in the payload that ends in “_rf” which could possibly conflict with other form values if you aren’t careful.

 

We are running 3900’s with 11.2.0 Build 2451.0 Hotfix HF1

 

Thanks!

 

  • Hi Jason,

     

     

    Can you open a case with F5 Support to raise this issue? You can ask Support to refer to BZ366999 and C924687 to expedite things.

     

     

    Aaron
  • Here's the Irule i put together incase anyone else would need a sample or have any suggestions on improving it. I am checking the Method and the content length twice but I wasn't sure if that HTTP_Request_data event could/would get fired off by something else. It appears to be working for us so far.

     when HTTP_REQUEST {if { [HTTP::method] eq "POST"} { Trigger collection for up to 1MB of dataif { [HTTP::header "Content-Length"] ne "" && [HTTP::header "Content-Length"] <= 1048576} {  set content_length [HTTP::header "Content-Length"]} else {set content_length 1048576} Check if $content_length is not set to 0if { $content_length > 0} {  HTTP::collect $content_length}}}when HTTP_REQUEST_DATA {if { [string tolower [HTTP::method]] eq "post" } {        if { [HTTP::header exists "Content-Length"] } {log local0.info [HTTP::payload]foreach p [split [HTTP::payload] &] {set name  [getfield $p = 1]if { $name ends_with "_rf" } {set value [getfield $p = 2]set value [URI::decode [getfield $p = 2]]log local0.info $namelog local0.info "Retaining Referrer: $value" HTTP::header replace Referer $value}}}    }}
  • Hi Guys, Was it ever implemented on BigIP or an irule is still needed?

     

    • nathe's avatar
      nathe
      Icon for Cirrocumulus rankCirrocumulus
      no probs. glad i could help. i didn;t realise this was an issue to be honest so I've learned something. thanks.