Forum Discussion

Paul_Dawson_208's avatar
Paul_Dawson_208
Icon for Nimbostratus rankNimbostratus
Apr 20, 2006

prevention of web scraping - plausible?

Hello,

 

 

Given the flexibility of iRules, would it be possible to prevent web/screen scraping?

 

 

Perhaps defining certain criteria/behavior to look for and then applying a rate shaping class.

 

 

This would be a fantastic selling point by being able to deal with this ever increasing industry problem. As far as I am aware there is nothing that can prevent this effectively at the moment.

 

 

The difficulty I invisage with this is being able to define traffic behavior over time and base the decision on 'heuristics' data from previous connections. Also being able to define illegitimate traffic from proxies.

 

 

 

3 Replies

  • Tom_Spector_50's avatar
    Tom_Spector_50
    Historic F5 Account
    Hi,

     

     

    I would like to point the difference between two kinds of scraping –

     

    - Scraping with a malicious intent – such as scanners or scripts intended to detect or attack various part of the site

     

    - Scraping by innocent users – such as scripts intended for consolidating and enhancing capabilities of applications (such as scripts that consolidate information from different bank accounts or that monitor sites for specific information.)

     

     

    Both kinds have common traffic characteristics with regards to the frequency of the requests per a given source (provided the source can be distinguished).

     

    The iRules language is strong enough to be used for implementing such logic with options such as utilizing cookies for source distinguishing and counters of requests per a given time.

     

    With regards to protecting applications against the malicious side of this issue, I would recommend using the F5 application security solution.

     

  • Colin_Walker_12's avatar
    Colin_Walker_12
    Historic F5 Account
    There is actually some iRule code to help with this sort of thing already written up in a Tech Tip. You can find it here: Click here

     

     

    This particular code was designed to deal with Phishing attempts in particular, but it should demonstrate the kinds of things that can be done to help deal with scraping attempts.

     

     

    You could also modify it to include a requests per time check as well, if you so desired.

     

     

    HTH,

     

    Colin
  • Ido_Breger_3805's avatar
    Ido_Breger_3805
    Historic F5 Account
    BIG-IP-ASM 10.1 has a unique advanced feature to detect and block web scraping, I highly recommand you to look at it.