Forum Discussion
Web scraping tunning issue
Hi, I have questions about how to apply Web scraping and read the great article "https://devcentral.f5.com/articles/more-web-scraping-bot-detection" written by John Wagnon. I tried to implement this feature in my policy ASM where my homepage have 125 requests in the first access to the site.
I understood that the value of "Grace Interval" should be at least greater than my number of initial requests but ASM will test whether it is a robot (here's my problem) and punish future access configured in "Unsafe Interval". If you do not detect a robot will allow navigation without checking the next N requests that are configured in "Safe Interval" and returns the validation flow.
My problem is that ASM performs a POST to test interactivity and makes me lose navigational information about google analytics.
Well, my questions are: Counters consider request source IP or trusted XFF? This counter is for each client connection or globally? What is the ideal values for a site with the characteristics value of my homepage? What is the ideal number for the "Safe Interval" value? I think "2000" is just too much for my case, am I wrong? Can anyone help me?
Thank you very much!
1 Reply
- Chris_Grant
Employee
It really varies by use. I don't think anyone can give you optimal settings. You might want to talk to your FSE or possibly your account manager and see if you can get some guidance on optimizing ASM for your environment. It is possible this may involve a PS engagement, but if you're just looking to have bot detection optimized the costs should be minimal. They will need to look at the traffic flow for your website to answer these questions.
Alternately you can experiment with different values and see which ones get you the result you need.
Help guide the future of your DevCentral Community!
What tools do you use to collaborate? (1min - anonymous)Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com