Forum Discussion
Mike_62629
Jul 16, 2008Nimbostratus
Rate limiting Search Spiders
We're currently having some problems with some web spiders beating up our webservers sucking up available sessions in our application and slurping up a whole bunch of our bandwidth. We're interested i...
hooleylist
Jul 16, 2008Cirrostratus
You can still use reject to send a TCP reset (even from an HTTP_ event). I haven't delved into search engine optimization, but technically it would be appropriate to send a 503 response back.
Google seems to handle the 503 as you'd hope:
http://googlewebmastercentral.blogspot.com/2006/08/all-about-googlebot.html
If my site is down for maintenance, how can I tell Googlebot to come back later rather than to index the "down for maintenance" page?
You should configure your server to return a status of 503 (network unavailable) rather than 200 (successful). That lets Googlebot know to try the pages again later.
What should I do if Googlebot is crawling my site too much?
You can contact us -- we'll work with you to make sure we don't overwhelm your server's bandwidth. We're experimenting with a feature in our webmaster tools for you to provide input on your crawl rate, and have gotten great feedback so far, so we hope to offer it to everyone soon.
Aaron
Recent Discussions
Related Content
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects