Forum Discussion
After Applying Robots.txt also Site is Visible into the Internet Search Engine
That's very interesting. So Google explains this:
While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.
And they suggest adding a meta tag to each page to completely block indexing:
https://support.google.com/webmasters/answer/93710
I think the easiest thing would be to add the following meta tag to each page in the application, but if you absolutely had to do it with an iRule, it might look something like this:
when HTTP_REQUEST {
HTTP::header remove Accept-Encode
STREAM::disable
}
when HTTP_RESPONSE {
STREAM::expression "@[/head]@[meta name=\"robots\" content=\"noindex\"]\r\n[/head]@"
STREAM::enable
}
Apply an empty STREAM profile to the VIP. There are two additional concerns:
- Not all crawlers may honor this tag
- Google will have to crawl your site again to catch this tag and remove its listing
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com