Forum Discussion

Nimbostratus

Jan 31, 2014

After Applying Robots.txt also Site is Visible into the Internet Search Engine

Hi All, I applied the below script to our Virtual-Server profile: ================================================================================================= when HTTP_REQUEST { ...

Employee

Jan 31, 2014

That's very interesting. So Google explains this:

While Google won't crawl or index the content of pages blocked by robots.txt, we may still index the URLs if we find them on other pages on the web. As a result, the URL of the page and, potentially, other publicly available information such as anchor text in links to the site, or the title from the Open Directory Project (www.dmoz.org), can appear in Google search results.

And they suggest adding a meta tag to each page to completely block indexing:

https://support.google.com/webmasters/answer/93710

I think the easiest thing would be to add the following meta tag to each page in the application, but if you absolutely had to do it with an iRule, it might look something like this:

when HTTP_REQUEST {
    HTTP::header remove Accept-Encode
    STREAM::disable
}
when HTTP_RESPONSE {
    STREAM::expression "@[/head]@[meta name=\"robots\" content=\"noindex\"]\r\n[/head]@"
    STREAM::enable
}

Apply an empty STREAM profile to the VIP. There are two additional concerns: