Forum Discussion

Parveez_70209's avatar
Parveez_70209
Icon for Nimbostratus rankNimbostratus
Oct 12, 2013

robots metatag and nofollow attribute in Irule

robots metatag and nofollow attribute

 

How to Use Robots metatag and nofollow Attribute into the Irule

 

In last topic, we discussed about How to hide URL( Virtual-Servers) which are exposed over Internet from search engines using robots.txt.

 

If we want to restrict robots from entire websites and directories,we used the robots.txt file.The contents of the robots.txt file, assuming we wanted to block all crawlers, we tested as below irule:

 

when HTTP_REQUEST { if { [string tolower [HTTP::uri]] equals "/robots.txt" } { HTTP::respond 200 content "User-agent: *\nDisallow: /" } }

 

If we want to restrict robots from a single page,we use the robots metatag , correct ? and incase If we are looking to restrict the spidering of a single link, we would use the link "nofollow" attribute, correct ?

 

Now how to use : robots metatag and nofollow attribute into the Irule ?

 

1 Reply

  • You can do this at least two ways:

    • You can specify the blocked pages/URIs in the robots.txt file. Example:

      when HTTP_REQUEST {     
          if { [string tolower [HTTP::uri]] equals "/robots.txt" } { 
              HTTP::respond 200 content "User-agent: *\nDisallow: /cgi-bin/\nDisallow: /tmp/"             
          }         
      }
      

    Ref: http://www.robotstxt.org/robotstxt.html

    • You can insert a meta tag into a page. Probably the easiest way to do that would be with a STREAM profile. Something like the following:

      when HTTP_REQUEST {        
          STREAM::disable
          HTTP::header remove "Accept-Encoding"
          if { [class match [string tolower [HTTP::uri]] starts_with my_nofollow_dg] } {
              set nofollow 1
          }
      }
      when HTTP_RESPONSE {
          if { ( [info exists nofollow] ) and ( [HTTP::header Content-Type] contains "text" ) } {
              unset nofollow
              STREAM::expression {@@@}
              STREAM::enable
          }
      }
      

    where "my_nofollow_dg" is a string-based data group containing the no-follow URIs. Example:

        "/cgi-bin/" := 1
        "/tmp/" := 1
    

    On HTTP request, if the requested URI matches a value in the data group, a temporary local variable is set. On HTTP response, if that variable exists, use the STREAM profile to replace the end HEAD tag with a new robots META tag and the end HEAD tag.