Forum Discussion

Bob_10976's avatar
Bob_10976
Icon for Nimbostratus rankNimbostratus
Feb 18, 2011

iRule to Block Google and other Search Engines

Hello all,

 

 

We would like to use an iRule to block google and other search engines from crawling our sites and was hopeing someone could point me in the right direction. In the past we would use robot.txt files, however with some major chances to our envirnoment it seems it would be simply to apply an iRule.

 

 

We did some research and found the below code, however I'm not sure how old it is or if it will work with our LTM, ver 10.2.x, The site is: http://blog.regisdonovan.org/2010/1...ic-by.html

 

 

 


when HTTP_REQUEST {
if { [HTTP::header "User-Agent"] contains "AnnoyingRobot" } 
{
drop
return }
} 

 

 

 

Thanks,

 

Bob
  • Steve_Brown_882's avatar
    Steve_Brown_882
    Historic F5 Account
    What you have will work for a single user agent, but you might want to add a number of them. You might try something like this and add all the user agents you want to block. I found a good list of these user agents here.

     

     

    http://www.user-agents.org/index.shtml?t_z

     

     

     

    when HTTP_REQUEST {

     

     

    switch -glob [string tolower [HTTP::header "User-Agent]] {

     

    "*googlebot*" -

     

    "*yahooseeker" -

     

    "*etc*" { drop }

     

    default { return}

     

    }

     

  • You could also add the user-agent strings to a datagroup and then use the class (v10) or matchclass (v9) commands to look up the user-agent header value against the datagroup.

     

     

    Aaron
  • Steve_Brown_882's avatar
    Steve_Brown_882
    Historic F5 Account
    I agree with Aaron that a datagroup is another good way to do this and it would allow you to keep a much larger txt list that you simply upload and apply.

     

     

    Something like this would work.

     

     

    when HTTP_REQUEST {

     

    if { class match [string tolower [HTTP::header User-Agent]] contains "bots-list" ] } {

     

    drop

     

    }

     

    else {

     

    return

     

    }

     

    }

     

  • Thanks again for the input.. I like the idea of the datagroup, I think it would be much easier to create and maintain. However the question I have is the list I create, where do I store it? Do i simply upload the txt file to the LTMs and if so would I need to put it in a specfic directory..

     

     

    Thanks,

     

    Bob
  • Hi Bob,

    You can manually add the datagroup entries via the GUI under Local >> Traffic >> iRules >> Datagroup tab. Or you could reference an external file. The format for an external file is in 10.0+ is:

    
    class namevalue {
      "name1" := "value",
      "name2" := "value",
    }
    

    Another option would be to use the bigip.conf format in a separate file and then merge the file into the bigip.conf using 'b merge file'.

    The format for an inbuilt datagroup in the bigip.conf is:

    
    class namevalue {
       {
          "name1" { "value" }
          "name2" { "value" }
       }
    }
    

    Lastly, you could use iControl to upload an external datagroup file and then use a Codeshare example from Joe to reload it:

    http://devcentral.f5.com/wiki/default.aspx/iControl/PingExternalClass.html

    Aaron