Forum Discussion

WK_98725's avatar
WK_98725
Icon for Nimbostratus rankNimbostratus
Sep 11, 2012

iRules and robots.txt question

Got a quick question for the F5 and iRules experts out there. I have been asked about "putting a robots.txt file

 

on an LTM", and was wondering if that is possible, or even makes sense.

 

 

I understand that robots.txt is supposed to give web crawlers advice on what part of the directory structure is off limits (which they are free to ignore). I have looked at

 

https://devcentral.f5.com/wiki/iRules.Version_9_x_Robot_and_Request_Limiting_iRule.ashx

 

which restricts robots to the ones that respect robots.txt and puts limits on client requests.

 

 

However, the question is whether it is possible to put a robots.txt file on an F5, have it parsed, and then

 

have the F5 restrict client request access according to that.

 

 

I am not an F5/iRule expert (to say the least), so before I go out on a limb and say it can't be done

 

I'd like to get some expert opinions.

 

 

Thanks,

 

 

W.

 

 

 

 

  • I think you're really talking about two different things: 1) putting a robots.txt file on the BIG-IP for robots to consume, and 2) limiting robot access.

    2 is generally covered by the link you provided.

    1 is pretty straight forward:

    
    when HTTP_REQUEST {
        if { [string tolower [HTTP::uri]] equals "/robots.txt" } {
             HTTP::respond 200 content "content of robots.txt..."
        }
    }
    

  • Hi

     

    i am looking to do the same thing, can some elaborate what "content of robots.txt..." is ? Do we create a Datagroup file or a string?

     

    I need to send a response of disallow.

     

    thanks

     

    C