Forum Discussion

Spence_Faschin1's avatar
Spence_Faschin1
Icon for Nimbostratus rankNimbostratus
Mar 15, 2005

Rewrite a uri (not redirect)

I have a need to rewrite a uri, and make this look transparent to the end user. What I have come up with is this:

   
 when HTTP_REQUEST {  
   set loop 0  
   set max [llength $::shorturi]  
   while {$loop < $max}{  
     set tmpstr [lindex $::shorturi $loop]  
     if {[HTTP::uri] starts_with $tmpstr}{  
       set uri1 {findstr [http_uri] $tmpstr}  
       set uri2 {regexp -all -inline {$tmpstr::longuri}}  
       subst {[HTTP::uri] $uri2}  
     }  
   }  
 }  
 

I have to Data Groups - long_uris and short_uris:

short_uris:

someuri1

someuri2

long_uris:

rewrite/someuri1

rewrite/someuri2

the end of the long uri will always match what the short uri is

I have not had a chance to test this yet - still building out the test environment, but it does not kick any errors on the LTM. I am just looking for some input - as I am not a developer (I'm a network guy).

So - to re-iterate - the long term goal is this:

user comes in to http://www.somesite.com/someuri

and the LTM rewrites the uri on the backside to:

http://www.somesite.com/rewrite/someuri (the rewrite portion is much much longer in the real data)

Thanks in advance

9 Replies

  • unRuleY_95363's avatar
    unRuleY_95363
    Historic F5 Account
    I think there are a number of things that can be improved in your example. But, let's not start there yet.

    First, let's gather more information about what you're trying to do.

    It sounds like you want to insert some path data before certain uris.

    So, the first question that comes to mind is whether the inserted rewrite text is different depending on the original uri?

    Second, I'm not quite sure how/why you are relating the two datagroups with each other. Perhaps you really just want one datagroup with two fields.

    For example, I might have a datagroup that looks like this:

     
     class my_uri_mappings { 
        "/index.html /some/place/for/html" 
        "/pretty.gif /some/place/for/images" 
        "/unknown.html /some/place/for/hackers" 
     } 
     

    I might then have a rule that looks of the uri in the datagroup and inserts the second portion of it. This example relies on the fact that when the findclass command is supplied with a separator (the 3rd argument) it will return the latter portion of the entry that matches the beginning portion:

     
     when HTTP_REQUEST { 
        set rewrite [findclass [HTTP::uri] $::my_uri_mappings " "] 
        if { $rewrite ne "" } { 
           HTTP::uri [concat $rewrite [HTTP::uri]] 
        } 
     } 
     

    The next example relies on the fact that the matchclass command returns the list index + 1 of a matching entry. Since it doesn't appear that you are actually rewriting the original portion of the uri, but only inserting a leading string, this approach could also be used:

     
     class my_uri_mappings { 
        "/some/place/for/html/index.html" 
        "/some/place/for/images/pretty.gif" 
        "/some/place/for/hackers/unknown.html" 
     } 
     

     
     when HTTP_REQUEST { 
        set rewrite [matchclass $::my_uri_mappings ends_with [HTTP::uri]] 
        if { $rewrite } { 
           HTTP::uri [lindex $::my_uri_mappings [expr $rewrite - 1]] 
        } 
     } 
     

  • Again - this just shows why I am not a programer - I used the second example - and that works rather well in all respects except - I don't wan't the client to see the new uri - I just want it to be passed on the back end - any way we can do that?

     

     

    Thanks for the examples
  • I did a little more testing - works great if the uri is equal to something in the data group. In my case - if the uri starts with something in the data group, I want to rewrite that portion of the uri, example:

     

     

    http://www.somesite.com/uri/index.html

     

     

    would become

     

     

    http://www.somesite.com/rewrite/uri/index.html

     

     

    I always to need to preserve and pass the "end" of the uri, if the beginning matches something in the data group.

     

     

  • Just an update - here is the final rule that I ended up going with:

    Assuming I have a data group called uris formated like this:

    /uri1 /longuri/directoryone

    /uri2 /longuri/directorytwo

    /uri3 /longuri/someotherdirectory

       
     when HTTP_REQUEST {  
        if { [getfield /[findclass [findstr [http_uri] "" 1 "/"] $::uris] " " 1] ne ""}{  
           HTTP::uri "[findclass /[findstr [http_uri] "" 1 "/"] $::uris " " ][HTTP::uri]"  
        }  
     }  
     

  • For the non-coders on this forum, (ok, idiot ME!) can you explain why that is an optimized version of the previous example? Thanks.
  • So - setting a variable before a rule matches will not cause additional overhead when processing a rule? My thought was I did not want to set a variable, unless the traffic passed my if statement - and at that point - I figured, why set a variable at all, if I can just splat it with one line. I will try the latest example once my test environment is back online and let everyone know how it goes.

     

     

    Thanks again for all the pointers and help!
  • unRuleY_95363's avatar
    unRuleY_95363
    Historic F5 Account
    That is a very good point. You have obviously thought about this. Of course, it will all really depend on just how often you expect to match. If it does not match often, then you are completely correct. If it matches regularly, then you would likely want to save the result in a variable. Another factor to weigh is the number of elements in the class/datagroup.

    For those that are interested and paying attention, I'm now going to mention a YASF (yet another stealth feature):

    You can enable timing statistics in a rule which will allow you to see just how many cycles are spent evaluating a given rule event. The way you do this is with the "timing on" statement.

    An example that enables timing for all subsequent events in a rule is:

       
     rule my_fast_rule {   
        timing on   
        when HTTP_REQUEST {   
            Do some stuff   
        }   
     }   
     

    An example of only timing a specific event is:

       
     rule my_slow_rule {   
        when HTTP_REQUEST timing on {   
            Do some other stuff   
        }   
     }   
     

    This will then collect timing information each time the rule is evaluated and can be viewed with "b rule show all". You'll likely only want to look at the average and min numbers as max is often way, way out there due to the optimizations being performed on the first run of the rule. Additionally, enabling timing does have some overhead, though it should be negligible.

  • This rule will be used on a very busy e-comm type site. My understanding is they peak at 52 million unique page clicks per day, during the holiday season, with 20 million unique page clicks being the average . . .

     

     

    My understanding of HTTP_REQUEST is that it will be evaled for every client header request - so pretty much every "hit" to these sites will use this rule - correct?
  • unRuleY_95363's avatar
    unRuleY_95363
    Historic F5 Account
    Ok.

     

     

    Here's another way to look at that traffic:

     

     

    With the busiest traffic being 52 million page clicks (we will assume this actually refers to unique HTTP requests), then that works out to be an average of 602 req/sec assuming around the clock clicking, or an average of 1805 req/sec if concentrated into an 8 hour busy period, or an average of 14444 req/sec if all 52 million clicks were concentrated into 1 hour. Any of these levels of traffic are well below the upper limits of v9 software on most of our platforms.

     

     

    I'm not sure where we were going with this, but it doesn't sound like you've got a whole lot to be concerned about other than writing a really, really crappy rule (which by no means have you done). The idea to making on efficient rule is not always how much it costs the current connection, but how much it costs the entire system which might be processing lots of different sites. Then you obviously want to make it as efficient as possible.

     

     

    It sounds like you are the right person to make the decision as you know the traffic patterns the best and also know the pros and cons to doing it either way. I was just trying to point out the subtle differences.

     

     

    To finish, your understanding of HTTP_REQUEST is correct. It is evaluated on every client request whether it's the first request on a new connection or a subsequent request on an existing connection.

     

     

    On an off topic side note: you could use "event HTTP_REQUEST disable" if you actually wanted to turn off this rule event for subsequent requests.

     

     

    Good luck and let us know how things work out.