For more information regarding the security incident at F5, the actions we are taking to address it, and our ongoing efforts to protect our customers, click here.

Forum Discussion

nshelton85's avatar
nshelton85
Icon for Altostratus rankAltostratus
Mar 09, 2020

Basic findstr question

So when using findstr, for instance set user [ findstr [HTTP::payload] "user=" 5 & ], I know that the 5 & means to return all data after "user=". If my user ID is 5555555555@domain.com, but has additional data after the domain.com, is there a way to tell the findstr command not to return the data after the domain? Is there a good resource that shows all of the characters/commands that can be used with findstr to return data in different ways? Sorry for the very basic questions, but I don't have a strong background in programming.

4 Replies

  • findstr is a simple string search function, and you can't really do anything complex with it.

    For more complex string extraction, you probably need a regular expression

    iRules 101 - #10 - Regular Expressions

    However, be aware that using a regex is computationally expensive.

    If your URL is /test?user=5555555@my.domain.com&pass=.....

    user=(.*@([\w*\.]*))(\&|\#|$)

    Then from the above regex, the Group 1 match is 5555555@my.domain.com

    You can be more selective about what you terminate on.

    To play with regex, go to regex101.com

  • Thanks, but this is all greek to me unfortunately. I checked the expression on the site you linked me to, and the first part makes sense. I don't understand what the last part (\&|\#|$) is doing entirely.

     

    /

    user=(.*@([\w*\.]*))(\&|\#|$)

    /

    gm

    user= matches the characters user= literally (case sensitive)

     

    1st Capturing Group (.*@([\w*\.]*))

     

    .* matches any character (except for line terminators)

    Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

    @ matches the character @ literally (case sensitive)

     

    2nd Capturing Group ([\w*\.]*)

     

    Match a single character present in the list below [\w*\.]*

    Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)

    \w matches any word character (equal to [a-zA-Z0-9_])

    * matches the character * literally (case sensitive)

    \. matches the character . literally (case sensitive)

     

    3rd Capturing Group (\&|\#|$)

     

    1st Alternative \&

    \& matches the character & literally (case sensitive)

     

    2nd Alternative \#

    \# matches the character # literally (case sensitive)

     

    3rd Alternative $

    $ asserts position at the end of a line

     

    Global pattern flags

    g modifier: global. All matches (don't return after first match)

    m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

     

    • gersbah's avatar
      gersbah
      Icon for Cirrostratus rankCirrostratus

      I think that last part is meant to "anchor" the expression. The idea would be to end the expression with either "&" (which means the next parameter follows), "#" (a URL fragment) or $ (end of the line - no other parameters).

      However, practically I don't believe it does a whole lot, because \w in the second capture group already limits the match to only alpha-numeric (which "&" isn't). It also doesn't really prevent overmatching of the first ".*" which would come into play if you have another "@" anywhere else in the query string after the user parameter.

      There's probably a million ways to do this and a lot of it depends on how robust you want or need to make it. What if there are multiple user parameters? What about a user parameter with no "@" or no value at all? Can there be special characters in the user name? If the domain is always the same you could even match that literally.

      I would probably start with something like this:

      user=([\w]+@[\w.]+)

      But as I said, a million ways...

      • Simon_Blakely's avatar
        Simon_Blakely
        Icon for Employee rankEmployee

        As you say, a million ways to die in the regex engine ...

         

        To be honest, I was trying to compensate for the requirement from the OP

         

        > If my user ID is 5555555555@domain.com, but has additional data after the domain.com, is there a way to tell the findstr command not to return the data after the domain?

         

        I was guessing what data might be after the domain if there wasn't another &

         

        In a correctly specified URI, the only valid element separators once you are past the path and have reached query parameters are query parameter separators (&, ;) or the fragment specifier (#), or a line end ( and I missed the ;).

         

        Special characters should be %-encoded in the URL at this stage, but I haven't accounted for %-encoding in the domain portion, because there probably shouldn't be any (the only allowed non-alphanumeric in a fqdn is a dash - , which probably needs inclusion as well).

         

        It gets complicated, real quick ...