Forum Discussion
Basic findstr question
So when using findstr, for instance set user [ findstr [HTTP::payload] "user=" 5 & ], I know that the 5 & means to return all data after "user=". If my user ID is 5555555555@domain.com, but has additional data after the domain.com, is there a way to tell the findstr command not to return the data after the domain? Is there a good resource that shows all of the characters/commands that can be used with findstr to return data in different ways? Sorry for the very basic questions, but I don't have a strong background in programming.
- Simon_BlakelyEmployee
findstr is a simple string search function, and you can't really do anything complex with it.
For more complex string extraction, you probably need a regular expression
iRules 101 - #10 - Regular Expressions
However, be aware that using a regex is computationally expensive.
If your URL is /test?user=5555555@my.domain.com&pass=.....
user=(.*@([\w*\.]*))(\&|\#|$)
Then from the above regex, the Group 1 match is 5555555@my.domain.com
You can be more selective about what you terminate on.
To play with regex, go to regex101.com
- nshelton85Altostratus
Thanks, but this is all greek to me unfortunately. I checked the expression on the site you linked me to, and the first part makes sense. I don't understand what the last part (\&|\#|$) is doing entirely.
/
user=(.*@([\w*\.]*))(\&|\#|$)
/
gm
user= matches the characters user= literally (case sensitive)
1st Capturing Group (.*@([\w*\.]*))
.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
@ matches the character @ literally (case sensitive)
2nd Capturing Group ([\w*\.]*)
Match a single character present in the list below [\w*\.]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\w matches any word character (equal to [a-zA-Z0-9_])
* matches the character * literally (case sensitive)
\. matches the character . literally (case sensitive)
3rd Capturing Group (\&|\#|$)
1st Alternative \&
\& matches the character & literally (case sensitive)
2nd Alternative \#
\# matches the character # literally (case sensitive)
3rd Alternative $
$ asserts position at the end of a line
Global pattern flags
g modifier: global. All matches (don't return after first match)
m modifier: multi line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
- gersbahCirrostratus
I think that last part is meant to "anchor" the expression. The idea would be to end the expression with either "&" (which means the next parameter follows), "#" (a URL fragment) or $ (end of the line - no other parameters).
However, practically I don't believe it does a whole lot, because \w in the second capture group already limits the match to only alpha-numeric (which "&" isn't). It also doesn't really prevent overmatching of the first ".*" which would come into play if you have another "@" anywhere else in the query string after the user parameter.
There's probably a million ways to do this and a lot of it depends on how robust you want or need to make it. What if there are multiple user parameters? What about a user parameter with no "@" or no value at all? Can there be special characters in the user name? If the domain is always the same you could even match that literally.
I would probably start with something like this:
user=([\w]+@[\w.]+)
But as I said, a million ways...
- Simon_BlakelyEmployee
As you say, a million ways to die in the regex engine ...
To be honest, I was trying to compensate for the requirement from the OP
> If my user ID is 5555555555@domain.com, but has additional data after the domain.com, is there a way to tell the findstr command not to return the data after the domain?
I was guessing what data might be after the domain if there wasn't another &
In a correctly specified URI, the only valid element separators once you are past the path and have reached query parameters are query parameter separators (&, ;) or the fragment specifier (#), or a line end ( and I missed the ;).
Special characters should be %-encoded in the URL at this stage, but I haven't accounted for %-encoding in the domain portion, because there probably shouldn't be any (the only allowed non-alphanumeric in a fqdn is a dash - , which probably needs inclusion as well).
It gets complicated, real quick ...
Recent Discussions
Related Content
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com