string first replacement procedure for ID999881
Code is community submitted, community supported, and recognized as ‘Use At Your Own Risk’.
Short Description
Replacement for the string first command to handle Unicode characters
Problem solved by this Code Snippet
There is a bug ID 999881 which causes the string first command to not work correctly in the presence of unicode characters eg in the payload for example à.
string first is a standard string function which finds the index of a substring inside a larger string, with an optional offset. See https://www.tcl.tk/man/tcl8.4/TclCmd/string.html#M7
string first needleString haystackString ?startIndex?
Search haystackString for a sequence of characters that exactly match the characters in needleString. If found, return the index of the first character in the first such match within haystackString. If not found, return -1. If startIndex is specified (in any of the forms accepted by the index method), then the search is constrained to start with the character in haystackString specified by the index. For example,
string first a 0a23456789abcdef 5
will return 10, but
string first a 0123456789abcdef 11
will return -1.
Example is iRules:
when HTTP_REQUEST_DATA {
set tagstart [string first "<" [HTTP::payload] ]
set tagend [string first "/>" [HTTP::payload] $tagstart]
}
The above iRule will find the start of an XML tag, and the end of the tag. Note that $tagstart is used in the second command to set the starting point as after the tagstart.
In this case, with a payload of <xml><header/><body></body></xml> $tagstart would be 0, $tagend would be 12
As per the bug, the second command would not work as expected - it will return a higher number such as 28.
How to use this code snippet
To replace that command, use the procedure below. For example
proc stringfirst {needle haystack {start 0}} {
# Procedure to find the first example of a character ( to replace string first because of https://cdn.f5.com/product/bugtracker/ID999881.html)
set pos -1
set buffer [split $haystack {}]
set matchlength [string length $needle]
for {set i $start} {$i<[llength $buffer]} {incr i} {
set chunk [join [lrange $buffer $i [expr {$i + $matchlength -1}]] ""]
if {$chunk eq $needle} {
set pos $i
break
}
}
return $pos
}
when HTTP_REQUEST_DATA {
if { ! [string is ascii [HTTP::payload] ] } {
set tagstart [call stringfirst "<" [HTTP::payload] ]
set tagend [call stringfirst "/>" [HTTP::payload] $tagstart]
} else {
set tagstart [string first "<" [HTTP::payload] ]
set tagend [string first "/>" [HTTP::payload] $tagstart]
}
}
Code Snippet Meta Information
- Version: TMOS v11+
- Coding Language: iRules
Full Code Snippet
proc stringfirst {needle haystack {start 0}} {
# Procedure to find the first example of a character ( to replace string first because of https://cdn.f5.com/product/bugtracker/ID999881.html)
set pos -1
set buffer [split $haystack {}]
set matchlength [string length $needle]
for {set i $start} {$i<[llength $buffer]} {incr i} {
set chunk [join [lrange $buffer $i [expr {$i + $matchlength -1}]] ""]
if {$chunk eq $needle} {
set pos $i
break
}
}
return $pos
}