string first replacement procedure for ID999881

Code is community submitted, community supported, and recognized as ‘Use At Your Own Risk’.

Short Description

Replacement for the string first command to handle Unicode characters

Problem solved by this Code Snippet

There is a bug ID 999881 which causes the string first command to not work correctly in the presence of unicode characters eg in the payload for example à.

string first is a standard string function which finds the index of a substring inside a larger string, with an optional offset. See https://www.tcl.tk/man/tcl8.4/TclCmd/string.html#M7 

string first needleString haystackString ?startIndex?
Search haystackString for a sequence of characters that exactly match the characters in needleString. If found, return the index of the first character in the first such match within haystackString. If not found, return -1. If startIndex is specified (in any of the forms accepted by the index method), then the search is constrained to start with the character in haystackString specified by the index. For example,
string first a 0a23456789abcdef 5
will return 10, but
string first a 0123456789abcdef 11
will return -1.

Example is iRules:

when HTTP_REQUEST_DATA {
    set tagstart [string first "<" [HTTP::payload] ]
    set tagend [string first "/>" [HTTP::payload] $tagstart]
}

The above iRule will find the start of an XML tag, and the end of the tag. Note that $tagstart is used in the second command to set the starting point as after the tagstart.

In this case, with a payload of <xml><header/><body></body></xml> $tagstart would be 0, $tagend would be 12

As per the bug, the second command would not work as expected - it will return a higher number such as 28.

How to use this code snippet

To replace that command, use the procedure below. For example

proc stringfirst {needle haystack {start 0}} {
 # Procedure to find the first example of a character ( to replace string first because of https://cdn.f5.com/product/bugtracker/ID999881.html)
 set pos -1
 set buffer [split $haystack {}]
 set matchlength [string length $needle]
 
 for {set i $start} {$i<[llength $buffer]} {incr i} {
   set chunk [join [lrange $buffer $i [expr {$i + $matchlength -1}]] ""]
   if {$chunk eq $needle} {
     set pos $i
     break
   }
 }
 return $pos
}
when HTTP_REQUEST_DATA {
  if { ! [string is ascii [HTTP::payload] ] } {
    set tagstart [call stringfirst "<" [HTTP::payload] ]
    set tagend [call stringfirst "/>" [HTTP::payload] $tagstart]
  } else {
    set tagstart [string first "<" [HTTP::payload] ]
    set tagend [string first "/>" [HTTP::payload] $tagstart]
  }
}

Code Snippet Meta Information

  1. Version:  TMOS v11+
  2. Coding Language: iRules

Full Code Snippet

proc stringfirst {needle haystack {start 0}} {
 # Procedure to find the first example of a character ( to replace string first because of https://cdn.f5.com/product/bugtracker/ID999881.html)
 set pos -1
 set buffer [split $haystack {}]
 set matchlength [string length $needle]
 
 for {set i $start} {$i<[llength $buffer]} {incr i} {
   set chunk [join [lrange $buffer $i [expr {$i + $matchlength -1}]] ""]
   if {$chunk eq $needle} {
     set pos $i
     break
   }
 }
 return $pos
}
Published Oct 13, 2022
Version 1.0
No CommentsBe the first to comment