Forum Discussion

mulot's avatar
mulot
Icon for Nimbostratus rankNimbostratus
Feb 09, 2011

Modify UTF-8 strings in Payload

Hi

 

 

I would like to replace some strings in HTTP payload but strings are in UTF-8 format (header "Content-Type: text/html; charset=utf-8" in server response).

 

 

 

My irule is :

 

 

 

when HTTP_REQUEST {

 

if { [HTTP::version] eq "1.1" } {

 

if { [HTTP::header is_keepalive] } {

 

HTTP::header replace "Connection" "Keep-Alive"

 

}

 

HTTP::version "1.0"

 

}

 

if {[HTTP::header exists "Accept-Encoding"]} {

 

HTTP::header replace "Accept-Encoding" ""

 

log local0. "Accept-Encoding null"

 

}

 

 

}

 

 

 

when HTTP_RESPONSE {

 

if {[HTTP::header value Content-Type] contains "text"} {

 

if { [HTTP::header exists "Content-Length"] } {

 

set content_length [HTTP::header "Content-Length"]

 

}

 

else {

 

set content_length 1048576

 

}

 

if { $content_length > 1048575 } {

 

HTTP::collect $content_length

 

}

 

if { $content_length > 0 } {

 

HTTP::collect $content_length

 

}

 

}

 

}

 

 

 

 

when HTTP_RESPONSE_DATA {

 

HTTP::payload replace 0 $content_length [string map {"OLD STRING" "NEW STRING"} [HTTP::payload]]

 

HTTP::release

 

}

 

 

 

 

 

 

Result :

 

Strings are replaced but all characters with accents are corrupted (wrong character displayed)

 

 

 

If I disable the "HTTP::payload replace 0 $content_length [string map {"OLD STRING" "NEW STRING"} [HTTP::payload]]" function, all accents are correct

 

 

 

If I try "HTTP::payload replace 0 $content_length [HTTP::payload]" everything is OK

 

 

 

I think "string map" can not handle UTF-8 strings.

 

 

 

How can I do ?

 

  • As a test, can you try using regsub instead of string map?

     

     

    http://www.tcl.tk/man/tcl8.4/TclCmd/regsub.htm

     

     

    Also, you can simplify the collection size logic a bit and make sure you avoid collecting more than 1Mb of payload by modifying your HTTP_RESPONSE code:

     

     

    
    when HTTP_RESPONSE {
       if {[HTTP::header value Content-Type] contains "text"} {
          if { [HTTP::header exists "Content-Length"] and [HTTP::header "Content-Length"] < 1048575 } {
             set content_length [HTTP::header "Content-Length"]
          } else {
             set content_length 1048576
          }
          if { $content_length > 0 } {
             HTTP::collect $content_length
          }
       }
    }
    

     

     

    Aaron
  • mulot's avatar
    mulot
    Icon for Nimbostratus rankNimbostratus
    Same problem with regsub.

     

     

    I tried with a Stream profile instead of HTTP:collect and it fixed this issue. All UTF-8 characters are displayed correctly.

     

  • It's great that you found something that works for you. That was going to be my next suggestion. A stream profile and STREAM::expression based iRule should be more efficient than buffering the response content with HTTP::collect.

     

     

    Aaron