HTML Comment Scrubber

Problem this snippet solves:

HTML Commenting is very useful for documenting your site content. Unfortunately, that information can sometimes be private and it is not a desired situation to let everyone viewing your website to be able to access those comments. This iRule will perform a regular expression search on the HTTP response content and if it finds a HTML comment in the form of "!(ws)...(ws)" (ws - whitespace), it will replace all the characters with a space. I could have removed the characters from the reponse, thus shortening the payload, but by keeping the size the same that reduces the need to configure your virtual server to rechunk responses.

How to use this snippet:

Keep in mind that this will remove any and all content in between "!--" and "--". That includes javascript code inside a script block. I'll leave it to you all out there to modify the regexp to account for non javascript commented code blocks...

Code :

when HTTP_REQUEST {
  # Don't allow data to be chunked
  if { [HTTP::version] eq "1.1" } {
    if { [HTTP::header is_keepalive] } {
      HTTP::header replace "Connection" "Keep-Alive"
    }
     HTTP::version "1.0"
  }
}
when HTTP_RESPONSE {
  if { [HTTP::header exists "Content-Length"] && [HTTP::header "Content-Length"] < 1000000} {
     set content_length [HTTP::header "Content-Length"]
  } else {
     set content_length 1000000
  }
  if { $content_length > 0 } {
     HTTP::collect $content_length
  }
}
when HTTP_RESPONSE_DATA {
  # Find the HTML comments

  set indices [regexp -all -inline -indices {} [HTTP::payload]]
  # Replace the comments with spaces in the response
  #log local0. "Indices: $indices"

  foreach idx $indices {
     set start [lindex $idx 0]
     set len [expr {[lindex $idx 1] - $start + 1}]
     log local0. "Start: $start, Len: $len"
     HTTP::payload replace $start $len [string repeat " " $len]
  }
}
Published Jan 30, 2015
Version 1.0