iRule Security 101 - #03 - HTML Comments
In this session of iRules Security 101, I'll walk you through on a process to strip unnecessary content from your outbound application responses. Section 3.2.5 of RFC 1886 (The Hypertext Markup Language - 2.0) allows for comments to be enclosed within HTML content. In certain cases, this could lead to unwanted information being exposed.
<body> ... <!-- Pull user info from Users table in database and format a list from that data --> ... </body>
In this article, I'll show you how to remove all those HTML comments from all HTTP traffic that's leaving your network. Other articles in the series:
- iRule Security 101 – #1 – HTTP Version
- iRule Security 101 – #02 – HTTP Methods and Cross Site Tracing
- iRule Security 101 – #03 – HTML Comments
- iRule Security 101 – #04 – Masking Application Platform
- iRule Security 101 – #05 – Avoiding Path Traversal
- iRule Security 101 – #06 – HTTP Referer
- iRule Security 101 – #07 – FTP Proxy
- iRule Security 101 – #08 – Limiting POST Data
- iRule Security 101 – #09 – Command Execution
An early of every (respectable) developers training is to thoroughly comment ones code. Typically this is a safe process as source code is either compiled or obfuscated before being made available to the client. But, with the advent of web based applications, situations can occur that could cause your source code to be leaked. While stripping out HTML comments will not completely secure all content breaches, it will do a small part in making sure that any internal information included in asp/jsp/html development is not allowed to reach the masses.
The following example will inspect all HTML responses for patterns matching HTML comments and replace those characters with spaces, effectively erasing them from the outside world.
when HTTP_REQUEST { # Don't allow data to be chunked. This ensures we don't get # a comment that is spread across two chunked boundaries. if { [HTTP::version] eq "1.1" } { if { [HTTP::header is_keepalive] } { HTTP::header replace "Connection" "Keep-Alive" } HTTP::version "1.0" } } when HTTP_RESPONSE { # Ensure all of the HTTP response is collected if { [HTTP::header exists "Content-Length"] } { set content_length [HTTP::header "Content-Length"] } else { set content_length 1000000 } if { $content_length > 0 } { HTTP::collect $content_length } } when HTTP_RESPONSE_DATA { # Find the HTML comments set indices [regexp -all -inline -indices {<![ \r\n\t]*--([^\-]|[\r\n]|-[^\-])*[^/][^/]--[ \r\n\t]*>} [HTTP::payload]] # Replace the comments with spaces in the response #log local0. "Indices: $indices" foreach idx $indices { set start [lindex $idx 0] set len [expr {[lindex $idx 1] - $start + 1}] #log local0. "Start: $start, Len: $len" HTTP::payload replace $start $len [string repeat " " $len] } }
The special sauce in here is the regular expression used to search for the comments. I'll leave it to you all to figure out how the regular expression works and possibly rehash it when I start the "iRules Ninja" series.
Bonus points to anyone who can comment on why I added the "[^/][^/]" towards the end of the regexp.
- JRahm
Admin
Is the trailing negated match to disqualify javascript comments? - Somehow I knew citizen_elah would be quick to answer...
- Mike_Lowell_108Historic F5 AccountActually that part of the regex seems incorrect. My memory is telling me that bracket expressions contain a list of characters, not a string, so the 2nd slash is redundant. "man 7 regex" seems to confirm this:
- Mike_Lowell_108Historic F5 AccountOh, and I almost forgot: the following PCRE has served me well...
s///msg;
- josh4534_106539
Nimbostratus
Is there an existing method to strip out the C-style comments of CSS?