iRules 101 - #18 - Revisiting the TCL Scan Command

I covered the Tcl scan command back in the iRules 101 – #16 – Parsing Strings with the TCL Scan Command, but this example (by Hoolio, who else?) was too good not to share with the community. The request involved parsing a log entry as efficiently as possible.  The log entry is as follows:

Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/0/0.21:10.1.1.1:33639 -> 192.168.22.254, creating forward or watch flow ; source address and identifier translate to 172.16.4.32:1135

The goal is to pull out the origin IP address (10.1.1.1) and the translation IP and port (172.16.4.32, 1135, respectively).  This could be done with a lot of lindex/split combinations, but this requirement is what the scan command is made for.  Before moving on to the formatting, let’s break this string into chunks that we can match on. We want to parse as few fields as possible, so we can match all the way to the first “/”, then the first “:” to get to the IP address.  Then we move along to the first “;”, and further to the next digits, and finally to the last “:”.  The strings we’ll build our formatting around are thus:

  1. Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/
  2. 0/0.21:
  3. 10.1.1.1:
  4. 33639 -> 192.168.22.254, creating forward or watch flow ;
  5. source address and identifier translate to
  6. 172.16.4.32:
  7. 1135

The only fields we really care about here are 3, 6, & 7.  There are options with formatting around fields you don’t want, I’ll show both below.

Option 1 – Set Unnecessary Fields to 0

scan $str {%[^/]%[^:]:%[^:]:%[^;];%[^0-9]%[0-9.]:%[0-9]} 0 0 ip1 0 0 ip2 port

Let’s list these out individually to understand what’s occuring:

  1. %[^/] – store all data in the string to the first occurrence of “/” to <null>
  2. %[^:] – store the remaining data in the string to the first occurence of “:” to <null>
  3. :%[^:] –skip the colon and store the remaining data in the string to the next occurrence of “:” to ip1
  4. :%[^;] – skip the colon and store the remaining data in the string to the next occurrence of “;” to <null>
  5. ;%[^0-9] –skip the semi-colon and store the remaining data in the string to the next occurrence of a number to <null>
  6. %[0-9.] –store the remaining data until a non “.” or number to ip2
  7. :%[0-9] – skip the colon and store the remaining numbers in port

In practice, I’ll set those <null> fields to garbage variables so you can see them matches:

% set a "Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}\[FWNAT\]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/0/0.21:10.1.1.1:33639 -> 192.168.22.254, creating forward or watch flow ; source address and identifier translate to 172.16.4.32:1135"
Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/0/0.21:10.1.1.1:33639 -> 192.168.22.254, creating forward or watch flow ; source address and identifier translate to 172.16.4.32:1135
% scan $a {%[^/]%[^:]:%[^:]:%[^;];%[^0-9]%[0-9.]:%[0-9]} g1 g2 ip1 g3 g4 ip2 port
7
% puts $g1
Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0
% puts $g2
/0/0.21
% puts $ip1
10.1.1.1
% puts $g3
33639 -> 192.168.22.254, creating forward or watch flow
% puts $g4
source address and identifier translate to
% puts $ip2
172.16.4.32
% puts $port
1135

Option 2 – Skip Over Unnecessary Fields in the Scan Formatting

The scan formatting allows you to use the "*" to skip data instead of forcing you to use a null variable for fields you don't need. I’ll just re-list the fields, with a slight change in the formatting of the fields we don’t care about (this alternate example provided by F5er and Devcentral member MiLK_MaN):

  1. %*[^/] – skip over all the data in the string to the first occurrence of “/”
  2. %*[^:] – skip over the remaining data in the string to the first occurence of “:”
  3. :%[^:] –skip the colon and store the remaining data in the string to the next occurrence of “:” to ip1
  4. :%*[^;] – skip the colon and skip over the remaining data in the string to the next occurrence of “;”
  5. ;%*[^0-9] –skip the semi-colon and skip over the remaining data in the string to the next occurrence of a number
  6. %[0-9.] –store the remaining data until a non “.” or number to ip2
  7. :%[0-9] – skip the colon and store the remaining numbers in port

So the command altogether looks like this:

scan $a {%*[^/]%*[^:]:%[^:]:%*[^;];%*[^0-9]%[0-9.]:%[0-9]} ip1 ip2 port

And in practice:

% scan $a {%*[^/]%*[^:]:%[^:]:%*[^;];%*[^0-9]%[0-9.]:%[0-9]} ip1 ip2 port
3
% puts $ip1
10.1.1.1
% puts $ip2
172.16.4.32
% puts $port
1135

I was curious if there is a performance difference between the two approaches, so I put each scan command in a proc and timed it over one million iterations:

% proc test1 {arg} {
  scan $arg {%[^/]%[^:]:%[^:]:%[^;];%[^0-9]%[0-9.]:%[0-9]} 0 0 ip1 0 0 ip2 port
}
% proc test2 {arg} {
  scan $arg {%*[^/]%*[^:]:%[^:]:%*[^;];%*[^0-9]%[0-9.]:%[0-9]} ip1 ip2 port
}
% time {test1 $a} 1000000
8.183545 microseconds per iteration
% time {test2 $a} 1000000
7.20703 microseconds per iteration

Both are pretty lean and mean, but skipping the data rather then “storing” to null is slightly more efficient.

Caveats

The scan command is amazing, but it is rigid in that if EVERY log entry isn’t formatted exactly the same, then this won’t work 100% of the time.  This is the case here:

     

    Aug 05 08:02:25 ethos-re0 (FPC Slot 1, PIC Slot 1) {sset2}\[FWNAT\]: ASP_SFW_DELETE_FLOW: proto 6 (TCP) application: any, (null)(null)10.1.1.1:8956 –> 192.168.22.254:80, deleting forward or watch flow ; source address and port translate to 172.16.4.32:1128

    Notice that our first match from the scan command above is nowhere to be found!  Not good.  There is hope, however.  If you know all the possible formats and can find a unique identifier among them, you could switch on the unique identifier and then have a custom scan format for each as necessary.

    The Challenge

    I have a DevCentral t-shirt (message in image below) for the first non-F5er to provide the scan syntax to pull out the highlighted fields in the string provided in the Caveats section.  Happy coding!

     

     

     

    Related Articles

     
    Published Jun 10, 2011
    Version 1.0