on 09-Jun-2011 19:42
I covered the Tcl scan command back in the iRules 101 – #16 – Parsing Strings with the TCL Scan Command, but this example (by Hoolio, who else?) was too good not to share with the community. The request involved parsing a log entry as efficiently as possible. The log entry is as follows:
Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/0/0.21:10.1.1.1:33639 -> 192.168.22.254, creating forward or watch flow ; source address and identifier translate to 172.16.4.32:1135
The goal is to pull out the origin IP address (10.1.1.1) and the translation IP and port (172.16.4.32, 1135, respectively). This could be done with a lot of lindex/split combinations, but this requirement is what the scan command is made for. Before moving on to the formatting, let’s break this string into chunks that we can match on. We want to parse as few fields as possible, so we can match all the way to the first “/”, then the first “:” to get to the IP address. Then we move along to the first “;”, and further to the next digits, and finally to the last “:”. The strings we’ll build our formatting around are thus:
The only fields we really care about here are 3, 6, & 7. There are options with formatting around fields you don’t want, I’ll show both below.
scan $str {%[^/]%[^:]:%[^:]:%[^;];%[^0-9]%[0-9.]:%[0-9]} 0 0 ip1 0 0 ip2 port
Let’s list these out individually to understand what’s occuring:
In practice, I’ll set those <null> fields to garbage variables so you can see them matches:
% set a "Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}\[FWNAT\]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/0/0.21:10.1.1.1:33639 -> 192.168.22.254, creating forward or watch flow ; source address and identifier translate to 172.16.4.32:1135"
Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0/0/0.21:10.1.1.1:33639 -> 192.168.22.254, creating forward or watch flow ; source address and identifier translate to 172.16.4.32:1135
% scan $a {%[^/]%[^:]:%[^:]:%[^;];%[^0-9]%[0-9.]:%[0-9]} g1 g2 ip1 g3 g4 ip2 port
7
% puts $g1
Aug 05 08:01:13 ethos-re0 (FPC Slot 1, PIC Slot 1) TEST-{ss-nat-2}[FWNAT]: ASP_SFW_CREATE_ACCEPT_FLOW: proto 1 (ICMP ECHO REQUEST) application: icmp, xe- 0
% puts $g2
/0/0.21
% puts $ip1
10.1.1.1
% puts $g3
33639 -> 192.168.22.254, creating forward or watch flow
% puts $g4
source address and identifier translate to
% puts $ip2
172.16.4.32
% puts $port
1135
The scan formatting allows you to use the "*" to skip data instead of forcing you to use a null variable for fields you don't need. I’ll just re-list the fields, with a slight change in the formatting of the fields we don’t care about (this alternate example provided by F5er and Devcentral member MiLK_MaN):
So the command altogether looks like this:
scan $a {%*[^/]%*[^:]:%[^:]:%*[^;];%*[^0-9]%[0-9.]:%[0-9]} ip1 ip2 port
And in practice:
% scan $a {%*[^/]%*[^:]:%[^:]:%*[^;];%*[^0-9]%[0-9.]:%[0-9]} ip1 ip2 port
3
% puts $ip1
10.1.1.1
% puts $ip2
172.16.4.32
% puts $port
1135
I was curious if there is a performance difference between the two approaches, so I put each scan command in a proc and timed it over one million iterations:
% proc test1 {arg} {
scan $arg {%[^/]%[^:]:%[^:]:%[^;];%[^0-9]%[0-9.]:%[0-9]} 0 0 ip1 0 0 ip2 port
}
% proc test2 {arg} {
scan $arg {%*[^/]%*[^:]:%[^:]:%*[^;];%*[^0-9]%[0-9.]:%[0-9]} ip1 ip2 port
}
% time {test1 $a} 1000000
8.183545 microseconds per iteration
% time {test2 $a} 1000000
7.20703 microseconds per iteration
Both are pretty lean and mean, but skipping the data rather then “storing” to null is slightly more efficient.
The scan command is amazing, but it is rigid in that if EVERY log entry isn’t formatted exactly the same, then this won’t work 100% of the time. This is the case here:
Aug 05 08:02:25 ethos-re0 (FPC Slot 1, PIC Slot 1) {sset2}\[FWNAT\]: ASP_SFW_DELETE_FLOW: proto 6 (TCP) application: any, (null)(null)10.1.1.1:8956 –> 192.168.22.254:80, deleting forward or watch flow ; source address and port translate to 172.16.4.32:1128
Notice that our first match from the scan command above is nowhere to be found! Not good. There is hope, however. If you know all the possible formats and can find a unique identifier among them, you could switch on the unique identifier and then have a custom scan format for each as necessary.
I have a DevCentral t-shirt (message in image below) for the first non-F5er to provide the scan syntax to pull out the highlighted fields in the string provided in the Caveats section. Happy coding!
Related Articles
The Challenge link is down... Mind fixing it. Wanna try this out.
We could not find this exact URL. Let us try to search this link for you.