Advanced iRules: Tables
We’ve covered quite a bit of ground in the Getting Started with iRules and Intermediate iRules series. In this series, we’ll dive even deeper down the rabbit hole, starting with the table command. This article is a primer for the power of tables, but we actually have an entire 9 part series on the table command alone, so after reading this overview, I highly recommend you dig in to the meat of what tables can really do. What is a table? A table is somewhat similar to a data-group in that it is, at its core, a memory structure to contain lists. That’s about where the similarities end, though. Tables are stored in memory alone, which means they don’t have the config object backing that a data-group does, but that also makes them inherently more flexible. A table is a read/write memory structure that is maintained across connections, making it a solid contender for storing data that needs to be persistent, as opposed to connection based like normal iRules variables. Tables also allow a few extremely powerful concepts, namely sub tables and a built in timeout/lifetime structure. Both of these things add possibilities to what tables are capable of. Tables are also extremely efficient, much like data-groups, though on a slightly smaller scale given their more flexible nature. Tables make use of the session table within LTM, which is designed to handle every connection into or out of the device, and as such is very high performance. Table queries share this performance and can be a quick way to construct simple, persistent memory structures or more complex near DB like sets of sub-tables, depending on your needs. What are the benefits of a table? Tables are a high performance, highly scalable, persistent (non-connection bound), read/write data structure. That fact alone makes them unique within iRules and extremely powerful. There simply aren’t other things that fill that role. There are only a couple of ways to have read/write variable data across connections, and tables are by far the best option in almost every case. The ability to create what amounts to a mini database in memory from within your iRule is massively useful in many scenarios also. This is easy to do via subtables. You can create not just a flat list, but named lists to segregate data however you’d like. By doing so you can create relations and a relatively complex schema for data storage and accounting. Now add in the fact that there are configurable timeout and lifetime options, which means you’ll never have to worry about memory management or programmatic cleanup of a table, as things will time out as designed, and you’ve got another layer of usability and ease of use that is unique to tables. The bottom line is that tables are one of the more powerful, flexible features to hit iRules in quite a while. What command(s) would I use to access data in a table? To access tables, you make use of the table command, much like the class command to access data-groups. This command can also get quite complex quite fast, so I won’t attempt to cover all the possible permutations here. I’ll list a couple of simple examples, and give you a link to the full documentation. # Limit each client IP address to 20 concurrent connections when CLIENT_ACCEPTED { # Check if the subtable has over 20 entries if { [table keys -subtable connlimit:[IP::client_addr] -count] >= 20 } { reject } else { # Add the client IP:port to the client IP-specific subtable # with a max lifetime of 180 seconds table set -subtable connlimit:[IP::client_addr] [TCP::client_port] "" 180 } } when CLIENT_CLOSED { # When the client connection is closed, remove the table entry table delete -subtable connlimit:[IP::client_addr] [TCP::client_port] } As you can see tables are extremely powerful, and can add a powerful tool to your quiver for your future iRule endeavors. There are plenty of examples of using these powerful commands out on DevCentral, so there is no shortage of information to be found and code to be lifted for re-use if you scour Q&A and the Codeshare. In an attempt to keep this consumable I've not gone through the full list of command permutations or anywhere near the full possibilities. I'll leave discovering those things to the more advanced and/or eager iRulers, but suffice to say the possibilities are vast.7.4KViews5likes5CommentsAdvanced iRules: Regular Expressions
A regular expression or regex for short is essentially a string that is used to describe or match a set of strings, according to certain syntax rules. Regular expressions ("REs") come in two basic flavors: extended REs ("ERE"s) and basic REs ("BREs"). For you unix heads out there, EREs are roughly the same as used by traditional egrep, while BREs are roughly those of the traditional ed. The TCL implementation of REs adds a third flavor, advanced REs ("AREs") which are basically EREs with some significant extensions. It is beyond the scope of this article to document the regular expression syntax. A great reference is included in the re_syntax document in the TCL Built-In Commands section of the TCL documentation. Regular Expression Examples So what does a regular expression look like? It can be as simple as a string of characters to search for an exact match to "abc" {abc} Or a builtin escape string that searches for all sequences of non-whitespace in a string {\S+} Or a set of ranges of characters that search for all three lowercase letter combinations {[a-z][a-z][a-z]} Or even a sequence of numbers representing a credit card number. {(?:3[4|7]\d{13})|(?:4\d{15})|(?:5[1-5]\d{14})|(?:6011\d{12})} For more information on the syntax, see the re_syntax manual page in the TCL documentation, you won't be sorry. Commands that support regular expressions In the TCL language, the following built in commands support regular expressions: regexp - Match a regular expression against a string regsub - Perform substitutions based on regular expression pattern matching lsearch - See if a list contains a particular element switch - Evaluate one of several scripts, depending on a given value iRules also has an operator to make regular expression comparisons in commands like "if", "matchclass" and "findclass" matches_regex - Tests if one string matches a regular expression. Think twice, no three times, about using Regular Expressions Regular expressions are fairly CPU intensive and in most cases there are faster, more efficient, alternatives available. There are the rare cases, such as the Credit Card scrubber iRule, that would be very difficult to implement with string searches. But, for most other cases, we highly suggest you search for alternate methods. The "switch -glob" and "string match" commands use a "glob style" matching that is a small subset of regular expressions, but allows for wildcards and sets of strings which in most cases will do exactly what you need. In every case, if you are thinking about using regular expressions to do straight string comparisons, please, please, please, make use of the "equals", "contains", "starts_with", and "ends_with" iRule operators, or the glob matching mentioned above. Not only will they perform significantly faster, they will do the exact same thing. Here's a comparative example: BAD: if { [regexp {bcd} "abcde"] } { BAD: if { "abcde" matches_regex "bcd" } { BETTER: if { [string match "*bcd*" "abcde"] } { BEST: if { "abcde" contains "bcd" } { A Performance Challenge: Scan vs Regex The scan and stringcommands will cover the majority of regex use cases, and as indicated above, will save significantly on system resources. Consider the following test where we use Tcl's time command to run a scanagainst an IP address 10,000 times, and then do that again only using the regexp command. % set ip "10.10.20.200" 10.10.20.200 % time { scan $ip {%d.%d.%d.%d} a b c d} 10000 2.1713 microseconds per iteration % time {regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})} $ip matched a b c d} 10000 34.2604 microseconds per iteration Two approaches, same result. The time to achieve that result? The scan command bests the regexp commandby far. I’ll save you the calculation…that’s a 93.7% reduction in processing time. 93.7 percent! Now, mind you, the difference between 2 and 34 microseconds will be negligible to an individual request’s response time, but in the context of a single system handling hundreds of thousands or even millions of request per second, the difference matters. A lot. Conclusion Regular expressions are available for those tricky situations where you just need to perform some crazy insane search like "string contains 123 but only if follows by "abc" or "def", or it contains a 5 digit number that is a valid US zip code". Also, if you need to find the exact location of a string within a string (for instance to replace one string for another), then regexp and regsubwill likely work for you (if a Stream Profile doesn't). So, in some cases, a regular expression is likely the only option. Just keep in mind that even multiple string and comparison tests are typically far more efficient than even the simplest regular expression, so use them wisely.4.1KViews1like1CommentAdvanced iRules: Binary Scan
The binary scan command, like the scan command covered in this Advanced iRules series, parses strings. Only, as the adjective indicates, it parses binary strings. In this article, I’ll highlight the command syntax and a few of the format string options below. Check the man page for the complete list of format options. binary scan string formatString ?varName varName … ? c - The data is turned into count 8-bit signed integers and stored in the corresponding variable as a list. If count is *, then all of the remaining bytes in string will be scanned. If count is omitted, then one 8-bit integer will be scanned. S - The data is interpreted as count 16-bit signed integers represented in big-endian byte order. The integers are stored in the corresponding variable as a list. If count is *, then all of the remaining bytes in string will be scanned. If count is omitted, then one 16-bit integer will be scanned. H - The data is turned into a string of count hexadecimal digits in high-to-low order represented as a sequence of characters in the set "0123456789abcdef". The data bytes are scanned in first to last order with the hex digits being taken in high-to-low order within each byte. Any extra bits in the last byte are ignored. If count is *, then all of the remaining hex digits in string will be scanned. If count is omitted, then one hex digit will be scanned. Do the Research First! Before you can start mapping fields, you need to know where the delimiters are, and that takes a little research. In this example, I want to list out the cipher suites presented to the BIG-IP LTM by the browser (more on that later), so I’m going to decode the fields in an SSLv3/TLSv1 client hello message. I took a capture with Wireshark to find the field delineations, but you could also pull this information from RFC 2246 if you are so inclined. The table below shows the fields I’ll extract or skip over in the course of the iRule code. So why do I want to look at the cipher suites? I was contacted by a friend who wanted to increase security to 256-bit ciphers for those browsers that supported it, but would still allow users that don’t to connect as well so they could be redirected to an error page requesting they upgrade. With the cipher settings as they were, the unsupported browser (in this case, IE6) could connect, but the newer browsers connected at 128-bit as well instead of 256-bit. I later learned (courtesy of none other than hoolio) that I could use @STRENGTH in my cipher settings to force the capable browsers up, but for the benefit of demonstrating the use the binary string command, I’ll proceed with the iRule. Standard Behaviors I set up a clientssl profile with these cipher settings: “DEFAULT:!ADH:!EXPORT40:!EXP:!LOW”. Those settings result in my test browsers (Chrome 5, FF 3.6.4, IE 😎 all using the RC4-SHA 128-bit cipher. I used a simple iRule to display the information back to the client: when HTTP_REQUEST { set cipher_info "[SSL::cipher name]:[SSL::cipher bits]" HTTP::respond 200 content "$cipher_info" } Again, the easy (and better) fix here is to add strength to the cipher settings in the SSL profile like so: “DEFAULT:!ADH:!EXPORT40:!EXP:!LOW:@STRENGTH”, but instead I created an additional clientssl profile, adding not medium to the settings: “DEFAULT:!ADH:!EXPORT40:!EXP:!LOW:!MEDIUM”. Once I had that profile in place, work on the iRule could begin. Using Binary Scan Enough prep talk already, let me get to the good stuff! The iRule starts in CLIENT_ACCEPTED, disabling SSL and doing a TCP collect: when CLIENT_ACCEPTED { set lsec 0 SSL::disable TCP::collect } when CLIENT_DATA { if { ! [info exists rlen] } { binary scan [TCP::payload] cSS rtype sslver rlen #log local0. "SSL Record Type $rtype, Version: $sslver, Record Length: $rlen" # SSLv3 / TLSv1.0 if { $sslver > 767 && $sslver < 770 } { if { $rtype != 22 } { log local0. "Client-Hello expected, Rejecting. \ Src: [IP::client_addr]:[TCP::remote_port] -> \ Dst: [IP::local_addr]:[TCP::local_port]" reject return } As shown in the binary scan syntax, “c” grabs one byte and returns a signed integer, and “S” grabs two-bytes and returns a signed integer. The rest of the payload is thrown away. From table 1, we know the first three fields of SSLv3/TLSv1 client hello’s are one-byte, two-byte, two-byte, so a format string of cSS followed by three variable names takes care of the parsing. The clientssl profile doesn’t yet support TLS 1.1 or 1.2, so I’m only testing for SSLv3/TLSv1. The SSLv2 and earlier record format is different altogether, so the initial fields wouldn’t map this way anyway. This is true of IE6, which by default issues a SSLv2 client hello but supports SSLv3/TLSv1. Confused yet? Anyway, once the test for SSLv3/TLSv1 is complete, I then make sure the record type is a client hello, if not, no reason to continue. Once the verification is complete that this really is a client hello, there is a string (pun intended!) of binary scans to sort the rest of the details out: #Collect rest of the record if necessary if { [TCP::payload length] < $rlen } { TCP::collect $rlen #log local0. "Length is $len" } #skip record header and random data set field_offset 43 #set the offset binary scan [TCP::payload] @${field_offset}c sessID_len set field_offset [expr {$field_offset + 1 + $sessID_len}] #Get cipherlist length binary scan [TCP::payload] @${field_offset}S cipherList_len #log local0. "Cipher list length: $cipherList_len" #Get ciphers, separate into a list of elements set field_offset [expr {$field_offset + 2}] set cipherList_len [expr {$cipherList_len * 2}] binary scan [TCP::payload] @${field_offset}H${cipherList_len} cipherlist #log local0. "Cipher list: $cipherlist" I’m using the “@” symbol to skip bytes in the binary string. Also, the curly brackets are necessary to surround the variable name to distinguish your variable from the significance of the format string characters. In the first binary scan above, I’m skipping the record header and all the random data as I don’t need that information, but I do need the session ID length so I can set my offset properly to get to the cipher list. In the second binary scan, again I’m skipping (from original payload) the new offset and then storing another two-byte signed integer that tells me how long the cipher list is. From that information, I set a new offset, set the cipher list length (number of ciphers * 2, each cipher is two-bytes) and then perform the final binary scan to store the entire cipher list (in hex high->low characters, hence the “H” in the string format). The rest of the rule is processing that cipher list (thanks to Colin for some tcl-fu in the for loop below), and sending users to a different clientssl profile if any of the ciphers in the client hello list match the ciphers in a string datagroup you define, in this case, accepted_ciphers (values were four hex character pairs, like c005, 0088, etc). For browsers incapable of the higher security profile, they would be redirected to an error page instructing them to upgrade. set profile_select 0 for { set x 0 } { $x < $cipherList_len } { incr x 4 } { set start $x set end [expr {$x+3}] #Check cipher against 256-bit ciphers in accepted_ciphers data group if { [matchclass [string range $cipherlist $start $end] equals accepted_ciphers] } { #If 256-bit capable, select High Security profile set profile_select "SSL::profile clientssl_highSecurity" break } } #Set New Profile with eval since SSL::profile errors in CLIENT_ACCEPTED event if { $profile_select != 0 } { eval $profile_select #log local0. "Profile changed to High Security" } else { set lsec 1 } event CLIENT_DATA disable SSL::enable TCP::release } else { set lsec 1 event CLIENT_DATA disable SSL::enable TCP::release } } } Conclusion The rule is useless in production, but is a good teaching example on the use of binary scan. There are several good examples in the codeshare. Check out this tftp file server iRule (companion article for the tftp iRule), this one throws in a binary format to boot. Happy coding!3.4KViews2likes0CommentsAdvanced iRules: Scan
Scan is used to parse out strings. It takes a string and based on your format parameters, stores the matches in one or more variables. It also returns the number of conversions performed, so it can be used in a conditional as well. For all the options available in the command, the scan man page is available here at http://tmml.sourceforge.net/doc/tcl/scan.html. I'll highlight a couple of the options I see used in iRules examples in Q&A below. scan string format ?varName varName ...? Options d - The input substring must be a decimal integer. s - The input substring consists of all the characters up to the next white-space character. n - No input is consumed from the input string. Instead, the total number of characters scanned from the input string so far is stored in the variable. [chars] - The input substring consist of one or more characters in chars. The matching string is stored in the variable. [^chars] - The input substring consists of one or more characters not in chars. Examples So how do we put scan to use in iRules? Consider this first example: when HTTP_REQUEST { if { [scan [HTTP::host] {%[^:]:%s} host port] == 2 } { log local0. "Parsed \$host:\$port: $host:$port } } Here we are scanning the host contents. The HTTP::host command only returns a port if it is not port 80 for http traffic and port 443 for ssl traffic, so if it is standard, the second conversion (%s) will not populate the port variable and the conditional will be false. In the scan commands format section, %[^:] tells the scan command to store all the characters in string from the beginning until the first occurrence of the colon. We then put a colon before the %s (which tells scan to store the remaining characters until the end or white space) so it is not included in the port variables contents. Also note that the format string is wrapped in curly braces so that the brackets are not evaluated as a command. Below is the functionality of the scan command in a tcl shell: % set httphost "www.test.com:8080" www.test.com:8080 % scan $httphost {%[^:]:%s} host port 2 % puts "$host $port" www.test.com 8080 Another use case--splitting up an IP address into multiple variables--is accomplished in one easy step below. % set ip 10.15.25.30 10.15.25.30 % scan $ip %d.%d.%d.%d ip1 ip2 ip3 ip4 4 % puts "$ip1 $ip2 $ip3 $ip4" 10 15 25 30 As with most things with iRules, there are many paths to the same result, even if they require more steps. Here's another way to arrive at the same split IP with variables for each octet. This method requires four sets of a nested split/lindex evaluation to achieve the same result. % set ip 10.15.20.25 10.15.20.25 % set ip1 [lindex [split $ip "."] 0] 10 % set ip2 [lindex [split $ip "."] 1] 15 % set ip3 [lindex [split $ip "."] 2] 20 % set ip4 [lindex [split $ip "."] 3] 25 % puts "$ip1 $ip2 $ip3 $ip4" 10 15 20 25 If you're wondering why you'd split the IP like this, the use case in the forums was to extract each octet so they could then do some bit shifiting to create a unique ID for their stores based on IP subnets. One final example before closing. The scan string is refined over a few steps to show the elimination of unwanted characters in the variables. % set sipinfo {} % scan $sipinfo {%[^:]%s} garbage sessid 2 % puts "$garbage $sessid" You can see that at the first colon it dumped the contents up to that character into the garbage variable. Everything else, including the colon, is dumped into the sessID variable. Close but we don't want that colon, so we need to include it in the scan format string. % scan $sipinfo {%[^:]:%s} garbage sessid 2 % puts "$garbage $sessid" Good. Now we need to break off the host and the port as well. We want all characters up until the @ sign for the session id, then all the characters between the @ sign and the colon for the host, and finally all the characters after the color for the port. % scan $sipinfo {%[^:]:%[^@]@%[^:]:%s} garbage sessid host port 4 % puts "$sessid $host $port" 214365981110 10.15.20.25 3232> OK, all looks good except the port. We definitely don't want the > on the port. So one final fix. % scan $sipinfo {%[^:]:%[^@]@%[^:]:%[^>]} garbage sessid host port 4 % puts "$sessid $host $port" 214365981110 10.15.20.25 3232 Scan is a great command to have available in your iRules arsenal. Thanks to Hoolio, cmbhatt, sre, and natty76 for some great examples. I've archived the article, but for another great scan example, check out my Revisiting the TCL Scan Command article from the original iRules 101 series.2.3KViews1like1CommentAdvanced iRules: SMTP Start TLS
F5er and DevCentral member natty76 wrote a few iRules a while back on interactive TLS session starting on the SMTP, IMAP, and POP3 protocols. A lot of the iRules can be understood from a flow perspective by reading the iRule top to bottom. This is not the case for these iRules. In this article, I’ll break down the SMTP communication context for the BIG-IP as middleman between client and server. I’ve saved the iRule as an image below so I reference line numbers as I go. The SMTP iRule as well as the IMAP and POP3 iRules are available in the iRules Codeshare. Before digging into the iRule, the usage example in section six of RFC 2487 is illustrated in the drawing below with the steps from our description to follow highlighted on each leg of the protocol exchange. The iRule The process starts with the standard TCP 3-way handshake, which results in the CLIENT_ACCEPTED event firing (line 2). At this point I don’t know if the client is requiring TLS yet, so ehlo is set to 0 and SSL is disabled (set by default in the virtual server profile.) With the SMTP protocol, the client initiates the connection but it’s the server that sends data first. This means that after the client connection occurs, we need to collect data on the server side of the connection, which is performed here when the SERVER_CONNECTED event fires (lines 6-7). SERVER_DATA fires when the server sends (and has been collected by the collect in line 7). The if will not match yet as the ehlo variable is still zero with no client data to match. The data is also released here (line 29) I do however, want to catch when the client sends data, so I do a clientside collect. (line 30) When the client does send data, the CLIENT_DATA event fires (line 9). The payload is de-cased and stored in the variable lcpayload (line 10) and then is checked for the existence of the ehlo command (line 11). If the ehlo was present, I collect on the serverside again (looking for TLS support messages) and make sure I set ehlo to true (lines 12-13). I release client data to continue flow, then collect again to look for the starttls command from the client. Now on the server side, if ehlo is set and the starttls is not in the message, replace the payload with a starttls message. (line 27) And again release the data (line 29) Now when client data arrives with starttls, LTM responds directly informing the client to start TLS communication (lines 16-17) and then swallows the payload (line 18). After responding to the client, I need to enable SSL to the virtual is ready for the client hello from the client (line 20). Finally, if there is no ehlo or starttls from the client, I’ll just release the payload. This is to allow clients not supporting starttls through. (line 22)2KViews1like1CommentAdvanced iRules: An Abstract View of iRules with the Tcl Bytecode Disassembler
In case you didn't already know, I'm a child of the 70's. As such, my formative years were in the 80's, where the music and movies were synthesized and super cheesy. Short Circuit was one of those cheesy movies, featuring Ally Sheedy, Steve Guttenberg, and Johnny Five, the tank-treaded laser-wielding robot with feelings and self awareness. Oh yeah! The plot...doesn't at all matter, but Johnny's big fear once reaching self actualization was being disassembled. Well, in this article, we won't dissamble Johnny Five, but we will take a look at disassembling some Tcl code and talk about optimizations. Tcl forms the foundation of several code environments on BIG-IP: iRules, iCall, tmsh, and iApps. The latter environments don't carry the burden of performance that iRules do, so efficiency isn't as big a concern. When we speak at conferences, we often commit some time to cover code optimization techniques due to the impactful nature of applying an iRule to live traffic. This isn't to say that the system isn't highly tuned and optimized already, it's just important not to introduce any more impact than is absolutely necessary to carry out purpose. In iRules, you can turn timing on to see the impact of an iRule, and in the Tcl shell (tclsh) you can use the time command. These are ultimately the best tools to see what the impact is going to be from a performance perspective. But if you want to see what the Tcl interpreter is actually doing from an instruction standpoint, well, you will need to disassemble the code. I've looked at bytecode in some of the python scripts I've written, but I wasn't aware of a way to do that in Tcl. I found a thread on stack that indicated it was possible, and after probing a little further was given a solution. This doesn't work in Tcl 8.4, which is what the BIG-IP uses, but it does work on 8.5+ so if you have a linux box with 8.5+ you're good to go. Note that there are variances from version to version that could absolutely change the way the interpreter works, so understand that this is an just an exercise in discovery. Solution 1 Fire up tclsh and then grab a piece of code. For simplicity, I'll use two forms of a simple math problem. The first is using the expr command to evaluate 3 times 4, and the second is the same math problem, but wraps the evaluation with curly brackets. The command that will show how the interpreter works its magic is tcl::unsupported::disassemble. ## ## unwrapped expression ## ## % tcl::unsupported::disassemble script { expr 3 * 4 } ByteCode 0x0x1e1ee20, refCt 1, epoch 16, interp 0x0x1d59670 (epoch 16) Source " expr 3 * 4 " Cmds 1, src 12, inst 14, litObjs 4, aux 0, stkDepth 5, code/src 0.00 Commands 1: 1: pc 0-12, src 1-11 Command 1: "expr 3 * 4 " (0) push1 0 # "3" (2) push1 1 # " " (4) push1 2 # "*" (6) push1 1 # " " (8) push1 3 # "4" (10) concat1 5 (12) exprStk (13) done ## ## wrapped expression ## ## % tcl::unsupported::disassemble script { expr { 3 * 4 } } ByteCode 0x0x1de7a40, refCt 1, epoch 16, interp 0x0x1d59670 (epoch 16) Source " expr { 3 * 4 } " Cmds 1, src 16, inst 3, litObjs 1, aux 0, stkDepth 1, code/src 0.00 Commands 1: 1: pc 0-1, src 1-15 Command 1: "expr { 3 * 4 } " (0) push1 0 # "12" (2) done Because the first expression is unwrapped, the interpreter has to build the expression and then call the runtime expression engine, resulting in 4 objects and a stack depth of 5. With the wrapped expression, the interpreter found a compile-time constant and used that directly, resulting in 1 object and a stack depth of 1 as well. Much thanks to Donal Fellows on Stack Overflow for the details. Using the time command in the shell, you can see that wrapping the expression results in a wildly more efficient experience. % time { expr 3 * 4 } 100000 1.02325 microseconds per iteration % time { expr {3*4} } 100000 0.07945 microseconds per iteration Solution 2 I was looking in earnest for some explanatory information for the bytecode fields displayed with tcl::unsupported::disassemble, and came across a couple pages on the Tcl wiki, one building on the other. Combining the pertinent sections of code from each page results in this script you can paste into tclsh: namespace eval tcl::unsupported {namespace export assemble} namespace import tcl::unsupported::assemble rename assemble asm interp alias {} disasm {} ::tcl::unsupported::disassemble proc aproc {name argl body args} { proc $name $argl $body set res [disasm proc $name] if {"-x" in $args} { set res [list proc $name $argl [list asm [dis2asm $res]]] eval $res } return $res } proc dis2asm body { set fstart " push -1; store @p; pop " set fstep " incrImm @p +1;load @l;load @p listIndex;store @i;pop load @l;listLength;lt " set res "" set wait "" set jumptargets {} set lines [split $body \n] foreach line $lines { ;#-- pass 1: collect jump targets if [regexp {\# pc (\d+)} $line -> pc] {lappend jumptargets $pc} } set lineno 0 foreach line $lines { ;#-- pass 2: do the rest incr lineno set line [string trim $line] if {$line eq ""} continue set code "" if {[regexp {slot (\d+), (.+)} $line -> number descr]} { set slot($number) $descr } elseif {[regexp {data=.+loop=%v(\d+)} $line -> ptr]} { #got ptr, carry on } elseif {[regexp {it%v(\d+).+\[%v(\d+)\]} $line -> copy number]} { set loopvar [lindex $slot($number) end] if {$wait ne ""} { set map [list @p $ptr @i $loopvar @l $copy] set code [string map $map $fstart] append res "\n $code ;# $wait" set wait "" } } elseif {[regexp {^ *\((\d+)\) (.+)} $line -> pc instr]} { if {$pc in $jumptargets} {append res "\n label L$pc;"} if {[regexp {(.+)#(.+)} $instr -> instr comment]} { set arg [list [lindex $comment end]] if [string match jump* $instr] {set arg L$arg} } else {set arg ""} set instr0 [normalize [lindex $instr 0]] switch -- $instr0 { concat - invokeStk {set arg [lindex $instr end]} incrImm {set arg [list $arg [lindex $instr end]]} } set code "$instr0 $arg" switch -- $instr0 { done { if {$lineno < [llength $lines]-2} { set code "jump Done" } else {set code ""} } startCommand {set code ""} foreach_start {set wait $line; continue} foreach_step {set code [string map $map $fstep]} } append res "\n [format %-24s $code] ;# $line" } } append res "\n label Done;\n" return $res } proc normalize instr { regsub {\d+$} $instr "" instr ;# strip off trailing length indicator set instr [string map { loadScalar load nop "" storeScalar store incrScalar1Imm incrImm } $instr] return $instr } Now that the script source is in place, you can test the two expressions we tested in solution 1. The output is very similar, however, there is less diagnostic information to go with the bytecode instructions. Still, the instructions are consistent between the two solutions. The difference here is that after "building" the proc, you can execute it, shown below each aproc expression. % aproc f x { expr 3 * 4 } -x proc f x {asm { push 3 ;# (0) push1 0 # "3" push { } ;# (2) push1 1 # " " push * ;# (4) push1 2 # "*" push { } ;# (6) push1 1 # " " push 4 ;# (8) push1 3 # "4" concat 5 ;# (10) concat1 5 exprStk ;# (12) exprStk ;# (13) done label Done; }} % f x 12 % aproc f x { expr { 3 * 4 } } -x proc f x {asm { push 12 ;# (0) push1 0 # "12" ;# (2) done label Done; }} % f x 12 Deeper Down the Rabbit Hole Will the internet explode if I switch metaphors from bad 80's movie to literary classic? I guess we'll find out. Simple comparisons are interesting, but now that we're peeling back the layers, let's look at something a little more complicated like a for loop and a list append. % tcl::unsupported::disassemble script { for { $x } { $x < 50 } { incr x } { lappend mylist $x } } ByteCode 0x0x2479d30, refCt 1, epoch 16, interp 0x0x23ef670 (epoch 16) Source " for { $x } { $x < 50 } { incr x } { lappend mylist $x " Cmds 4, src 57, inst 43, litObjs 5, aux 0, stkDepth 3, code/src 0.00 Exception ranges 2, depth 1: 0: level 0, loop, pc 8-16, continue 18, break 40 1: level 0, loop, pc 18-30, continue -1, break 40 Commands 4: 1: pc 0-41, src 1-56 2: pc 0-4, src 7-9 3: pc 8-16, src 37-54 4: pc 18-30, src 26-32 Command 1: "for { $x } { $x < 50 } { incr x } { lappend mylist $x }" Command 2: "$x " (0) push1 0 # "x" (2) loadStk (3) invokeStk1 1 (5) pop (6) jump1 +26 # pc 32 Command 3: "lappend mylist $x " (8) push1 1 # "lappend" (10) push1 2 # "mylist" (12) push1 0 # "x" (14) loadStk (15) invokeStk1 3 (17) pop Command 4: "incr x " (18) startCommand +13 1 # next cmd at pc 31 (27) push1 0 # "x" (29) incrStkImm +1 (31) pop (32) push1 0 # "x" (34) loadStk (35) push1 3 # "50" (37) lt (38) jumpTrue1 -30 # pc 8 (40) push1 4 # "" (42) done You'll notice that there are four commands in this code. The for loop, the x variable evaluation, the lappend operations, and the loop control with the incr command. There are a lot more instructions in this code, with jump pointers from the x interpretation to the incr statement, the less than comparison, then a jump to the list append. Wrapping Up I went through an exercise years ago to see how far I could minimize the Solaris kernel before it stopped working. I personally got down into the twenties before the system was unusable, but I think the record was somewhere south of 15. So...what's the point? Minimal for minimal's sake is not the point. Meet the functional objectives, that is job one. But then start tuning. Less is more. Less objects. Less stack depth. Less instantiation. Reviewing bytecode is good for that, and is possible with the native Tcl code. However, it is still important to test the code performance, as relying on bytecode objects and stack depth alone is not a good idea. For example, if we look at the bytecode differences with matching an IP address, there is no discernable difference from Tcl's perspective between the two regexp versions, and very little difference between the two regexp versions and the scan example. % dis script { regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})} 192.168.101.20 _ a b c d } ByteCode 0x0x24cfd30, refCt 1, epoch 15, interp 0x0x2446670 (epoch 15) Source " regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-" Cmds 1, src 90, inst 19, litObjs 8, aux 0, stkDepth 8, code/src 0.00 Commands 1: 1: pc 0-17, src 1-89 Command 1: "regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9" (0) push1 0 # "regexp" (2) push1 1 # "([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})" (4) push1 2 # "192.168.101.20" (6) push1 3 # "_" (8) push1 4 # "a" (10) push1 5 # "b" (12) push1 6 # "c" (14) push1 7 # "d" (16) invokeStk1 8 (18) done % dis script { regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _ a b c d } ByteCode 0x0x24d1730, refCt 1, epoch 15, interp 0x0x2446670 (epoch 15) Source " regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _" Cmds 1, src 64, inst 19, litObjs 8, aux 0, stkDepth 8, code/src 0.00 Commands 1: 1: pc 0-17, src 1-63 Command 1: "regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _ " (0) push1 0 # "regexp" (2) push1 1 # "^(\d+)\.(\d+)\.(\d+)\.(\d+)$" (4) push1 2 # "192.168.101.20" (6) push1 3 # "_" (8) push1 4 # "a" (10) push1 5 # "b" (12) push1 6 # "c" (14) push1 7 # "d" (16) invokeStk1 8 (18) done % dis script { scan 192.168.101.20 %d.%d.%d.%d a b c d } ByteCode 0x0x24d1930, refCt 1, epoch 15, interp 0x0x2446670 (epoch 15) Source " scan 192.168.101.20 %d.%d.%d.%d a b c d " Cmds 1, src 41, inst 17, litObjs 7, aux 0, stkDepth 7, code/src 0.00 Commands 1: 1: pc 0-15, src 1-40 Command 1: "scan 192.168.101.20 %d.%d.%d.%d a b c d " (0) push1 0 # "scan" (2) push1 1 # "192.168.101.20" (4) push1 2 # "%d.%d.%d.%d" (6) push1 3 # "a" (8) push1 4 # "b" (10) push1 5 # "c" (12) push1 6 # "d" (14) invokeStk1 7 (16) done However, if you look at the time results from these examples, they are very different. % time { regexp {([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})} 192.168.101.20 matched a b c d } 100000 11.29749 microseconds per iteration % time { regexp {^(\d+)\.(\d+)\.(\d+)\.(\d+)$} 192.168.101.20 _ a b c d } 100000 7.78696 microseconds per iteration % time { scan 192.168.101.20 %d.%d.%d.%d a b c d } 100000 1.03708 microseconds per iteration Why is that? Well, bytecode is a good indicator, but it doesn't address the inherent speed of the commands being invoked. Regex is a very slow operation comparatively. And within the regex engine, the second example is a simpler regex to evaluate, so it's faster (though less accurate, so make sure you are actually passing an IP address.) Then of course, scan shows off its optimized self in grand fashion. Was this a useful exercise in understanding Tcl under the hood? Drop some feedback in the comments if you'd like more tech tips like this that aren't directly covering a product feature or solution, but reveal some utility that assists in learning how things tick.1.3KViews1like3CommentsAdvanced iRules: Getting Started with iRules Procedures
As Colin so eloquently puts it in the procs overview on Clouddocs: "Ladies and gentlemen, procs are now supported in iRules!" Yes, the rumors are true. As of BIG-IP version 11.4, you can move all that repetitive code into functional blocks the Tcl language defines as procedures, or procs for short. If you program in any other languages, function support is not new. If you don't, maybe you've at some point built a macro in a Microsoft Office product to perform a repetitive task, such as substituting diapers for Dwight every time it was typed as Jim Halpert did in NBC's The Office. Anywho...now that we have them, what can you do with them? The biggest benefit is code re-use. If you have a particular set of commands that you use multiple times in one or more rules, it makes sense to code that once and just call that code block when necessary. Another benefit of using a procedure is having one source to document/troubleshoot/update when necessary. Where Do They Go, and How Do I Use Them? With few exceptions, code blocks must be place within the context of an event. Procedures join the short list of exceptions of code blocks or commands that live outside an event. There are two options for placing procs, in the iRule where it will be called, or in another iRule altogether. In this first example, the proc is local. rule encodeHTML { proc html_encode { str } { set encoded "" foreach char [split $str ""] { switch $char { "<" { append encoded "<" } ">" { append encoded ">" } "'" { append encoded "'" } {"} { append encoded """ } "&" { append encoded "&" } default { append encoded $char } } } return $encoded } when RULE_INIT { # iRule that calls the html_encode proc: set raw {some xss: < script > and sqli: ' or 1==1# "} log local0. "HTML encoded: [call html_encode $raw]" } } The proc definition is pretty basic, it receives a string and splits it up so it can encode characters matched in the switch statement, and otherwise leaves the characters alone. Notice, however, in the INIT event the new call command: [call html_encode $raw]. This is where the proc is invoked. Now, let's look at a remote proc. rule library { proc html_encode { str } { set encoded "" foreach char [split $str ""] { switch $char { "<" { append encoded "<" } ">" { append encoded ">" } "'" { append encoded "'" } {"} { append encoded """ } "&" { append encoded "&" } default { append encoded $char } } } return $encoded } } rule encodeHTML { when RULE_INIT { # iRule that calls the html_encode proc: set raw {some xss: < script > and sqli: ' or 1==1# "} log local0. "HTML encoded: [call library::html_encode $raw]" } } Notice the subtle change to the call command? Because the proc is in a remote irule, it is necessary to set that rule's name as the namespace of the proc being called. A couple of notes: I'd recommend starting a library of procedures and store that library in the common partition so all application owners from different partitions can use it. This works from partition x: [call library::proc_a], but I'd include the partition to be safe: [call /Common/library::proc_a]. If creating a library of procs, and one proc calls another proc in that library, make sure to explicitly define the namespace in that call. Example below. proc sequence {from to} { for {set lst {}} {$from <= $to} {incr from} { lappend lst $from } return $lst } proc knuth_shuffle lst { set j [llength $lst] for {set i 0} {$j > 1} {incr i;incr j -1} { set r [expr {$i+int(rand()*$j)}] set t [lindex $lst $i] lset lst $i [lindex $lst $r] lset lst $r $t } return $lst } proc shuffleIntSequence {x y} { return [call procs::knuth_shuffle [call procs::sequence $x $y]] } The shuffleIntSequence proc, which is called from another iRule, makes calls to two more procs, knuth_shuffle_lst and sequence. Without explicitly setting the namespace, the Tcl interpreter is expecting the call to sequence to be local to the iRule calling it, and it will fail. Use Cases There are already a couple procs out in the codeshare, and are referenced as well on a page on Clouddocs specifically for procedures. Logging, math functions, canned response pages, and more have been tossed around as ideas for procs. What else would you like to see?4.3KViews1like1CommentAdvanced iRules: Sideband Connections
With the release of BIG-IP version 11 there are many, many new features and capabilities to come up to sped on. Not the least of which are the additions and changes to the already powerful iRules infrastructure. iRules, as a whole, is an amazingly powerful, flexible technology. We've seen it used for thousands of things over the years, and it has solved problems that many people thought were beyond fixing. That being said, there are some new tricks being added that will, simply put, blow you away. One such feature is the concept of a sideband connection. This is one of the most exciting features in the entire v11 release for those of us that are true iRules zealots out there, as it's something that we've been asking to have for years now, and adds tremendous power to the already vast iRules toolbox. So what is a sideband connection you ask? Think of a sideband connection as a way to access the world outside of an iRule. With this functionality you can create a connection (TCP or UDP) to any outside resource you choose, send a custom formatted request, await a response if applicable, act on that response, etc. The sky is really the limit. If you've done any kind of socket programming this should start sounding pretty familiar. The concept is very similar, open, send, receive, process. By allowing you to reach out to surrounding applications, APIs, servers and even, if you're tricky enough, databases...the possibilities of what you can do within your iRule open up considerably. Curious what permissions someone is supposed to have but they're stored in your LDAP system? No problem. Want to query that custom HTTP based API with some information about the current connection before deciding where to load balance to? Sure, you can do that. So how does it work? What does it look like? Well, the commands are pretty simple, actually. In this article I'll briefly cover the four main commands that allow you to establish, communicate via, and manipulate sideband connections within iRules. Connect First is the connect command, which is the command used to initiate the connection for use. This is what reaches out, does whatever handshaking necessary and actually establishes the connection with whatever remote server you'll be interacting with. The connect command looks like this: connect [-protocol TCP|UDP] [-myport <port>] [-myaddr <addr>] [-tos <tos>] [-status <varname>] [-idle <s>] [-timeout <ms>] <destination> As you can see, there are a lot of options, but the structure itself is pretty logical. You're calling the connect command, obviously, and specifying how you want to connect (TCP vs UDP), where you're connecting to (IP & Port), and then giving some basic information about that connection, namely timeout, idle time, etc. Now that you have an established connection, you're probably going to want to send some data through it. When you're initiating the connection you'll normally also specify a variable that represents the resulting handle for later use. This looks like: set conn [connect -timeout 3000 -idle 30 -status conn_status vs_test] log local0. "Connect returns: <$conn> and conn status: <$conn_status>" Send The above will give you a variable named "$conn" that you can then pass traffic to, assuming the connection was initiated successfully. Once things are up and running and your variable is set you can make use of the send command to pass traffic to your new connection. Keep in mind that you're going to be sending raw data, which will require whatever protocol specific formatting may be necessary for whatever it is you're trying to send. Below you'll see the simple HTTP GET request we'll be sending, stored in the variable "$data" to keep things obvious, as well as the send command that is used to send that data through the connection we opened above. Note that the send command also allows you to specify a variable to contain the status of the send so that you can later verify that the send went through properly. set data "GET /mypage/myindex2.html HTTP/1.0\r\n\r\n" set send_info [send -timeout 3000 -status send_status $conn $data] Recv Now that you have a functioning connection and are sending data through it to some remote host, you're likely curious what that host is saying in response, I would imagine. There is a command for that too, naturally. With the recv command you're able to consume the data that is being returned from your open connection. There are also sub-commands like -peek and -eol. I'll get into more detail about these later, but here's the short version: The -peek flag allows you to instantly return any received data without unbuffering, to allow you to inspect a portion of the data before the recv command has completed collecting the response. The -eol command looks for an end-of-line character before terminating the recv. A basic recv looks like this, storing the data in a new variable: set recv_data [recv -timeout 3000 -status recv_status 393 $conn] Close At this point you'll be able to inspect whatever data was returned by parsing the $recv_data variable (though this variable name is up to you). Now that you've initiated a connection, sent data to it, and collected the response, the only thing left to do, unless you want to repeat the send/recv of course, is to close the connection. That is done quite simply with close command. No arguments or tricks here, just a simple termination of the connection: close $conn If you put it all together, a simple example iRule would look something like this: when HTTP_REQUEST { set conn [connect -timeout 3000 -idle 30 -status conn_status vs_test] log local0. "Connect returns: <$conn> and conn status: <$conn_status> " set conn_info [connect info -idle -status $conn] log local0. "Connect info: <$conn_info>" set data "GET /mypage/myindex2.html HTTP/1.0\r\n\r\n" set send_info [send -timeout 3000 -status send_status $conn $data] log local0. "Sent <$send_info> bytes and send status: <$send_status>" # This magically knows that we’re getting 393 bytes back set recv_data [recv -timeout 3000 -status recv_status 393 $conn] log local0. "Recv data: <$recv_data> and recv status: <$recv_status>" close $conn log local0. "Closed; conn info: <[connect info -status $conn]>" } So there you have it, a simple intro to how you can manipulate sideband connections via iRules. Keep in mind that the devil is, of course, in the details and it will take some practice to harness the power of these new commands, but the power is most definitely there for the taking. Feel free to ask any questions either here or in the iRules Q&A and also keep your eye on the wiki for the official command pages for the new v11 commands which will be coming soon.4.8KViews2likes11Comments