on
27-Nov-2017
11:17
- edited on
05-Jun-2023
22:07
by
JimmyPackets
F5 introduced iRules to its BIG-IP product offerings a decade ago, and it has been a great success ever since. BIG-IP users write iRule scripts to extend TMOS functionality for protocol parsing, traffic manipulation, statistics collection, and all sorts of other application delivery tasks. Millions of iRule scripts run around the clock around the world for mission critical functions. There are several factors contributing to the success of iRules.
First, iRules are integrated directly into the TMOS microkernel architecture. There is no separate process running a virtual machine on the sideline; instead a customized Tcl interpreter is fully integrated within the traffic management microkernel. All the traffic data processing, algorithm rolling, result caching and memory management are directly within the TMOS core. This design provides a high performance environment for iRule execution.
Second, iRules are a genuine part of the BIG-IP solution. When the product development team takes on a new BIG-IP feature, be it a new software module or hardware acceleration, how iRules can best be used in the overall solution is one of the first and most important design considerations. This makes iRules integral throughout the vertical system stack and the horizontal feature offerings. Users can rely on a rich Tcl extension to quickly develop powerful applications.
As of BIG-IP v13.1.0 release there are over 800 iRule commands covering more than 100 network applications.
Third, there is a vibrant and resourceful iRules community. DevCentral provides a key platform for F5 and users to exchange experiences and ideas. This quick feedback loop gives the product development team great insights into trending use cases.
This is great, definitely!
Yet we know that iRules as a development environment is far from perfect. Specifically when the iRule infrastructure team surveys the users, we often get asked 2 questions:
log
commands?Being able to scope iRules execution is a long sought after feature from users, F5 support and product development. Primarily the requests focus on 2 types of info:
iRule users have historically used the crudest of debug facilities, the
log
command, to dump data to log files to trace iRule execution. Each time users insert a debug log, they need to reload the config, rerun their test traffic, consult the log file, consider what it shows, insert the further debug log
statements and do it all over until they're content with the results. This is very time consuming. To make this problem worse, the BIG-IP logging facility can mangle the output, as when the debug log runs at high frequency the output can be truncated.
Because of the problem, users have been asking for a new iRules specific debug facility so that they can conveniently inspect the following data:
proc
command provides a polymorphism language feature, and very often at run time the proc callee gets finalized when there are multiple procs with the same name. Again, because of the event driven programming model, figuring out which proc is called and when is critical to iRules debugging.While iRules timing stats provide such performance metrics as minimum, maximum and average execution time at the script level, they are too coarse to provide much architectural level insight. When a user suspects an iRule script has performance issues, normally that user needs to rely on past experience to guess the bottleneck. There is no data to guide users to locate the critical path. In other words, when an iRule is suspected of "running slow", is it really something that users can improve? If so, what is the margin? Specifically there are several aspects:
Because of these issues, iRule performance tuning has been constrained with personal expertise rather than systematic solutions. Users painstakingly fix all the bugs in their script, traffic management executes correctly but not as fast as they expected, and it normally entails a trial and error journey to tune up the performance.
Specifically the following insight will greatly help understand the performance bottleneck:
proc
is a command, as this conforms to the Tcl convention.So we understand the problem, and we would love to solve it because the same problems bother us (maybe even more so because we are honestly stunned time and again by multi-thousand line iRules that are escalated to us where the case's subject indicates that it's an important situation, they need help with the iRules, and it is urgent. Ugrrr).
Yes, I know you are itching to learn how we will address this. But before that, we need to have a common ground to set up a description of the solution.
Figure 1 iRule execution model
Figure 1, iRule execution model, illustrates an iRule execution from the perspective of its timing breakdown. Note that it is a timing breakdown, not necessarily the exact sequencing. Let us walk through the diagram.
You can see that along the execution there are various named points. These are the junctures between 2 execution phases. Let us call them "occurrances". We will run into this term again because it is part of the naming convention.
To help illustrate the diagram, let us use the following simple iRule as an example:
when HTTP_REQUEST {
log local0. "It is from [HTTP::host]"
}
First after TMM finishes the initial packet processing and it comes to "iRules land" by firing an iRule event, "Event Entry" is the occurrance for the iRule land. In List 1's example, the execution starts at "when HTTP_REQUEST".
Because there could be multiple scripts with same event (i.e. there is another HTTP_REQUEST script) locating the right script takes some time. After that, the execution comes to the rule script, "Rule Entry", in the diagram. Figuratively we can think of that "{" in list 1. This is the second occurrance.
Now it is ready to execute the Tcl code so the execution comes to Tcl VM at "Rule VM Entry".
We all know Tcl scripts are compiled to byte code before the execution. The Tcl virtual machine executes one bytecode instruction at a time and so "Bytecode" is the fourth occurrance.
When a bytecode is "invoke", the VM calls the native implementation of the command. This fifth occurrance is "VM Entry".
In List 1's example, the invoked command is
log
or HTTP::host
. Both these commands are extention commands (provided by the TMM and not part of the core Tcl runtime) and so the execution leaves the Tcl virtual machine which is the "Command Entry" occurrance.
OK, by this point, we have walked through all the timing hierarchies, from top to bottom. Execution continues and the following occurrances walk the hierarchy upwards: "Command Exit", "VM Exit", "Rule VM Exit", "Rule Exit" and "Event Exit". In order to preserve the clarity of the diagram, these occurrances are not labled, hopefully it is straightforward to locate them.
So except "Bytecode", all occurrances come in pairs.
The last occurrance, as with those exit ones, not shown in the diagram, is "Variable Modification". It is when a Tcl variable changes value.
At this point we know what timing metrics are needed in order to tune up iRule performance. In the next article we will present the new iRules features we introduced in the BIG-IP V13.1 release. We will take real examples to trace execution and digest timing data. Be prepared for a deep dive.
Stay tuned.
Authors: Jibin Han, Bonny Rais
Any chance you could re-do the diagram and change the color of 'VM entry'? The contrast between the chosen color and the background is quite low.
Is there any significance to 'bytecode' appearing twice in the TCL VM context?