iRules Concepts: Logging, a Deeper Understanding

Multiple times in recent iRules presentations, whether on the road or here within F5, there have been questions raised when the topic of logging within iRules gets brought up. Specifically people are curious about logging best practices, performance impact, when to log or not, and how to ensure they're not bogging down their device while still recording the necessary data. As such, it seems like a pertinent topic to demystify a bit by way of background and explanation.

First, to understand the possible performance and frankly functional implications of logging from within an iRule you need to understand a bit about the TMM and how it interacts with the *nix based host OS that maintains management level services while TMM does the heavy lifting on traffic.

What's a TMM?

TMM stands for Traffic Management Microkernel. This is the core of our TMOS (Traffic Management Operating System) infrastructure. Whenever anyone talks about TMOS they are talking about one or more TMMs running to perform the job of passing, inspecting and manipulating traffic, among other things. This is important to understand because the TMM performs a certain set of functions, but is not a complete OS in a traditional sense. Years ago, when engineering version 9 of BIG-IP, we needed a way to achieve not only high performance but also scalability, modularity and reliability while processing traffic. As such, TMM was born. Without going into nuts & bolts too deeply, the idea is that there is a custom, purpose built kernel designed to do nothing but process network traffic as effectively and efficiently as possible. Complete with custom TCP/IP stack, session management and much more, TMM handles all of the traffic being passed over the wire and through your BIG-IP.

It is also where all of the processing surrounding said traffic is handled. Profile parsing, most module influences, iRules...the magic really happens within the TMM. This is relevant and important because everything that you do inside an iRule takes place within the TMM. It handles the event structure, commands, memory management...all of it. Whenever you're invoking an iRule you're calling on TMM to do the work and output the result regardless of whether that's discarding an unsavory access attempt or writing new data to the session table.

So how does that affect logging?

So that's TMM, and understanding that is great, but how does it pertain to logging and performance? TMM is blazingly fast and does its job extremely well. It does not, however, do things like run sshd or Apache, or other generic "User" processes such as, say, syslogd. Those "User" processes aren't things that are directly in the path of traffic, generally speaking, and are handled by the host OS that is the support system that TMM relies on to handle those sort of things. The host OS operates with different memory and CPU restrictions, different performance limitations and generally behaves as a completely different entity than the TMM(s) that are processing traffic. This being the case, things that involve processes in user space are generally not put in the traffic path as it can, in some cases, slow down traffic processing given that the host OS doesn't run nearly as efficiently as TMM does.

That ceases to be the case, however, when you put basic, on box logging in place from within an iRule. When you set up a basic log statement within your iRule it will send that information out the user land and feed it to syslogd, which is the logging engine configured on the system. Syslogd is designed to log management events, errors on the system, etc. and does so quite well. It can, however, become overwhelmed if you throttle it with requests. One of the easiest ways to do that is to, say, put 20 log statements in an iRule that then receives 50,000 requests per second. Don't misunderstand, production iRules making use of the basic log command are common and there is nothing necessarily wrong with them as a rule, but it's important to understand the number of log requests that will be sent to syslogd and gauge whether or not that is acceptable. There is no hard and fast rule for this, so I can't tell you "only ever log x requests per second on this device", because that is completely subjective based on box utilization, application needs and about 100 other variables.

Are there other options?

There are indeed other options, a couple of them in fact. Both of the, however, involve shipping logs to a remote syslog server. The most reliable, simplest way to increase your logging throughput while ensuring you don't impact the performance of your BIG-IP is to ship those logs off box to a system dedicated to handle that type of load. This isn't necessary for many deployments as the logging rate required most times doesn't warrant it. When you need more logging headroom though, this is the best way to go about it. For the purposes of this article I'll assume you're running on a recent version (11.0 or later), though some of these features were available much earlier (10.0 in some cases).

First up is a simple change to the log statement that will signal the LTM to send the log message directly off box rather than to the host OS to be processed by syslog. You can see the simple difference here:

General, default logging statement:

   1: #log [-noname] <facility>.[<level>] <message>
   2: #E.G: 
   3: log local0. "Default log messages look like this"


Remote, in-line logging statement:

   1: #log [-noname] <remote_ip>[:<remote_port>] <facility>.[<level>] <message>
   2: #E.G: 
   3: log 172.27.122.54 local0.info "In line remote logging looks like this"

As you can see, it's quite an easy change to make in your log statements and, in the case of environments that require a high rate of logging, it will have a significant impact on performance and logging overhead.

The other option is the HSL commands, which allow you more granular control over exactly how and what you want to send your remote systems, but make use of the exact same exit point from the BIG-IP to transmit the remote logging data.

The syntax of the HSL (High Speed Logging) commands looks like this:

   1: HSL::open -proto <UDP|TCP> -pool <poolname>
   2: HSL::send <handle> <data>

And a simple, example of them in an iRule, straight from the wiki, is:

 

   1: when CLIENT_ACCEPTED {
   2:    set hsl [HSL::open -proto UDP -pool syslog_server_pool]
   3: }
   4: when HTTP_REQUEST {
   5:    # Log HTTP request as local7.info; see RFC 3164 Section 4.1.1 - "PRI Part" for more info
   6:    HSL::send $hsl "<190> [IP::local_addr] [HTTP::uri]\n"
   7: }

The above example shows a very simple request in which the outcome is easy to reproduce and doesn't stray far at all from the in line option shown earlier, but the HSL commands offer drastically more possibilities. While they can be used simply, they are also a ticket to complete customization of log messages being sent off box wherever and however you want them to be. There are many possibilities here and I won't try to enumerate them, but they are highly flexible commands and allow you much more granular control over off-box message shipping. If the simple, in-line option doesn't meet your needs, HSL almost certainly will.


Why is this so much faster?

What this small change signals to the LTM is that there is a remote logging system that this log message should be sent to, and rather than directing this message to the host OS it will instead send it out of a high-speed output specifically designed to efficiently send messages out of TMM without involving the host OS whatsoever. That last bit is very important. This is a way to allow TMM to directly fire messages out to your remote logging system, which allows for a much, much higher rate of logging while maintaining incredibly low performance implications. This means that rather than being affected by the host OS' performance (keep in mind that logging is one of the only ways that this happens within BIG-IP), that TMM is able to run unhindered, dishing log messages out as fast as it needs to without anything slowing it down.

Does this mean I should never log locally?

No, not at all. Local logging is absolutely applicable and has its place in many deployments. Whether it's debugging or production logging, there is no issue with logging locally from within an iRule unless you require an extremely high rate of logging either due to many log messages in a given iRule (or many iRules logging at once) combined with a high amount of request throughput. There's no golden rule as to the tipping point where you should move from local logging to remote, but it's not something that is run into lightly. If you're working in a high throughput environment you have likely run into this type of issue elsewhere, and are already on the lookout for workarounds such as this.

As I've told multiple people recently at both User Groups and training sessions, logging in an iRule is nothing to be afraid of. There are limitations and best practices, as with most things, but there is no reason to avoid it whatsoever. Rather, it's best to understand your options, the needs of your environment, and how those two things coalesce. Once you have that understanding, it's smooth sailing. Hopefully this helps clarify how iRules logging works and allows you to pick the appropriate option.

Published Jun 18, 2012
Version 1.0
  • Hi Andreas,

     

    Thank you for this iRule, it gave us a good start.

     

    I have a few additions to your iRule:

     

    1) You are overwriting $version on the response so I think it should be (as an example):

     

    in the request section set version_req [HTTP::version]

     

    in the response section set version_res [HTTP::version]

     

    then in syslog you need a custom rule to log them individually unless you don't really care then you should only log the version from the user since this is by default (in your case you have the version from the server on the response).

     

    2) We started encoding all user input because we hit an issue when you can send "\n" (new line) in a different encoding so the HSL function will send two lines which means syslog will "drop" the rest of the request (you will never see the full request).

     

    Best, Rafal.

     

  • It took me a while to get this going because I did not realize that HSL would not send anything out the management interface, but given how it bypasses the OS layer, this makes sense. Also, 'referer' is misspelled, which caught me as well. Here is my apache style logging rule.

     

    {code}

     

    when HTTP_REQUEST {

     

    set hsl [HSL::open -proto UDP -pool syslog_pool]

     

    set agent [HTTP::header "User-Agent"]

     

    set client [IP::client_addr]

     

    set method [HTTP::method]

     

    set referrer [HTTP::header "Referer"]

     

    set uri [HTTP::uri]

     

    set version [HTTP::version]

     

    set vip [HTTP::host]:[TCP::local_port]

     

    }

     

     

    when HTTP_RESPONSE {

     

    set time [clock format [clock seconds] -format "%d/%b/%Y:%H:%M:%S %z"]

     

    set bytes [HTTP::header "Content-Length"]

     

    set status [HTTP::status]

     

    set version [HTTP::version]

     

    log connection info

     

    Log HTTP request via syslog protocol as local7.info; see RFC 3164 for more info

     

    local3 = 19; info = 6; 8*19 + 6 = 158

     

    HSL doesn't actually talk 'syslog-ng' must be something else.

     

    HSL::send $hsl "<158> $vip $client - - \[$time\] \"$method $uri HTTP\/$version\" $status $bytes \"$referrer\" \"$agent\"\n"

     

    }

     

    {code}