Performance Logging iRule (Rule_http_log)
Problem this snippet solves:
Here's a logging iRule. You'll need a HSL syslog pool to log too. Various bits gathered from other posts on DevCentral. Sharing in case there is interest.
Make sure your rsyslogd is setup to use the newer syslog format like RFC-5424 including milliseconds and timezone info. Includes Country (co) and logs individual request times for each request on a HTTP/1.1 connection. To configure F5 logging to use milliseconds and timezone, disable logging in the gui and use tmsh edit sys syslog and something like:
include "
# short hostnames
options {
use_fqdn(no);
};
# Remote syslog in RFC5424 - Tim Riker <Tim@Rikers.org>
destination remotesyslog {
syslog(\"10.1.2.3\" transport(\"udp\") port(51443) ts_format(iso));
};
log {
source(s_syslog_pipe);
destination(remotesyslog);
};
"
Uses upvar and proc. Tested on 11.6 - 15.1
This tracks connection info in a table and then copies that down to the per-request log() to handle reporting on http2.
This version works around a BIG-IP bug where HTTP::version does not report 2 or higher for http2 and later requests.
With http2 profiles, subsequent requests using the same connection can generate this error in the logs if HTTP::respond HTTP::redirect or HTTP::retry is called from and earlier iRule. Reorder your iRules to avoid this.
<HTTP_REQUEST> - No HTTP header is cached - ERR_NOT_SUPPORTED (line 1) invoked from within "HTTP::method"
How to use this snippet:
Add this iRule to whatever virtual hosts you desire. I always add it as the first rule. If you have a rule that sets headers you want to track, you may want this after the rule that sets headers.
Interesting Splunk queries can be created like:
index=* perflog | timechart avg(cpu_5sec) by host limit=10
to show load across multiple F5s.
index=* perflog | timechart max(upstream_time) by http_host limit=10
to show long request times by http_host
Any other iRule may add things to the log() array and those will get added to the single hsl output.
If you create a dg_http_log datagroup, that will be used to filter what gets logged.
Tested on version:
13.0 - 15.1
# Rule_http_log
# http logging - Tim Riker <Tim@Rikers.org>
# bits taken from this post:
# https://devcentral.f5.com/questions/irule-for-getting-total-response-time-server-response-time-and-server-connection-time
# iRule performance tracking
# https://devcentral.f5.com/questions/Timing-iRules
timing on
# timing is on by default in 11.5.0+ to see stats:
# tmsh show ltm rule Rule_http_log
#
# if the dg_http_log datagroup exists then vips or hosts/paths in dg_http_log that start with
# "NONE" no logging (really anything other than empty)
# "INFO" normal logging
# "FINE" full request and response headers and CLIENT_CLOSED
#
# upstream_time := 15000 in the datagroup to log all requests over 15 seconds
#
# example:
# "/Common/vs_www.example.com_HTTPS" := "FINE" - logged including CLIENT_CLOSED
# "www.example.com/" := "INFO" - logged
# "www.example.com/somepath" := "FINE" - full headers
# "www.example.com/otherpath" := "NONE" - not logged
when RULE_INIT {
# hostname up to first dot
set static::hostname [getfield [info hostname] "." 1]
}
# not calling /Common/proc:hsllog as this logs when the request occurred
# instead of the time it calls hsllog at the end of the request
proc hsllog {time mylog} {
upvar 1 $mylog log
# https://tools.ietf.org/html/rfc5424 <local0.info>version rfc-3339time host procid msgid structured_data log
# should be able to use a "Z" here instead of "+00:00" but our splunk logs don't handle that
# 134 = local0.info
set output "<134>1 [clock format [string range $time 0 end-3] -gmt 1 -format %Y-%m-%dT%H:%M:%S.[string range $time end-2 end]+00:00] ${static::hostname} httplog [TMM::cmp_group].[TMM::cmp_unit] - -"
foreach key [lsort [array names log]] {
if { ($log($key) matches_regex {[\" ;,:]}) } {
append output " $key=\"[string map {\" "|"} $log($key)]\""
} else {
append output " $key=$log($key)"
}
}
# avoid marking virtual server up when hsl pool is up
# https://support.f5.com/csp/article/K14505
set hsl pool_syslog
HSL::send [HSL::open -proto UDP -pool $hsl] $output
}
when CLIENT_ACCEPTED {
# calculate and track milliseconds
# is this / 1000 guaranteed to be clock seconds? TCL docs say no, but it looks like on f5 it is.
set tcp_start_time [clock clicks -milliseconds]
set log(loglevel) 0
if { [class exists dg_http_log] } {
# virtual name entries need to be full path, ie: /Common/vs_www.example.com_HTTP
switch -- [string range [class match -value -- [virtual name] equals dg_http_log] 0 3] {
"FINE" {
set log(loglevel) 2
}
"INFO" {
set log(loglevel) 1
}
default {
set log(loglevel) 0
}
}
}
table set -subtable [IP::client_addr]:[TCP::client_port] loglevel $log(loglevel)
table set -subtable [IP::client_addr]:[TCP::client_port] tmm "[TMM::cmp_group].[TMM::cmp_unit]"
table set -subtable [IP::client_addr]:[TCP::client_port] client_addr [IP::client_addr]
table set -subtable [IP::client_addr]:[TCP::client_port] client_port [TCP::client_port]
table set -subtable [IP::client_addr]:[TCP::client_port] cpu_5sec [cpu usage 5secs]
table set -subtable [IP::client_addr]:[TCP::client_port] virtual_name [virtual name]
set co [whereis [IP::client_addr] country]
if { $co eq "" } {
set co unknown
}
table set -subtable [IP::client_addr]:[TCP::client_port] co $co
}
when HTTP_REQUEST {
set http_request_time [clock clicks -milliseconds]
set keys [table keys -subtable [IP::client_addr]:[TCP::client_port]]
foreach key $keys {
set log($key) "[table lookup -subtable "[IP::client_addr]:[TCP::client_port]" "$key"]"
}
if {[HTTP::has_responded]} {
# The rule should come BEFORE any rules that do things like redirects
set log(http_has_responded) [HTTP::has_responded]
set log(loglevel) 1
set log(event) HTTP_REQUEST
call hsllog $http_request_time log
return
}
if { [class exists dg_http_log] } {
set logsetting [class match -value -- [HTTP::host][HTTP::uri] starts_with dg_http_log]
if { $logsetting ne "" } {
# override log(loglevel) if we found something
switch -- [string range $logsetting 0 3] {
"FINE" {
set log(loglevel) 2
}
"INFO" {
set log(loglevel) 1
}
default {
set log(loglevel) 0
}
}
}
}
set log(http_host) [HTTP::host]
set log(http_uri) [HTTP::uri]
set log(http_method) [HTTP::method]
# request_num might not be accurate for HTTP2
set log(request_num) [HTTP::request_num]
set log(request_size) [string length [HTTP::request]]
# BUG http2 reported as http1 in pre 16.x
# https://cdn.f5.com/product/bugtracker/ID842053.html
set log(http_version) [HTTP::version]
if { [catch \[HTTP2::version\] result] == 1 } {
if { $result contains "Operation not supported" } {
#log local0. "HTTP version is: [HTTP::version]"
} else {
set h2ver [eval "\HTTP2::version"]
# we might have http2 support, but not be http2
if { $h2ver != 0 } {
set log(http_version) $h2ver
}
}
}
#log local0. "http_version = $log(http_version)"
if { $log(loglevel) > 1 } {
foreach {header} [HTTP::header names] {
set log(req-$header) [HTTP::header $header]
}
} else {
foreach {header} {"connection" "content-length" "keep-alive" "last-modified" "policy-cn" "referer" "transfer-encoding" "user-agent" "x-forwarded-for" "x-forwarded-proto" "x-forwarded-scheme"} {
if { [HTTP::header exists $header] } {
set log(req-$header) [HTTP::header $header]
}
}
}
}
when LB_SELECTED {
set lb_selected_time [clock clicks -milliseconds]
set log(server_addr) [LB::server addr]
set log(server_port) [LB::server port]
set log(pool) [LB::server pool]
}
when SERVER_CONNECTED {
set log(connection_time) [expr {[clock clicks -milliseconds] - $lb_selected_time}]
set log(snat_addr) [IP::local_addr]
set log(snat_port) [TCP::local_port]
}
when LB_FAILED {
set log(event_info) [event info]
}
when HTTP_REJECT {
set log(http_reject) [HTTP::reject_reason]
}
when HTTP_REQUEST_SEND {
set http_request_send_time [clock clicks -milliseconds]
}
when HTTP_RESPONSE {
set log(upstream_time) [expr {[clock clicks -milliseconds] - $http_request_send_time}]
set log(http_status) [HTTP::status]
if { $log(loglevel) > 1 } {
foreach {header} [HTTP::header names] {
set log(res-$header) [HTTP::header $header]
}
} else {
foreach {header} {"cache-control" "connection" "content-encoding" "content-length" "content-type" "content-security-policy" "keep-alive" "last-modified" "location" "server" "www-authenticate"} {
if { [HTTP::header exists $header] } {
set log(res-$header) [HTTP::header $header]
}
}
}
# if logging is off, but upstream_time is over threshold in datagroup, log anyway
if { ($log(loglevel) < 1) && [class exists dg_http_log] } {
set log_upstream_time [class match -value -- upstream_time equals dg_http_log]
if {$log_upstream_time ne "" && $log(upstream_time) >= $log_upstream_time} {
set log(over_upstream_time) $log_upstream_time
set log(loglevel) 1
}
}
}
when HTTP_RESPONSE_RELEASE {
if { [info exists http_request_time] } {
set log(http_time) "[expr {[clock clicks -milliseconds] - $http_request_time}]"
# push http_time into table so CLIENT_CLOSED can see it in HTTP/2
table set -subtable [IP::client_addr]:[TCP::client_port] http_time $log(http_time)
} else {
set http_request_time [clock clicks -milliseconds]
}
set log(event) HTTP_RESPONSE_RELEASE
if { $log(loglevel) > 0 } { call hsllog $http_request_time log }
}
when HTTP_DISABLED {
set log(http_passthrough_reason) [HTTP::passthrough_reason]
}
when CLIENT_CLOSED {
# grab log() values from table
set keys [table keys -subtable [IP::client_addr]:[TCP::client_port]]
foreach key $keys {
set log($key) "[table lookup -subtable "[IP::client_addr]:[TCP::client_port]" "$key"]"
}
set log(tcp_time) "[expr {[clock clicks -milliseconds] - $tcp_start_time}]"
set log(event) CLIENT_CLOSED
# http_time didn't get set, log here (HTTP_RESPONSE_RELEASE never called, catch redirects, aborted connections)
if { not ([info exists log(http_time)]) } {
if { [info exists http_request_time] } {
# called HTTP_REQUEST but not HTTP_RESPONSE_RELEASE using HTTP 1.0 or 1.1
set log(http_time) "[expr {[clock clicks -milliseconds] - $http_request_time}]"
}
call hsllog $tcp_start_time log
} elseif { $log(loglevel) > 1 } {
call hsllog $tcp_start_time log
}
# clean out table when client disconnects
table delete -subtable [IP::client_addr]:[TCP::client_port] -all
}
- H_W_Huang_21714Nimbostratus
so usefull iRulse , Thank you very much!
and I suggest make some change to help us to understanding:
total_time ----> total_http_time
tcp_time ----> total_tcp_time
- TimRikerCirrocumulus
Thanks for the suggestions! I renamed total_time to http_time as that's what it represents. I've also added logging of some relevant headers from both the request (req-) and response (res-). We add a few more to that list that are relevant inside our networks.
I updated the iRule here with my changes.
I added logging the co (country) of the upstream request using the f5 whereis IP geo tables.
- TimRikerCirrocumulus
Moved the HSL pool name to a static to avoid marking the virtual server up based on hsl pool status
https://support.f5.com/csp/article/K14505
- TimRikerCirrocumulus
added some support for http2 requests. Note: they are incorrectly logged as http 1.1 requests as HTTP::version returns 1.1 on http2 requests.
- antec42Altostratus
I found that this upvar statement doesn't work (it crashes later when entering the foreach loop):
upvar 1 $mylog log
From what I understand, you're passing "log" already as a parameter to the hsllog proc and then do a reference from $mylog (which should be a copy of log?) back to yourself? I can't wrap my head around that. Any way I changed the name of the "log" variable inside hsllog proc to "alog", i.e. "upvar 1 $mylog alog" and used alog in the foreach loop instead. That works as expected…
Also you can check http2 version by using this instead:
if {[HTTP2::version] != 0} {
set log(http_version) [HTTP2::version]
} else {
set log(http_version) [HTTP::version]
}
Kind Regards,
Marcus
- TimRikerCirrocumulus
What BIG-IP version are you using? I've tested from 11.x up through 14.x so far.
Note: if you reference HTTP2::version then you can ONLY apply the rule to VIPS with http2 enabled. I want to use the same iRule on any http vip whether it has compression, client / server encryption enabled, http2 enabled. ipv6 listen address, etc. I would prefer to be able to apply the rule to Layer 4 vips too, but I currently maintain 2 versions of the script. One for http vips and one for l4 vips.
The point is that when an http2 request is made, HTTP::version reports an incorrect version. This is clearly a bug. It's incorrect information.
- TimRikerCirrocumulus
I'm now using an eval with escaping to allow the iRule to be applied to both http2 and http1 vips and still report the correct version in http_version. I have not yet looked into http3/QIC vips.