Forum Discussion

Simon_83666's avatar
Simon_83666
Icon for Nimbostratus rankNimbostratus
Sep 22, 2008

syslog-ng & alertd CPU usage

We're having an issue with our new LTM 6400 box. With the following iRule enabled for logging, the syslog-ng & alertd processes are taking up almost 100% of one CPU during a performance testing. The second CPU which I think is dedicated to TMM is not doing much but we're not sure what can be done to reduce the CPU 00 usage without disabling logging. Is this something to do with the way the irules was written ?

 

 

 

when HTTP_REQUEST {

 

set client_ip [IP::client_addr]

 

set http_request [HTTP::request]

 

set http_request_time [clock clicks -milliseconds]

 

log local0. "Client connected, IP: $client_ip, HTTP request: $http_request "

 

}

 

 

when HTTP_RESPONSE {

 

set http_status [HTTP::status]

 

set lb_server [LB::server]

 

log local0. "Pool Info: $lb_server - HTTP Status: $http_status (request/response delta: [expr [clock clicks -milliseconds] - $http_request_time]ms)"

 

}

4 Replies

  • I'm curious to see Mike's response, but in the meantime...

     

     

    I think it's expected that alertd and syslog-ng will eat up CPU cycles when you're trying to log a lot of events. In general, I disable all but essential logging when putting a rule in production. You can either use a debug flag to only log when you need it, or, more efficiently, comment out the log statements.

     

     

    How many requests per second are being passed through the VIP with this rule? If you do want to log the request details, you could make the rule slightly more efficient by eliminating the intermediate variables:

     

     

    [code]

     

    when HTTP_REQUEST {

     

    set http_request_time [clock clicks -milliseconds]

     

    log local0. "Client connected, IP: [IP::client_addr], HTTP request: [HTTP::request]"

     

    }

     

    when HTTP_RESPONSE {

     

    log local0. "Pool Info: [LB::server] - HTTP Status: [HTTP::status] (request/response delta: [expr [clock clicks -milliseconds] - $http_request_time]ms)"

     

    }

     

    [/code]

     

     

    You could also remove the calculation of the delta and calculate that after the fact from the logs. Also, the HTTP::request could be very large. If you don't really need the full request logged, you could remove it.

     

     

    Aaron
  • logging and clock are really heavy for cpu. Do you have an idea about how much request/sec you have going through your BIGIP ?

     

     

    If you want to have an estimation you can use the timing command [url]http://devcentral.f5.com/wiki/default.aspx/iRules/timing.html[/url], [url]http://devcentral.f5.com/Default.aspx?tabid=63&articleType=ArticleView&articleId=123[/url]

     

     

    I did some testing on a 6400 Platform with v9.4.3 and here is what i found when i did my tests:

     

     

    [code]

     

    iRule:

     

    when HTTP_REQUEST timing on {

     

    }

     

     

    Result:

     

    +-> HTTP_REQUEST 25190 total 0 fail 0 abort

     

    | Cycles (min, avg, max) = (664, 683, 9969)

     

     

    when HTTP_REQUEST timing on {

     

    set actual_time [clock seconds]

     

    }

     

     

    Result:

     

    +-> HTTP_REQUEST 69031 total 69031 fail 0 abort

     

    | Cycles (min, avg, max) = (12461, 12601, 94735)

     

     

     

     

    Command name average cpu cycles CPU usage / request max requests / second to take (100% cpu with the iRule ONLY!)

     

    log 10064 0,000504871% 198070

     

    clock 12601 0,000632142% 158192

     

    [/code]

     

     

    HTH
  • Mike_Lowell_108's avatar
    Mike_Lowell_108
    Historic F5 Account
    I share hoolio's sentiment about only logging when you need it, and reducing what you log. For example, based on the calculation you're doing I'll bet you're wanting to track server responses that take too long. I'd consider changing the logic such that you *only* log requests that are past a certain threshold. i.e. if it takes longer than 5000ms, log it, otherwise do nothing.

     

     

    Logging at a high rate of speed requires either that you have fast disks, or that we be able to send log messages to a remote host that does. Since BIG-IP doesn't have fast disks, I suggest:

     

    a) Reduce how often you're logging

     

    b) Reduce how much you're logging

     

    c) Send the log messages elsewhere, directly with the log command

     

     

    Note the remote host syntax mentioned here:

     

    http://devcentral.f5.com/wiki/default.aspx/iRules/log.html

     

     

    This allows you to avoid syslog-ng on BIG-IP, and avoid the bottlenecks associated with it.

     

     

    Mike Lowell
  • Thanks guys, I actually contacted F5 support and managed to reduce the CPU usage significantly by doing the following:

     

     

    1) reduced the length of the log entry by only logging the HTTP response and not the entire HTTP request

     

     

    2) Since we also use a modified version of syslog-ng.conf to log things into different files, we've now remove all the regular expression from all the custom filters that we've created.

     

     

    3) By modifying the syslog-ng.conf again, I was able to filter out all of our log messages from the alertd process which by default examine all message logged.

     

     

    After the above 3 steps, the CPU 0 usage dropped from 100% to below 3% during load testing.