Forum Discussion
Leslie_South_55
Apr 03, 2008Nimbostratus
Collecting website statistics
Greetings DevCentral,
So the challange was to be able to gather website statistics (client IP, pages hit, spiders, crawlers, search text, etc, etc, etc). There is seemingly useful tool called AWStats, available on sourceforge. Very nice interface.
AWStats builds it's stats page using web server logs recording all the METHODS and REPONSES. So now the challange; we have 8 web servers with log files wraping every hour, PLUS we have some iRules in front of this pool to either redirect or block certain request types, so the web servers don't see all the traffic.
Enter another iRule...and here is the code (partially built from several examples found here)
when HTTP_REQUEST {
set curtime [clock seconds]
set formattedtime [clock format $curtime -format {%d/%b/%Y:%T %z}]
set referer [HTTP::header value Referer]
set useragent [HTTP::header value User-Agent]
set log_format "[IP::client_addr] - - \[$formattedtime\] \"[HTTP::method] [HTTP::uri] HTTP/[HTTP::version]\" \"$referer\" \"$useragent\""
}
when HTTP_RESPONSE {
log local1.info "$log_format [HTTP::status] [HTTP::payload length]"
}
notice that I am using local1 which I defined in my syslog-ng.conf to send to a remote_server, as not to log to local disk on the bigip.
Testing was great, when fine BUT when I put this in production...LOOK OUT..my 6400 went bonkers..alertd and syslog-ng daemons consumed 99.5% of the 2nd CPU and the UNIX remote syslog IO was equally utilized.
In 5 min, we accumilated a 17MB log file with over 53,000 lines (each line representing one HTTP_RESPONSE)needless to say I had to remove the rule.
The VS that we are trying to gather stats on maintains an average 1000 HTTP connections.
I am looking for any advice or words of wisdom that could help me complete my challange.
Thanks
-L
- hoolioCirrostratusI'm not sure whether you're hitting a resource limit on the BIG-IP trying to log ~175 events/sec or if the rule could be streamlined to improve the situation.
when HTTP_REQUEST { set log_format "[IP::client_addr] - - - \"[HTTP::method] [HTTP::uri] \ HTTP/[HTTP::version]\" \"[HTTP::header value Referer]\" \"[HTTP::header value User-Agent]\"" } when HTTP_RESPONSE { log local1.info "$log_format [HTTP::status] [HTTP::header value Content-Length]" }
- Leslie_South_55NimbostratusI'm going to try your suggestions, and yes we will have to do additional post processing on the log file without the date/time stamp, I guess I could pull the LTM time stamp, as you mention, and just insert it in the correct place?
- Leslie_South_55NimbostratusOK..so I took your sugestions with one addition (see below) and I am currently monitoring in production (after I tested in stage)...we are about 1/2 as busy as we were the other day, averaging around 600 HTTP requests per sec. alertd is running at 20% and syslog-ng at 0.9 which gives me an average of just under 50% total CPU (not TMM) utilization.
- hoolioCirrostratusI think the CPU usage should increase linearly, but I'd be happy to have someone with more knowledge on this comment.
- Leslie_South_55NimbostratusI have not tested withou the clock command...that is something I will do next.
- Leslie_South_55NimbostratusRemoving the clock command in the log statement
Recent Discussions
Related Content
Â
DevCentral Quicklinks
* Getting Started on DevCentral
* Community Guidelines
* Community Terms of Use / EULA
* Community Ranking Explained
* Community Resources
* Contact the DevCentral Team
* Update MFA on account.f5.com
Discover DevCentral Connects