cancel
Showing results for 
Search instead for 
Did you mean: 

High speed logging and round robin load balancing issues

Pieter_72260
Nimbostratus
Nimbostratus

Hi,

 

We have a couple of F5 servers that have an iRule that uses HSL to send log messages to a pool of remote rsyslog daemons.

 

The rsyslog daemons are added to one pool, and the pool is configured to distribute log messages in a round robin way.

 

All the nodes in the pool are up and running and are in a green state.

 

 

I would expect that the log messages get distributed evenly accross all nodes in that logging pool.

 

However it looks like the round robin balancing is not used.

 

All the messages sent with HSL are always sent to the same rsyslog daemon.

 

Only when logrotation kills rsyslog for a short term another rsyslog daemon is selected.

 

Is there some other stickyness setting that we're forgetting? Or is HSL not supporing the load balancing configuration of the pool?

 

Or is there another setting that I'm forgetting?

 

Any ideas?

 

Thanks,

 

Pieter

 

17 REPLIES 17

hoolio
Cirrostratus
Cirrostratus
Hi Pieter,

 

 

Can you post the HSL pool config using 'tmsh list ltm pool hsl_pool' and reply with the anonymized output?

 

 

Thanks, Aaron

Pieter_72260
Nimbostratus
Nimbostratus

Hi Aron,

 

Below is the pool config:

 

ltm pool hsl_pool {

 

members {

 

10.197.15.45:514 {

 

session monitor-enabled

 

}

 

10.197.15.46:514 {

 

session monitor-enabled

 

}

 

}

 

monitor tcp

 

}

 

Once colleague thinks it might have something to do with the persistency settings that are used in the irule:

 

when HTTP_REQUEST priority 100 {

 

if { "POST" eq [HTTP::method] } {

 

persist none

 

} else {

 

persist carp [HTTP::uri]

 

}

 

}

 

Thanks in advance,

 

Pieter

 

What_Lies_Bene1
Cirrostratus
Cirrostratus
The persistence method you mention is for HTTP traffic using the VS, not the HSL traffic. The HSL Pool configuration looks good. Do you know if the F5 is opening multiple connections or just one?

nitass
F5 Employee
F5 Employee
what event is HSL::open in? is it CLIENT_ACCEPTED?

 

 

if so, have you ever tried to move HSL::open to HTTP_REQUEST (i just want to check if what i guess is correct)?

Pieter_72260
Nimbostratus
Nimbostratus
Hi nitass,Steve,

 

 

The HSL::open is indeed in CLIENT_ACCEPTED and we use HSL::send in CLIENT_CLOSED,HTTP_RESPONSE and LB_FAILED.

 

I'll check with our SysEng team if they can see connections, or health checks being executed.

 

 

Thanks,

 

Pieter

Pieter_72260
Nimbostratus
Nimbostratus
And some more info.

 

 

We've done a TCP dump on the syslog daemons (using tcpdump -w f5.cap -s 0 port 514)

 

And for both the rsyslog daemons we can see the TCP health checks from both F5 loadbalancers (we have 2 for failover purposes)

 

So all that seems right, but still we only get logs on one rsyslog daemon.

 

 

Thanks,

 

Pieter

nitass
F5 Employee
F5 Employee
i did test on my 10.2.4 unit and got the same issue (i.e. hsl log is sent to only one pool member. additionally, only 3-way handshake is done). i think it could be a bug.

 

 

have you opened a support case? if not yet, would you mind logging a case to verify?

Pieter_72260
Nimbostratus
Nimbostratus
Hi Nitass,

 

 

We've created a support case: C1289376

 

 

I've seen this behavior on some of our load balancer pair's.

 

But not all of them seem to have an issue with it.

 

And once we restart the rsyslog daemons for these LB pairs the issue happens as well.

 

 

Thanks,

 

pieter

 

 

nitass
F5 Employee
F5 Employee
But not all of them seem to have an issue with it. are they using tcp syslog? i have not yet tested but understanding udp syslog could be okay.

Pieter_72260
Nimbostratus
Nimbostratus
They are all using TCP syslog. I'll see if can try and test UDP syslog.

nitass
F5 Employee
F5 Employee
udp syslog seems working fine.

 

 

[root@ve10:Active] config b virtual bar list virtual bar { snat automap pool foo destination 172.28.19.252:80 ip protocol 6 rules myrule profiles { http {} tcp {} } } [root@ve10:Active] config b rule myrule list rule myrule { when CLIENT_ACCEPTED { set hsl [HSL::open -proto UDP -pool loghost] } when HTTP_REQUEST { set host [HTTP::host] set uri [HTTP::uri] } when HTTP_RESPONSE { HSL::send $hsl "<190> [IP::client_addr]:[TCP::client_port] $host $uri\n" } } [root@ve10:Active] config b pool loghost list pool loghost { members { 200.200.200.101:514 {} 200.200.200.111:514 {} } } [root@ve10:Active] config tcpdump -nni 0.0 port 514 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on 0.0, link-type EN10MB (Ethernet), capture size 108 bytes 07:47:58.467764 IP 200.200.200.10.13874 > 200.200.200.101.514: SYSLOG local7.info, length: 42 07:48:01.220714 IP 200.200.200.10.13875 > 200.200.200.111.514: SYSLOG local7.info, length: 42 07:48:05.041582 IP 200.200.200.10.13876 > 200.200.200.101.514: SYSLOG local7.info, length: 42 07:48:06.635777 IP 200.200.200.10.13877 > 200.200.200.111.514: SYSLOG local7.info, length: 42

 

 

this is tcp syslog.

 

 

[root@ve10:Active] config b virtual bar list virtual bar { snat automap pool foo destination 172.28.19.252:80 ip protocol 6 rules myrule profiles { http {} tcp {} } } [root@ve10:Active] config b rule myrule list rule myrule { when CLIENT_ACCEPTED { set hsl [HSL::open -proto TCP -pool loghost] } when HTTP_REQUEST { set host [HTTP::host] set uri [HTTP::uri] } when HTTP_RESPONSE { HSL::send $hsl "<190> [IP::client_addr]:[TCP::client_port] $host $uri\n" } } [root@ve10:Active] config b pool loghost list pool loghost { members { 200.200.200.101:514 {} 200.200.200.111:514 {} } } [root@ve10:Active] config tcpdump -nni 0.0 port 514 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on 0.0, link-type EN10MB (Ethernet), capture size 108 bytes 07:50:15.016770 IP 200.200.200.10.13899 > 200.200.200.111.514: S 3784171141:3784171141(0) win 4380 07:50:15.017474 IP 200.200.200.111.514 > 200.200.200.10.13899: S 1437766692:1437766692(0) ack 3784171142 win 5840 07:50:15.017487 IP 200.200.200.10.13899 > 200.200.200.111.514: . ack 1 win 4380 07:50:15.017496 IP 200.200.200.10.13899 > 200.200.200.111.514: P 1:43(42) ack 1 win 4380 07:50:15.018565 IP 200.200.200.111.514 > 200.200.200.10.13899: . ack 43 win 46 07:50:16.746792 IP 200.200.200.10.13899 > 200.200.200.111.514: P 43:85(42) ack 1 win 4380 07:50:16.747472 IP 200.200.200.111.514 > 200.200.200.10.13899: . ack 85 win 46 07:50:17.442778 IP 200.200.200.10.13899 > 200.200.200.111.514: P 85:127(42) ack 1 win 4380 07:50:17.443470 IP 200.200.200.111.514 > 200.200.200.10.13899: . ack 127 win 46 07:50:18.040623 IP 200.200.200.10.13899 > 200.200.200.111.514: P 127:169(42) ack 1 win 4380 07:50:18.041676 IP 200.200.200.111.514 > 200.200.200.10.13899: . ack 169 win 46

Pieter_72260
Nimbostratus
Nimbostratus
Hi nitass,

 

 

I was partially able to get it working using UDP.

 

It works fine I just manually launch requests using a browser.

 

But the moment we create a higher load (60qps) using multiple threads the load balancing stops.

 

Everything is going to one server again.

 

 

And this morning it seems to have worked in one of our DC's with TCP after log rotation kicked in on the rsyslog daemons.

 

But the distribution ratio of the requests is not equal. one node gets 25% of requests and the other 75%.

 

 

Kind regards,

 

Pieter

nitass
F5 Employee
F5 Employee
thanks Pieter. i understand support engineer has found something. would you mind updating us here after he gets back to you?

Pieter_72260
Nimbostratus
Nimbostratus

Hey,

 

We've got feedback from F5, at first they provided a workaround, which unfortunatly didn't work.

 

And after that they mentioned that the thing we're trying to do is not going to work.

 

 

I find it very strange that HSL & F5 doesn't support load balancing of it's own logs.

 

We're trying various other possible workarounds but none of them has been working reliably yet.

 

 

Kind regards,

 

Pieter

 

 

nitass
F5 Employee
F5 Employee
what about this one?

 

 

[root@ve10:Active] config b virtual bar list virtual bar { snat automap pool foo destination 172.28.19.252:80 ip protocol 6 rules myrule profiles { http {} tcp {} } } [root@ve10:Active] config b rule myrule list rule myrule { when RULE_INIT { set static::i 0 } when CLIENT_ACCEPTED { if { $static::i == 0 } { set loghost loghostp1 set static::i 1 } else { set loghost loghostp2 set static::i 0 } set hsl [HSL::open -proto TCP -pool $loghost] } when HTTP_REQUEST { set host [HTTP::host] set uri [HTTP::uri] } when HTTP_RESPONSE { HSL::send $hsl "<190> [IP::client_addr]:[TCP::client_port] $host $uri\n" } } [root@ve10:Active] config b pool loghostp1 list pool loghostp1 { members 200.200.200.101:514 {} } [root@ve10:Active] config b pool loghostp2 list pool loghostp2 { members 200.200.200.111:514 {} } send 500 requests to virtual server [root@centos251 ~] ab -n 500 http://172.28.19.252/ This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0 Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Copyright 2006 The Apache Software Foundation, http://www.apache.org/ Benchmarking 172.28.19.252 (be patient) Completed 100 requests Completed 200 requests Completed 300 requests Completed 400 requests Finished 500 requests Server Software: Apache/2.2.3 Server Hostname: 172.28.19.252 Server Port: 80 Document Path: / Document Length: 89 bytes Concurrency Level: 1 Time taken for tests: 2.145340 seconds Complete requests: 500 Failed requests: 0 Write errors: 0 Total transferred: 175500 bytes HTML transferred: 44500 bytes Requests per second: 233.06 [/sec] (mean) Time per request: 4.291 [ms] (mean) Time per request: 4.291 [ms] (mean, across all concurrent requests) Transfer rate: 79.71 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.6 0 3 Processing: 2 3 1.2 3 14 Waiting: 2 2 1.3 3 13 Total: 3 3 1.3 3 14 Percentage of the requests served within a certain time (ms) 50% 3 66% 4 75% 4 80% 4 90% 4 95% 5 98% 8 99% 10 100% 14 (longest request) pool stats [root@ve10:Active] config b pool loghostp1|grep -i request | requests (total) = 250 [root@ve10:Active] config b pool loghostp2|grep -i request | requests (total) = 250

nitass
F5 Employee
F5 Employee
and this is on cmp platform.

 

 

root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) show sys tmm-info |grep -i sys::tmm Sys::TMM: 0.0 Sys::TMM: 0.1 root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) list ltm virtual bar ltm virtual bar { destination 172.28.20.16:80 ip-protocol tcp mask 255.255.255.255 pool foo profiles { http { } tcp { } } rules { myrule } source 0.0.0.0/0 source-address-translation { type automap } vlans-disabled } root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) list ltm rule myrule ltm rule myrule { when RULE_INIT { set static::i 0 } when CLIENT_ACCEPTED { if { $static::i == 0 } { set loghost loghostp1 set static::i 1 } else { set loghost loghostp2 set static::i 0 } set hsl [HSL::open -proto TCP -pool $loghost] } when HTTP_REQUEST { set host [HTTP::host] set uri [HTTP::uri] } when HTTP_RESPONSE { HSL::send $hsl "<190> [IP::client_addr]:[TCP::client_port] $host $uri\n" } } root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) list ltm pool loghostp1 ltm pool loghostp1 { members { 200.200.200.101:514 { address 200.200.200.101 } } } root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) list ltm pool loghostp2 ltm pool loghostp2 { members { 200.200.200.111:514 { address 200.200.200.111 } } } pool stats root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) show ltm pool loghostp1|grep -i total\ requests Total Requests : 250 root@(ve11a)(cfg-sync Changes Pending)(Active)(/Common)(tmos) show ltm pool loghostp2|grep -i total\ requests Total Requests : 250

jimmythegeek_10
Historic F5 Account

SOL17398 covers this. Load balancing decisions are made in the logging profile, and HSL will stick to the pool member chosen until the server indicates it can't keep up or the connection is lost, at which point another pool member is seleted. As of 12.0 you can change the distribution approach to "balanced" to spread the log messages evenly across pool members. This is done in the unformatted HSL log destination configuration setting for Distribution.

 

(old question but answer may help future searchers)