Technical Forum
Ask questions. Discover Answers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Custom Alert Banner

UDP Datagram LB

cpt_ri_F5
Cirrostratus
Cirrostratus

Hello,

To enable fair load balancing between backend servers (5 syslog srv >> F5 >> 2 splunk srv)

I created a new udp profile and activated the option : "Datagram LB" : https://support.f5.com/csp/article/K3605

100% of logs received with the default udp profile, but not with the new profile udp (the other parameters are equal)

an idea?

Thanks!

1 ACCEPTED SOLUTION

cpt_ri_F5
Cirrostratus
Cirrostratus

hello,

FYI, I solved this problem with a simple stateless VS: K13675

Thank you all

View solution in original post

8 REPLIES 8

CA_Valli
MVP
MVP

UDP datagram LB forwards traffic packet-by-packet, and no loger treats UDP packets from the same source and port as part of a connection, so if syslog message is split into multiple packets you might have part of the log on one server and part of the log on the other one, possibly resulting in messages missing. 

cpt_ri_F5
Cirrostratus
Cirrostratus

Hello @CA_Valli

Sorry, I don't understand your answer, the stats show approx. 1 million logs received with the default udp profile and approx. 200k with "Datagram LB", there is a loss of logs, isn't it?

Or do you mean in the same entry (with Datagram LB profil), contains several logs?

Thanks!

(Edited by @Leslie_Hubertus to tag CA_Valli, to make sure he sees this reply)

 

Where are you checking the stats? F5 or splunk? Also is this a live stat or is is a cumulative counter?

What I wanted to point out is that datagramLB changes load balancing behvior, and while "standard" UDP profile keeps track of UDP connection and forwards all packets of the same connection to the same destination, this is no longer true with DLB -- meaning that if a syslog message is split into multiple UDP packets you might end up with some of them being balanced to one pool member and other packets of the same flow being balanced to the second pool member.

I thought that this could be a possible problem, as one splunk server might not be able to reconstruct the full syslog message due to part of it being sent to the other server -- and possibly discarding/not logging the "incomplete" packets. 

Thank you it's clear

I checked the stats on splunk, cumulative counter (per minute), but it's the same splunk request.

In the splunk graph I see when I change the UDP profile that the number of logs is divided /5 or more

if we can't recover the integrity of the logs, in this case, I don't understand the point of the option "Datagram LB" !

I am interested in another solution to fairly share the logs on the splunk servers

Thanks!

 

Hi Cpt_Rl_F5,

The loss could be also be a problem with snat port allocations.

A syslog-originator may open a single UDP connection to the syslog-collector and send multiple subsequent messages where each message is carried in a single UDP datagram.

When enabling UDP-Datagram-LB your LTM would treat each arriving UDP datagram as a new UDP connection flow and will maintain the flow, LB decission, SNAT allocations independently of previously received UDP datagrams.

This behavior may drain your SNAT pools very quick resulting in SNAT pool port allocation errors causing packet loses (See K33355231 for more details)

To avoid such snat exhaustions you may either deploy bigger snat pools or simply ignore the "Important" recommendation of K3605:

"Important: With UDP Datagram LB set to Enabled, if you also set the Timeout to Immediate, UDP response traffic is forwarded to the client using the origin server's source IP address and port. As a result, response traffic may not appear to have originated from the virtual server to which the request was sent, and traffic disruption may occur when a client and/or routing expects the UDP source IP address to be the address other than the origin server. You can avoid this issue by setting the timeout to a value other than Immediate whenever possible."

... and set the timeout well-knowing to "Immediate" to allow the individual UDP Datagrams to timeout immediately allowing them to basically share UDP sockets.

Syslog is a simplex protocol, so a originator will never get any answers back from the collector anyway. No need to maintain any timeouts for responses for syslog...

Cheers, Kai


iRule can do… 😉


@cpt_ri_F5 wrote:

if we can't recover the integrity of the logs, in this case, I don't understand the point of the option "Datagram LB" !


If the previous assumption are true, something which you might want to confirm before going in other directions, then this might not be the right choice in this case. But that option might help people with other UDP load balance challenges.

You can keep using the default profile, perhaps with a little lower time out and see which load balancing algorithm get a similar job done. Least Connections would be an interesting to check. Keep in mind that a full 50% / 50% sounds nice, but is it really that bad when it's 60% / 40% on the long run.

cpt_ri_F5
Cirrostratus
Cirrostratus

hello,

FYI, I solved this problem with a simple stateless VS: K13675

Thank you all

Thank you for sharing that knowledge.