Forum Discussion

Cirrostratus

Jan 27, 2023

Solved

UDP Datagram LB

Hello, To enable fair load balancing between backend servers (5 syslog srv >> F5 >> 2 splunk srv) I created a new udp profile and activated the option : "Datagram LB" : https://support.f5.com/csp/a...

event

cpt_ri_F5
Feb 24, 2023
hello,
FYI, I solved this problem with a simple stateless VS: K13675
Thank you all

CA_Valli

MVP

Jan 27, 2023

Where are you checking the stats? F5 or splunk? Also is this a live stat or is is a cumulative counter?

What I wanted to point out is that datagramLB changes load balancing behvior, and while "standard" UDP profile keeps track of UDP connection and forwards all packets of the same connection to the same destination, this is no longer true with DLB -- meaning that if a syslog message is split into multiple UDP packets you might end up with some of them being balanced to one pool member and other packets of the same flow being balanced to the second pool member.

I thought that this could be a possible problem, as one splunk server might not be able to reconstruct the full syslog message due to part of it being sent to the other server -- and possibly discarding/not logging the "incomplete" packets.

cpt_ri_F5

Cirrostratus

Jan 27, 2023

Thank you it's clear

I checked the stats on splunk, cumulative counter (per minute), but it's the same splunk request.

In the splunk graph I see when I change the UDP profile that the number of logs is divided /5 or more

if we can't recover the integrity of the logs, in this case, I don't understand the point of the option "Datagram LB" !

I am interested in another solution to fairly share the logs on the splunk servers

Thanks!

Kai_Wilke
MVP
Jan 29, 2023
Hi Cpt_Rl_F5,

The loss could be also be a problem with snat port allocations.

A syslog-originator may open a single UDP connection to the syslog-collector and send multiple subsequent messages where each message is carried in a single UDP datagram.

When enabling UDP-Datagram-LB your LTM would treat each arriving UDP datagram as a new UDP connection flow and will maintain the flow, LB decission, SNAT allocations independently of previously received UDP datagrams.

This behavior may drain your SNAT pools very quick resulting in SNAT pool port allocation errors causing packet loses (See K33355231 for more details)

To avoid such snat exhaustions you may either deploy bigger snat pools or simply ignore the "Important" recommendation of K3605:

"Important: With UDP Datagram LB set to Enabled, if you also set the Timeout to Immediate, UDP response traffic is forwarded to the client using the origin server's source IP address and port. As a result, response traffic may not appear to have originated from the virtual server to which the request was sent, and traffic disruption may occur when a client and/or routing expects the UDP source IP address to be the address other than the origin server. You can avoid this issue by setting the timeout to a value other than Immediate whenever possible."

... and set the timeout well-knowing to "Immediate" to allow the individual UDP Datagrams to timeout immediately allowing them to basically share UDP sockets.

Syslog is a simplex protocol, so a originator will never get any answers back from the collector anyway. No need to maintain any timeouts for responses for syslog...

Cheers, Kai
boneyard
MVP
Jan 29, 2023
cpt_ri_F5 wrote:

if we can't recover the integrity of the logs, in this case, I don't understand the point of the option "Datagram LB" !

If the previous assumption are true, something which you might want to confirm before going in other directions, then this might not be the right choice in this case. But that option might help people with other UDP load balance challenges.

You can keep using the default profile, perhaps with a little lower time out and see which load balancing algorithm get a similar job done. Least Connections would be an interesting to check. Keep in mind that a full 50% / 50% sounds nice, but is it really that bad when it's 60% / 40% on the long run.