29-Sep-2022 02:35 - edited 29-Sep-2022 22:24
Hello comunity,
We're facing for quite a while, some sort of a DDoS tries (I can't consider it as an actual attack), against our F5 Public DNS.
Initially we were seeing ~700K - 1Mil connections (IP connections) and after some analisys we decided to lower port UDP 53 timeout from default 40sec to 10sec (on our firewall that is in front of the F5). That change had an positive impact on peak conenctions (when DDoS was seen) by lowering them to under 50% - appros 300-400K connections nowadays.
At this point, we started to collect data and we've seen that there is a pattern on this DDoS (at least once a day, and close to the same time - almost) and also we got the IP's that are performing those querryes (top 10 IP's based on conenctions performed in that moment).
The missing piece was the actuall querry that was performed during the HIGH connections moments, and for that we enabled F5 DNS logging, but having too much traffic, we could only see 10 - 15 sec in F5 logs 😔 .
Still, the good part was that we were right on the spot in couple of occasions and we were able to get a snap of the querryes:
Now yesterday, we manage to get the F5 DNS logging to be forwarded to an GrayLog, and over the night, we were able to capture a peak of ~1Mil logs (so I would say like 500K querryes - considering that you have 2 log lines per querry) and while exporting the logs for the 5 min this event happened, we were able to get a list of the TOP 20 querryes - that are legitimate DNS querryes, nothing abnormal.
So after all this monolog, my question to you is, does anyone faced similar HIGH DNS querry and what did you do to prevent it ?
While looking for ways to prevent it, we've seen article K11005751 that might be for us, and with an iRule we can DROP requests for FQDNS that don't exists in our environment.
Would that add an extra load on resources, considering that the iRule will be executed each time a DNS Querry arrives to our F5?
Thank you,
30-Sep-2022 15:49
iRules are processed first in the line of the other services, so whereas there is always an impact, it will be negligible per connection compared to the other services processing those requests and that is a well-known and helpful approach to managing the issues you are seeing.
That said, the local system is still going to get all those requests, even if you're dropping them. A service like F5 Distributed Cloud DDoS Mitigation could scrub these brefore the requests hit your local services if the volume becomes too great for your equipment or pipes to manage.
30-Sep-2022 19:21 - edited 30-Sep-2022 19:23
Having spent 22 years architecting backbone DNS solutions for global service providers, I would urge you to consider a few things immediately:
01-Oct-2022 02:09 - edited 01-Oct-2022 02:19
Thank you @JRahm and @AubreyKingF5 for your responses.
To clarify and answer/confirm @AubreyKingF5 questions or recommendations (in blue) :
So, our take from all this is to look on the AFM and IP Intelligence and in the end (if this doesn't get the results as expected) we could go with the iRule we've talked about .
If there are any other ideeas, please share .
Thank you and have a great weekend,
04-Oct-2022 11:15
Awesome! To clarify #2, I would recommend setting your UDP timeout to "Immediate", not 10. 1 UDP packet takes less than a ms to complete. Like much less. I've personally watched a single UDP profile on F5 DNS handle 3M PPS.. ON A VE !! (High Performance VE, but still.)
Regarding the AFM, there's not a ton besides the manual, but this lab guide may help:
https://clouddocs.f5.com/training/community/firewall/html/class2/class2.html
Also, here is a snippet of tmsh from a service provider grade DNS AFM DoS policy that I worked on for an example:
tmsh modify security dos device-config dos-device-config dos-device-vector { sweep { detection-threshold-percent 500 detection-threshold-pps 250 per-source-ip-limit-pps 500 packet-types replace-all-with { dns-a-query dns-aaaa-query dns-any-query dns-axfr-query dns-cname-query dns-ixfr-query dns-mx-query dns-ns-query dns-other-query dns-oversize dns-ptr-query dns-response-flood dns-soa-query dns-srv-query dns-txt-query udp } } }
The idea is to build a policy that disallows all undesired query types, LIMITS all desired query types (flooding is dropped) and then defends against all UDP flooding. You may want to carve that up, too.. so UDP flooding cut off globally (like.. why not?) and the dns-specific dos profile attached to the VIP(s).