reporting
49 TopicsPerformance Logging iRule (Rule_http_log)
Problem this snippet solves: Here's a logging iRule. You'll need a HSL syslog pool to log too. Various bits gathered from other posts on DevCentral. Sharing in case there is interest. Make sure your rsyslogd is setup to use the newer syslog format like RFC-5424 including milliseconds and timezone info.Includes Country (co) and logs individual request times for each request on a HTTP/1.1 connection. To configure F5 logging to use milliseconds and timezone, disable logging in the gui and use tmsh edit sys syslog and something like: include " # short hostnames options { use_fqdn(no); }; # Remote syslog in RFC5424 - Tim Riker <Tim@Rikers.org> destination remotesyslog { syslog(\"10.1.2.3\" transport(\"udp\") port(51443) ts_format(iso)); }; log { source(s_syslog_pipe); destination(remotesyslog); }; " Uses upvar and proc. Tested on 11.6 - 15.1 This tracks connection info in a table and then copies that down to the per-request log() to handle reporting on http2. This version works around a BIG-IP bug where HTTP::version does not report 2 or higher for http2 and later requests. With http2 profiles, subsequent requests using the same connection can generate this error in the logs if HTTP::respond HTTP::redirect or HTTP::retry is called from and earlier iRule. Reorder your iRules to avoid this. <HTTP_REQUEST> - No HTTP header is cached - ERR_NOT_SUPPORTED (line 1)invoked from within "HTTP::method" How to use this snippet: Add this iRule to whatever virtual hosts you desire. I always add it as the first rule. If you have a rule that sets headers you want to track, you may want this after the rule that sets headers. Interesting Splunk queries can be created like: index=* perflog | timechart avg(cpu_5sec) by host limit=10 to show load across multiple F5s. index=* perflog | timechart max(upstream_time) by http_host limit=10 to show long request times by http_host Any other iRule may add things to the log() array and those will get added to the single hsl output. If you create a dg_http_log datagroup, that will be used to filter what gets logged. Tested on version: 13.0 - 15.1 # Rule_http_log # http logging - Tim Riker <Tim@Rikers.org> # bits taken from this post: # https://devcentral.f5.com/questions/irule-for-getting-total-response-time-server-response-time-and-server-connection-time # iRule performance tracking # https://devcentral.f5.com/questions/Timing-iRules timing on # timing is on by default in 11.5.0+ to see stats: # tmsh show ltm rule Rule_http_log # # if the dg_http_log datagroup exists then vips or hosts/paths in dg_http_log that start with # "NONE" no logging (really anything other than empty) # "INFO" normal logging # "FINE" full request and response headers and CLIENT_CLOSED # # upstream_time := 15000 in the datagroup to log all requests over 15 seconds # # example: # "/Common/vs_www.example.com_HTTPS" := "FINE" - logged including CLIENT_CLOSED # "www.example.com/" := "INFO" - logged # "www.example.com/somepath" := "FINE" - full headers # "www.example.com/otherpath" := "NONE" - not logged when RULE_INIT { # hostname up to first dot set static::hostname [getfield [info hostname] "." 1] } # not calling /Common/proc:hsllog as this logs when the request occurred # instead of the time it calls hsllog at the end of the request proc hsllog {time mylog} { upvar 1 $mylog log # https://tools.ietf.org/html/rfc5424 <local0.info>version rfc-3339time host procid msgid structured_data log # should be able to use a "Z" here instead of "+00:00" but our splunk logs don't handle that # 134 = local0.info set output "<134>1 [clock format [string range $time 0 end-3] -gmt 1 -format %Y-%m-%dT%H:%M:%S.[string range $time end-2 end]+00:00] ${static::hostname} httplog [TMM::cmp_group].[TMM::cmp_unit] - -" foreach key [lsort [array names log]] { if { ($log($key) matches_regex {[\" ;,:]}) } { append output " $key=\"[string map {\" "|"} $log($key)]\"" } else { append output " $key=$log($key)" } } # avoid marking virtual server up when hsl pool is up # https://support.f5.com/csp/article/K14505 set hsl pool_syslog HSL::send [HSL::open -proto UDP -pool $hsl] $output } when CLIENT_ACCEPTED { # calculate and track milliseconds # is this / 1000 guaranteed to be clock seconds? TCL docs say no, but it looks like on f5 it is. set tcp_start_time [clock clicks -milliseconds] set log(loglevel) 0 if { [class exists dg_http_log] } { # virtual name entries need to be full path, ie: /Common/vs_www.example.com_HTTP switch -- [string range [class match -value -- [virtual name] equals dg_http_log] 0 3] { "FINE" { set log(loglevel) 2 } "INFO" { set log(loglevel) 1 } default { set log(loglevel) 0 } } } table set -subtable [IP::client_addr]:[TCP::client_port] loglevel $log(loglevel) table set -subtable [IP::client_addr]:[TCP::client_port] tmm "[TMM::cmp_group].[TMM::cmp_unit]" table set -subtable [IP::client_addr]:[TCP::client_port] client_addr [IP::client_addr] table set -subtable [IP::client_addr]:[TCP::client_port] client_port [TCP::client_port] table set -subtable [IP::client_addr]:[TCP::client_port] cpu_5sec [cpu usage 5secs] table set -subtable [IP::client_addr]:[TCP::client_port] virtual_name [virtual name] set co [whereis [IP::client_addr] country] if { $co eq "" } { set co unknown } table set -subtable [IP::client_addr]:[TCP::client_port] co $co } when HTTP_REQUEST { set http_request_time [clock clicks -milliseconds] set keys [table keys -subtable [IP::client_addr]:[TCP::client_port]] foreach key $keys { set log($key) "[table lookup -subtable "[IP::client_addr]:[TCP::client_port]" "$key"]" } if {[HTTP::has_responded]} { # The rule should come BEFORE any rules that do things like redirects set log(http_has_responded) [HTTP::has_responded] set log(loglevel) 1 set log(event) HTTP_REQUEST call hsllog $http_request_time log return } if { [class exists dg_http_log] } { set logsetting [class match -value -- [HTTP::host][HTTP::uri] starts_with dg_http_log] if { $logsetting ne "" } { # override log(loglevel) if we found something switch -- [string range $logsetting 0 3] { "FINE" { set log(loglevel) 2 } "INFO" { set log(loglevel) 1 } default { set log(loglevel) 0 } } } } set log(http_host) [HTTP::host] set log(http_uri) [HTTP::uri] set log(http_method) [HTTP::method] # request_num might not be accurate for HTTP2 set log(request_num) [HTTP::request_num] set log(request_size) [string length [HTTP::request]] # BUG http2 reported as http1 in pre 16.x # https://cdn.f5.com/product/bugtracker/ID842053.html set log(http_version) [HTTP::version] if { [catch \[HTTP2::version\] result] == 1 } { if { $result contains "Operation not supported" } { #log local0. "HTTP version is: [HTTP::version]" } else { set h2ver [eval "\HTTP2::version"] # we might have http2 support, but not be http2 if { $h2ver != 0 } { set log(http_version) $h2ver } } } #log local0. "http_version = $log(http_version)" if { $log(loglevel) > 1 } { foreach {header} [HTTP::header names] { set log(req-$header) [HTTP::header $header] } } else { foreach {header} {"connection" "content-length" "keep-alive" "last-modified" "policy-cn" "referer" "transfer-encoding" "user-agent" "x-forwarded-for" "x-forwarded-proto" "x-forwarded-scheme"} { if { [HTTP::header exists $header] } { set log(req-$header) [HTTP::header $header] } } } } when LB_SELECTED { set lb_selected_time [clock clicks -milliseconds] set log(server_addr) [LB::server addr] set log(server_port) [LB::server port] set log(pool) [LB::server pool] } when SERVER_CONNECTED { set log(connection_time) [expr {[clock clicks -milliseconds] - $lb_selected_time}] set log(snat_addr) [IP::local_addr] set log(snat_port) [TCP::local_port] } when LB_FAILED { set log(event_info) [event info] } when HTTP_REJECT { set log(http_reject) [HTTP::reject_reason] } when HTTP_REQUEST_SEND { set http_request_send_time [clock clicks -milliseconds] } when HTTP_RESPONSE { set log(upstream_time) [expr {[clock clicks -milliseconds] - $http_request_send_time}] set log(http_status) [HTTP::status] if { $log(loglevel) > 1 } { foreach {header} [HTTP::header names] { set log(res-$header) [HTTP::header $header] } } else { foreach {header} {"cache-control" "connection" "content-encoding" "content-length" "content-type" "content-security-policy" "keep-alive" "last-modified" "location" "server" "www-authenticate"} { if { [HTTP::header exists $header] } { set log(res-$header) [HTTP::header $header] } } } # if logging is off, but upstream_time is over threshold in datagroup, log anyway if { ($log(loglevel) < 1) && [class exists dg_http_log] } { set log_upstream_time [class match -value -- upstream_time equals dg_http_log] if {$log_upstream_time ne "" && $log(upstream_time) >= $log_upstream_time} { set log(over_upstream_time) $log_upstream_time set log(loglevel) 1 } } } when HTTP_RESPONSE_RELEASE { if { [info exists http_request_time] } { set log(http_time) "[expr {[clock clicks -milliseconds] - $http_request_time}]" # push http_time into table so CLIENT_CLOSED can see it in HTTP/2 table set -subtable [IP::client_addr]:[TCP::client_port] http_time $log(http_time) } else { set http_request_time [clock clicks -milliseconds] } set log(event) HTTP_RESPONSE_RELEASE if { $log(loglevel) > 0 } { call hsllog $http_request_time log } } when HTTP_DISABLED { set log(http_passthrough_reason) [HTTP::passthrough_reason] } when CLIENT_CLOSED { # grab log() values from table set keys [table keys -subtable [IP::client_addr]:[TCP::client_port]] foreach key $keys { set log($key) "[table lookup -subtable "[IP::client_addr]:[TCP::client_port]" "$key"]" } set log(tcp_time) "[expr {[clock clicks -milliseconds] - $tcp_start_time}]" set log(event) CLIENT_CLOSED # http_time didn't get set, log here (HTTP_RESPONSE_RELEASE never called, catch redirects, aborted connections) if { not ([info exists log(http_time)]) } { if { [info exists http_request_time] } { # called HTTP_REQUEST but not HTTP_RESPONSE_RELEASE using HTTP 1.0 or 1.1 set log(http_time) "[expr {[clock clicks -milliseconds] - $http_request_time}]" } call hsllog $tcp_start_time log } elseif { $log(loglevel) > 1 } { call hsllog $tcp_start_time log } # clean out table when client disconnects table delete -subtable [IP::client_addr]:[TCP::client_port] -all }3.5KViews3likes7CommentsL3/4/DNS DDoS Reporting with Elastic Search and Kibana
Dear Reader, In this article, I would like to, in collaboration with my colleague Mohamed Shaath, show you how to use DDoS reporting and visibility dashboards that we have created based on an ELK (Elastic Search Logstash and Kibana) stack. The goal is to give you templates based on Open-Source software to address typical questions DDoS operators have and need to answer when an incident happens. Another component we added is the visualization of incoming packets, dropped packets, detection, and mitigation thresholds per attack vector. The idea here is to give you insights into auto-calculated thresholds compared to incoming rates. It will also give you the possibility to see anomalies in traffic behavior. Hopefully, the visualization will also help you with fine-tuning the DoS vector configuration (a typical example of this is the floor value of a vector). This article will give you an introduction to some of the graphs we provide together with the templates. Feel free to arrange or modify them in the way you need when you use the solution. We are also very happy to get your feedback, so we can optimize the dashboards and graphs in a way that is most useful for DDoS operators. Fundamental understanding of log events All DDoS configuration relay basically on two thresholds, regardless of the chosen threshold (manual, fully automatic, multiplier, …): Detection and Mitigation Figure 1: Detection and Mitigation rate “Detection” means, inform the DDoS operator that the incoming rate is above the configured (or auto-calculated rate based on the history) rate. Do not block traffic, just send out specific log information. The “detection” value is usually set or calculated to a rate that is just within the expected “normal” rate. That also means, everything above that value is not “normal” and therefore suspicious, but not necessarily an attack. But the DDoS operator should be aware of that event. Exactly this is happening when a packet rate crosses the detection rate: BIG-IP will send out log messages to the log server (when configured). Within the ELK solution we are introducing, we use the “Splunk” logging format, which sends the information in key/value format. That makes the understanding of the fields much easier. Here is an example of a log message, which is sent out when the packet rate has crossed the detection threshold. Jun 17 23:08:46 172.30.107.11 action="Allow",hostname="lon-i5800-1.pme.itc.f5net.com",bigip_mgmt_ip="172.30.107.11",context_name="/Common/www_10_103_2_80_80",date_time="Jun 17 2021 22:58:12",dest_ip="10.103.2.80",dest_port="80",device_product="DDoS Hybrid Defender",device_vendor="F5",device_version="15.1.2.1.0.317.10",dos_attack_event="Attack Sampled",dos_attack_id="550542726",dos_attack_name="TCP Push Flood",dos_packets_dropped="0",dos_packets_received="117",errdefs_msgno="23003138",errdefs_msg_name="Network DoS Event",flow_id="0000000000000000",severity="4",dos_mode="Enforced",dos_src="Volumetric, Per-SrcIP, VS-specific attack, metric:PPS",partition_name="Common",route_domain="0",source_ip="10.103.6.10",source_port="39219",vlan="/Common/vlan3006_client" Explanation of the message content: Action = “Allow” indicates that BIG-IP is not dropping packets (from the DoS point of view), it’s just giving the operator the information that within the last second the protected context (here: /Common/www_10_103_80_80) has received 117 (dos_packets_received) push packets (dos_attack_name) from source IP 10.103.6.10 (source_ip) within the last second. Btw., because this is a “Volumetric, Per-SrcIP, VS-specific attack” (dos_src) log message, it also tells you that the source IP has been identified as a bad actor (Also see my article: Increasing accuracy using Bad Actor and Attacked Destination). Therefore, this event was triggered by the Bad Actor configuration of the TCP Push flood vector. Mitigation threshold Once the incoming packet rate has crossed the mitigation threshold of a DoS vector or an attack signature, then BIG-IP starts to drop (rate-limit) traffic above that value. This is when we declare being under an DDoS attack because the protected context (server, service, network, BIG-IP, etc.) will be negatively affected by this high number of packets per second. Now the BIG-IP DoS device (AFM/DHD) needs to lower the number of packets hitting the affected context and that’s why it starts to drop packets on the identified vector. Again, this mitigation threshold can be set manually or auto-calculated based on history or a multiplication of the detection threshold. (Explanation of the F5 DDoS threshold modes) Here is an example of a drop log message: Jun 17 23:05:03 172.30.107.11 action="Drop",hostname="lon-i5800-1.pme.itc.f5net.com",bigip_mgmt_ip="172.30.107.11",context_name="Device",date_time="Jun 17 2021 22:54:29",dest_ip="10.103.2.80",dest_port="0",device_product="DDoS Hybrid Defender",device_vendor="F5",device_version="15.1.2.1.0.317.10",dos_attack_event="Attack Sampled",dos_attack_id="3221546531",dos_attack_name="Bad TCP flags (all cleared)",dos_packets_dropped="152224",dos_packets_received="152224",errdefs_msgno="23003138",errdefs_msg_name="Network DoS Event",flow_id="0000000000000000",severity="4",dos_mode="Enforced",dos_src="Volumetric, Aggregated across all SrcIP's, Device-Wide attack, metric:PPS",partition_name="Common",route_domain="0",source_ip="10.103.6.10",source_port="12826",vlan="/Common/vlan3006_client" In this example, the message is an aggregation of all source IPs (dos_src="Volumetric, Aggregated across all SrcIP's, Device-Wide attack) of the dropped packets (dos_packets_dropped="152224") during the last second. Therefore, the source IP (source_ip="10.103.6.10”) is just a representer for all source IPs with dropped packets within the last second. This is because there was no “bad actor” identified. This is usually the case, when the bad actor functionality is not configured, or when every packet has a different source IP. Structure of the dashboards These two main logging events (allow and drop) are what we have adapted to the visualization of the DDoS dashboard. The DDoS operator needs to know when there is an anomaly in the network and which vectors are triggered by the anomaly. The operator also needs to know what destinations are involved and which sources cause the anomaly. But, it is also important to know when the network is under attack and again what mitigation has taken place on which destinations and sources. How many packets have been dropped etc.? When you open the “DDoS Dashboard” and choose the “Overview Dashboard” you will notice that the dashboard is divided into two halves. On the left side, you get the information when a DDoS device has dropped packets and on the right side, you get the information about “suspicious” packets, which means when traffic was above a detection threshold without being dropped (action “Allow”). Figure 2: Structure of the dashboard Within this dashboard, you will also find graphs or tables which do not split the dashboard into two sides. Here you find combined information from both events/areas (mitigation and suspicious). Explanation of some dashboards In the menu section Home/Analytics/Dashboard you will find all dashboards we created. Figure 3: Dashboard menu Let’s briefly explain what the main Dashboards are for. Figure 4: Dashboard overview DDoS_Dashboard is the board where you can see all events during a chosen timeframe, which you can select in the upper right corner within that dashboard. Figure 5: Period of time selection On the top of the page, you find the Dashboard Explorer. From here you can easily navigate between all the relevant dashboards without going through the Analytics section of the main menu. Figure 6: Dashboard Explorer DDOS STATS Dashboard: shows details of the rates and thresholds (packet rate, detection, and mitigation threshold, drop rate) for all vectors including bad actor and attacked destination thresholds. Here you need to select the relevant vector, and context to see the details. DDOS Network Vectors: show details of the incoming rate and drop rate per network vector on a one-pager. DDOS DNS Vectors: show details of the incoming rate and drop rate per DNS vector on a one-pager. DDOS Bad Header Vectors: show details of the incoming rate and drop rate per bad header vector on one page. DDOS SIP vector: show details of the incoming rate and drop rate per SIP vector on a one pager. Please note that all “stats” dashboards are based on the “dos_stats” table, which you need to you to your server. It is not done via the DoS logs. On the GitHub page, you will find instructions on how to do it. Next, you see the Stats Control Panel Figure 7: Stats Control Panel By default, it will show the events (drop/allow) for all vectors in all contexts (VS/PO, Device) on all DoS devices. But by using the drop-down menu you can filter on specific data. All filters you set can also easily be saved and used again. Kibana gives a lot of flexibility. Next, you get to the Top Attacks Timeline, which shows you the top 10 attack vectors, which have dropped packets. Figure 8: Attack Timeline When you mouseover then you get the number of dropped packets for that vector. To the right of this graph, you see the Attack Event Details. Figure 9: Attack Event Details This simply shows you how many logs you have received per log event. Remember every mechanism (for example per source event, per destination, aggregated, …) has its own logs. The next row shows on the left side how many packets had been dropped during the chosen time frame. Figure 10: Dropped vs. suspicious packets On the right side you see how many packets had been identified as suspicious because the rate was above the detection threshold, but not above the mitigation threshold. This event message has the action “Allow”. In the middle graph, you see the relation of suspicious packets vs. dropped packets vs. incoming packets (incoming packets is the summarization of dropped and suspicious packets). The next graph gives you also an overview of received packets vs. dropped packets. Figure 11: Incoming vs. dropped packets But here the data comes from the dos_stats table, so again it is only visible when you send the information. Keep in mind this is not done via the log messages. This is the part where you send the output of the “tmctl -c dos_stat” command to your log device. If you are not doing it, then you can remove this graph from the dashboard. The main difference to the graph in the middle above is, that you will see data also when there is no event (allow/drop) because depending on the configured frequency you send the “dos_stat” table, you get the data (snapshot). Graphs based on log events of course can only appear when there is an event and logs are sent. This graph shows all incoming packets counted by all enabled vectors, regardless of they are counted on bad actors, attacked destinations, or the global stats per vector. Same for the dropped packets. It gives an overall overview of incoming packets vs. dropped packets. To get more details on which vector or mechanism (BA, AD) did the mitigation, you need to go to the DDOS STATS Dashboard. A piece of important information for a DDoS Operator is to know which services (IPs) are under attack and which contexts or protected objects have been involved. Figure 12: Target information Of course, also which vectors are used by the attacker. This is what is shown in the next row. On the left two graphs you get this information for dropped packets. On the right two graphs you see it for packets above the detection threshold but below the mitigation. Attacked IP and Destination Port, shows you the attacked IPs including the destination ports. Attacked Protected Objects, shows you the Context (VS/PO, Device, Global) in relation to the attack vectors. Context “Global” is used for IPI (IP-Intelligence). In this example packets got dropped because source IPs were configured within the IPI policy “my_IPI” and the category “denial of service”. The mitigation was executed on the global level. IPI activities are shown as attack vectors. Figure 13: IP-Intelligence information When you mouseover you can get the full line. More details on the attack vectors and IPI activity you will see lower on the page. Attacked destination details In the next row, you find a table with information on IP addresses that have been identified as being attacked by “Attacked Destination Detection” configured on a vector. Figure 14: Attacked Destination Details Figure 15: Vector configuration What are the sources of an attack? The next graph gives you the information of the identified attackers. “Top AttackerIPs” shows you the top 10 attacks based on aggregated logs. When you have configured “Bad Actor Detection” then you will also get the information for the top 10 “bad actors” IPs. Identified “bad actor” IPs are certainly important information you want to keep an eye on. Figure 16: Source address information Bad Actor Details To get more information on “bad actors”, you can use the “Bad Actor Details” table, which will show you relevant information. Here is an example: Figure 17: Bad Actor details You can see that the UDF flood vector identified a flood for the bad actor IP “4.4.4.4” at 11:02 on the Device level. Most of the packets had been dropped (PPS vs. Dropped Packets). Within the next multiple 30 second intervals, you get again details for that bad actor IP. But at 11:06 you can see that the IP address got programmed into the “denial of service” category and after that, all traffic coming from that IP got dropped via the IP-Intelligence policy “my_IPI” on the “Global” level/context. BDoS Details The Dashboard will also give you information about BDoS signatures and their events. Figure 18: BDoS details In this example, you can see that the system generated (Signature Add) a BDoS signature at 11:23:30. Then this signature was used (Re-USED) for mitigation (Drop). Keep in mind you will only see the details of a signature when it gets created. If a signature is re-used and you want to see the details of the signatures which may have gotten created days or weeks before, then you need to filter for that signature within the timeframe it got created. Another view on attacked IPs The dashboards give you also another, comprehensive view on attacked and targeted (action allow) IP addresses. Here you probably best start to mouseover from the inner circle going outside and you will get information per attacked context. Figure 19: Combined view on sources, destination and vectors Details about DNS attacks Within the DNS section, you get details about DNS-related attacks. Figure 20: DNS attack overview per vector Figure 21: Detailed DNS attack overview Also, a different view on Bad Actor activities Figure 22: Bad Actor / attack vector / destination overview Since we hope the graphs are mostly self-explaining we don´t want to go through all of them. We also plan to add more or modify them based on your feedback. Now it’s time to talk about another component, which we already touched on multiple times within this article. Attack vector visualization A second component we have created is the visualization of the stats (incoming, detection, mitigation, etc.) per attack vector. This is an optional part and is not related to the DoS logging. It is based on the “dos_stat” table and gives a snapshot of the statistics based on the interval you have configured to send the data from BIG-IP into your ELK stack. In my article “Demonstration of Device DoS and Per-Service DoS protection,” I already introduced you to the “dos_stat” table, when I used it within my “show_DoS_stats_script”. Figure 23: DDoS stat table This script shows you the stats for all vectors and their threshold etc. By sending this data frequently into your ELK stack, you can visualize the data and get graphs for them. You then can easily see trends or anomalies within a defined time frame. You can also easily see what thresholds (detection/mitigation) the system has calculated. Figure 24: Activity (detection/mitigation)graph per vector In this example you can see, what the system has done during an attack. The green line shows the incoming packet rate for that vector. The yellow line shows the expected auto-calculated rate (detection rate). The blue line is the auto-calculated mitigation rate, which is at the beginning of this graph very high because the protected context has no stress. Then we can see that the packet rate increases massively and crossed the detection rate. This is when the DDoS operator needs to be informed because this rate is not “normal” (based on history) and therefore suspicious. This high packet rate has an impact on the stress of the protected context and the mitigation rate got adjusted below the incoming rate. At that point, the system started to defend and mitigate. But the incoming packet rate went down again for a short time. Here the mitigation stopped because the rate was below the mitigation threshold, which also got increased again because of no stress on the protected context anymore. Then the flood happened again. The mitigation threshold got adjusted, mitigation started. Later we can see the incoming rate sometimes climbed above the detection threshold but was not strong enough to affect the health of the protected context. Therefore, no mitigation took place. At around 11:53 we can see the flood increased again and enabled the mitigation. Please keep in mind that the granularity of this graph depends of course on the frequency you send the data into the ELK stack and the data is always a snapshot of the current stats. How to configure logging on BIG-IP tmsh create ltm pool pool_log_server members add { 1.1.1.1:5558 } tmsh create sys log-config destination remote-high-speed-log HSL_LOG_DEST { pool-name pool_log_server protocol udp } tmsh create sys log-config destination splunk SPLUNK_LOG_DEST forward-to HSL_LOG_DEST tmsh create sys log-config publisher KIBANA_LOG_PUBLISHER destinations add { SPLUNK_LOG_DEST } tmsh create security log profile LOG_PROFILE dos-network-publisher KIBANA_LOG_PUBLISHER protocol-dns-dos-publisher KIBANA_LOG_PUBLISHER protocol-sip-dos-publisher KIBANA_LOG_PUBLISHER ip-intelligence { log-translation-fields enabled log-publisher KIBANA_LOG_PUBLISHER } traffic-statistics { syncookies enabled log-publisher KIBANA_LOG_PUBLISHER } tmsh modify security log profile global-network dos-network-publisher KIBANA_LOG_PUBLISHER ip-intelligence { log-geo enabled log-rtbh enabled log-scrubber enabled log-shun enabled log-translation-fields enabled log-publisher KIBANA_LOG_PUBLISHER } protocol-dns-dos-publisher KIBANA_LOG_PUBLISHER protocol-sip-dos-publisher KIBANA_LOG_PUBLISHER traffic-statistics { log-publisher KIBANA_LOG_PUBLISHER syncookies enabled } tmsh modify security dos device-config dos-device-config log-publisher KIBANA_LOG_PUBLISHER Figure 25: Overview of logging configuration How to send the dos_stats table data modify (crontab -e) the crontab on BIG-IP and add: * * * * * nb_of_tmms=$(tmsh show sys tmm-info | grep Sys::TMM | wc -l);tmctl -c dos_stat -s context_name,vector_name,attack_detected,stats_rate,drops_rate,int_drops_rate,ba_stats_rate,ba_drops_rate,bd_stats_rate,bd_drops_rate,detection,mitigation_low,mitigation_high,detection_ba,mitigation_ba_low,mitigation_ba_high,detection_bd,mitigation_bd_low,mitigation_bd_high | grep -v "context_name" | sed '/^$/d' | sed "s/$/,$nb_of_tmms/g" | logger -n 1.1.1.1 --udp --port 5558 Modify IP and port appropriate. Better approach then using the crontab is to use an external monitor: https://support.f5.com/csp/article/K71282813 Anyhow, keep in mind more frequently logging generates more data on you logging device! Conclusion The DDoS dashboards based on an ELK stack give the DDoS operators visibility into their DDoS events. The dashboard consumes logs sent by BIG-IP based on L3/4/DNS DDoS events and visualizes them in graphs. These graphs provide relevant information on what kinds of attacks from which sources are going to which destinations. Based on your BIG-IP DoS config you get “bad actor” details or “attacked destinations” details listed. You will also see if IPs that have been blocked by certain IPI categories and more. In addition to other information shown, the ELK stack is also able to consume data from the dos_stats table, which gives you details about your network behaviors on a vector level. Further, you can see how “auto thresholds” calculate detection and mitigation thresholds. We hope that this article gives you an introduction to the DDoS ELK Dashboards. We also plan to publish another article on the explanation of the underlying architecture. Sven Mueller & Mohamed Shaat3.5KViews1like0CommentsLog Tcp And Http Request Response Info
Problem this snippet solves: This iRule logs a line for the following events: when a new TCP connection is established with a client when the HTTP headers of an HTTP request are received from the client when the HTTP headers of an HTTP response are received from the pool member when the TCP connection with a client is closed Code : # Here is a sample of the log output for a single TCP connection with three HTTP requests: : New TCP connection from 192.168.99.210:2675 to 192.168.101.41:80 : Client 192.168.99.210:2675 -> test_http_vip/test0.html?parameter=val (request) : Client 192.168.99.210:2675 -> test_http_vip/test0.html?parameter=val (response) - pool info http_pool 192.168.101.45 80 - status: 200 (request/response delta: 0ms) : Client 192.168.99.210:2675 -> test_http_vip/test1.html?parameter=val (request) : Client 192.168.99.210:2675 -> test_http_vip/test1.html?parameter=val (response) - pool info http_pool 192.168.101.45 80 - status: 200 (request/response delta: 0ms) : Client 192.168.99.210:2675 -> test_http_vip/test2.html?parameter=val (request) : Client 192.168.99.210:2675 -> test_http_vip/test2.html?parameter=val (response) - pool info http_pool 192.168.101.45 80 - status: 200 (request/response delta: 1ms) : Closed TCP connection from 192.168.99.210:2675 to 192.168.101.41:80 (open for: 1078ms) when CLIENT_ACCEPTED { # Get time for start of TCP connection in milleseconds set tcp_start_time [clock clicks -milliseconds] # Log the start of a new TCP connection log local0. "New TCP connection from [IP::client_addr]:[TCP::client_port] to [IP::local_addr]:[TCP::local_port]" } when HTTP_REQUEST { # Get time for start of HTTP request set http_request_time [clock clicks -milliseconds] # Log the start of a new HTTP request set LogString "Client [IP::client_addr]:[TCP::client_port] -> [HTTP::host][HTTP::uri]" log local0. "$LogString (request)" } when LB_SELECTED { log local0. "Client [IP::client_addr]:[TCP::client_port]: Selected [LB::server]" } when LB_FAILED { log local0. "Client [IP::client_addr]:[TCP::client_port]: Failed to [LB::server]" } when SERVER_CONNECTED { log local0. "Client [IP::client_addr]:[TCP::client_port]: Connected to [IP::server_addr]:[TCP::server_port]" } when HTTP_RESPONSE { # Received the response headers from the server. Log the pool name, IP and port, status and time delta log local0. "$LogString (response) - pool info: [LB::server] - status: [HTTP::status] (request/response delta: [expr {[clock clicks -milliseconds] - $http_request_time}] ms)" } when CLIENT_CLOSED { # Log the end time of the TCP connection log local0. "Closed TCP connection from [IP::client_addr]:[TCP::client_port] to [IP::local_addr]:[TCP::local_port] (open for: [expr {[clock clicks -milliseconds] - $tcp_start_time}] ms)" }2.2KViews1like3CommentsLog large HTTP payloads in chunks locally and remotely
Problem this snippet solves: Log HTTP POST request payloads remotely via High Speed Logging (HSL) to a syslog server and locally. Code : # Log POST request payloads remotely via HSL to a syslog server and locally. # Based on Steve Hillier's example and the HTTP::collect wiki page # https://devcentral.f5.com/s/wiki/iRules.http__collect.ashx # Note that although any size payload can theoretically be collected, the maximum size of a Tcl variable in v9 and v10 is 4MB # with a smaller functional maximum after charset expansion of approximately 1Mb. # In v11, the maximum variable size was increased to 32Mb. when RULE_INIT { # Log debug to /var/log/ltm? 1=yes, 0=no set static::payload_dbg 1 # Limit payload collection to 5Mb set static::max_collect_len 5242880 # HSL pool name set static::hsl_pool "my_hsl_tcp_pool" # Max characters to log locally (must be less than 1024 bytes) # https://devcentral.f5.com/s/wiki/iRules.log.ashx set static::max_chars 900 } when HTTP_REQUEST { # Only collect POST request payloads if {[HTTP::method] equals "POST"}{ if {$static::payload_dbg}{log local0. "POST request"} # Open HSL connection set hsl [HSL::open -proto TCP -pool $static::hsl_pool] # Get the content length so we can request the data to be processed in the HTTP_REQUEST_DATA event. if {[HTTP::header exists "Content-Length"]}{ set content_length [HTTP::header "Content-Length"] } else { set content_length 0 } # content_length of 0 indicates chunked data (of unknown size) if {$content_length > 0 && $content_length < $static::max_collect_len}{ set collect_length $content_length } else { set collect_length $static::max_collect_len } if {$static::payload_dbg}{log local0. "Content-Length: $content_length, Collect length: $collect_length"} } } when HTTP_REQUEST_DATA { # Log the bytes collected if {$static::payload_dbg}{log local0. "Collected [HTTP::payload length] bytes"} # Send all the collected payload to the remote syslog server HSL::send $hsl "<190>[HTTP::payload]\n" # Log the payload locally if {[HTTP::payload length] < $static::max_chars}{ log local0. "Payload=[HTTP::payload]" } else { # Initialize variables set remaining $payload set position 0 set count 1 set bytes_logged 0 # Loop through and log each chunk of the payload while {[string length $remaining] > $static::max_chars}{ # Get the current chunk to log (subtract 1 from the end as string range is 0 indexed) set current [string range $remaining $position [expr {$position + $static::max_chars -1}]] log local0. "chunk $count=$current" # Add the length of the current chunk to the position for the next chunk incr position [string length $current] # Get the next chunk to log set remaining [string range $remaining $position end] incr count incr bytes_logged $position log local0. "remaining bytes=[string length $remaining], \$position=$position, \$count=$count, \$bytes_logged=$bytes_logged" } if {[string length $remaining]}{ log local0. "chunk $count=$current" incr bytes_logged [string length $remaining] } log local0. "Logged $count chunks for a total of $bytes_logged bytes" } }704Views1like1CommentWhat is BIG-IQ?
tl;dr - BIG-IQ centralizes management, licensing, monitoring, and analytics for your dispersed BIG-IP infrastructure. If you have more than a few F5 BIG-IP's within your organization, managing devices as separate entities will become an administrative bottleneck and slow application deployments. Deploying cloud applications, you're potentially managing thousands of systems and having to deal with traditionallymonolithic administrative functions is a simple no-go. Enter BIG-IQ. BIG-IQ enables administrators to centrally manage BIG-IP infrastructure across the IT landscape. BIG-IQ discovers, tracks, manages, and monitors physical and virtual BIG-IP devices - in the cloud, on premise, or co-located at your preferred datacenter. BIG-IQ is a stand alone product available from F5 partners, or available through the AWS Marketplace. BIG-IQ consolidates common management requirements including but not limited to: Device discovery and monitoring: You can discovery, track, and monitor BIG-IP devices - including key metrics including CPU/memory, disk usage, and availability status Centralized Software Upgrades: Centrally manage BIG-IP upgrades (TMOS v10.20 and up) by uploading the release images to BIG-IQ and orchestrating the process for managed BIG-IPs. License Management: Manage BIG-IP virtual edition licenses, granting and revoking as you spin up/down resources. You can create license pools for applications or tenants for provisioning. BIG-IP Configuration Backup/Restore: Use BIG-IQ as a central repository of BIG-IP config files through ad-hoc or scheduled processes. Archive config to long term storage via automated SFTP/SCP. BIG-IP Device Cluster Support: Monitor high availability statuses and BIG-IP Device clusters. Integration to F5 iHealth Support Features: Upload and read detailed health reports of your BIG-IP's under management. Change Management: Evaluate, stage, and deploy configuration changes to BIG-IP. Create snapshots and config restore points and audit historical changes so you know who to blame. 😉 Certificate Management: Deploy, renew, or change SSL certs. Alerts allow you to plan ahead before certificates expire. Role-Based Access Control (RBAC): BIG-IQ controls access to it's managed services with role-based access controls (RBAC). You can create granular controls to create view, edit, and deploy provisioned services. Prebuilt roles within BIG-IQ easily allow multiple IT disciplines access to the areas of expertise they need without over provisioning permissions. Fig. 1 BIG-IQ 5.2 - Device Health Management BIG-IQ centralizes statistics and analytics visibility, extending BIG-IP's AVR engine. BIG-IQ collects and aggregates statistics from BIG-IP devices, locally and in the cloud. View metrics such as transactions per second, client latency, response throughput. You can create RBAC roles so security teams have private access to view DDoS attack mitigations, firewall rules triggered, or WebSafe and MobileSafe management dashboards. The reporting extends across all modules BIG-IQ manages, drastically easing the pane-of-glass view we all appreciate from management applications. For further reading on BIG-IQ please check out the following links: BIG-IQ Centralized Management @ F5.com Getting Started with BIG-IQ @ F5 University DevCentral BIG-IQ BIG-IQ @ Amazon Marketplace8.2KViews1like1Comment