elk
4 TopicsImplementing BIG-IP WAF logging and visibility with ELK
Scope This technical article is useful for BIG-IP users familiar with web application security and the implementation and use of the Elastic Stack.This includes, application security professionals, infrastructure management operators and SecDevOps/DevSecOps practitioners. The focus is for WAF logs exclusively.Firewall, Bot, or DoS mitigation logging into the Elastic Stack is the subject of a future article. Introduction This article focusses on the required configuration for sending Web Application Firewall (WAF) logs from the BIG-IP Advanced WAF (or BIG-IP ASM) module to an Elastic Stack (a.k.a. Elasticsearch-Logstash-Kibana or ELK). First, this article goes over the configuration of BIG-IP.It is configured with a security policy and a logging profile attached to the virtual server that is being protected. This can be configured via the BIG-IP user interface (TMUI) or through the BIG-IP declarative interface (AS3). The configuration of the Elastic Strack is discussed next.The configuration of filters adapted to processing BIP-IP WAF logs. Finally, the article provides some initial guidance to the metrics that can be taken into consideration for visibility.It discusses the use of dashboards and provides some recommendations with regards to the potentially useful visualizations. Pre-requisites and Initial Premise For the purposes of this article and to follow the steps outlined below, the user will need to have at least one BIG-IP Adv. WAF running TMOS version 15.1 or above (note that this may work with previous version but has not been tested).The target BIG-IP is already configured with: A virtual Server A WAF policy An operational Elastic Stack is also required. The administrator will need to have configuration and administrative privileges on both the BIG-IP and Elastic Stack infrastructure.They will also need to be familiar with the network topology linking the BIG-IP with the Elastic Search cluster/infrastructure. It is assumed that you want to use your Elastic Search (ELK) logging infrastructure to gain visibility into BIG-IP WAF events. Logging Profile Configuration An essential part of getting WAF logs to the proper destination(s) is the Logging Profile.The following will go over the configuration of the Logging Profile that sends data to the Elastic Stack. Overview of the steps: Create Logging Profile Associate Logging Profile with the Virtual Server After following the procedure below On the wire, logs lines sent from the BIG-IP are comma separated value pairs that look something like the sample below: Aug 25 03:07:19 localhost.localdomainASM:unit_hostname="bigip1",management_ip_address="192.168.41.200",management_ip_address_2="N/A",http_class_name="/Common/log_to_elk_policy",web_application_name="/Common/log_to_elk_policy",policy_name="/Common/log_to_elk_policy",policy_apply_date="2020-08-10 06:50:39",violations="HTTP protocol compliance failed",support_id="5666478231990524056",request_status="blocked",response_code="0",ip_client="10.43.0.86",route_domain="0",method="GET",protocol="HTTP",query_string="name='",x_forwarded_for_header_value="N/A",sig_ids="N/A",sig_names="N/A",date_time="2020-08-25 03:07:19",severity="Error",attack_type="Non-browser Client,HTTP Parser Attack",geo_location="N/A",ip_address_intelligence="N/A",username="N/A",session_id="0",src_port="39348",dest_port="80",dest_ip="10.43.0.201",sub_violations="HTTP protocol compliance failed:Bad HTTP version",virus_name="N/A",violation_rating="5",websocket_direction="N/A",websocket_message_type="N/A",device_id="N/A",staged_sig_ids="",staged_sig_names="",threat_campaign_names="N/A",staged_threat_campaign_names="N/A",blocking_exception_reason="N/A",captcha_result="not_received",microservice="N/A",tap_event_id="N/A",tap_vid="N/A",vs_name="/Common/adv_waf_vs",sig_cves="N/A",staged_sig_cves="N/A",uri="/random",fragment="",request="GET /random?name=' or 1 = 1' HTTP/1.1\r\n",response="Response logging disabled" Please choose one of the methods below.The configuration can be done through the web-based user interface (TMUI), the command line interface (TMSH), directly with a declarative AS3 REST API call, or with the BIG-IP native REST API.This last option is not discussed herein. TMUI Steps: Create Profile Connect to the BIG-IP web UI and login with administrative rights Navigate to Security >> Event Logs >> Logging Profiles Select “Create” Fill out the configuration fields as follows: Profile Name (mandatory) Enable Application Security Set Storage Destination to Remote Storage Set Logging Format to Key-Value Pairs (Splunk) In the Server Addresses field, enter an IP Address and Port then click on Add as shown below: Click on Create Add Logging Profile to virtual server with the policy Select target virtual server and click on the Security tab (Local Traffic >> Virtual Servers : Virtual Server List >> [target virtualserver] ) Highlight the Log Profile from the Available column and put it in the Selected column as shown in the example below (log profile is “log_all_to_elk”): Click on Update At this time the BIG-IP will forward logs Elastic Stack. TMSH Steps: Create profile ssh into the BIG-IP command line interface (CLI) from the tmsh prompt enter the following: create security log profile [name_of_profile] application add { [name_of_profile] { logger-type remote remote-storage splunk servers add { [IP_address_for_ELK]:[TCP_Port_for_ELK] { } } } } For example: create security log profile dc_show_creation_elk application add { dc_show_creation_elk { logger-type remote remote-storage splunk servers add { 10.45.0.79:5244 { } } } } 3. ensure that the changes are saved: save sys config partitions all Add Logging Profile to virtual server with the policy 1.From the tmsh prompt (assuming you are still logged in) enter the following: modify ltm virtual [VS_name] security-log-profiles add { [name_of_profile] } For example: modify ltm virtual adv_waf_vs security-log-profiles add { dc_show_creation_elk } 2.ensure that the changes are saved: save sys config partitions all At this time the BIG-IP sends logs to the Elastic Stack. AS3 Application Services 3 (AS3) is a BIG-IP configuration API endpoint that allows the user to create an application from the ground up.For more information on F5’s AS3, refer to link. In order to attach a security policy to a virtual server, the AS3 declaration can either refer to a policy present on the BIG-IP or refer to a policy stored in XML format and available via HTTP to the BIG-IP (ref. link). The logging profile can be created and associated to the virtual server directly as part of the AS3 declaration. For more information on the creation of a WAF logging profile, refer to the documentation found here. The following is an example of a pa rt of an AS3 declaration that will create security log profile that can be used to log to Elastic Stack: "secLogRemote": { "class": "Security_Log_Profile", "application": { "localStorage": false, "maxEntryLength": "10k", "protocol": "tcp", "remoteStorage": "splunk", "reportAnomaliesEnabled": true, "servers": [ { "address": "10.45.0.79", "port": "5244" } ] } In the sample above, the ELK stack IP address is 10.45.0.79 and listens on port 5244 for BIG-IP WAF logs.Note that the log format used in this instance is “Splunk”.There are no declared filters and thus, only the illegal requests will get logged to the Elastic Stack.A sample AS3 declaration can be found here. ELK Configuration The Elastic Stack configuration consists of creating a new input on Logstash.This is achieved by adding an input/filter/ output configuration to the Logstash configuration file.Optionally, the Logstash administrator might want to create a separate pipeline – for more information, refer to this link. The following is a Logstash configuration known to work with WAF logs coming from BIG-IP: input { syslog { port => 5244 } } filter { grok { match => { "message" => [ "attack_type=\"%{DATA:attack_type}\"", ",blocking_exception_reason=\"%{DATA:blocking_exception_reason}\"", ",date_time=\"%{DATA:date_time}\"", ",dest_port=\"%{DATA:dest_port}\"", ",ip_client=\"%{DATA:ip_client}\"", ",is_truncated=\"%{DATA:is_truncated}\"", ",method=\"%{DATA:method}\"", ",policy_name=\"%{DATA:policy_name}\"", ",protocol=\"%{DATA:protocol}\"", ",request_status=\"%{DATA:request_status}\"", ",response_code=\"%{DATA:response_code}\"", ",severity=\"%{DATA:severity}\"", ",sig_cves=\"%{DATA:sig_cves}\"", ",sig_ids=\"%{DATA:sig_ids}\"", ",sig_names=\"%{DATA:sig_names}\"", ",sig_set_names=\"%{DATA:sig_set_names}\"", ",src_port=\"%{DATA:src_port}\"", ",sub_violations=\"%{DATA:sub_violations}\"", ",support_id=\"%{DATA:support_id}\"", "unit_hostname=\"%{DATA:unit_hostname}\"", ",uri=\"%{DATA:uri}\"", ",violation_rating=\"%{DATA:violation_rating}\"", ",vs_name=\"%{DATA:vs_name}\"", ",x_forwarded_for_header_value=\"%{DATA:x_forwarded_for_header_value}\"", ",outcome=\"%{DATA:outcome}\"", ",outcome_reason=\"%{DATA:outcome_reason}\"", ",violations=\"%{DATA:violations}\"", ",violation_details=\"%{DATA:violation_details}\"", ",request=\"%{DATA:request}\"" ] } break_on_match => false } mutate { split => { "attack_type" => "," } split => { "sig_ids" => "," } split => { "sig_names" => "," } split => { "sig_cves" => "," } split => { "staged_sig_ids" => "," } split => { "staged_sig_names" => "," } split => { "staged_sig_cves" => "," } split => { "sig_set_names" => "," } split => { "threat_campaign_names" => "," } split => { "staged_threat_campaign_names" => "," } split => { "violations" => "," } split => { "sub_violations" => "," } } if [x_forwarded_for_header_value] != "N/A" { mutate { add_field => { "source_host" => "%{x_forwarded_for_header_value}"}} } else { mutate { add_field => { "source_host" => "%{ip_client}"}} } geoip { source => "source_host" } } output { elasticsearch { hosts => ['localhost:9200'] index => "big_ip-waf-logs-%{+YYY.MM.dd}" } } After adding the configuration above to the Logstash parameters, you will need to restart the Logstash instance to take the new logs into configuration.The sample above is also available here. The Elastic Stack is now ready to process the incoming logs.You can start sending traffic to your policy and start seeing logs populating the Elastic Stack. If you are looking for a test tool to generate traffic to your Virtual Server, F5 provides a simpleWAF tester tool that can be found here. At this point, you can start creating dashboards on the Elastic Stack that will satisfy your operational needs with the following overall steps: ·Ensure that the log index is being created (Stack Management >> Index Management) ·Create a Kibana Index Pattern (Stack Management>>Index patterns) ·You can now peruse the logs from the Kibana discover menu (Discover) ·And start creating visualizations that will be included in your Dashboards (Dashboards >> Editing Simple WAF Dashboard) A complete Elastic Stack configuration can be found here – note that this can be used with both BIG-IP WAF and NGINX App Protect. Conclusion You can now leverage the widely available Elastic Stack to log and visualize BIG-IP WAF logs.From dashboard perspective it may be useful to track the following metrics: -Request Rate -Response codes -The distribution of requests in term of clean, blocked or alerted status -Identify the top talkers making requests -Track the top URL’s being accessed -Top violator source IP An example or the dashboard might look like the following:13KViews5likes6CommentsL3/4/DNS DDoS Reporting with Elastic Search and Kibana
Dear Reader, In this article, I would like to, in collaboration with my colleague Mohamed Shaath, show you how to use DDoS reporting and visibility dashboards that we have created based on an ELK (Elastic Search Logstash and Kibana) stack. The goal is to give you templates based on Open-Source software to address typical questions DDoS operators have and need to answer when an incident happens. Another component we added is the visualization of incoming packets, dropped packets, detection, and mitigation thresholds per attack vector. The idea here is to give you insights into auto-calculated thresholds compared to incoming rates. It will also give you the possibility to see anomalies in traffic behavior. Hopefully, the visualization will also help you with fine-tuning the DoS vector configuration (a typical example of this is the floor value of a vector). This article will give you an introduction to some of the graphs we provide together with the templates. Feel free to arrange or modify them in the way you need when you use the solution. We are also very happy to get your feedback, so we can optimize the dashboards and graphs in a way that is most useful for DDoS operators. Fundamental understanding of log events All DDoS configuration relay basically on two thresholds, regardless of the chosen threshold (manual, fully automatic, multiplier, …): Detection and Mitigation Figure 1: Detection and Mitigation rate “Detection” means, inform the DDoS operator that the incoming rate is above the configured (or auto-calculated rate based on the history) rate. Do not block traffic, just send out specific log information. The “detection” value is usually set or calculated to a rate that is just within the expected “normal” rate. That also means, everything above that value is not “normal” and therefore suspicious, but not necessarily an attack. But the DDoS operator should be aware of that event. Exactly this is happening when a packet rate crosses the detection rate: BIG-IP will send out log messages to the log server (when configured). Within the ELK solution we are introducing, we use the “Splunk” logging format, which sends the information in key/value format. That makes the understanding of the fields much easier. Here is an example of a log message, which is sent out when the packet rate has crossed the detection threshold. Jun 17 23:08:46 172.30.107.11 action="Allow",hostname="lon-i5800-1.pme.itc.f5net.com",bigip_mgmt_ip="172.30.107.11",context_name="/Common/www_10_103_2_80_80",date_time="Jun 17 2021 22:58:12",dest_ip="10.103.2.80",dest_port="80",device_product="DDoS Hybrid Defender",device_vendor="F5",device_version="15.1.2.1.0.317.10",dos_attack_event="Attack Sampled",dos_attack_id="550542726",dos_attack_name="TCP Push Flood",dos_packets_dropped="0",dos_packets_received="117",errdefs_msgno="23003138",errdefs_msg_name="Network DoS Event",flow_id="0000000000000000",severity="4",dos_mode="Enforced",dos_src="Volumetric, Per-SrcIP, VS-specific attack, metric:PPS",partition_name="Common",route_domain="0",source_ip="10.103.6.10",source_port="39219",vlan="/Common/vlan3006_client" Explanation of the message content: Action = “Allow” indicates that BIG-IP is not dropping packets (from the DoS point of view), it’s just giving the operator the information that within the last second the protected context (here: /Common/www_10_103_80_80) has received 117 (dos_packets_received) push packets (dos_attack_name) from source IP 10.103.6.10 (source_ip) within the last second. Btw., because this is a “Volumetric, Per-SrcIP, VS-specific attack” (dos_src) log message, it also tells you that the source IP has been identified as a bad actor (Also see my article: Increasing accuracy using Bad Actor and Attacked Destination). Therefore, this event was triggered by the Bad Actor configuration of the TCP Push flood vector. Mitigation threshold Once the incoming packet rate has crossed the mitigation threshold of a DoS vector or an attack signature, then BIG-IP starts to drop (rate-limit) traffic above that value. This is when we declare being under an DDoS attack because the protected context (server, service, network, BIG-IP, etc.) will be negatively affected by this high number of packets per second. Now the BIG-IP DoS device (AFM/DHD) needs to lower the number of packets hitting the affected context and that’s why it starts to drop packets on the identified vector. Again, this mitigation threshold can be set manually or auto-calculated based on history or a multiplication of the detection threshold. (Explanation of the F5 DDoS threshold modes) Here is an example of a drop log message: Jun 17 23:05:03 172.30.107.11 action="Drop",hostname="lon-i5800-1.pme.itc.f5net.com",bigip_mgmt_ip="172.30.107.11",context_name="Device",date_time="Jun 17 2021 22:54:29",dest_ip="10.103.2.80",dest_port="0",device_product="DDoS Hybrid Defender",device_vendor="F5",device_version="15.1.2.1.0.317.10",dos_attack_event="Attack Sampled",dos_attack_id="3221546531",dos_attack_name="Bad TCP flags (all cleared)",dos_packets_dropped="152224",dos_packets_received="152224",errdefs_msgno="23003138",errdefs_msg_name="Network DoS Event",flow_id="0000000000000000",severity="4",dos_mode="Enforced",dos_src="Volumetric, Aggregated across all SrcIP's, Device-Wide attack, metric:PPS",partition_name="Common",route_domain="0",source_ip="10.103.6.10",source_port="12826",vlan="/Common/vlan3006_client" In this example, the message is an aggregation of all source IPs (dos_src="Volumetric, Aggregated across all SrcIP's, Device-Wide attack) of the dropped packets (dos_packets_dropped="152224") during the last second. Therefore, the source IP (source_ip="10.103.6.10”) is just a representer for all source IPs with dropped packets within the last second. This is because there was no “bad actor” identified. This is usually the case, when the bad actor functionality is not configured, or when every packet has a different source IP. Structure of the dashboards These two main logging events (allow and drop) are what we have adapted to the visualization of the DDoS dashboard. The DDoS operator needs to know when there is an anomaly in the network and which vectors are triggered by the anomaly. The operator also needs to know what destinations are involved and which sources cause the anomaly. But, it is also important to know when the network is under attack and again what mitigation has taken place on which destinations and sources. How many packets have been dropped etc.? When you open the “DDoS Dashboard” and choose the “Overview Dashboard” you will notice that the dashboard is divided into two halves. On the left side, you get the information when a DDoS device has dropped packets and on the right side, you get the information about “suspicious” packets, which means when traffic was above a detection threshold without being dropped (action “Allow”). Figure 2: Structure of the dashboard Within this dashboard, you will also find graphs or tables which do not split the dashboard into two sides. Here you find combined information from both events/areas (mitigation and suspicious). Explanation of some dashboards In the menu section Home/Analytics/Dashboard you will find all dashboards we created. Figure 3: Dashboard menu Let’s briefly explain what the main Dashboards are for. Figure 4: Dashboard overview DDoS_Dashboard is the board where you can see all events during a chosen timeframe, which you can select in the upper right corner within that dashboard. Figure 5: Period of time selection On the top of the page, you find the Dashboard Explorer. From here you can easily navigate between all the relevant dashboards without going through the Analytics section of the main menu. Figure 6: Dashboard Explorer DDOS STATS Dashboard: shows details of the rates and thresholds (packet rate, detection, and mitigation threshold, drop rate) for all vectors including bad actor and attacked destination thresholds. Here you need to select the relevant vector, and context to see the details. DDOS Network Vectors: show details of the incoming rate and drop rate per network vector on a one-pager. DDOS DNS Vectors: show details of the incoming rate and drop rate per DNS vector on a one-pager. DDOS Bad Header Vectors: show details of the incoming rate and drop rate per bad header vector on one page. DDOS SIP vector: show details of the incoming rate and drop rate per SIP vector on a one pager. Please note that all “stats” dashboards are based on the “dos_stats” table, which you need to you to your server. It is not done via the DoS logs. On the GitHub page, you will find instructions on how to do it. Next, you see the Stats Control Panel Figure 7: Stats Control Panel By default, it will show the events (drop/allow) for all vectors in all contexts (VS/PO, Device) on all DoS devices. But by using the drop-down menu you can filter on specific data. All filters you set can also easily be saved and used again. Kibana gives a lot of flexibility. Next, you get to the Top Attacks Timeline, which shows you the top 10 attack vectors, which have dropped packets. Figure 8: Attack Timeline When you mouseover then you get the number of dropped packets for that vector. To the right of this graph, you see the Attack Event Details. Figure 9: Attack Event Details This simply shows you how many logs you have received per log event. Remember every mechanism (for example per source event, per destination, aggregated, …) has its own logs. The next row shows on the left side how many packets had been dropped during the chosen time frame. Figure 10: Dropped vs. suspicious packets On the right side you see how many packets had been identified as suspicious because the rate was above the detection threshold, but not above the mitigation threshold. This event message has the action “Allow”. In the middle graph, you see the relation of suspicious packets vs. dropped packets vs. incoming packets (incoming packets is the summarization of dropped and suspicious packets). The next graph gives you also an overview of received packets vs. dropped packets. Figure 11: Incoming vs. dropped packets But here the data comes from the dos_stats table, so again it is only visible when you send the information. Keep in mind this is not done via the log messages. This is the part where you send the output of the “tmctl -c dos_stat” command to your log device. If you are not doing it, then you can remove this graph from the dashboard. The main difference to the graph in the middle above is, that you will see data also when there is no event (allow/drop) because depending on the configured frequency you send the “dos_stat” table, you get the data (snapshot). Graphs based on log events of course can only appear when there is an event and logs are sent. This graph shows all incoming packets counted by all enabled vectors, regardless of they are counted on bad actors, attacked destinations, or the global stats per vector. Same for the dropped packets. It gives an overall overview of incoming packets vs. dropped packets. To get more details on which vector or mechanism (BA, AD) did the mitigation, you need to go to the DDOS STATS Dashboard. A piece of important information for a DDoS Operator is to know which services (IPs) are under attack and which contexts or protected objects have been involved. Figure 12: Target information Of course, also which vectors are used by the attacker. This is what is shown in the next row. On the left two graphs you get this information for dropped packets. On the right two graphs you see it for packets above the detection threshold but below the mitigation. Attacked IP and Destination Port, shows you the attacked IPs including the destination ports. Attacked Protected Objects, shows you the Context (VS/PO, Device, Global) in relation to the attack vectors. Context “Global” is used for IPI (IP-Intelligence). In this example packets got dropped because source IPs were configured within the IPI policy “my_IPI” and the category “denial of service”. The mitigation was executed on the global level. IPI activities are shown as attack vectors. Figure 13: IP-Intelligence information When you mouseover you can get the full line. More details on the attack vectors and IPI activity you will see lower on the page. Attacked destination details In the next row, you find a table with information on IP addresses that have been identified as being attacked by “Attacked Destination Detection” configured on a vector. Figure 14: Attacked Destination Details Figure 15: Vector configuration What are the sources of an attack? The next graph gives you the information of the identified attackers. “Top AttackerIPs” shows you the top 10 attacks based on aggregated logs. When you have configured “Bad Actor Detection” then you will also get the information for the top 10 “bad actors” IPs. Identified “bad actor” IPs are certainly important information you want to keep an eye on. Figure 16: Source address information Bad Actor Details To get more information on “bad actors”, you can use the “Bad Actor Details” table, which will show you relevant information. Here is an example: Figure 17: Bad Actor details You can see that the UDF flood vector identified a flood for the bad actor IP “4.4.4.4” at 11:02 on the Device level. Most of the packets had been dropped (PPS vs. Dropped Packets). Within the next multiple 30 second intervals, you get again details for that bad actor IP. But at 11:06 you can see that the IP address got programmed into the “denial of service” category and after that, all traffic coming from that IP got dropped via the IP-Intelligence policy “my_IPI” on the “Global” level/context. BDoS Details The Dashboard will also give you information about BDoS signatures and their events. Figure 18: BDoS details In this example, you can see that the system generated (Signature Add) a BDoS signature at 11:23:30. Then this signature was used (Re-USED) for mitigation (Drop). Keep in mind you will only see the details of a signature when it gets created. If a signature is re-used and you want to see the details of the signatures which may have gotten created days or weeks before, then you need to filter for that signature within the timeframe it got created. Another view on attacked IPs The dashboards give you also another, comprehensive view on attacked and targeted (action allow) IP addresses. Here you probably best start to mouseover from the inner circle going outside and you will get information per attacked context. Figure 19: Combined view on sources, destination and vectors Details about DNS attacks Within the DNS section, you get details about DNS-related attacks. Figure 20: DNS attack overview per vector Figure 21: Detailed DNS attack overview Also, a different view on Bad Actor activities Figure 22: Bad Actor / attack vector / destination overview Since we hope the graphs are mostly self-explaining we don´t want to go through all of them. We also plan to add more or modify them based on your feedback. Now it’s time to talk about another component, which we already touched on multiple times within this article. Attack vector visualization A second component we have created is the visualization of the stats (incoming, detection, mitigation, etc.) per attack vector. This is an optional part and is not related to the DoS logging. It is based on the “dos_stat” table and gives a snapshot of the statistics based on the interval you have configured to send the data from BIG-IP into your ELK stack. In my article “Demonstration of Device DoS and Per-Service DoS protection,” I already introduced you to the “dos_stat” table, when I used it within my “show_DoS_stats_script”. Figure 23: DDoS stat table This script shows you the stats for all vectors and their threshold etc. By sending this data frequently into your ELK stack, you can visualize the data and get graphs for them. You then can easily see trends or anomalies within a defined time frame. You can also easily see what thresholds (detection/mitigation) the system has calculated. Figure 24: Activity (detection/mitigation)graph per vector In this example you can see, what the system has done during an attack. The green line shows the incoming packet rate for that vector. The yellow line shows the expected auto-calculated rate (detection rate). The blue line is the auto-calculated mitigation rate, which is at the beginning of this graph very high because the protected context has no stress. Then we can see that the packet rate increases massively and crossed the detection rate. This is when the DDoS operator needs to be informed because this rate is not “normal” (based on history) and therefore suspicious. This high packet rate has an impact on the stress of the protected context and the mitigation rate got adjusted below the incoming rate. At that point, the system started to defend and mitigate. But the incoming packet rate went down again for a short time. Here the mitigation stopped because the rate was below the mitigation threshold, which also got increased again because of no stress on the protected context anymore. Then the flood happened again. The mitigation threshold got adjusted, mitigation started. Later we can see the incoming rate sometimes climbed above the detection threshold but was not strong enough to affect the health of the protected context. Therefore, no mitigation took place. At around 11:53 we can see the flood increased again and enabled the mitigation. Please keep in mind that the granularity of this graph depends of course on the frequency you send the data into the ELK stack and the data is always a snapshot of the current stats. How to configure logging on BIG-IP tmsh create ltm pool pool_log_server members add { 1.1.1.1:5558 } tmsh create sys log-config destination remote-high-speed-log HSL_LOG_DEST { pool-name pool_log_server protocol udp } tmsh create sys log-config destination splunk SPLUNK_LOG_DEST forward-to HSL_LOG_DEST tmsh create sys log-config publisher KIBANA_LOG_PUBLISHER destinations add { SPLUNK_LOG_DEST } tmsh create security log profile LOG_PROFILE dos-network-publisher KIBANA_LOG_PUBLISHER protocol-dns-dos-publisher KIBANA_LOG_PUBLISHER protocol-sip-dos-publisher KIBANA_LOG_PUBLISHER ip-intelligence { log-translation-fields enabled log-publisher KIBANA_LOG_PUBLISHER } traffic-statistics { syncookies enabled log-publisher KIBANA_LOG_PUBLISHER } tmsh modify security log profile global-network dos-network-publisher KIBANA_LOG_PUBLISHER ip-intelligence { log-geo enabled log-rtbh enabled log-scrubber enabled log-shun enabled log-translation-fields enabled log-publisher KIBANA_LOG_PUBLISHER } protocol-dns-dos-publisher KIBANA_LOG_PUBLISHER protocol-sip-dos-publisher KIBANA_LOG_PUBLISHER traffic-statistics { log-publisher KIBANA_LOG_PUBLISHER syncookies enabled } tmsh modify security dos device-config dos-device-config log-publisher KIBANA_LOG_PUBLISHER Figure 25: Overview of logging configuration How to send the dos_stats table data modify (crontab -e) the crontab on BIG-IP and add: * * * * * nb_of_tmms=$(tmsh show sys tmm-info | grep Sys::TMM | wc -l);tmctl -c dos_stat -s context_name,vector_name,attack_detected,stats_rate,drops_rate,int_drops_rate,ba_stats_rate,ba_drops_rate,bd_stats_rate,bd_drops_rate,detection,mitigation_low,mitigation_high,detection_ba,mitigation_ba_low,mitigation_ba_high,detection_bd,mitigation_bd_low,mitigation_bd_high | grep -v "context_name" | sed '/^$/d' | sed "s/$/,$nb_of_tmms/g" | logger -n 1.1.1.1 --udp --port 5558 Modify IP and port appropriate. Better approach then using the crontab is to use an external monitor: https://support.f5.com/csp/article/K71282813 Anyhow, keep in mind more frequently logging generates more data on you logging device! Conclusion The DDoS dashboards based on an ELK stack give the DDoS operators visibility into their DDoS events. The dashboard consumes logs sent by BIG-IP based on L3/4/DNS DDoS events and visualizes them in graphs. These graphs provide relevant information on what kinds of attacks from which sources are going to which destinations. Based on your BIG-IP DoS config you get “bad actor” details or “attacked destinations” details listed. You will also see if IPs that have been blocked by certain IPI categories and more. In addition to other information shown, the ELK stack is also able to consume data from the dos_stats table, which gives you details about your network behaviors on a vector level. Further, you can see how “auto thresholds” calculate detection and mitigation thresholds. We hope that this article gives you an introduction to the DDoS ELK Dashboards. We also plan to publish another article on the explanation of the underlying architecture. Sven Mueller & Mohamed Shaat3.5KViews1like0CommentsHow I did it - "Visualizing Data with F5 TS and the Elastic ELK Stack"
With the F5 BIG-IP and Telemetry Streaming I have the ability to send BIG-IP metrics to a variety of third-party analytics vendors. One of the more popular of these is Elastic. Elastic's ELK Stack, (acronym for Elasticsearch, Logstash, Kibana) provides a platform where I can store, search, analyze and visualize my BIG-IP telemetry data. With said, here's an overview of "How I did it"; integrating and visualizing data with the ELK Stack. P.S. Make sure to stay for the movie. Application Services 3 Extension (AS3) There are several resources, (logging profiles, log publishers, iRules, etc.) that must be configured on the BIG-IP to enable remote logging. I utilized AS3 to deploy and manage these resources. I used Postman to apply a single REST API declaration to the AS3 endpoint. Telemetry Streaming (TS) F5's Telemetry Streaming, (TS) service enables the BIG-IP to stream telemetry data to a variety of third-party analytics providers. Aside from the aforementioned resources, configuring TS to stream to a consumer, (Logstash in this instance), is simply a REST call away. Just as I did for AS3, I utilized Postman to post a single declaration to the BIG-IP. Elastic (ELK) Stack Configuring the ELK stack to receive and ingest BIG-IP telemetry is a fairly simple process. Logstash, (the "L" in ELK) is the data processor I used to ingest data into the stack. To accomplish this, I applied the sample Logstash configuration file. The configuration file specifies, (among other items) the listener port, message format, and the Elasticsearch index naming format. Dashboards Getting telemetry data into Elasticsearch is great but only if you can make use of it. If I'm going to utilize the data, I need to visualize the data; (should probably trademark that). For visualization, i created a couple sample dashboards. The dashboards, (community-supported and perhaps not suitable for framing) report various relevant BIG-IP performance metrics and WAF incident information. F5 BIG-IP Advanced WAF Insights F5 BIG-IP Performance Metrics Check it Out Rather than walk you through the entire configuration, how about a movie? Click on the link (image) below for a brief walkthrough demo integrating F5's BIG-IP with Elastic's ELK stack using F5 Telemetry Streaming. Try it Out Liked what you saw? If that's the case, (as I hope it was) try it out for yourself. Checkout F5's CloudDocs for guidance on configuring your BIG-IP(s) with the F5 Automation Toolchain.The various configuration files, (including the above sample dashboards) used in the demo are available on the GitHub solution repository Enjoy!3.5KViews0likes0CommentsAdopting SRE practices with F5: Observability and beyond with ELK Stack
This article is a joint collaboration between Eric Ji and JC Kwon. Getting started In the previous article, we explained SRE (Site Reliability Engineering) and how F5 helps SRE deploy and secure modern applications. We already talked observability is essential for SRE to implement SLOs. Meanwhile, we havea wide range of monitoring tools and analytic applications, each assigned to special devices or runningonly for certain applications. In this article, we will explore one of the most commonly utilized logging tools, or the ELK stack. The ELK stack is a collection of three open-source projects, namely Elasticsearch, Logstash, and Kibana. It provides IT project stakeholders the capabilities of multi-system and multi-application log aggregation and analysis.Besides, the ELK stack provides data visualization at stakeholders' fingertips, which is essential for security analytics, system monitoring, and troubleshooting. A brief description of the three projects: Elasticsearch is an open-source, full-text analysis, and search engine. Logstash is a log aggregator that executes transformations on data derived from various input sources, before transferring it to output destinations. Kibana provides data analysis and visualization capabilities for end-users, complementary to Elasticsearch. In this article, the ELK is utilized to analyze and visualize application performance through a centralized dashboard. A dashboard enables end-users to easily correlate North-South traffic with East-West traffic, for end-to-end performance visibility. Overview This use case is built on top of targeted canary deployment. As shown in the diagram below, we are taking advantage of the iRule on BIG-IP, generated a UUID is and inserted it into the HTTP header for every HTTP request packet arriving at BIG-IP. All traffic access logs will contain the UUIDs when they are sent to the ELK server, for validation of information, like user location, the response time by user location, response time of BIG-IP and NGINX plus, etc. Setup and Configuration 1.Create HSL pool, iRule on BIG-IP First, we created a High-Speed Logging (HSL) pool on BIG-IP, to be used by the ELK Stack. The HSL pool is assigned to the sampleapplication. This pool member will be used by iRule to send access logs from BIG-IP to the ELK server. The ELK server is listening for incoming log analysis requests Below is the iRule that we created. when CLIENT_ACCEPTED { set timestamp [clock format [clock seconds] -format "%d/%h/%y:%T %Z" ] } when HTTP_REQUEST { # UUID injection if { [HTTP::cookie x-request-id] == "" } { append s [clock seconds] [IP::local_addr] [IP::client_addr] [expr { int(100000000 * rand()) }] [clock clicks] set s [md5 $s] binary scan $s c* s lset s 8 [expr {([lindex $s 8] & 0x7F) | 0x40}] lset s 6 [expr {([lindex $s 6] & 0x0F) | 0x40}] set s [binary format c* $s] binary scan $s H* s set myuuid $s unset s set inject_uuid_cookie 1 } else { set myuuid [HTTP::cookie x-request-id] set inject_uuid_cookie 0 } set xff_ip "[expr int(rand()*100)].[expr int(rand()*100)].[expr int(rand()*100)].[expr int(rand()*100)]" set hsl [HSL::open -proto UDP -pool pool_elk] set http_request "\"[HTTP::method] [HTTP::uri] HTTP/[HTTP::version]\"" set http_request_time [clock clicks -milliseconds] set http_user_agent "\"[HTTP::header User-Agent]]\"" set http_host [HTTP::host] set http_username [HTTP::username] set client_ip [IP::remote_addr] set client_port [TCP::remote_port] set http_request_uri [HTTP::uri] set http_method [HTTP::method] set referer "\"[HTTP::header value referer]\"" if { [HTTP::uri] contains "test" } { HTTP::header insert "x-request-id" "test-$myuuid" } else { HTTP::header insert "x-request-id" $myuuid } HTTP::header insert "X-Forwarded-For" $xff_ip } when HTTP_RESPONSE { set syslogtime [clock format [clock seconds] -format "%h %e %H:%M:%S"] set response_time [expr {double([clock clicks -milliseconds] - $http_request_time)/1000}] set virtual [virtual] set content_length 0 if { [HTTP::header exists "Content-Length"] } { set content_length \"[HTTP::header "Content-Length"]\" } else { set content_length \"-\" } set lb_server "[LB::server addr]:[LB::server port]" if { [string compare "$lb_server" ""] == 0 } { set lb_server "" } set status_code [HTTP::status] set content_type \"[HTTP::header "Content-type"]\" # construct log for elk, local6.info <182> set log_msg "<182>$syslogtime f5adc tmos: " #set log_msg "" append log_msg "time=\[$timestamp\] " append log_msg "client_ip=$client_ip " append log_msg "virtual=$virtual " append log_msg "client_port=$client_port " append log_msg "xff_ip=$xff_ip " append log_msg "lb_server=$lb_server " append log_msg "http_host=$http_host " append log_msg "http_method=$http_method " append log_msg "http_request_uri=$http_request_uri " append log_msg "status_code=$status_code " append log_msg "content_type=$content_type " append log_msg "content_length=$content_length " append log_msg "response_time=$response_time " append log_msg "referer=$referer " append log_msg "http_user_agent=$http_user_agent " append log_msg "x-request-id=$myuuid " if { $inject_uuid_cookie == 1} { HTTP::cookie insert name x-request-id value $myuuid path "/" set inject_uuid_cookie 0 } # log local2. sending log to elk via log publisher #log local2. $log_msg HSL::send $hsl $log_msg } Next, we added a new VIP for theHSL pool which was created earlier, and applied iRule for this VIP. Then all access logs containing the respective UUID for the HTTP datagram will be sent to the ELK server. Now, the ELK server is ready for the analysis of the BIG-IP access logs. 2.Configure NGINX plus Logging We configure logging for each NGINXplus deployed inside the OpenShift cluster through the respective configmap objects. Here is one example: 3.Customize Kibana Dashboard With all configurations in place, log information will be processed by the ELK server. We will be able to customize a dashboard containing useful, visualized data, like user location, response time by location, etc. When an end-user accesses the service, the VIP will be responded and iRule will apply. Next, the user’s HTTP header information will be checked by iRule, and logs are forwarded to the ELK server for analysis. As the user is accessing the app services, the app server’s logs are also forwarded to the ELK server based on the NGINX plus configmap setting. The list of key indicators available on the Kibana dashboard page is rather long, so we won't describe all of them here. You can check detail here 4.ELK Dashboard Samples We can easily customize the data for visualization in the centralized display, and the following is just a list of dashboard samples. We can look at user location counts and response time by user location: We can check theaverage response time and max response time for each endpoint: We can seethe correlation between BIG-IP (N-S traffic) and NGINX plus endpoint(E-W traffic): We can also checkthe response time for N-S traffic: Summary In this article, we showed how the ELK stack joined forces with F5 BIG-IP and NGINX plus to provide an observability solution for visualizing application performance with a centralized dashboard. Withcombined performance metrics, it opens up a lot of possibilities for SRE's to implement SLOs practically. F5 aims to provide the best solution to support your business success and continue with more use cases. If you want to learn more about this and other SRE use cases, please visit the F5 DevCentral GitHub link here.999Views1like0Comments