reporting
8 TopicsWhat is BIG-IQ?
tl;dr - BIG-IQ centralizes management, licensing, monitoring, and analytics for your dispersed BIG-IP infrastructure. If you have more than a few F5 BIG-IP's within your organization, managing devices as separate entities will become an administrative bottleneck and slow application deployments. Deploying cloud applications, you're potentially managing thousands of systems and having to deal with traditionallymonolithic administrative functions is a simple no-go. Enter BIG-IQ. BIG-IQ enables administrators to centrally manage BIG-IP infrastructure across the IT landscape. BIG-IQ discovers, tracks, manages, and monitors physical and virtual BIG-IP devices - in the cloud, on premise, or co-located at your preferred datacenter. BIG-IQ is a stand alone product available from F5 partners, or available through the AWS Marketplace. BIG-IQ consolidates common management requirements including but not limited to: Device discovery and monitoring: You can discovery, track, and monitor BIG-IP devices - including key metrics including CPU/memory, disk usage, and availability status Centralized Software Upgrades: Centrally manage BIG-IP upgrades (TMOS v10.20 and up) by uploading the release images to BIG-IQ and orchestrating the process for managed BIG-IPs. License Management: Manage BIG-IP virtual edition licenses, granting and revoking as you spin up/down resources. You can create license pools for applications or tenants for provisioning. BIG-IP Configuration Backup/Restore: Use BIG-IQ as a central repository of BIG-IP config files through ad-hoc or scheduled processes. Archive config to long term storage via automated SFTP/SCP. BIG-IP Device Cluster Support: Monitor high availability statuses and BIG-IP Device clusters. Integration to F5 iHealth Support Features: Upload and read detailed health reports of your BIG-IP's under management. Change Management: Evaluate, stage, and deploy configuration changes to BIG-IP. Create snapshots and config restore points and audit historical changes so you know who to blame. 😉 Certificate Management: Deploy, renew, or change SSL certs. Alerts allow you to plan ahead before certificates expire. Role-Based Access Control (RBAC): BIG-IQ controls access to it's managed services with role-based access controls (RBAC). You can create granular controls to create view, edit, and deploy provisioned services. Prebuilt roles within BIG-IQ easily allow multiple IT disciplines access to the areas of expertise they need without over provisioning permissions. Fig. 1 BIG-IQ 5.2 - Device Health Management BIG-IQ centralizes statistics and analytics visibility, extending BIG-IP's AVR engine. BIG-IQ collects and aggregates statistics from BIG-IP devices, locally and in the cloud. View metrics such as transactions per second, client latency, response throughput. You can create RBAC roles so security teams have private access to view DDoS attack mitigations, firewall rules triggered, or WebSafe and MobileSafe management dashboards. The reporting extends across all modules BIG-IQ manages, drastically easing the pane-of-glass view we all appreciate from management applications. For further reading on BIG-IQ please check out the following links: BIG-IQ Centralized Management @ F5.com Getting Started with BIG-IQ @ F5 University DevCentral BIG-IQ BIG-IQ @ Amazon Marketplace8.2KViews1like1CommentL3/4/DNS DDoS Reporting with Elastic Search and Kibana
Dear Reader, In this article, I would like to, in collaboration with my colleague Mohamed Shaath, show you how to use DDoS reporting and visibility dashboards that we have created based on an ELK (Elastic Search Logstash and Kibana) stack. The goal is to give you templates based on Open-Source software to address typical questions DDoS operators have and need to answer when an incident happens. Another component we added is the visualization of incoming packets, dropped packets, detection, and mitigation thresholds per attack vector. The idea here is to give you insights into auto-calculated thresholds compared to incoming rates. It will also give you the possibility to see anomalies in traffic behavior. Hopefully, the visualization will also help you with fine-tuning the DoS vector configuration (a typical example of this is the floor value of a vector). This article will give you an introduction to some of the graphs we provide together with the templates. Feel free to arrange or modify them in the way you need when you use the solution. We are also very happy to get your feedback, so we can optimize the dashboards and graphs in a way that is most useful for DDoS operators. Fundamental understanding of log events All DDoS configuration relay basically on two thresholds, regardless of the chosen threshold (manual, fully automatic, multiplier, …): Detection and Mitigation Figure 1: Detection and Mitigation rate “Detection” means, inform the DDoS operator that the incoming rate is above the configured (or auto-calculated rate based on the history) rate. Do not block traffic, just send out specific log information. The “detection” value is usually set or calculated to a rate that is just within the expected “normal” rate. That also means, everything above that value is not “normal” and therefore suspicious, but not necessarily an attack. But the DDoS operator should be aware of that event. Exactly this is happening when a packet rate crosses the detection rate: BIG-IP will send out log messages to the log server (when configured). Within the ELK solution we are introducing, we use the “Splunk” logging format, which sends the information in key/value format. That makes the understanding of the fields much easier. Here is an example of a log message, which is sent out when the packet rate has crossed the detection threshold. Jun 17 23:08:46 172.30.107.11 action="Allow",hostname="lon-i5800-1.pme.itc.f5net.com",bigip_mgmt_ip="172.30.107.11",context_name="/Common/www_10_103_2_80_80",date_time="Jun 17 2021 22:58:12",dest_ip="10.103.2.80",dest_port="80",device_product="DDoS Hybrid Defender",device_vendor="F5",device_version="15.1.2.1.0.317.10",dos_attack_event="Attack Sampled",dos_attack_id="550542726",dos_attack_name="TCP Push Flood",dos_packets_dropped="0",dos_packets_received="117",errdefs_msgno="23003138",errdefs_msg_name="Network DoS Event",flow_id="0000000000000000",severity="4",dos_mode="Enforced",dos_src="Volumetric, Per-SrcIP, VS-specific attack, metric:PPS",partition_name="Common",route_domain="0",source_ip="10.103.6.10",source_port="39219",vlan="/Common/vlan3006_client" Explanation of the message content: Action = “Allow” indicates that BIG-IP is not dropping packets (from the DoS point of view), it’s just giving the operator the information that within the last second the protected context (here: /Common/www_10_103_80_80) has received 117 (dos_packets_received) push packets (dos_attack_name) from source IP 10.103.6.10 (source_ip) within the last second. Btw., because this is a “Volumetric, Per-SrcIP, VS-specific attack” (dos_src) log message, it also tells you that the source IP has been identified as a bad actor (Also see my article: Increasing accuracy using Bad Actor and Attacked Destination). Therefore, this event was triggered by the Bad Actor configuration of the TCP Push flood vector. Mitigation threshold Once the incoming packet rate has crossed the mitigation threshold of a DoS vector or an attack signature, then BIG-IP starts to drop (rate-limit) traffic above that value. This is when we declare being under an DDoS attack because the protected context (server, service, network, BIG-IP, etc.) will be negatively affected by this high number of packets per second. Now the BIG-IP DoS device (AFM/DHD) needs to lower the number of packets hitting the affected context and that’s why it starts to drop packets on the identified vector. Again, this mitigation threshold can be set manually or auto-calculated based on history or a multiplication of the detection threshold. (Explanation of the F5 DDoS threshold modes) Here is an example of a drop log message: Jun 17 23:05:03 172.30.107.11 action="Drop",hostname="lon-i5800-1.pme.itc.f5net.com",bigip_mgmt_ip="172.30.107.11",context_name="Device",date_time="Jun 17 2021 22:54:29",dest_ip="10.103.2.80",dest_port="0",device_product="DDoS Hybrid Defender",device_vendor="F5",device_version="15.1.2.1.0.317.10",dos_attack_event="Attack Sampled",dos_attack_id="3221546531",dos_attack_name="Bad TCP flags (all cleared)",dos_packets_dropped="152224",dos_packets_received="152224",errdefs_msgno="23003138",errdefs_msg_name="Network DoS Event",flow_id="0000000000000000",severity="4",dos_mode="Enforced",dos_src="Volumetric, Aggregated across all SrcIP's, Device-Wide attack, metric:PPS",partition_name="Common",route_domain="0",source_ip="10.103.6.10",source_port="12826",vlan="/Common/vlan3006_client" In this example, the message is an aggregation of all source IPs (dos_src="Volumetric, Aggregated across all SrcIP's, Device-Wide attack) of the dropped packets (dos_packets_dropped="152224") during the last second. Therefore, the source IP (source_ip="10.103.6.10”) is just a representer for all source IPs with dropped packets within the last second. This is because there was no “bad actor” identified. This is usually the case, when the bad actor functionality is not configured, or when every packet has a different source IP. Structure of the dashboards These two main logging events (allow and drop) are what we have adapted to the visualization of the DDoS dashboard. The DDoS operator needs to know when there is an anomaly in the network and which vectors are triggered by the anomaly. The operator also needs to know what destinations are involved and which sources cause the anomaly. But, it is also important to know when the network is under attack and again what mitigation has taken place on which destinations and sources. How many packets have been dropped etc.? When you open the “DDoS Dashboard” and choose the “Overview Dashboard” you will notice that the dashboard is divided into two halves. On the left side, you get the information when a DDoS device has dropped packets and on the right side, you get the information about “suspicious” packets, which means when traffic was above a detection threshold without being dropped (action “Allow”). Figure 2: Structure of the dashboard Within this dashboard, you will also find graphs or tables which do not split the dashboard into two sides. Here you find combined information from both events/areas (mitigation and suspicious). Explanation of some dashboards In the menu section Home/Analytics/Dashboard you will find all dashboards we created. Figure 3: Dashboard menu Let’s briefly explain what the main Dashboards are for. Figure 4: Dashboard overview DDoS_Dashboard is the board where you can see all events during a chosen timeframe, which you can select in the upper right corner within that dashboard. Figure 5: Period of time selection On the top of the page, you find the Dashboard Explorer. From here you can easily navigate between all the relevant dashboards without going through the Analytics section of the main menu. Figure 6: Dashboard Explorer DDOS STATS Dashboard: shows details of the rates and thresholds (packet rate, detection, and mitigation threshold, drop rate) for all vectors including bad actor and attacked destination thresholds. Here you need to select the relevant vector, and context to see the details. DDOS Network Vectors: show details of the incoming rate and drop rate per network vector on a one-pager. DDOS DNS Vectors: show details of the incoming rate and drop rate per DNS vector on a one-pager. DDOS Bad Header Vectors: show details of the incoming rate and drop rate per bad header vector on one page. DDOS SIP vector: show details of the incoming rate and drop rate per SIP vector on a one pager. Please note that all “stats” dashboards are based on the “dos_stats” table, which you need to you to your server. It is not done via the DoS logs. On the GitHub page, you will find instructions on how to do it. Next, you see the Stats Control Panel Figure 7: Stats Control Panel By default, it will show the events (drop/allow) for all vectors in all contexts (VS/PO, Device) on all DoS devices. But by using the drop-down menu you can filter on specific data. All filters you set can also easily be saved and used again. Kibana gives a lot of flexibility. Next, you get to the Top Attacks Timeline, which shows you the top 10 attack vectors, which have dropped packets. Figure 8: Attack Timeline When you mouseover then you get the number of dropped packets for that vector. To the right of this graph, you see the Attack Event Details. Figure 9: Attack Event Details This simply shows you how many logs you have received per log event. Remember every mechanism (for example per source event, per destination, aggregated, …) has its own logs. The next row shows on the left side how many packets had been dropped during the chosen time frame. Figure 10: Dropped vs. suspicious packets On the right side you see how many packets had been identified as suspicious because the rate was above the detection threshold, but not above the mitigation threshold. This event message has the action “Allow”. In the middle graph, you see the relation of suspicious packets vs. dropped packets vs. incoming packets (incoming packets is the summarization of dropped and suspicious packets). The next graph gives you also an overview of received packets vs. dropped packets. Figure 11: Incoming vs. dropped packets But here the data comes from the dos_stats table, so again it is only visible when you send the information. Keep in mind this is not done via the log messages. This is the part where you send the output of the “tmctl -c dos_stat” command to your log device. If you are not doing it, then you can remove this graph from the dashboard. The main difference to the graph in the middle above is, that you will see data also when there is no event (allow/drop) because depending on the configured frequency you send the “dos_stat” table, you get the data (snapshot). Graphs based on log events of course can only appear when there is an event and logs are sent. This graph shows all incoming packets counted by all enabled vectors, regardless of they are counted on bad actors, attacked destinations, or the global stats per vector. Same for the dropped packets. It gives an overall overview of incoming packets vs. dropped packets. To get more details on which vector or mechanism (BA, AD) did the mitigation, you need to go to the DDOS STATS Dashboard. A piece of important information for a DDoS Operator is to know which services (IPs) are under attack and which contexts or protected objects have been involved. Figure 12: Target information Of course, also which vectors are used by the attacker. This is what is shown in the next row. On the left two graphs you get this information for dropped packets. On the right two graphs you see it for packets above the detection threshold but below the mitigation. Attacked IP and Destination Port, shows you the attacked IPs including the destination ports. Attacked Protected Objects, shows you the Context (VS/PO, Device, Global) in relation to the attack vectors. Context “Global” is used for IPI (IP-Intelligence). In this example packets got dropped because source IPs were configured within the IPI policy “my_IPI” and the category “denial of service”. The mitigation was executed on the global level. IPI activities are shown as attack vectors. Figure 13: IP-Intelligence information When you mouseover you can get the full line. More details on the attack vectors and IPI activity you will see lower on the page. Attacked destination details In the next row, you find a table with information on IP addresses that have been identified as being attacked by “Attacked Destination Detection” configured on a vector. Figure 14: Attacked Destination Details Figure 15: Vector configuration What are the sources of an attack? The next graph gives you the information of the identified attackers. “Top AttackerIPs” shows you the top 10 attacks based on aggregated logs. When you have configured “Bad Actor Detection” then you will also get the information for the top 10 “bad actors” IPs. Identified “bad actor” IPs are certainly important information you want to keep an eye on. Figure 16: Source address information Bad Actor Details To get more information on “bad actors”, you can use the “Bad Actor Details” table, which will show you relevant information. Here is an example: Figure 17: Bad Actor details You can see that the UDF flood vector identified a flood for the bad actor IP “4.4.4.4” at 11:02 on the Device level. Most of the packets had been dropped (PPS vs. Dropped Packets). Within the next multiple 30 second intervals, you get again details for that bad actor IP. But at 11:06 you can see that the IP address got programmed into the “denial of service” category and after that, all traffic coming from that IP got dropped via the IP-Intelligence policy “my_IPI” on the “Global” level/context. BDoS Details The Dashboard will also give you information about BDoS signatures and their events. Figure 18: BDoS details In this example, you can see that the system generated (Signature Add) a BDoS signature at 11:23:30. Then this signature was used (Re-USED) for mitigation (Drop). Keep in mind you will only see the details of a signature when it gets created. If a signature is re-used and you want to see the details of the signatures which may have gotten created days or weeks before, then you need to filter for that signature within the timeframe it got created. Another view on attacked IPs The dashboards give you also another, comprehensive view on attacked and targeted (action allow) IP addresses. Here you probably best start to mouseover from the inner circle going outside and you will get information per attacked context. Figure 19: Combined view on sources, destination and vectors Details about DNS attacks Within the DNS section, you get details about DNS-related attacks. Figure 20: DNS attack overview per vector Figure 21: Detailed DNS attack overview Also, a different view on Bad Actor activities Figure 22: Bad Actor / attack vector / destination overview Since we hope the graphs are mostly self-explaining we don´t want to go through all of them. We also plan to add more or modify them based on your feedback. Now it’s time to talk about another component, which we already touched on multiple times within this article. Attack vector visualization A second component we have created is the visualization of the stats (incoming, detection, mitigation, etc.) per attack vector. This is an optional part and is not related to the DoS logging. It is based on the “dos_stat” table and gives a snapshot of the statistics based on the interval you have configured to send the data from BIG-IP into your ELK stack. In my article “Demonstration of Device DoS and Per-Service DoS protection,” I already introduced you to the “dos_stat” table, when I used it within my “show_DoS_stats_script”. Figure 23: DDoS stat table This script shows you the stats for all vectors and their threshold etc. By sending this data frequently into your ELK stack, you can visualize the data and get graphs for them. You then can easily see trends or anomalies within a defined time frame. You can also easily see what thresholds (detection/mitigation) the system has calculated. Figure 24: Activity (detection/mitigation)graph per vector In this example you can see, what the system has done during an attack. The green line shows the incoming packet rate for that vector. The yellow line shows the expected auto-calculated rate (detection rate). The blue line is the auto-calculated mitigation rate, which is at the beginning of this graph very high because the protected context has no stress. Then we can see that the packet rate increases massively and crossed the detection rate. This is when the DDoS operator needs to be informed because this rate is not “normal” (based on history) and therefore suspicious. This high packet rate has an impact on the stress of the protected context and the mitigation rate got adjusted below the incoming rate. At that point, the system started to defend and mitigate. But the incoming packet rate went down again for a short time. Here the mitigation stopped because the rate was below the mitigation threshold, which also got increased again because of no stress on the protected context anymore. Then the flood happened again. The mitigation threshold got adjusted, mitigation started. Later we can see the incoming rate sometimes climbed above the detection threshold but was not strong enough to affect the health of the protected context. Therefore, no mitigation took place. At around 11:53 we can see the flood increased again and enabled the mitigation. Please keep in mind that the granularity of this graph depends of course on the frequency you send the data into the ELK stack and the data is always a snapshot of the current stats. How to configure logging on BIG-IP tmsh create ltm pool pool_log_server members add { 1.1.1.1:5558 } tmsh create sys log-config destination remote-high-speed-log HSL_LOG_DEST { pool-name pool_log_server protocol udp } tmsh create sys log-config destination splunk SPLUNK_LOG_DEST forward-to HSL_LOG_DEST tmsh create sys log-config publisher KIBANA_LOG_PUBLISHER destinations add { SPLUNK_LOG_DEST } tmsh create security log profile LOG_PROFILE dos-network-publisher KIBANA_LOG_PUBLISHER protocol-dns-dos-publisher KIBANA_LOG_PUBLISHER protocol-sip-dos-publisher KIBANA_LOG_PUBLISHER ip-intelligence { log-translation-fields enabled log-publisher KIBANA_LOG_PUBLISHER } traffic-statistics { syncookies enabled log-publisher KIBANA_LOG_PUBLISHER } tmsh modify security log profile global-network dos-network-publisher KIBANA_LOG_PUBLISHER ip-intelligence { log-geo enabled log-rtbh enabled log-scrubber enabled log-shun enabled log-translation-fields enabled log-publisher KIBANA_LOG_PUBLISHER } protocol-dns-dos-publisher KIBANA_LOG_PUBLISHER protocol-sip-dos-publisher KIBANA_LOG_PUBLISHER traffic-statistics { log-publisher KIBANA_LOG_PUBLISHER syncookies enabled } tmsh modify security dos device-config dos-device-config log-publisher KIBANA_LOG_PUBLISHER Figure 25: Overview of logging configuration How to send the dos_stats table data modify (crontab -e) the crontab on BIG-IP and add: * * * * * nb_of_tmms=$(tmsh show sys tmm-info | grep Sys::TMM | wc -l);tmctl -c dos_stat -s context_name,vector_name,attack_detected,stats_rate,drops_rate,int_drops_rate,ba_stats_rate,ba_drops_rate,bd_stats_rate,bd_drops_rate,detection,mitigation_low,mitigation_high,detection_ba,mitigation_ba_low,mitigation_ba_high,detection_bd,mitigation_bd_low,mitigation_bd_high | grep -v "context_name" | sed '/^$/d' | sed "s/$/,$nb_of_tmms/g" | logger -n 1.1.1.1 --udp --port 5558 Modify IP and port appropriate. Better approach then using the crontab is to use an external monitor: https://support.f5.com/csp/article/K71282813 Anyhow, keep in mind more frequently logging generates more data on you logging device! Conclusion The DDoS dashboards based on an ELK stack give the DDoS operators visibility into their DDoS events. The dashboard consumes logs sent by BIG-IP based on L3/4/DNS DDoS events and visualizes them in graphs. These graphs provide relevant information on what kinds of attacks from which sources are going to which destinations. Based on your BIG-IP DoS config you get “bad actor” details or “attacked destinations” details listed. You will also see if IPs that have been blocked by certain IPI categories and more. In addition to other information shown, the ELK stack is also able to consume data from the dos_stats table, which gives you details about your network behaviors on a vector level. Further, you can see how “auto thresholds” calculate detection and mitigation thresholds. We hope that this article gives you an introduction to the DDoS ELK Dashboards. We also plan to publish another article on the explanation of the underlying architecture. Sven Mueller & Mohamed Shaat3.5KViews1like0CommentsBIG-IP Logging and Reporting Toolkit - part one
Joe Malek, one of the many awesome engineers here at F5, took it upon himself to delve deeply into a very interesting but often unsung part of the BIG-IP advanced configuration world: logging and reporting. It’s my great pleasure to get to share with you his awesome study and the findings therein, along with (eventually) a toolkit to help you get started in the world of custom log manipulation. If you’ve ever questioned or been curious about your options when it comes to information gathering and reporting, this is definitely something you should read. There will be multiple parts, so stay tuned. This one is just the intro. Logging & Reporting Toolkit - Part 1 Logging & Reporting Toolkit - Part 2 Logging & Reporting Toolkit - Part 3 Logging & Reporting Toolkit - Part 4 Description F5 products occupy critical positions in application delivery infrastructure. They serve as gateways, proxies, accelerators and traffic flow arbiters. In these roles customer expectations vary for the degree and amount of event information recorded. Several opportunities exist within our current product capabilities for our customers and partners to produce and consume log messages from and via F5 products. Efforts to date include generating W3C style log messages on LTM via iRules, close integration with leading vendors and ASM (requires askf5 login), and creating relationships with leading vendors to best serve our customers. Significant capabilities exist for customers and partners to create their own logging and reporting solutions. Problems and opportunity In the many products offered by F5, there exists a variety of logging structures. The common log protocols used to emit messages by F5 products are Syslog (requires askf5 login) and SNMP (requires askf5 login), along with built-in iRulescapabilities. Though syslog-ng is commonplace, software components tend to vary in transport, verbosity, message formatting and sometimes syslog facility. This can result in a high degree of data density in our logs, and messages our systems emit can vary from version to version.[i] The combination of these factors results in a challenge that requires a coordinated solution for customers who are compelled by regulation, industry practice, or by business process, to maintain log management infrastructure that consumes messages from F5 devices.[ii] By utilizing the unique product architecture TMOS employs by sharing its knowledge about networks and applications as well as capabilities built into iRules, TMOS can provide much of this information to log management infrastructure in a simple and knowledgeable manner. In effect, we can emit messages about appliance state and offload many message logging tasks from application servers. Based on our connection knowledge we can also improve the utility and value of information obtained from vendor provided log management infrastructure.[iii] Objectives and success criteria The success criteria for including an item in the toolkit is: 1. A capability to deliver reports on select items using the leading platforms without requiring core development work on an F5 product. 2. An identified extensibility capability for future customization and report building. Assumptions and dependencies Vendors to include in the toolkit are Splunk, Q1Labs and PresiNET ASM logging and reporting is sufficient and does not need further explanation Information to be included in sample reports should begin to assist in diagnostic activities, demonstrate ROI by including ROI in an infrastructure and advise on when F5 devices are nearing capacity Vendor products must be able to accept event data emitted by F5 products. This means that some vendors might have more comprehensive support than others. Products currently supported but not in active development are not eligible for inclusion in the toolkit. Examples are older versions of BIG-IP and FirePass, and all WANJet releases. Some vendor products will require code modifications on the vendor’s side to understand the data F5 products send them. [i] As a piece of customer evidence, Microsoft implemented several logging practices around version 9.1. When they upgraded to version 9.4 their log volume increased several-fold because F5 added log messages and changed existing messages. As a result existing message taxonomy needed to be deprecated and we caused them to need to redesign filters, reports and create a new set of logging practices. [ii] Regulations such as the Sarbanes-Oxley Act, Gramm Leach Blyley Act, Federal Information Security Management Act, PCI DSS, and HIPPA. [iii] It is common for F5 products to manipulate connections via OneConnect, NATs and SNATs. These operations are unknown to external log collectors, and pose a challenge when assembling a complete view of the network connections between a client and a server via an F5 device for a single application transaction. What’s Next? In the next installment we’ll get into the details of the different vendors in question, their offerings, how they work and integrate with BIG-IP, and more. Logging and Reporting Toolkit Series: Part Two | Part Three730Views0likes1CommentCustom Reporting with iRules
In BIG-IP version 9.2, a new profile type was added that is very powerful but isn't well understood. The Statisitcs Profile enables the creation of custom statistics. What makes it interesting is that it's accessible from within iRules. What does this mean to you? Well, for starters, you can extract information from within a connection and store that data for later retrieval. The Scenario: Here's a typical situation: You want your web application to provide the best user experience possible. How do you measure that? We'll there are complex monitoring systems to simulate user actions, the time it takes for pages to refresh, the number of HTTP errors returned, etc. But let's say your needs aren't that great. You just want to know what the error rate is and how fast pages are getting served to the client. And to top that off, you'll want a way to control and view those statistics. Enter iRules and the Statistics Profile... The setup: This is assuming you already have a virtual server fronting your application and that you are running BIG-IP v9.2 or greater. Create the Statistics Profile The heart of this solution is the statistics profile. A statistics profile is a set of name=value pairs where the names are strings and the values are numbers. Login to the BIG-IP Administrative GUI. Select the Profiles option under Local Traffic and Virtual Servers Select the Statistics from the Other menu. Click the Create button Enter "user_experience" for the Profile name Add the following fields: count_20x, count_30x, count_40x, count_50x, num_requests, and total_time Click Update to create the profile Create the iRule Now that the virtual server is setup with the statistics profile, you'll need to create an iRule to update the profile with the statistics. Select Rules from the Local Traffic/Virtual Servers menu. Click the Create button. Enter "user_experience" for the iRule name and enter the iRule from below into the Definition text box. Click Finished to save the iRule Apply the statistics profile to your virtual server The statistics profile doesn't do much until you apply it to a virtual servers properties. Here's how: Select Virtual Servers from the Local Traffic menu. Click on your Virtual Server to enter it's properties Make sure the Advanced Configuration option is selected, and scroll down to the Statistics Profile option Select the previously created profile user_experience Click the Update Button Select the Resources Menu Click the Manage Rules button Make sure the user_experience iRule is in the Enabled list box and click the Finished button. The Fun So now that the grunt work is done, here's the fun stuff - the iRule. when RULE_INIT { # store the profile name in a variable for easy modification later. set ::PROFILE_NAME "user_experience" } when HTTP_REQUEST { # store the number of milliseconds since the epoch (you'll find out why later) set t0 [clock clicks -milliseconds] # add some secret control functions to manipulate the statistics switch [string tolower [HTTP::uri]] { "/getuserstats" { # Avoid divide by zero errors set time_per_req 0 set total_time [STATS::get $::PROFILE_NAME total_time] set num_requests [STATS::get $::PROFILE_NAME num_requests] if { $num_requests > 0 } { set time_per_req [expr $total_time / $num_requests] } # Hand roll a HTTP response to serve up the statistics report HTTP::respond 200 content "<html> <head><center><title>HTTP Status Code Report</title> <style>body {font-family: Tahoma} td {text-align: center} </style> </head><body> <table border='1' cellpadding='5' cellspacing='0'> <tr><th>HTTP<br/>Status Code</th><th>Response<br/>Count</th></tr> <tr><td>20x</td><td>[STATS::get $::PROFILE_NAME count_20x]</td></tr> <tr><td>30x</td><td>[STATS::get $::PROFILE_NAME count_30x]</td></tr> <tr><td>40x</td><td>[STATS::get $::PROFILE_NAME count_40x]</td></tr> <tr><td>50x</td><td>[STATS::get $::PROFILE_NAME count_50x]</td></tr> <tr><td>Num Requests</td><td>[STATS::get $::PROFILE_NAME num_requests]</td></tr> <tr><td>Total Time</td><td>[STATS::get $::PROFILE_NAME total_time] ms.</td></tr> <tr><td>Avg Time/Req</td><td>$time_per_req ms.</td></tr> </table></center></body></html>" } "/resetuserstats" { # Reset all the statistics values to zero STATS::set $::PROFILE_NAME "count_20x" 0 STATS::set $::PROFILE_NAME "count_30x" 0 STATS::set $::PROFILE_NAME "count_40x" 0 STATS::set $::PROFILE_NAME "count_50x" 0 STATS::set $::PROFILE_NAME "num_requests" 0 STATS::set $::PROFILE_NAME "total_time" 0 # Hand roll a HTTP response indicating reset status HTTP::respond 200 content "<html> <head><center><title>HTTP Status Code Control</title> <body><h1>Statistics Successfully reset</h1> </body></html>" } } } when HTTP_RESPONSE { # use the clock command to get the delta in milliseconds between # the request and the response. This doesn't give the true # time the client waits, but it is pretty close to the server # processing time. set t1 [clock clicks -milliseconds] set total_time [expr $t1 - $t0] # Increment the statistics profile values switch -glob [HTTP::status] { "20*" { STATS::incr $::PROFILE_NAME "count_20x" 1 } "30*" { STATS::incr $::PROFILE_NAME "count_30x" 1 } "40*" { STATS::incr $::PROFILE_NAME "count_40x" 1 } "50*" { STATS::incr $::PROFILE_NAME "count_50x" 1 } } STATS::incr $::PROFILE_NAME "num_requests" 1 STATS::incr $::PROFILE_NAME "total_time" $total_time } Conclusion In this example we store the high level HTTP::status categories as well as response time. By adding additional fields the the statistics profile, you can easily extend the functionality of this example. Oh, and if browsers aren't your preferred method for pulling the stats, know that they are all available via iControl as well so you can pull them down with your own preferred environment (perl, PowerShell, .Net, Java, ...).671Views0likes8CommentsBIG-IP Logging and Reporting Toolkit – part four
So far we’ve covered the initial problem, the players involved and one in-depth analysis of one of the options (splunk). Next let’s dig into Q1labs’ Qradar offering. The first thing you’ll need to do, just like last time, is make sure your BIG-IP to pass syslog traffic off the box. Here’s a simple example of how you can get that done in your config file. These are the same as last time, so nothing shockingly new here, though this bit is important. Logging & Reporting Toolkit - Part 1 Logging & Reporting Toolkit - Part 2 Logging & Reporting Toolkit - Part 3 Logging & Reporting Toolkit - Part 4 Bigip v9 syslog { remote server 10.10.200.31 } Bigip v10 syslog { remote server { qradar { host 10.11.100.31 } } } This will send all syslog messages from the BIG-IP to the QRadar system; both BIG-IP system messages and any messages from iRules. If you’re interested in having iRules log to the QRadar system directly you can use the HSL statements or the log statements with a destination host defined. Ex) RULE_INIT has set ::QRadarHost “10.10.200.31” and then in the iRules event you’re interested in you assemble $log_message and then sent it to the log with log $::QRadarHost $log_message . A good practice would be to also record it locally on something like local0 incase the message doesn’t make it to the QRadar system. In my testing I used a single QRadar system running the log collector and event processor. If you’re using a more sophisticated deployment you’ll need to use the Deployment Manager to ensure that the QRadar log collectors are forwarding messages onto the Event Processor you’re going to work with. My QRadar system was already setup to receive syslog messages on port 514, so there wasn’t anything more to do to get messages flowing. The key to working with QRadar is defining regular expressions to extract the message data you’re interested in – once you have that done most things are done using the same process. In this section I’ll walk through all the tasks needed to extract custom data through build a report for the w3c case. Then I’ll show a summary using NEDS and dashboard data. Here are my regexes for QRadar for w3c, NEDS and the dashboard data script: message source attribute name regex capture group sample message dashboard script Compression Deflate uses deflate\.out\.uses='(\d+)' 1 in dc post dashboard script Compression LZO uses lzo\.out\.uses='(\d+)' 1 in dc post dashboard script Compression Null uses null\.out\.uses='(\d+)' 1 in dc post dashboard script Dashboard-messageType message_type='(.+?)' 1 in dc post dashboard script Dashboard-reportingSystem HostName='(.+?)' 1 in dc post dashboard script Dashboard-routingEnabled routing='(.+?)' 1 in dc post NEDS iRule NEDSv1-Flow-clientside-http "(neds\.f5\.conn\.start\.v1)",(\"[\w\.resp\.v1]+\"\,)+\"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}\-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\" 1 in NEDS Spec NEDS iRule NEDSv1-clientIPaddress "(neds\.f5\.conn\.start\.v1","[\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) 2 in NEDS Spec NEDS iRule NEDSv1-clientPort "(neds\.f5\.conn\.start\.v1","[\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:)(\d{1,5}) 3 in NEDS Spec NEDS iRule NEDSv1-clientCloseBytesIn (neds\.f5\.conn\.end\.v1)","([\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}@\d+\.\d+)",(\d+.\d+),(\d+),(\d+),(\d+) 7 in NEDS Spec NEDS iRule NEDSv1-clientCloseBytesOut (neds\.f5\.conn\.end\.v1)","([\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}@\d+\.\d+)",(\d+.\d+),(\d+),(\d+),(\d+),(\d+) 8 in NEDS Spec NEDS iRule NEDSv1-clientClosePktsIn (neds\.f5\.conn\.end\.v1)","([\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}@\d+\.\d+)",(\d+.\d+),(\d+), 5 in NEDS Spec NEDS iRule NEDSv1-clientClosePktsOut (neds\.f5\.conn\.end\.v1)","([\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}@\d+\.\d+)",(\d+.\d+),(\d+),(\d+) 6 in NEDS Spec NEDS iRule NEDSv1-clientCloseTimestamp (neds\.f5\.conn\.end\.v1)","([\w\.]+)","(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5}@\d+\.\d+)",(\d+.\d+), 4 in NEDS Spec NEDS iRule NEDSv1-clientConnectionIngressVlan (neds[\w\.]+start\.v1\",\"[\w\.]+",")(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\",(\d+\.\d+)\,"(\w+)" 4 in NEDS Spec NEDS iRule NEDSv1-clientConnectionPolicyName (neds[\w\.]+start\.v1\",\"[\w\.]+",")(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\",(\d+\.\d+)\,"(\w+)"\,(\d+),(\d+),(\d+),\"([\w\.]+)\" 8 in NEDS Spec NEDS iRule NEDSv1-clientConnectionStartTimestamp (neds[\w\.]+start\.v1\",\"[\w\.]+",")(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\",(\d+\.\d+) 3 in NEDS Spec NEDS iRule NEDSv1-clientIPProtocol (neds[\w\.]+start\.v1\",\"[\w\.]+",")(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\",(\d+\.\d+)\,"(\w+)",(\d+), 5 in NEDS Spec NEDS iRule NEDSv1-httpRequestHost (neds\.f5\.http\.req\.v1)",("[\w\.\"\:\-\@]+)","([\w\.\:\-\@]+)",(\d+\.\d+,\d+),"([\w\.\_\-]+) 5 in NEDS Spec NEDS iRule NEDSv1-httpRequestServerPort (neds\.f5\.http\.resp\.v1)","([\w\.]+)","([\d\.:]+)-([\d\.]+):(\d{1,5}) 5 in NEDS Spec NEDS iRule NEDSv1-httpRequestTCPReplyNumber (neds\.f5\.http\.req\.v1)",("[\w\.\"\:\-\@]+)","([\w\.\:\-\@]+)",(\d+\.\d+),(\d+) 5 in NEDS Spec NEDS iRule NEDSv1-httpRequestUserAgent (neds\.f5\.http\.req\.v1)",("[\w\.\"\,\:\-\@]+)","([\w/\._\%\@]+)",("[\w\@\.]*?"),"([\w/\.\s(;\-\:\)]+) 5 in NEDS Spec NEDS iRule NEDSv1-httpResponseContentLength (neds\.f5\.http\.resp\.v1)","([\w\."\,\:\-\@]+)","([\w\/\;\s\=\-]+)","(\d+) 4 in NEDS Spec NEDS iRule NEDSv1-httpResponseContentType (neds\.f5\.http\.resp\.v1)","([\w\."\,\:\-\@]+)","([\w\/\;\s\=\-]+) 3 in NEDS Spec NEDS iRule NEDSv1-httpResponseLBTarget (neds\.f5\.http\.resp\.v1)","([\w\.\,:\-@/;\s\="]+),"(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:\d{1,5})" 3 in NEDS Spec NEDS iRule NEDSv1-reportingSystem (\"neds.+[\w]\.v1\"),\"([\w.]+)\" 2 in NEDS Spec NEDS iRule NEDSv1-responseHTTPContentLength (neds[\w\.]+resp\.v1\",\"[\w\.]+",")(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\",(\d+\.\d+),(\d+),"(\d{3})","([\w\/]+)","(\d+)" 7 in NEDS Spec NEDS iRule NEDSv1-responseHTTPServerResponseCode (neds[\w\.]+resp\.v1\",\"[\w\.]+",")(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}-\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}@\d+\.\d+)\",(\d+\.\d+),(\d+),"(\d{3})" 5 in NEDS Spec w3c iRule W3C Client IP address client_ip=(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) 1 in dc post w3c iRule W3C Client Port client_port=(\d{1,5}) 1 in dc post w3c iRule W3C Client username username=([\w]+) 1 in dc post w3c iRule W3C Content Length content_length=(\d+) 1 in dc post w3c iRule W3C HTTP Request request="(.*)"\ss 1 in dc post w3c iRule W3C HTTP version HTTP/(\d\.\d)" 1 in dc post w3c iRule W3C Host header host=(.+?) 1 in dc post w3c iRule W3C Member server lb_server=(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}:\d{1,5}) 1 in dc post w3c iRule W3C Server Response Code server_status=(\d{3}) 1 in dc post w3c iRule W3C User Agent user_agent="(.*)" 1 in dc post w3c iRule W3C VIrtual Server name virtual=(.*?)\s 1 in dc post w3c iRule w3c Server Port lb_server=\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}:(\d{1,5}) 1 in dc post w3c iRule w3c referer referer=([\w\./\:\-]+) 1 in dc post w3c iRule w3c resp_time resp_time=(\d+) 1 in dc post W3C offload case Now that BIG-IP is setup to send messages to the QRadar system double check to see that you’ve installed the w3c-client-logging iRule on a vip and we’ll see what it looks like when everything is put together. Login to your QRadar management console and navigate to the Events tab. You should see events streaming into QRadar if everything is configured correctly. If you can’t find what you’re looking for in the normalized events view change it to the raw view – I also opted to have my console autorefresh every minute. There last message is the one I’m looking for. If you find a similar message and double click on it we can start to extract data from the message and build up some searches and reports. I’ve not fully customized my QRadar deployment, so I’m ignoring the fact that the Log Source for the iRule messages has been identified by the system’s FastIronDsm. After your screen is showing the Event Viewer click on the Extract Property button in the button bar. This will launch the Custom Event Property Definition tool, which will allow you to categorize event elements and write the regex for extracting the information you’re interested in. Right now, I’m interested in the HTTP server status codes. For your custom field extractions, here’s a sample message from the W3C iRule: Feb 9 14:23:21 tmm tmm[5088]: Rule w3c-client-logging : virtual=www.f5demo.com_http client_ip=65.197.145.92 client_port=37227 lb_server=10.10.200.1:80 host=www.f5demo.com username= request="GET /compression HTTP/1.0" server_status=301 content_length=322 resp_time=1 user_agent="check_http/1.96 (nagios-plugins 1.4.5)]" referer= The regular expressions for key value pairings are pretty easy to create. In this window you can see that the regex has located the item in the log message I’m interested in – it’s highlighted in yellow. Save the regex extraction and you’ll be returned to the Event Viewer and look for the new property listed on the page. Now the attribute shows up on the Event list page, down at the bottom. I’ve already entered several regular expressions for the NEDS data, and since I’ve assigned them all to the same Device Support Module (DSM) they’re showing up on this page; and this isn’t a NEDS message so they’re not applicable to this stream. With the extraction we just assigned to this DSM and log source we can return to the Event List and build a search. After the search is built it can be used to filter the events list, and we can build a report from it. In the Event Viewer Click on the Search button and select the New Event Search option. I’ve also added extractions to my system for response time, and member server. This helps to further illustrate what’s happening in my environment – I can see what hosts are sending which response codes and get a rough idea on what the client and server performance is. Pick the fields you’re interested in including in the search and click on the ‘Filter’ button. QRadar composes the search and saves it to the list of defined searches. I could also add a regex to extract the BIG-IP name from the message and group by that attribute to get an idea of what’s happening across the various BIG-IPs in my environment. Now the search runs – and I find that in the last 6 hours there have been 147 HTTP 304 response codes recorded by the system. Here’s my search result: /p> I see that in the last 6 hours there have been 147 HTTP 304’s recorded by the system. To turn this search into a report or make it available to the dashboard, click on the Save Criteria button in the toolbar. I’ve found that it helps to group searches together, so I’ve created a group called BIG-IP for all my BIG-IP related searches. For this search to appear on your dashboard you’ll also need to click the “Add item…” button on the dashboard and locate your search. To generate the report, click the Reports tab and find the Actions dropdown in the tool bar – select Create. I’m building a manual report for this step. My report uses a single frame and Events/Logs as the information source. And here’s my report: Accounting for the date format, there were a lot of 304’s returned to clients on March 5 th and I probably have data missing from March 10 th onwards because my BIG-IP was sending log messages somewhere else. NEDS case While the w3c offload case used an iRule with key/value tuples NEDS uses a comma delimited string to convey information in the message. I spent some time with the specification and wrote several regular expressions to extract the data. The process is identical to what’s outlined in the w3c case, so I’ll save the screen real estate and skip the screen shots of the process. You can find my regular expressions here – I’m fairly new to regular expressions, so I’m sure that there are improvements that can be made to make mine more efficient/maintainable. After defining the custom attributes here’s what I get when viewing a NEDS message in the Event Viewer. To syslog, A NEDS message looks like Mar 30 10:44:59 tmm tmm[5088]: Rule networkEventDataStream <HTTP_RESPONSE>: "neds.f5.http.resp.v1","bigip9.f5demo.com","65.197.145.92:42709-65.197.145.93:80@1269971099.951082",1269971099.952527,1,"301","text/html; charset=iso-8859-1","322","10.10.200.1:80","65.197.145.92:42709-10.10.200.1:80" Here’s the detailed view using the regular expressions for http response, client close, http request and client accepted messages. HTTP Response Client close HTTP Request Client Accepted < Here’s a select result of a search I composed for the connection close data. The clientCloseBytesOut is not N/A filters out all the non-client-close NEDS messages. The process to generate a report for NEDS data mirrors the process for the W3C case: 1. Save the search 2. Create a report template 3. Add the search to the report template 4. Save the template 5. Run the report Dashboard data case To get the dashboard data streaming into QRadar, I had to modify my base script to send the messages via syslog, instead of just printing the string. In addition to the QRadar and BIG-IP systems, you’ll need another host with the requisite Perl modules installed to relay the data from the BIG-IP to the QRadar. Here’s the script: Dashboard Syslog Pearl Script To use it you’ll need to: 1. On line 66 configure the username the script should use to access the dashboard interface 2. On line 67 configure the password for the username from step 1 3. On line 48 configure the IP address or name that for the QRadar event collector 4. Schedule the script to run periodically on your relay host – cron would do nicely. Once you’ve got the data into QRadar you’ll need to: 1. Find the endpoint-isession-stat data log message and write your regular expressions for the data you’re interested in 2. Find the remote endpoint log message you’re interested in and write your regular expressions for the data you’re interested in 3. Build and save your searches 4. Build and run your report template Here’s a sample endpoint-isession-stat data log message in syslog: Mar 15 17:05:46 127.0.0.1 10.11.100.73: device_timestamp='Wed Mar 15 00:05:46 2010 GMT' HostName='bigip3900c.demo.f5demo.com' version.version='10.1.0' message_type='endpoint_isession_stat' name='_tunnel_ctrl_10.20.50.103' peer_ref='00:00:00:00:00:00:00:00:00:00:ff:ff:0a:14:32:67' null.in.uses='292670' null.in.errors='0' null.in.bytes_opt='31371493' null.in.bytes_raw='29030133' null.out.uses='292670' null.out.errors='0' null.out.bytes_opt='31371493' null.out.bytes_raw='29030133' lzo.in.uses='1250' lzo.in.errors='0' lzo.in.bytes_opt='139167' lzo.in.bytes_raw='124178' lzo.out.uses='1250' lzo.out.errors='0' lzo.out.bytes_opt='139166' lzo.out.bytes_raw='124177' deflate.in.uses='0' deflate.in.errors='0' deflate.in.bytes_opt='0' deflate.in.bytes_raw='0' deflate.out.uses='0' deflate.out.errors='0' deflate.out.bytes_opt='0' deflate.out.bytes_raw='0' dedup.in.uses='0' dedup.in.errors='0' dedup.in.bytes_opt='0' dedup.in.bytes_raw='0' dedup.out.uses='0' dedup.out.errors='0' dedup.out.bytes_opt='0' dedup.out.bytes_raw='0' dedup_in.hit_bytes='0' dedup_in.hits='0' dedup_in.hit_hist.bucket_1k='0' dedup_in.hit_hist.bucket_2k='0' dedup_in.hit_hist.bucket_4k='0' dedup_in.hit_hist.bucket_8k='0' dedup_in.hit_hist.bucket_16k='0' dedup_in.hit_hist.bucket_32k='0' dedup_in.hit_hist.bucket_64k='0' dedup_in.hit_hist.bucket_128k='0' dedup_in.hit_hist.bucket_256k='0' dedup_in.hit_hist.bucket_512k='0' dedup_in.hit_hist.bucket_1m='0' dedup_in.hit_hist.bucket_large='0' dedup_in.miss_bytes='0' dedup_in.misses='0' dedup_in.miss_hist.bucket_1k='0' dedup_in.miss_hist.bucket_2k='0' dedup_in.miss_hist.bucket_4k='0' dedup_in.miss_hist.bucket_8k='0' dedup_in.miss_hist.bucket_16k='0' dedup_in.miss_hist.bucket_32k='0' dedup_in.miss_hist.bucket_64k='0' dedup_in.miss_hist.bucket_128k='0' dedup_in.miss_hist.bucket_256k='0' dedup_in.miss_hist.bucket_512k='0' dedup_in.miss_hist.bucket_1m='0' dedup_in.miss_hist.bucket_large='0' dedup_out.hit_bytes='0' dedup_out.hits='0' dedup_out.hit_hist.bucket_1k='0' dedup_out.hit_hist.bucket_2k='0' dedup_out.hit_hist.bucket_4k='0' dedup_out.hit_hist.bucket_8k='0' dedup_out.hit_hist.bucket_16k='0' dedup_out.hit_hist.bucket_32k='0' dedup_out.hit_hist.bucket_64k='0' dedup_out.hit_hist.bucket_128k='0' dedup_out.hit_hist.bucket_256k='0' dedup_out.hit_hist.bucket_512k='0' dedup_out.hit_hist.bucket_1m='0' dedup_out.hit_hist.bucket_large='0' dedup_out.miss_bytes='0' dedup_out.misses='0' dedup_out.miss_hist.bucket_1k='0' dedup_out.miss_hist.bucket_2k='0' dedup_out.miss_hist.bucket_4k='0' dedup_out.miss_hist.bucket_8k='0' dedup_out.miss_hist.bucket_16k='0' dedup_out.miss_hist.bucket_32k='0' dedup_out.miss_hist.bucket_64k='0' dedup_out.miss_hist.bucket_128k='0' dedup_out.miss_hist.bucket_256k='0' dedup_out.miss_hist.bucket_512k='0' dedup_out.miss_hist.bucket_1m='0' dedup_out.miss_hist.bucket_large='0' outgoing.conns_idle_cur='0' outgoing.conns_idle_max='0' outgoing.conns_idle_tot='0' outgoing.conns_active_cur='2' outgoing.conns_active_max='3' outgoing.conns_active_tot='3' outgoing.conns_errors='0' outgoing.conns_passthru_tot='0' incoming.conns_idle_cur='0' incoming.conns_idle_max='0' incoming.conns_idle_tot='0' incoming.conns_active_cur='2' incoming.conns_active_max='6' incoming.conns_active_tot='127' incoming.conns_errors='0' incoming.conns_passthru_tot='0' dedup_status_array='cccc ' Here’s a sample remote endpoint log message in syslog: Mar 15 17:05:50 127.0.0.1 10.11.100.73: device_timestamp='Wed Mar 15 00:05:46 2010 GMT' HostName='bigip3900c.demo.f5demo.com' version.version='10.1.0' message_type='woc_peer' peer_ref='10.20.50.103' name='bigip3900b.demo.f5demo.com' UUID='cd18:4840:f9e0:' mgmt_addr='10.11.100.72' version='10.1.0' dedup_cache='203588' dedup_action='DEDUP_ACTION_NONE' dedup_cache_refresh_flag='false' dedup_cache_refresh_count='0' state='WOC_PEER_STATE_READY' is_enabled='true' origin='MCP_ORIGIN_CONFIGURED' profile_serverssl='' tunnel_encrypt_data='true' tunnel_port='443' behind_nat='false' source_address='WOC_PEER_NAT_SOURCE_ADDRESS_NONE' config_status='none' routing='true' addr_list='' And lastly if you’re an ASM or APM user there’s an F5 Networks DSM that will recognize your ASM logs; all you need to do is define the hostname or IP address of your QRadar system on your ASM logging profile.644Views0likes1Comment2048-bit Infrastructure Impact Reporting Tool
A few weeks ago Lori nailed it with a post (The 2048-bit Keys to the Kingdom) on the coming forced migration to 2048-bit keys. A few days prior, I got a call from “THE” Matt Cauthorn, DevCentral resident stud contributor L4L7 about the very same issue. Not surprisingly, he was ahead of the game on this and has spent some time developing a tool that will take the mystery out of the licensing and infrastructure impact checklist items Lori mentioned. Well what does this tool do? Function Generates a high-level report in pdf format on what 2048-bit keys will do to your infrastructure Graphs the last seven days of TPS data by default (you can also run against 24 hour and 30 day data as well) Highlight any platforms in your infrastructure that might be improperly sized for 2048-bit keys under existing loads Details Fetches some graph data, the license file, the platform ID, the TMOS version, and general system information. These are all read-only calls. Assumptions Using 1024-bit keys today. This may not be true for you. If you’re using 2048-bit keys, the report will still generate useful information To estimate your maximum platform TPS, the tool simply takes the maximum 1024-bit TPS for your platforms and reports 20% of that value. Note that this is maximum platform TPS, not maximum licensed TPS. Requirements If you haven’t taken the time to configure your environment for pyControl, you’ll need to do so to use this tool. There are installation tutorials for Windows and Ubuntu. Here are the packages you’ll need: Python 2.5, 2.6, or 2.7 (avoid 2.6 if you can) Setuptools Suds (Grab the GA version, which is currently 0.4) pyControl Reportlab (I grabbed the latest daily windows installer) These details and the reporting tool itself are ready for you here in the iControl codeshare. Enjoy! Related Articles WILS: SSL TPS versus HTTP TPS over SSL SSL TPS license - DevCentral - F5 DevCentral > Community > Group ... SSL transaction (TPS) rate limit reached - DevCentral - F5 ... Data Center Feng Shui: SSL F5 Friday: The 2048-bit Keys to the Kingdom Experimenting with pyControl on LTM VE > DevCentral > F5 ... Getting Started with pyControl v2: Installing on Windows ... Getting Started with pyControl v2: Installing on Ubuntu Desktop ... Does pycontrol work in Linux? - DevCentral - F5 DevCentral ... pyControl v2.0 - DevCentral - F5 DevCentral > Forums - Social ...190Views0likes0CommentsBIG-IP Logging and Reporting Toolkit – part three
In the first couple installments of this series we’ve talked about what we’re trying to accomplish, the vendor options we have available to us, their offerings, some strengths and weaknesses of each, etc. In this installment, we’re actually going to roll up our sleeves and get to getting. We’ll look at how to get things working in a couple different use cases including scripts, screenshots and config goodies. Before we get too far ahead of ourselves though, first things first – get the BIG-IP to send messages to syslog. Logging & Reporting Toolkit - Part 1 Logging & Reporting Toolkit - Part 2 Logging & Reporting Toolkit - Part 3 Logging & Reporting Toolkit - Part 4 - Bigip v9 syslog { remote server 10.10.200.30 } - Bigip v10 syslog { remote server { splunk { host 10.11.100.30 } } } This will send all syslog messages from the BIG-IP to the Splunk server; both BIG-IP system messages and any messages from iRules. If you’re interested in having iRules log to the Splunk server directly you can use the HSL statements or the log statements with a destination host defined. Ex) RULE_INIT has set ::SplunkHost “10.10.200.30” and then in the iRules event you’re interested in you assemble $log_message and then sent it to the log with log $::SplunkHost $log_message . A good practice would be to also record it locally on something like local0 incase the message doesn’t make it to the Splunk host. For Splunk to receive the message you have to create a Data Input on udp:514 for the log statement. To cover HSL and log statements I’d recommend creating tcp:514 and udp:514 data inputs on Splunk. http://www.splunk.com/base/Documentation/4.0.2/Admin/Monitornetworkports covers this. We’ll get to the scripts part in a bit, first… W3C offload case Now that BIG-IP is setup to send messages to Splunk and Splunk is setup to listen to them, let’s see what it looks like when put together. Open the Splunk Search application and enter ‘w3c-client-logging sourcetype=udp:514’ into the search bar. Here’s one of the things that makes Splunk really easy to work with: it recognized the key-value pairings in the log message without any configuration needed on my part. Next, I opened the Pick fields box and selected user_agent and added it to the list of fields I’m interested in; and now it shows up in alongside the log message and I can now build a report on it by clicking on the arrow. The engineer in us wants to use technical terms to accurately convey the precise information we want to distribute. Splunk makes it easy to bridge the gap from technical terms to terms that are meaningful to non-engineers. So, for example a BIG-IP admin knows what this iRule is and what it’s called (in this case w3c-client-logging) – but those could be foreign concepts to folks in the Creative Service department that only want to know what browsers people are using to access a website. So, let’s employ some natural language too. The w3c-client-logging rule records a message when an HTTP transaction completes; a request and a response. So, let’s call it what it is. On your Splunk system open up the $SPLUNKHOME/etc/system/local/eventtypes.conf file and add this: [httpTransaction] search = “Rule w3c-client-logging” You might need to restart Splunk for this change to take effect. Now, let’s go back to the search console and try out our new event type. This is a basic usage of event types in Splunk, you can learn more here: http://www.splunk.com/base/Documentation/4.0.2/Admin/Eventtypesconf . With transforms.conf and props.conf you can also effectively rename the attributes so, lb_server could be called webServer instead. Now that we have a custom event based off our search string, all we have to do is click the dropdown arrow next to userAgent (in this case) and select the report option from the dropdown. Here's the output we'd see: Heh – lookit’ that; nagios is the most frequent visitor… Network Event Data Stream Case Now that we've seen the W3C example, let's take a look at another example that's much more rich (comma delineated). With no keys, just values, this changes things considerably. Let’s look at the Network Event Data Stream specification and see how it’s been implemented as an iRule. iRule - http://devcentral.f5.com/s/wiki/default.aspx/iRules/NEDSRule.html Doc – http://devcentral.f5.com/s/downloads/techtips/NedsF5v1.doc Since this is an information rich data source, conveyed from the BIG-IP to the Splunk server using comma separated values it takes a few more simple steps for Splunk to be able to extract out the fields just like it did for the key-value pairs. Open up $SPLUNKHOME/etc/system/local/transforms.conf and insert this: [extract_neds.f5.conn.start.v1_csv] DELIMS = "," FIELDS = "EventID","Device","Flow","DateTimeSecs","IngressInterface","Protocol",”DiffServ","TTL","PolicyName","Direction" [extract_neds.f5.conn.end.v1_csv] DELIMS = "," FIELDS = "EventID","Device","Flow","DateTimeSecs","PktsIn","PktsOut","BytesIn","BytesOut" [extract_neds.f5.http.req.v1_csv] DELIMS = "," FIELDS = "EventID","Device","Flow","DateTimeSecs","Request","Host","URI","UserName","UserAgent" [extract_neds.f5.http.resp.v1_csv] DELIMS = "," FIELDS = "EventID","Device","Flow","DateTimeSecs","Reply","ResponseCode","ContentType","ContentLength","LoadBalanceTarget","ServerFlow" This names each list of information we’re interested in, indicates that the fields in the message are comma delimited and names the fields. You can name the fields for what’s appropriate for your environment. Save the file. Next, open $SPLUNKHOME/etc/system/local/props.conf and insert this: [eventtype::F5connectionStartEvent] REPORT-extrac = extract_neds.f5.conn.start.v1_csv [eventtype::F5connectionEndEvent] REPORT-extrac = extract_neds.f5.conn.end.v1_csv [eventtype::F5httpRequestEvent] REPORT-extrac = extract_neds.f5.http.req.v1_csv [eventtype::F5httpResponseEvent] REPORT-extrac = extract_neds.f5.http.resp.v1_csv This instructs the Splunk system to extract the information from the named fields. Save the file. Next, open SPLUNKHOME/etc/system/local/eventtypes.conf and insert this (the ‘sourcetype=udp:514’ part is optional – set it up for your environment or omit the search term): [F5connectionStartEvent] search = neds.f5.conn.start.v1 sourcetype=udp:514 [F5connectionEndEvent] search = neds.f5.conn.end.v1 sourcetype=udp:514 [F5httpRequestEvent] search = neds.f5.http.req.v1 sourcetype=udp:514 [F5httpResponseEvent] search = neds.f5.http.resp.v1 sourcetype=udp:514 Lastly, this defines the event to extract the data from. Save the file, and restart Splunkd. There are a few processes you can restart to avoid a complete Splunkd restart, but my environment is a lab so I just restarted the whole thing. While Splunkd is restarting you should attach the NEDS iRule to a BIG-IP virtual server you want to receive data from and send some traffic though the VIP so your Splunk servers will get some data. Now let’s navigate back to the Search app in the web ui. In the search bar, enter eventtype=F5connectionEndEvent . I opened the Pick fields box and selected BytesIn, BytesOut, Device, PktsIn and PktsOut . As another way to use the Splunk search to report on traffic transiting a BIG-IP enter eventtype=F5connectionEndEvent |timechart avg(PktsOut) avg(BytesOut) into the search bar. This will generate a table for you listing the average number of packets transmitted from the VIP and the average number of bytes transmitted by the VIP for the default 10s time period. I mentioned more to come about the script input at the top of the message. F5 recently added dashboards for WAN optimization and Access Policy management. One thing that I wish the dashboards provided is a historic view of the data so I can see how my infrastructure is changing over time as I upgrade applications and add more traffic to my network. Full disclosure: this BIG-IP interface isn’t a supported interface for anything other than the dashboard. Using BIG-IP 10.1 with a full WAN Optimization license, Perl and Splunk, here’s how I did it. 1) Place this script (http://devcentral.f5.com/s/downloads/techtips/text-dashboard-log.pl) somewhere on your Splunk system and mark it executable – I put mine in $SPLUNKHOME/bin/scripts 2) Ensure you have the proper Perl modules installed for the script to work 3) Add BIG-IP logon data to lines 58 and 59 – the user must be able to access the dashboard 4) Configure a data input for the Splunk server to get the dashboard data. My Splunk system retrieves a data set every 2 minutes. I’ve added in 2 collectors, one for each side of my WOM iSession tunnel. After getting all this setup and letting it run for a while, if you navigate back to your Search console you should see a new Sourcetype show up called BIG-IP_Acceleration_Dashboard. Clicking on the BIG-IP_Acceleration_Dashboard sourcetype displays the log entries sent to the Splunk system. Splunk recognizes the key-value pairings and has automatically extracted the data and created entries in the Pick fields list. That’s a lot of data! Basically it’s the contents of the endpoint_isession_stat table and the endpoint data – you can get this on the CLI via ‘tmctl endpoint_isession_stat’ and ‘b endpoint remote’ . Now I can easily see that from basically March 8 until now my WOM tunnels were only down for about 4 minutes. Another interesting report I’ve built from here from here is the efficacy of adaptive compression for the data transiting my simulated WAN by charting lzo_out_uses, deflate_out_uses and null_out_uses over time. Last, but certainly not least – there’s the Splunk for F5 Networks application available via http://www.splunk.com/wiki/Apps:Splunk_for_F5. You should definitely install it if you’re an ASM or PSM user. Logging and Reporting Toolkit Series: Part One | Part Two399Views0likes0CommentsBIG-IP Logging and Reporting Toolkit - part two
In this second offering from Joe Malek’s delve into some advanced configuration concepts, and more specifically the logging and reporting world, we take a look at the vendors that he investigated, what they offer, and how they integrate with F5 products. He discusses some of the capabilities of each, their strengths and weaknesses and some of the things you might use each for. If you’ve been wondering what your options are for more in-depth log analysis and reporting, take a look to see what his thoughts are on a couple of the leading solutions. Logging & Reporting Toolkit - Part 1 Logging & Reporting Toolkit - Part 2 Logging & Reporting Toolkit - Part 3 Logging & Reporting Toolkit - Part 4 Vendor descriptions: Splunk - http://www.splunk.com/ “IT Search” is Splunk’s self identified core functionality. Splunk’s software contains multiple ways to obtain data from IT systems, indexes the data and reports on the data using a web interface. Splunk has invested in creating a Splunk for F5 application containing dashboard style views into log data for F5 products. Currently included in the application are LTM, GTM, ASM, APM and FirePass. The application is able to consume log messages sent to Splunk servers via syslog – and by extension iRules using High Speed Logging. Splunk is deployed as software to be installed on a customer provided system. Windows, Mac OS, Linux, AIX, and BSD variants are all supported host operating systems. Splunk can receive messages via files, syslog, SNMP, SCP, SFTP, FTP, generic network ports, FIFO queues, directory crawling and scripting. Splunk has a very intuitive and “Google like” interface allowing users to easily navigate and report on data in the system. Users are able to define reports, indices, dashboards and applications to present data as an organization requires. Upon receipt of data, Splunk can process the data according to in-built training or according to a user constructed taxonomy. Q1 Labs - http://www.q1labs.com/ Q1 Labs brings a product called QRadar to market. QRadar combines functionality commonly found in SIEM, log management and network behavior analysis products. Q1 products are able to consume event messages as well as record information on a network connection basis. QRadar is available as a pay-for appliance and a no-charge edition in a virtual machine. Differences between the two editions are the SIEM and advanced correlation functionality. The no-charge edition is a log management tool only. QRadar can receive messages via syslog SNMP, JDBC connectors, SFTP, FTP, SCP, and SDEE. Additionally QRadar con obtain network flow information in a port mirror/span mode. Customizing data views and report building are based on regular expressions. Customers can create their own regular expressions and build upon pre-configured expressions for reporting. In the SIEM module, QRadar includes approximately 250 events that can be sequenced together into complex “Offenses” in a manner similar to building a rule in Microsoft Outlook. “Universal Device Support Modules” can be created and shared among Q1 Labs customers. PresiNET – http://www.presinet.com/ Whereas tcpdump is like an x-ray for your network, Total View One is like an MRI. Total View One enables customers to maximize the use of infrastructure resources and network performance. Total View One sensors collect protocol state information by tracking connections through a network. This is commonly done out-of-line from traffic streams via port mirroring or network tap technologies. Currently PresiNET has implemented the NEDS specification which enables Total View One to receive messages from BIG-IP products to process them as if they’d come from a PresiNET sensor. This integration started with the NEDS iRule and specification and from this PresiNET created their own parser. PresiNET products are delivered as appliances in both a central unit and sensor unit mode. Optionally one may subscribe to PresiNET on a managed service basis. After you install a Total View One product in your network you get access to extensive views of available state information – with little or no additional work. If the included reporting capabilities aren’t enough, you can export data from the system as a CSV file. What’s Next? Now that you know who the players are and what they can do, be sure to check back next week to look at how the F5 products generate logs, how these technologies deal with them, and some testing results. To give you more of an idea of what’s to come, I’ll leave you with a look at the facts that will be delivered to the reporting systems from the F5 device(s) to see how they’re handled: Virtual server accessed, client IP address, client port, LB decision results, http host, http username, user-agent string, content encoding, requested URI, requested path, content type, content length, request time, server string, server port, status code, device identifier, referrer, host header, response time, VLAN id, IP protocol, IP type of service, connection end time, packets, bytes, anything sent to a dashboard, firewall messages, client source geography, extended application log data, health information for back end filers, audit logs, SNMP trap information, dedup efficacy, compression codec efficacy, wom error counters, link characteristics as known, system state Logging and Reporting Toolkit Series: Part One | Part Three459Views0likes0Comments