security
2890 TopicsConcept of F5 Device DoS and DoS profiles
Dear Reader, I´m about to write a series of DDoS articles, which will hopefully help you to better understand how F5 DDoS works and how to configure it. I plan to release them in a more or less regular frequency. Feel free to send me your feedback and let me know which topics are relevant for you and should get covered. This first article is intended to give an explanation of the BIG-IP Device DOS and Per Service-DOS protection (DOS profile). It covers the concepts of both approaches and explains in high level the threshold modes “Fully manual”, “Fully Automatic” and “Multiplier Based Mitigation” including the principles of stress measurement. In later articles I will describe this threshold modes in more detail. In the end of the article you will also find an explanation of the physical and logical data path of the BIG-IP. Device DOS with static DoS vectors The primary goal of “Device DOS” is to protect the BIG-IP itself. BIG-IP has to deal with all packets arriving on the device, regardless of an existing listener (Virtual Server (VS)/ Protected Object (PO)), connection entry, NAT entry or whatever. As soon as a packet hits the BIG-IP it´s CPU has basically to deal with it. Picking up each and every packet consumes CPU cycles, especially under heavy DOS conditions. The more operations each packet needs, depending on configurations, the higher the CPU load gets, because it consumes more CPU cycles. The earlier BIG-IP can drop a packet (if the packet needs to get dropped) the better it is for the CPU load. If the drop process is done in hardware (FPGA), the packet gets dropped before it consumes CPU cycles. Using static DOS vectors helps to lower the number of packets hitting the CPU when under DoS. If the number of packets is above a certain threshold (usually when a flood happens), BIG-IP rate-limits for that specific vector, which keeps the CPU available, because it sees less packets. Figure 1: Principle of Attack mitigation with static DoS vectors The downside with this approach is that the BIG-IP cannot differentiate between "legitimate" and "attacking" packets, when using static DOS rate-limit. BIG-IP (FPGA) just drops on the predicates it gets from the static DOS vector. Predicates are attributes of a packet like protocol “TCP” or “flag” set “ACK”. -Keep in mind, when BIG-IP runs stateful, "bad" packets will get dropped in software anyway by the operating system (TMOS), when it identifies them as not belonging to an existing connection or as invalid/broken (for example bad CRC checksum). This is of course different for SYN packets, because they create connection entries. This is where SYN-Cookies play an important role under DoS conditions and will be explained in a later article. I usually recommend running Behavioral DoS mitigation in conjunction with static DoS vectors. It is way more precise and able to filter out only the attack traffic, but I will discuss this in more detail also in one of the following articles. Manual threshold vs. Fully Automatic Before I start to explain the “per service DoS protection” (DoS profiles), I would like to give you a brief overview of some Threshold modes you can use per DoS vector. There are different ways to set thresholds to activate the detection and rate-limiting on a DOS vector. The operator can do it manual by using the option “Fully Manual” and then fine tune the pre-configured values on the DOS vectors, which can be challenging, because beside doing it all manually it´s mostly difficult to know the thresholds, especially because they are usually related to the day and time of the week. Figure 2: Example of a manual DoS vector configuration That’s why most of the vectors have the option "Fully Automatic" available. It means BIG-IP will learn from the history and “knows” how many packets it usually “sees” at that specific time for the specific vector. This is called the baselining and calculates the detection threshold. Figure 3: Threshold Modes As soon as a flood hits the BIG-IP and crosses the detection threshold for a vector, BIG-IP detects it as an attack for that vector, which basically means it identifies it as an anomaly. However, it does not start to mitigate (drop). This will only happen as soon as a TMM load (CPU load) is also above a certain utilization (Mitigation sensitivity: Low: 78%, Medium 68%, High: 51%). If both conditions (packet rate and TMM/CPU load is too high) are true, the mitigation starts and lowers (rate-limit) the number of packets for this vector going into the BIG-IP for the specific TMM. That means dropping packets by the DOS feature will only happen if necessary, in order to protect the BIG-IP CPU. This is a dynamic process which drops more packets when the attack becomes more aggressive and less when the attack becomes less aggressive. For TCP traffic keep in mind the “invalid” traffic will not get forwarded anyway, which is a strong benefit of running the device stateful. When a DOS vector is hardware supported then FPGAs drop the packets basically at the switch level of the BIG-IP. If it’s not hardware supported, then the packet is dropped at a very early stage of the life cycle of a packet “inside” a BIG-IP. Device DOS counts ALL packets going to the BIG-IP. It doesn´t matter if they have a destination behind the BIG-IP, or when it is the BIG-IP itself. Because Device DOS is to protect the BIG-IP, the thresholds you can configure in the “manual configuration” mode on the DOS vectors are per TMM! This means you configure the maximum number of packets one TMM can handle. Very often operators want to set an overall threshold of let’s say 10k, then they need to divide this limit by the number of TMMs. An Exception is the Sweep and Flood vector. Here the threshold is per BIG-IP and not per TMM. DOS profile for protected Objects Now let’s talk about the “per Service DoS Protection (DoS profile)”. The goal is to protect the service which runs behind the BIG-IP. Important to know, when BIG-IP runs stateful, the service is already protected by the state of BIG-IP (true for TCP traffic). That means a randomized ACK flood for example will never hit the service. On stateless traffic (UDP/ICMP) it is different, here the BIG-IP simply forward packets. Fully Automatic used on a DoS profile discovers the health of the service, which works well if its TCP or DNS traffic. (This is done by evaluating the TCP behavior like Window Size, Retransmission, Congestions, or counting DNS requests vs. responses, ....). If the detection rate for that service is crossed (anomaly detected) and stress on the service is identified, the mitigation starts. Again, keep in mind for TCP this will only happen for “legitimate” traffic, because out of the state traffic will never reach the service. It is already protected by the BIG-IP being configured state-full. An example could be a HTTP flood. My recommendation: If you have the option to configure L7 BaDOS (Behavioral DOS running on DHD/A-WAF), then go with that one. It gives much better results and better mitigation options on HTTP floods then just L3/4 DOS, because it does also operate on L7. Please also do then not configure TCP DOS for “Push Flood”, "bad ACK Flood" and “SYN Flood”, on that DOS profile, except you use a TMOS version newer than 14.1. For other TCP services like POP3, SMTP, LDAP … you can always go with the L3/4 DOS vectors. Since TMOS version 15.0 L7 BaDOS and L3/4 DOS vectors work in conjunction. Stateless traffic (UDP/ICMP) For UDP traffic I would like to separate UDP into DNS traffic and non-DNS-UDP traffic. DNS is a very good example where the health detection mechanism works great. This is done by measuring the ratio of requests vs. responses. For example, if BIG-IP sees 100k queries/sec going to the DNS server and the server sends back 100k answers/sec, the server shows it can handle the load. If it sees 200k queries/sec going to the server and the server sends back only 150k queries/sec, then it is a good indication of that the server cannot handle the load. In that case the BIG-IP would start to rate-limit, when the current rate is also above the detection rate (The rate BIG-IP expects based on the history rates). The 'Device UDP' vector gives you the option to exclude UDP traffic on specific ports from the UDP vector (packet counting). For example, when you exclude port 53 and UDP traffic hits that port, then it will not count into the UDP counter. In that case you would handle the DNS traffic with the DNS vectors. Figure 4 : UDP Port exclusion Here you see an example for port 53 (DNS), 5060 (SIP), 123 (NTP) Auto Detection / Multiplier Based Mitigation If the traffic is non-DNS-UDP traffic or ICMP traffic the stress measurement does not work very accurate, so I recommend going with the “multiplier option” on the DOS profile. Here BIG-IP does the baselining (calculating the detection rate) similar to “fully automatic” mode, except it will kick in (rate-limit) if it sees more than the defined multiplication for the specific vector, regardless of the CPU load. For example, when the calculated detection rate is 250k packets/sec and the multiplication is set to 500 (which means 5 times), then the mitigation rate would be 250k x 5 = 1.250.000 packets/sec. The multiplier feature gives the nice option to configure a threshold based on the multiplication of an expected rate. Figure 5 : Multiplier based mitigation On a DOS profile the threshold configuration is per service (all packets targeted to the protected service), which actually means on that BIG-IP and NOT per TMM like on the Device level. Here the goal is to set how many packets are allowed to pass the BIG-IP and reach the service. The distribution of these thresholds to the TMMs is done in a dynamic way: Every TMM gets a percentage of the configured threshold, based on the EPS (Events Per Second, which is in this context Packets Per Second) for the specific vector the system has seen in the second before on this TMM. This mechanism protects against hash type of attacks. Physical and logical data path There is a physical and logical traffic path in BIG-IP. When a packet gets dropped via a DOS profile on a VS/PO, then BIG-IP will not see it on the device level counter anymore. If the threshold on the device level is reached, the packet gets dropped by the device level and will not get to the DOS profile. So, the physical path is device first and then comes the VS/PO, but the logical path is VS/PO first and then device. That means, when a packet arrives on the device level (physical first), that counter gets incremented. If the threshold is not reached, the packet goes to the VS/PO (DOS profile) level. If the threshold is reached here, then the packet gets dropped and the counter for the device DOS vector decremented. If the packet does not get dropped, then the packet counts into both counters (Device/VS). This means that the VS/PO thresholds should always set lower than device thresholds (remember on Device level you set the thresholds per TMM!). The device threshold should be the maximum the BIG-IP (TMM) can handle overall. It is important to note that mitigation in manual mode are “hard” numbers and do not take into account the stress on the service. In “Fully Automatic” mode on the VS/PO level, mitigation kicks in when the detection threshold is reached AND the stress on the service is too high. Now let’s assume a protected TCP service behind the BIG-IP gets attacked by a TCP attack like a randomized Push/ACK food. Because of the high packet rate hitting that vector on the attached DOS profile for that PO, it will go into 'detect mode'. But because it is a TCP flood and BIG-IP is configured to run state-full, the attack packets will never reach the service behind the BIG-IP and therefore the stress on that service will never go high in this case. The flood is then handled by the session state (CPU) of the BIG-IP until this gets too much under pressure and then the Device DOS will kick in “upfront” and mitigate the flood. If you use the “multiplication” option, then the mitigation kicks in when the packet detection rate + multiplication is reached. This can happen on the VS/PO level and/or on the Device level, because it is independent of stress. “Fully automatic” will not work properly with asymmetric traffic for VS/PO, because here BIG-IP cannot identify if the service is under stress due to the fact that BIG-IP will only see half of the traffic. -Either request or response depending on where it is initiated. For deployments with asymmetric traffic I recommend using the manual or “multiplier option”, on VS/PO configurations. The “Multiplier option” keeps it dynamic in regard to the calculated detection and mitigation rate based on the history. “Fully Automatic” works with Asymmetric traffic on Device DOS, because it measures the CPU (TMM) stress of the BIG-IP itself. Of course, when you run asymmetric traffic, BIG-IP can´t be state-full. I recommend to always start first with the Device DOS configuration to protect the BIG-IP itself against DOS floods and then to focus on the DOS protection for the services behind the BIG-IP. On Device DOS you will probably mostly use “Fully Manual” and “Fully Automatic” threshold modes. To protect services, you will mostly go with “Fully Manual” and “Fully Automatic” or “Auto Detection/ Multiplier based mitigation”. In one of the next articles I will describe in more detail when to use what and why. As mentioned before, additional to the static DoS vectors I recommend enabling Behavioral DOS, which is way smarter than the static DOS vectors and can filter out specifically only the attack traffic which tremendously decrease the chance of false positives. You can enable it like the static DoS vectors on Device- an on VS/PO level. But this topic will be covered in another article as well. With that said I would like to finish my first article. Let me know your feedback. Thank you, sVen Mueller8.4KViews34likes7CommentsExplanation of F5 DDoS threshold modes
Der Reader, In my article “Concept of Device DOS and DOS profile”, I recommended to use the “Fully Automatic” or “Multiplier” based configuration option for some DOS vectors. In this article I would like to explain how these threshold modes work and what is happening behind the scene. When you configure a DOS vector you have the option to choose between different threshold modes: “Fully Automatic”, “Auto Detection / Multiplier Based”, “Manual Detection / Auto Mitigation” and “Fully Manual”. Figure 1: Threshold Modes The two options I normally use on many vectors are “Fully Automatic” and “Auto Detection / Multiplier Based”. But what are these two options do for me? To manually set thresholds is for some vectors not an easy task. I mean who really knows how many PUSH/ACK packets/sec for example are usually hitting the device or a specific service? And when I have an idea about a value, should this be a static value? Or should I better take the maximum value I have seen so far? And how many packets per second should I put on top to make sure the system is not kicking in too early? When should I adjust it? Do I have increasing traffic? Fully Automatic In reality, the rate changes constantly and most likely during the day I will have more PUSH/ACK packets/sec then during the night. What happens when there is a campaign or an event like “Black Friday” and way more users are visiting the webpage then usually? During these high traffic events, my suggested thresholds might be no longer correct which could lead to “good” traffic getting dropped. All this should be taken into consideration when setting a threshold and it ends up being very difficult to do manually. It´s better to make the machine doing it for you and this is what “Fully Automatic” is about. Figure 2: Expected EPS As soon as you use this option, it leverages from the learning it has done since traffic is passing through the BIG-IP, or since you have enforced the relearning, which resets everything learned so far and starts from new. The system continuously calculates the expected rates for all the vectors based on the historic rates of traffic. It takes the information up to one year and calculates them with different weights in order to know which packets rate should be expected at that time and day for that specific vector in the specific context (Device, Virtual Server/Protected Object). The system then calculates a padding on top of this expected rate. This rate is called Detection Rate and is dependent on the “threshold sensitivity” you have configured: Low Sensitivity means 66% padding Medium Sensitivity means 40% padding High Sensitivity means 0% padding Figure 3: Detection EPS As soon as the current rate is above the detection value, the BIG-IP will show the message “Attack detected”, which actually means anomaly detected, because it sees more packets of that specific vector then expected + the padding (detection_rate). But DoS mitigation will not start at that point! Figure 4: Current EPS Keep in mind, when you run the BIG-IP in stateful mode it will drop 'out of state' packets anyway. This has nothing to do with DoS functionalities. But what happens when there is a serious flood and the BIG-IP CPU gets high because of the massive number of packets it has to deal with? This is when the second part of the “Fully Automatic” approach comes into the game. Again, depending on your threshold sensitivity the DOS mitigation starts as soon as a certain level of stress is detected on the CPU of the BIG-IP. Figure 5: Mitigation Threshold Low Sensitivity means 78,3% TMM load Medium Sensitivity means 68,3% TMM load High Sensitivity means 51,6% TMM load Note, that the mitigation is per TMM and therefore the stress and rate per TMM is relevant. When the traffic rate for that vector is above the detection rate and the CPU of the BIG-IP (Device DOS) is “too” busy, the mitigation kicks in and will rate limit on that specific vector. When a DOS vector is hardware supported, FPGAs drop the packets at the switch level of the BIG-IP. If that DOS vector is not hardware supported, then the packet is dropped at a very early stage of the life cycle of a packet inside a BIG-IP. The rate at which are packets dropped is dynamic (mitigation rate), depending on the incoming number of packets for that vector and the CPU (TMM) stress of the BIG-IP. This allows the stress of the CPU to go down as it has to deal with less packets. Once the incoming rate is again below the detection rate, the system declares the attack as ended. Note: When an attack is detected, the packet rate during that time will not go into the calculation of expected rates for the future. This ensures that the BIG-IP will not learn from attack traffic skewing the automatic thresholds. All traffic rates below the detection (or below the floor value, when configured) rate modify the expected rate for the future and the BIG-IP will adjust the detection rate automatically. For most of the vectors you can configure a floor and ceiling value. Floor means that as long as the traffic value is below that threshold, the mitigation for that vector will never kick in. Even when the CPU is at 100%. Ceiling means that mitigation always kicks in at that rate, even when the CPU is idle. With these two values the dynamic and automatic process is done between floor and ceiling. Mitigation only gets executed when the rate is above the rate of the Floor EPS and Detection EPS AND stress on the particular context is measured. Figure 6: Floor and Ceiling EPS What is the difference when you use “Fully Automatic” on Device level compared to VS/PO (DOS profile) level? Everything is the same, except that on VS or Protected Object (PO) level the relevant stress is NOT the BIG-IP device stress, it is the stress of the service you are protecting (web-, DNS, IMAP-server, network, ...). BIG-IP can measure the stress of the service by measuring TCP attributes like retransmission, window size, congestion, etc. This gives a good indication on how busy a service is. This works very well for request/response protocols like TCP, HTTP, DNS. I recommend using this, when the Protected Object is a single service and not a “wildcard” Protected Object covering for example a network or service range. When the Protected Object is a “wildcard” service and/or a UDP service (except DNS), I recommend using “Auto Detection / Multiplier Option”. It works in the same way as the “Fully Automatic” from the learning perspective, but the mitigation condition is not stress, it is the multiplication of the detection rate. For example, the detection rate for a specific vector is calculated to be 100k packets/sec. By default, the multiplication rate is “500”, which means 5x. Therefore, the mitigation rate is calculated to 500k packets/sec. If that particular vector has more than 500k packets/sec those packets would be dropped. The multiplication rate can also be individually configured. Like in the screenshot, where it is set to 8x (800). Figure 7: Auto Detection / Multiplier Based Mitigation The benefit of this mode is that the BIG-IP will automatically learn the baseline for that vector and will only start to mitigate based on a large spike. The mitigation rate is always a multiplication of the detection rate, which is 5x by default but is configurable. When should I use “Fully Manual”? When you want to rate-limit a specific vector to a certain number of packets/sec, then “Fully Manual” is the right choice. Very good examples for that type of vector are the “Bad Header” vector types. These type of packets will never get forwarded by the BIG-IP so dropping them by a DoS vector saves the CPU, which is beneficial under DoS conditions. In the screenshot below is a vector configured as “Fully Manual”. Next I’ll describe what each of the options means. Figure 8: Fully Manual Detection Threshold EPS configures the packet rate/sec (pps) when you will get a log messages (NO mitigation!). Detection Threshold % compares the current pps rate for that vector with the multiplication of the configured percentage (in this example 5 for 500%) with the 1-minute average rate. If the current rate is higher, then you will get a log message. Mitigation Threshold EPS rate limits to that configured value (mitigation). I recommend setting the threshold (Mitigation Threshold EPS) to something relatively low like ‘10’ or ‘100’ on ‘Bad Header’ type of vectors. You can also set it also to ‘0’, which means all packets hitting this vector will get dropped by the DoS function which usually is done in hardware (FPGA). With the ‘Detection Threshold EPS’ you set the rate at which you want to get a log messages for that vector. If you do it this way, then you get a warning message like this one to inform you about the logging behavior: Warning generated: DOS attack data (bad-tcp-flags-all-set): Since drop limit is less than detection limit, packets dropped below the detection limit rate will not be logged. Another use-case for “Fully Manual” is when you know the maximum number of these packets the service can handle. But here my recommendation is to still use “Fully Automatic” and set the maximum rate with the Ceiling threshold, because then the protected service will benefit from both threshold options. Important: Please keep in mind, when you set manual thresholds for Device DoS the thresholds are to protect each TMM. Therefore the value you set is per TMM! An exception to this is the Sweep and Flood vector where the threshold is per BIG-IP/service and not per TMM like on DoS profiles. When using manual thresholds for a DOS profile of a Protected Object the threshold configuration is per service (all packets targeted to the protected service) NOT per TMM like on the Device level. Here the goal is to set how many packets are allowed to pass the BIG-IP and reach the service. The distribution of these thresholds to the TMMs is done in a dynamic way: Every TMM gets a percentage of the configured threshold, based on the EPS (Events Per Second, which is in this context Packets Per Second) for the specific vector the system has seen in the second before on this TMM. This mechanism protects against hash type of attacks. Ok, I hope this article gives you a better understanding on how ‘Fully Manual’, ‘Fully Automatic’ and ‘Auto Detection / Multiplier Based Mitigation’ works. They are important concepts to understand, especially when they work in conjunction with stress measurement. This means the BIG-IP will only kick in with the DoS mitigation when the protected object (BIG-IP or the service behind the BIG-IP) is under stress. -Why risk false positives, when not necessary? With my next article I will demonstrate you how Device DoS and the DoS profiles work together and how the stateful concept cooperates with the DoS mitigation. I will show you some DoS commands to test it and also commands to get details from the BIG-IP via CLI. Thank you, sVen Mueller11KViews21likes10CommentsF5 Distributed Cloud - Regional Decryption with Virtual Sites
In this article we discuss how the F5 Distributed Cloud can be configured to support regulatory demands for TLS termination of traffic to specific regions around the world. The article provides insight into the F5 Distributed Cloud global backbone and application delivery network (ADN). The article goes on to inspect how the F5 Distriubted Cloud is able to achieve these custom topologies in a multi-tenant architecture while adhearing to the "rules of the internet" for route summarization. Read on to learn about the flexibility of F5's SaaS platform providing application delivery and security solutions for your applications.7.2KViews17likes2CommentsF5 402 Exam reading list and notes
Disclaimer: The collection of articles and documentation are credited to original owners. This is not an official F5 402 exam guide. I recently passed the F5 402 - Certified Solution Expert - Cloud exam. I am pleased that I finally achieved it. Many are asking what I used to prepare for the exam. First, be familiar with the 402 - CLOUD SOLUTIONS EXAM BLUEPRINT. It is located at K29900360: F5 certification | Exams and blueprints. https://support.f5.com/csp/article/K29900360 The pre requisite to take the F5 402 exam is that you are currently a F5 CTS for LTM (301a and 301b) and DNS (302). These exams would have already exposed you to BIG-IP LTM and DNS. However, you should also read on and have an idea what are the other BIG-IP modules and their functionality. The F5 402 exam blueprint already gives you the topics you will need to familiar with. It really helps if you have hands on experience on working on cloud environments, such as AWS and Azure, and container environments such as in Kubernetes. For me, it was a bit of AWS and Kubernetes. You will need to be familiar with cloud terminologies - services, features, etc - and how they relate to cloud vendors. Familiarity with container orchestration terminologies such as in Kubernetes will also help. Bundle these Cloud/Container terms and features and how they relate to BIG-IP deployments in the cloud, plus, mapping them per the F5 402 exam blueprint, will help you organize your knowledge and prepare for the exam. Looking back and while preparing for the exam, here are the documentation which I would start to review and build a knowledge map. There are links in the articles that would supplement the concepts described, my suggestion, consult the F5 402 exam blueprint and see if you need more familiarity with a topic after reading thru the articles. https://clouddocs.f5.com/cloud/public/v1/ https://clouddocs.f5.com/cloud/public/v1/aws_index.html https://clouddocs.f5.com/cloud/public/v1/azure_index.html https://clouddocs.f5.com/cloud/public/v1/matrix.html https://clouddocs.f5.com/containers/latest/ https://aws.amazon.com/blogs/enterprise-strategy/6-strategies-for-migrating-applications-to-the-cloud/ https://www.f5.com/company/blog/networking-in-the-age-of-containers https://aws.amazon.com/blogs/networking-and-content-delivery/deployment-models-for-aws-network-firewall/ https://docs.microsoft.com/en-us/azure/architecture/aws-professional/services Good Luck!7.9KViews14likes8CommentsF5 Distributed Cloud - Service Policy - Header Matching Logic & Processing
Learn about the F5 Distributed Cloud service policy feature and how to apply logic to your match criteria (and/or). This understanding of the logic structures within service policies unlocks endless combinations of application security services.2.6KViews13likes2CommentsA Day in the Life of a Security Engineer from Tel Aviv
October 2022 is the Cybersecurity Awareness Month, so we decided to focus on the human aspect of the F5 SIRT team and share some of our day to day work. When I started writing this, I thought it would be trivial to capture what I do on an average day and write about it. But it turned out to be challenging task simply because we do so much. We interact with many groups and there is always a new top priority. So bouncing back and forth between tasks is the only way to execute when you are deeply involved with security in the organization. There is really no average day as the next security emergency is right around the corner. First, a little background info on me: I started working in F5 at 2006 as a New Products Introduction (NPI) engineer representing the customer throughout the product life cycle. The job included attending design meetings on new features and their implementation in real world with Product Development (PD) and Product Management (PM). The deliverables were technical presentations for both online and in-person at internal F5 conferences. The feedback that I got from the various departments were consolidated into improvements list to PD and PM, acting as a feedback loop for new features. The product that I represented as subject matter expert was BIG-IP Application Security Manager (ASM) that evolved to BIG-IP Advanced WAF, which is my specialization and my favorite technical topic until today. Then at end of 2016 I moved to the F5 SIRT team. The shift was beneficial as it started a new chapter in becoming a full time security engineer. Let me describe to you what that looks like. Morning: coffee & emergency catchup first It is a 12-minute drive from my apartment to the the office based in Tel Aviv. Living close to the office is great and you can see the sea from the 30 th floor. Yes, I know I should/can bike, and I will do more biking now that the summer heat is going away. First cortado is for reading emails that pile up overnight. We are a "follow the sun" coverage group so here's a quick time orientation: when I arrive to the office at 9AM, it's lunch time for the Singapore guys at 13:00 (5 hours ahead) and the US guys are getting ready to sleep at 11PM Seattle time (10 behind). This means that I usually have a long list of emails and messages to read. I catch up on all the emergency cases that are ongoing and reach my time zone for monitoring or follow up actions. F5 SIRT is a unique group of top engineers with many years of experience with F5 security products and security in general. We are responsible for three main pillars and the first one is assisting F5 customers when they are under attack. Since we are an emergency team, we are ready to act from the minute we come to work and we like the excitement of solving emergency cases. The F5 security team is ready to help customers when they need us the most, when the customers are under attack. This is what I call the money time as this is why people buy security products, to mitigate attacks using the F5 products. This moment has arrived. The first line of defense is the F5 SIRT specialist group which handles the request from the customer and marks it as emergency. If they need assistance from a security engineer, then they will ping us. Working in collaboration with the SIRT specialist group always feels good. It's great to have someone to trust, especially working with EMEA F5 SIRT specialists who always set a high standard. When I’m needed to help a customer under attack, verbal communication is always more effective and faster than written communication. This call ensures the technical issues, the risks, and the benefits involved in mitigation are the right ones so that the customer can choose the best path forward. Common action includes understanding the customer environment, the attack indication they have seen, and the severity of the incident. Once we collect the information from the customer, we create a plan that lists all the possibilities to mitigate the attack. Sometimes we simply give good advice on the possible mitigation and how to proceed but sometimes we need to have a full war room where we do deep traffic analysis and provide the specific mitigation to kill the attack. We have seen many attacks and each attack is different but essentially, we classify them to those main categories: Graph : Distribution of the attack over time. Working in SIRT requires understanding of the different environments, the attack landscape and above all a deep understanding of F5 security products. These are your best friends in killing the attack. Finding the best mitigation strategies with our products which leads to a successful prevention is what we do best. It is a very good fealing to lead the way to an incident win. At the end of each incident, we create a report with recommendation to the customers as well as internal analysis because if it's not documented it doesn’t exist. We have a high success rate in mitigating attacks mostly because F5 product suite is one of the best in the industry for mitigating network and web application attacks. We usually get a lot of warm words from customers. And with that, it's now lunch time. Time flies when you are busy. Noon: lunch and CVE’s Deciding what to eat should be evaluated carefully: too heavy and you fall asleep, too light and you will be hungry in 2-3 hours. If time permits, I'll walk 10 minutes each direction to the local food market with the local F5 employees and have fun conversations over lunch. When I get back it is time to take a black coffee and review the additional work that needs to be done for the day and decide which of the items I can delay. Most of the time we define our own deadlines, so we plan ahead. This means we have no one to blame if we are late. So don’t be late. This is also a good time to read some of the security industry news. If there is something notable, I will paste the link in the group team’s chat. If it's my turn to write the This Week in Security (TWIS), then this is where I will mark topics to write about. Writing TWIS can be time consuming, but it provides the ability to express yourself and keep up to date with the security industry around the globe. Now, it is CVE work time, which is our second pilar of responsibility: vulnerability management for F5 products. F5 SIRT owns the vulnerability management and publishes public CVEs as part of the F5's commitment to security best practices with F5 products. We have public policy that we follow: K4602: Overview of the F5 security vulnerability response policy. CVEs can originate from internal or external sources such as a security researcher who approached the F5 SIRT team directly. We evaluate CVEs to make sure we understand the vulnerability from both the exploitation aspect and the relevant fix introduced by Product Engineering (PE). After Interacting with PE, and once the software fix is in place, we start writing the security advisory which is the actual article that will be published. All CVEs are under embargo until publication day and just before we publish we provide briefing to internal audience to inform them of what to expect and which type of questions they might encounter. We work as a group to cover all the regions and keep everyone on the same page. Publication day is always a big event for us. This is where all the hard work comes into the light. We are constantly monitoring customers inquiries about fresh CVEs and are ready to solve any challenges customers may face. We always invest a lot of time and effort, so we created a well-defined playbook and a common language so that we can publish well-documented CVEs. Vulnerabilities and their CVEs will never run out, this is the nature of software and hardware. Time for ristretto and the Zero Day (0day) aka the OMG scenario. Every now and then, a new high-profile 0day is being published. This is the start of a race to mitigation and our play books are ready for those situations. We start by collecting all the possible information available and evaluate the situation. If F5 products are affected by the 0day, a software fix will be issued ASAP and customer notification will be released by us describing the actions that need to be taken. If we are not affected, then we want to find a mitigation to help our customer protect themselves. In both cases we will write a security article in AskF5, as well as internal communication and briefing on our findings and remediations. Those will include all possible mitigations such as WAF signatures, iRules, AFM IPS signature, LTM configuration and more. The Log4j 0day was a good example of how a solid process works like magic and we published mitigation list articles and email notifications very fast. In such cases we work with the Security Research Team from the local Tel Aviv office, a very talented group of people that assists and collaborates with us all the time with full dedication for high profile CVEs. This is where the power of F5 as a company shows its face. Once we have our mitigations plan for the 0day, F5 SIRT will send a notification email to our customers and publish information on the Ask F5 site and on social media (of course). This typically increases customers inquiries about the level of exposure they have from this new 0day, so publishing articles and knowledge is critical to fast mitigations for our customers. And it is afternoon already. Afternoon: tea, knowledge share and projects Technology is constantly improving and new features, products and services are being released to confront upcoming attacks. Therefore learning and practicing new releases is mandatory. The more we learn, know and get our hands on, the better we can mitigate security challenges when dealing with customers under attack and vulnerability management. This is also our third pillar: security advisor, which is about learning and building security mindset by sharing knowledge and experience. We write knowledge base articles on Ask F5, we mentor whenever we have good advice, and we answer security inquiries from both internal and external sources. This knowledge and experience translates to projects that we chose to do every quarter. My favorite project that I was leading (and is still very alive and relevant today) is the Attack Matrix that is used as a battle cards for customers and F5 personnel. The basic concept is to have attacks and their corresponding mitigations with F5 products. This is a very effective tool for customers and demonstrates the power of the F5 security capabilities. I mostly liked doing the WAF section (remember my favorite F5 product is BIG-IP Advance WAF) which IMHO is the best WAF technology in the industry. Late afternoon: meet the team You probably already figured out that my time zone is EMEA. Together with AaronJB, we cover the three pillars of the F5 SIRT team for the EMEA region. We discuss new ideas often and sometimes it feels like we can talk about security for weeks. So thank you, Aaron, for helping me and for being around. No matter how good you are as an individual, you must have a team to really succeed! As the day comes to an end and North America wakes up (8AM Seattle time is 6PM in Tel Aviv), we have a sync calls for the core team and other teams. It always feels good to talk to the F5 SIRT core personnel from APCJ and NA whom I work with every day. With our fearless leader who established this security A-Team, it is such a pleasure working in this group. Day report This was a day in the life of an F5 SIRT team memeber and it is totally subject to immediate changes, an emergency can arrive at any time of the day. There are days where everything becomes a war room, when there is a worldwide high-profile security incident is invoked. And there are days where I can have a cup of coffee and write an article like this. Security became a necessity, every aspect of software and computer system is affected directly by threats. So security mitigation is here to stay and is key to keeping it all going. There is much more to these organized and erratic workdays and I can talk and talk but the day has ended so until next time... Keep it up.3.4KViews13likes3CommentsAutomating ACMEv2 Certificate Management on BIG-IP
While we often associate and confuse Let's Encrypt with ACMEv2, the former is ultimately a consumer of the latter. The "Automated Certificate Management Environment" (ACME) protocol describes a system for automating the renewal of PKI certificates. The ACME protocol can be used with public services like Let's Encrypt, but also with internal certificate management services. In this article we explore the more generic support of ACME (version 2) on the F5 BIG-IP.12KViews12likes20CommentsSecurity Best Practices for F5 Products
My colleagues previously wrote this article as a security best practice guidance for BIG-IP and BIG-IQ. This is an updated overview of key recommendations and not an exhaustive list of steps for securing an F5 product. I’ve also included updates to keep it relevant including newly published product hardening guides. These include: K53108777: Hardening your F5 system and K45321906: Harden your BIG-IQ system along with the two newly published K000156803: Hardening NGINX and NGINX Plus and K000156807: Secure the AOM subsystem. Regarding BIG-IP, the F5 SIRT team recently collaborated with the F5 iHealth team to create new diagnostic heuristics that align with these hardening best practices. These heuristics are now included in the Security tab of QKViews under a new "Security Best Practices" panel. You can also filter the alerts in the Diagnostics tab to show "Security_best_practices". Beyond these resources, there is extensive documentation available on MyF5 detailing specific steps for configuring functionality, though many are version-specific due to changes and enhancements across major releases. The most relevant links for configuration can usually be found within the hardening guides listed above. Additionally, F5 documentation occasionally refers to the "control-plane" and "data-plane." The control-plane includes all methods for managing a device or installation, such as the Web UI (TMUI), iControl REST, iControl SOAP, SSH, and related daemons like big3d and bigd. The data-plane, on the other hand, refers to all constructs that handle user traffic, such as Virtual Servers, NATs, SNATs, and other similar components. Going forward, references in this context will pertain to these constructs. Step 1: Minimize access to the control-plane It is crucial to implement sound security practices for any system, especially those in privileged network positions like BIG-IP or edge firewalls. One fundamental principle is keeping the control-plane off the internet whenever possible, with limited exceptions such as big3d communications between BIG-IP DNS and BIG-IP LTM devices that may traverse the internet. Ideally, access to the control-plane should be restricted solely to authorized IT staff. Measures should be taken to control access to control-plane services (such as SSH, HTTP, and SNMP) to ensure traffic only comes from expected hosts, as outlined in K13092: Overview of securing access to the BIG-IP system and 10 Settings to Lock Down Your BIG-IP. Adding pre-login and post-login banners is another effective security step, as they can help enforce security policies—such as informing users that activities are logged—or notify users of system updates like scheduled maintenance. Guidance for configuring banners can be found in K6068: Configuring a pre-login or post-login message banner for the BIG-IP or Enterprise Manager system and K71515276: Configuring a pre-login or post-login message banner for the BIG-IQ system. Ideally, control-plane access should be managed via a management DMZ, and additional restrictions on lateral movement within the DMZ can be enforced through micro-segmentation or the use of on-device controls. For BIG-IP, these on-device controls were notably enhanced in version 14.1 and above with a robust management interface firewall. Access to the management DMZ itself should be through a jump box or VPN with 2FA enabled. Jump boxes provide a dedicated and secure environment for administrative tasks, offering substantial protection against attacks like XSS and CSRF, because administrators will use them solely for device administration rather than general browsing or other activities. In the absence of this infrastructure, using a local virtual machine or dedicated browser for administrative duties is still recommended to mitigate risks from phishing-delivered XSS and CSRF attacks. While changes to network design to accommodate a management DMZ may take time, the on-device management interface firewall can be implemented independently, along with a mandate for more secure administrative environments. Several articles provide guidance for minimizing access to the control-plane, including K5380: Specify allowable IP ranges for SSH access, K11719: Mitigating risk from SSH brute-force login attacks, K13309: Restricting access to the Configuration utility by source IP address (11.x–17.x), K9908: Configure an automatic logout for idle sessions, and K75211108: Configure automatic logout for idle sessions on the BIG-IQ system. Furthermore, articles like K80425458: Modifying the list of ciphers and MAC and key exchange algorithms used by the SSH service on BIG-IP or BIG-IQ systems, K92748202: Restrict access to the BIG-IQ management interface using network firewall rules, and K31401771: Restricting access to the BIG-IQ or F5 iWorkflow user interface by source IP address provide additional strategies for securing critical management interfaces. Step 2: BIG-IP Management and Self IPs To enhance security, ensure that all Self IPs are set to "Lockdown None" to prevent the exposure of control-plane services unless explicitly required. If a service such as big3d (port 4353) needs to be exposed, carefully restrict access to only the specific ports required. For dedicated management VLANs and non-routable HA VLANs, the "Allow Default" setting can be used, though it is recommended to allow only specific ports whenever possible for tighter access control. Relevant guidance can be found in K17333: Overview of port lockdown behavior (12.x–17.x), K39403510: Managing the port lockdown configuration on the BIG-IQ system, and K15612: Connectivity requirements for the BIG-IQ system. Out-of-band management via a dedicated interface or VLAN is strongly recommended for optimal security. This can be implemented using the hardware platform’s dedicated management interface or a dedicated management VLAN on production interfaces when a dedicated management interface is unavailable, such as in single-NIC cloud deployments. Step 3: Hardening the BIG-IP To improve security, consider using a Hardware Security Module (HSM) for storing sensitive information such as SSL keys. Options like an onboard FIPS HSM or NetHSM offer a high level of protection, while the built-in SecureVault functionality can provide additional security by making SSL key recovery more difficult for unauthorized users who gain access to the BIG-IP’s control plane. For more details about SecureVault, F5 offers a knowledge base article: K73034260: Overview of the BIG-IP system Secure Vault feature. Additionally, reduce your attack surface by provisioning modules only as needed instead of upfront, which can also decrease the frequency of applicable Security Advisories. For further access restriction, appliance mode is another option designed to limit BIG-IP administrative access, making it behave more like a typical network appliance rather than a multi-user UNIX device (K12815: Overview of Appliance mode). For authentication, the BIG-IP control-plane should integrate with enterprise-grade AAA solutions such as RADIUS, TACACS+, or LDAP, as these bring administrative accounts under pre-existing enterprise security practices. However, note that root and admin passwords are available as fallback authentication, so these should be configured with strong, secure passwords. Guidance for setting up AAA solutions can be found in articles such as K8811: Configuring TACACS+ authentication for BIG-IP administrative users, K11072: Configuring LDAP remote authentication for Active Directory, K17403: Configuring RADIUS authentication for administrative users, and corresponding BIG-IQ articles like K31586420: Configuring the BIG-IQ system to use TACACS+ based authentication and authorization, K00153876: Enabling LDAP remote authentication for Advanced Shell access to the BIG-IQ system, and K51458353: Configuring the BIG-IQ system to use RADIUS authentication. If remote authentication is not being used, it is essential to enforce a strong password policy for local accounts on the BIG-IP or BIG-IQ systems. Several articles on MyF5 provide detailed instructions for locking down authentication on F5 devices, including K15497: Configuring a secure password policy for the BIG-IP system, K13121: Changing system maintenance account passwords, K4139: Configuring the BIG-IP system to enforce the use of strict passwords, K32203233: The root and admin accounts are now subject to the enforcement restrictions of the secure password policy, K12173: Overview of BIG-IP administrative access controls, and K49507549: Configuring a secure password policy for the BIG-IQ system. For systems running BIG-IP 15.0.0 or later, remote APM authentication can be used to manage control-plane access while also implementing two-factor or multi-factor authentication (2FA/MFA) using the APM system. For further details, see https://techdocs.f5.com/en-us/bigip-15-0-0/big-ip-local-traffic-manager-implementations/implementing-apm-system-authentication.html . Step 4: Monitoring To maintain comprehensive security and monitoring, it is recommended to configure off-box syslog, ideally directed to a SIEM, to ensure you have a reliable and immutable record of events such as configuration changes, potential indicators of compromise, and system issues. Alerts based on these logs can be set up to monitor critical events in real-time. Additionally, consider utilizing SNMP traps and polling to keep track of system performance and load while monitoring for potential attack indicators against the data-plane, such as denial of service (DoS) attacks. Regularly uploading qkviews to iHealth is another beneficial practice—unless restricted by enterprise security policies—as iHealth’s built-in heuristics can identify potential device misconfigurations, vulnerabilities specific to your version, hardware, or configuration, and any indicators of compromise within your system. This process can be automated via BIG-IQ, which also has the capability to automate regular configuration snapshots. For enhanced awareness of system access, refer to resources such as K13426: Monitoring login attempts (11.x–17.x) and K08662997: Monitoring login attempts on the BIG-IQ system. Step 5: Maintaining It is highly recommended to run a recent software release, ideally within the last two LTS (Long-Term Support) branches, as F5 continuously enhances functionality to address new attack vectors and ensure rapid adoption of security fixes. While some customers opt for engineering hotfixes to resolve specific issues, it is advised to migrate back to a mainline branch as soon as the necessary fixes are incorporated to minimize time-to-patch for newly discovered defects or vulnerabilities. Useful references include K9957: Creating a custom RSS feed to view new and updated documents, K2200: Most recent versions of F5 software, K9502: BIG-IP hotfix and point release matrix, and K15113: BIG-IQ hotfix and point release matrix. To stay informed about significant vulnerabilities, customers should subscribe to the F5 Security mailing list to receive alerts for critical vulnerabilities, including Quarterly Security Notifications (QSNs) and out-of-band notifications for high-impact third-party vulnerabilities. For more information about the QSN process and scheduling, consult K67091411: Guidance for Quarterly Security Notifications and K9970: Subscribe to email notifications regarding F5 products and security announcements. Additionally, reporting software issues—whether security-related or not—ensures the continuous improvement of F5 software. Any issues reported to F5 allow developers to address them promptly, facilitating early fixes. Resources such as K4602: Overview of the F5 security vulnerability response policy and K4918: Overview of the F5 critical issue hotfix policy provide more insights into how F5 handles reported vulnerabilities. Regular backups of your devices are another critical aspect of maintaining security and stability. Backups ensure you have a reliable, uncompromised configuration to restore in case a device needs reimaging. BIG-IQ can assist in automating this process, but it is crucial to thoroughly test and validate backup scripts to ensure they capture valid data and do not unintentionally delete necessary files during backup rotation. Step 6: Recovery Although compromise is relatively uncommon, adhering to the outlined security steps and best practices can significantly reduce the likelihood of it occurring. However, preparation is critical to ensuring a successful recovery should a compromise take place. Since recovery efforts often involve multiple departments within an organization, having a documented recovery plan is essential. At a minimum, the plan should address key areas such as how to isolate the compromised device. For example, if a device pair is compromised, should a potentially compromised box remain online and serve customers despite serious implications like PCI or GDPR noncompliance? Does your application delivery design allow you to continue serving customers after losing a device pair, or should you activate Disaster Recovery? The plan should also define when and how devices can be reintroduced into service. If company policy requires devices to be held for forensic analysis, ensure you have spare devices available to maintain uninterrupted service. Include steps for reimaging devices from scratch and recovering configurations from backups, as well as revoking and replacing potentially compromised SSL keys. Additionally, consider other secrets that might need to be replaced, such as RADIUS, TACACS, or SNMP credentials. Although this level of preparation may seem burdensome, having these discussions in advance is far easier than making critical, service-impacting decisions under pressure. Moreover, your recovery plan should not be limited to only your F5 systems but should account for broader infrastructure. For additional guidance, refer to K11438344: Considerations and guidance when you suspect a security compromise on a BIG-IP system. Step 7: Secure Against Brute Force and Application Attacks Protecting your F5 system is only part of securing your network; it is equally important to protect the applications and application servers that sit behind it. F5 systems can be configured in numerous ways to provide protection not only for the system itself but also for your applications. Starting at the lower layers, protections can be implemented using TCP profiles or by adding additional modules like F5’s Advanced Firewall Manager (AFM). AFM is a high-performance, stateful, full-proxy network firewall designed to safeguard data centers from incoming threats. It supports widely used protocols such as HTTP/S, SMTP, DNS, SIP, and FTP. For further guidance, consult resources such as K25301105: Mitigate HTTP SLOWRead attacks, K37718515: Investigating BIG-IP AFM attack vector logs and tuning the DoS Vector Attack Type, and K41305885: BIG-IP AFM DoS vectors. At higher layers, HTTP applications can be protected using a Web Application Firewall (WAF). F5 offers several WAF solutions, including Distributed Cloud, NGINX App Protect WAF, and Advanced WAF/ASM. With the increasing complexity of web applications, adding a WAF has become essential. A WAF provides significant mitigation capabilities and can be configured to protect against emerging attacks, offering robust defenses against threats such as authentication attacks and brute-force attempts. For additional information, refer to K07359270: Succeeding with application security, K15405450: Overview of web scraping detection, K18650749: Configuring brute force attack protection (13.1.0 and later), and K14199: Determining if the BIG-IP ASM system has detected and prevented a Slow HTTP POST DDoS attack. Implementing these layers of protection ensures comprehensive security for both your F5 systems and the applications they support. Step 8: Prevent Data Leakage The BIG-IP system offers several HTTP protections even without utilizing a Web Application Firewall (WAF). For example, HTTP cookies can be encrypted to prevent the exposure of sensitive data, ensuring better security for client-server communication. Additionally, the BIG-IP system can be configured to remove sensitive HTTP response headers that might otherwise reveal information about the backend server, thereby reducing the risk of information leakage. Furthermore, an HTTP profile can be configured to enable Layer 7 inspections, ensuring that clients remain RFC compliant. These features collectively help safeguard against the leakage of sensitive data and enhance the overall security of HTTP transactions. For additional details, refer to resources such as K6917: Overview of BIG-IP persistence cookie encoding, K14784: Configuring cookie encryption within the HTTP profile, K23254150: Configuring cookie encryption for BIG-IP persistence cookies from the cookie persistence profile, and K40243113: Overview of the HTTP profile. Summary As noted earlier, this list is not exhaustive and should be considered within the context of your organization's existing guidelines for securing, monitoring, and maintaining systems, as well as any disaster recovery plans in place. While the technical details may evolve over time as F5’s product offerings expand—whether with BIG-IP or the NGINX suite—the overarching principles of system security will largely remain constant. To assist with these efforts, there is a wealth of documentation available on MyF5 that outlines specific technical steps, additional resources, and best practices for securing systems. A few key references include K67091411: Guidance for Quarterly Security Notifications, K9970: Subscribing to email notifications regarding F5 products, K27404821: Using F5 iHealth to diagnose vulnerabilities, K11438344: Considerations and guidance when you suspect a security compromise on a BIG-IP system, K53108777: Hardening your F5 system, K45321906: Harden your BIG-IQ system, and K000156803: Hardening NGINX and NGINX Plus.5.3KViews12likes1CommentDeploying WAF in production using Azure Resource Manager template with F5 Nginx App Protect
Introduction: In production-grade deployments, it is always a challenge for anyone who wants to give a demo in their environment with a WAF deployment. Usually, it takes at least a few weeks for an average team to design and implement a production-grade WAF in a cloud environment because for each cloud deployment, virtual networking, infrastructure security, virtual machine images, auto-scaling, logging, monitoring, automation, and many more topics require detailed analysis. To mitigate this time and effort, we came up with the conclusion that a proper WAF deployment can be templatized and automated, so a team doesn’t need to spend time on deployment and maintenance and uses a WAF from day zero. In this article we introduced a project that implements an Azure Resource Manager template to deploy a production-grade WAF in Azure cloud in just a few clicks. The WAF is using the F5 NGINX App Protect WAF official image, which is available under the Azure marketplace. This eliminates the need to manually prebuild the VM image for your WAF deployment. It contains all the necessary code and packages on top of the OS of your choice. Additionally, it allows you to pay as you go for NGINX App Protect WAF software instead of purchasing a year-long license. Why Azure? Globally, 90% of Fortune 500 companies are using Microsoft Azure to drive their business. Using deeply integrated Azure cloud services, enterprises can rapidly build, deploy, and manage simple to complex applications with ease. Azure supports a wide range of programming languages, frameworks, operating systems, databases, and devices, allowing enterprises to leverage tools and technologies they trust. Here are some of the reasons why customers are deploying their applications using Azure. Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) capabilities Security offers scalability and ductility Environmental integration with other Microsoft tools Cost efficiency and interoperability Project: This project implements an ARM (Azure Resource Manager) template that automatically deploys a production grade WAF using NGINX App Protect WAF to Azure cloud. It allows administrators to deploy, manage, and monitor Azure resources. It also allows administrators to apply access controls to all services in a resource group with role-based access control (RBAC), which is available in ARM. Architecture: The high-level architecture represents an Azure availability system that runs an application load balancer, Virtual Machine Scale Set (VMSS), and a subset of virtual machines running NGINX App Protect WAF software behind it. A load balancer is supposed to manage TLS certificates, receive traffic, and distribute it across all Azure VMs (Virtual Machine). NGINX App Protect WAF VM instance inspects traffic and forwards it to the application backend. The VMSS scales up the virtual machines based on the rules configured. Major components: ARM template (GIT repository which contains the source of data plane and security policy configurations): The pipeline runs the ARM templates which will connect to the Azure portal and deploy the solution. Also, a user can directly login to the Azure portal and run the template under the Template Spec which will deploy the solution directly. Auto-scaling (data plane based on official NGINX App Protect WAF Azure VM Images): The solution uses a Virtual Machine Scale Set configured to spin up new NGINX App Protect WAF Virtual Machine instances based on incoming traffic volumes. This removes operational headaches and optimizes costs as the Scale Set adjusts the amount of computing resources and charges a user on an as-you-go basis. Visibility (dashboards displaying the NGINX App Protect WAF health and security data): The template sets up a set of visibility dashboards in Azure Dashboard Service. Data plane VMs send logs and metrics to the Dashboard service that visualizes incoming data as a set of charts and graphs showing NGINX App Protect WAF health and security violations. Example: These three components form a complete NGINX App Protect WAF solution that is easy to deploy, doesn’t impose any operational headache, and provides handy interfaces for NGINX App Protect WAF configuration and visibility right out of the box. Automation: The following diagram represents the end-to-end automation solution. GitHub is being used as the CI/CD platform. The GitHub pipeline sets up and configures the entire system from the ground up. The first stage creates all necessary Azure resources such as Azure AS (Analysis Service), VMSS, Virtual Machines, and the Load Balancer. The second stage sends test traffic (including malicious requests) and verifies the solution. Project Repository: f5devcentral/azure-waf-solution-template (github.com) Steps: Pre-requisites: Azure account and credentials. Admin privileges to your Azure resource group. Service principal and password (follow link to create the service principal (https://docs.microsoft.com/en-us/cli/azure/create-an-azure-service-principal-azure-cli)) Resource group created in the Azure portal. Add the below variables under github-->secrets AZURE_SP --> Azure service principle AZURE_PWD --> Azure client password Add your resource group and other params under Lib/azure-user-params file. mandatory params: ResourceGroup TenandId SubscriptionId On GitHub.com, navigate to the main page of the repository and below repository name, click Actions tab. In the left sidebar select the workflow with name "Resource Manager Template Deployment in Azure". Above the list of workflow runs, select Run workflow. LOG: Conclusion: Using a template to deploy a cloud WAF significantly reduces the time spent on WAF deployment and maintenance. It also provides a complete and easy-to-use solution to deploy resources and verify the NGINX App Protect WAF security solution on the Azure platform in any location. Handy interfaces for configuration and visibility turn this project into a boxed solution, allowing a user to easily operate a WAF and focus on application security.2.5KViews12likes0Comments