NGINX WAF
27 TopicsSeparating False Positives from Legitimate Violations
Imagine you make yourself a cup of tea and now want to extract the tannins and caffeine from the homogeneous mixture. How do you do it? Similarly, when building and protecting applications, you brew a blend of legitimate violations and false positives (distinct substances with similar properties) that match the rules you have defined in your web application firewall (WAF). How do you separate the false positives from legitimate violations? Before you decaffeinate your tea, let’s review some context around false positives and the challenges they pose in application security. False positives occur when legitimate requests are identified as attacks or violations. Due to the complex nature of web applications, false positives are a normal aspect of application security. It is preferable for a WAF to trigger false positives than to allow false negatives (attacks perceived as legitimate traffic). However, reducing the rate of false positives without compromising application security remains a significant challenge for security professionals. A high false positive rate has the following disadvantages: · Obstructs legitimate traffic · Increases maintenance · Burdens computer resources One way to extract the caffeine from your tea, chemically, is to introduce a solvent to neutralize the caffeine. Similarly, separating false positives from legitimate violations require the introduction of a solvent, of sorts, to neutralize the false positives among all the legitimate violations found in our WAF log. The challenge is designing the so-called solvent. Applications are unique and require a deep understanding of architecture, business flow and also your configured WAF rules in order to define the necessary properties for each so-called solvent. For example, using ‘square brackets in parameter name’ as a so-called solvent, is a common practice in the Drupal application development, yet many default WAF rules will flag it as a violation and trigger the false positive.[1]Christian Folini, https://www.christian-folini.ch You should also consider other volumetric and statistical factors such as violation type, volume, and density when you design your so-called solvent. Generally 3 components are taken in consideration : Traffic: What are the characteristics (origin, destination, frequency, payload ...) of the traffic (HTTP Request/Response) that triggered the violation ?. Violation: What are the characteristics (Type of attack, Rule triggered , ) of the violation triggered ?. Application: What are the characteristics of the application that is processing the request ?. Once you have subtracted false positives, you can correct your WAF configurations and monitor for new violations, then repeat the process. The workflow looks like the following: 1. Design a solvent. 2. Sample web traffic and log all violations. 3. Apply all violations to the solvent. 4. Use analytical extraction or manual extraction to subtract false positives. 5. Correct your WAF with Rules Exclusion for the identified false positives. 6. Monitor both web traffic and application changes. 7. Repeat steps 1 through 6. As solvent properties are specific to applications, in the next article, we'll provide additional example of solvent properties for the most popular applications.1KViews6likes3CommentsImplementing BIG-IP WAF logging and visibility with ELK
Scope This technical article is useful for BIG-IP users familiar with web application security and the implementation and use of the Elastic Stack.This includes, application security professionals, infrastructure management operators and SecDevOps/DevSecOps practitioners. The focus is for WAF logs exclusively.Firewall, Bot, or DoS mitigation logging into the Elastic Stack is the subject of a future article. Introduction This article focusses on the required configuration for sending Web Application Firewall (WAF) logs from the BIG-IP Advanced WAF (or BIG-IP ASM) module to an Elastic Stack (a.k.a. Elasticsearch-Logstash-Kibana or ELK). First, this article goes over the configuration of BIG-IP.It is configured with a security policy and a logging profile attached to the virtual server that is being protected. This can be configured via the BIG-IP user interface (TMUI) or through the BIG-IP declarative interface (AS3). The configuration of the Elastic Strack is discussed next.The configuration of filters adapted to processing BIP-IP WAF logs. Finally, the article provides some initial guidance to the metrics that can be taken into consideration for visibility.It discusses the use of dashboards and provides some recommendations with regards to the potentially useful visualizations. Pre-requisites and Initial Premise For the purposes of this article and to follow the steps outlined below, the user will need to have at least one BIG-IP Adv. WAF running TMOS version 15.1 or above (note that this may work with previous version but has not been tested).The target BIG-IP is already configured with: A virtual Server A WAF policy An operational Elastic Stack is also required. The administrator will need to have configuration and administrative privileges on both the BIG-IP and Elastic Stack infrastructure.They will also need to be familiar with the network topology linking the BIG-IP with the Elastic Search cluster/infrastructure. It is assumed that you want to use your Elastic Search (ELK) logging infrastructure to gain visibility into BIG-IP WAF events. Logging Profile Configuration An essential part of getting WAF logs to the proper destination(s) is the Logging Profile.The following will go over the configuration of the Logging Profile that sends data to the Elastic Stack. Overview of the steps: Create Logging Profile Associate Logging Profile with the Virtual Server After following the procedure below On the wire, logs lines sent from the BIG-IP are comma separated value pairs that look something like the sample below: Aug 25 03:07:19 localhost.localdomainASM:unit_hostname="bigip1",management_ip_address="192.168.41.200",management_ip_address_2="N/A",http_class_name="/Common/log_to_elk_policy",web_application_name="/Common/log_to_elk_policy",policy_name="/Common/log_to_elk_policy",policy_apply_date="2020-08-10 06:50:39",violations="HTTP protocol compliance failed",support_id="5666478231990524056",request_status="blocked",response_code="0",ip_client="10.43.0.86",route_domain="0",method="GET",protocol="HTTP",query_string="name='",x_forwarded_for_header_value="N/A",sig_ids="N/A",sig_names="N/A",date_time="2020-08-25 03:07:19",severity="Error",attack_type="Non-browser Client,HTTP Parser Attack",geo_location="N/A",ip_address_intelligence="N/A",username="N/A",session_id="0",src_port="39348",dest_port="80",dest_ip="10.43.0.201",sub_violations="HTTP protocol compliance failed:Bad HTTP version",virus_name="N/A",violation_rating="5",websocket_direction="N/A",websocket_message_type="N/A",device_id="N/A",staged_sig_ids="",staged_sig_names="",threat_campaign_names="N/A",staged_threat_campaign_names="N/A",blocking_exception_reason="N/A",captcha_result="not_received",microservice="N/A",tap_event_id="N/A",tap_vid="N/A",vs_name="/Common/adv_waf_vs",sig_cves="N/A",staged_sig_cves="N/A",uri="/random",fragment="",request="GET /random?name=' or 1 = 1' HTTP/1.1\r\n",response="Response logging disabled" Please choose one of the methods below.The configuration can be done through the web-based user interface (TMUI), the command line interface (TMSH), directly with a declarative AS3 REST API call, or with the BIG-IP native REST API.This last option is not discussed herein. TMUI Steps: Create Profile Connect to the BIG-IP web UI and login with administrative rights Navigate to Security >> Event Logs >> Logging Profiles Select “Create” Fill out the configuration fields as follows: Profile Name (mandatory) Enable Application Security Set Storage Destination to Remote Storage Set Logging Format to Key-Value Pairs (Splunk) In the Server Addresses field, enter an IP Address and Port then click on Add as shown below: Click on Create Add Logging Profile to virtual server with the policy Select target virtual server and click on the Security tab (Local Traffic >> Virtual Servers : Virtual Server List >> [target virtualserver] ) Highlight the Log Profile from the Available column and put it in the Selected column as shown in the example below (log profile is “log_all_to_elk”): Click on Update At this time the BIG-IP will forward logs Elastic Stack. TMSH Steps: Create profile ssh into the BIG-IP command line interface (CLI) from the tmsh prompt enter the following: create security log profile [name_of_profile] application add { [name_of_profile] { logger-type remote remote-storage splunk servers add { [IP_address_for_ELK]:[TCP_Port_for_ELK] { } } } } For example: create security log profile dc_show_creation_elk application add { dc_show_creation_elk { logger-type remote remote-storage splunk servers add { 10.45.0.79:5244 { } } } } 3. ensure that the changes are saved: save sys config partitions all Add Logging Profile to virtual server with the policy 1.From the tmsh prompt (assuming you are still logged in) enter the following: modify ltm virtual [VS_name] security-log-profiles add { [name_of_profile] } For example: modify ltm virtual adv_waf_vs security-log-profiles add { dc_show_creation_elk } 2.ensure that the changes are saved: save sys config partitions all At this time the BIG-IP sends logs to the Elastic Stack. AS3 Application Services 3 (AS3) is a BIG-IP configuration API endpoint that allows the user to create an application from the ground up.For more information on F5’s AS3, refer to link. In order to attach a security policy to a virtual server, the AS3 declaration can either refer to a policy present on the BIG-IP or refer to a policy stored in XML format and available via HTTP to the BIG-IP (ref. link). The logging profile can be created and associated to the virtual server directly as part of the AS3 declaration. For more information on the creation of a WAF logging profile, refer to the documentation found here. The following is an example of a pa rt of an AS3 declaration that will create security log profile that can be used to log to Elastic Stack: "secLogRemote": { "class": "Security_Log_Profile", "application": { "localStorage": false, "maxEntryLength": "10k", "protocol": "tcp", "remoteStorage": "splunk", "reportAnomaliesEnabled": true, "servers": [ { "address": "10.45.0.79", "port": "5244" } ] } In the sample above, the ELK stack IP address is 10.45.0.79 and listens on port 5244 for BIG-IP WAF logs.Note that the log format used in this instance is “Splunk”.There are no declared filters and thus, only the illegal requests will get logged to the Elastic Stack.A sample AS3 declaration can be found here. ELK Configuration The Elastic Stack configuration consists of creating a new input on Logstash.This is achieved by adding an input/filter/ output configuration to the Logstash configuration file.Optionally, the Logstash administrator might want to create a separate pipeline – for more information, refer to this link. The following is a Logstash configuration known to work with WAF logs coming from BIG-IP: input { syslog { port => 5244 } } filter { grok { match => { "message" => [ "attack_type=\"%{DATA:attack_type}\"", ",blocking_exception_reason=\"%{DATA:blocking_exception_reason}\"", ",date_time=\"%{DATA:date_time}\"", ",dest_port=\"%{DATA:dest_port}\"", ",ip_client=\"%{DATA:ip_client}\"", ",is_truncated=\"%{DATA:is_truncated}\"", ",method=\"%{DATA:method}\"", ",policy_name=\"%{DATA:policy_name}\"", ",protocol=\"%{DATA:protocol}\"", ",request_status=\"%{DATA:request_status}\"", ",response_code=\"%{DATA:response_code}\"", ",severity=\"%{DATA:severity}\"", ",sig_cves=\"%{DATA:sig_cves}\"", ",sig_ids=\"%{DATA:sig_ids}\"", ",sig_names=\"%{DATA:sig_names}\"", ",sig_set_names=\"%{DATA:sig_set_names}\"", ",src_port=\"%{DATA:src_port}\"", ",sub_violations=\"%{DATA:sub_violations}\"", ",support_id=\"%{DATA:support_id}\"", "unit_hostname=\"%{DATA:unit_hostname}\"", ",uri=\"%{DATA:uri}\"", ",violation_rating=\"%{DATA:violation_rating}\"", ",vs_name=\"%{DATA:vs_name}\"", ",x_forwarded_for_header_value=\"%{DATA:x_forwarded_for_header_value}\"", ",outcome=\"%{DATA:outcome}\"", ",outcome_reason=\"%{DATA:outcome_reason}\"", ",violations=\"%{DATA:violations}\"", ",violation_details=\"%{DATA:violation_details}\"", ",request=\"%{DATA:request}\"" ] } break_on_match => false } mutate { split => { "attack_type" => "," } split => { "sig_ids" => "," } split => { "sig_names" => "," } split => { "sig_cves" => "," } split => { "staged_sig_ids" => "," } split => { "staged_sig_names" => "," } split => { "staged_sig_cves" => "," } split => { "sig_set_names" => "," } split => { "threat_campaign_names" => "," } split => { "staged_threat_campaign_names" => "," } split => { "violations" => "," } split => { "sub_violations" => "," } } if [x_forwarded_for_header_value] != "N/A" { mutate { add_field => { "source_host" => "%{x_forwarded_for_header_value}"}} } else { mutate { add_field => { "source_host" => "%{ip_client}"}} } geoip { source => "source_host" } } output { elasticsearch { hosts => ['localhost:9200'] index => "big_ip-waf-logs-%{+YYY.MM.dd}" } } After adding the configuration above to the Logstash parameters, you will need to restart the Logstash instance to take the new logs into configuration.The sample above is also available here. The Elastic Stack is now ready to process the incoming logs.You can start sending traffic to your policy and start seeing logs populating the Elastic Stack. If you are looking for a test tool to generate traffic to your Virtual Server, F5 provides a simpleWAF tester tool that can be found here. At this point, you can start creating dashboards on the Elastic Stack that will satisfy your operational needs with the following overall steps: ·Ensure that the log index is being created (Stack Management >> Index Management) ·Create a Kibana Index Pattern (Stack Management>>Index patterns) ·You can now peruse the logs from the Kibana discover menu (Discover) ·And start creating visualizations that will be included in your Dashboards (Dashboards >> Editing Simple WAF Dashboard) A complete Elastic Stack configuration can be found here – note that this can be used with both BIG-IP WAF and NGINX App Protect. Conclusion You can now leverage the widely available Elastic Stack to log and visualize BIG-IP WAF logs.From dashboard perspective it may be useful to track the following metrics: -Request Rate -Response codes -The distribution of requests in term of clean, blocked or alerted status -Identify the top talkers making requests -Track the top URL’s being accessed -Top violator source IP An example or the dashboard might look like the following:13KViews5likes6CommentsDashboards for NGINX App Protect
Introduction NGINX App Protect is a new generation WAF from F5 which is built in accordance with UNIX philosophy such that it does one thing well everything else comes from integrations. NGINX App Protect is extremely good at HTTP traffic security. It inherited a powerful WAF engine from BIG-IP and light footprint and high performance from NGINX. Therefore NGINX App Protect brings fine-grained security to all kinds of insertion points where NGINX use to be either on-premises or cloud-based. Therefore NGINX App Protect is a powerful and flexible security tool but without any visualization capabilities which are essential for a good security product. As mentioned above everything besides primary functionality comes from integrations. In order to introduce visualization capabilities, I've developed an integration between NGINX App Protect and ELK stack (Elasticsearch, Logstash, Kibana) as one of the most adopted stacks for log collection and visualization. Based on logs from NGINX App Protect ELK generates dashboards to clearly visualize what WAF is doing. Overview Dashboard Currently, there are two dashboards available. "Overview" dashboard provides a high-level view of the current situation and also allows to discover historical trends. You are free to select any time period of interest and filter data simply by clicking on values right on the dashboard. Table at the bottom of the dashboard lists all requests within a time frame and allows to see how exactly a request looked like. False Positives Dashboard Another useful dashboard called "False Positives" helps to identify false positives and adjust WAF policy based on findings. For example, the chart below shows the number of unique IPs that hit a signature. Under normal conditions when traffic is mostly legitimate "per signature" graphs should fluctuate around zero because legitimate users are not supposed to hit any of signatures. Therefore if there is a spike and amount of unique IPs which hit a signature is close to the total amount of sources then likely there is a false positive and policy needs to be adjusted. Conclusion This is an open-source and community-driven project. The more people contribute the better it becomes. Feel free to use it for your projects and contribute code or ideas directly to the repo. The plan is to make these dashboards suitable for all kinds of F5 WAF flavors including AWAF and EAP. This should be simple because it only requires logstash pipeline adjustment to unify logs format stored in elasticsearch index. If you have a project for AWAF or EAP going on and would like to use dashboards please feel free to develop and create a pull request with an adjusted logstash pipeline to normalize logs from other WAFs. Github repo: https://github.com/464d41/f5-waf-elk-dashboards Feel free to reach me with questions. Good luck!2KViews4likes1CommentCloud Template for App Protect WAF
Introduction Everybody needs a WAF. However, when it gets to a deployment stage a team usually realizes that production-grade deployment going to be far more complex than a demo environment. In the case of a cloud deployment VPC networking, infrastructure security, VM images, auto-scaling, logging, visibility, automation, and many more topics require detailed analysis. Usually, it takes at least a few weeks for an average team to design and implement a production-grade WAF in a cloud. That is the one side of the problem. Additionally, cloud deployment best practices are the same for everyone, therefore most of well-made WAF deployments follow a similar path and become similar at the end. The statements above bring us to an obvious conclusion that proper WAF deployment can be templatized. So a team doesn’t spend time on deployment and maintenance but starts to use a WAF from day zero. The following paragraphs introduce a project that implements a Cloud Formation template to deploy production-grade WAF in AWS cloud just in a few clicks. Project (GitHub) On a high level, the project implements a Cloud Formation template that automatically deploys a production-grade WAF to AWS cloud. The template aims to follow cloud deployment best practices to set up a complete solution that is fully automated, requires minimum to no infrastructure management, therefore, allows a team to focus on application security. The following picture represents the overall solution structure. The solution includes a definition of three main components. Auto-scalingdata planebased on official NGINX App Protect AWS AMI images. Git repository as the source of data plane and securityconfiguration. Visibilitydashboards displaying the WAF health and security data. Therefore it becomes a complete and easy-to-use solution to protect applications whether they run in AWS or in any other location. Data Plane: Data plane auto-scales based on the amount of incoming traffic and uses official NGINX App Protect AWS AMIs to spin up new VM instances. That removes the operational headache and optimizes costs since WAF dynamically adjusts the amount of computing resources and charges a user on an as-you-go basis. Configuration Plane: Solution configuration follows GitOps principles. The template creates the AWS CodeCommit git repository as a source of forwarding and security configuration. AWS CodeDeploy pipeline automatically delivers a configuration across all data plane VMs. Visibility: Alongside the data plane and configuration repository the template sets up a set of visibility dashboards in AWS CloudWatch. Data plane VMs send logs and metrics to CloudWatch service that visualizes incoming data as a set of charts and tables showing WAF health and security violations. Therefore these three components form a complete WAF solution that is easy to deploy, doesn't impose any operational headache and provides handy interfaces for WAF configuration and visibility right out of the box. Demo As mentioned above, one of the main advantages of this project is the ease of WAF deployment. It only requires downloading the AWS CloudFormation template file from the project repository and deploy it whether via AWS Console or AWS CLI. Template requests a number of parameters, however, all they are optional. As soon as stack creation is complete WAF is ready to use. Template outputs contain WAF URL and pointer to configuration repository. By default the WAF responds with static page. As a next step, I'll put this cloud WAF instance in front of a web application. Similar to any other NGINX instance, I'll configure it to forward traffic to the app and inspect all requests with App Protect WAF. As mentioned before, all config lives in a git repo that resides in the AWS CodeCommit service. I'm adjusting the NGINX configuration to forward traffic to the protected application. Once committed to the repo, a pipeline delivers the change to all data plane VMs. Therefore all traffic redirects to a protected application (screenshot below is not of a real company, and used for demo purposes only). Similar to NGINX configuration App Protect policy resides in the same repository. Similarly, all changes reflect running VMs. Once the configuration is complete, a user can observe system health and security-related data via pre-configured AWS CloudWatch dashboards. Outline As you can see, the use of a template to deploy a cloud WAF allows to significantly reduce time spent on WAF deployment and maintenance. Handy interfaces for configuration and visibility turn this project into a boxed solution allowing a user to easily operate a WAF and focus on application security. Please comment if you find useful to have this kind of solution in major public clouds marketplaces. It is a community project so far, and we need as much feedback as possible to steer one properly. Feel free to give it a try and leave feedback here or at the project's git repository. P.S.: Take a look to another community project that contributes to F5 WAF ecosystem: WAF Policy Editor655Views3likes0CommentsMitigate OWASP LLM Security Risk: Sensitive Information Disclosure Using F5 NGINX App Protect
This short WAF security article covered the critical security gaps present in current generative AI applications, emphasizing the urgent need for robust protection measures in LLM design deployments. Finally we also demonstrated how F5 Nginx App Protect v5 offers an effective solution to mitigate the OWASP LLM Top 10 risks.246Views2likes0CommentsProtecting gRPC based APIs with NGINX App Protect
gRPC support on NGINX Developed back in 2015, gRPC keeps attracting more and more adopters due to the use of HTTP/2.0 as efficient transport, tight integration with interface description language (IDL), bidirectional streaming, flow control, bandwidth effective binary payload, and a lot more other benefits. About two years ago NGINX started to support gRPC (link) as a gateway. However, the market quickly realized that (like any other gateway)it is subject to cyber-attacks and requires strong defense. As a response for such challenges, App Protect WAF for NGINX just released a compelling set of security features to defend gRPC based services. gRPC Security It is afact that App Protect for NGINX provides much more advanced security and performance than any ModSecurity based WAFs (most of the WAF market). Hence, even before explicit gRPC support, App Protect armory in conjunction with NGINX itself could protect web services from a wide variety of threats like: Injection attacks Sensitive data leakage OS Command execution Buffer Overflow Threat Campaigns Authentication attacks Denial-of-service and more (link) With gRPC support, App Protect provides an even deeper level of security. The newly added gRPC content profile allows to parse binary payload, make sure there is no malicious data, and ensures its structure conforms to interface definition (protocol buffers) (link). gRPC Profile Similar to JSON and XML profiles gRPC profile attaches to a subset of URLs and serves to define and enforce a payload structure. gRPC profile extracts application URLs and request/response structures from Interface Definition Language (IDL) file. IDL file is a mandatory part of every gRPC based application. Following policy listing shows an example of referencing the IDL file from a gRPC profile. { "policy": { "name": "online-boutique-policy", "grpc-profiles": [ { "name": "hipstershop-grpc-profile", "defenseAttributes": { "maximumDataLength": 100, "allowUnknownFields": true }, "idlFiles": [ { "idlFile": { "$ref": "file:///hipstershop/demo.proto" }, "isPrimary": true } ] ...omitted... } ], ...omitted... } } gRPC profile references the IDL file to extract all required data to instantiate a positive security model. This means that all URLs and payload formats from it will be considered as valid and pass. To catch anomalies in the gRPC traffic, App Protect introduces three kinds of violations. Requests that don't match IDL trigger "VIOL_GRPC_MALFORMED" or "VIOL_GRPC_METHOD". Requests with unknown or longer than allowed fields cause "VIOL_GRPC_FORMAT" violation. In addition to the above checkups, App Protect looks up signatures or disallowed meta characters in gRPC data. Because of this, it protects applications from a wide variety of attacks that worked for plain HTTP traffic. The following listing gives an example of signatures and meta characters enforcement in gRPS profile (docs). "grpc-profiles": [ { "name": "online-boutique-profile", "attackSignaturesCheck": true, "signatureOverrides": [ { "signatureId": 200001213, "enabled": false }, { "signatureId": 200089779, "enabled": false } ], "metacharCheck": true, ...omitted... } ], Example Sandbox Here is an example of how to configure NGINX as a gRPC gateway and defend it with the App Protect. As a demo application, I use "Online Boutique" (link). This application consists of multiple micro-services that talk to each other in gRPC. The picture below represents the structure of entire application. Picture 1. I slightly modified the application deployment such that the frontend doesn't talk to micro-services directly but through NGINX gateway that proxies all calls. Picture 2. Before I jump to App Protect configuration, here is how NGINX config looks like for proxying gRPC services. http { include /etc/nginx/mime.types; default_type application/octet-stream; sendfile on; keepalive_timeout 65; include /etc/nginx/conf.d/upstreams.conf; server { server_name boutique.online; listen 443 http2 ssl; ssl_certificate /etc/nginx/ssl/nginx.crt; ssl_certificate_key /etc/nginx/ssl/nginx.key; ssl_protocols TLSv1.2 TLSv1.3; include conf.d/errors.grpc_conf; default_type application/grpc; app_protect_enable on; app_protect_policy_file "/etc/app_protect/conf/policies/online-boutique-policy.json"; app_protect_security_log_enable on; app_protect_security_log "/opt/app_protect/share/defaults/log_grpc_all.json" stderr; include conf.d/locations.conf; } } The listing above is self-explaining. NGINX virtual server "boutique.online" listens on port 443 for HTTP2 requests. Requests are routed to one of a micro-service from "upstreams.conf" based on the map defined in "locations.conf". Below are examples of these config files. $ cat conf.d/locations.conf upstream adservice { server 10.101.11.63:9555; } location /hipstershop.CartService/ { grpc_pass grpc://cartservice; } ...omitted... $ cat conf.d/uptsreams.conf location /hipstershop.AdService/ { grpc_pass grpc://adservice; } upstream cartservice { server 10.99.75.80:7070; } ...omitted... For instance, any call with URL starting from "/hipstershop.AdService" is routed to "adservice" upstream and so on. For more details on NGINX configuration for gRPC refer to this blog article or official documentation. App Protect is enabled on the virtual server. Hence, all requests are subject to inspection. Let's take a closer look at the security policy applied to the virtual server above. { "policy": { "name": "online-boutique-policy", "grpc-profiles": [ { "name": "online-boutique-profile", "idlFiles": [ { "idlFile": { "$ref": "https://raw.githubusercontent.com/GoogleCloudPlatform/microservices-demo/master/pb/demo.proto" }, "isPrimary": true } ] "associateUrls": true, "defenseAttributes": { "maximumDataLength": 10000, "allowUnknownFields": false }, "attackSignaturesCheck": true, "signatureOverrides": [ { "signatureId": 200001213, "enabled": true }, { "signatureId": 200089779, "enabled": true } ], "metacharCheck": true, } ], "urls": [ { "name": "*", "type": "wildcard", "method": "*", "$action": "delete" } ] } } The policy has one gRPC profile called "online-boutique-profile". Profile references IDL file for the demo application (similar to open API file reference) as a source of the application structure. "associateUrls: true" directive instructs App Protect to extract all possible URLs from IDL file and enforce parent profile on them. Notice URL section removes wildcard URL "*" from the policy to only allow URLs that are in IDL and therefore establish a positive security model. "defenseAttributes" directive enforces payload length and tolerance to unknown parameters. "attackSignaturesCheck" and "metaCharactersCheck" directives look for a malicious pattern in the entire request. Now let's see what does this policy block and pass. Experiments First of all, let's make sure that valid traffic passes. As an example, I construct a valid call to "Ads" micro-service based on IDL content below. syntax = "proto3"; package hipstershop; service AdService { rpc GetAds(AdRequest) returns (AdResponse) {} } message AdRequest { repeated string context_keys = 1; } Based on the definition above a valid call to the service should go to "/hipstershop.AdService/GetAds" URL and contain "context_keys" identifiers in a payload. I use the "grpcurl" tool to construct and send the call that passes. $ grpcurl -proto ../microservices-demo/pb/demo.proto -d '{"context_keys": "example"}' boutique.online:8443 hipstershop.AdService/GetAds { "ads": [ { "redirectUrl": "/product/2ZYFJ3GM2N", "text": "Film camera for sale. 50% off." }, { "redirectUrl": "/product/0PUK6V6EV0", "text": "Vintage record player for sale. 30% off." } ] } As you may note, the URL constructs out of package name, service name, and method name from IDL. Therefore it is expected that all calls which don't comply IDL definition will be blocked. A call to invalid service. $ curl -X POST -k --http2 -H "Content-Type: application/grpc" -H "TE: trailers" https://boutique.online:8443/hipstershop.DoesNotExist/GetAds <html><head><title>Request Rejected</title></head><body>The requested URL was rejected. Please consult with your administrator.<br><br>Your support ID is: 16472380185462165521<br><br><a href='javascript:history.back();'>[Go Back]</a></body></html> A call to an unknown service. Notice that the previous call got "html" response page when this one got a special "grpc" response. This happened because only valid URLs considered as type gRPC others are "html" by default. A call to an invalid method. $ curl -v -X POST -k --http2 -H "Content-Type: application/grpc" -H "TE: trailers" https://boutique.online:8443/hipstershop.AdService/DoesNotExist < HTTP/2 200 < content-type: application/grpc; charset=utf-8 < cache-control: no-cache < grpc-message: Operation does not comply with the service requirements. Please contact you administrator with the following number: 16472380185462166031 < grpc-status: 3 < pragma: no-cache < content-length: 0 A call with junk in the payload. curl -v -X POST -k --http2 -H "Content-Type: application/grpc" -H "TE: trailers" https://boutique.online:8443/hipstershop.AdService/GetAds -d@trash_payload.bin < HTTP/2 200 < content-type: application/grpc; charset=utf-8 < cache-control: no-cache < grpc-message: Operation does not comply with the service requirements. Please contact you administrator with the following number: 13966876727165538516 < grpc-status: 3 < pragma: no-cache < content-length: 0 All gRPC wise invalid calls are blocked. In the same way attack signatures are caught in gRPC payload ("alert() (Parameter)" signature). $ grpcurl -proto ../microservices-demo/pb/demo.proto -d '{"context_keys": "example"}' boutique.online:8443 hipstershop.AdService/GetAds { "ads": [ { "redirectUrl": "/product/2ZYFJ3GM2N", "text": "Film camera for sale. 50% off." }, { "redirectUrl": "/product/0PUK6V6EV0", "text": "Vintage record player for sale. 30% off." } ] } Conclusion With gRPC support App Protect provides even deeper controls for gRPC traffic along with all existing security inventory available for plain HTTP traffic. Keep in mind that this release only supports Unary gRPC traffic and doesn't support the server reflection feature. Refer to the official documentation for detailed information on gRPC support (link).2KViews2likes0CommentsServerless NGINX App Protect
Introduction Advanced teams which develop serverless applications usually prefer to have entire infrastructure running serverless including security tools like WAF. F5 can definitely accommodate such need with Essential App Protect which runs as SaaS. It doesn't require any infrastructure management, easy to setup as 1-2-3 however not as flexible as managed WAF can be. Serverful NGINX App Protect gives full flexibility in data plane and WAF policy configuration but requires team, expertise and time to manage underlying servers. What if to combine both of these approaches by running full featured NGINX App Protect on top of public cloud's serverless compute engine (e.g. AWS Fargate, GCP Cloud Run, ...) which require zero infrastructure maintenance to run workloads. Note: If you are not familiar with serverless compute platforms take a look to official docs: AWS, GCP, Azure. Long story short they are managed compute engines which take container workload declaration (what?) as input and then automatically make sure that workload runs exactly as specified in declaration (how?). Compute resources, networking, security, everything... Literally zero infrastructure maintenance. Solution Requirements The idea sounds pretty attractive so let's define requirements and try to make it happen. Fully featured WAF runs serverless (zero infrastructure maintenance, low cost due to per used CPU/MEM billing) As easy to setup as SaaS WAF Deployment is fully automated and repeatable All configuration stored in git (version control, easy roll back, developers friendly) Implementation Clear requirements are half of success. Implementation is another half. Last two solution requirements lead to use of CICD platform (like Gitlab). Such platform allows to automate anything using CICD pipelines. Fortunately there is no need to develop a pipeline from scratch. There is a template repo which can be reused to deploy your own serverless WAF. The plan as simple as: Clone template repo to a new one Supply cloud service account credentials and NGINX App Protect license keys to the new git repo Let pipeline to do the rest for you Enjoy serverless WAF Template repo contains all configuration files and CICD pipeline definition to automate all deployment and operation steps. Pipeline designed to use Gitlab as CICD platform, AWS as public cloud and consists of three stages: Terraform: Creates AWS resources (VPC, subnets, security groups, cluster, etc...) Build: Builds NGINX App Protect docker image and pushes it to container registry (AWS ECR) Deploy: Generates NGINX App Protect configuration and launches it in AWS Elastic Container Service (AWS ECS aka AWS Fargate) Terraform Stage Terraform is a popular configuration management system. It takes infrastructure declaration as input and builds it at location of your choice either cloud or on-premises (Official docs). This particular stage uses terraform to create "Security VPC" and its components as shown below: ECS cluster runs NGINX App Protect containers to process a traffic. ECR stores and supplies container images to ECS. ALB accepts traffic from clients and distributes it between WAF containers. Build Stage Build stage builds NGINX App Protect container image with default configuration and pushes it to ECR. In order to get access to NGINX App Protect repositories certificate and key should be supplied to the Gitlab repo as base64 encoded environment variables. Deploy Stage This stage generates configuration for NGINX App Protect, deploys containers to the ECS and publishes them through ALB. Therefore pipeline completely automates WAF deployment. All day to day operations are performed via code commits. Want to change NGINX configuration - commit changes to nginx.conf. Want to change WAF policy - commit changes to WAF policy file. Want to roll back configuration - refer previous task definition. Example After initial commit CICD pipeline creates AWS VPC and ECS cluster with few NGINX App Protect instances running in it. Instances with default configuration return static hello page. It means deployment went successfully and the WAF is ready for use. Next step is to publish a web resource through the WAF. Remember, serverless NGINX App Protect has exactly the same NGINX under the hood but running on top of serverless platform. Therefore publishing a web resource happens exactly the same as with serverful NGINX. All traffic which comes to WAF simply redirects to a web server. After committing changes to nginx.conf pipeline automatically updates running NGINX instances with new configuration. Once configuration applied traffic redirects to a protected web resource (see URL). Any other system configuration aspect as WAF policy, logging, scaling, image, etc. modification happens similarly. Update configuration file and pipeline does the rest. Conclusion This approach of running NGINX App Protect serverless brings a lot of benefits. Easy to deploy and manage. Doesn't require infrastructure management. Auto scalable. Cost effective since AWS ECS charges only for consumed CPU/Memory. developer friendly because all configuration and parameters stored as a code. Easy to roll back. Therefore in current world of developers who want to only write code but still need their applications secured such WAF format seems right on time. This is a new project which IMO worth to be further developed. Feel free to give it a try and post your feedback to the repo or directly to me (see profile) Links Template repo: link Demo video: link898Views1like0CommentsL7 DoS Protection with NGINX App Protect DoS
Intro NGINX security modules ecosystem becomes more and more solid. Current App Protect WAF offering is now extended by App Protect DoS protection module. App Protect DoS inherits and extends the state-of-the-art behavioral L7 DoS protection that was initially implemented on BIG-IPand now protects thousands of workloads around the world. In this article, I’ll give a brief explanation of underlying ML-based DDoS prevention technology and demonstrate few examples of how precisely it stops various L7 DoS attacks. Technology It is important to emphasize the difference between the general volumetric-based protection approach that most of the market uses and ML-based technology that powers App Protect DoS. Volumetric-based DDoS protection is an old and well-known mechanism to prevent DDoS attacks. As the name says, such a mechanism counts the number of requests sharing the same source or destination, then simply drops or applies rate-limiting after some threshold crossed. For instance, requests sourcing the same IP are dropped after 100 RPS, requests going to the same URL after 200 RPS, and rate-limiting kicks in after 500 RPS for the entire site. Obviously, the major drawback of such an approach is that the selection criterion is too rough. It can causeerroneous drops of valid user requests and overall service degradation. The phenomenon when a security measure blocks good requests is called a “false positive”. App Protect DoS implements much more intelligent techniques to detect and fight off DDoS attacks. At a high level, it monitors all ongoing traffic and builds a statistical model in other words a baseline in aprocesscalled “learning”. The learning process almost never stops, thereforea baseline automatically adjusts to the current web application layout, a pattern of use, and traffic intensity. This is important because it drastically reduces maintenance cost and reaction speed for the solution. There is no more need to manually customize protection configuration for every application or traffic change. Infinite learning produces a legitimate question. Why can’t the system learn attack traffic as a baseline and how does it detect an attack then? To answer this question let us define what a DDoS attack is. A DDoS attack is a traffic stream that intends to deny or degrade access to a service. Note, the definition above doesn’t focus on the amount of traffic. ‘Low and slow’ DDoS attacks can hurt a service as severely as volumetric do. Traffic is only considered malicious when a service level degrades. So, this means that attack traffic can become a baseline, but it is not a big deal since protected service doesn’t suffer. Now only the “service degradation” term separates us from the answer. How does the App Protect DoS measure a service degradation? As humans, we usually measure the quality of a web service in delays. The longer it takes to get a response the more we swear. App Protect DoS mimics human behavior by measuring latency for every single transaction and calculates the level of stress for a service. If overall stress crosses a threshold App Protect DoS declares an attack. Think of it; a service degradation triggers an attack signal, not a traffic volume. Volume is harmless if an application servermanages to respond quickly. Nice! The attack is detected for a solid reason. What happens next? First of all, the learning process stops and rolls back to a moment when the stress level was low. The statistical model of the traffic that was collected during peacetime becomes a baseline for anomaly detection. App Protect DoS keeps monitoring the traffic during an attack and uses machine learning to identify the exact request pattern that causes a service degradation. Opposed to old-school volumetric techniques it doesn’t use just a single parameter like source IP or URL, but actually builds as accurate as possible signature of entire request that causes harm. The overall number of parameters that App Protect DoS extracts from every request is in the dozens. A signature usually contains about a dozen including source IP, method, path, headers, payload content structure, and others. Now you can see that App Protect DoS accuracy level is insane comparing to volumetric vectors. The last part is mitigation. App Protect DoS has a whole inventory of mitigation tools including accurate signatures, bad actor detection, rate-limiting or even slowing down traffic across the board, which it usesto return service. The strategy of using those is convoluted but the main objective is to be as accurate as possible and make no harm to valid users. In most cases, App Protect DoS only mitigates requests that match specific signatures and only when the stress threshold for a service is crossed. Therefore, the probability of false positives is vanishingly low. The additional beauty of this technology is that it almost doesn’t require any configuration. Once enabled on a virtual server it does all the job "automagically" and reports back to your security operation center. The following lines present a couple of usage examples. Demo Demo topology is straightforward. On one end I have a couple of VMs. One of them continuously generates steady traffic flow simulating legitimate users. The second one is supposed to generate various L7 DoS attacks pretending to be an attacker. On the other end, one VM hosts a demo application and another one hosts NGINX with App Protect DoS as a protection tool. VM on a side runs ELK cluster to visualize App Protect DoS activity. Workflow of a demo aims to showcase a basic deployment example and overall App Protect DoS protection technology. First, I’ll configure NGINX to forward traffic to a demo application and App Protect DoS to apply for DDoS protection. Then a VM that simulates good users will send continuous traffic flow to App Protect DoS to let it learn a baseline. Once a baseline is established attacker VM will hit a demo app with various DoS attacks. While all this battle is going on our objective is to learn how App Protect DoS behaves, and that good user's experience remains unaffected. Similar to App Protect WAF App Protect DoS is implemented as a separate module for NGINX. It installs to a system as an apt/yum package. Then hooks into NGINX configuration via standard “load_module” directive. load_module modules/ngx_http_app_protect_dos_module.so; Once loaded protection enables under either HTTP, server, or location sections. Depending on what would you like to protect. app_protect_dos_enable [on|off] By default, App Protect DoS takes a protection configuration from a local policy file “/etc/nginx/BADOSDefaultPolicy.json” { "mitigation_mode" : "standard", "use_automation_tools_detection": "on", "signatures" : "on", "bad_actors" : "on" } As I mentioned before App Protect DoS doesn’t require complex config and only takes four parameters. Moreover, default policy covers most of the use cases therefore, a user only needs to enable App Protect DoS on a protected object. The next step is to simulate good users’ traffic to let App Protect DoS learn a good traffic pattern. I use a custom bash script that generates about 6-8 requests per second like an average surfing activity. While inspecting traffic and building a statistical model of good traffic App Protect DoS sends logs and metrics to Elasticsearch so we can monitor all its activity. The dashboard above represents traffic before/after App Protect DoS, degree of application stress, and mitigations in place. Note that the rate of client-side transactions matches the rate of server-side transactions. Meaning that all requests are passing through App Protect DoS and there are not any mitigations applied. Stress value remains steady since the backend easily handles the current rate and latency does not increase. Now I am launching an HTTP flood attack. It generates several thousands of requests per second that can easily overwhelm an unprotected web server. My server has App Protect DoS in front applying all its’ intelligence to fight off the DoS attack. After a few minutes of running the attack traffic, the dashboard shows the following situation. The attack tool generated roughly 1000RPS. Two charts on the left-hand side show that all transactions went through App Protect DoS and were reaching a demo app for a couple of minutes causing service degradation. Right after service stress has reached a threshold an attack was declared (vertical red line on all charts). As soon as the attack has been declared App Protect DoS starts to apply mitigations to resume the service back to life. As I mentioned before App Protect DoS tries its best not to harm legitimate traffic. Therefore, it iterates from less invasive mitigations to more invasive. During the first several seconds when App Protect DoS just detected an attack and specific anomaly signature is not calculated yet. App Protect DoS applies an HTTP redirect to all requests across the board. Such measure only adds a tiny bit of latency for a web browser but allows it to quickly filter out all not-so-intelligent attack tools that can’t follow redirects. In less than a minute specific anomaly signature gets generated. Note how detailed it is. The signature contains 11 attributes that cover all aspects: method, path, headers, and a payload. Such a level of granularity and reaction time is not feasible neither for volumetric vectors nor a SOC operator armed with a regex engine. Once a signature is generated App Protect DoS reduces the scope of mitigation to only requests that match the signature. It eliminates a chance to affect good traffic at all. Matching traffic receives a redirect and then a challenge in case if an attacker is smart enough to follow redirects. After few minutes of observation App Protect DoS identifies bad actors since most of the requests come from the same IP addresses (right-bottom chart). Then switches mitigation to bad actor challenge. Despite this measure hits all the same traffic it allows App Protect DoS to protect itself. It takes much fewer CPU cycles to identify a target by IP address than match requests against the signature with 11 attributes. From now on App Protect DoS continues with the most efficient protection until attack traffic stops and server stress goes away. The technology overview and the demo above expose only a tiny bit of App Protect DoS protection logic. A whole lot of it engages for more complicated attacks. However, the results look impressive. None of the volumetric protection mechanisms or even a human SOC operator can provide such accurate mitigation within such a short reaction time. It is only possible when a machine fights a machine.4.4KViews1like3CommentsProtecting APIs with NGINX App Protect
Recently NGINX App Protect learned how to ingest Open API file. With that feature protecting APIs with NGINX app protect become much easier. App Protect automatically configures itself to pass only allowed paths, methods, and parameters based on Open API file contents. Let's take a quick look how it works and how it helps. Configuring WAF to protect APIs is a big mess. For example Httpbin service which I use for this article has 12 API endpoints. 12 endpoints multiplied by 4 methods to implement CRUD give 48 combinations. Plus every endpoint has some parameters... You can imagine how much work it is to manually write a WAF policy to accurately allowlist all these combinations. Now what if API layout changes? I think this is not something anyone want to spend time on. Fortunately Open API file format was developed to become a single source of truth of an API definition. It formally describes entire API structure including endpoints, methods, parameters, etc.. Any tool can use it in order to interact with an API. WAF is not an exception. When open api file is referenced from a WAF policy NGINX App Protect automatically configures itself to allowlist all API resources. As a good side effect of it is that list of application specific resources becomes segregated from a WAF policy. This means you can update a policy or resources independently. Or even reuse same policy to protect different APIs. As a demo I have built a simple topology. Httpbin application instance as API backend, NGINX App protect as a WAF and WAF dashboards to visualize activity. All components are deployed on top of serverless AWS platforms including WAF. Serverless NGINX App Protect receives traffic, filters it and forwards to Httpbin backend. WAF dashboards visualize WAF activity based on logs received from NGINX App Protect. If you are not aware of what serverless NGINX App Protect or WAF dashboards are I encourage you to take a look to introductory articles. At the beginning NGINX just forwards traffic to Httpbin instance and App Protect has default policy in place. Default policy is not aware of any API specific urls or parameters and only has some generic features enabled. Therefore any endpoint is accessible and all invalid requests reach application, put unnecessary load and expose breaches. $ curl -khttps://snap-alb-1333484792.us-east-2.elb.amazonaws.com/doesntexist < HTTP/2 404 < date: Tue, 29 Sep 2020 00:12:47 GMT < content-type: text/html < content-length: 233 < access-control-allow-origin: * < access-control-allow-credentials: true <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"> <title>404 Not Found</title> <h1>Not Found</h1> <p>The requested URL was not found on the server. If you entered the URL manually please check your spelling and try again.</p> To protect application from invalid traffic WAF needs to allow only requests to expected urls, methods, parameters. Using open api file from inside of policy helps to define all allowed resources in one line. Documentation says there are two ways of referencing an open api file: as local file as remote url I will use local file in this example due to serverless WAF nature which follows gitops paradigm and stores all configuration files in a gitlab repo. So policy transforms to following one: { "name": "app_protect_default_policy", "template": { "name": "POLICY_TEMPLATE_NGINX_BASE" }, "applicationLanguage": "utf-8", "enforcementMode": "blocking", "open-api-files": [ { "filename": "file://httpbin-openapi.json" } ] } From this point NGINX App Protect becomes aware of entire API structure. However default policy doesn't have all violations enabled to effectively block invalid requests based on API structure. Documentation also gives a set of violations to enable: VIOL_FILE_UPLOAD_IN_BODY VIOL_MANDATORY_REQUEST_BODY VIOL_PARAMETER_LOCATION VIOL_MANDATORY_PARAMETER VIOL_JSON_SCHEMA VIOL_PARAMETER_ARRAY_VALUE VIOL_PARAMETER_VALUE_BASE64 VIOL_FILE_UPLOAD VIOL_URL_CONTENT_TYPE VIOL_PARAMETER_STATIC_VALUE VIOL_PARAMETER_VALUE_LENGTH VIOL_PARAMETER_DATA_TYPE VIOL_PARAMETER_NUMERIC_VALUE VIOL_PARAMETER_VALUE_REGEXP VIOL_URL VIOL_PARAMETER VIOL_PARAMETER_EMPTY_VALUE VIOL_PARAMETER_REPEATED { "name": "app_protect_default_policy", "template": { "name": "POLICY_TEMPLATE_NGINX_BASE" }, "applicationLanguage": "utf-8", "enforcementMode": "blocking", "open-api-files": [ { "filename": "file://httpbin-openapi.json" } ], "blocking-settings": { "violations": [ { "block": true, "description": "Disallowed file upload content detected in body", "name": "VIOL_FILE_UPLOAD_IN_BODY" }, ...omitted... ] } } Once an open api file and a policy are committed to serverless NGINX App Protect repo and applied to the traffic all invalid requests aren't forwarded to a backend but receive blocking response page directly from WAF instead. $ curl -kvhttps://snap-alb-1333484792.us-east-2.elb.amazonaws.com/doesntexist {"supportID": "18105457996499725594"} That is it. No more monstrous hand written WAF policies. Open api file reference makes WAF API aware in one line. One more thing which a good WAF can't live without is visibility. NGINX App Protect is totally GUIless and the only way to get some insight of what does it do is to integrate it with external visibility tools like WAF Dashboards for Kibana. Integration process is simple. Once ELK stack is deployed just two steps left: Configure NGINX App Protect to send logs to log processor (logstash) Import dashboards to Kibana Configuring WAF to send logs requires to commit two more directives to serverless NGINX App Protect repo. Once applied WAF starts to send logs to logstash and logstash in turn parses them and stores in elasticsearch. app_protect_security_log_enable on; app_protect_security_log "/etc/app_protect/conf/log_default.json" syslog:server=logstash.example.com:5144 Step of importing dashboards to Kibana isn't much harder. Documentation requires just couple commands: KIBANA_URL=https://your.kibana:5601 jq -s . kibana/overview-dashboard.ndjson | jq '{"objects": . }' | \ curl -k --location --request POST "$KIBANA_URL/api/kibana/dashboards/import" \ --header 'kbn-xsrf: true' \ --header 'Content-Type: text/plain' -d @- \ | jq jq -s . kibana/false-positives-dashboards.ndjson | jq '{"objects": . }' | \ curl -k --location --request POST "$KIBANA_URL/api/kibana/dashboards/import" \ --header 'kbn-xsrf: true' \ --header 'Content-Type: text/plain' -d @- \ | jq Once done you can observe all NGINX App Protect activity on the dashboards. There are lots of useful information like top talkers, frequently accessed URLs, blocking reasons and so on. All this info helps to keep track of what WAF is up to and fine tune policy based on these insights. NGINX App Protect gets new features quickly and becomes full featured, flexible and lighting fast WAF. Community keeps up and produces ecosystem of tools to make WAF operator experience even better. Feel free to explore tools I covered in this article and join the community to make live even better. Tools: Serverless NGINX App Protect WAF Dashboards Do not hesitate to contact me directly with questions of any kind about NGINX App Protect or any project used in this article. Good luck!700Views1like0CommentsNGINX App Protect Deployment in AWS Cloud
Introduction OfficialAWS AMI image for NGINX App Protecthas been released recently. This fact gives two big benefits for all users. First is that the official image available on the AWS marketplace eliminates the need to manually pre build AMI for your WAF deployment. It contains all the necessary code and packages on top of the OS of your choice. Another benefit that is even more important from my perspective is that the use of official AMI from the AWS marketplace allows you to pay as you go for NGINX App Protect software instead of purchasing a year-long license. Pay as you go licensing model is much more suitable for modern dynamic cloud environments. The following article proposes an option of how to deploy and automate NGINX App Protect WAF as a motive to play with new AMIs. To make it slightly more useful I'll try to simulate a production-like environment. Here are the requirements: Flexibility. A number of instances scale up and down smoothly. Redundancy. Loss of an instance or entire datacenter doesn't cause a service outage. Automation. All deployment and day to day operations are automated. Architecture High-level architecture represents a common deployment pattern for a highly available system. AWS VPC runs an application load balancer and a subset of EC2 instances running NGINX App Protect software behind it. A load balancer is supposed to manage TLS certificates, receive traffic, and distribute it across all EC2 instances. NGINX App Protect VMs inspect traffic and forward it to the application backend. Everything is simple so far. Diagram 1. High Level Architecture. Since the system pretends to be production like then redundancy is a must. Deeper dive to AWS architecture on the diagram below reveals more system details. Diagram 2. VPC Architecture. AWS VPC has two subnets distributed across two availability zones. Load balancer legs and WAF instances are going to present in each subnet. Such workload distribution provides geographical resiliency. Even if the entire AWS datacenter in one zone goes down WAF instances in other zone keep woking. Therefore WAF deployment keeps handling traffic and applications remain available to the public. Such a scenario reveals the rule of thumb. Rule: Always keep instances load below fifty percent to prevent overload in case of loss up to half of the instances. Each tier lives in its security group. The load balancer security group allows access from any IP to HTTPS port for data traffic. WAF security group allows HTTP access from the load balancer and SSH from trusted hosts for administration purposes. Data traffic enters load balancer public IPs and then reaches one of WAF instances via private IPs. Blocking response pages served right from WAF VMs. Clear traffic departs directly to application backends regardless of their location. Automation Automated deployment and operations for modern systems is de facto standard. Similar to any other systems WAF automation should cover deployment and configuration. Deployment automation sets up underlying AWS infrastructure. Configuration automation takes care of WAF policy distribution across all WAF instances. The following diagram represents an option I used to automate the NGINX App Protect instance. Diagram 3. Gitlab is used as a CI/CD platform. Gitlab pipeline sets up and configures the entire system from the ground up. The first stage uses terraform to create all necessary AWS resources such as VPC, subnets, load balancer, and EC2 instances out of official NGINX App Protect AMI image. Second stage provisions WAF policy across all instances. CI/CD Pipeline Let's take a closer look at the GitLab pipeline listing. The first stage simply uses terraform to create AWS resources as shown in diagram 2. terraform: stage: terraform image: name: hashicorp/terraform:0.13.5 before_script: - cd terraform - terraform init script: - terraform plan -out "planfile" && \ - terraform apply -input=false "planfile" artifacts: paths: - terraform/hosts.cfg The second stage applies WAF policy across all NGINX App Protect instances created by Terraform. provision: stage: provision image: name: 464d41/ansible before_script: - eval $(ssh-agent -s) && \ - echo $ANSIBLE_PRIVATE_KEY | base64 -d | ssh-add - - export ANSIBLE_REMOTE_USER=ubuntu - cd provision - ansible-galaxy install nginxinc.nginx_config script: - ansible-playbook -i ../terraform/hosts.cfg nap-playbook.yaml only: changes: - "terraform/*" - "provision/**/*" - ".gitlab-ci.yml" WAF Deployment Automation. Terraform There are a couple of important code snippets I would like to emphasize from terraform code. ...omitted... module "nap" { source = "terraform-aws-modules/ec2-instance/aws" providers = { aws = aws.us-west-2 } version = "~> 2.0" instance_count = 2 name= "nap.us-west-2c.int" ami= "ami-045c0c07ba6b04fcc" instance_type= "t2.medium" root_block_device = [ { volume_type = "gp2" volume_size = 8 } ] associate_public_ip_address = true key_name= "aws-f5-nap" vpc_security_group_ids = [module.nap_sg.this_security_group_id, data.aws_security_group.allow-traffic-from-trusted-sources.id] subnet_id= data.aws_subnet.public.id } resource "local_file" "hosts_cfg" { content = templatefile("hosts.tmpl", { nap_instances = module.nap.public_ip } ) filename = "hosts.cfg" } ...omitted... A community module to create EC2 instances is in use. It allows to save some time on implementing my own and scale deployment up and down by simply changing "instance_count" or "instance_type" values back and forth. "ami" value represents official NGINX App Protect AMI therefore no need to pre-bake custom images and buy with per-instance licenses. All instances have a public IP address assigned to them for management purposes. Only GitLab IPs are allowed to access those IPs. Data traffic comes from a load balancer through private IPs. Notice that Terraform creates a "hosts.cfg" local file. This file contains a list of WAF VM IPs which terraform manages. So Ansible in the next stage always knows what instances to provision. WAF Configuration Automation. Ansible Ansible generates NGINX and App Protect configuration and applies them across all instances created by Terraform. NGINX team developed a set ofAnsible collectionsthat wrap these operations to roles. It allows to avoid dealing with complex Jinja templates but instead define NGINX configuration right as Ansible playbook parameters. Ansible automatically compiles these parameters to the NGINX config file and spreads it across hosts. The following listing gives an example of a playbook to configure NGINX. First, it copies a custom App Protect policy to all hosts. --- - name: Converge hosts: all gather_facts: false become: yes tasks: - name: Copy App Protect Policy copy: src: ./app-protect/custom-policy.json dest: /etc/nginx/custom-policy.json The next task configures general NGINX daemon parameters. - name: Configure NGINX and App Protect include_role: name: nginxinc.nginx_config vars: nginx_config_debug_output: true nginx_config_main_template_enable: true nginx_config_main_template: template_file: nginx.conf.j2 conf_file_name: nginx.conf conf_file_location: /etc/nginx/ modules: - modules/ngx_http_app_protect_module.so user: nginx worker_processes: auto pid: /var/run/nginx.pid error_log: location: /var/log/nginx/error.log level: warn worker_connections: 1024 http_enable: true http_settings: default_type: application/octet-stream access_log_format: - name: main format: | '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"' access_log_location: - name: main location: /var/log/nginx/access.log keepalive_timeout: 65 cache: false rate_limit: false keyval: false server_tokens: "off" stream_enable: true http_custom_includes: - "/etc/nginx/sites-enabled/*.conf" The last task of the playbook configures a virtual server with App Protect enabled on it: nginx_config_http_template_enable: true nginx_config_http_template: app: template_file: http/default.conf.j2 conf_file_name: default.conf conf_file_location: /etc/nginx/conf.d/ servers: server1: listen: listen_localhost: ip: 0.0.0.0 port: 80 opts: - default_server server_name: localhost access_log: - name: main location: /var/log/nginx/access.log locations: frontend: location: / proxy_pass: http://app_servers proxy_set_header: header_host: name: Host value: $host app_protect: enable: true policy_file: /etc/nginx/custom-policy.json upstreams: app_upstream: name: app_servers servers: app_server_1: address: 35.167.144.13 port: 80 Once the pipeline ends successfully the NGINX App Protect WAF cluster is deployed, configured, and ready to inspect traffic. Conclusion This is an option for how production grade NGINX App Protect deployment could look like. Simple, redundant architecture automated from the ground up helps to effectively manage WAF deployment and let a team focus on application development and security instead of maintaining a WAF up and running. Official AMIs allow to use pay as you go licensing to easily scale deployment up and down without overpaying for static licenses. Full listings of configuration files are available atrepo. Feel free to reach out with questions and suggestions. Thanks for reading!853Views1like0Comments