In search of a security incident response system for the masses
Introduction
Designing and deploying the best security architectures out there are almost always seen as more exciting activities than planning for a log management system or setting up an incident documentation policy. This seems to correlate well with the size of the organization, as smaller companies may lack the resources or the business focus to implement anything more than simple preventive security measures.
As the organization - and it's attack footprint - grows, so is the number of incidents so the enterprise will start to look for a more formal way to manage the security events.
One of the first documents that is usually consulted is NIST SP 800-61 "Computer Security Incident Handling Guide", which is describing the principles of establishing a security incident response capability. A good starting point is defining the scope of such a capability, as illustrated by the incident response life-cycle diagram.
Of particular interest for this article are two phases, "Detection & Analysis" and "Containment, Eradication & Recovery", usually less well covered in the planning of a typical resource-constrained organization that might choose to rely heavily on preventive measures.
The "Detection" sub-phase deals with the collection of incident indicators, reported by various security controls such as WAFs or IDPS, or even by end-users.
The next challenge, addressed in the "Analysis" sub-phase is qualifying these incident indicators. Some of them are false-positives while others can be triggered by non-security related incidents.
The sheer number of alerts poses a big challenge in processing them and this phase is the one in which the security teams will spend most of their time. Any automation that can speed up the analysis phase would have a huge impact on SecOps productivity.
The "Containment, Eradication & Recovery" phase is concerned with selecting the appropriate threat response action, gathering all the required evidence, blocking the source of the attack as an immediate action while all the affected systems are remediated and returned to normal operation.
The speed of response is essential here to limit the blast radius so, again, automation seems to be the best solution.
From a product perspective, the "Detection & Analysis" and "Containment, Eradication & Recovery" phases seem to be covered by solutions such as SIEM/SOAR, which can be integrated with a broad range of telemetry sources, or XDR for a narrower set of log generators.
There is a lot of overlap between the capabilities of these products and clear delimitation is not helped by the fact their markets are fairly new (compared to the other security products) and rapidly evolving.
Security Incident Response capabilities
For the purpose of identifying the main features of a Security Incident Response system, also refered to as a Security Incident Management system, we can use the SIEM as a more generic, mature and representative technology (as it sometimes includes SOAR capabilities) and check Gartner's SIEM critical capabilities list.
According to them, a SIEMS/SOAR should support:
- A multitude of deployment environments, ranging from on-premises to public/private cloud or as-a-service ("cloud SIEM")
- Collection of logs from a multitude of sources and in different formats
- Storing of data ensuring its integrity, confidentiality and availability
- Analytics to detect threats and compliance issues
- Tracking of incidents by creating cases based on indicators/alerts
- Enrichment of alert data by running automated analytics
- Automated threat response capabilities
The main barrier for widespread adoption of SIEM/SOAR technology is the cost, both in terms of acquisition and operation. SIEM product licenses are definitely expensive, although their list price might be obscured when bundled with other security products but there's always the large cost of maintaining a fairly complex system.
In terms of market use-cases, Gartner identifies three categories:
The "Essential SIEM" category should cater for the more budget-minded customers but adopting these products still requires a sizeable investment and a steep learning curve. As a result, SIEM/SOAR technology is still relatively uncommon in midsize enterprises, their SecOps teams having to rely for investigating their security alerts on time-consuming manual parsing of logs.
There is a need for an easier access ramp to the security incident response space, either for testing as proof-of-concept in lab environments or even for adoption to production.
As mentioned in a previous article, "Detect and stop exfiltration attempts with F5 Distributed Cloud App Infrastructure Protection", as part of InnovateF5 program, we developed a project (STIR - Security Telemetry and Incident Response) that aims to package existing external Open-Source technologies in order to make them easier to deploy. The end goal was to deploy all the necessary infrastructure with a single command, made possible by packaging all the components as a Helm chart.
STIR components
STIR uses ELK as data repository, enabling it to receive and parse any telemetry data that Logstash can parse. Tracking of incidents is done through TheHive, one of the most popular Open-Source SIEMs. Alerts from Elastic are being pushed to TheHive with the help of ElastAlert2, which allows a second layer of parsing and formatting, after Logstash.
Once alerts have been imported into TheHive, one option is for a SecOps analyst to quickly triage them and select those that will be elevated as cases for further analysis.
A second option is to have an automated conversion from alerts to cases with the help of Node-RED, which is the component responsible for automation of the most common tasks of STIR.
Once cases have been created in TheHive, the SecOps analysts can start enriching the alert data by calling Cortex analysers - you can think of Cortex as a "docking station" or a gateway that ensures access to a series of threat intelligence sources that can be queried to check for any information related to the event indicators or indicators of compromise (IoC).
TheHive will also support the assignment of various tasks to the SecOps team and a way to tag the cases, adding MITRE ATT&CK TTPs and building up a "dossier" in support of the final decision to classify the incident as a true or false positive.
To ensure threat intelligence sharing within a configurable security group or community, STIR is using MISP, a distributed platform that allows sharing of insights derived from security analysis.
For automated threat response, we are using again Node-RED, which is, for example, capable to use API calls to perform remedial actions like reconfiguration of network elements such as network firewalls or WAFs.
As mentioned before, the objective of wrapping all these components in a Helm chart is to deploy this infrastructure as a unit, in the simplest possible way but also to allow its deployment in as many environments as possible, ranging from on-prem K8s distributions to public cloud environments such as AWS EKS. You can find the Helm chart Git repo here: link.
I will go over the installation and basic configuration of STIR in a future article.
Conclusion
Although the security incident response technologies such as SIEM/SOAR are becoming more common, there is still a need for a low cost alternative to allow new users to test the best way to integrate various security solutions.
The STIR project developed within F5's InnovateF5 program is bundling some of the most widely used Open-Source security solutions that, together, check all the critical characteristics of a SIEM/SOAR, in a package that is easy to deploy in any K8s-based environment.