Increase Security in AWS without Rearchitecting your Applications - Part 1: Tuesday
It is a random Tuesday, and your boss just left an urgent meeting with the company executives. The board has become concerned about the security of your IT infrastructure. Several companies in your segment have recently suffered security breaches making front page news and management is being pressed on the steps they are taking to secure the application infrastructure. Lucky you, it is now your task to fix it (somewhere in all those courses on networking it was never stated that Network Engineering is the team every company turns to when there is a problem and nobody else can help...). Your company's infrastructure is a maze of VPCs, route tables, Transit Gateways, Transit VPCs, Internet Gateways, Direct Connects, VPNs, multiple accounts and traffic flowing in every different direction. No one seems to know what is talking to what, why or what information they are sharing, let alone the security posture. Your boss simply says that a solution needs to be found. Fast.
Evaluating the Options
After you hang up the phone you start looking into what architectures and Best Practices are being used to solve this problem. It is preferable to insert the security services you need without a major architecture effort, but what does that look like? What options do you have? How do you define what security should be applied to different traffic flows without major changes? What types of traffic do we need to inspect? What does that mean to our architecture options? What if you need more than one security service? What about cost and time to deploy? How to stay resilient with changing security demands?
Defining Network Flows (Who's the client and who's the server?)
I believe it is important that we start with a crisp definition of the network flows that we need to consider.
Flow Type |
Description |
Example |
Ingress: North to South (N/S) |
From a external client to an internal resource |
An Internet client is connecting to your companies website, API etc. |
Egress: South to North (S/N) |
From an external client to an external resource |
An internal client is connecting to a website, github, API, etc. on the public Internet. |
East to West (E/W) |
From an internal resource to another internal resource. |
An internal client is connecting to another internal client in a different subnet which could be in the same VPC, a different VPC, or on premises. |
Solution v1 - Routed Sandwich
Let's look at an example architecture, while it does accomplish the technical goal, it does not solve the problem in a manner that is dynamic, elastic, and transparent. If you manually daisy chain together the services you will run into an array of issues that directly contradict our solution requirements.
- Architecture is based on route table constructs and to use all security services you need to build a "club sandwich" pattern. - Complex
- Not all vendors integrate with the AWS API - creating more complexity, forcing NAT, losing visibility
- AWS does not support ECMP in a VPC route table - creating single points of failure
- Not every service may be needed for each flow or app but the route table forces processing. Increasing cost and complexity
- Security stack has to be deployed per VPC, per AZ, per Account - an exponential lift in cost and toil
Solution v2 - LB-LB Sandwich
Most of us have seen this architecture, and many organizations have deployed in such a manner. Prior to AWS delivering GWLB, this was the most common way to get horizontal scale for ingress traffic security in the public cloud. While the pattern is actually simple, the deployment and management is complex and cost is very high. At each layer you incur an address translation and a load balancer cost. This is multiplied by N number of layers and N number of VPCs. What you end up with is a feeling of security but near zero visibility. With hindsight being 20/20 it is easy to point the faults with it, but it was only a bad solution when there is, now, a better solution.
Solution v3 Gateway Load Balancing Sandwich
What if we try to use a separate Gateway Load Balancing Endpoint for each service? Sounds good… right? Not so fast. While it does add end to end IP visibility to the topology it still steers all traffic through the all services and it creates a complex topology that needs to be replicated in all of the application VPCs. This pattern simply delivers on a new type of complexity.
Don't Love the Complexity
The problem with all of these solutions is that while adding security functions they create massive complexity, decrease visibility, generate even greater sprawl, and are only effective for securing a subset of traffic and are static based solely on routing. All of these are now problems that must be solved.
Luckily, there is a better way.
Defining Solution Requirements
The actual state of challenge can be defined as:
"You need an architecture that allows you to capture traffic flows that are north to south, south to north, and east to west that apply programmatic and dynamic security chains as appropriate while minimizing environment complexity and supporting the agility gains that cloud provides while minimizing appliance sprawl, removing the need to VPC peering, and support delivering security services across accounts."
Define your Security Use Case
Prior to building our solution suite we need to decide our security use case and tool set. When I think about this challenge the difference is based on do we need a single security service deployed or do we need an array of security services deployed.
Security Use Case |
Example |
F5 Options |
Comments |
Simple (1 function) |
Web Application Firewall, Network Firewall |
Advanced WAF, Advanced Firewall Manager |
Usually targeting a single network flow pattern, such is North to South, or a single security function for all network flow patterns. |
Complex (n functions) |
WAF, Network Firewall, DLP, |
SSL Orchestrator, Advanced WAF, Advanced Firewall Manager, other 3rd party services |
Targets a single network flow pattern with multiple services or multiple network flow patterns with specific services based on pattern. |
Building Blocks - Simple and Complex Use Case
If we are to solve for agility, security, minimizing sprawl, and dynamic security chains we will need to understand and implement a couple of building blocks.
Objects |
Function |
Details |
AWS Gateway Load Balancer |
The AWS Gateway Load balancer allows us present a transparent network service (bump in the wire) between the AWS edge (public IPs) and internal resources or in-between internal resources without having to NAT the traffic. |
GWLB uses a construct of an endpoint. Each endpoint exists in an Availability Zone and uses AWS private link to transport traffic to a provider service in another, disconnected, VPC. |
AWS Gateway Load Balancer Service |
A services is the array of appliances that a customer is allowed to consume with a Gateway Load balancer endpoint. |
Users can deploy one ore more security services as a target of a GWLB. The GWLB service is then able to be shared across accounts and topologies. |
F5 BIG-IP Advanced WAF and/or AFM |
Provide security services for all traffic in a single "hop". |
All traffic traverses this system and no traffic is passed off to additional security services in a service chain |
F5 BIG-IP SSL Orchestrator |
Creates dynamic security chains based on network flows or other traffic characteristics and orchestrates the traffic processing |
BIG-IP SSL Orchestrator acts as the service presented by GWLB. F5's traffic matching capabilities allow us to create security chains that are correct based on flow characteristics. |
N types of security services |
Security services that are deployed with BIG-IP SSL Orchestrator |
These services become the service chains that SSL Orchestrator can use. These would be the NGFW, DLP, IPS, WAF etc. |
Correct State - Complex Security Service Insertion
Let's look at a correct state example. Here we have a North/South flow that is sent across the Internet to an Elastic IP (EIP). The EIP is NATed to the NLB internal IP address and passed into the VPC. The GWLB Endpoint (GWLB EP), one per AZ, intercepts the traffic and sends it to one or more SSL Orchestrator Instances. SSL Orchestrator process the traffic based on the interception rule and security policy and returns the traffic back to the GWLB EP, at this point the traffic is passed to the NLB network interface and then to the backend pools. The return traffic reverses the path (for security you need to see both directions of the flow). This pattern provides the following benefits:
- End to end client IP visibility. The backend server, the security services all see the true client IP at the L3 header.
- Simplified route tables. GWLB EPs are deployed once per AZ and traffic that needs to be inspected (N/S, S/N, E/W) has one route that points to one endpoint.
- SSL Orchestrator evaluates the traffic and will steer it to the correct security services - no ore having to put all traffic to all services.
- Simplified architecture - fewer route tables per VPC, and remove dependencies on TransitGateway for security inspection.
- The security service does not relay on VPC peering, simplifying VPC topologies Horizontal scale - multiple SSL Orchestrator blocks can be deployed with GWLB allowing for scale and creating a resilient service.
- Insertion in front of ELB and ENIs constructs allowing for security inspection before traffic hits the application.
- Developers can still use their existing patterns.
Join me on Wednesday to look at a more detailed architecture and the configuration of SSL Orchestrator to capture the different traffic flows.