LTM
399 TopicsSNI Routing with BIG-IP
In the previous article, The Three HTTP Routing Patterns, Lori MacVittie covers 3 methods of routing. Today we will look at Server Name Indication (SNI) routing as an additional method of routing HTTPS or any protocol that uses TLS and SNI. Using SNI we can route traffic to a destination without having to terminate the SSL connection. This enables several benefits including: Reduced number of Public IPs Simplified configuration More intelligent routing of TLS traffic Terminating SSL Connections When you have a SSL certificate and key you can perform the cryptographic actions required to encrypt traffic using TLS. This is what I refer to as “terminating the SSL connection” throughout this article. When you want to route traffic this is a chicken and an egg problem, because for TLS traffic you want to be able to route the traffic by being able to inspect the contents, but this normally requires being able to “terminate the SSL connection”. The goal of this article is to layer in traffic routing for TLS traffic without having to require having/knowing the original SSL certificate and key. Server Name Indication (SNI) SNI is a TLS extension that makes it possible to "share" certificates on a single IP address. This is possible due to a client using a TLS extension that requests a specific name before the server responds with a SSL certificate. Prior to SNI, the other options would be a wildcard certificate or Subject Alternative Name (SAN) that allows you to specify multiple names with a single certificate. SNI with Virtual Servers It has been possible to use SNI on F5 BIG-IP since TMOS 11.3.0. The following KB13452 outlines how it can be configured. In this scenario (from the KB article) the BIG-IP is terminating the SSL connection. Not all clients support SNI and you will always need to specify a “fallback” profile that will be used if a SNI name is not used or matched. The next example will look at how to use SNI without terminating the SSL connection. SNI Routing Occasionally you may have the need to have a hybrid configuration of terminating SSL connections on the BIG-IP and sending connections without terminating SSL. One method is to create two separate virtual servers, one for SSL connections that the BIG-IP will handle (using clientssl profile) and one that it will not handle SSL (just TCP). This works OK for a small number of backends, but does not scale well if you have many backends (run out of Public IP addresses). Using SNI Routing we can handle everything on a single virtual server / Public IP address. There are 3 methods for performing SNI Routing with BIG-IP iRule with binary scan a. Article by Colin Walker code attribute to Joel Moses b. Code Share by Stanislas Piron iRule with SSL::extensions Local Traffic Policy Option #1 is for folks that prefer complete control of the TLS protocol. It only requires the use of a TCP profile. Options #2 and #3 only require the use of a SSL persistence profile and TCP profile. SNI Routing with Local Traffic Policy We will skip option #1 and #2 in this article and look at using a Local Traffic Policy for SNI Routing. For a review of Local Traffic Policies check out the following DevCentral articles: LTM Policy Jan 2015 Simplifying Local Traffic Policies in BIG-IP 12.1 June 2016 In previous articles about Local Traffic Policies the focus was on routing HTTP traffic, but today we will use it to route SSL connections using SNI. In the following example, using a Local Traffic Policy named “sni_routing”, we are setting a condition on the SSL Extension “servername” and sending the traffic to a pool without terminating the SSL connection. The pool member could be another server or another BIG-IP device. The next example will forward the traffic to another virtual server that is configured with a clientssl profile. This uses VIP targeting to send traffic to another virtual server on the same device. In both examples it is important to note that the “condition”/“action” has been changed from occurring on “request” (that maps to a HTTP L7 request) to “ssl client hello”. By performing the action prior to any L7 functions occurring, we can forward the traffic without terminating the SSL connection. The previous example policy, “sni_routing”, can be attached to a Virtual Server that only has a TCP profile and SSL persistence profile. No HTTP or clientssl profile is required! This method can also be used to solve the issue of how to consolidate multiple SSL virtual servers behind a single virtual server that have different APM and/or ASM policies. This is similar to the architecture that is used by the Container Connector for Cloud Foundry; in creating a two-tier load balancing solution on a single device. Routed Correctly? TLS 1.3 has interesting proposals on how to obscure the servername (TLS in TLS?), but for now this is a useful and practical method of handling multiple SSL certs on a single IP. In the future this may still be possible as well with TLS 1.3. For example the use of a HTTP Fronting service could be a tier 1 virtual server (this is just my personal speculation, I have not tried, at the time of publishing this was still a draft proposal). In other news it has been demonstrated that a combination of using SNI and a different host header can be used for “domain fronting”. A method to enforce consistent policy (prevent domain fronting) would be to layer in additional conditions that match requested SNI servername (TLS extension) with requested HOST header (L7 HTTP header). This would help enforce that a tenant is using a certificate that is associated with their application and not “borrowing” the name and certificate that is being used by an adjacent service. We don’t think of a TLS extension as an attribute that can be used to route application traffic, but it is useful and possible on BIG-IP.27KViews0likes17CommentsSSL Orchestrator Advanced Use Cases: Client Certificate Constrained Delegation (C3D) Support
Introduction F5 BIG-IP is synonymous with "flexibility". You likely have few other devices in your architecture that provide the breadth of capabilities that come native with the BIG-IP platform. And for each and every BIG-IP product module, the opportunities to expand functionality are almost limitless. In this article series we examine the flexibility options of the F5 SSL Orchestrator in a set of "advanced" use cases. If you haven't noticed, the world has been steadily moving toward encrypted communications. Everything from web, email, voice, video, chat, and IoT is now wrapped in TLS, and that's a good thing. The problem is, malware - that thing that creates havoc in your organization, that exfiltrates personnel records to the Dark Web - isn't stopped by encryption. TLS 1.3 and multi-factor authentication don't eradicate malware. The only reasonable way to defend against it is to catch it in the act, and an entire industry of security products are designed for just this task. But ironically, encryption makes this hard. You can't protect against what you can't see. F5 SSL Orchestrator simplifies traffic decryption and malware inspection, and dynamically orchestrates traffic to your security stack. But it does much more than that. SSL Orchestrator is built on top of F5's BIG-IP platform, and as stated earlier, is abound with flexibility. SSL Orchestrator Use Case: Client Certificate Constrained Delegation (C3D) Using certificates to authenticate is one of the oldest and most reliable forms of authentication. While not every application supports modern federated access or multi-factor schemes, you'll be hard-pressed to find something that doesn't minimally support authentication over TLS with certificates. And coupled with hardware tokens like smart cards, certificates can enable one of the most secure multi-factor methods available. But certificate-based authentication has always presented a unique challenge to security architectures. Certificate "mutual TLS" authentication requires an end-to-end TLS handshake. When a server indicates a requirement for the client to submit its certificate, the client must send both its certificate, and a digitally-signed hash value. This hash value is signed (i.e. encrypted) with the client's private key. Should a device between the client and server attempt to decrypt and re-encrypt, it would be unable to satisfy the server's authentication request by virtue of not having access to the client's private key (to create the signed hash). This makes encrypted malware inspection complicated, often requiring a total bypass of inspection to sites that require mutual TLS authentication. Fortunately, F5 has an elegant solution to this challenge in the form of Client Certificate Constrained Delegation, affectionally referred to as "C3D". The concept is reasonably straightforward. In very much the same way that SSL forward proxy re-issues a remote server certificate to local clients, C3D can re-issue a remote client certificate to local servers. A local server can continue to enforce secure mutual TLS authentication, while allowing the BIG-IP to explicitly decrypt and re-encrypt in the middle. This presents an immediate advantage in basic load balancing, where access to the unencrypted data allows the BIG-IP greater control over persistence. In the absence of this, persistence would typically be limited to IP address affinity. But of course, access to the unencrypted data also allows the content to be scanned for malicious malware. C3D actually takes this concept of certificate re-signing to a higher level though. The "constrained delegation" portion of the name implies a characteristic much like Kerberos constrained delegation, where (arbitrary) attributes can be inserted into the re-signed token, like the PAC attributes in a Kerberos ticket, to inform the server about the client. Servers for their part can then simply filter on client certificates issued by the BIG-IP (to prevent direct access), and consume any additional attributes in the certificate to understand how better to handle the client. With C3D you can maintain strong mutual TLS authentication all the way through to your servers, while allowing the BIG-IP to more effectively manage availability. And combined with SSL Orchestrator, C3D can enable decryption and inspection of content for malware inspection. This article describes how to configure SSL Orchestrator to enable C3D for inbound decrypted inspection. Arguably, most of what follows is the C3D configuration itself, as the integration with SSL Orchestrator is pretty simple. Note that Client Certificate Constrained Delegation (C3D) is included with Local Traffic Manager (LTM) 13.0 and beyond, but for integration with SSL Orchestrator you should be running 14.1 or later.To learn more about C3D, please see the following resources: K14065425: Configuring Client Certificate Constrained Delegation (C3D): https://support.f5.com/csp/article/K14065425 Manual Chapter: SSL Traffic Management: https://techdocs.f5.com/en-us/bigip-15-1-0/big-ip-system-ssl-administration/ssl-traffic-management.html#GUID-B4D2529E-D1B0-4FE2-8C7F-C3774ADE1ED2 SSL::c3d iRule reference - not required to use C3D, but adds powerful functionality https://clouddocs.f5.com/api/irules/SSL__c3d.html The integration of C3D with SSL Orchestrator involves effectively replacing the client and server SSL profiles that the SSL Orchestrator topology creates, with C3D SSL profiles. This is done programmatically with an iRule, so no "non-strict" customization is required at the topology. Also note that an inbound (reverse proxy) SSL Orchestrator topology will take the form of a "gateway mode" deployment (a routed path to multiple applications), or "application mode" deployment (a single application instance hosted at the BIG-IP). See section 2.5 of the SSL Orchestrator deployment guide for a deeper examination of gateway and application modes: https://clouddocs.f5.com/sslo-deployment-guide/ The C3D integration is only applicable to application mode deployments. Configuration C3D itself involves the creation of client and server SSL profiles: Create a new Client SSL profile: Configuration Certificate Key Chain: public-facing server certificate and private key. This will be the certificate and key presented to the client on inbound request. It will likely be the same certificate and key defined in the SSL Orchestrator inbound topology. Client Authentication Client Certificate: require Trusted Certificate Authorities: bundle that can validate client certificate. This is a certificate bundle used to verify the client's certificate, and will contain all of the client certificate issuer CAs. Advertised Certificate Authorities: optional CA hints bundle. Not expressly required, but this certificate bundle is forwarded to the client during the TLS handshake to "hint" at the correct certificate, based on issuer. Client Certificate Constrained Delegation Client Certificate Constrained Delegation: Enabled Client Fallback Certificate (new in 15.1): option to select a default client certificate if client does not send one. This option was introduced in 15.1 and provides the means to select an alternate (local) certificate if the client does not present one. The primary use case here might be to select a "template" certificate, and use an iRule function to insert arbitrary attributes. OCSP: optional client certificate revocation control. This option defines an OCSP revocation provider for the client certificate. Unknown OCSP Response Control (new in 15.1): determines what happens when OCSP returns Unknown. If an OCSP revocation provider is selected, this option defines what to do if the response to the OCSP query is "unknown". Create a new Server SSL profile: Configuration Certificate: default.crt. The certificate and key here are used as "templates" for the re-signed client certificate. Key: default.key Client Certificate Configuration Delegation Client Certificate Constrained Delegation: Enabled CA Certificate: local forging CA cert. This is the CA certificate used to re-sign the client certificate. This CA must be trusted by the local servers. CA Key: local forging CA key CA Passphrase: optional CA passphrase Certificate Extensions: extensions from the real client cert to be included in the forged cert. This is the list of certificate extensions to be copied from the original certificate to the re-issued certificate. Custom Extension: additional extensions to copy to forged cert from real cert (OID). This option allows you to insert additional extensions to be copied, as OID values. Additional considerations: Under normal conditions, the F5 and backend server attempt to resume existing SSL sessions, whereby the server doesn’t send a Certificate Request message. The effect is that all connections to the backend server use the same forged client cert. There are two ways to get around this: Set a zero-length cache value in the server SSL profile, or Set server authentication frequency to ‘always’ in the server SSL profile CA certificate considerations: A valid signing CA certificate should possess the following attributes. While it can work in some limited scenarios, a self-signed server certificate is generally not an adequate option for the signing CA. keyUsage: certificate extension containing "keyCertSign" and "digitalSignature" attributes basicConstraints: certificate extension containing "CA = true" (for Yes), marked as "Critical" With the client and server SSL profiles built, the C3D configuration is basically done. To integrate with an inbound SSL Orchestrator topology, create a simple iRule and add it to the topology's Interception Rule configuration. Modify the SSL profile paths below to reflect the profiles you created earlier. ### Modify the SSL profile paths below to match real C3D SSL profiles when CLIENT_ACCEPTED priority 250 { ## set clientssl set cmd1 "SSL::profile /Common/c3d-clientssl" ; eval $cmd1 } when SERVER_CONNECTED priority 250 { ## set serverssl SSL::profile "/Common/c3d-serverssl" } In the SSL Orchestrator UI, either from topology workflow, or directly from the corresponding Interception Rule configuration, add the above iRule and deploy. The above iRule programmatically overrides the SSL profiles applied to the Interception Rule (virtual server), effectively enabling C3D support. At this point, the virtual server will request a client certificate, perform revocation checks if defined, and then mint a local copy of the client certificate to pass to the backend server. Optionally, you can insert additional certificate attributes via the server SSL profile configuration, or more dynamically through additional iRule logic: ### Added in 15.1 - allows you to send a forged cert to the server ### irrespective of the client side authentication (ex. APM SSO), ### and insert arbitrary values when SERVERSSL_SERVERCERT { ### The following options allow you to override/replace a submitted ### client cert. For example, a minted client certificate can be sent ### to the server irrespective of the client side authentication method. ### This certificate "template" could be defined locally in the iRule ### (Base64-encoded), pulled from an iFile, or some other certificate source. # set cert1 [b64decode "LS0tLS1a67f7e226f..."] # set cert1 [ifile get template-cert] ### In order to use a template cert, it must first be converted to DER format # SSL::c3d cert [X509::pem2der $cert1] ### Insert arbitrary attributes (OID:value) SSL::c3d extension 1.3.6.1.4.1.3375.3.1 "TEST" } If you've configured the above, a server behind SSL Orchestrator that requires mutual TLS authentication can receive minted client certificates from external users, and SSL Orchestrator can explicitly decrypt and pass traffic to the set of malware inspection tools. You can look at the certificate sent to the server by injecting a tcpdump packet between the BIG-IP and server, then open in Wireshark. tcpdump -lnni [VLAN] -Xs0 -w capture.pcap [additional filters] Finally, you might be asking what to do with certificate attributes injected by C3D, and really it depends on what the server can support. The below is a basic example in an Apache config file to block a client certificate that doesn't contain your defined attribute. <Directory /> SSLRequire "HTTP/%h" in PeerExtList("1.3.6.1.4.1.3375.3.1") RewriteEngine on RewriteCond %{SSL::SSL_CLIENT_VERIFY} !=SUCCESS RewriteRule .? - [F] ErrorDocument 403 "Delegation to SPN HTTP/%h failed. Please pass a valid client certificate" </Directory> And there you have it. In just a few steps you've configured your SSL Orchestrator to integrate with Client Certificate Constrained Delegation to support mutual TLS authentication, and along the way you have hopefully recognized the immense flexibility at your command. Updates As of F5 BIG-IP 16.1.3, there are some new C3D capabilities: C3D has been updated to encode and return the commonName (CN) found in the client certificate subject field in printableString format if possible, otherwise the value will be encoded as UTF8. C3D has been updated to support inserting a subject commonName (CN) via 'SSL::c3d subject commonName' command: when CLIENTSSL_HANDSHAKE { if {[SSL::cert count] > 0} { SSL::c3d subject commonName [X509::subject [SSL::cert 0] commonName] } } C3D has been updated to support inserting a Subject Alternative Name (SAN) and Authority Info Access (AIA) via 'SSL::c3d extension' commands: when CLIENTSSL_HANDSHAKE { ## Insert Subject Alternative Name (SAN) value SSL::c3d extension SAN "DNS:*.test-client.com, IP:1.1.1.1" ### Insert Authority Info Access (AIA) value SSL::c3d extension AIA "ocsp,https://ocsp.entrust.net.com; caIssuer, https://aia.entrust.net/l1m-chain256.cer" } C3D has been updated to add the Authority Key Identifier (AKI) extension to the client certificate if the CA certificate has a Subject Key Identifier (SKI) extension. Another interesting use case is copying the real client certificate Subject Key Identifier (SKI) to the minted client certificate. By default, the minted client certificate will not contain an SKI value, but it's easy to configure C3D to copy the origin cert's SKI by modifying the C3D server SSL profile. In the "Custom extension" field of the C3D section, add 2.5.29.14 as an available extension. As of F5 BIG-IP 17.1.0 (SSL Orchestrator 11.0), C3D has been integrated natively. Now, for a deployed Inbound topology, the C3D SSL profiles are listed in the Protocol Settings section of the Interception Rules tab. You can replace the client and server SSL profiles created by SSL Orchestrator, with C3D SSL profiles in the Interception Rules tab to support C3D. The C3D support is now extended to both Gateway and Application modes.4KViews2likes2CommentsBIG-IP LTM VE: Transfer your iRules in style with the iRule Editor
The new LTM VE has opened up the possibilities for writing, testing and deploying iRules in a big way. It’s easier than ever to get a test environment set up in which you can break things develop to your heart’s content. This is fantastic news for us iRulers that want to be doing the newest, coolest stuff without having to worry about breaking a production system. That’s all well and good, but what the heck do you do to get all of your current stuff onto your test system? There are several options, ranging from copy and paste (shudder) to actual config copies and the like, which all work fine. Assuming all you’re looking for though is to transfer over your iRules, like me, the easiest way I’ve found is to use the iRule editor’s export and import features. It makes it literally a few clicks and super easy to get back up and running in the new environment. First, log into your existing LTM system with your iRule editor (you are using the editor, right? Of course you are…just making sure). You’ll see a screen something like this (right) with a list of a bagillionty iRules on the left and their cool, color coded awesomeness on the right. You can go through and select iRules and start moving them manually, but there’s really no need. All you need to do is go up to the File –> Archive –> Export option and let it do its magic. All it’s doing is saving text files to your local system to archive off all of your iRuley goodness. Once that’s done, you can then spin up your new LTM VE and get logged in via the iRule editor over there. Connect via the iRule editor, and go to File –> Archive –> Import, shown below. Once you choose the import option you’ll start seeing your iRules popping up in the left-hand column, just like you’re used to. This will take a minute depending on how many iRules you have archived (okay, so I may have more than a few iRules in my collection…) but it’s generally pretty snappy. One important thing to note at his point, however, is that all of your iRules are bolded with an asterisk next to them. This means they are not saved in their current state on the LTM. If you exit at this point, you’ll still be iRuleless, and no one wants that. Luckily Joe thought of that when building the iRule editor, so all you need to do is select File –> Save All, and you’ll be most of the way home. I say most of the way because there will undoubtedly be some errors that you’ll need to clean up. These will be config based errors, like pools that used to exist on your old system and don’t now, etc. You can either go create the pools in the config or comment out those lines. I tend to try and keep my iRules as config agnostic as possible while testing things, so there aren’t a ton of these but some of them always crop up. The editor makes these easy to spot and fix though. The name of the iRule that’s having a problem will stay bolded and any errors in that particular code will be called out (assuming you have that feature turned on) so you can pretty quickly spot them and fix them. This entire process took me about 15 minutes, including cleaning up the code in certain iRules to at least save properly on the new system, and I have a bunch of iRules, so that’s a pretty generous estimate. It really is quick, easy and painless to get your code onto an LTM VE and get hacking coding. An added side benefit, but a cool one, is that you now have your iRules backed up locally. Not only does this mean you’re double plus sure that they won’t be lost, but it means the next time you want to deploy them somewhere, all you have to do is import from the editor. So if you haven’t yet, go download your BIG-IP LTM VE and get started. I can’t recommend it enough. Also make sure to check out some of the really handy DC content that shows you how to tweak it for more interfaces or Joe’s supremely helpful guide on how to use a single VM to run an entire client/LTM/server setup. Wicked cool stuff. Happy iRuling. #Colin1.4KViews0likes2CommentsVirtual-wire Configuration and Troubleshooting
A virtual wire(vWire) logically connects two interfaces or trunks, in any combination, to each other, enabling the BIG-IP system to forward traffic from one interface to the other, in either direction. This type of configuration is typically used for security monitoring, where the BIG-IP system inspects ingress packets without modifying them in any way. To deploy a BIG-IP system without making changes to other devices on your network, you can configure the system to operate strictly at Layer 2. By deploying a virtual wire configuration, you transparently add the device to the network without having to create self IP addresses or change the configuration of other network devices that the BIG-IP device is connected to. Topology Before vWire Deployment After vWire Deployment Few points about virtual wire configurations in general: vWire works in transparent mode,which means there is no packet modification The system bridges both tagged and untagged packets Neither VLANs nor MAC addresses change in Symmetry mode propagate virtual wire link: When enabled, the BIG-IP system changes the peer port state to down when the corresponding interface is disabled or down. If disabled, the BIG-IPsystem does not change the peer port state Configuring vWire in UI BIG-IP Navigate to Network>>Virtual Wire Select Create (upper right) Enter the values for interfaces added to the virtual wire Enter VLAN information and click on Add for every VLAN object created Recommended- Enable propagate virtual wire link status for detecting link failure Once all the selections are made and you are ready to implement, click on "Commit Changes to System": The resulting screen will look like the following: The resulting VLAN configuration will look as follows: Note: Be sure to configure an untagged VLAN on the relevant virtual wire interface to enable the system to correctly handle untagged traffic. Note that many Layer 2 protocols, such as Spanning Tree Protocol (STP), employ untagged traffic in the form of BPDUs. Configuring vWire in cli mode Configure interfaces to support virtual wire: tmsh modify net interface 1.1 port-fwd-mode virtual-wire tmsh modify net interface 1.2 port-fwd-mode virtual-wire Create all VLAN tag VLAN objects: tmsh create net vlan Direct_all_vlan_4096_1 tag 4096 interfaces add { 1.1 { tagged } } tmsh create net vlan Direct_all_vlan_4096_2 tag 4096 interfaces add { 1.2 { tagged } } Create specific (802.1Q tag 512) VLAN objects: tmsh create net vlan Direct_vlan_512_1 tag 512 interfaces add { 1.1 { tagged } } tmsh create net vlan Direct_vlan_512_2 tag 512 interfaces add { 1.2 { tagged } } Create VLAN Groups: tmsh create net vlan-group Direct_all_vlan members add { Direct_all_vlan_4096_1 Direct_all_vlan_4096_2 } mode virtual-wire tmsh create net vlan-group Direct_vlan_512 members add { Direct_vlan_512_1 Direct_vlan_512_2 } mode virtual-wire config save: tmsh save sys config partitions all vWire config with trunk with LACP and LACP Pass Through LACP Pass through feature tunnels LACP packets through trunks between switches. Configure an untagged VLAN on the virtual wire interface to tunnel LACP packets. Note: Propagate virtual wire link status should be enabled for LACP pass through mode.LACP Pass through and Propagate virtual wire link status is supported from 16.1.x Configuring LACP Pass Through Configuring interface to support in vwire mode: tmsh modify net interface 1.1 port-fwd-mode virtual-wire tmsh modify net interface 2.1 port-fwd-mode virtual-wire tmsh modify net interface 1.2 port-fwd-mode virtual-wire tmsh modify net interface 2.2 port-fwd-mode virtual-wire Configure trunk : tmsh create net trunk left_trunk_1 interfaces add { 1.1 2.1 } qinq-ethertype 0x8100 link-select-policy auto tmsh create net trunk right_trunk_1 interfaces add { 1.2 2.2 } qinq-ethertype 0x8100 link-select-policy auto Configure VLAN tagged and untagged interface: tmsh create net vlan left_vlan_1_4k tag 4096 interfaces add {left_trunk_1 {tagged}} tmsh create net vlan left_vlan_1 tag 31 interfaces add {left_trunk_1 {tagged}} tmsh create net vlan left_vlan_333 tag 333 interfaces add {left_trunk_1 {untagged}} tmsh create net vlan right_vlan_1_4k tag 4096 interfaces add {right_trunk_1 {tagged}} tmsh create net vlan right_vlan_1 tag 31 interfaces add {right_trunk_1 {tagged}} tmsh create net vlan right_vlan_333 tag 333 interfaces add {right_trunk_1 {untagged}} Create VLAN Group and enabled propagate-linkstatus: tmsh create net vlan-group vg_1_4k bridge-traffic enabled mode virtual-wire members add { left_vlan_1_4k right_vlan_1_4k } vwire-propagate-linkstatus enabled tmsh create net vlan-group vg_untagged bridge-traffic enabled mode virtual-wire members add { left_vlan_333 right_vlan_333 } vwire-propagate-linkstatus enabled tmsh create net vlan-group vg_1 bridge-traffic enabled mode virtual-wire members add { left_vlan_1 right_vlan_1 } vwire-propagate-linkstatus enabled configuring LACP(Active-Active) mode Configuring interface to support in vwire mode: tmsh modify net interface 1.1 port-fwd-mode virtual-wire tmsh modify net interface 1.2 port-fwd-mode virtual-wire tmsh modify net interface 2.1 port-fwd-mode virtual-wire tmsh modify net interface 2.2 port-fwd-mode virtual-wire Configure trunk in LACP Active mode : tmsh create net trunk left_trunk_1 interfaces add { 1.1 1.2 } qinq-ethertype 0x8100 link-select-policy auto lacp enabled lacp-mode active tmsh create net trunk right_trunk_1 interfaces add { 2.1 2.2 } qinq-ethertype 0x8100 link-select-policy auto lacp enabled lacp-mode active Configure VLAN tagged interface: tmsh create net vlan left_vlan_1_4k tag 4096 interfaces add {left_trunk_1 {tagged}} tmsh create net vlan left_vlan_1 tag 31 interfaces add {left_trunk_1 {tagged}} tmsh create net vlan right_vlan_1_4k tag 4096 interfaces add {right_trunk_1 {tagged}} tmsh create net vlan right_vlan_1 tag 31 interfaces add {right_trunk_1 {tagged}} Create VLAN Group and enabled propagate-linkstatus: tmsh create net vlan-group vg_1_4k bridge-traffic enabled mode virtual-wire members add { left_vlan_1_4k right_vlan_1_4k } vwire-propagate-linkstatus enabled tmsh create net vlan-group vg_1 bridge-traffic enabled mode virtual-wire members add { left_vlan_1 right_vlan_1 } vwire-propagate-linkstatus enabled DB variables for vWire: Trouble shooting vWire : 1 . Verify that traffic flowing through default Virtual Server(_vlangroup) Tcpdump cmd: tcpdump -nne -s0 -i 0.0:nnn 22:00:53.398116 00:00:00:00:01:31 > 33:33:00:00:00:05, ethertype 802.1Q (0x8100), length 139: vlan 31, p 0, ethertype IPv6, fe80::200:ff:fe00:131 > ff02::5: OSPFv3, Hello, length 40 out slot1/tmm9 lis=_vlangroup 22:00:53.481645 00:00:5e:00:01:01 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 91: vlan 31, p 0, ethertype IPv4, 10.31.0.3 > 224.0.0.18: VRRPv3, Advertisement, vrid 1, prio 150, intvl 100cs, length 12 out slot1/tmm4 lis=_vlangroup 2. Now create Virtual Server based on requirements like TCP, UDP and ICMP with Virtual Server name as test.Verify traffic is hitting Virtual Server Tcpdump cmd: tcpdump -nne -s0 -i 0.0:nnn 22:04:54.161197 3c:41:0e:9b:01:31 > 00:00:00:00:03:31, ethertype 802.1Q (0x8100), length 145: vlan 31, p 0, ethertype IPv4, 10.20.0.10 > 10.13.0.10: ICMP echo request, id 30442, seq 2, length 64 out slot4/tmm2 lis=/Common/test 22:05:14.126544 3c:41:0e:9b:01:31 > 00:00:00:00:03:31, ethertype 802.1Q (0x8100), length 121: vlan 31, p 0, ethertype IPv4, 10.20.0.10.41692 > 10.13.0.10.80: Flags [S], seq 2716535389, win 64240, options [mss 1460,sackOK,TS val 685348731 ecr 0,nop,wscale 7], length 0 out slot3/tmm8 lis=/Common/test 22:05:14.126945 3c:41:0e:9b:03:31 > 00:00:00:00:01:31, ethertype 802.1Q (0x8100), length 121: vlan 31, p 0, ethertype IPv4, 10.13.0.10.80 > 10.20.0.10.41692: Flags [S.], seq 1173350299, ack 2716535390, win 65160, options [mss 1460,sackOK,TS val 4074187325 ecr 685348731,nop,wscale 7], length 0 in slot3/tmm8 lis=/Common/test 3 . Trouble Shooting steps Get the tcpdump and check the traffic hitting Virtual Server or not If traffic is dropped, enable “tmsh modify sys db vlangroup.forwarding.override value enable” with destination as catch all and check whether traffic is hitting _vlangroup and going out or not. If traffic is going without any issue, then there is an issue with created virtual server. Even after enabling vlangroup.forwarding.override db variable, then take the output of below commands tmctl ifc_stats - displays interface traffic statistics tmctl ip_stat - displays ip traffic statistics tmctl ip6_stat - displays ipv6 traffic statistics vWire Behavior This table describes how the BIG-IP system handles certain conditions when the relevant interfaces are configured to use a virtual wire. The table also shows what actions you can take, if possible Notable Effects-Caveats When deploying a pair of BIG-IP’s in HA mode, the virtual wire configuration will create objects with different names on each BIG-IP. So for example, the creation of vwire_lab01 will result in the creation of VLAN objects vwire_lab01_1_567 and vwire_lab01_2_567 on one BIG-IP, while the other BIG-IP will have vwire_lab01_1_000 and vwire_lab01_2_000 in its configuration. For modules like SSL Orchestrator, or in cases where a Virtual Server needs to be associated with a specific VLAN, the numbering is problematic. The administrator will not be able to associate the topology or Virtual Server to one VLAN object (vwire_lab01_2_567) on the first BIG-IP and the other VLAN object (vwire_lab01_2_000) on the peer BIG-IP. (this is not possible for a number of reasons, one of which is the way configurations are synchronized between BIG-IP devices) Q-in-Q is not supported in a virtual wire configuration Virtual wire feature is not supported on Virtual Clustered Multiprocessing (vCMP) Active/Active deployment is not supported vWire is not supported on virtual edition(VE) Conclusion BIG-IP in Virtual Wire can be deployed in any network without any network design or configuration changes, as it works in L2 transparent mode. L2Transparency Caveats There are few caveats with respect to L2 Transparency OSPF neighborship struck in exstart state. BGP neighborship won’t come with MD5 authentication OSPF neighborship struck in exstart state In transparent mode when standard Virtual server is configured, the VS will process the DBD packet with this the TTL value become zero and the OSPF neighborship will struck at Exstart state. To solve the above problem, we need to configure a profile to preserve the TTL value and attach the profile to the virtual server. Below are the steps to configure the profile and the virtual server. Same steps can be configured for both vwire and vlangroup Create a profile to preserve TTL Click on create and enter the profile name as TTL and select to preserve Now attach the profile under ipother After attaching the profile the OSPF neighborship will come up. BGP neighborship won’t come with MD5 authentication In transparent mode when standard Virtual server is configured, the VS will process the BGP packet and will reply back to the tcp connection without MD5 with BGP wont come up between two devices To solve the above problem, we need to configure a profile to support Md5 authentication and attach the profile to the virtual server. Below are the steps to configure the profile and the virtual server. Same steps can be configured for both Vwire and Vlan-group Create a profile to support md5 authentication Create a profile with name md5. Enable md5 authentication and provide md5 authentication password. Attach the MD5 profile to virtual server After attaching the md5 policy the BGP neighborship will up.4.3KViews6likes1Comment7. SYN Cookie: Troubleshooting Stats
Introduction In this article I will explain what SYN Cookie stats you can consult and their meaning. There are more complex stats than the explained in this article but they are intended for helping F5 engineers when cases become complex. SYN Cookie stats First at all, in order to troubleshoot SYN Cookie you need to know how you can check SYN Cookie stats easily and understand what you are reading. The easiest way to see these stats in a device running LTM module based SYN Cookie is by running below command and focusing in SYN Cookies section of the output: # tmsh show ltm virtual <virtual> SYN Cookies Status full-hardware Hardware SYN Cookie Instances 6 Software SYN Cookie Instances 0 Current SYN Cache 2.0K SYN Cache Overflow 24 Total Software 4.3M Total Software Accepted 0 Total Software Rejected 0 Total Hardware 21.7M Total Hardware Accepted 3 Let’s go through each field to know what they means: Status: [full-software/full-hardware|not-activated]. This value describes what type of SYN Cookie mode has been activated, software or hardware. Once an attack has finished it is normal that LTM takes some time to deactivate SYN Cookie mode after attack traffic stops, see the second article in this series (SYN Cookie Operation). Note that we do not want SYN Cookie entering in an activating/deactivating loop in case TCP SYN packets per second reaching device are near to the configured SYN cache threshold. So delay of 30-60 seconds before SYN Cookie being deactivated is in normal range. Hardware SYN Cookie Instances: How many TMMs are under Hardware SYN Cookie mode. Software SYN Cookie Instances: How many TMMs are under Software SYN Cookie mode. It indicates if software is currently generating SYN cookies. Current SYN Cache: Indicates how many embryonic connections are handled by BIG-IP (refreshed every 2 seconds). SYN cache is always counting embryonic connections, regardless if SYN Cookie is activated or not. So even if SYN Cookies is completely turned off, we still have embryonic flows that have not been promoted to full flows yet (SYNs which has not completed 3WHS), this means that cache counter will still be used regularly, so is normal having a value different than 0. Disabling SYN Cookie will not avoid this counter increase but thresholds will not be taken into account. Since SYN cache is no longer a cache, as explained in prior articles, and the stat is merely a counter of embryonic flows, it does not consume memory resources. As the embryonic flows are promoted or time out, this value will decrease. SYN Cache Overflow (per TMM): It is incremented whenever the SYN cache threshold is exceeded and SYN cookies need to be generated. It increments in one per each TMM instance and this value is a counter that only increases, so we can consult the value to know how many times SYN Cookie has been activated since last stats reset, remember, per TMM. Total Software: Number of challenges generated by Software. This is increased regardless if client sends a response, does not send any response or the response is not correct. Total Software Accepted: Number of TCP handshakes that were correctly handled with clients. Total Software Rejected: Number of wrong responses to challenges. Remember that ALL rejected SYN cookies are in software, there is no hardware rejected. Total Hardware: Number of challenges generated by Hardware. Total Hardware Accepted: Number of TCP handshakes that were correctly handled with clients. In case you have configured AFM based SYN Cookie then you can use two easy sources of information, the already explained LTM command above, but also you can check stats for TCP half open DoS vector, at device or virtual server context, as you would do with any other DoS vector. Let’s check the most important fields at device context. # tmsh show security dos device-config | grep -A 40 half Statistics Type Count Status Ready Attack Detected 1 *Attacked Dst Detected 0 *Bad Actor Attack Detected 0 Aggregate Attack Detected 1 Attack Count 2 *Stats 1h Samples 0 Stats 408 *Stats Rate 408 Stats 1m 104 Stats 1h 0 Drops 1063 *Drops Rate 1063 Drops 1m 187 Drops 1h 4 *Int Drops 0 *Int Drops Rate 0 *Int Drops 1m 0 *Int Drops 1h 0 Status: This field confirm if this specific DoS vector is ready to detect TCP SYN flood attacks. You have to take into account that you could decide to configure this DoS vector with Auto-threshold, in which case AFM would be in charge of deciding the best threshold. Note that by enabling auto-threshold AFM would need some time to learn the traffic pattern of your environment, until it has enough information to create a correct threshold you will not see this field as Ready but as Learning. If vector is configured manually you always see it as Ready. Attack detected: It informs you if currently there is an ongoing attack and detected by AFM. Aggregate attack detected: Since DoS stats shown above are for device context the detected attack is aggregate. Attack Count: Gives information about how many attacks have been detected since stats were reset last time. It does not decrease. Stats: Number of embryonic connections at this current second. Remember that since this is a snapshot, the counter could go increase or decrease. This is different to other DoS vectors where it only increases. Stats 1m: Average of the Stats in the last minute. This is the average number of embryonic connections that AFM has seen when taking sampling every 1s. Stats 1h: Average of the Stats 1m in the last hour. Drops: It counts the number of wrong ACKs received. In other words this is the current snapshot of: Number of SYN Cookies – Number of validated ACKs received Drops 1m: Average of the Drops in the last minute Drops 1h: Average of the Drops 1m in the last hour I have added an asterisk to some fields to point to values that has not real meaning for SYN Cookie DoS vector, or information is not really useful from my perspective, but they are included to have coherence with other DoS attack stats information. Note: Detection logic for this vector is not based on the Stats 1m, as other DoS vectors, instead it is based on the current number of embryonic connections, that is, value seen in Stats counter. Interpreting stats Once you know what each important stat mean I will give some advises when you interpret SYN Cookie stats. You should know some of them already if you have read all articles in this series: Hardware can offload TMM for validation, so you can see a number of software generated SYN Cookies much bigger than software validated SYN Cookie since software generates cookies that are validated by hardware as well. Since it is possible that a software generated SYN Cookie be accepted by hardware and vice versa, hardware generated SYN Cookies be accepted by software, you could think that value for Total Software is the same than the result of adding the Total Software Accepted plus Total Software rejected. But that is NOT true since it can be possible that a SYN Cookie sent by BIG-IP does not have a response (ACK), quite typical during a SYN flood attack indeed. In this case nothing adds up because generated SYN Cookie was not accepted nor rejected. Remember that SYN cookie’s responses discarded by hardware will be rejected by software, so all rejects are in software. This is why there is not counter for Total Hardware rejected. This means that Total Software Rejected can be increase by Total Hardware. When there is an attack vectors definition that match this attack will be increased. So, for example, during a TCP SYN flood attack you will see that Stats increases for TCP half open and TCP SYN flood (maybe others like low TTL,... as well). Also Stats will increase in all contexts (device and virtual server). The first vector in an specific context whose limit is exceeded it will start to mitigate. This is important because in cases where TCP SYN flood DoS vector has a lower threshold than TCP half open DoS vector, you will notice that traffic is dropped but you did not expect this behavior. Check article 5 for more information. Example In this part you will learn what stats changes you should expect to see when a TCP SYN flood attack is detected and SYN Cookie is activated, so it starts to mitigate the attack. Let’s interpret below stats: During attack Status full-hardware Hardware SYN Cookie Instances 6 Software SYN Cookie Instances 0 Current SYN Cache 1 SYN Cache Overflow 24 Total Software 10.1K Total Software Accepted 0 Total Software Rejected 0 Total Hardware 165.1M Total Hardware Accepted 3 After attack Status not-activated Hardware SYN Cookie Instances 0 Software SYN Cookie Instances 0 Current SYN Cache 15 SYN Cache Overflow 24 Total Software 10.1K Total Software Accepted 0 Total Software Rejected 0 Total Hardware 171.0M Total Hardware Accepted 3 Before the attack, before SYN Cookie is activated, you will see Current SYN Cache stats starts to increase quickly since TCP SYN packets are causing an increment of embryonic connections. Once SYN Cache threshold has been reached in a TMM then SYN Cache Overflow will increase attending to the number of TMMs that detected the attack. In above example you see 24 because this is a 6 CPUs device and 4 TCP SYN Flood attacks were detected by SYN Cookie. Remember this counter will never decrease unless we reset stats. During attack SYN cookie is activated so Current SYN Cache will start to decrease until reaching 0 because SYN Cookie Agent starts to handle TCP 3WHS, in other words, TCP stack stops to receive TCP SYN packets. We can check how many TMMs have SYN cookie activated currently looking at Hardware/Software SYN Cookie Instances counter. Note this match with what I explained for SYN Cache Overflow value. Legitimate connections are counted under Total Hardware/Software Accepted. In this case, we can see that although several millions of TCP SYN packets reached the device only 3 TCP 3WHS were correctly carried out. As you can see this device has Hardware SYN Cookie configured and working, but you can read that software also generates SYN Cookie challenges (Total Software). As commented in article number six of these series, this is expected and can be due to collisions or due to validations of first challenges when SYN Cookie is activated and TMMs handles TCP SYN packets until it enables hardware to do it. This does not affect device performance. In order to change from ‘Status full-hardware’ to ‘Status not-activated’ both HSBs must exit from SYN Cookie mode. Conclusion Now you know how to interpret stats, so you know can deduce information about the past and the present status of your device related to SYN Cookie. In next two article I will end up this series and this troubleshooting part talking about traffic captures and meaning of logs.3.1KViews1like5Comments2. SYN Cookie: Operation
Introduction As concluded in the last article, in order to avoid allocating space for TCB, the attacked device needs to reject TCP SYN packets sent by clients. In this article I will explain how a system can do this without causing service disruption, and more specifically I will explain how this work in BIG-IP. Implementation Since attacked device need to alter default TCP 3WHS behaviour the best option is implementing SYN Cookie countermeasure in an intermediate device, so you offload your servers from this task. In this way if connection is not legitimate then it is just never forwarded to the backend server and therefore it will not waste any kind of extra resourcess. Since BIG-IP is already in charge of handling application traffic directed towards servers it is the best place to implement SYN Cookie. What BIG-IP does is adding an extra layer, which we can call SYN Cookie Agent, that basically implement a stateless buffer between client and BIG-IP TCP stack. This agent is in charge of handling client’s TCP SYN packets by modifying slightly the standard behaviour of TCP 3WHS, this modification comes in two flavours depending on what type of routing role BIG-IP is playing between client and server. Symmetric routing This is the typical situation. In this environment BIG-IP is sited in the middle of the TCP conversation and all the traffic from/to server pass through it. The SYN Cookie operation in this case is briefly described below: Client sends a TCP SYN packet to BIG-IP. BIG-IP uses a stateless buffer for answering each client SYN with a SYN/ACK. BIG-IP generates the 32 bits sequence number which will be included in SYN/ACK packet sent to the client. BIG-IP encodes essential and mandatory information of the connection in 24 bits. BIG-IP hashes the previous encoded information. BIG-IP also encodes other values like MSS in the remaining bits. BIG-IP generates the SYN/ACK response packet and includes the hash and other encoded information as sequence number for the packet. This is the so called SYN Cookie. BIG-IP sends SYN/ACK to client. BIG-IP discards the SYN from the stateless buffer, in other words, BIG-IP removes all information related to this TCP connection. At this point no memory has been allocated for TCP connection (TCB). Client sends correct ACK to BIG-IP acknowledging BIG-IP’s sequence number. BIG-IP validates ACK. BIG-IP subtracts one to this ACK. BIG-IP runs the hash function using connection information as input (see point 3-b above). Then it compares it with the hash provided in the ACK, if they match then it means client sent a correct SYN Cookie response and that client is legitimate. BIG-IP uses connection information in the ACK TCP/IP headers to create an entry in connflow. At this time is when BIG-IP builds TCB entry, so it's the first time BIG-IP uses memory to save connection information. BIG-IP increase related SYN Cookie stats. BIG-IP starts and complete a TCP 3WHS with backend server on behalf of the client since we are sure that client is legitimate. Now traffic for the connection is proxied by BIG-IP as usual attending to configured L4 profile in the virtual server. *If ACK packet received by BIG-IP is a spurious ACK then BIG-IP will discard it and no entry will be created in connection table. Since attackers will never send a correct response to a SYN/ACK then you can be sure that TCB entries are never created for them. Only legitimate clients will use BIG-IP resources as shown in diagram: Fig4. TCP SYN Flood attack with SYN Cookie countermeasure Asymmetric routing In an asymmetric environment, also called nPath or DSR, you face a different problem because Big-IP cannot establish a direct TCP 3WHS with the server (point 11 in last section). As you can see in below diagram SYN/ACK packet sent from server to client would not traverse Big-IP, so method used for symmetric routing cannot be used in this case. You need another way to trust in clients and discard them as attackers, so clients then can complete TCP 3WHS directly with the server. Fig5. TCP state diagram section for asymmetric routing + FastL4 In order to circumvent this problem BIG-IP takes the advantage of the fact that applications will try to start a second TCP connection if a first TCP 3WHS is RST by the server. What BIG-IP does in asymmetric routing environments is completing the TCP three way handshake with client, issuing SYN Cookie as I described for symmetric routing, and once BIG-IP confirms that client is trustworthy it will add its IP to SYN Cookie Whitelist, it will send a RST to the client and it will close the TCP connection. At this point client will try to start a new connection and this time BIG-IP will let the client talk directly to the server as it would do under normal circumstances (see diagram above). The process is briefly described below: Client sends a TCP SYN packet to BIG-IP. BIG-IP generates the sequence number as explained for symmetric routing. BIG-IP generates the SYN/ACK response packet and includes the calculated sequence number for the packet. BIG-IP sends SYN/ACK to client and then remove all information related to this TCP connection. At this point no memory has been allocated for TCP connection (TCB). Client sends correct ACK to BIG-IP acknowledging sequence number. BIG-IP validates ACK. BIG-IP substract one to this ACK so it can decode needed information and check hash. BIG-IP increase related SYN Cookie stats. BIG-IP sends a RST to client and discards TCP connection. Big-IP adds client’s IP to Whitelist. Clients starts a new TCP connection. BIG-IP lets client to start this new TCP connection directly to the server since it knows that server is legitimate. *If ACK packet received by BIG-IP is a spurious ACK then BIG-IP will discard it and no entry will be created in connection table. Fig6. TCP 3WHS flow in asymmetric routing DSR whitelist has some important characteristics you need to take into account: By default IP is added 30 seconds to the whitelist (DB Key tm.flowstate.timeout). Client’s IP is added to whitelist for 30 seconds but only if there is no traffic from this client, if BIG-IP have already seen ACK from this client then its IP is removed from whitelist since connection has been already established between client and server. Whitelist is common to all virtual servers, so if a client is whitelisted it will be for all applications. Whitelist it is not mirrored. Whitelist it is not shared among blades. SYN Cookie Challenge When a TCP connection is initiated the TCP SYN packet sent by the client specify certain values that define the connection, some of these values are mandatory, like source and destination IP and port, otherwise you would not be able to identify the correct connection when packets arrive. Some other values are optional and they are used to improve performance, like TCP options or WAN optimizations. Under normal circumstances, upon TCP SYN reception the system creates a TCB entry where all this information is saved. BIG-IP will use this information to set up the TCP connection with the backend server. The problem we face is that when SYN Cookie is in play device does not create the TCB entry, only a limited piece of connection information is collected, so the information that BIG-IP has about the TCP connection is limited. Remember that SYN Cookie is not handled by TCP stack but by the stateless SYN Cookie Agent, so we cannot save the connection details. What BIG-IP does instead is encoding key information in 24 bits, this left only some bits for the rest of data. While there is no room for values like Window Scale information, however we have space for other values, for example 3 bits reserved for MSS value. This limits MSS possible values to 8, the most commonly used values, and the one chosen will the nearest to the original MSS value. However, note that MSS limitation has a workaround through configuration in FastL4 profile that it can help in some cases. The option SYN Cookie MSS in this profile specifies a value that overrides the SYN cookie maximum segment size (MSS) value in the SYN-ACK packet that is returned to the client. Valid values are 0, and values from 256 through 9162. The default is 0, which means no override. You might use this option if backend servers use a different MSS value for SYN cookies than the BIG-IP system does. If this is not enough for some customers, BIG-IP overcomes this problem by using TCP timestamp space to save extra information about TCP connection, in other words we create a second extra SYN Cookie. This space is used to record the client and server side Window Scale values, or the SACK info which are then made available to the TCP stack via this cookie if the connection is accepted. Note that TCP connection will only be accepted if both SYN Cookies (standard and Timestamp) are correct. So all this means that if the client starts a connection with TCP TS set then Big-IP will have more space to encode information and hence the performance will be improved when under TCP SYN flood attack. NOTE: SYN Cookie Timestamp extension (software or hardware) only work for standard virtual servers currently. FastL4 virtual servers only supports MMS option in SYN Cookie mode. Deactivating In order to deactivate SYN Cookie three conditions need to be met, these conditions are the same for LTM or AFM SYN Cookie, although stat names can vary: Embryonic connections must be below the configured SYN Cookie threshold. The ratio of invalid packets has not been exceeded. We define this ratio as a percentage saved in the DB key dos.syncookiedeactivate which by default is the 50% of the configured threshold (do not change this value if it is not requested by a F5 engineer). The second condition above has to be met three consecutive times as a method for avoiding SYN Cookie activation/deactivation flapping. BIG-IP will receive information about the number of invalid packets every 7 seconds, so this means that at least 21 seconds are required to deactivate SYN Cookie assuming that the second condition is constantly met. Note the difference between embryonic connections and invalid connections in the above conditions: Embryonic connections are connections pending to be confirmed with a last ACK. Invalid connections include any connection that did not finish the TCP 3WHS with a correct last ACK, that is, connections for which BIG-IP did not receive the last ACK, or for which BIG-IP did receive a wrong ACK. Finally remember that for Hardware SYN Cookie all HSBs connected to TMM must exit from SYN Cookie mode in order to consider that the TMM exit SYN Cookie. And also all TMMs must exit from SYN Cookie mode in order to deactivate SYN Cookie completely for that context. Conclusion Now you know how SYN Cookie works under the hoods. In next article I will describe when SYN Cookie is activated and I will give specific details of BIG-IP SYN Cookie operation. Note that limitations related to SYN Cookie commented in this article are not caused specifically by BIG-IP SYN Cookie implementation but by SYN Cookie standard itself. In fact, F5 Networks implements some improvements on SYN Cookie to get a better performance and provide some extra features.3.7KViews3likes3CommentsF5 Unveils New Built-In TCP Profiles
[Update 3/17: Some representative performance results are at the bottom] Longtime readers know that F5's built-in TCP profiles were in need of a refresh. I'm pleased to announce that in TMOS® version 13.0, available now, there are substantial improvements to the built-in profile scheme. Users expect defaults to reflect best common practice, and we've made a huge step towards that being true. New Built-in Profiles We've kept virtually all of the old built-in profiles, for those of you who are happy with them, or have built other profiles that derive from them. But there are four new ones to load directly into your virtual servers or use a basis for your own tuning. The first three are optimized for particular network use cases: f5-tcp-wan, f5-tcp-lan, and f5-tcp-mobile are updated versions of tcp-wan-optimized, tcp-lan-optimized, and tcp-mobile-optimized. These adapt all settings to the appropriate link types, except that they don't enable the very newest features. If the hosts you're communicating with tend to use one kind of link, these are a great choice. The fourth is f5-tcp-progressive. This is meant to be a general-use profile (like the tcp default), but it contains the very latest features for early adopters. In our benchmark testing, we had the following criteria: f5-tcp-wan, f5-tcp-lan, and f5-tcp-mobile achieved throughput at least as high, and often better, than the default tcp profile for that link type. f5-tcp-progressive had equal or higher throughput than default TCP across all representative network types. The relative performance of f5-tcp-wan/lan/mobile and progressive in each link type will vary given the new features that f5-tcp-progressive enables. Living, Read-Only Profiles These four new profiles, and the default 'tcp' profile, are now "living." This means that we'll continually update them with best practices as they evolve. Brand-new features, if they are generally applicable, will immediately appear in f5-tcp-progressive. For our more conservative users, these new features will appear in the other four living profiles after a couple of releases. The default tcp profile hasn't changed yet, but it will in future releases! These five profiles are also now read-only, meaning that to make modifications you'll have to create a new profile that descends from these. This will aid in troubleshooting. If there are any settings that you like so much that you never want them to change, simply click the "custom" button in the child profile and the changes we push out in the future won't affect your settings. How This Affects Your Existing Custom Profiles If you've put thought into your TCP profiles, we aren't going to mess with it. If your profile descends from any of the previous built-ins besides default 'tcp,' there is no change to settings whatsoever. Upgrades to 13.0 will automatically prevent disruptions to your configuration. We've copied all of the default tcp profile settings to tcp-legacy, which is not a "living" profile. All of the old built-in profiles (like tcp-wan-optimized), and any custom profiles descended from default tcp, will now descend instead from tcp-legacy, and never change due to upgrades from F5. tcp-legacy will also include any modifications you made to the default tcp profile, as this profile is not read-only. Our data shows that few, if any, users are using the current (TMOS 12.1 and before) tcp-legacy settings.If you are, it is wise to make a note of those settings before you upgrade. How This Affects Your Existing Virtual Servers As the section above describes, if your virtual server uses any profile other than default 'tcp' or tcp-legacy, there will be no settings change at all. Given the weaknesses of the current default settings, we believe most users who use virtuals with the TCP default are not carefully considering their settings. Those virtuals will continue to use the default profile, and therefore settings will begin to evolve as we modernize the default profile in 13.1 and later releases. If you very much like the default TCP profile, perhaps because you customized it when it wasn't read-only, you should manually change the virtual to use tcp-legacy with no change in behavior. Use the New Profiles for Better Performance The internet changes. Bandwidths increase, we develop better algorithms to automatically tune your settings, and the TCP standard itself evolves. If you use the new profile framework, you'll keep up with the state of the art and maximize the throughput your applications receive. Below, I've included some throughput measurements from our in-house testing. We used parameters representative of seven different link types and measured the throughput using some relevant built-in profiles. Obviously, the performance in your deployment may vary. Aside from LANs, where frankly tuning isn't all that hard, the benefits are pretty clear.5.2KViews1like10CommentsUsing BIG-IP GTM to Integrate with Amazon Web Services
This is the latest in a series of DNS articles that I've been writing over the past couple of months. This article is taken from a fantastic solution that Joe Cassidy developed. So, thanks to Joe for developing this solution, and thanks for the opportunity to write about it here on DevCentral. As a quick reminder, my first six articles are: Let's Talk DNS on DevCentral DNS The F5 Way: A Paradigm Shift DNS Express and Zone Transfers The BIG-IP GTM: Configuring DNSSEC DNS on the BIG-IP: IPv6 to IPv4 Translation DNS Caching The Scenario Let's say you are an F5 customer who has external GTMs and LTMs in your environment, but you are not leveraging them for your main website (example.com). Your website is a zone sitting on your windows DNS servers in your DMZ that round robin load balance to some backend webservers. You've heard all about the benefits of the cloud (and rightfully so), and you want to move your web content to the Amazon Cloud. Nice choice! As you were making the move to Amazon, you were given instructions by Amazon to just CNAME your domain to two unique Amazon Elastic Load Balanced (ELB) domains. Amazon’s requests were not feasible for a few reasons...one of which is that it breaks the RFC. So, you engage in a series of architecture meetings to figure all this stuff out. Amazon told your Active Directory/DNS team to CNAME www.example.com and example.com to two AWS clusters: us-east.elb.amazonaws.com and us-west.elb.amazonaws.com. You couldn't use Microsoft DNS to perform a basic CNAME of these records because of the BIND limitation of CNAME'ing a single A record to multiple aliases. Additionally, you couldn't point to IPs because Amazon said they will be using dynamic IPs for your platform. So, what to do, right? The Solution The good news is that you can use the functionality and flexibility of your F5 technology to easily solve this problem. Here are a few steps that will guide you through this specific scenario: Redirect requests for http://example.com to http://www.example.com and apply it to your Virtual Server (1.2.3.4:80). You can redirect using HTTP Class profiles (v11.3 and prior) or using a policy with Centralized Policy Matching (v11.4 and newer) or you can always write an iRule to redirect! Make www.example.com a CNAME record to example.lb.example.com; where *.lb.example.com is a sub-delegated zone of example.com that resides on your BIG-IP GTM. Create a global traffic pool “aws_us_east” that contains no members but rather a CNAME to us-east.elb.amazonaws.com. Create another global traffic pool “aws_us_west” that contains no members but rather a CNAME to us-west.elb.amazonaws.com. The following screenshot shows the details of creating the global traffic pools (using v11.5). Notice you have to select the "Advanced" configuration to add the CNAME. Create a global traffic Wide IP example.lb.example.com with two pool members “aws_us_east” and “aws_us_west”. The following screenshot shows the details. Create two global traffic regions: “eastern” and “western”. The screenshot below shows the details of creating the traffic regions. Create global traffic topology records using "Request Source: Region is eastern" and "Destination Pool is aws_us_east". Repeat this for the western region using the aws_us_west pool. The screenshot below shows the details of creating these records. Modify Pool settings under Wide IP www.example.com to use "Topology" as load balancing method. See the screenshot below for details. How it all works... Here's the flow of events that take place as a user types in the web address and ultimately receives the correct IP address. External client types http://example.com into their web browser Internet DNS resolution takes place and maps example.com to your Virtual Server address: IN A 1.2.3.4 An HTTP request is directed to 1.2.3.4:80 Your LTM checks for a profile, the HTTP profile is enabled, the redirect request is applied, and redirect user request with 301 response code is executed External client receives 301 response code and their browser makes a new request to http://www.example.com Internet DNS resolution takes place and maps www.example.com to IN CNAME example.lb.example.com Internet DNS resolution continues mapping example.lb.example.com to your GTM configured Wide IP The Wide IP load balances the request to one of the pools based on the configured logic: Round Robin, Global Availability, Topology or Ratio (we chose "Topology" for our solution) The GTM-configured pool contains a CNAME to either us_east or us_west AWS data centers Internet DNS resolution takes place mapping the request to the ELB hostname (i.e. us-west.elb.amazonaws.com) and gives two A records External client http request is mapped to one of the returned IP addresses And, there you have it. With this solution, you can integrate AWS using your existing LTM and GTM technology! I hope this helps, and I hope you can implement this and other solutions using all the flexibility and power of your F5 technology.3.2KViews1like14Comments9. SYN Cookie Troubleshooting: Logs
Introduction In this last article I will add the last piece of information you can check when troubleshooting TCP SYN Cookie attacks, logs. With this information together with all that you have learned until now you should be able to understand how SYN Cookie is behaving and decide if there is any change you should do in your configuration to improve it. Use cases LTM SYN Cookie at Global context Logs when Global SYN Check Threshold or Default Per Virtual Server SYN Check Threshold has been exceeded are similar, so in order to know in which context was SYN Cookie activated you need to focus on specific text in logs. For example, by having below config: turboflex profile feature => adc tmsh list sys db pvasyncookies.enabled => true tmsh list ltm global-settings connection default-vs-syn-challenge-threshold => 1500 <= tmsh list ltm global-settings connection global-syn-challenge-threshold => 2050 <= tmsh list ltm profile fastl4 syn-cookie-enable => enabled You will get logs similar to the ones below if Global SYN cache has been reached: Dec 7 03:03:02 B12050-R67-S8 warning tmm9[5507]: 01010055:4: Syncookie embryonic connection counter 2051 exceeded sys threshold 2050 Dec 7 03:03:02 B12050-R67-S8 warning tmm5[5507]: 01010055:4: Syncookie embryonic connection counter 2051 exceeded sys threshold 2050 Dec 7 03:03:02 B12050-R67-S8 notice tmm5[5507]: 01010240:5: Syncookie HW mode activated, server name = /Common/syncookie_test server IP = 10.10.20.212:80, HSB modId = 1 Dec 7 03:03:02 B12050-R67-S8 notice tmm9[5507]: 01010240:5: Syncookie HW mode activated, server name = /Common/syncookie_test server IP = 10.10.20.212:80, HSB modId = 2 As you can notice there are two different messages, the first one informs about Software SYN Cookie being activated at Global context, and the second one tells us that Hardware is offloading SYN Cookie from Software. Since there is a minimum delay before Hardware to start to offload SYN Cookie is expected to see a non zero value for the counter Current SYN Cache stats. See article in this SYN Cookie series for more information about stats. Global SYN cache value is configured per TMM, so you see in the log that 2050 threshold has been exceeded in the TMM, and therefore SYN Cookie is activated globally in the device. In this specific example the device has two HSBs and BIG-IP decided that tmm9 and tmm5 would activate each one of them this is why we see the logs repeated. LTM SYN Cookie at Virtual context For the same configuration example I showed above you will see log similar to one below if Virtual SYN cache has been reached: Oct 18 02:26:32 I7800-R68-S7 warning tmm[15666]: 01010038:4: Syncookie counter 251 exceeded vip threshold 250 for virtual = 10.10.20.212:80 Oct 18 02:26:32 I7800-R68-S7 notice tmm[15666]: 01010240:5: Syncookie HW mode activated, server name = /Common/wildcardCookie server IP = 10.10.20.212:80, HSB modId = 1 Oct 18 02:26:32 I7800-R68-S7 notice tmm[15666]: 01010240:5: Syncookie HW mode activated, server name = /Common/wildcardCookie server IP = 10.10.20.212:80, HSB modId = 2 Virtual SYN cache value is configured globally meaning that the configured value must be divided among TMMs to know when SYN cookie will be enabled on each TMM. Run below command to see physical number of cores: tmsh sho sys hard | grep core In this example device has 6 TMMs, so 1500/6 is 250. Note that you will see a warning message entry per TMM (I removed 3 log entries in above example order to summarize) and per HSB ID. Log does not always show the VIP’s IP, it depends on type of VIP. For example in below case: Oct 17 04:04:54 I7800-R68-S7 warning tmm2[22805]: 01010038:4: Syncookie counter 251 exceeded vip threshold 250 for virtual = 10.10.20.212:80 Oct 17 04:04:54 I7800-R68-S7 warning tmm3[22805]: 01010038:4: Syncookie counter 251 exceeded vip threshold 250 for virtual = 10.10.20.212:80 Oct 17 04:04:55 I7800-R68-S7 notice tmm2[22805]: 01010240:5: Syncookie HW mode activated, server name = /Common/wildcardCookie server IP = 10.10.20.212:80, HSB modId = 1 Oct 17 04:05:51 I7800-R68-S7 notice tmm2[22805]: 01010241:5: Syncookie HW mode exited, server name = /Common/wildcardCookie server IP = 10.10.20.212:80, HSB modId = 1 from HSB There is not any virtual configured with destination IP 10.10.20.212. In fact traffic is handled by a wildcard VIP listening on 0.0.0.0/0, this logged IP is the destination IP:Port in the request that triggered SYN Cookie. You can consider this IP as the most probable attacked IP since it was the one that exceeded the threshold, so you can assume there are more attacks to this IP, however attack could have a random destination IPs target. Important: Per-Virtual SYN Cookie threshold MUST be lower than Global threshold, if you configure Virtual Server threshold higher than Global, or 0, then internally BIG-IP will set SYN Cookie Global threshold equals to Per-Virtual SYN Cookie threshold. LTM SYN Cookie at VLAN context Configuration example for triggering LTM SYN Cookie at VLAN context: turboflex profile feature => adc tmsh list sys db pvasyncookies.enabled => true tmsh list ltm global-settings connection vlan-syn-cookie => enabled tmsh list net vlan hardware-syncookie => [vlan external: 2888] tmsh list ltm global-settings connection default-vs-syn-challenge-threshold => 0 tmsh list ltm global-settings connection global-syn-challenge-threshold => 2500 When SYN cookie is triggered you get log: Oct 17 10:27:23 I7800-R68-S7 notice tmm[15666]: 01010292:5: Hardware syncookie protection activated on VLAN 1160 (syncache:2916 syn flood pkt rate:0) In this case you will see that information related to virtual servers on this VLAN will show SYN cookie as ‘not activated’ because protection is at VLAN context: # tmsh show ltm virtual | grep ' status ' -i Status not-activated Status not-activated If you configure SYN Cookie per VLAN but Turboflex adc/security is not provisioned then you will get: Oct 17 04:39:52 I7800-R68-S7.sin.pslab.local warning mcpd[7643]: 01071859:4: Warning generated : This platform supports Neuron-based Syncookie protection on per VS basis (including wildcard virtual). Please use that feature instead AFM SYN Cookie at Global context Main different in AFM default log is that you will not get a message telling you the threshold it has been exceeded, instead log will inform you directly about the context that detected the attack. Configuration example for triggering AFM SYN Cookie at global context: turboflex profile feature => security tmsh list ltm global-settings connection vlan-syn-cookie => enabled tmsh list net vlan hardware-syncookie [not compatible with DoS device] tmsh list sys db pvasyncookies.enabled => true tmsh list ltm global-settings connection default-vs-syn-challenge-threshold. => 0 tmsh list ltm global-settings connection global-syn-challenge-threshold => 2500 tmsh list security dos device-config default-internal-rate-limit (tcp-half-open) => >2500 tmsh list security dos device-config detection-threshold-pps (tcp-half-open) => 2500 tmsh list ltm profile fastl4 syn-cookie-enable => enabled AFM Device DoS has preference over LTM Global SYN Cookie, so in above configuration AFM tcp half open vector will be triggered: Oct 19 02:23:41 I7800-R68-S7 err tmm[23288]: 01010252:3: A Enforced Device DOS attack start was detected for vector TCP half open, Attack ID 1213152658. Oct 19 02:29:23 I7800-R68-S7 notice tmm[23288]: 01010253:5: A Enforced Device DOS attack has stopped for vector TCP half open, Attack ID 1213152658. In the example above you can see that there are logs warning you about an attack that started and stopped, but there is not any log showing if attack is mitigated. This is because you have not configured AFM to log to local-syslog (/var/log/ltm). In this situation DoS logs are basic. If you want to see packets dropped or allowed you need to configure specific security log profile. Be aware that when SYN Cookie is active because Device TCP half open DoS vector’s threshold has been reached then you will not see any Virtual Server showing that SYN Cookie has been activated, as it happens when SYN Cookie VLAN is activated: SYN Cookies Status not-activated This is slightly different to LTM Global SYN Cookie, when LTM Global SYN Cookie is enabled BIG-IP will show which specific VIP has SYN Cookie activated (Status Full Hardware/Software). In case you have configured logging for DoS then you would get logs like these: Oct 23 03:56:15 I7800-R68-S7 err tmm[21638]: 01010252:3: A Enforced Device DOS attack start was detected for vector TCP half open, Attack ID 69679369. Oct 23 03:56:15 I7800-R68-S7 info tmm[21638]: 23003138 "Oct 23 2020 03:56:15","10.200.68.7","I7800-R68-S7.sin.pslab.local","Device","","","","","","","TCP half open","69679369","Attack Started","None","0","0","0000000000000000", "Enforced", "Volumetric, Aggregated across all SrcIP's, Device-Wide attack, metric:PPS" Oct 23 03:56:16 I7800-R68-S7 info tmm[21638]: 23003138 "Oct 23 2020 03:56:16","10.200.68.7","I7800-R68-S7.sin.pslab.local","Device","","","","","","","TCP half open","69679369","Attack Sampled","Drop","3023","43331","0000000000000000", "Enforced", "Volumetric, Aggregated across all SrcIP's, Device-Wide attack, metric:PPS" Oct 23 03:56:16 I7800-R68-S7 info tmm[21638]: 23003138 "Oct 23 2020 03:56:16","10.200.68.7","I7800-R68-S7.sin.pslab.local","Device","","","","","","","TCP half open","69679369","Attack Sampled","Drop","3017","69710","0000000000000000", "Enforced", "Volumetric, Aggregated across all SrcIP's, Device-Wide attack, metric:PPS” The meaning of below fields shown in above logs: "Drop","3023","43331","0000000000000000" "Drop","3017","69710","0000000000000000" Are as below: {action} {dos_packets_received} {dos_packets_dropped} {flow_id} Where: {dos_packets_received} - It counts the number of TCP SYNs received for which you have not received the ACK. Also called embryonic SYNs. {dos_packets_dropped} - It counts the number of TCP syncookies that you have sent for which you have not received valid ACK. If you see that {dos_packets_received} are high, but {dos_packets_dropped} are 0 or low, then it just means that AFM is responding to SYN packets with SYN cookies and it is receiving correct ACKs from client. Therefore, AFM is not dropping anything. So this could mean that this is not an attack but a traffic peak. It can happen that you configure a mitigation threshold lower than detection threshold, although you will get a message warning you, you could not realise about it. If this is the case you will not see any log informing you about that there is an attack. This will happen for example with below configuration: tmsh list ltm global-settings connection global-syn-challenge-threshold => 3400 tmsh list security dos device-config default-internal-rate-limit (tcp-half-open) => 3000 tmsh list security dos device-config detection-threshold-pps (tcp-half-open) => 3900 tmsh list ltm profile fastl4 syn-cookie-enable => disabled Due to this you will see in /var/log/ltm something like: Oct 23 03:38:12 I7800-R68-S7.sin.pslab.local warning mcpd[10516]: 01071859:4: Warning generated : DOS attack data (tcp-half-open): Since drop limit is less than detection limit, packets dropped below the detection limit rate will not be logged. AFM SYN Cookie at Virtual context All information provided in previous use case applies in here, so for below configuration example: tmsh list ltm global-settings connection global-syn-challenge-threshold => 3400 tmsh list security dos device-config default-internal-rate-limit (tcp-half-open) => 3000 tmsh list security dos device-config detection (tcp-half-open) => 3900 list security dos profile SYNCookie dos-network default-internal-rate-limit (tcp-half-open) => 2000 list security dos profile SYNCookie dos-network detection-threshold-pps (tcp-half-open) => 1900 tmsh list ltm profile fastl4 <name> => enabled AFM device SYN Cookie is activated for specific virtual server with security profile applied: Oct 23 04:10:26 I7800-R68-S7 notice tmm[21638]: 01010240:5: Syncookie HW mode activated, server = 0.0.0.0:0, HSB modId = 1 Oct 23 04:10:26 I7800-R68-S7 notice tmm5[21638]: 01010240:5: Syncookie HW mode activated, server = 0.0.0.0:0, HSB modId = 2 Oct 23 04:10:26 I7800-R68-S7 err tmm3[21638]: 01010252:3: A NETWORK /Common/SYNCookie_Test DOS attack start was detected for vector TCP half open, Attack ID 2147786126. Oct 23 04:10:28 I7800-R68-S7 info tmm[16005]: 23003156 "10.200.68.7","I7800-R68-S7.sin.pslab.local","Virtual Server","/Common/SYNCookie_Test","Cryptographic SYN Cookie","16973","0","0","0", Oct 23 04:10:57 I7800-R68-S7 notice tmm5[21638]: 01010253:5: A NETWORK /Common/SYNCookie_Test DOS attack has stopped for vector TCP half open, Attack ID 2147786126. Oct 23 04:12:46 I7800-R68-S7 notice tmm[21638]: 01010241:5: Syncookie HW mode exited, server = 0.0.0.0:0, HSB modId = 1 from HSB Oct 23 04:12:47 I7800-R68-S7 notice tmm5[21638]: 01010241:5: Syncookie HW mode exited, server = 0.0.0.0:0, HSB modId = 2 from HSB Troubleshooting steps When you need to troubleshoot how device is working against a SYN flood attack there are some steps you can follow. Check configuration to make a global idea of what should happen when SYN flood occurs: tmsh show sys turboflex profile feature tmsh list ltm global-settings connection vlan-syn-cookie tmsh list net vlan hardware-syncookie tmsh list sys db pvasyncookies.enabled tmsh list ltm global-settings connection default-vs-syn-challenge-threshold tmsh list ltm global-settings connection global-syn-challenge-threshold tmsh list ltm profile fastl4 syn-cookie-enable tmsh list ltm profile tcp all-properties | grep -E 'profile|syn-cookie' tmsh list ltm profile fastl4 all-properties| grep -E 'profile|syn-cookie' list security dos device-config syn-cookie-whitelist syn-cookie-dsr-flow-reset-by tscookie-vlans tmsh list security dos device-config dos-device-config | grep -A23 half tmsh list security dos profile dos-network {<profile> { network-attack-vector { tcp-half-open } } } *I can miss some commands since I cannot know specific configuration you are using, but above list can give you a good idea about what you have actually configured in your system. Are you using Hardware or Software SYN cookie? Are you using CMP or vCMP? Is device a Neuron platform? Is SYN cookie configured/working in AFM, in LTM or in both? Is SYN cookie enabled at Device, VLAN or Virtual Server context? If issue is at virtual server context, which virtual servers are affected? is the problem happening in a Standard or FastL4 VIP, …? Check logs (date/times) and stats to confirm what it has really happened and since when. Take captures to confirm your findings. Is this an attack? Were there other attacks at the same time (TCP BAD ACK, TCP RST maybe)? Are thresholds correctly configured attending to expected amount of traffic? If clients are hidden by a proxy maybe you could save resources by configuring Challenge and remember. If this is a Neuron platform, is there any error in /var/log/neurond? Check published IDs related to SYN Cookie for specific TMOS versions or/and platforms. Conclusion Now you have enough information to start troubleshooting your own BIG-IP devices if any issue happens, and also and maybe more important you have tools to create the most appropriate configuration attending to your specific platform and traffic patterns. So you can start to take the advantage of your knowledge to improve performance of your device when under TCP SYN flood attack.2.8KViews2likes1Comment