Return of Bleichenbacher - the ROBOT Attack CVE-2017-6168
Everything old is new again.
Security researcher Hanno Böck and friends built a neat little Python adaptive chosen-ciphertext attack proof-of-concept (POC) tool and ran it against many popular websites. They call their attack the “ROBOT”, which stands for Return of Bleichenbacher Oracle Threat, and they’ve published their results at robotattack.org. For this article I’ll be using the word Bleichenbacher to mean the general attack, and ROBOT to mean this latest, optimized version that includes a key signature attack.
F5’s SSL/TLS stack was one of the stacks that was found vulnerable. We issued knowledge base article K21905460 in November of 2017 with details for immediate consumption. K21905460 is the official F5 response, and this article is for those looking for a more detailed explanation of the attack, as well as the detection and mitigation strategies related to the vulnerability.
CVE-2017-6168 describes a Bleichenbacher attack against the F5 TLS stack. The theory of the attack isn’t new; primers on SSL/TLS mentioned it as early as 1998. The original RSA key exchange padding oracle attack for TLS, Bleichenbacher sends thousands of variations of ciphertext at a TLS server. The TLS server attempts to decrypt each one, and sends back one of two error codes—either the decrypt failed or the padding was messed up. By trying many variations of a message, and differentiating between the two error codes, the attacker could eventually decipher an RSA-encrypted plaintext, one bit at time.
Key Points
- Previously recorded RSA sessions will be vulnerable to decryption until mitigation is applied.
- At best, the attack decrypts one past TLS session taking several hours and tens of thousands of failed handshakes.
- Only RSA-based TLS handshakes are vulnerable; elliptic curve (ECC) and Diffie-Hellman are not affected by Bleichenbacher.
- Recovery of the RSA private key is never a concern for this exploit. There is no need to rotate keys or get new certificates.
- There is a theoretical man-in-the-middle decryption attack, but it is highly unlikely to successfully exploited.
The Bleichenbacher attack raises its head in Europe every now and then. Filippo Valsorda found one in the Python-RSA library in 2016. A German team found one in XML encryption in 2012. Another German team wrote about optimizations for Bleichenbacher in this 2014 paper. Hanno Böck lives in Berlin and writes mostly in German. Daniel Bleichenbacher himself, though, is Swiss.
There are two obvious uses for the ROBOT attack.
Threat vector #1: Use ROBOT to recover a TLS session
Attacker Eve records a TLS browser session between user Alice and website Bob. Eve extracts the encrypted session key material from Alice’s session. Eve then sends thousands of variations of that session key at server Bob, changing a bit here and there. Of course the vast majority of the variations fail, in one of two ways. Sometimes Bob replies “I couldn’t decrypt that.” And sometimes he replies “I decrypted it, but the message padding is messed up.” Eve uses the difference between these error codes to test the validity of each bit she changed. Eventually, she reconstructs Alice’s original session key. She decrypts Alice’s session, and in that session Eve finds Alice’s user credentials, and then the breach is on.
Threat vector #2: Use ROBOT during a man-in-the-middle attack
In addition to decrypting a message, the Hanno Böck team claims that their variation of the Bleichenbacher attack can coerce a TLS server into signing a piece of data. Because signatures are used to verify the integrity of the TLS handshake, in theory, the ROBOT attack could be used to man-in-the-middle a new TLS session.
However, such an attack would be difficult to exploit due to the tens of thousands of messages that Eve would need to send to Bob to construct a signature. Even if Eve was wicked fast, there’s only so many messages that Bob can try to decrypt. For example, the ROBOT proof of concept code took six hours to run on our powerful server harness. Alice is not going to wait six hours for her handshake to complete. Her browser would time out after a few minutes. And after Logjam, administrators figured they should bound the SSL handshake to avoid this kind of problem.
One of our TLS architects has done some math and determined that even if Eve was wicked fast, and Bob was wicked fast (and happy to try to decrypt all those messages), Eve would need to be within one kilometer of Bob or the slowness of the speed of light would make the MiTM attack infeasible.
What is the real impact of ROBOT?
The Bleichenbacher attack only affects RSA sessions not protected with the ephemeral keys offered by forward secrecy. All modern browsers and mobile clients have preferred ephemeral keys for several years. Google has been preferring it with their servers and software since 2012.
Two-thirds of F5 TLS servers prefer forward secrecy.
Not all websites, though, allow forward secrecy, usually because they have to support passive TLS monitoring. This includes many organizations in the financial sector.
The scope of ROBOT is therefore limited to legacy clients—think Windows XP users—and the clients of sites that do not offer forward secrecy. Customers using versions of F5 TMOS prior to 11.6 are not vulnerable.
So why is F5 vulnerable?
You might be asking, “Hey, F5, if Bleichenbacher-style attacks have been known about since 1998, why are you vulnerable to them?”
Let me be completely transparent here, because root-cause analysis is a thing. Of course we’ve known about the Bleichenbacher attack and we weren’t originally vulnerable to it. CVE-2017-6168 was a regression. In order to achieve the fastest possible TLS performance, we use crypto offload hardware where possible. In version 11.6 of our TMOS platform, we received new firmware drop from one of our offload vendors. When integrating that firmware, we reverted to responding to padding errors with the TLS padding error alert, which in itself sounds reasonable, doesn’t it? The change we made to support the firmware also affected our software stack, which is why virtual editions are vulnerable as well.
As the researchers mentioned in their paper, there are no easy ways to test for Bleichenbacher attacks within crypto frameworks right now. We are working on updating our testing framework to ensure that the regression doesn’t happen again.
How to tell if you’ve been Bleichenbachered
There are two ways to check if an attacker has been running the Bleichenbacher tool against your virtual server. In your log (/var/log/ltm), you would see messages similar to these:
warning tmm:01260009:4: Connection error: ssl_hs_vfy_pms:10117: Invalid PMS (80)
These don’t necessarily indicate a direct attack, but high numbers of the messages should be considered suspicious. Depending on the hardware/software version and debug level enabled, the context around this message might be different, but this is the common log entry observed during our testing.
The second way is observing rapidly increasing SSL profile statistics such as “Handshake Failures” and “Fatal Alerts.” These can be seen by running this command from the tmsh shell:
(tmos)# show ltm profile client-ssl <your_clientssl_profile>
…
Failures
Premature Disconnects 0
Handshake Failures 71.8K
Renegotiations Rejected 0
Aggregate Renegotiations Rejected 0
Fatal Alerts 71.8K
Active Handshakes Rejected 0
You can also see the same statistics in the GUI, under clientssl profile statistics.
Incident response
Check each of your BIG-IP devices for tens of thousands of handshake failures. In our testing, the attack generated about 50,000 handshake failures for each session cracking attempt. If you have only a few thousand handshake failures, then no one has tried the Bleichenbacher attack against that device.
However, if there are more than say, 10,000 handshake failures, then you may consider that someone has tried (or is trying) to crack a previously recorded TLS session. If there are millions of handshake failures, then someone may be attempting a mass decryption of several sessions.
The appropriate response in that case would be to consider that user sessions may have been recently compromised and begin password reset procedures for the users in that application scope.
Mitigation strategy for CVE-2017-6168: Bleichenbacher 2017
The primary threat to protect against is threat vector #1 above: the session recovery problem.
Your end users are vulnerable to recovered sessions if you are using any version of F5 BIG-IP between 11.6 and 13.0 and terminating SSL/TLS on any virtual server. We have issued patches for these versions. See knowledge base article K21905460 for the relevant download pages and start on your staging tests.
Some more good news is that there are two ways you can mitigate the attack immediately.
Virtual patch #1: ssl_hx_rlimit iRule
In 2014, I wrote an iRule called ssl_hx_rlimit (handshake rate-limit) to prevent a class of denial-of-service (DoS) attacks. The rule is generic enough that it protects against Bleichenbacher attacks as well. Any IP address that fails five consecutive handshakes in a five-minute period will be ignored for the remainder of that period. The count and the period are both configurable within the iRule. If you need to allowlist a set of IP addresses (so that they are not caught by this rule), there are multiple ways to do that documented on DevCentral. If you are running all traffic through a NAT prior to your F5, then this solution is obviously not appropriate.
Pick up the 75-line ssl_hx_rlimit iRule from DevCentral’s Code Share.
Virtual patch #2: Require forward secrecy
The second virtual patch option is to require only forward-secret (non-RSA) key exchange ciphers. Bleichenbacher only works against RSA handshakes, so elliptic curve and Diffie-Hellman handshakes are safe. Most of the Internet already supports (and prefers) forward secrecy, so some sites may opt for this solution.
Requiring forward secrecy on your F5 isn’t rocket surgery. Ultimately you just want to set your clientssl profile cipher string to include only ECDHE and DHE (but not ADH) ciphers and disallow RSA key exchanges. It can be as simple as setting your cipher string to “DEFAULT:!RSA” or “ECDHE:DHE.” Here are some relevant links about forward secrecy:
- Whitepaper: F5 SSL Recommended Practices
- Lightboard lesson: Perfect Forward Secrecy
- Knowledge base article K21905460
But should you disable RSA? What if you have end-users (or automated queries) still using RSA key exchanges? Fortunately its somewhat easy to tell. Here’s an article on DevCentral that talks about retrieving key exchange counters from your F5. You could also combine that article with this older one, that generates cool graphs from an iRule.
After you change your cipher string, you can test it by running the openssl s_client command (either on your F5 or any *nix distribution that can connect to the virtual server in question):
% openssl s_client -connect <your_virtual_server>:443 -cipher RSA
(should not succeed)
If it fails to connect then you did it right. While you’re there, make sure that forward secrecy is working:
% openssl s_client -connect <your_virtual_server>:443 -cipher ECDHE
(should succeed)
With either virtual patch applied you may buy yourself the time to apply the patches on your next schedule patch cycle instead of on an accelerated schedule.
Set your handshake timeouts, too
You should also set your TLS handshake timeouts to some small, but still sane, value to avoid that theoretical MitM problem (threat vector #2). Six seconds should be sufficient. See the knowledge base article for instructions.
Will this be the last Bleichenbacher?
The forthcoming TLS 1.3 protocol requires forward-secret ciphers, so it will be safe from Bleichenbacher. However, adoption of TLS 1.3 could be somewhat slow, because forward secrecy isn’t free and introduces major visibility problems for enterprises with significant SSL inspection investments. So don’t expect TLS 1.3 to save you from Bleichenbachers for another decade.
I expect TLS 1.2 and its support of RSA handshakes to be around for several more years. Heck, according to our 2016 TLS Telemetry report, half the Internet is still stuck at TLS 1.0! So at the current rate of a new Bleichenbacher every couple of years, I guess we have a few more to see.
- Hannes_RappNimbostratus
Nice summary. Another day another attack, but in the end it's that good old brute-force, with an edge.
To address the exploitation at hand, and pro-actively mitigate any similar ones that will come out in the future, why not try and hit multiple flies at once? Implement a TLS rate limiting feature (what your iRule does) in next LTM release as a native feature.
I see 3 benefits here. TLS rate limiting is a direct mitigation against this attack, but also a proactive mitigation of similar attacks that entirely or partly rely on primitive means of brute force. And lastly, it's also a neat DOS protection feature.
For short-term, maybe it can be implemented as a new attribute, configurable in clientssl profile? For long term, maybe there will eventually be a dedicated rate limiting policy which includes not only TLS rate limiting but also the same for network stack and HTTP?
- FulmetalNimbostratus
Many thanks for the sum up .
Very OK with what Hannes said , TLS RL is a good to have as a native feature .
- MLBNimbostratus
I was wondering if the iRule for handshake rate=limit still existed anywhere? We have an older version of F5 Big IP and due to having customers that use stand-alone terminals that continue to use RSA (and which the terminal manufacturer will not address) we need to mitigate the risk for the TLS ROBOT attack in all ways possible and the iRule limiting the handshake rate seemed ideal but I cannot find that iRule anywhere. Has another iRule which contains that functionality replaced it? Is there other means to mitigate that make the iRule unnecessary. Forgive my ignorance, as I am not the network resource most familiar and responsible for our F5, I'm just trying to research ways to mitigate. Thanks in advance for any guidance anyone can provide.