Making Android SSL Work Correctly.
Connecting via #SSL in #Android. Correctly and securely. Mimic browser functionality with untrusted certs and hostname mismatches. #f5
Okay, for those of you who’ve been waiting, this is the blog. For those who haven’t, welcome, we’re going to talk about https connections (specifically) on Android (TM) from within a native app and connecting to sites that Android views as possible security risks. Easy like learning to drive if your face was in your navel. But I’ve tried to make the issues and the code as clean and clear as possible for you. Hope it is of some assistance.
I’m going to take it slow and address all of the issues, even the seemingly unrelated ones. My goal with this blog is to help those who are just getting started on Android too, so I’ll use a lot of headings, giving you the chance to skip sections you are already familiar with. Otherwise, grab a cup of coffee (or a spot of tea, or a glass of water, whatever suits you), sit back, and read. Feedback is, as always, welcome. I’m not one of the Android designers, and had to admit long ago I wasn’t perfect, so anything to improve this blog is welcome.
Android Info
We all basically know what Android is, and if you’ve been developing for it for very long, you know the quirks. There are a few major ones that I’ll mention up-front so simple gotchas and oddities later on don’t catch you off guard when I discuss them. The first, and one that is very important to the general version of this solution, is pretty straight-forward. You cannot stop execution of code and wait for user input. Purists would argue with me, and I’ve even seen the core Android developers claim that the system supported modal dialogs because of some design features, but modal as in “stop and wait for input” Android does not even pretend to support. It is a cooperative multitasking environment, and you’re not allowed to hog a ton of time idling while other apps are crying for cycles. If your app takes too long to do anything, Android will close it. If you try things like (important for our solution) database queries or network connections, Android will close the app. Remember that, because it comes up again later.
Ignoring the “cannot stop execution” part, and focusing on the “long running processes” part, we then are forced to move all long-running things like queries and network connections out of the core application thread and into a separate one. Android did a great job of making this relatively painless. There is good documentation, everything works as expected, and there are several different ways to do it based upon your needs. You can just spawn a Java thread, and let it go, you can use an AsynchTask, which allows limited interaction with the UI thread (meaning you can directly post updates relevant to the background task), and there are services. The thing is that both AsynchTask and Thread are stopped when your application is stopped. For some uses, that’s okay. For most, it is not. If you want that long-running SSL communication task to complete before exiting, then a Service is your tool. Service has an interface you can use to send it commands, and when it finishes a command, it exits on its own. This gives you data consistency, even if the application is terminated for some reason (and trust me if you don’t already know, apps are “terminated” for some pretty superfluous reasons). In our code, this is the way we went, but the source I am introducing here works with any of these approaches, simply because it isn’t the execution mechanism that matters, it is the SSL code that matters.
SSL Info
SSL has been around forever, and Java support for SSL has been too. Every time a new platform comes along though, you get people asking how to implement SSL on the platform, and you get people suggesting various ways to disable SSL on the platform. This blog is not one of those. It is one way that you can make SSL act like it does in your browser, without compromising security. Note: Security geek friends, the following is far from a definitive treatise on SSL, it is aimed at helping developers use it more effectively. Do not pick at my generalizations, I’m aiming at more secure Android apps, not training security peeps.
When an SSL connection is made, one of the first things that occurs is the SSL handshake.
This includes the server coughing up its certificate for the client to determine if the server certificate is valid. This is the point in most Java developers’ careers that their hair starts to turn grey. The process of validating a certificate is baked into the java.net libraries, and at the level we need to care about consists of three steps. First is to get an instance of javax.net.ssl.HostNameVerifier, and use it to check if the name on the certificate matches the hostname advertised through DNS. A mismatch of these two items may imply a Man-In-The-Middle attack, so the check is there. The thing is, nearly every test environment on the planet also has a host name mismatch, because they’re temporary. In the browser, a dialog pops up and says “Host Name Mismatch, continue? (Y/N)” and you choose. In code, you cannot have the browser do that for you, so you’ll have to handle it – and we’ll see how in a bit.
Aside from the Host Name Verifier, the certificate has to be checked for validity. There is a “store” (generally an encrypted file) on the Android device (and really any device using Java for SSL) that has a list of valid “Root Certificates”. These are the certificates that are definitive and known good, there are relatively few of them in the world (compared to the pool of all certificates), and they are used to issue other certificates. It costs money to get a trusted cert for production use, so in test environments, most companies do not, so again, if you’re testing, you’ll see problems that might not exist in the real-world. A production certificate that you would use is “signed” by someone and theirs was “signed” by someone, etc, all the way up to one of the root certificates. If you can get back to a root by looking at the signatures, that’s called the certificate chain, and it means the certificate is good.
But there are two catches. First, Android has not consistently included the all of the worlds’ largest of the root servers’ certificates in the certificate store. “Android” devices being sold by many different companies – who can change the certificate store – doesn’t help. So sometimes, a valid certificate will trace up the chain correctly, but at the top of the chain, the root certificate is not in the store, so the entire chain – and by extension the server you’re connecting to – is “invalid” in Androids’ eyes. Second, as alluded to above, in a test environment, you can generate “self-signed” certificates. These are valid certificates that do not point to a Root, the chain ends with them. These are never deemed valid in Android, nor in most other SSL implementations, simply because anyone could self-sign and there is no way to track it back to a root that guarantees that signing. There are other things that are checked – is the certificate expired, has it been revoked by someone further up the chain, etc. – but the killer for Android developers is the “break in the chain”, either by lack of inclusion of a root certificate or self signing.
Android With SSL
So in the normal processing of SSL, you can implement the validation interfaces and change normal function – we’re going to stick with the two that we use in this source, but there are (can be) more. We’ll look at the problem in a couple of different ways, and in the end show you how to use the code developed to pop up a descriptive dialog, and ask the user if they want to continue. This is exactly what browsers (including the default Android browser) do for you. Looking around online there are several excellent examples of handling this problem with dialogs.
Android doesn’t support this model well at all. If you recall up above I pointed out that you can’t stop code execution on Android, meaning you can’t ask the user from within, say, HostNameVerifier.verify(), because you must return true or false from this routine. False says “don’t allow this connection”, and true say “yes, allow this connection. But in Android, since you cannot use a modal dialog to stop execution and wait for an answer, if you do throw up a dialog, the code continues, and whatever the last value of your return variable was, that’s what the answer will be. This is not likely to meet your design goals.
At this point, since we’ve hit the first fiddly bit we have to solve, I’ll say point blank that those people who are suggesting that you simply always return true from this routine are not the ones to listen to. This protection is there for a reason, do not lightly take it away from your users, or you will eventually regret it.
A very similar scenario exists with the certificate validation classes – in our case, X509Certificates – you either throw an exception, or the act of exiting the routine implies acceptance and allows the connection to go through. Needless to say, in Android, before the dialog to ask a users’ preference is drawn, the routine would have exited.
So What To Do?
There are several decent work-arounds for this problem. We’ll discuss three of them that cover common scenarios.
Solution #1: If you are only ever going to hit the same servers all the time (like you would in a VPN), and have access to the tablets/phones that will be doing the connecting, then you should look up how to add certs to the Android device. This will give you a solid solution with zero loss of security and essentially no code complexity. I will not show you how to achieve this one in this blog, simply because it is the best documented of the three solutions.
Solution #2: If you can ask the user ahead of time (this is my scenario) if they want to ignore errors for a given server, then you can save it in a database and use that information to tell the routine how to behave.
Solution #3: If you cannot ask the user ahead of time, and want to be able to dynamically go to many sites, then you’ll have a little more work to do – notably subclassing the verification classes you’ll need, and installing them as the defaults for your kind of communication (TLS in our case). I’ll detail this below, but the overview is (1) Return false or throw an exception. (2) Send a message to the UI loop to say “connection failed, bad cert, retry”. (3) In the main UI loop you can pop up a dialog, and while it will not be blocking, you will be notified when the user clicks okay or cancel (assuming you’re the listener), if the notification is “yes, proceed”, then you can tell your background thread to retry and return true from HostnameVerifier, while not throwing exceptions in X509 processing.
I loathe this solution, because it causes you to build the entire connection twice, which is always bad form, but doubly so on phones with data plans, which is a huge chunk of the Android market. But short of writing extensive code that could become a project on its own, this is the best solution for dynamic access to websites available at this time. It essentially adopts the blocking dialog model to the Android model.
Source – Solution #2
The key to this solution and the next is the over-ridden classes for host name verification and certificate validation. We’ll delve into them first, then how to install them, then into how to set the up to give the SSL processing code the correct answers. We’ll also need a database, but we’ll discuss that afterward, because in theory you could use other mechanisms to implement that bit, depending upon the rest of your architecture.
HostnameVerifier is the easier of the two. The Interface in the javax.net.ssl library has only one routine – verify. We’ll implement that routine with a few assumptions, and I’ll talk about the assumptions afterward.
Diagram of relevant program flow
The above diagram is simply for reference. Some people understand easier with a diagram, so I built you one with just the parts we’re going to care about in it.
So here is the source for our host name verifier. It is really pretty simple, we’re just implementing the interface and using our own criteria to determine if a host name mismatch should cause us to abandon the connection…
public class ModifiedHostNameVerifier implements HostnameVerifier {
boolean bAccepted;
public ModifiedHostNameVerifier(Context m) {
bAccepted=false;
}public void setAccepted(boolean acc) {
bAccepted = acc;
}
// verify is only called to compare hostname from DNS to SSL Certificate issued-to name. Do what the user wants and return.// NOTE: in the more generic case where the user is connecting all over the place, you’ll need to get the default behavior,
// same as we do in the X509 code below – otherwise they will be blocked on all calls not set to accepted, even ones with
// matching hostnames.
public boolean verify(String string,SSLSession ssls) {return bAccepted;
}
}
I mentioned caveats, well as you can see, we set whether this should be accepted or not outside of the class. That makes nothing to the class. But we have to have it or the standard rule of “host names don’t match, let’s get out of here” will apply. As mentioned in the comments, you’ll want to get a pointer to the default handler before this one is installed, so you can use it for most validations – except in cases (like ours) that only need to know what the user said and don’t have to care about the normal case where hostnames match. A call in the constructor that constructs an instance of DefaultHostNameVerifier and saves the created object to a global (just as we did in X509) will then give you a reference to use to check normal processing before returning the value of bAccepted.
The original iteration of this code had the determination if we should accept the mismatch or not right in the verify method. The reason it had to move out of here is simple – connections have timeouts. We were wasting so much time doing that determination inside this class and the X509 class that the SSL connection would time out and close. Not exactly what we wanted, so I did it this way, which conveniently for blogging purposes is also easier to follow. The determination is made and we set the accepted variable before the call to connect (which eventually calls this classes verify()), More on that after this next class…
The X509TrustManager handles validating certificate chains. It cares not what you’re doing, merely validates that the certificate is valid. We need to add a caveat that says “If the default Trust Manager doesn’t think this is valid, we should try to validate it with our specialized parameters.” And that is indeed what we do. Again, we use a boolean to tell us how to handle this scenario, which keeps the code both fast in execution and clean for learning.
public class ModifiedX509TrustManager implements X509TrustManager {
X509TrustManager defaultTrustMgr;
boolean accepted;public ModifiedX509TrustManager() {
TrustManager managers[]= {null};
try {
KeyStore ks = KeyStore.getInstance(KeyStore.getDefaultType());
String algo = TrustManagerFactory.getDefaultAlgorithm();
TrustManagerFactory tmf =TrustManagerFactory.getInstance(algo);
tmf.init(ks);
managers = tmf.getTrustManagers();
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
} catch (KeyStoreException e) {
e.printStackTrace();
}for (int i = 0; i < managers.length; i++) {
if (managers[i] instanceof X509TrustManager) {
defaultTrustMgr = (X509TrustManager) managers[i];
return;
}
}
}public void checkClientTrusted(X509Certificate[] chain, String authType)
throws CertificateException {
defaultTrustMgr.checkClientTrusted(chain, authType);
}public void setAccepted(boolean acc) {
accepted = acc;
}public void checkServerTrusted(X509Certificate[] chain, String authType)
throws CertificateException {
// First, check with the default.
try {
defaultTrustMgr.checkServerTrusted(chain, authType);
// if there is no exception, return happily.
} catch(CertificateException ce) {
if(accepted) {
// YES. They've accepted it in the setup dialog.
return;
} else {
Log.e("Device Not trusted!","Check Settings.");
throw ce;
}
}
}public X509Certificate[] getAcceptedIssuers() {
return defaultTrustMgr.getAcceptedIssuers();
}}
The first thing we do (in the constructor) is use the KeyStore (where the recognized trusted certs are stored) and TrustManagerFactory (the class that instantiates and manages Trust Managers for different types of keys. We seek out the default, which is TLS on most systems. Some would tell you that this is unsafe and you should explicitly ask for TLS, but I am of the opinion that we want the default for X509, since we don’t much care beyond that. So we grab it. We store a class level reference to the default trust manager for use later. That really is the whole point of overriding the constructor.
It is save to ignore the implementation of checkClientTrusted(). We have to implement it, but don’t need it (unless you’re doing client connections…), so it simply returns the default trust manager implementation of this class.
SetAccepted() is called from our source before the connect call is made, to tell this class whether the user has agreed to accept certificates that aren’t perfect. It simply sets a class-level variable that we use in the next, most important, routine.
checkServerTrusted() is the key routine in this class. It tries to validate with the default trust manager, and if that fails, it then uses the value set with accepted() to determine if it should let the connection go through. If the default trust manager threw an exception (finding fault with the cert or its chain), and the user said “No, don’t accept bad connections” when we asked, this routine throws a certificate exception that stops us from connecting. In all other scenarios, it returns normally and allows the connection to go through.
getAcceptedIssuers() is another routine that the Interface requires, but we didn’t want to modify, since our accepted/not accepted is not based on adding new certificates. You might want to modify this routine if you are going to manage the certificates you (or the user) have explicitly trusted.
Okay, that’s the worst of it. Now we have two implementations of the interfaces that will do what we want. But first, we have to tell the system to use our classes instead of the ones currently in use. That’s relatively easy, here’s the method to do it (this comes from our connection class, and is called before HttpsUrlConnection.connnect() is called).
private void trustOurHosts() throws Exception {
// Create a host name verifier
mhnv = new ModifiedHostNameVerifier(this);
// Create a trust manager that recognizes our acceptance criteria
mod509 = new ModifiedX509TrustManager();
TrustManager[] trustOurCerts ={ mod509 };
// Install the trust manager and host name verifier
try {
SSLContext sc = SSLContext.getInstance("TLS");
sc.init(null, trustOurCerts, new java.security.SecureRandom());
HttpsURLConnection
.setDefaultSSLSocketFactory(sc.getSocketFactory());
HttpsURLConnection.setDefaultHostnameVerifier( mhnv);
} catch (Exception e) {
/* No Such Algorithm Exception, KeyManagementException. In both cases, we can't continue. */
e.printStackTrace();
throw e;
}
}
This simply tells the SSL context associated with TLS that when its socketfactory creates a new socket, it should use our socketfactory (in sc,.init), then tells HttpsURLConneciton that when it opens a socket, it should use the modified socketfactory. It then tells HttpsURLConnection that it should use our hostname verifier. In the event of an exception, we’re unable to connect reliably and need to get out, so we pass the exception on.
Finally, we have to call this routine, tell our implementations whether or not to accept errors, and actually create an https connection. Almost there, here’s the code.
if(mod509 == null)
trustOurHosts();
What we don’t want to do is waste a bunch of time creating and destroying the handling classes, so we create them and keep a reference to them, only recreating them if that reference is null.
mod509.setAccepted(bigIp.isAccepted());
mhnv.setAccepted(bigIp.isAccepted());
Then each time we need to start a new connection with the device, we call setAccepted on both of our classes. For our purposes, we could lump both host name mismatch and invalid cert into a single question, you might want to have two different values for users to check.
HttpTransportBasicAuth httpt = new HttpTransportBasicAuth("https://" + host + "/iControl/iControlPortal.cgi", param1, param2);
And finally, in our case, we use HttpTransportBasicAuth from kSOAP2 Android to connect. We have also tested this solution with the more generic SSL connection mechanism:
HttpsURLConnection https = new HttpsURLConnection(ourURL);
https.connect();
And it worked just fine.
Last but not least
Okay, so this solution works if you already know how the user is going to respond, but what if you don’t, and you want your behavior to mimic the browser, but for whatever reason cannot use the browser with Intents to achieve your goals…?
Well, I have not implemented this, but knew looking into how to solve the problem that more people would be interested in this solution. Here’s my answer.
1. Do all of the steps above except set “accept” to false for both class implementations.
2. Make the connect call, in the cases where there are problems, it will throw an exception.
3. Return to the UI thread with an “Oh my, exception type X occurred” message.
4. In the UI thread, respond to this message by throwing up an “Allow connection?” dialogfragment. with your calling class as the onclicklistener
5. in OnClick, if they said “no, do not trust”, just move on with whatever the app was doing. If they said “Yes, do trust”, use their answers to set up a new call to the thread to connect(). You now have actual values for the setAccepted() calls.
As I said above, I don’t like it because all the overhead of establishing a connection occurs twice, but it is pretty clean code-wise, and short of stopping execution, this is the best solution we have… For now. One thing Android has been great at is addressing issues in a timely manner, this one just has to impact enough developers to raise it up as an issue.
And that, is that.
There’s a lot here, both for you to consume and me to write. Feel free to contact me with questions/corrections/issues, and I’ll update this or lend a hand as much as possible. It certainly works like a champ for our app, hopefully it does for yours also. The people out there offering all sorts of insecure and downright obfuscated answers to this problem could send you batty, hopefully this solution is a bit more concise and helps you not only implement the solution, but understand the problem domain also.
My caveat: None of us is an island any more. I wish I had a list of references to the bazillion websites and StackOverflow threads that eventually lead me to this implementation, but alas, I was focused on solving the problem, not documenting the path… So I’ll offer a hearty “thank you” to the entire Android community, and acknowledge that while this solution is my own, parts of it certainly came from others.