Building a Fraud Profile with Device ID+ (Part 1)
Overview End-to-end architecture for IT fraud and security systems is an opaque space and best practices are usually held within the silos of large corporate cybersecurity teams - and for good reason. Cyber vendors are often the only ones who can connect the dots across customers and find pain points that need to be solved. Luckily for me, I have been able to sit down with security experts across all major industry verticals to discuss those pain points. For years, I have assisted their usage of cybersecurity point solutions (e.g., WAF, Bot, Fraud, etc.) from the perspective of API security, server-side exploits, client-side vulnerabilities, and so on. One piece of technology that is common across all cybersecurity architectures is some form of end user or device identifier. It is the single thread that runs across the entire technology stack and each organization uses it to drive fraud prevention and critical business analytics. Creation of an identifier starts when users interact with an application and provide input to it. Normally this happens when the user logs in, creates a new account, or creates a post or comment. This identifier is typically a traditional cookie from a browser fingerprinting solution created in-house or supplied by a third-party service. It is the way organizations identify and track their users and ultimately how they improve their business. At F5, we help security teams across the world’s top organizations understand their users better. Are they lying about their identity? Are they a known good user? Are they committing fraud, or do they appear to be malicious? We have made a large investment (see Shape Security) in creating an identifier that is based on unique signals and, most importantly, trusted by the security and fraud teams who use it. This identifier is known as Device ID+ and it is now available as a free service to anyone who wants to use it. Device ID+ Device ID+ was created to address the following problems with existing web-based identifiers and fingerprinting solutions: Over 30% of users cannot be tracked due to cookie churn. Frequent changes due to the likelihood that one browser will create multiple identifiers. Identifiers are reset after users clear the cache or go into incognito mode. Device ID+ leverages JavaScript to create an identifier that solves the issues of traditional user tracking through cookies. Developers can include a simple JavaScript tag (as shown in the example code) and use it in their application to determine if a user account is good, bad, encountering a bad user experience, has been compromised, and more. One of the major strengths of Device ID+ is that it persists across users who clear or reset their browser and you’ll have an opportunity to see this in action below. The purpose of this article is to give you a quick rundown on what Device ID+ is, why it’s important, and how to use it within your application. As a demonstration, I am going to inject Device ID+ into an existing login form that uses Google’s reCAPTCHA service. Google reCAPTCHA Google reCAPTCHA is the service that shows you pictures of things to verify that you are human. I am not going to address some of the most critical shortcomings of the reCAPTCHAapproach but since it’s a free service and many websites use it to manage bots, I thought it would make a great example on how Device ID+ can be used to strengthen any existing bot or WAF solution. In later articles we’ll trace Device ID+ from its creation to its consumption in fraud analytics. Preventing Application Abuse Since all users are born or recognized at login, I’m going to start with a simple login form. Login is where most of the fraud and malicious activity start and that’s why reCAPTCHA has been used over the years as a free service to try to prevent abuse. Today we are going to create what’s known as a Fraud Profile with Device ID+ and we’ll use it in later articles to super charge our fraud analytics and gain visibility into things like: Fraudulent behavior of automated bots Fraudulent or malicious posts and commenting Fraudulent user account creation Good user friction and unnecessary CAPTCHA challenges About the Demo Application This is a very simple demo application that shows how to layer Device ID+ into an existing application. See it in action at https://deviceid.dev/v3 If you wish to run this example locally as a Docker container, you can deploy it with the following command after installing Docker: docker run -d -p 80:8000 wesleyhales/deviceid.dev Open a browser and visit: http://localhost/v3 Demo Walkthrough For starters, go ahead and login to the application with your email address or any made-up value for the username. There’s no need to enter a password. Fig. 1-1 After you click Submit, you will see a description of the data that was captured. This is our Fraud Profile (Fig 1-2) that we have created for our users. It uses Device ID+ to encapsulate the reCAPTCHA score along with a timestamp of when the transaction took place. Fig. 1-2 Fraud Profiles are viewed differently across the cybersecurity industry. Some security teams build Fraud Profiles around credit card transaction data and others build them throughout specific flows across web pages. Device ID+ can be applied to any Fraud Profile and is built to be used on every page of the application. The more you use it, the more you can enhance good user experiences and/or eliminate actual fraud. The following JSON shows how the example app adds a reCAPTCHA signal to our Device ID+ Fraud Profile: Example of Device ID+ based Fraud Profile Fig. 1-3 Normally, developers would simply capture the score returned from the server side reCAPTCHA API and take action (0.9 in Fig 1-3 above). This score might be used in the authentication logic within the application, simply allowing the user to login if it’s above 0.7. It might also be sent downstream with additional user data to be recorded in a SIEM. The Device ID+ based Fraud Profile provides a structure around existing “scores” or data. This gives us an extendable framework that is decoupled from existing solutions and makes the identifier technology abstract. In our Fraud Profile, the Device ID+ information is located immediately following the username for a couple of reasons: Now we can identify how many different devices a single username is using. Is this account being shared? Is it compromised? Does it violate our terms of service? All of this can be answered by using Device ID+ under the system wide unique identifier (usually this is the username, or an email address as seen in the example). It also brings visibility to important user experience unknowns. Is this a good user who spends money regularly but is encountering too many reCAPTCHA challenges? It is a way to keep your current bot or fraud verification system in check to ensure friction is removed for your good user interactions. The Differentiator As users log in, they will acquire a new Device ID+ cookie which contains the following values. Fig. 1-4 diA is known as the “residue-based identifier”. It is the main identifier used directly after the username in our example. This value is stored locally on the device and may be deleted if the user clears their local storage or cookies. diB is known as the “attribute-based identifier”. This value will remain the same even when the user clears local storage. Keep in mind, it can change if the user upgrades their browser version as it is based on environment signals that remain consistent across browser versions. One easy way to test this feature is to log into the demo application with the same email address twice but using two different browser sessions. Login once in your regular browser and login again with the same browser in incognito mode. Fig. 1-5 In Figure 1-5, we see that the Device ID+ residue values are different for a single username, but the Device ID+ attribute is the same. Conclusion and Next Steps Now that we understand what makes the Device ID+ identification service unique, we can begin to craft ways to take advantage of it in our business analytics. In part 2 of this article series, we are going to analyze the user data from the live demo at https://deviceid.dev/v3 to visualize anomalies and areas where user friction might be occurring. Device ID+ usage spans a broad set of use cases across the enterprise and is complementary to any existing fraud or bot solutions. If you have input or ideas on how you’d like me to extend this article series, please mention them in the comments below. For more information regarding the technical details around Device ID+, see the documentation here. If you’d like to add Device ID+ to your own application, you can sign up for a free account here.1.2KViews1like0CommentsBuilding a Fraud Profile with Device ID+ (Part 2 - Analytics and Reporting)
Overview Today there are at least 4.9 million websites using reCAPTCHA, including 28% of the Alexa top 10,000 sites. Google’s reCAPTCHA service is a free offering and developers have been using it for years to try and defend against automation. Many cyber security vendors embed it in their core offering where customers pay subscription service fees for these vendors to “manage” reCAPTCHA and the data it produces. TL;DR - reCAPTCHA is probably causing revenue leakage, false positives and allowing abuse of the web properties that it’s deployed on. How do I know? Watch this demo video I put together the other day. The link is time bookmarked to start at 5:13. The video shows me logging in across 2 different browser sessions and reCAPTCHA returning false positives. Additionally, a simple search on Google or Github reveals the problems that developers face when using reCAPTCHA. reCAPTCHA is embedded into the web’s top sites because it’s free and seems to work. Or does it? What do I mean by “seems to work”? It depends on the business and the type of website, but “works” in this context typically means making the bot or automation problem go away. While developers have been laser focused on solving problems around bot nets, fraudulent users, and overall noise hitting the system, they’ve forgotten about user friction and revenue. In fact, revenue and user friction may be an afterthought for most developers because they don’t see the opportunity to remove friction and again, they have tunnel vision on fixing one particular problem. (For the record I’m not blaming developers, just stating the reality of most engineering organizations and how tasks are managed.) Let’s take a step back. Fraud detection is a framework within web applications and the creation and ongoing maintenance warrants a good bit of architecture and thought. That’s why I’ve been on a pursuit to research and expose the development of a “Fraud Framework” or “Fraud Profile” across the cybersecurity industry. I want to give developers a resource for greenfield projects and rewrites. A place that is open and information flows freely to make the web a safer place. That’s why I started the conversation in Part 1 and I plan on writing as many articles as possible to open up this well-kept secret across the industry. Experienced security engineers are highly sought after because they have been through the pains of developing these systems and architectures. Not only creating them, but making use of the fraud data that they generate. My goal is to make this knowledge freely accessible through this series of articles. This article and video serve as a stepping-stone in my research. In Part 1, we reviewed a simple NodeJS web application that implements a device identifier service to keep the existing fraud system in check, remove user friction and defend against fraudulent activity. The article demonstrates how to add F5’s Device ID+ to an existing application that already uses a basic bot defense or fraud scoring system – in this case reCAPTCHA. Let’s take a look at the live data from our example application deployed at deviceid.dev/v3 to: Analyze user login and transactional scoring data. Understand how well our fraud scoring system is working by looking at good user data. Gain insight into how Device ID+ improves Fraud analytics. Find areas to make changes to our application to remove user friction. Conclusion and Next Steps The example analysis of our Fraud Profile is just the beginning of what can be accomplished with a trustable device identifier. Analytics around user behavior and malicious intent can now be uncovered in new ways in fraud reporting. Device ID+ usage spans a broad set of use cases across the enterprise and is complementary to any existing fraud or bot solutions. If you have input or ideas on how you’d like me to extend this article series, please mention them in the comments below. For more information regarding the technical details around Device ID+, see the documentationhere. If you’d like to add Device ID+ to your own application, you can sign up for a free accounthere.444Views1like0Comments