cancel
Showing results for 
Search instead for 
Did you mean: 
Login & Join the DevCentral Connects Group to watch the Recorded LiveStream (May 12) on Basic iControl Security - show notes included.
Scott_Swensen
Legacy Employee
Legacy Employee

While identifying and stopping bots is an important part of securing our customers' web traffic, many enterprises are at risk of human actors manually committing costly fraud. On a regular basis we now read in the news of website login credentials, credit card numbers, and personally identifiable information (PII) being compromised and disseminated across the internet. This leads to billions of dollars in losses to financial institutions, retailers, government agencies, and others every year. The product Shape AI Fraud Engine (SAFE) has been developed to combat manual fraud.

Recognizing and mitigating manual fraud can be difficult. Unlike automated web traffic, manual fraud is committed by humans who often behave very similarly to good users. Also, while manual fraud may be very costly to an enterprise, the volume of manual fraud traffic is usually very small compared to good traffic, usually comprising less than 0.1% of all traffic. This can make it difficult to separate out fraudsters’ online behavior from unusual (but not nefarious) actions by good users. While there are some definite patterns in fraudster behavior from one enterprise to another, vulnerabilities specific to an enterprise will lead to observed fraudster behavior that may differ substantially from that seen on other similar enterprises. Since we do not see chargebacks from customer transactions, it is generally not possible to confirm with absolute certainty that a specific transaction is fraudulent, though we can have a high degree of confidence if certain combinations of signals are observed.

There are three types of transaction features that can be highly indicative of manual fraud: (1) user behavioral signals; (2) environmental features; and (3) user journey features.

User Behavior Patterns in Manual Fraud

Fraudsters often interact with a website differently than the majority of manual users. This includes behavior such as copying and pasting login credentials or personal information when signing up for a new online account, not using autofill during login, typing in an irregular fashion, and navigating away from the current browser while on the login page. As an example, the following figure shows the mean time between printable keystrokes for a login page.

While most good users have a small mean time between keystrokes (less than 0.5 seconds on average), fraudsters often have more irregular typing speeds, with both the mean time between keystrokes and the variation of time between keystrokes being longer. This occurs because the fraudster may navigate to a different page or browser to look up stolen credentials while on the login page. They may also be copying credentials or typing slowly as they check the spelling of the credentials. While these behavioral signals can be strong, a few behavioral features are often too few to make a high confidence fraud prediction.

The behavioral signals may be compared to past behavioral signals for the same username. For example, the copying and pasting of credentials can be a good indicator of suspicious login behavior. The signal is even stronger if we can see from previous logins from that same account that this is not normal behavior for that user.

Environmental Patterns in Manual Fraud

Environmental indicators such as new cookies, the use of VPN, incognito mode, environmental spoofing and access from known hosting ASNs are generally more common for fraudsters than good users. The following figures show the proportion of fraud logins and good logins to an enterprise that used VPN and environmental spoofing.

0151T0000040T19QAE.png

 

0151T0000040T1EQAU.png

 

User Journey Patterns

Finally, the journey of the user through the website can be indicative of fraudulent behavior. High access diversity, defined as access or attempted access of a device to multiple accounts is usually a very good indicator of manual fraud. The following figure shows the number of login emails attempted over a month-long period by devices associated with fraud and devices associated with bad behavior. While this plot shows an extreme example, the pattern of high access diversity is common in account takeover (ATO) fraud instances.

0151T0000040T15QAE.png

 

In addition to high access diversity, fraudsters often access high risk pages more frequently than good users. The following figure shows the distribution of manual fraud and good devices that have accessed the password reset page for a financial institution in the previous month.

0151T0000040T1FQAU.png

 

In addition to the password reset page, fraudsters logging into other people’s financial accounts may visit the profile page to attempt to change the contact information associated with the account. The fraudster is also more likely to visit pages where funds may be sent to an external recipient such as money transfer and bill-pay pages. They are likely to visit these pages more frequently than good users as well. For retail fraud, a device may add items to the cart quickly, pay with a stolen credit card, then quickly try to purchase more items again. This behavior is unusual for normal users.

Fraud Prediction

Tree-based machine learning (ML) models are well suited for predicting manual fraud. These models can find complex patterns in large sets of features where there is no parametric relationship between the individual signals and the fraud recommendation. Additionally, tree-based models are interpretable, meaning the important features contributing to the prediction can be extracted and used to explain the model performance. Sometimes rules-based models are also used or combined with ML models to capture well known fraud patterns.

The following figure shows an example model performance curve for a SAFE customer. The line represents the true positive rate (the recall, or amount of known fraud captured) versus the false positive rate (the portion of good transactions that are mistakenly classified as fraud by the model). Any threshold along the line may be chosen to take action for the customer. We may set a very low false positive threshold to block transactions so that we are not blocking good transactions by good users whose behavior may be similar to that of fraudsters. For the example below, the model can capture 70% of the manual fraud with a false positive rate of 0.02%, or approximately one in 5,000 good transactions. This false positive rate is quite low, but it must be remembered that there are many more good transactions than fraud transactions for most enterprises; so even false positive rates in this low range may result in tens or hundreds of false positives per day, depending on the amount of traffic seen by the customer. At a lower confidence threshold, we may tell the customer to review those transactions or trigger multi-factor authentication (MFA) to ensure the users are who they claim to be. 

0151T0000040T16QAE.png

Conclusion

Behavioral, environmental, and user journey features can be strong signals for predicting manual fraud in web traffic for many customers F5 protects. SAFE harnesses the signals in these features through machine learning to take action on suspicious online traffic. These actions include blocking high confidence fraud transactions and sending signals to trigger a challenge or review of less confident fraud predictions. As enterprises provide more examples of manual fraud they have experienced, indicators from these transactions can be used to train SAFE to capture more fraud without producing unnecessary friction to good users.

Comments
MichaelOLeary
F5 Employee
F5 Employee

Great read, thanks!

Makoto_Onodera
F5 Employee
F5 Employee

This precision recall curve is extremely ideal. That's a great solution.

Version history
Last update:
‎28-Apr-2021 08:34
Updated by:
Contributors