Equifax Insights Blog

Identity & Fraud

What Types of Analytics Intelligence Can Power Identity Trust?

December 09, 2020 | Sriram Tirunellayi

Consider Performance Criteria, Algorithms and Labeled Data

Fraud prevention is a challenging task, given the mercurial nature of the problem. Last week’s modus operandi may no longer be relevant today since fraudsters continually change and adapt. In my previous blog article, I described the data-to-decisions journey as it pertains to identity and fraud. Now, let’s dive into identity trust solutions that are increasingly powered by advanced analytics like artificial intelligence (AI) and machine learning (ML).

Machine learning methods are best suited for identity trust because the approach has the inherent capability to work with huge volumes of data from multiple and diverse sources. The models can be trained to learn, adapt and detect evolving patterns. When designed properly and trained correctly, machine learning models can continuously learn as new data is presented in the form of feedback outcomes. The models are able to isolate data points that are deviations from known safe patterns and help uncover new fraud patterns.

There are several key considerations in the optimal design of any machine learning model. Here, we will focus on the following three points to illustrate the relevance of AI and ML models for identity and fraud.

The Economics of Identity Trust Decisions

As described in an earlier blog article, identity trust decisions during an interaction can impact the customer experience and influence their perception of the brand and future interactions. In a practical sense, this translates into the following costs: operational cost, fraud losses and opportunity cost. The time and effort required to validate information counts as operational costs. Also, customer friction drives opportunity costs due to false positives or extra authentication steps during sign up.

Given the complex, multi-faceted nature of trust decisions, machine learning models leverage a mix of intelligent data, behavior patterns and signals to make the best identity trust decision for every transaction. That decision is always optimized to minimize fraud losses, while balancing operational costs with opportunity costs.

The Dynamic Nature of Fraud

Fraud is a rare event, and it is not uncommon to see fraud rates of less than 50 basis points. In some situations, fraud labels may be messy, costly and time consuming to obtain. In other situations, there may be only a handful of confirmed cases or anecdotal examples with which you can work. Fraud patterns also vary over time as fraudsters change their attack vectors and adapt to new defenses that institutions deploy. By assessing a continuous feed of data and signals regarding a consumer’s past interaction, present context and predicted intent, which can be intelligently adjusted based on feedback outcomes, machine learning models can seamlessly adapt to these changing patterns.

The Importance of Problem Framing

Machine learning models do not exist in a vacuum. They exist to serve a specific business problem; therefore, an accurate articulation of the business problem is a fundamental requirement toward the correct formulation of the machine learning model. It’s important to answer critical questions upfront so that relevant design parameters such as training population, sampling and weighting schemes, segmentation, and even appropriate algorithms, can be chosen wisely. For example, you might pose the following questions:

Are you looking to replace an existing fraud model or augment the model as an additional layer of defense?
Would you like to predict behaviors of a fraudster or capture patterns of fraud victims?
Are there any biases in the label definition that need to be accounted for?

Distinctive Characteristics of Machine Learning Models for Fraud

These unique design considerations further dictate how machine learning for identity and fraud is different from other use cases.

Performance Criteria

Standard supervised models are evaluated against precision and recall. Due to the economics of fraud, it is important that fraud models not only have high precision (low false positives) and high recall (high fraud capture rate). They should also occur within a fairly small fraction of the population. Simply, the population that can be alerted for additional fraud review has to be well within the operational constraints. This is dictated by several factors:

impact to customer future value
number of investigators
time taken per investigation
cost of investigation

Thus, compared to credit risk or marketing models where performance metrics like Kolmogorov–Smirnov (KS) and GINI statistics evaluate model performance across the entire distribution of the scored data, fraud models are judged based on false positives (precision) and fraud capture rate (recall), and are measured at alert volumes of 1 - 5 percent of the population.

Algorithms

The requirement of high precision at low alert volumes means that fraud models need specialized algorithms that are effective in their search for a needle-in-a-haystack approach. Ensemble classifiers, rule induction methods and artificial neural networks are some of the most common and successful supervised AI techniques used in predicting fraud when adequate fraud labels are available.

The dynamic, varying nature of fraud patterns and the lack of strong fraud labels require that we also use unsupervised algorithms to help uncover patterns. Novelty detection, clustering, graph anomaly detection and social network analytics are some of the commonly used AI techniques.

Labeled Data

Machine learning models are known to suffer from cold start problems. Typically, this happens when there are no cases of fraud or inadequate known cases. It usually requires a bit of creative thinking. Additionally, it needs some domain knowledge to overcome this problem and iterate on the model. Below is an example of such a situation with a customer problem that we overcame.

We used the principles of active learning, which involves working with human beings — in this case, fraud analysts — to get feedback. We were able to iterate on the machine learning model by querying analysts for fraud feedback selectively and expanding on the features to improve overall performance with every generation. Each new version was produced on a biweekly schedule.

Unlike credit risk and marketing use cases, fraud patterns are sophisticated, dynamic and constantly changing. This means the analytic models created to fight fraud today must be equally — if not more — innovative and iterative. When appropriately designed and trained, machine learning models will continually learn and adapt to fast-moving fraud patterns. Then, businesses can make more precise, accurate decisions over time without compromising the customer experience. To learn more, visit our website or read prior articles in this blog series:

Return to Insights Homepage

Tags:

Recommended for you

Market Trends May 2025 Economic Update: Top 9 Takeaways You Need to Know Now

The opinions, estimates, and forecasts presented herein are for general information use only. This material is based upon information th [...]

May 16, 2025

Identity & Fraud Two Sample Fraud Scenarios and Solutions for Combating Them

Fraud can take many forms and can also vary in size and impact. Below are the two most common types of fraud we see and help our customers [...]

May 15, 2025

Alternative Data Young People Are Entering the Credit World: Here’s How to Approve More Gen Z With Alternative Data

For many young adults, getting their first credit card, car loan, or personal loan is an important milestone. However, for lenders, these [...]

May 13, 2025

Market Trends April 2025 Market Pulse Recap: Expert Answers on Inflation & Economic Outlook

Both before and during each Market Pulse webinar, our audience submits their burning questions to our expert panelists, some of which we [...]

May 09, 2025

Market Trends Key Economic Trends: April 2025 Macroeconomic Update

The opinions, estimates, and forecasts presented herein are for general information use only. This material is based upon information th [...]

May 06, 2025

Credit Risk The Cloud Advantage: How to Amplify the Competitive Power of Differentiated Data

In today's dynamic and competitive business landscape, data is at the core of informed decision-making. But not all data is created equal [...]

May 05, 2025

Free Services for Consumers

Industries

Products

Don’t have an account?

Don’t have an account?

What Types of Analytics Intelligence Can Power Identity Trust?

Consider Performance Criteria, Algorithms and Labeled Data

The Economics of Identity Trust Decisions

The Dynamic Nature of Fraud

The Importance of Problem Framing

Distinctive Characteristics of Machine Learning Models for Fraud

Performance Criteria

Algorithms

Labeled Data

Recommended for you

Subscribe to our Insights Blog

Free Services for Consumers

Tell Us Why You're Here

Frequently Asked Questions

Your Equifax Support

Existing Business Customers

Not a Business Customer Yet?

Equifax Premium Products

Equifax Premium Products

Equifax Complete™ Family Plan

Equifax Complete™ Premier

Equifax ID Patrol™

Equifax Value Products

Equifax Value Products

Equifax Complete™

Equifax Credit Monitor™

Equifax Core Credit™

Help Choosing a Product

Help Choosing a Product

Industries

Products

Learn more about...

Who We Are

Our Commitments

Our Capabilities

Why Equifax

Investors

Culture & Careers

Trends & Insights

Don’t have an account?

Don’t have an account?

What Types of Analytics Intelligence Can Power Identity Trust?

Consider Performance Criteria, Algorithms and Labeled Data

The Economics of Identity Trust Decisions

The Dynamic Nature of Fraud

The Importance of Problem Framing

Distinctive Characteristics of Machine Learning Models for Fraud

Performance Criteria

Algorithms

Labeled Data

Recommended for you

Subscribe to our Insights Blog