blog

Ensemble Modeling and AML Optimization

11/04/2019 by Calvin Crase Financial Crimes, Modernization - Analytics

Using more than one modeling technique in conjunction with another to gain better predictive performance is what we call ensemble modeling or ‘ensemble learning.’ Learning how to use modeling techniques with another could increase significantly the efficiency of your modeling techniques, decrease your misclassification rate, and improve other metrics that evaluate model performance.

With ensemble learning, you can use the output from one model to inform the choice of — and feed into — your second subsequent model.

Transaction Monitoring and Model Building

Model 1: Clustering

Ensemble Learning has many potential applications. Financial crimes, marketing, advertising, and spam detection are some. The scientific community has leveraged ensemble learning in domains such as bioinformatics as well. We are going to approach the subject of ensemble learning with the idea in mind that we can hit multiple birds with one stone with a focus on anti money laundering and financial crimes. The domains we will cover include:

  • Customer segmentation
  • Scenario design + risk evaluation
  • Scenario tuning + threshold Setting
  • Reduction in false positives

First, an example of what we could use to motivate our initial machine learning technique is customer segmentation. We want to segment our customer base by using a method called k-means clustering. The reason why we choose k-means as our first model to segment is that k-means is good at creating groups of similar things. Once we have groups of things that are by definition similar, we can develop a predictive model more tailored to each subgroup. The idea here is that we have certain characteristics we care about –certain characteristics we have reason to believe may be associated to either low or high risk individuals. Since these different groups tend to exhibit different aggregate financial behavior if we lump them together prior to doing a predictive algorithm, then when we give the data from these clusters to our next modeling technique it will find characteristics that are predictive for that customer segment.

This is important because when you are trying to determine whether financial behavior is potentially illicit you are looking for behavior that is anomalous. But what is anomalous might depend on who you are. What characterizes odd financial behavior for someone like you might not be odd for a small business owner or Jeff Bezos. Finding what characteristics we can group people into will allow us to optimize our transaction monitoring to minimize false positives, which in turn minimizes cost.

At the beginning I said we’re hoping to essentially get ‘multiple birds with one stone’ in our ensemble modeling. So far the clustering method is useful because our customer groups now allow us to segment our customer base.

The other important piece we can address at this point is scenario design and scenario tuning. Financial institutions will often write business rules called scenarios which flag for potentially illicit behavior. It makes sense to write these rules to be relevant to the specific typologies within different clusters. Once we have scenarios that are specific to our segments, we can set the thresholds for these rules, e.g. a scenario will trigger alert on customer if currency_amount is greater than $1000, $5000, or $100,000, etc.

Model 2: Logistic Regression or Decision Trees

What we want to use now is a predictive modeling technique leveraging the output data from our previous k-means clustering model. This step of using the output from the previous model is what constitutes ‘ensemble learning.’ The data we will leverage at this point will be at the transaction level as opposed to the party-level data we were using to cluster above.

It’s up to us to determine which model makes the most sense. We could leverage a decision tree, logistic regression, or any number of other options. Then we would compare these models to see which one performs the best and offers the greatest interpretability.

The purpose of this second model is to reduce false positives by leveraging historical data we have about our customer base. Whether they have alerted in the past and if so on what scenarios? What are the cases where the bank filed a SAR/STR on this person in the past? We want to be able to predict the answer to those questions more effectively. Is the transaction level of behavior known to be associated with true positive illicit behavior, i.e. have there been SARs/STRs filed that are associated with this transaction level behavior for other customers?

As has been understood for awhile in the AML world, false positives are an ongoing problem. Being able to effectively leverage historical data at the customer, transaction, and alert level can cut down significantly on the time an analyst has to spend investigating alerts. Using machine learning techniques in effective ways with one another is a reliable way to do this. This can save real money while also automating a lot of risk assessment work.

All of these different modeling techniques are available to use in SAS VDMML. This method of embedding a model within one another is a proven method for improving on the predictability of your models. It’s applications are wide ranging from fraud and AML to marketing to biology.