A Better Approach to Anomalous Detection for AML
04/02/2020 by Craig Willis Financial Crimes
Most financial institutions today have implemented some type of automated Anti Money Laundering (AML) transaction monitoring system. These systems collect and aggregate large amounts of transaction data for customers, accounts, external parties, and other relevant entities and then apply rules to this data to trigger alerts when that behavior is deemed to be abnormal.
The problem with this approach is that ‘normal’ behavior can be defined in many ways across an organization. Different products and transaction types can have vastly different definitions of normal depending on who is looking at, and how they are viewing the behavior. Most institutions use percentiles or simple expert opinion for setting levels of behavior that may be suspicious. These measures are typically based on what is known about their customers, or specific types of behavior they have caught in the past.
The inconsistent nature of this approach gives rise to the development of multiple rules all trying to cover the same risk, resulting in significant increase in alert volumes without an accompanying increase in the detection of truly anomalous behavior.
There is a better approach.
Anomaly or outlier detection seeks to identify data points, in our case things like parties or transactions, that differ from much of the data being analyzed. Often, when observing only 1 or 2 variables, we can use data visualization to identify outliers. Scatter plots are a great way to view two-dimensional data, and in the graph below our outlier is easily identifiable.
A graphical approach is insufficient for the complex problem of detecting suspicious financial transactions. There are simply too many variables to consider such as transaction timing, location, and the direction of the transaction. The large number of variables and the fast-moving, dynamic nature of the financial industry today call for a more sophisticated approach.
In the following section and subsequent posts, we will discuss some of the anomaly detection methodologies available and how they can be used to successfully augment your BSA/AML transaction monitoring program.
A single transaction type approach seeks to isolate all behavior within a single type of transaction (eg. Outgoing Wires, or ATM Cash Deposits) and observe the customers who are using these transactions in a manner different from most other customers. This is usually one of the simpler approaches to anomaly detection because it considers less of a customer’s overall behavior. Depending on the use case, this can be a great starting point in developing a more sophisticated approach to transaction monitoring.
This type of analysis requires 6 – 12 months of historical transaction data. The amount of data collected depends on how much is available and how efficiently it can be processed. Next, behavior measures are gathered. Examples of behavior measures include:
These measures are aggregated by customer with one set of measures for all months except the current month and one set for the current month. Since the dollar amount of transaction data is often highly right skewed, the log of this measure is calculated to produce a more normal distribution. Principal component analysis is then performed on all the measures to determine which measures are the most independent. These measures are then used with the SAS procedure PROC FASTCLUS to create the clusters of customers. These clusters represent normal behavior patterns within this single transaction type, meaning there are many customers using this transaction type in a similar pattern. Finally, we compare the current months data against these original clusters to determine which customers in the current month are the farthest away from the cluster centroids.
Once this scenario is developed, there are multiple ways to refine it to produce more relevant results. The initial dataset can be filtered to look only at transactions above a certain threshold or to only include certain products in order to produce more relevant clusters. The distance from the centroids required to produce an alert can be increased or decreased in order to modify the number of alerts produced.
In this article, we’ve provided an introductory overview of our recommended anomaly detection process, but there is much more to learn. In future posts, we will write more extensively about the details of coding this approach, some of the underlying statistical concepts, and some more complex approaches to the same problem.