As institutions explore generative AI for transaction monitoring, sanctions screening and KYC processes, questions about model risk management loom large. Can hallucinations be controlled? How do you explain AI-generated alert narratives? What about bias? Kevin Lee of Cygnus Compliance Consulting argues that these concerns are manageable within existing MRM frameworks, demonstrating how AI can be retrofitted into current assurance practices while keeping human accountability intact
Today’s compliance model risk management (MRM) policies are based on regular assurance of transaction monitoring (TM), sanctions screening and Know Your Customer (KYC) processes, and they follow a standard approach of assessing model input and model output.
This assurance takes the shape of real-time quality control (L2 alert clearing), post-event quality assurance, ongoing data governance and regular model validation and tuning exercises. Regulations in the US (including those from the Office of the Comptroller of the Currency, Federal Reserve and New York Department of Financial Services) require that model input, output and governance meet industry standards.
The proliferation of AI-based compliance solutions has put the spotlight on the model risk management that governs it and highlights the importance of retrofitting AI-based compliance models into today’s MRM frameworks.
Is ‘Extreme Due Diligence’ the Next Sanctions Defense?
Geopolitical hotspots demand stepped-up processes
Read moreDetailsThe most well-known model risk is hallucinations, or inaccurate model output. Much like with today’s models, model error is managed with a human in the loop (HitL) with the requisite expertise to determine whether hallucinations have occurred. In the compliance context, the most granular level of model output is the alert and it follows, then, that the HitL is accountable for assuring the efficacy of each alert. This assurance happens in real time and before any waive/escalate decisions are finalized. In today’s operating model, this is commonly referred to as Level 1. Level 2 and post-decision quality assurance (QA) would remain unchanged (for now).
The previous point of HitL accountability at L1 is an important one to reinforce. In essence, AI’s role at the early stages of adoption is to assist the HitL. The HitL is still the individual accountable for the alert narrative, albeit the path taken to the end narrative has been streamlined and improved. That is, the accountability model for an AI-native alert-clearing model remains unchanged from today.
Another key risk is explainability, of which there are two interconnected main themes — explainability of the AI model itself and explainability of the alert narrative. The former theme is best explained at a more macro level, though it is not unlike the Jaro-Winkler, Levenshtein and Soundex algorithms that power the sanctions models of today. The latter theme, our subject at the moment, how transparency can be introduced to the alert narrative end output. The answer is both simple and seen in real time: The alert narrative will have citations for each point it makes. The gen AI’s narrative will be sourced from a bank policy, a regulatory policy, bank data, external intelligence, etc. Even logical reasoning (this looks like structuring) can be grounded by a source and cited for HitL verification. The AI model can even break down the “weighting” of the various factors it is considering. In this way, the explainability an AI model can generate far surpasses the precision by which a human adjudicator can cite their thought process.
Bias is often pointed to as a risk of an AI approach. However, it is straightforward to remove data points (name, gender, age, etc.) that may lead to bias and have the AI model operate without bias. Calls to adverse media sources can be done in a different container and the results can be scrubbed of sensitive data. This is a much more controlled approach than today’s human equivalent; after all, the risk of bias resides in each individual. In this way, the risk footprint of bias in alert clearing is centralized and drastically reduced vs. today’s human equivalent.
Lastly, assurance of model input and model output are retrofitted to today’s model validation process. The measurement of data movement of bank data into the AI’s pre-processing layers remains a priority. Even as regulatory AI products are more complex in nature, the adage garbage in, garbage out remains true. Similarly, challenger models, whether they be programmatic or human in nature, should serve to assess whether the model output is functioning as expected. Post-output data analysis should be used as an assurance layer to measure model drift, bias according to protected characteristics and other such risks.
The risk footprint of AI models in regulatory compliance are largely risk-neutral or risk-negative. As an industry, we are reducing the risk by taking an AI-assisted approach while leaving the current accountability model untouched. The risk footprint of AI models is further managed by retrofitting assurance of AI models to the model risk management framework that has been relatively successful in the current generation of compliance technology.


Kevin Lee is a managing director at Cygnus Compliance Consulting. He formerly served in a variety of roles AML RightSource, Navigant and Exiger. 







