Regulators won’t simply look at your AI outputs when scrutinizing your organization’s compliance, Jonny Frank, Nathan Gibson, Michael Costa and Kashif Sheikh of StoneTurn write. Your decision-making is being evaluated, and relying on the defense that AI told you to make a certain decision is a recipe for disaster.
AI is changing the speed and scale of compliance work, identifying anomalies in real time and surfacing risks that would have taken weeks to find manually.
But speed without structure creates different risks. Regulators are not evaluating your algorithms. They are evaluating your decisions — how you reached them, whether they are supported, and whether they hold up under scrutiny. Using AI isn’t the risk. In fact, neglecting the use case for AI can create a different risk entirely. Where organizations can find themselves in the hot seat is treating AI output as the defense behind every conclusion it helps produce.
The defensibility gap
Recent allegations involving a provider of AI-automated auditing and compliance reviews underscore a growing problem: the defensibility gap. Whistleblowers allege the firm bypassed authentic reviews and used “certification mills” to rubber-stamp compliance reports.
Whether those specific allegations are proved, the lesson is clear: If you cannot explain and defend the result, the process does not matter. In today’s environment, where AI missteps are scrutinized and amplified, a weak or opaque process risks not only regulatory exposure but immediate reputational damage.
The allegations also highlight a broader structural issue: Independence required for credible assurance breaks down when a single platform implements compliance measures and evaluates their effectiveness. For companies facing DOJ or SEC scrutiny, “check-the-box” automation is a liability.
What regulators expect is straightforward: a clear, auditable trail; evidence of independent judgment; and conclusions that can be explained and defended
Where AI helps and where it doesn’t
AI is powerful where scale and pattern recognition matter. This can include identifying anomalies across large datasets, canning contracts or transactions for outliers and surfacing potential misconduct signals. These capabilities are not theoretical. Many compliance teams use AI to accelerate reviews that previously took weeks. The technology works.
But uncovering the genesis of a compliance failure is rarely rooted in data alone. That requires an understanding of why people acted, whether controls were bypassed and whether leadership reinforced — or undermined — the program
AI cannot interview a whistleblower effectively. It cannot assess tone at the top. It cannot distinguish a technical failure from a cultural one. These require human judgment. That is where experienced professionals add irreplaceable value — not as a check on AI but as the judgment layer that makes AI outputs meaningful.
Deepfakes Are Now a Board-Level Risk & Regulators Are Watching
Recent UK regulatory developments are making deepfake risk a board-level disclosure and accountability issue, not just an IT problem
Read moreDetailsThe shift in skillset
As AI becomes embedded in compliance, the job is changing. Less time gathering data. More time interpreting it and acting on it. Four capabilities matter most:
- Model validation: Understanding what the AI is doing, where it is most effective and how to spot when its outputs need a closer look.
- Data integrity: Ensuring inputs are reliable, knowing that bad data can produce confident but wrong answers.
- Contextual judgment: Recognizing when controls exist on paper but fail in practice.
- Defensible conclusions: Translating outputs into a clear, supportable narrative that explains why a conclusion was reached.
Questions to ask before deploying AI
AI compliance-testing products are sure to proliferate. Ask these questions from your vendor to assess whether the output will stand up to scrutiny:
- Who is accountable if the tool flags or clears an issue?
- What evidence supports the conclusion?
- Does the same system design, implement and test controls?
- Does the tool produce auditable documentation or just outputs?
- What are the limits of the model?
- What data is excluded?
- How are false positives and false negatives handled?
- Would you rely on this output in front of the DOJ or SEC?
AI will continue to transform how compliance work gets done, but it will not lower the bar. Prosecutors and regulators evaluate whether a program is adequately resourced, empowered and effective. Technology is but one piece of the puzzle.
In other words: Do not expect credit for automation alone. Every automated insight still needs a human who can stand behind it. In the end, compliance is not about what your system can detect. It is about what your organization can defend.
AI has permanently raised the bar for what defines ”defensible” compliance. The leaders of this new era will not be those who avoid AI, nor those who automate blindly. They will be the firms that pair powerful technology with the professional judgment required to explain, defend and stand behind every conclusion.


Jonny Frank
Nathan Gibson
Michael Costa
Kashif Sheikh






