We’re piloting continuous monitoring for journal entry testing and user access reviews using an ML-based anomaly detection layer on top of our ERP transaction logs. The system flags exceptions in near real-time and routes them to control owners with recommended actions. Internally, this has been a huge win—we’re catching issues weeks earlier than we did with quarterly sampling, and the finance team is spending way less time chasing screenshots in Q4.
The problem is our external auditors. They’re asking how the model decides what’s an anomaly, what data it’s trained on, and whether we can prove the logic is consistent period-over-period. We’ve been positioning this as a copilot—humans review every flagged transaction and document their decisions—but they want to see an audit trail that shows not just what the system flagged, but why it flagged it. They’re also concerned about model drift and whether the thresholds we set in Q1 are still valid in Q4.
Has anyone successfully walked auditors through an AI-driven control environment? What kind of documentation or explainability framework made them comfortable? And how do you handle the governance piece—do you register every model change, maintain version-controlled decision logic, or something else?
From an internal audit perspective, I’d recommend involving your auditors earlier in the design phase rather than at year-end. We piloted our first AI control with both internal audit and our external firm in the room from day one. They helped shape the control design, the evidence requirements, and the monitoring approach. That upfront collaboration eliminated most of the surprises later and built trust that we weren’t trying to hide complexity behind a black box.
One thing that helped us was using simpler, more interpretable models for high-risk controls. We switched from a complex ensemble model to a decision-tree-based approach for segregation of duties violations. The trade-off in accuracy was minimal, but the explainability jump was huge. Auditors could literally follow the decision path in the tree and understand why a specific user access pattern was flagged.
We went through this exact conversation last year. What worked for us was creating a control narrative document specifically for the AI layer. It included: the business risk being addressed, the data sources feeding the model, the precision thresholds we configured, how we handle false positives, and our monthly model performance reviews. We also built a dashboard that shows auditors the volume of alerts, override rates, and remediation timelines. That transparency helped a lot—they could see we weren’t blindly trusting the system.
Have you formalized your AI inventory yet? We maintain a register of every AI model used in SOX controls, including purpose, data sources, training approach, limitations, and monitoring plan. Each model also has a risk score (low, medium, high) based on its impact and complexity. High-risk models get deeper reviews and more frequent audits. That structure gave our auditors confidence that we were managing this systematically, not ad hoc.