Excellent questions that get to the technical heart of production ML systems. We evaluated LSTM approaches but found gradient boosting offered better explainability for our audit requirements while maintaining strong performance. Financial auditors need to understand why matches were made, and tree-based models provide clear feature importance rankings.
For database architecture, we use a hybrid approach: ML predictions and confidence scores are stored in custom Z-tables in SAP HANA for real-time integration with Cash Management transactions. The raw training data, model artifacts, and retraining pipeline live in our Azure data lake. This separation keeps SAP performance optimal while maintaining ML infrastructure flexibility.
Model monitoring runs weekly automated checks: we track confidence score distributions, match accuracy on validation sets, and feature drift metrics. When we add new bank accounts, we have a cold-start protocol that uses the global baseline model until sufficient transaction history accumulates for regional model adaptation.
For audit transparency, our Fiori app displays match reasoning with top contributing features for each transaction. Auditors can see that a match was made based on 98% name similarity, exact amount match, and 2-day date proximity. We also maintain complete audit trails of all manual overrides and model version history.
The system handles approximately 12,000 transactions monthly across our 50+ accounts. Processing time averages 15 minutes for the full reconciliation cycle, with HANA’s columnar storage enabling fast filtering and aggregation. We partition data by fiscal period and bank country to optimize query performance.
Key implementation insight: Start with high-confidence automation (>95%) and gradually lower thresholds as trust builds. Our initial rollout only automated matches above 97% confidence, which handled 60% of volume. After three months of validation, we lowered to 95% and captured the additional 25% of transactions. The remaining 15% in exception handling still provides massive time savings by pre-ranking likely matches for analyst review.
For teams considering similar implementations: invest heavily in data quality upfront, design for explainability from day one, and build feedback loops that continuously improve the model. The ROI extends beyond time savings - we’ve also improved cash visibility by reducing reconciliation lag from days to hours.