We implemented a Mendix RPA bot to automate invoice data entry into our accounts receivable system, and the results have been remarkable. Previously, our AP team manually entered data from 200-300 invoices daily, taking approximately 3-4 minutes per invoice and generating frequent data entry errors that required corrections and customer follow-ups.
The RPA solution monitors incoming email for invoice attachments, extracts data using OCR, validates the information against our customer database, and posts entries directly to our AR system. Within three months of deployment, we achieved 95% error reduction, processing time dropped to under 30 seconds per invoice, and our AP team shifted focus to exception handling and customer relationship management.
The implementation covered OCR data extraction for multiple invoice formats, email monitoring with intelligent filtering, and automated AR posting with validation checks. I’m sharing this to help others considering similar automation projects.
This is impressive! What OCR solution did you integrate with Mendix RPA? We’re looking at similar automation but struggling to find an OCR engine that handles varied invoice formats reliably. Did you need to train the OCR for different vendor invoice layouts?
How did you handle the email monitoring piece? Are you using Mendix’s email integration, or did the RPA bot directly monitor an inbox? We’ve had issues with email polling frequency causing delays in invoice processing.
Let me provide a comprehensive breakdown of the implementation covering all three key areas:
OCR Data Extraction Implementation:
We structured the OCR extraction in three tiers to balance accuracy and flexibility:
-
Template-Based Extraction (Tier 1 - 70% of volume):
- Created extraction templates for top 20 vendors using Mendix RPA’s document automation capabilities
- Each template defines specific field coordinates for invoice number, date, amount, line items
- Templates include validation rules (e.g., invoice number format, date ranges, amount reasonableness)
- Accuracy: 98-99% for templated vendors
- Processing time: 15-20 seconds per invoice
-
AI-Based OCR (Tier 2 - 25% of volume):
- Integrated cloud OCR service (we used Azure Form Recognizer, but AWS Textract works similarly)
- Service trained on invoice document type to recognize common invoice patterns
- Extracts key-value pairs: invoice_number, invoice_date, due_date, total_amount, vendor_name, line_items
- Accuracy: 85-90% for variable format invoices
- Processing time: 30-45 seconds per invoice
-
Manual Review Queue (Tier 3 - 5% of volume):
- Invoices that fail validation or have low OCR confidence scores route to manual review
- RPA bot presents extracted data with original invoice image for verification
- Reviewers correct data, and corrections feed back to improve template accuracy
Email Monitoring Architecture:
The email automation uses a robust monitoring and processing pipeline:
-
Inbox Setup:
- Dedicated email address: invoices@company.com
- Server-side rules filter incoming emails:
- Subject contains: “invoice”, “bill”, “statement”
- Has PDF attachment
- From: approved vendor domains
- Filtered emails move to “Processing” folder
-
RPA Bot Email Processing:
// Pseudocode for email monitoring workflow:
1. Connect to IMAP server every 5 minutes
2. Check "Processing" folder for new messages
3. For each email:
- Download PDF attachments
- Extract sender info and email metadata
- Move email to "InProgress" folder
4. Process attachments through OCR pipeline
5. On success: move email to "Completed" folder
6. On failure: move to "ManualReview" folder with error details
-
Error Handling:
- Duplicate detection: Check invoice number against last 90 days of processed invoices
- Invalid sender: Alert AP team if email from unknown vendor
- Attachment issues: Flag emails with no PDF or corrupted files
- Processing failures: Retry up to 3 times with exponential backoff before manual review
-
Performance Optimization:
- Parallel processing: Bot handles up to 5 invoices simultaneously
- Peak load handling: During month-end, increase polling frequency to 2 minutes
- Monitoring dashboard: Real-time view of processing queue, success rate, and error types
Automated AR Posting with Validation:
The AR posting process includes comprehensive validation before committing transactions:
-
Pre-Posting Validation:
// Pseudocode for validation checks:
1. Verify customer_id exists in AR master data
2. Validate invoice_amount is numeric and positive
3. Check invoice_date is not future-dated
4. Confirm GL_code is in approved list for invoice type
5. Verify invoice_number not already posted (duplicate check)
6. Validate line item amounts sum to invoice total
-
Data Transformation:
- Map extracted vendor name to customer_id using lookup table
- Convert invoice date to AR system format
- Split line items into individual AR transaction lines
- Calculate due date based on payment terms (extracted or default)
- Assign GL codes based on invoice type and line item descriptions
-
Posting Process:
- Create AR transaction in Mendix database
- Call AR system API to post invoice (we use REST API)
- Capture transaction ID from AR system response
- Update Mendix transaction record with AR system reference
- Generate posting confirmation and attach to email thread
-
Post-Posting Reconciliation:
- Daily batch job compares Mendix transaction log with AR system
- Flags any discrepancies for investigation
- Generates daily processing report: total invoices, success rate, error breakdown
- Sends summary email to AP manager with exception queue
Results and Metrics:
After 6 months of operation:
- Volume: Processing 250-300 invoices daily (peak: 450 during month-end)
- Accuracy: 95% processed without manual intervention (up from 5% manual review initially)
- Speed: Average 28 seconds per invoice (vs. 3-4 minutes manual)
- Error Reduction: 95% fewer posting errors requiring correction
- Cost Savings: Equivalent of 2.5 FTE redirected to higher-value activities
- ROI: 18-month payback period including development, licensing, and maintenance
Lessons Learned:
- Start with template-based extraction for high-volume vendors before investing in AI OCR
- Build comprehensive validation rules - better to flag for review than post incorrect data
- Maintain manual review queue - some invoices will always require human judgment
- Include feedback loop where manual corrections improve future automation accuracy
- Monitor and tune OCR confidence thresholds - we started at 90% and adjusted to 85% after analyzing false positives
The key to our success was not trying to automate 100% of invoices immediately. We targeted 80% automation with high accuracy, which delivered significant value while maintaining quality. The remaining 20% in manual review ensures we catch edge cases and continuously improve the system.
For organizations considering similar automation, I recommend starting with a pilot of your top 5-10 vendors, proving the concept, then expanding gradually. The Mendix RPA platform made this iterative approach feasible with its visual bot development and easy integration with Mendix applications.
The RPA bot monitors a dedicated inbox using IMAP protocol with a 5-minute polling interval. We set up email rules to filter invoices into a specific folder based on sender domain and subject line patterns (like “Invoice” or “Bill”). The bot processes any new emails in that folder, downloads PDF attachments, and moves processed emails to an archive folder. This keeps the active queue clean and provides an audit trail.
We used a combination approach. For our top 20 vendors (which represent about 70% of invoice volume), we created specific extraction templates in the RPA bot that target known field positions. For the remaining invoices, we integrated with an AI-based OCR service that uses machine learning to identify invoice fields. The templates handle structured invoices with near-perfect accuracy, while the AI OCR manages the variable formats with about 85-90% accuracy.
The most common manual entry errors were transposed digits in amounts, incorrect customer account matching, and wrong GL code assignments. Our validation layer checks extracted amounts against invoice totals, verifies customer IDs exist in our system before posting, and validates GL codes against a predefined list. If any validation fails, the bot flags the invoice for manual review rather than posting incorrect data.
The 95% error reduction is the most interesting part to me. What types of errors were most common with manual entry, and how does the automated posting validation catch them? We have similar error rates with manual entry and need to build a business case for automation.