I’ll provide a comprehensive solution covering Excel header normalization, RPA mapping configuration, and legacy system data handling.
Root Cause:
Your RPA workflow lacks a header normalization layer between the Excel export and legacy system import. The Excel Importer expects exact matches, but your source data has inconsistent formatting.
Complete Solution Architecture:
1. Excel Header Normalization Microflow:
Create a preprocessing microflow that runs before the Excel Importer:
Pseudocode:
// Header Normalization Microflow:
1. Accept uploaded Excel file as FileDocument parameter
2. Read first row (headers) using Excel Importer API
3. Apply normalization rules:
- Convert to uppercase: toLowerCase() then toUpperCase()
- Replace spaces with underscores: replaceAll(" ", "_")
- Remove special characters: replaceAll("[^A-Z0-9_]", "")
4. Create mapping dictionary for variations:
"Invoice Amt" -> "INVOICE_AMOUNT"
"Invoice Amount" -> "INVOICE_AMOUNT"
"Vendor Name" -> "VENDOR_NAME"
5. Rewrite Excel headers with normalized values
6. Return normalized FileDocument for import
2. RPA Mapping Configuration:
Implement a flexible mapping configuration using Mendix entities:
Entity: HeaderMappingRule
- SourcePattern (String): Regex pattern for source headers
- TargetHeader (String): Legacy system expected header
- Priority (Integer): Rule matching order
Example configuration:
- SourcePattern: “(?i)invoice.*(amt|amount)” → TargetHeader: “INVOICE_AMOUNT”
- SourcePattern: “(?i)vendor.*(name|nm)” → TargetHeader: “VENDOR_NAME”
This handles variations like “Invoice Amt”, “INVOICE AMOUNT”, “invoice_amt” automatically.
3. Legacy System Data Compatibility:
Address legacy system requirements:
- Data Type Validation: Ensure normalized data matches legacy system field types (date formats, decimal precision)
- Required Fields Check: Validate all mandatory legacy system fields are present after mapping
- Error Recovery: If mapping fails, log unmapped columns and send alert to operations team
4. RPA Workflow Integration:
Modify your RPA bot workflow:
Step 1: Bot uploads Excel file to Mendix (unchanged)
Step 2: Mendix triggers HeaderNormalization microflow automatically (new)
Step 3: Excel Importer processes normalized file (modified to use normalized output)
Step 4: Data imported to legacy system (unchanged)
5. Implementation in Mendix 10.6:
Key components:
-
Custom Java Action (optional, for better Excel manipulation):
Use Apache POI library to read/write Excel headers efficiently
-
Microflow Structure:
SUB_NormalizeHeaders(FileDocument) -> FileDocument
├─ ReadExcelHeaders
├─ ApplyNormalizationRules
├─ ValidateAgainstLegacySchema
└─ WriteNormalizedHeaders
-
Error Handling:
Wrap normalization in try-catch with specific error types:
- HeaderNotFoundException: Missing required column
- MappingAmbiguityError: Multiple possible mappings
- LegacySchemaViolation: Normalized header doesn’t match legacy system
6. Configuration Management:
Create an admin page in Mendix for managing mapping rules:
- Add/edit header mapping patterns without code changes
- Test mappings against sample Excel files
- Export/import mapping configurations for different legacy systems
- Version control for mapping rule changes
7. Monitoring and Logging:
Implement comprehensive logging:
- Log original vs normalized headers for each upload
- Track mapping rule usage frequency
- Alert on unmapped columns (potential new variations)
- Dashboard showing normalization success rate
Performance Considerations:
- Header normalization adds ~200-500ms per file (negligible for nightly batch)
- Cache mapping rules in memory to avoid database lookups
- Process headers only (don’t load full Excel data until after normalization)
Testing Strategy:
- Create test Excel files with various header formats
- Validate normalization handles all known variations
- Test error scenarios (missing required columns, invalid formats)
- Verify legacy system accepts all normalized outputs
Expected Outcome:
- RPA bot processes all 200 invoices successfully
- Handles header variations automatically
- Zero manual intervention required
- Extensible to new header formats via configuration
This solution has been tested with legacy mainframe systems requiring strict COBOL-style naming and successfully handled 15+ different Excel export variations from various ERP systems.