Our organization needs to extract customer account data from a legacy mainframe system for case processing in Pega. We’re debating between using RPA bots to automate screen scraping versus investing in developing direct API integration.
The mainframe team says building APIs would take 6-8 months due to security reviews and testing requirements. RPA could be deployed in 4-6 weeks. However, I’m concerned about the maintenance overhead of bots breaking when screen layouts change.
The data extraction needs to happen multiple times per case (initial lookup, validation checks, final updates). We process about 500 cases daily. What factors should we consider when choosing between RPA automation and API development for legacy system integration?
Yes, bot maintenance was a challenge. We had dedicated RPA support staff who monitored bot health daily and fixed breaks quickly. For error recovery, we implemented a three-tier strategy: automatic retry for transient errors, alert-based manual intervention for bot failures, and a manual terminal access process as ultimate fallback. The error handling added complexity but was necessary. We also built comprehensive logging so we could audit all data extractions and identify patterns in failures. About 15% of our support time went to RPA maintenance versus maybe 2% for our API-based integrations now.
That hybrid approach is interesting. Did you face issues with bot maintenance during the transition period? Also, how did you handle error recovery when the bots failed - did you have manual fallback processes?
This decision requires balancing three critical factors based on your specific context.
API Availability Assessment:
Before choosing RPA, thoroughly investigate existing integration options. Many mainframes have hidden integration capabilities that aren’t widely known. Check for:
CICS Transaction Gateway or CICS Web Services
IBM MQ or other message queue infrastructure
File transfer protocols (FTP/SFTP) for batch data exchange
Database replication tools that could expose mainframe data
Existing APIs used by other systems that you could leverage
Contact your mainframe team and ask specifically about programmatic access methods, not just “APIs.” The terminology matters - they might have transaction interfaces they don’t call APIs. If any of these exist, they’re almost always better than RPA for reliability and performance.
Bot Maintenance Reality:
RPA maintenance overhead is real and often underestimated. Based on implementations I’ve seen:
Screen-based bots break 2-4 times per year on average due to UI changes
Each break causes 2-8 hours of downtime while fixes are developed and tested
You need dedicated RPA support staff or your integration team spends 10-20% time on bot maintenance
Bot performance degrades over time as screen response times vary
Debugging bot failures is harder than API integration issues because you’re dealing with visual elements and timing
For 500 cases daily with multiple lookups, a bot failure impacts significant business volume. Calculate the cost of downtime and maintenance resources before committing to RPA.
Error Recovery Considerations:
This is where API integration shines. APIs provide:
Immediate error responses with specific error codes
Retry logic that’s straightforward to implement
Transaction rollback capabilities
Consistent performance regardless of system load
RPA error recovery is more complex:
Need to handle screen timeout scenarios
Must detect when screens don’t load correctly
Require screenshot capture for debugging
Need fallback to manual processes when bots fail
My recommendation: Use RPA as a bridge only if:
No existing integration methods are available
Business urgency justifies the technical debt
You commit to replacing it with proper APIs within 12-18 months
You budget for ongoing RPA maintenance resources
If the mainframe team can deliver even basic APIs (read-only data access) in 3-4 months rather than 6-8, that’s worth waiting for. The 6-8 month timeline might include unnecessary scope - negotiate for MVP API functionality first, then enhance iteratively. A simple read API for customer data is far less complex than full CRUD operations and could be delivered faster.
For your 500 cases daily, API integration will provide better performance, reliability, and lower total cost of ownership despite higher upfront investment.