Your issue is a classic post-patching TCP keepalive misconfiguration combined with connection pool tuning needs. Here’s the complete solution addressing all three focus areas:
OS TCP Keepalive Configuration: OS patches reset /etc/sysctl.conf to defaults. You need to reconfigure and persist these settings:
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 9
Apply with ‘sysctl -p’. These values send keepalive probes every 5 minutes (300s) which is well within your firewall’s 600s timeout. The 9 probes with 60s intervals provide 540 seconds of retry time before declaring connection dead.
Database Listener Diagnostics: Enable listener tracing to confirm where connections are failing. Check listener.ora for SQLNET.EXPIRE_TIME parameter - set it to 10 minutes to have Oracle send its own keepalive packets independent of OS settings. This provides defense-in-depth. Review listener logs with timestamps correlated to ORA-03113 errors to determine if termination is client-side, network-side, or server-side.
Connection Pool Tuning: Your UCP configuration needs three critical changes:
pds.setValidateConnectionOnBorrow(true);
pds.setConnectionValidationTimeout(5);
pds.setInactiveConnectionTimeout(300);
ValidateConnectionOnBorrow ensures dead connections are detected before being handed to application. Set validation timeout to 5 seconds for quick detection. InactiveConnectionTimeout of 300 seconds recycles idle connections before firewall timeout.
Additionally, configure SQLNET.RECV_TIMEOUT and SQLNET.SEND_TIMEOUT in sqlnet.ora on the database server to 300 seconds. This ensures the database server itself detects dead clients.
Root Cause: Your ORA-03113 errors occur because the OS patch reset TCP keepalive to 7200 seconds (2 hours), but your firewall drops connections at 600 seconds (10 minutes). Idle pooled connections are terminated by the firewall, but neither the OS nor the application detects this until attempting to use the connection, resulting in protocol adapter errors.
Verification: After implementing these changes, monitor for 48 hours. You should see no more ORA-03113/ORA-03114 errors. Use ‘netstat -anp | grep ESTABLISHED’ to verify connections remain stable. The combination of OS keepalive (300s), Oracle SQLNET.EXPIRE_TIME (600s), and UCP validation ensures connections are either kept alive or properly recycled before becoming stale.