Snowpipe delays when loading SAP transactional data through

mohan_mehta · February 9, 2025, 12:18pm

We’re experiencing significant delays with Snowpipe ingestion from our SAP BW system. The pipeline loads transactional data via S3, but we’re seeing 15-30 minute lags between file arrival and data availability in Snowflake.

Our setup uses S3 event notifications to trigger Snowpipe, but the SQS queue seems to accumulate messages during peak hours (8 AM - 11 AM). We’ve checked the Snowpipe status and it shows ‘RUNNING’, but the COPY_HISTORY reveals inconsistent load times.

The warehouse assigned to Snowpipe tasks is X-Small. Could this be causing the bottleneck? We need near real-time analytics for our SAP sales data, and these delays are impacting dashboard refresh rates. Any insights on optimizing Snowpipe configuration or monitoring would be appreciated.

ryan_allen · March 21, 2025, 10:12am

I’ve dealt with this exact scenario for SAP BW to Snowflake pipelines. Here’s a comprehensive solution addressing all the factors:

Snowpipe Configuration Optimization:

First, upgrade your architecture to use SNS-to-SQS pattern instead of direct SQS. Create an SNS topic in the same region as your S3 bucket, configure S3 event notifications to publish to SNS, then subscribe your SQS queue to the topic. This adds retry logic and better handles high-volume bursts from SAP extracts.

For S3 event notifications, ensure you’re using these settings:

Event type: s3:ObjectCreated:*
Prefix filter: Match your SAP data path exactly (e.g., sap/bw/transactional/)
Suffix filter: Add .csv or .parquet to avoid triggering on metadata files

Verify IAM permissions allow S3 to publish to SNS, SNS to send to SQS, and Snowpipe to read from SQS. The policy should include s3:GetObject, s3:GetObjectVersion, and s3:ListBucket.

Warehouse Sizing Strategy:

For SAP transactional data with 15-30 minute delays on X-Small, you need to scale up. Based on typical SAP BW extract volumes:

If processing 100-500 files/hour: Use Small warehouse
If processing 500-2000 files/hour: Use Medium warehouse
Above 2000 files/hour: Consider Large or split into multiple pipes

The key metric is FILES_INSERTED vs FILES_RECEIVED in PIPE_USAGE_HISTORY. If you’re consistently seeing a backlog building during business hours, you’re undersized.

Implement warehouse auto-suspend after 60 seconds to control costs when file arrival is sporadic (common with scheduled SAP extracts).

Monitoring and Troubleshooting:

Set up comprehensive monitoring with these queries:

-- Check pipe lag
SELECT SYSTEM$PIPE_STATUS('SAP_TRANSACTIONAL_PIPE');

-- Monitor load performance
SELECT
  TO_DATE(LAST_LOAD_TIME) as load_date,
  COUNT(*) as files_loaded,
  AVG(ROW_COUNT) as avg_rows,
  AVG(FILE_SIZE) as avg_size_mb
FROM TABLE(INFORMATION_SCHEMA.COPY_HISTORY(
  TABLE_NAME=>'SAP_TRANSACTIONS',
  START_TIME=>DATEADD(hours, -24, CURRENT_TIMESTAMP())))
GROUP BY 1 ORDER BY 1 DESC;

Create a monitoring task that runs every 5 minutes:

Check if pending file count > 100 (indicates backlog)
Verify last successful load was within last 10 minutes
Alert if error count increases

SAP-Specific Optimizations:

For SAP BW integration, coordinate with your ABAP team to:

Schedule extracts during off-peak hours when possible
Split large extracts into smaller chunks (max 100MB per file)
Use incremental delta loads rather than full refreshes
Implement file-level checksums in naming convention for validation

Expected Results:

After implementing these changes, you should see:

Ingestion latency reduced to 2-5 minutes for most files
Consistent processing even during peak SAP extract windows
Better visibility into pipeline health through monitoring
More predictable dashboard refresh cycles

The combination of proper SNS/SQS configuration, appropriately sized warehouse, and comprehensive monitoring will resolve your delays. Start with the warehouse upgrade to Small - that alone should cut your latency by 60-70% immediately.

eva2576 · March 15, 2025, 7:15pm

X-Small warehouse is definitely your problem for SAP transactional volumes. During peak hours, the compute capacity can’t keep up with the file ingestion rate. I’d recommend starting with Small or Medium warehouse, especially if you’re processing hundreds of files per hour.

You can monitor this by querying PIPE_USAGE_HISTORY - look at the CREDITS_USED and FILES_INSERTED columns. If you see credits maxing out during your delay windows, that’s your confirmation.

pooj9705 · March 16, 2025, 2:32pm

Thanks for the suggestions. I checked our setup - we’re using direct SQS without SNS. The S3 events are configured correctly with ObjectCreated:Put triggers.

Queried PIPE_USAGE_HISTORY and you’re right - credits spike to maximum during our delay periods. The FILES_INSERTED count drops significantly even though files are arriving in S3. Looks like warehouse sizing is the primary issue here.

zhang_snow · March 15, 2025, 11:47am

First thing to check is your S3 event notification configuration. Are you using SNS topic with SQS subscription, or direct SQS? The SNS-to-SQS pattern provides better reliability for high-volume scenarios.

Also verify that your S3 bucket has the correct event configuration - it should trigger on ‘ObjectCreated’ events with the appropriate prefix filter matching your SAP data path. Permission issues between S3, SNS/SQS, and Snowpipe can cause silent failures that manifest as delays.

Topic		Views
Snowpipe delays when loading SAP transactional data through S3 Snowflake question , analytics , performance , sap-integration , data-connectors , snow-7-0 , snowpipe , ingestion-latency , s3-events	7	October 6, 2025
Snowpipe ingestion delays when syncing SAP master data through BDC interface Snowflake question , sql , ai-ml-integration , data-connectors , snow-7-0 , snowpipe , sap-bdc , ingestion-latency , real-time-ai-model-training	5	September 2, 2025
Snowpipe ingestion delays when syncing SAP master data through BDC Snowflake question , ai-ml-integrati , data-connectors , change-data-capture , snow-7-0 , snowpipe , sap-bdc , ingestion-latency , real-time-ai-model-training	6	November 18, 2025
Data storage latency increases during peak event ingestion periods SAP IoT question , pipeline-optimization , analytics-delay , peak-load , event-processing , data-storage , event-batching , sapiot-25 , latency-issues	5	November 10, 2025
Workflow approval data sync fails with SAP backend due to connection timeout OutSystems question , rest-api , sap-integration , data-integration , workflow-mgmt , connection-timeout , async-processing , integration-studio , outsystems-11	3	March 19, 2025
Data storage bottleneck: High ingest latency in SAP IoT with HANA backend during peak sensor data loads SAP IoT question , performance-opt , rest-api , hana , real-time-analytics , partitioning , data-storage , sapiot-25 , ingest-latency	6	December 29, 2024
Handling high-frequency data streams in event-processing pipeline without data loss SAP IoT discussion , async-processing , event-processing , data-stream , high-frequency , sapiot-23 , pipeline-scaling , batching-strategy , dead-letter-queue	5	June 18, 2025
Process mining data pipeline fails on large volume imports from SAP Pega Platform question , performance , java , data-integration , process-mining , batch-processing , sap-connector , pega-8-5 , outofmemoryerror	5	April 15, 2025
Consolidation data loads intermittently fail due to disk I/O wait times and latency spikes SAP S/4HANA question , performance-opt , consolidation , sap-2020 , latency , sap-hana-cockpit , disk-io , storage-performance , data-load	5	May 21, 2025

Snowpipe delays when loading SAP transactional data through

Related topics