We’re having critical issues uploading large PDF files (>50MB) to document-control module via file sync integration in MC 2022.1. The API calls timeout after about 2 minutes, resulting in file loss and incomplete document records.
Timeout error:
HTTPConnectionPool: Read timed out (read timeout=120)
Document ID: DOC-2024-1156 created but file attachment missing
We’ve checked server-side timeout settings in the application server config but they seem adequate (300s). The files upload fine through the web UI, so it’s specifically an API integration issue. Could this be related to chunked file upload support, or do we need PDF compression before upload? Large technical drawings are common in our workflow.
Yes, MC 2022.1 has a default API payload size limit of 50MB for single-request uploads. For files larger than 50MB, you must use chunked file upload. The API supports multipart upload where you split the file into chunks (typically 10-20MB each) and upload sequentially. Each chunk gets a part number, and after all chunks are uploaded, you call a finalize endpoint to assemble them. Check the document-control API documentation section on ‘Multipart File Upload’.
Chunks must be uploaded sequentially in order (part 1, then part 2, etc.) because MC validates each chunk against the previous one’s checksum. If a chunk fails, you can retry just that specific chunk - you don’t need to restart. The upload session remains active for 24 hours. After all chunks succeed, call the finalize endpoint which validates checksums and assembles the complete file server-side.
Good point. I increased the client timeout to 600s, but now we’re hitting a different issue - the uploads still fail but after 5 minutes instead of 2. The error message changed to ‘Request Entity Too Large’. I think we’re hitting some size limit. Does MC 2022.1 have a maximum file size for API uploads?
Here’s a comprehensive solution for handling large file uploads in MC 2022.1 document-control:
1. Server-Side Timeout Settings:
Verify and configure all timeout layers:
-
Application Server: Already set to 300s - that’s good
-
Load Balancer: Check if you have ALB/nginx in front. Default is often 60s. Increase to 600s:
proxy_read_timeout 600s;
proxy_send_timeout 600s;
-
HTTP Client: Increase client-side timeout in your integration code to 600s minimum
-
Database Connection: Verify DB connection timeout allows long-running file operations
2. Chunked File Upload:
Implement multipart upload for files >50MB:
Pseudocode for chunked upload:
// Pseudocode - Key implementation steps:
1. Initiate upload session: POST /api/v1/documents/{docId}/files/initiate
Response includes uploadId and chunkSize (typically 10MB)
2. Split file into chunks based on returned chunkSize
3. For each chunk (sequential order required):
- POST /api/v1/documents/{docId}/files/upload?uploadId=X&partNumber=N
- Include chunk data and MD5 checksum in request
- Retry on failure (up to 3 attempts per chunk)
4. After all chunks uploaded successfully:
- POST /api/v1/documents/{docId}/files/finalize?uploadId=X
- Server validates checksums and assembles complete file
// Upload session valid for 24 hours
Key implementation details:
- Chunk size: Use 10MB chunks for optimal balance of reliability and performance
- Checksums: Calculate MD5 for each chunk to detect corruption
- Retry logic: Implement exponential backoff for failed chunks
- Progress tracking: Store completed chunk numbers so you can resume after interruption
3. PDF Compression:
Pre-process large PDFs before upload:
-
Use PDF optimization libraries (e.g., Ghostscript, Adobe PDF Library)
-
Target compression settings:
- Image DPI: Reduce to 150-200 DPI for technical drawings (sufficient for screen viewing)
- Image compression: Use JPEG compression at 85% quality
- Font subset: Embed only used characters
- Remove metadata: Strip creation history and unused objects
-
Expected results: 50-70% size reduction for typical engineering PDFs
-
Quality validation: Always verify compressed PDF readability before upload
Complete Workflow:
- Check file size: if <50MB, use standard single-request upload
- If >50MB: compress PDF first, then check size again
- If still >50MB: use chunked upload
- Implement progress monitoring and resume capability
- Validate document record and file attachment after upload completes
Error Handling:
- 408 Request Timeout: Increase client timeout or implement chunking
- 413 Payload Too Large: File exceeds 50MB, must use chunked upload
- 500 Server Error during finalize: Check server logs for checksum mismatch
- Connection reset: Typically load balancer timeout - increase proxy timeout
Performance Tips:
- Upload during off-peak hours for large batches
- Monitor network bandwidth - sustained 50MB uploads require ~5Mbps minimum
- Consider parallel document uploads (different documents), but chunks within same document must be sequential
- Enable compression at HTTP level (gzip) for API metadata, but not for file binary data
If you implement chunked uploads and still experience issues, verify your MC 2022.1 instance has the latest patches. Version 2022.1.3 fixed several chunked upload bugs related to checksum validation and session timeout.
Also consider PDF compression before upload. Large technical drawings often have uncompressed images embedded. Use PDF optimization tools to compress images and remove unnecessary metadata. We routinely see 50-70% size reduction on engineering PDFs without quality loss. This reduces upload time and storage costs even if you implement chunked uploads.
I found the multipart upload documentation, but it’s not clear if the chunks need to be uploaded in a specific sequence or if they can be parallel. Also, what happens if one chunk fails midway? Do we need to restart from the beginning or can we retry just the failed chunk?
The 120-second timeout you’re seeing is likely your HTTP client timeout, not the server timeout. Check your integration code - most HTTP libraries default to 120s read timeout. You need to increase the client-side timeout setting to match or exceed the server’s 300s timeout. Also verify your load balancer timeout if you have one in front of MC.