Let me walk through our complete implementation covering all the key components:
OCR and VLM Integration:
We use a two-stage approach:
// Stage 1: PDF to image conversion
pdfToImages(contract.pdf, {dpi: 300})
// Stage 2: VLM extraction with structured prompt
const prompt = `Extract supplier terms as JSON:
{leadTime, moq, pricing, paymentTerms}
vlmExtract(images, prompt)
The VLM handles complex layouts, multi-column text, and table extraction far better than pure OCR. We achieve 94% accuracy on first pass vs 67% with Tesseract OCR.
JSON Schema Mapping:
Our standardized schema normalizes supplier variations:
{
"supplierId": "SUP-12345",
"leadTimeDays": 21,
"moqUnits": 500,
"pricingTiers": [
{"quantity": 500, "unitPrice": 12.50},
{"quantity": 1000, "unitPrice": 11.75}
],
"paymentTerms": "NET30"
}
The VLM is prompted to map variations (“delivery time”, “turnaround”, “lead time”) to the standard “leadTimeDays” field. We provide 10-15 example mappings in the prompt to guide the model.
Data Validation Rules:
Multi-layer validation catches errors before API submission:
- Type validation: leadTimeDays must be integer, prices must be decimal
- Range validation: leadTime 1-180 days, MOQ > 0, prices > 0
- Business logic: pricing tiers must be ascending by quantity
- Referential integrity: supplierId must exist in master data
Validation failure rate is about 6%, mostly due to ambiguous contract language. Failed validations route to manual review queue.
API Integration:
We use Infor’s standard supply planning APIs with custom error handling:
POST /api/v1/suppliers/{id}/terms
// Payload: validated JSON schema
// Response: confirmation or detailed error
The integration includes retry logic for transient failures and detailed logging for audit trails. We batch updates during off-peak hours to minimize impact on planning operations.
Error Handling:
Comprehensive error handling at each stage:
- PDF corruption: Alert procurement team for manual processing
- VLM extraction confidence < 85%: Route to manual review
- Validation failure: Queue for correction with highlighted issues
- API failure: Retry with exponential backoff, alert if persistent
We maintain a manual review queue that typically has 6-8% of contracts requiring human intervention. Reviewers see the extracted data side-by-side with the original PDF and can correct any errors before submission.
Results:
After 6 months in production:
- Processing time: 2-3 hours → 5 minutes per contract
- Accuracy: 94% fully automated, 6% requiring minor corrections
- ROI: System paid for itself in 4 months through time savings
- Dashboard data freshness: Updated within hours vs weeks
- Planning accuracy improved: Better lead time data = 12% reduction in stockouts
The key success factors were: (1) using VLM instead of pure OCR for better accuracy, (2) comprehensive validation to catch errors early, (3) keeping humans in the loop for edge cases, and (4) robust error handling throughout the pipeline.