We’re deploying a large-scale IoT infrastructure and need to onboard approximately 15,000 devices into our ThingWorx 9.7 registry. The Bulk Importer works fine for small batches (100-200 devices), but when we scale up to 2,000+ devices per batch, the import process slows to a crawl and eventually times out.
Our current approach uses the REST API to create device entities in batches. Initial batches complete in 5-8 minutes, but subsequent batches take 20+ minutes and often fail with timeout errors. We’ve noticed memory usage climbing steadily during imports, and the application server becomes unresponsive.
We suspect issues with thread pool configuration and memory allocation for import jobs, but we’re not sure where to start tuning. Has anyone successfully optimized bulk device imports at this scale? What REST API batch size limits should we be aware of?
The REST API has implicit rate limiting that kicks in with large payloads. I’d recommend breaking your 2,000 device batches down to 500 devices maximum per API call. This reduces the transaction size and helps with memory pressure. Also, implement a delay between batches - even 30 seconds helps the garbage collector catch up. For thread pools, I’ve found that setting the bulk operation workers to 20-30 threads works well, but this depends on your server specs. What’s your current heap allocation?
We’re allocated 8GB heap with a 16-core server. The 500-device batch size recommendation is helpful - that’s much smaller than what we’ve been attempting. I’ll adjust the thread pool settings and test with reduced batch sizes. Should we also consider parallel import streams, or would that make memory issues worse?
I’ve dealt with similar scaling issues. First thing to check is your platform-settings.json configuration for the import subsystem. The default thread pool for bulk operations is usually set too low for large-scale imports. You’ll want to increase the worker threads dedicated to device creation. Also, are you monitoring heap usage during imports? Memory allocation is critical here because each device entity instantiation consumes heap space before persistence.
Parallel streams can work but need careful tuning. With 8GB heap, I’d stay sequential for now or limit parallelism to 2-3 concurrent streams max. More important is optimizing the device template itself - remove unnecessary properties or subscriptions that execute on creation. Each device instantiation triggers property bindings and event handlers, which compounds at scale.