Work order status updates delayed 15-20 minutes in reporting dashboards

Our production supervisors are complaining about significant delays in work order status visibility on their monitoring dashboards. When operators complete a work order on the shop floor terminals, the status change is recorded immediately in the MES database (we can verify this by querying directly), but the reporting dashboards don’t reflect the change for 15-20 minutes.

This delay is causing confusion because supervisors are dispatching new orders based on outdated information, leading to resource allocation issues. We’re running AVEVA MES 2022.1 with the standard cache configuration as far as I know.


// Current behavior observed:
Shop Floor Terminal: WO-12345 marked COMPLETE at 09:15:32
Database Query: Status=COMPLETE, updated_at=09:15:33
Dashboard Display: Still shows IN_PROGRESS until 09:32:18

I suspect this is related to cache refresh intervals or how the dashboard subscribes to data updates. Has anyone optimized the cache invalidation strategy to get near real-time visibility?

One thing to watch out for: if you have multiple dashboard instances or a load-balanced reporting environment, you need to ensure cache invalidation messages are broadcast to all instances. We initially configured event-driven invalidation but only the primary dashboard server was receiving the messages. Secondary servers still had 15-minute delays until we fixed the topic subscription configuration.

Thanks! Where exactly would I find the CacheManager configuration? Is this in the application server settings or the reporting module config? Also, what’s the performance impact of switching to event-driven invalidation - will it increase load on the database or message broker significantly?

CacheManager config is in ApplicationServer/config/cache-config.xml. Look for the <cache-region name="WorkOrderStatus"> section. Event-driven invalidation does add some overhead to the message queue, but it’s minimal compared to the benefit. The real question is whether your message broker can handle the additional throughput - if you’re processing hundreds of work order updates per minute, you might see some latency there too.

I’ll walk you through the complete solution for achieving near real-time work order status visibility. This involves configuring multiple components to work together efficiently.

1. Cache Invalidation Strategy and Timing:

First, locate and modify ApplicationServer/config/cache-config.xml:

<cache-region name="WorkOrderStatus">
  <expiration-policy type="event-driven"/>
  <refresh-interval>300</refresh-interval> <!-- Fallback: 5 min -->
  <invalidation-source>message-queue</invalidation-source>
</cache-region>

The key change is switching from time-based to event-driven invalidation while keeping a fallback refresh interval for safety.

2. Event-Driven vs Scheduled Refresh Mechanisms:

Configure the event publisher in WorkOrderService/config/event-config.xml:

<event-publisher>
  <event-type>WorkOrderStatusChange</event-type>
  <target-topic>jms/cache/invalidation</target-topic>
  <publish-mode>immediate</publish-mode>
  <batch-size>1</batch-size>
</event-publisher>

This ensures status change events are published immediately rather than batched. Batching can introduce additional 30-60 second delays.

3. Message Queue Throughput and Latency:

Your message broker configuration is critical. In MessageBroker/config/broker.xml, verify these settings:

<topic name="cache.invalidation">
  <max-messages>10000</max-messages>
  <message-ttl>60000</message-ttl>
  <delivery-mode>non-persistent</delivery-mode>
</topic>

Key points:

  • Use non-persistent delivery for cache invalidation messages (they’re not critical to persist)
  • Set appropriate max-messages based on your work order completion rate
  • Keep TTL short (60 seconds) since stale invalidation messages are useless

Monitor message queue metrics:


jms.topic.cache.invalidation.depth < 100 (healthy)
jms.topic.cache.invalidation.enqueue_rate ≈ work_order_update_rate
jms.topic.cache.invalidation.latency < 500ms

If queue depth grows consistently, you have a throughput problem. Increase consumer threads or optimize message processing.

4. Dashboard Subscription Model Configuration:

This is where many implementations fail. Configure push-based updates in ReportingDashboard/config/subscription-config.xml:

<dashboard-subscription>
  <data-source>WorkOrderStatus</data-source>
  <update-mode>push</update-mode>
  <subscription-topic>jms/cache/invalidation</subscription-topic>
  <filter>event_type='WorkOrderStatusChange'</filter>
  <refresh-on-invalidation>true</refresh-on-invalidation>
</dashboard-subscription>

For load-balanced environments with multiple dashboard servers, ensure each instance subscribes to the topic:

<topic-subscriber>
  <client-id>dashboard-${server.instance.id}</client-id>
  <durable>false</durable>
  <shared-subscription>true</shared-subscription>
</topic-subscriber>

The shared-subscription setting is crucial - it ensures all dashboard instances receive invalidation messages, not just one.

Performance Impact Analysis:

Based on implementing this across multiple plants:

  • Message broker CPU increase: 5-8%
  • Message broker memory increase: 15-25% (depends on message volume)
  • Database load: Minimal increase (<2%) - fewer full cache refresh queries
  • Network bandwidth: Increase of ~50-100 KB/s per dashboard instance
  • Dashboard response time: Improved by 40-60% (less polling)

Implementation Steps:

  1. Baseline current performance:

    • Measure current message queue depth and latency
    • Document current cache refresh intervals
    • Record dashboard update latency
  2. Upgrade message broker capacity:

    • Increase heap size by 20-25%
    • Configure topic for cache invalidation messages
    • Test message throughput under peak load
  3. Configure event-driven cache invalidation:

    • Update cache-config.xml
    • Configure event publisher
    • Test with single cache region first
  4. Update dashboard subscriptions:

    • Configure push-based updates
    • Ensure all instances subscribe correctly
    • Test invalidation propagation
  5. Monitor and tune:

    • Watch message queue depth and latency
    • Monitor dashboard update times
    • Adjust consumer threads if needed

Validation:

After implementation, your timeline should look like:


09:15:32 - Operator completes WO-12345 on shop floor terminal
09:15:33 - Database updated
09:15:33 - Status change event published to message queue
09:15:34 - Cache invalidation message received by all dashboard instances
09:15:34 - Dashboard cache refreshes from database
09:15:35 - UI updates with new status (3 second total latency)

If you’re still seeing delays >5 seconds after implementation:

  • Check message queue consumer thread count
  • Verify network latency between components
  • Review database query performance for cache refresh
  • Confirm all dashboard instances are receiving messages

This configuration should reduce your 15-20 minute delay to under 5 seconds in most cases, giving supervisors near real-time visibility into work order status changes.

We implemented event-driven cache invalidation last year and it made a huge difference. Our dashboard latency dropped from 12-15 minutes to under 30 seconds. The key is configuring the message queue properly - make sure your JMS topic for cache invalidation events has sufficient capacity and that the dashboard subscription model is using push notifications rather than polling. We had to increase our message broker heap size by about 20% to handle the additional load, but it was worth it for the real-time visibility.