IoT policy denies device connection after certificate rotation, causing edge device offline status

We implemented quarterly certificate rotation for our IoT device fleet as part of security hardening. After rotating certificates for 200 edge devices, they all started receiving policy denial errors when attempting to connect to AWS IoT Core.

Error from device logs:


Connection denied: Not authorized
Policy evaluation failed for certificate
Certificate ARN: arn:aws:iot:us-east-1:123456:cert/new-cert-id

The certificate rotation process completed successfully according to AWS IoT Console - new certificates show as ACTIVE and old ones are INACTIVE. However, IoT policy attachment seems to still reference the old certificate ARNs. Device ARN mapping might not have updated properly during rotation. Has anyone encountered policy attachment issues post-rotation? We need devices back online quickly.

Nina makes a great point about multiple attachments. Also check your IoT policy itself - if it has explicit certificate ARN references in Condition blocks, those will break after rotation. Policies should use thing-based conditions like ${iot:Connection.Thing.ThingName} instead of hardcoded certificate ARNs for rotation-friendly policies.

We assumed the policy would automatically attach to new certificates since they’re associated with the same Thing names. So we need to manually attach policies to all 200 new certificates? Is there a batch operation for this or do we script it?

You’ll need to script it using AWS CLI or SDK. The policy attachment is to the certificate principal, not the Thing. When you rotate certificates, the new certificate gets a new ARN, so the policy attachment breaks. I’ve built automation for this using Python and boto3. Loop through your Things, get the new certificate ARN for each, then attach the policy. Takes about 10 minutes to process 200 devices with proper error handling and rate limiting.

Here’s the comprehensive solution addressing all three focus areas:

Certificate Rotation Process: Proper certificate rotation requires these sequential steps:

  1. Create new certificate and keep old one ACTIVE initially
  2. Attach new certificate to Thing while old cert remains attached
  3. Attach IoT policy to new certificate
  4. Update device with new certificate credentials
  5. Verify device connects successfully with new cert
  6. Only then detach old certificate and mark INACTIVE

The script for step 3 (policy attachment):

for cert_arn in $(cat new_cert_arns.txt); do
  aws iot attach-policy \
    --policy-name DevicePolicy \
    --target $cert_arn
done

IoT Policy Attachment: Your policy must use dynamic references, not static certificate ARNs. Update your policy to use Thing-based conditions:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": "iot:Connect",
    "Resource": "arn:aws:iot:region:account:client/${iot:Connection.Thing.ThingName}"
  }]
}

To fix your current situation, attach the policy to all new certificates:

aws iot list-thing-principals --thing-name device-001
# Identify new certificate ARN
aws iot attach-policy --policy-name YourPolicyName \
  --target arn:aws:iot:region:account:cert/new-cert-id

Device ARN Mapping: Verify and fix Thing-to-certificate mappings. List all principals for a Thing:

aws iot list-thing-principals --thing-name device-001

If old certificate is still attached, detach it:

aws iot detach-thing-principal \
  --thing-name device-001 \
  --principal arn:aws:iot:region:account:cert/old-cert-id

Immediate Recovery Steps:

  1. List all Things and their current certificate attachments
  2. For each Thing, identify the new certificate ARN
  3. Attach your IoT policy to each new certificate ARN
  4. Verify policy attachment: `aws iot list-attached-policies --target cert-arn
  5. Test device connection with new certificate
  6. Once confirmed working, detach and deactivate old certificates

Automation for Future Rotations: Create a rotation script that handles all steps atomically:

  • Generates new certificate
  • Attaches to Thing (keeping old cert attached)
  • Copies all policy attachments from old to new cert
  • Updates device configuration
  • Validates connection
  • Only then removes old cert

This prevents the connection outage you’re experiencing. The key insight is that policy attachments are certificate-specific, not Thing-specific, so rotation must explicitly transfer them.

I’ve seen this exact scenario multiple times. The issue is that certificate rotation in AWS IoT is not a single atomic operation - it’s a multi-step process that requires careful orchestration.

There’s also the device ARN mapping issue to consider. When you rotate certificates, the Thing-to-certificate attachment might not update if you didn’t explicitly detach the old certificate first. Check if your Things still have both old and new certificates attached. AWS IoT allows multiple certificates per Thing, but policy evaluation uses the certificate presented during connection. If the device is trying to use the new cert but the Thing attachment still prioritizes the old one, you’ll get policy denials. Verify the attachment status for both old and new certificates on a sample device.

Certificate rotation doesn’t automatically transfer policy attachments in AWS IoT. You need to explicitly attach the IoT policy to each new certificate. The old certificate’s policy attachment remains with the old cert even after it’s marked INACTIVE. Did you run attach-policy commands for all new certificates after rotation?