RAM policy blocks ECS snapshot restore for disaster recovery testing

ronalddev · July 6, 2025, 8:04am

Attempting to restore ECS snapshots during our quarterly disaster recovery validation is failing with ‘Unauthorized’ errors. Our DR team has a RAM policy that should allow snapshot operations, but when they try to restore snapshots to test recovery procedures, the operation is denied.

Current RAM policy attached to DR team role:

{
  "Statement": [{
    "Effect": "Allow",
    "Action": ["ecs:CreateSnapshot", "ecs:DescribeSnapshots"],
    "Resource": "*"
  }],
  "Version": "1"
}

The error occurs when executing restore operations through the console or API. Snapshot creation and viewing work fine, but any restore attempt fails immediately. This is blocking our DR validation process required for compliance audits.

I believe the issue is related to RAM policy permissions not including the snapshot restore action, but I’m uncertain about the correct action name and whether the resource scope needs to be more specific. Any guidance on proper RAM policies for disaster recovery workflows?

georgecoder · July 6, 2025, 10:08am

Your RAM policy is missing the restore action. Creating snapshots and restoring from snapshots are separate permissions in Alibaba Cloud. You need to add the restore action explicitly.

The action you’re looking for is probably ecs:CreateDiskFromSnapshot or ecs:CreateInstanceFromSnapshot depending on whether you’re restoring individual disks or entire instances. Check the ECS API documentation for the exact action names.

bettytech · July 10, 2025, 4:19am

Thanks, that makes sense. We’re trying to restore entire ECS instances from snapshots for DR testing. Should I add ecs:CreateInstanceFromSnapshot to the policy? Are there any other related permissions needed for a complete restore operation?

emma_analyst · July 14, 2025, 9:19am

Don’t forget about the snapshot resource itself. Your current policy allows operations on all resources ("Resource": "*"), but depending on your organization’s security policies, you might need to explicitly grant access to the specific snapshots used for DR.

Also check if there are any deny policies at the account or resource group level that might be overriding your allow policy. Explicit denies always win in RAM policy evaluation.

maryexpert · July 27, 2025, 3:22am

Good point about deny policies. I’ll check with our security team if there are any account-level restrictions. For now, I need to update the RAM policy with the correct restore permissions to unblock DR testing this week.

bettytech · July 27, 2025, 7:25am

Here’s the complete solution addressing all three areas:

RAM Policy Permissions: Your current policy only grants snapshot creation and viewing, not restoration. For full DR testing capability, update the policy to include all necessary restore actions:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:CreateSnapshot",
        "ecs:DescribeSnapshots",
        "ecs:RunInstances",
        "ecs:CreateDisk",
        "ecs:AttachDisk",
        "ecs:DescribeInstances",
        "ecs:DescribeDisks",
        "ecs:DescribeInstanceTypes",
        "ecs:DescribeImages"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "vpc:DescribeVpcs",
        "vpc:DescribeVSwitches"
      ],
      "Resource": "*"
    }
  ],
  "Version": "1"
}

Key permissions explained:

ecs:RunInstances: Creates new ECS instances (required for restore)
ecs:CreateDisk: Creates disks from snapshots
ecs:AttachDisk: Attaches restored disks to instances
Describe actions: Required for validation and instance configuration during restore
VPC actions: Needed if restoring into VPC networks (typical for production DR)

Snapshot Restore Action: The specific action for snapshot restore depends on your workflow:

Full Instance Restore (recommended for DR): Use ecs:RunInstances with snapshot parameter. This creates a new instance directly from snapshot in one operation.
Disk-level Restore:
- ecs:CreateDisk with snapshotId parameter
- Then ecs:AttachDisk to attach to existing or new instance
- More granular but requires multiple steps

For DR testing, ecs:RunInstances is the primary action you need. It handles creating the instance with disks restored from snapshots automatically.

Additional Required Actions:

If using security groups: ecs:DescribeSecurityGroups, `ecs:AuthorizeSecurityGroup
If using EIP: ecs:AllocateEipAddress, `ecs:AssociateEipAddress
If tagging restored instances: `ecs:TagResources
If in resource groups: resourcemanager:ListResourceGroups **Resource Scope in Policy:** Your current “Resource”: “*”` is overly permissive. Apply least privilege by scoping to DR resources:

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:RunInstances",
        "ecs:CreateDisk",
        "ecs:AttachDisk"
      ],
      "Resource": [
        "acs:ecs:cn-shanghai:*:instance/*",
        "acs:ecs:cn-shanghai:*:disk/*",
        "acs:ecs:cn-shanghai:*:snapshot/dr-*"
      ],
      "Condition": {
        "StringEquals": {
          "ecs:ResourceGroup": "rg-dr-testing"
        }
      }
    },
    {
      "Effect": "Allow",
      "Action": [
        "ecs:DescribeSnapshots",
        "ecs:DescribeInstances",
        "ecs:DescribeDisks"
      ],
      "Resource": "*"
    }
  ],
  "Version": "1"
}

This scopes restore actions to:

Specific region (cn-shanghai - adjust to your DR region)
Snapshots with ‘dr-’ prefix (naming convention for DR snapshots)
Resources in ‘rg-dr-testing’ resource group
Read-only describe actions remain unrestricted for convenience

Best Practices for DR RAM Policies:

Separate Policies by Environment:
- Production restore: Highly restricted, requires approval workflow
- DR testing: More permissive, scoped to test resource groups
- Create separate RAM roles for each

Time-based Access: Add condition to limit restore permissions to DR testing windows:

"Condition": {
  "DateGreaterThan": {"acs:CurrentTime": "2025-01-26T00:00:00Z"},
  "DateLessThan": {"acs:CurrentTime": "2025-01-27T23:59:59Z"}
}

Audit Trail: Enable ActionTrail to log all snapshot restore operations for compliance.
MFA Requirement: For production restores, add MFA condition:
```
"Condition": {
  "Bool": {"acs:MFAPresent": "true"}
}
```

Validation Steps:

Update RAM policy with required actions
Wait 2-3 minutes for policy propagation

Test restore using RAM user/role:


aliyun ecs RunInstances --ImageId img-xxx \
  --SnapshotId s-dr-xxx \
  --InstanceType ecs.g6.large \
  --SecurityGroupId sg-xxx

Verify instance creates successfully from snapshot
Check ActionTrail logs to confirm proper authorization

Troubleshooting:

If still getting ‘Unauthorized’: Check for explicit deny policies in parent accounts or SCPs
Verify RAM role trust policy allows your DR team to assume the role
Confirm snapshots exist in the same region as restore target
Check snapshot status is ‘accomplished’ (completed snapshots only)

The core issue is that snapshot creation and restoration are separate permission domains in RAM. Your policy granted read/create snapshot permissions but not the execute permissions needed for restore operations. Adding ecs:RunInstances and related actions, properly scoped to DR resources, will enable your team to perform disaster recovery validation while maintaining security boundaries.

marywizard · July 10, 2025, 6:17pm

Restoring an ECS instance from snapshot actually involves multiple actions behind the scenes. You need permissions for:

Creating the instance (ecs:CreateInstance or ecs:RunInstances)
Attaching the restored disk (ecs:AttachDisk)
Potentially network operations if creating in a VPC
Describing instance types and regions

The resource scope also matters. Using "Resource": "*" works but violates least privilege. You should scope it to specific regions or resource groups for DR environments.

Topic		Replies	Views
RAM role permission denied after upgrading to ac-2021 security module Alibaba Cloud question , compute , security , access-control , iam , ac-2021 , permission-denied , ram , role-policies	6	1	March 18, 2025
ECS disk auto snapshot fails to create backups, no snapshot generated Alibaba Cloud question , compute , disaster-recovery , ac-2021 , ecs , cloudmonitor , auto-snapshot , backup-failure , quota-limit	4	0	June 14, 2025
PolarDB scheduled backup to OSS fails with insufficient permissions error, backup not created Alibaba Cloud question , database , ac-2021 , backup-failure , oss-permission , polardb , ram-role , bucket-policy , database-recovery	6	0	October 27, 2025
Aurora backup restore fails with snapshot not found error despite valid snapshot ID Amazon Web Services (AWS) question , disaster-recovery , database , aws-2021 , backup-restore , aurora , snapshot-management , kms-encryption , iam-permissions	6	0	July 18, 2025
Compute Engine instance restore fails with custom image permissions error Google Cloud Platform (GCP) question , compute , disaster-recovery , automation , gcp-2019 , iam-permissions , compute-engine , backup-disaster , cross-project	5	0	November 15, 2025
Autonomous Database backup restore fails with 'ResourceNotFound' error cross-region Oracle Cloud question , backup-dr , database , oci-2019 , cross-region , iam-policy , autonomous-database , ocid , restore	4	0	July 3, 2025
Virtual machine disk snapshot fails with 'Resource is locked' error after enabling resource locks for compliance Microsoft Azure question , compute , storage , automation , backup , az-2020 , azure-cli , vm-snapshots , resource-locks	5	0	September 1, 2025
IAM policy blocks access to Cloud Object Storage bucket when automated backup runs IBM Cloud question , storage , security , iam , ic-2019 , access-denied , backup-failure , cloud-object-storage , service-credentials	6	0	February 8, 2025
Cloud Disaster Recovery and Backup Strategies for Business Continuity Generic Cloud Topics discussion , disaster-recovery , observability , backup-strategy , cloud-dr , security-posture , cloud-dr-backup-stra	7	0	September 21, 2025

RAM policy blocks ECS snapshot restore for disaster recovery testing

Related topics