Fix RDS Automated Backups Disabled (Retention 0)

TL;DR

An RDS instance with backup retention set to 0 days has automated backups and point-in-time recovery disabled, so a bad migration or fat-fingered delete can mean permanent data loss. Set retention to at least 7 days with aws rds modify-db-instance --backup-retention-period 7 --apply-immediately.

Automated backups are one of those features that feel invisible until the day you need them. When an RDS instance has its backup retention period set to 0 days, automated backups are turned off entirely. No daily snapshots, no transaction log capture, and no point-in-time recovery. If something corrupts your data or someone runs a DELETE without a WHERE clause, there is no clean way back.

This Lensix check (rds_backupretention in the rds_checks module) flags any RDS instance where the backup retention period is zero. It is a small setting with an outsized blast radius.

What this check detects

Amazon RDS supports automated backups controlled by the BackupRetentionPeriod attribute. This value tells RDS how many days to keep automated backups, and it can range from 0 to 35 days. A value of 0 disables automated backups completely.

When automated backups are enabled, RDS does two things:

Takes a daily storage volume snapshot during your configured backup window.
Continuously captures transaction logs so you can restore to any second within the retention window.

The second part is what people call point-in-time recovery (PITR). Setting retention to 0 throws both away.

Note: Manual snapshots are separate from automated backups and are not deleted when you set retention to 0. But manual snapshots only capture a moment in time. They do not give you the second-by-second recovery that transaction log capture provides, so they are not a substitute for automated backups.

Why it matters

Databases fail in messy ways. The threat is rarely a dramatic outage. It is usually one of these everyday scenarios:

Human error. An engineer runs a migration against the wrong environment, or a script truncates a table. With PITR you rewind to one minute before the mistake. Without it, you are reconstructing data from application logs and apologies.
Application bugs. A code deploy starts writing corrupt records. By the time anyone notices, hours of bad data are in the table. PITR lets you pinpoint when the corruption began and restore to just before it.
Ransomware and malicious deletion. A compromised credential can drop tables or encrypt rows. If automated backups are off, the attacker has effectively removed your recovery option along with your data.
Compliance gaps. Frameworks like SOC 2, PCI DSS, and HIPAA expect demonstrable recovery capabilities. An instance with no backups is a finding waiting to happen during an audit.

There is also a quieter operational cost. Disabling automated backups means you cannot create read replicas for some engines, and certain maintenance operations behave differently. So the setting reaches beyond disaster recovery into day-to-day capabilities.

Warning: A common way instances end up with retention 0 is during creation through Terraform or CloudFormation where the field is left unset or explicitly zeroed to "save on storage." The storage savings are trivial. The recovery exposure is not.

How to fix it

The fix is to set a non-zero retention period. Seven days is a sensible floor for most workloads, and production databases often warrant 14 to 35 days depending on your recovery objectives.

Using the AWS CLI

First, check the current retention period on your instance:

aws rds describe-db-instances \
  --db-instance-identifier my-prod-db \
  --query 'DBInstances[0].BackupRetentionPeriod'

If it returns 0, enable backups by setting a retention period:

aws rds modify-db-instance \
  --db-instance-identifier my-prod-db \
  --backup-retention-period 7 \
  --apply-immediately

Warning: Changing the backup retention period from 0 to a positive value triggers a brief I/O suspension while RDS takes the first backup. For Single-AZ instances this can mean a short period of unavailability. Schedule the change during a maintenance window, or omit --apply-immediately to defer it to the next window.

While you are at it, set a sensible backup window so backups run during low-traffic hours:

aws rds modify-db-instance \
  --db-instance-identifier my-prod-db \
  --backup-retention-period 7 \
  --preferred-backup-window "03:00-04:00" \
  --apply-immediately

Using the AWS Console

Open the RDS console and select your database instance.
Click Modify.
Scroll to the Backup section.
Set Backup retention period to 7 days or more.
Optionally set the backup window to an off-peak time.
Click Continue, choose when to apply, and confirm.

Using Terraform

In Terraform, the relevant argument is backup_retention_period on the aws_db_instance resource:

resource "aws_db_instance" "prod" {
  identifier              = "my-prod-db"
  engine                  = "postgres"
  instance_class          = "db.t3.medium"
  allocated_storage       = 50

  backup_retention_period = 7
  backup_window           = "03:00-04:00"
  copy_tags_to_snapshot   = true
  deletion_protection     = true
}

For Aurora clusters the setting lives on aws_rds_cluster instead:

resource "aws_rds_cluster" "prod" {
  cluster_identifier      = "my-prod-cluster"
  engine                  = "aurora-postgresql"
  backup_retention_period = 14
  preferred_backup_window = "03:00-04:00"
}

How to prevent it from happening again

Fixing one instance is easy. The harder part is making sure the next instance someone spins up does not repeat the mistake. Bake the requirement into the pipeline so it cannot slip through.

Catch it in CI with policy-as-code

If you use Terraform, add a Checkov or OPA check to your CI pipeline. Checkov has a built-in rule for this:

checkov -d . --check CKV_AWS_133

CKV_AWS_133 fails any aws_db_instance where backup_retention_period is missing or set to 0. Wire it into your pull request checks so the plan never merges with backups disabled.

For a custom OPA/Rego policy that enforces a minimum of 7 days:

package terraform.rds

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_db_instance"
  retention := resource.change.after.backup_retention_period
  retention < 7
  msg := sprintf("RDS instance %s must have backup_retention_period >= 7", [resource.address])
}

Enforce it at runtime with AWS Config

AWS Config ships a managed rule, db-instance-backup-enabled, that continuously evaluates whether RDS instances have backups enabled and meet a minimum retention. Deploy it across accounts with a conformance pack so drift gets flagged even when someone changes a setting outside your IaC.

Tip: Pair the AWS Config rule with an automatic remediation action through SSM Automation. When the rule detects a non-compliant instance, it can call modify-db-instance to set a compliant retention period without waiting for a human to notice.

Use a Service Control Policy as a backstop

SCPs cannot directly enforce a retention value, but you can combine continuous monitoring through Lensix or AWS Config with alerting so that any instance with retention 0 surfaces immediately. The goal is layered defense: prevention in CI, detection at runtime, and alerting when both are bypassed.

Best practices

Set retention based on your RPO. Match the retention period to your recovery point objective. If losing more than a day of data is unacceptable, 7 days minimum gives you headroom; for critical systems use 14 to 35.
Enable deletion protection too. Backups protect against data loss, but deletion_protection stops someone from deleting the whole instance by accident. They solve different problems.
Copy automated backups to a second region. Use cross-region automated backup replication for production so a regional event does not take your backups with it.
Test your restores. A backup you have never restored is a hope, not a plan. Periodically restore to a throwaway instance and verify the data is intact.
Set copy_tags_to_snapshot. This carries your tags onto snapshots, which keeps cost allocation and ownership clear when you have hundreds of automated backups.

Danger: Deleting an RDS instance without taking a final snapshot, or with automated backups disabled, destroys your data permanently. Always confirm --final-snapshot-identifier is set before running aws rds delete-db-instance on anything that matters.

Automated backups cost very little and ask almost nothing of you once configured. Turning them off to shave a few cents off the storage bill is a trade no team regrets until the moment they do. Set a sane retention period, enforce it in your pipeline, and move on to harder problems.

RDS Automated Backups Disabled: Why Retention 0 Is a Data Loss Trap