Back to blog
Best PracticesCloud SecurityDatabasesGCPReliability

Cloud SQL Automated Backups Disabled: Why It Matters and How to Fix It

Learn why disabled automated backups on GCP Cloud SQL is a critical risk, how to enable them via gcloud, console, and Terraform, and how to enforce it in CI/CD.

TL;DR

This check flags Cloud SQL instances running without automated backups, which leaves you with no clean recovery point after data corruption, accidental deletion, or a ransomware event. Turn backups on with gcloud sql instances patch INSTANCE --backup-start-time=HH:MM and require it in your IaC.

Automated backups are one of those settings that nobody thinks about until the moment they desperately need them. A developer runs a migration against the wrong database, a bad deploy truncates a table, or an attacker drops your schema, and suddenly the only question that matters is: how far back can we recover? If automated backups were never enabled, the answer is often "not at all."

The Lensix check sql_noautomatedbackups looks at each Cloud SQL instance in your GCP projects and reports any instance where automated backups are switched off. It is a low-effort, high-impact fix, and it is one of the most common gaps we see in real environments.


What this check detects

Every Cloud SQL instance has a backup configuration. The relevant field is settings.backupConfiguration.enabled. When this is false, GCP does not take scheduled daily backups of the instance, and depending on the engine you also lose the ability to enable point-in-time recovery, which relies on backups plus write-ahead logs or binary logs.

The check fails when an instance has backupConfiguration.enabled = false. You can confirm the state of any single instance directly:

gcloud sql instances describe my-instance \
  --format="value(settings.backupConfiguration.enabled)"

If that returns False, the instance has no automated backups and will trip this check.

Note: Automated backups are different from on-demand backups. An on-demand backup is a one-time snapshot you trigger manually. Automated backups run on a daily schedule and are what point-in-time recovery and most disaster recovery plans depend on. This check is specifically about the scheduled, automated kind.


Why it matters

A database without backups is a single point of failure for the most important asset most companies have: their data. Here are the concrete ways this bites people.

Human error

The most common cause of data loss is not an attacker, it is a person. A DELETE without a WHERE clause, a migration script run against production instead of staging, or an automation job with a bug. With automated backups and point-in-time recovery, you can restore to a moment seconds before the mistake. Without them, you are reconstructing data from application logs and luck.

Ransomware and destructive attacks

Attackers who gain write access to a database increasingly go for impact rather than just exfiltration. They drop tables, encrypt data, or leave a ransom note in a renamed table. If your only copy of the data is the live instance, you are negotiating. Backups stored separately by GCP give you a recovery path that does not depend on the compromised instance.

Danger: If an instance is deleted, GCP also deletes its automated backups along with it unless you have separately exported the data or enabled deletion protection. Backups protect you from data corruption, but they are not a substitute for deletion protection and off-instance exports. Plan for both.

Compliance and contractual obligations

Frameworks like SOC 2, ISO 27001, PCI DSS, and HIPAA all expect a defined backup and recovery process. An auditor asking "show me your recovery point objective for the customer database" does not want to hear that backups were never turned on. This check maps directly to those controls.

Regional outages

GCP regions are reliable but not immune to incidents. Automated backups, especially when stored in a multi-region location, give you a recovery option if a zone or region has a bad day.


How to fix it

You have three paths depending on how you manage infrastructure: the console, the gcloud CLI, or infrastructure as code. Use the one that matches how the instance was created so your fix does not get reverted on the next deploy.

Warning: Enabling backups on an existing instance does not require downtime, but the first backup runs at the next scheduled window. If you need an immediate recovery point, trigger an on-demand backup as well. Stored backups do incur storage costs, billed at the rate for the region, so factor that into budgeting for large instances.

Option 1: gcloud CLI

Enable automated backups and set a backup window. The --backup-start-time value is in UTC and starts a backup window during which GCP picks a moment to run the backup. Choose a low-traffic period.

gcloud sql instances patch my-instance \
  --backup-start-time=03:00 \
  --retained-backups-count=14 \
  --retained-transaction-log-days=7

For MySQL and PostgreSQL you can also turn on point-in-time recovery, which lets you restore to any second within the retention window. PITR requires backups to be enabled first.

# PostgreSQL and MySQL
gcloud sql instances patch my-instance \
  --enable-point-in-time-recovery

Take an immediate on-demand backup so you have a recovery point right now rather than waiting for the next window:

gcloud sql backups create \
  --instance=my-instance \
  --description="manual baseline after enabling automated backups"

Option 2: Google Cloud Console

  1. Open SQL in the Cloud Console and select the instance.
  2. Go to Backups in the left navigation.
  3. Click Edit next to the backup settings, or open the instance's Edit page and expand Data Protection.
  4. Check Automate backups and set a backup window during off-peak hours.
  5. Set the number of backups to retain (7 is the default, consider more for critical data).
  6. Where available, enable Point-in-time recovery and Deletion protection.
  7. Click Save. The instance applies the change without a restart.

Option 3: Terraform

If the instance is managed by Terraform, the console or CLI fix will be overwritten on the next apply. Fix it in code instead:

resource "google_sql_database_instance" "main" {
  name             = "my-instance"
  database_version = "POSTGRES_15"
  region           = "us-central1"

  settings {
    tier              = "db-custom-2-7680"
    availability_type = "REGIONAL"

    backup_configuration {
      enabled                        = true
      start_time                     = "03:00"
      point_in_time_recovery_enabled = true
      transaction_log_retention_days = 7

      backup_retention_settings {
        retained_backups = 14
        retention_unit   = "COUNT"
      }
    }
  }

  deletion_protection = true
}

Tip: For MySQL instances, point-in-time recovery uses binary logging. In Terraform set binary_log_enabled = true inside backup_configuration alongside point_in_time_recovery_enabled. For PostgreSQL the transaction log retention setting is used instead.


How to prevent it from happening again

Fixing one instance is easy. Making sure no future instance ships without backups is the part that actually keeps you safe. Push the control to the left so it is caught before deployment.

Enforce it with Organization Policy

GCP does not currently offer a built-in constraint that forces backups on, but you can enforce backup configuration through a custom Organization Policy or, more reliably, through policy-as-code on your IaC.

Gate it in CI/CD with OPA or Conftest

Run a policy check against your Terraform plan before apply. This Rego rule rejects any Cloud SQL instance that does not enable backups:

package sql.backups

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_sql_database_instance"
  config := resource.change.after.settings[_]
  not config.backup_configuration[_].enabled
  msg := sprintf("Cloud SQL instance '%s' must have automated backups enabled", [resource.name])
}

Wire it into the pipeline so a missing backup config fails the build:

terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
conftest test tfplan.json --policy policy/

Tip: Pair the CI gate with continuous monitoring. A pipeline check only covers resources created through that pipeline. Lensix scans live infrastructure on a schedule, so it catches instances created by hand, by other teams, or by automation that bypassed the gate. The two together close the loop.

Use a hardened module

Wrap Cloud SQL in an internal Terraform module that sets backups, PITR, deletion protection, and retention as defaults. Teams consuming the module get a safe baseline without having to remember every flag, and you change the default in one place.


Best practices

  • Set a retention window that matches your RPO. The default of 7 retained backups may be fine for low-value data but too short for systems where you need to recover from a problem discovered weeks later. Align retention with your recovery point objective and any compliance requirement.
  • Enable point-in-time recovery for transactional databases. Daily backups alone can lose up to 24 hours of data. PITR lets you recover to a specific second, which is the difference between losing a minute of orders and losing a day of them.
  • Turn on deletion protection. Backups do not survive instance deletion. Deletion protection stops an accidental or malicious terraform destroy or console click from taking the instance and its backups with it.
  • Export critical data off the instance. For your most important databases, schedule exports to a Cloud Storage bucket in a separate project with restricted access. This protects you against scenarios where the entire instance, backups included, is lost.
  • Test your restores. A backup you have never restored is a hope, not a plan. Periodically restore a backup into a throwaway instance and verify the data is intact and usable.
  • Schedule backups during low-traffic windows. Backups have minimal performance impact, but on heavily loaded instances it is still good practice to run them when traffic is lowest.

Automated backups cost very little and take minutes to enable, yet they are the foundation of nearly every database recovery story. Turn them on everywhere, enforce them in code, and verify they are still on with continuous scanning. The day you need them, you will be glad they were already there.