GCP VM Boot Disk Auto-Delete: Risks and Fix

TL;DR

This check flags Compute Engine VMs whose boot disk has auto-delete enabled, meaning the disk is destroyed the moment the VM is deleted. If that disk holds data you cannot easily rebuild, disable auto-delete with gcloud compute instances set-disk-auto-delete so the disk survives the instance.

Deleting a VM in GCP feels like a clean, reversible action until you realize the boot disk went with it. By default, Compute Engine attaches a boot disk with auto-delete turned on, so the disk and everything on it disappear when the instance is deleted. For stateless workers that is exactly what you want. For a VM that doubles as a database host, a build cache, or a snapshot you never got around to taking, it is a quiet way to lose data permanently.

The Lensix check compute_autodeletedisk identifies Compute Engine instances where the boot disk auto-delete flag is set to true, so you can decide deliberately rather than by default.

What this check detects

Every disk attached to a Compute Engine VM has an autoDelete property. When it is true, GCP deletes that disk automatically as part of deleting the instance. This check looks specifically at the boot disk and reports any VM where auto-delete is enabled.

You can see the current setting for any instance with the GCP CLI:

gcloud compute instances describe my-instance \
  --zone=us-central1-a \
  --format="value(disks[].autoDelete)"

A return value of True means the boot disk will be deleted with the VM. You can also inspect it in JSON to see the full disk layout:

gcloud compute instances describe my-instance \
  --zone=us-central1-a \
  --format="json(disks)"

Note: Auto-delete is a per-disk attachment property, not a property of the disk itself. The same persistent disk could be attached to one VM with auto-delete on and survive being detached and reattached elsewhere with it off. The flag describes the relationship between a VM and a disk, not the disk's own lifecycle.

Why it matters

The risk here is data loss, and it tends to hit in ways that are easy to dismiss until they happen.

Accidental deletion takes the data with it. An engineer deletes a VM thinking it is disposable, or a Terraform destroy tears down an environment, and the boot disk goes too. If that disk held application state, local databases, or unsynced files, there is no undo.
Automation amplifies the blast radius. A cleanup script or autoscaler that deletes instances will silently delete their boot disks. One misconfigured selector can wipe dozens of disks in seconds.
Forensics and incident response suffer. If a VM is compromised and you delete it to stop the bleeding, an auto-deleting boot disk destroys the evidence you would need to understand the breach. Disk artifacts are often the only record of what an attacker did on the host.
It hides missing backups. Teams sometimes treat the boot disk as the backup. Auto-delete plus no snapshot policy means a single VM deletion erases the only copy.

Warning: Auto-delete only fires when the VM is deleted, not when it is stopped or reset. A stopped instance keeps its disk. The danger is specifically around gcloud compute instances delete, Terraform destroys, MIG scale-in, and preemptible or Spot VM termination cleanup.

None of this means auto-delete is wrong. For an immutable, stateless fleet behind a managed instance group, leaving auto-delete on is correct, because the disk is just a copy of an image and carries no unique state. The check exists to make sure that decision was made on purpose, not inherited from a default.

How to fix it

If a VM's boot disk holds data you would not want to lose, disable auto-delete. You do not need to restart or recreate the instance.

Option 1: gcloud CLI

Disable auto-delete on the boot disk of a running instance:

gcloud compute instances set-disk-auto-delete my-instance \
  --zone=us-central1-a \
  --disk=my-instance-boot-disk \
  --no-auto-delete

If you are not sure of the disk name, the boot disk usually shares the instance name. Confirm with:

gcloud compute instances describe my-instance \
  --zone=us-central1-a \
  --format="value(disks[].source.basename())"

Verify the change took effect:

gcloud compute instances describe my-instance \
  --zone=us-central1-a \
  --format="value(disks[].autoDelete)"

You should now see False.

Option 2: Google Cloud Console

Go to Compute Engine > VM instances.
Click the instance name, then click Edit.
Scroll to the Boot disk section.
Uncheck Delete boot disk when instance is deleted.
Click Save.

Option 3: Terraform

Within google_compute_instance, set auto_delete = false on the boot disk block:

resource "google_compute_instance" "db_host" {
  name         = "db-host"
  machine_type = "e2-standard-4"
  zone         = "us-central1-a"

  boot_disk {
    auto_delete = false

    initialize_params {
      image = "debian-cloud/debian-12"
      size  = 50
    }
  }

  network_interface {
    network = "default"
  }
}

Danger: Changing auto_delete on an existing instance is a safe in-place update, but never run terraform destroy against a stateful environment to "reapply" config. Destroy deletes the VM, and if any disk still has auto-delete on, that disk is gone. Always run terraform plan and read the output before applying.

Tip: Even with auto-delete disabled, a detached orphan boot disk is not a backup. Pair this fix with a scheduled snapshot policy so you have point-in-time recovery, not just a disk that happens to survive a delete. Use gcloud compute resource-policies create snapshot-schedule to set one up.

How to prevent it from happening again

Fixing instances one at a time does not scale. Push the decision into your provisioning pipeline so disks are created with the right setting from the start.

Enforce it in Terraform with policy-as-code

Use OPA or Sentinel to reject plans that create stateful instances with auto-delete on. Here is a Rego rule that flags boot disks where auto-delete is enabled and the instance carries a stateful label:

package terraform.compute

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_compute_instance"
  resource.change.after.labels.stateful == "true"
  boot := resource.change.after.boot_disk[_]
  boot.auto_delete == true
  msg := sprintf("Instance '%s' is labeled stateful but boot disk auto_delete is true", [resource.change.after.name])
}

Gate it in CI/CD

Run the policy check on every pull request that touches infrastructure, before the apply step:

terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
opa eval --data policy/ --input tfplan.json \
  "data.terraform.compute.deny" --fail-defined

If the deny set is non-empty, the pipeline fails and the change never reaches production.

Sweep existing instances on a schedule

Run a scheduled audit to catch instances created outside your IaC, for example manually in the console. This lists every instance and its boot disk auto-delete state:

gcloud compute instances list \
  --format="table(name, zone.basename(), disks[0].autoDelete)"

Tip: Lensix runs compute_autodeletedisk continuously across your projects, so you do not have to write and maintain sweep scripts yourself. Wire the findings into your alerting so a newly created stateful VM with auto-delete on surfaces within minutes instead of at the next manual review.

Best practices

Decide auto-delete based on statefulness, not habit. Stateless, image-backed VMs in managed instance groups should keep auto-delete on. Anything holding unique state should have it off.
Keep state off the boot disk where you can. The cleanest pattern is stateless boot disks plus separate data disks for anything you care about. That way auto-delete on the boot disk is harmless because nothing important lives there.
Always have snapshots regardless of the auto-delete setting. Auto-delete protection is not backup. Scheduled snapshots with a retention policy are. Treat them as separate controls.
Apply deletion protection for truly critical VMs. Set deletion_protection = true in Terraform or run gcloud compute instances update --deletion-protection so the instance cannot be deleted at all without an explicit override.
Label instances by lifecycle. A consistent stateful=true or environment=prod labeling scheme lets your policy rules reason about which VMs need protection, which makes enforcement reliable instead of guesswork.
Tighten IAM around deletion. Restrict compute.instances.delete to a small set of principals. Most engineers do not need the ability to delete production VMs directly.

Auto-delete is a sensible default for the workloads it was designed for and a footgun for everything else. The goal is not to turn it off everywhere, it is to make sure every VM that survives a delete does so because someone chose that, and every VM that does not is genuinely safe to lose.

VM Boot Disk Auto-Delete Enabled on GCP Compute Engine