Back to blog
Best PracticesCloud SecurityCompute & ContainersGCPReliability

VM Deletion Protection Not Enabled on GCP Compute Engine

Learn why Compute Engine VMs need deletion protection, how to enable it with gcloud and Terraform, and how to enforce it with OPA and org policy.

TL;DR

This check flags Compute Engine VMs running without deletion protection, which means a single misclick or rogue script can permanently remove a production instance. Turn it on with gcloud compute instances update VM_NAME --deletion-protection and enforce it through policy-as-code.

Accidental deletion is one of those failure modes that feels avoidable right up until it happens to you. Someone runs a cleanup script with a loose name filter, a Terraform plan gets applied against the wrong workspace, or an engineer fat-fingers a delete in the console while reaching for the stop button. Compute Engine gives you a simple guardrail against all of these scenarios, and this check exists to tell you when that guardrail is missing.

The VM Deletion Protection Not Enabled check looks at your Compute Engine instances and reports any that can be deleted without an extra confirmation step. It is a low-effort, high-value setting that too many teams overlook until they lose something that mattered.


What this check detects

Every Compute Engine instance has a boolean property called deletionProtection. When it is set to false (the default), the VM can be deleted by anyone with the compute.instances.delete permission, through the console, the API, the CLI, or any tool that wraps them.

This check flags any VM where deletionProtection is false. It does not care about the workload running on the instance, so you will see flags across production databases, jump hosts, and short-lived test boxes alike. Part of remediating this check is deciding which instances genuinely need the protection.

Note: Deletion protection only blocks the delete operation. It does not stop someone from stopping the VM, detaching its disks, or changing its configuration. It is a safety latch against destruction, not an access control mechanism.


Why it matters

The risk here is straightforward: an unprotected VM can vanish in a single API call, and Compute Engine instance deletion is not reversible. Once the instance and its boot disk are gone, there is no undo button. If you were not capturing snapshots or images, the data goes with it.

A few real situations where this bites teams:

  • Scripted cleanup gone wrong. A nightly job deletes instances matching *-temp-*, but a production box was named billing-temp-fix-02 back during an incident and never renamed.
  • Terraform state drift. An engineer runs terraform apply against the wrong workspace and the plan happily proposes destroying the live instance because it is not in that state file.
  • Console accidents. The delete and stop buttons sit close together in the instance list, and bulk-selecting rows to stop them can quietly include a delete.
  • Compromised credentials. An attacker with delete permissions on your project can wipe out infrastructure as a destructive or extortion move. Deletion protection forces an extra step that buys time and trips alarms.

Warning: Deletion protection is not a backup. It prevents accidental deletion but does nothing if a disk corrupts, a region has an outage, or someone disables protection and then deletes. Pair it with regular snapshots or machine images for anything stateful.

The business impact scales with what the VM does. A stateless web node behind a managed instance group is cheap to lose. A self-managed database, a license server, or a host holding the only copy of some artifact can mean hours of recovery and real data loss.


How to fix it

Enabling deletion protection takes one command per instance. Below are the console, CLI, and IaC paths.

Option 1: gcloud CLI

Enable protection on an existing instance:

gcloud compute instances update VM_NAME \
  --zone=us-central1-a \
  --deletion-protection

To verify it took effect:

gcloud compute instances describe VM_NAME \
  --zone=us-central1-a \
  --format="value(deletionProtection)"

That should return True. If you need to enable it across many instances, list them and loop:

gcloud compute instances list \
  --format="value(name,zone)" | while read NAME ZONE; do
  gcloud compute instances update "$NAME" \
    --zone="$ZONE" \
    --deletion-protection
done

Tip: Do not blindly protect every VM. Instances in managed instance groups are meant to be created and destroyed automatically, and deletion protection will break autoscaling and rolling updates. Filter your loop to exclude MIG members and ephemeral workloads.

Option 2: Google Cloud Console

  1. Go to Compute Engine → VM instances.
  2. Click the instance name, then click Edit.
  3. Under the instance settings, check Enable deletion protection.
  4. Click Save.

Option 3: Terraform

For new and managed instances, set the field directly in your resource definition:

resource "google_compute_instance" "db" {
  name         = "billing-db-01"
  machine_type = "e2-standard-4"
  zone         = "us-central1-a"

  deletion_protection = true

  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-12"
    }
  }

  network_interface {
    network = "default"
  }
}

Danger: When deletion protection is enabled, terraform destroy and any plan that proposes replacing the instance will fail until you disable it. To intentionally delete a protected VM you must first set deletion_protection = false and apply, or run gcloud compute instances update VM_NAME --no-deletion-protection. Double-check the instance name before doing this in production.


How to prevent it from happening again

Fixing the existing VMs is the easy half. Keeping new instances from drifting back to the default requires enforcement at the point of creation.

Default it in your modules

If teams provision VMs through a shared Terraform module, set deletion_protection = true as the default and let callers opt out explicitly for ephemeral workloads. Defaults shape behavior far more reliably than documentation.

variable "deletion_protection" {
  type    = bool
  default = true
}

Gate it in CI with Open Policy Agent

Run a policy check against your Terraform plan in CI so an unprotected production instance never merges. A Conftest/OPA rule looks like this:

package main

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_compute_instance"
  resource.change.after.deletion_protection == false
  not contains(resource.change.after.name, "ephemeral")
  msg := sprintf("Instance %s must enable deletion_protection", [resource.change.after.name])
}

Wire that into your pipeline after terraform plan -out tfplan and terraform show -json tfplan, and the build fails before anything reaches GCP.

Enforce with an Organization Policy

For broader coverage you can mandate deletion protection through a custom organization policy constraint, which applies even to resources created outside your IaC.

name: organizations/123456789/customConstraints/custom.requireVmDeletionProtection
resourceTypes:
  - compute.googleapis.com/Instance
methodTypes:
  - CREATE
condition: "resource.deletionProtection == true"
actionType: ALLOW
displayName: "Require VM deletion protection"

Note: Custom org policy constraints evaluate at create and update time, so they catch console clicks and ad-hoc CLI calls that bypass your CI pipeline entirely. Test them in dry-run mode against a sandbox folder before rolling out org-wide.

Catch drift with continuous scanning

Policies and CI gates cover the creation path, but settings still get changed after the fact. Lensix continuously scans your projects and re-flags any VM where protection has been turned off, so you find out within a scan cycle rather than during a post-incident review.


Best practices

  • Protect stateful and singleton VMs first. Databases, license servers, and any host with non-reproducible data deserve protection immediately.
  • Leave MIG members and ephemeral instances unprotected. Protection conflicts with autoscaling, rolling updates, and preemptible/spot lifecycles. Tag or name these clearly so policy rules can exclude them.
  • Treat protection as one layer, not the whole strategy. Combine it with scheduled snapshots, machine images, and tight IAM on compute.instances.delete.
  • Scope delete permissions narrowly. Most engineers never need to delete instances directly. Granting that permission only to break-glass roles reduces both accidents and blast radius.
  • Log and alert on protection changes. Send a Cloud Audit Logs alert when deletionProtection flips to false, since that often precedes an intentional or malicious delete.

Deletion protection costs nothing and takes seconds to enable. The asymmetry between that effort and the cost of losing a production instance is exactly why this check is worth clearing across every project you run.