Back to blog
AzureBest PracticesCloud SecurityCompute & ContainersReliability

Azure VM Has No Backup Policy: Detection, Fix, and Prevention

Learn why Azure VMs without a backup policy are a ransomware and data-loss risk, and how to enable Azure Backup, enforce it with policy, and test restores.

TL;DR

This check flags Azure VMs that are not protected by any backup policy, leaving them with no recovery point after ransomware, accidental deletion, or disk corruption. Fix it by enabling Azure Backup through a Recovery Services vault and assigning the VM to a backup policy.

Most teams discover their backup gap at the worst possible moment: right after a VM is gone. A finance app server gets hit by ransomware, an engineer runs a Terraform destroy against the wrong workspace, or a managed disk silently corrupts after a botched OS patch. If there is no backup policy attached to that VM, there is nothing to restore from. You are rebuilding from scratch and hoping someone documented the setup.

The vm_nobackup check looks for exactly this gap. It tells you which Azure virtual machines have no backup protection configured before that gap turns into an outage.


What this check detects

The check inspects each Azure VM in your subscription and verifies whether it is registered as a protected item in a Recovery Services vault with an active backup policy. A VM passes when Azure Backup is enabled for it and a policy controls the schedule and retention. It fails when the VM has no associated backup item at all.

In Azure, backups for VMs are not automatic. Creating a VM does not create backups. You have to explicitly enable Azure Backup, which involves three pieces:

  • A Recovery Services vault that stores the recovery points
  • A backup policy that defines schedule and retention
  • An association between the VM and that policy

Note: Azure Backup for VMs takes app-consistent snapshots on Windows (via VSS) and file-system-consistent snapshots on Linux. This is different from a managed disk snapshot, which is a one-off point-in-time copy with no schedule, retention, or vault-level protection.


Why it matters

A VM with no backup policy has a recovery point objective of "whatever you can salvage." That is fine for a stateless, ephemeral worker that rebuilds from a pipeline. It is a serious problem for anything holding state, configuration, or data that lives only on that machine.

Ransomware

Ransomware is the scenario backups exist for. When an attacker encrypts the disks on a VM, your only clean path back is a recovery point from before the encryption. Without a backup policy, you are negotiating with the attacker or rebuilding. With one, you restore to a point in time and move on.

Accidental deletion and operator error

Deletes happen. Someone runs a script against the wrong resource group, a Terraform plan removes a VM that drifted out of state, or a cleanup job is too aggressive. Soft-delete in the vault gives you a grace window, and a real backup gives you the data itself.

Corruption and failed updates

A kernel update that won't boot, a corrupted database file after an unclean shutdown, a bad application deploy that overwrites config. These are not security incidents, but they are exactly when you want to roll a VM back to yesterday.

Warning: Compliance frameworks including ISO 27001, SOC 2, and PCI DSS expect documented, tested backup procedures. An auditor finding production VMs with no backup configured is a common cause of failed controls, even when nothing has actually gone wrong yet.


How to fix it

You need a Recovery Services vault, a backup policy, and the VM enrolled against it. Here is the full path with the Azure CLI.

1. Create a Recovery Services vault

If you already have a vault in the same region as the VM, skip this. Backup vaults must be in the same region as the resources they protect.

az backup vault create \
  --resource-group prod-rg \
  --name prod-backup-vault \
  --location eastus

2. Set storage redundancy on the vault

Decide on redundancy before you protect anything. You can only change it while the vault has no backup items.

az backup vault backup-properties set \
  --resource-group prod-rg \
  --name prod-backup-vault \
  --backup-storage-redundancy GeoRedundant

3. Enable backup for the VM

This associates the VM with a backup policy. Azure provides a built-in DefaultPolicy (daily backup, 30-day retention) to get you started, but you should define your own to match your RPO and retention needs.

az backup protection enable-for-vm \
  --resource-group prod-rg \
  --vault-name prod-backup-vault \
  --vm app-server-01 \
  --policy-name DefaultPolicy

4. Define a custom backup policy (recommended)

Write the policy as JSON so the schedule and retention are explicit and version-controlled. Below is a daily policy with 90-day retention.

{
  "properties": {
    "backupManagementType": "AzureIaasVM",
    "schedulePolicy": {
      "schedulePolicyType": "SimpleSchedulePolicy",
      "scheduleRunFrequency": "Daily",
      "scheduleRunTimes": ["2023-01-01T02:00:00Z"]
    },
    "retentionPolicy": {
      "retentionPolicyType": "LongTermRetentionPolicy",
      "dailySchedule": {
        "retentionTimes": ["2023-01-01T02:00:00Z"],
        "retentionDuration": { "count": 90, "durationType": "Days" }
      }
    }
  }
}
az backup policy create \
  --resource-group prod-rg \
  --vault-name prod-backup-vault \
  --name daily-90day \
  --backup-management-type AzureIaasVM \
  --policy @daily-90day.json

5. Trigger an initial backup

The first scheduled run may be hours away. Kick one off now so you have a recovery point immediately.

az backup protection backup-now \
  --resource-group prod-rg \
  --vault-name prod-backup-vault \
  --container-name app-server-01 \
  --item-name app-server-01 \
  --retain-until 31-12-2024

Tip: Define the same protection in Terraform so it travels with the VM. The azurerm_backup_protected_vm resource ties a VM to a policy, which means a new VM provisioned from your module is backed up from day one.

resource "azurerm_recovery_services_vault" "main" {
  name                = "prod-backup-vault"
  location            = "eastus"
  resource_group_name = "prod-rg"
  sku                 = "Standard"
  soft_delete_enabled = true
}

resource "azurerm_backup_policy_vm" "daily" {
  name                = "daily-90day"
  resource_group_name = "prod-rg"
  recovery_vault_name = azurerm_recovery_services_vault.main.name

  backup {
    frequency = "Daily"
    time      = "02:00"
  }

  retention_daily {
    count = 90
  }
}

resource "azurerm_backup_protected_vm" "app" {
  resource_group_name = "prod-rg"
  recovery_vault_name = azurerm_recovery_services_vault.main.name
  source_vm_id        = azurerm_linux_virtual_machine.app.id
  backup_policy_id    = azurerm_backup_policy_vm.daily.id
}

How to prevent it from happening again

Fixing one VM is easy. Making sure the next ten VMs ship with backups is the real work. Two layers handle this well.

Azure Policy enforcement

Azure ships a built-in policy initiative, "Configure backup on virtual machines without a given tag to an existing recovery services vault." Assigned with a DeployIfNotExists effect, it automatically enrolls any new VM into a vault unless it carries an opt-out tag. This is the strongest control because it remediates without anyone remembering to act.

az policy assignment create \
  --name enforce-vm-backup \
  --scope "/subscriptions/<sub-id>" \
  --policy-set-definition "Configure backup on virtual machines" \
  --location eastus \
  --mi-system-assigned \
  --role Contributor

Note: A DeployIfNotExists policy needs a managed identity with permission to write the backup configuration. The --mi-system-assigned and --role flags above grant that. Without it, the policy evaluates but cannot remediate.

CI/CD gates

For teams provisioning through pipelines, scan IaC before it applies. A pull request that adds an azurerm_*_virtual_machine resource without a matching azurerm_backup_protected_vm should fail the build. Tools like Checkov and tfsec catch this, and you can run the Lensix check against the deployed state to confirm reality matches intent.


Best practices

  • Match retention to the data, not a default. A 30-day default rarely satisfies compliance. Map daily, weekly, monthly, and yearly retention to your actual recovery and audit requirements.
  • Turn on soft delete and enhanced security. Soft delete keeps deleted recovery points for 14 days, which blocks an attacker who gains vault access from immediately wiping your backups.
  • Use immutable vaults for critical workloads. Immutability prevents anyone, including admins, from shortening retention or deleting recovery points before they expire. This is your ransomware insurance.
  • Test restores, don't assume them. A backup you have never restored is a guess. Run a periodic restore drill to a sandbox and confirm the VM boots and the data is intact.
  • Keep vaults in a separate resource group. Separating backup infrastructure from the workloads it protects reduces the chance that a single bad delete takes out both the VM and its recovery points.
  • Tag exclusions explicitly. Some VMs genuinely don't need backups. Use a clear, documented tag for those so the absence of a backup is a deliberate decision, not an oversight.

Danger: Stopping protection with the --delete-backup-data true flag permanently destroys all recovery points for that VM. There is no undo. Only do this when you are certain the VM is decommissioned and its data is no longer needed.

Backups are one of those controls that cost almost nothing to set up and everything to skip. Run this check across your subscriptions, fix the gaps, and put a policy in place so the gap never reopens.