Back to blog
AzureBest PracticesCloud SecurityCompute & ContainersMonitoring & Logging

VM Boot Diagnostics Disabled: Why It Matters and How to Fix It

Learn why Azure VM boot diagnostics matters for incident response, how to enable it via CLI, portal, and Terraform, and how to enforce it with policy.

TL;DR

This check flags Azure VMs that have boot diagnostics turned off, which leaves you blind when a VM fails to boot or hangs during startup. Enable boot diagnostics with a managed storage account so you can capture serial console output and screenshots when things go wrong.

When an Azure VM refuses to boot, you usually find out the hard way: SSH or RDP times out, your app is down, and the portal shows a running instance that is anything but healthy. Without boot diagnostics, you have almost nothing to work with. You cannot see the boot screen, you cannot read kernel panics, and you cannot tell whether the OS got stuck on an fsck check, a misconfigured fstab entry, or a failed cloud-init script.

The vm_bootdiagnostics check looks for exactly this gap. It identifies Azure virtual machines where boot diagnostics are disabled, so you can turn the feature on before you need it.


What this check detects

Boot diagnostics is an Azure VM feature that captures two things during the boot process:

  • Serial console output — the text log streamed during OS startup, equivalent to plugging a monitor into the serial port of a physical machine.
  • Screenshots — a snapshot of the VM's display, useful for spotting a Windows blue screen or a Linux boot prompt waiting for input.

The check inspects each VM's diagnosticsProfile.bootDiagnostics configuration. If the enabled flag is false or the profile is missing entirely, the VM is reported as non-compliant.

Note: Azure supports two storage modes for boot diagnostics. The older model writes logs to a storage account you specify. The newer managed model lets Azure handle storage for you, with no account to provision or pay for directly. Both satisfy this check, but managed storage is simpler and recommended for new VMs.


Why it matters

Boot diagnostics is not a security control in the traditional sense, but it is a critical part of operational resilience and incident response. Here is where its absence bites.

You lose your only window into boot failures

If a VM hangs before the network stack comes up, remote access tools are useless. The OS is running but unreachable. With boot diagnostics enabled, you can open the serial console in the portal and watch the boot log in real time, or read the captured screenshot to see exactly where startup stalled.

Common scenarios where this is the difference between a five minute fix and a multi-hour outage:

  • A bad entry in /etc/fstab drops the boot into emergency mode and waits for a root password.
  • A kernel update leaves the VM unable to find its root device.
  • A Windows update loops on a failed configuration step and never reaches the login screen.
  • A misconfigured disk encryption setup blocks the OS from unlocking the system volume.

Incident response gets slower

When a VM is part of a security investigation, the serial console can reveal whether unexpected processes are running at boot, whether a malicious init script was injected, or whether the system was tampered with at a low level. No diagnostics means no record, and you are left guessing.

Warning: If you use the legacy storage account mode, the boot diagnostics account itself becomes a small operational dependency. A storage account that is deleted, firewalled off, or has its keys rotated incorrectly will silently break diagnostics collection. Managed storage avoids this class of problem.

Compliance frameworks expect it

Several Azure security baselines, including the CIS Microsoft Azure Foundations Benchmark and the Microsoft cloud security benchmark, recommend boot diagnostics as part of the logging and monitoring posture for compute resources. Leaving it off can show up as a finding in audits even when nothing is technically broken.


How to fix it

You can enable boot diagnostics on a running VM with no reboot required. Pick the method that matches how you manage infrastructure.

Azure CLI (managed storage, recommended)

The simplest fix uses Azure managed storage, so there is no account to create:

az vm boot-diagnostics enable \
  --name myVM \
  --resource-group myResourceGroup

Omitting the --storage flag tells Azure to use managed storage automatically.

Azure CLI (specific storage account)

If your policy requires logs to land in a storage account you control, pass its blob endpoint:

az vm boot-diagnostics enable \
  --name myVM \
  --resource-group myResourceGroup \
  --storage "https://mydiagstorage.blob.core.windows.net/"

Azure Portal

  1. Open the VM in the Azure portal.
  2. Under Help in the left menu, select Boot diagnostics.
  3. Click Settings.
  4. Choose Enable with managed storage account.
  5. Click Save.

Terraform

For VMs managed as code, add a boot_diagnostics block. Leaving storage_account_uri empty selects managed storage:

resource "azurerm_linux_virtual_machine" "example" {
  name                = "myVM"
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location
  size                = "Standard_B2s"
  admin_username      = "azureuser"

  # ... network_interface_ids, os_disk, source_image_reference, etc.

  boot_diagnostics {
    storage_account_uri = null  # managed storage
  }
}

Tip: Need to enable diagnostics across dozens of VMs at once? Loop through them with a quick script rather than clicking through the portal one at a time.

az vm list --query "[].{name:name, rg:resourceGroup}" -o tsv | \
while read -r name rg; do
  echo "Enabling boot diagnostics on $name in $rg"
  az vm boot-diagnostics enable --name "$name" --resource-group "$rg"
done

Note: Enabling boot diagnostics does not restart the VM and has no impact on running workloads. The change applies on the next boot cycle for full serial log capture, but screenshots and console access become available right away.


How to prevent it from happening again

Fixing existing VMs is only half the job. New VMs created without diagnostics will reintroduce the finding. Close the loop with policy and pipeline controls.

Azure Policy

Azure provides a built-in policy definition to audit VMs without boot diagnostics. Assign it at the subscription or management group level so every new VM is evaluated automatically:

az policy assignment create \
  --name "audit-vm-boot-diagnostics" \
  --display-name "Audit VMs without boot diagnostics" \
  --scope "/subscriptions/" \
  --policy ""

Start with an Audit effect to measure your exposure, then move to a stricter effect once you understand the blast radius.

Enforce in Terraform with a required block

Make the boot_diagnostics block mandatory in your shared VM module so teams cannot opt out by omission. Pair it with a policy-as-code tool such as Checkov or tfsec to catch drift in CI:

# Run in your CI pipeline before terraform apply
checkov -d . --check CKV_AZURE_ --compact

Tip: Wire the same scan into a pre-commit hook so engineers get feedback on their laptop before a pull request is ever opened. Catching it locally is cheaper than catching it in review.

Continuous monitoring with Lensix

The vm_bootdiagnostics check runs as part of the Lensix Azure VM module, so newly created VMs that miss this setting surface on your next scan rather than waiting for an audit. Combine the policy gate at deploy time with continuous scanning so nothing slips through the gaps between deployments.


Best practices

  • Prefer managed storage for new VMs. It removes the storage account dependency and one less thing to misconfigure or pay for.
  • Standardize on a single approach across your estate. Mixing legacy storage accounts and managed storage makes troubleshooting harder and complicates audits.
  • Lock down legacy diagnostics storage if you must use it. Restrict access to the account, but make sure the VM's managed identity or the platform can still write to it.
  • Test serial console access before an incident. Confirm that your team has the permissions to open the serial console, and that the OS is configured to emit to the serial port, so the data is actually there when you reach for it.
  • Bake it into golden images and modules so diagnostics is on by default, not an afterthought added during firefighting.
  • Treat it as part of observability, not a standalone toggle. Boot diagnostics complements VM metrics, the Azure Monitor agent, and centralized logging to give you full visibility from boot to runtime.

Boot diagnostics is cheap to enable, has no runtime cost in managed mode, and pays for itself the first time a VM refuses to start. Turn it on everywhere, enforce it in policy, and move on to the next finding.