Back to blog
AzureBest PracticesCloud SecurityCompute & ContainersKubernetes

AKS Node Pool Host Encryption Disabled: What It Means and How to Fix It

Learn why AKS host encryption matters, how to enable encryption at host on node pools with CLI and Terraform, and how to enforce it with policy-as-code.

TL;DR

This check flags AKS node pools running without host-based encryption, which leaves the temp disks, OS disks, and cache on the underlying VM hosts unencrypted at rest. Fix it by recreating the node pool with --enable-encryption-at-host after registering the feature on your subscription.

Encryption at rest is one of those controls that everyone assumes is already on. With Azure managed disks, server-side encryption is enabled by default, so most teams check the box and move on. But AKS nodes have more than just managed disks. They have temp disks, OS and data disk caches, and ephemeral OS disks that live on the physical host running your VMs. Without host-based encryption, that data sits unencrypted on the host hardware.

The AKS Node Pool Host Encryption Disabled check catches node pools that have not opted into encryption at host. Below is what it means, why it is worth fixing, and how to remediate it cleanly.


What this check detects

The check inspects each node pool in your AKS clusters and verifies whether enableEncryptionAtHost is set to true. When it is disabled, the following data is not encrypted on the Azure host:

  • The VM temp disk (the local SSD mounted at /dev/sdb or similar)
  • The OS and data disk caches
  • Ephemeral OS disks, when your node pool uses them

Note: Encryption at host is different from server-side encryption of managed disks. Managed disks are encrypted in Azure Storage by default. Encryption at host extends that protection to the data stored locally on the physical machine, encrypting it before it leaves the host and is persisted to the Azure storage backend.

This is a per node pool setting. A cluster can have its system node pool encrypted and a user node pool left exposed, so the check evaluates each pool independently.


Why it matters

The risk here is data exposure at the infrastructure layer. Kubernetes workloads write all kinds of sensitive material to local disk without thinking about it: secrets mounted as files, cached database pages, scratch space for ETL jobs, temporary copies of customer data, and container layers pulled from your registry. Much of this lands on the temp disk or the disk cache, which is exactly the surface that host encryption protects.

Concretely, here is what host encryption guards against:

  • Physical and host-level access. If data on the host is ever exposed through a hardware fault, a decommissioning gap, or a hypervisor-level issue, host encryption ensures it stays unreadable.
  • Defense in depth. Server-side disk encryption covers the persisted managed disk, but the cache and temp disk are separate. Host encryption closes that gap so there is no plaintext copy of your data anywhere in the storage path.
  • Compliance requirements. Frameworks like PCI DSS, HIPAA, FedRAMP, and SOC 2 expect encryption of data at rest across the full stack. An auditor will not accept "the managed disks are encrypted" when the temp disk is not.

Warning: If you are running workloads that handle regulated data, leaving host encryption off can put you out of compliance even if every other encryption control is in place. The temp disk is a real gap, not a theoretical one.


How to fix it

Encryption at host cannot be toggled on an existing node pool. You enable it at node pool creation time, which means remediation involves creating a new node pool and migrating workloads. Before you can use the feature, you have to register it on the subscription.

Step 1: Register the EncryptionAtHost feature

az feature register --namespace Microsoft.Compute --name EncryptionAtHostPreview

# Wait until the state shows "Registered" (can take a few minutes)
az feature show --namespace Microsoft.Compute --name EncryptionAtHostPreview \
  --query properties.state -o tsv

# Propagate the registration to the provider
az provider register --namespace Microsoft.Compute

Note: The feature flag name contains "Preview", but encryption at host is generally available. Registration is still required because the underlying VM capability is gated behind this flag at the subscription level.

Step 2: Create a new node pool with encryption at host

az aks nodepool add \
  --resource-group myResourceGroup \
  --cluster-name myAKSCluster \
  --name encrypted1 \
  --node-count 3 \
  --node-vm-size Standard_DS2_v2 \
  --enable-encryption-at-host

Warning: Not every VM size supports encryption at host. You need a size that offers the capability (most DS, ES, and F series do). Check support with az vm list-skus --location <region> --query "[?capabilities[?name=='EncryptionAtHostSupported' && value=='True']].name" before you pick a size.

Step 3: Migrate workloads off the old node pool

Cordon and drain the old pool so the scheduler moves pods onto the new encrypted nodes.

# Cordon every node in the old pool so nothing new schedules there
kubectl cordon -l agentpool=oldpool

# Drain them one at a time, respecting pod disruption budgets
for node in $(kubectl get nodes -l agentpool=oldpool -o name); do
  kubectl drain "$node" --ignore-daemonsets --delete-emptydir-data --timeout=300s
done

Danger: Draining evicts running pods. Make sure you have PodDisruptionBudgets in place and enough capacity on the new pool before you start, or you will cause an outage. Validate that workloads are healthy on the new nodes before deleting anything.

Step 4: Delete the old node pool

Danger: This permanently removes the nodes and any data on their local disks. Confirm all workloads have rescheduled and are running on the encrypted pool first.

az aks nodepool delete \
  --resource-group myResourceGroup \
  --cluster-name myAKSCluster \
  --name oldpool

Terraform example

If you manage AKS with Terraform, set enable_host_encryption on the node pool. The default node pool block and any additional azurerm_kubernetes_cluster_node_pool resources both support it.

resource "azurerm_kubernetes_cluster" "this" {
  name                = "myAKSCluster"
  location            = azurerm_resource_group.this.location
  resource_group_name = azurerm_resource_group.this.name
  dns_prefix          = "myaks"

  default_node_pool {
    name                   = "system"
    node_count             = 3
    vm_size                = "Standard_DS2_v2"
    enable_host_encryption = true
  }

  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_kubernetes_cluster_node_pool" "user" {
  name                  = "user1"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.this.id
  vm_size               = "Standard_DS2_v2"
  node_count            = 3
  enable_host_encryption = true
}

Tip: Because the setting is immutable, changing enable_host_encryption on an existing pool forces Terraform to replace it. Plan the change during a maintenance window and use create_before_destroy with a surge in node count to avoid a capacity gap.


How to prevent it from happening again

The fix is annoying because it requires a node pool recreation. The way to avoid ever doing it reactively is to make encryption at host the default for every new pool. A few mechanisms work well together.

Azure Policy

Use a built-in or custom Azure Policy to audit or deny AKS node pools without host encryption. The deny effect stops a non-compliant pool from being created in the first place.

{
  "if": {
    "allOf": [
      {
        "field": "type",
        "equals": "Microsoft.ContainerService/managedClusters/agentPools"
      },
      {
        "field": "Microsoft.ContainerService/managedClusters/agentPools/enableEncryptionAtHost",
        "notEquals": "true"
      }
    ]
  },
  "then": {
    "effect": "deny"
  }
}

CI/CD gates with policy-as-code

If your clusters are provisioned through Terraform, catch the misconfiguration before it ever reaches Azure. Tools like Checkov, tfsec, and OPA Conftest can scan plans in your pipeline.

# Run Checkov against your Terraform
checkov -d ./infra --framework terraform

# Or fail a pipeline on a specific OPA policy
conftest test plan.json --policy ./policies

A minimal Rego rule to reject node pools without host encryption:

package main

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "azurerm_kubernetes_cluster_node_pool"
  resource.change.after.enable_host_encryption != true
  msg := sprintf("Node pool %s must enable host encryption", [resource.name])
}

Tip: Pair the pipeline gate with continuous monitoring in Lensix. The pipeline catches new misconfigurations at deploy time, and Lensix flags drift or anything created outside your IaC, like a node pool added manually in the portal during an incident.


Best practices

  • Enable encryption at host on every node pool, system and user alike. Do not assume system pools are exempt. They run kube-system workloads that handle cluster secrets.
  • Register the feature once per subscription. Bake az feature register into your subscription bootstrap so new clusters can use the setting immediately.
  • Choose supported VM sizes from the start. Standardize on a list of approved sizes that all support encryption at host so you never get blocked at pool creation time.
  • Combine it with customer-managed keys. For higher assurance, point your disk encryption set at a key in Azure Key Vault so you control the key lifecycle alongside host encryption.
  • Treat node pool settings as immutable. Because many security settings cannot be changed in place, get them right at creation. Use blue-green node pool swaps for upgrades rather than trying to mutate live pools.
  • Audit regularly. Encryption at host is easy to forget on a one-off pool spun up for a specific workload. A scheduled check keeps those exceptions from going unnoticed.

Encryption at host is a small flag with an outsized compliance and security payoff. It costs nothing extra in Azure billing, it closes a real gap in your at-rest encryption story, and once it is the default in your IaC you never have to think about it again.