Back to blog
Best PracticesCloud SecurityCompute & ContainersGCPKubernetes

GKE Node Secure Boot Not Enabled: Why It Matters and How to Fix It

Learn why GKE nodes without Secure Boot are vulnerable to boot-level rootkits, and how to enable Shielded Nodes with gcloud, Terraform, and policy-as-code.

TL;DR

This check flags GKE node pools running without Secure Boot, which leaves nodes open to boot-level rootkits and unsigned kernel modules. Recreate the node pool with --shielded-secure-boot to enforce verified boot.

Most container security work focuses on the layers people can see: image scanning, RBAC, network policies, admission controllers. The boot process underneath each node tends to get ignored, partly because it sits below the abstraction most Kubernetes users ever touch. That gap is exactly where Secure Boot lives, and it is why this check matters.

The GKE Node Secure Boot Not Enabled check looks at your Google Kubernetes Engine node pools and reports any pool where Secure Boot is turned off. Below we cover what Secure Boot actually does, the risk of running without it, and how to fix and prevent the misconfiguration.


What this check detects

Secure Boot is part of GKE's Shielded GKE Nodes feature set. When enabled, it uses UEFI firmware to verify that every boot component, the bootloader, the kernel, and kernel drivers, is signed by a trusted authority before it loads. If a component fails signature verification, the node refuses to boot it.

This check fails when a node pool has the secureBoot property set to false (or absent) in its shielded instance config. In practice that means the node VMs in the pool boot without verifying the integrity of their boot chain.

Note: Shielded GKE Nodes bundle three protections: Secure Boot, virtual Trusted Platform Module (vTPM), and integrity monitoring. Secure Boot validates the boot chain at startup, the vTPM measures and attests boot state, and integrity monitoring compares that state against a known-good baseline. This check is specifically about the Secure Boot component.

You can confirm the current state of a node pool with gcloud:

gcloud container node-pools describe NODE_POOL_NAME \
  --cluster CLUSTER_NAME \
  --region REGION \
  --format="value(config.shieldedInstanceConfig.enableSecureBoot)"

If that returns False or nothing at all, Secure Boot is not enabled on the pool.


Why it matters

A node without Secure Boot will happily load an unsigned or tampered bootloader, kernel, or kernel module. That is a meaningful attack surface for a few reasons.

Bootkits and kernel rootkits survive reboots

Malware that lodges itself in the boot chain or installs a malicious kernel module is among the hardest to detect and remove. It loads before your runtime security tooling, before the kubelet, and before any agent that might catch it. Without Secure Boot, nothing stops an attacker who has gained code execution on a node from installing a persistent rootkit that survives reboots and reimaging of the workload layer.

Container escapes become more dangerous

Container escapes are not rare. A privileged container, a vulnerable kernel, or a misconfigured hostPath mount can give an attacker node-level access. Once they are on the host, Secure Boot is one of the few controls that limits what they can persist. Without it, the escape escalates from "compromised pod" to "compromised node with permanent foothold."

Warning: If your nodes run any privileged DaemonSets, custom CNI plugins, or third-party agents that load kernel modules, verify those modules are signed before enforcing Secure Boot. Unsigned out-of-tree modules can fail to load on a Secure Boot enabled node and break the workload.

Compliance and attestation

Standards like CIS GKE Benchmark explicitly recommend Shielded GKE Nodes with Secure Boot. If you are pursuing SOC 2, PCI DSS, or FedRAMP, an auditor will ask why production nodes lack a documented, available boot integrity control. "We didn't enable the checkbox" is a weak answer.


How to fix it

Here is the catch worth knowing up front: you cannot enable Secure Boot on an existing node pool in place. Shielded node configuration is set at pool creation. To fix an existing pool you create a new pool with Secure Boot enabled, migrate workloads, and delete the old one.

Option 1: Create a replacement node pool (gcloud)

Create a new pool with Secure Boot turned on:

gcloud container node-pools create secure-pool \
  --cluster CLUSTER_NAME \
  --region REGION \
  --shielded-secure-boot \
  --shielded-integrity-monitoring \
  --machine-type e2-standard-4 \
  --num-nodes 3

Cordon and drain the old nodes so the scheduler moves workloads onto the new pool:

# List nodes in the old pool
kubectl get nodes -l cloud.google.com/gke-nodepool=OLD_POOL_NAME

# Cordon and drain each one
kubectl cordon NODE_NAME
kubectl drain NODE_NAME --ignore-daemonsets --delete-emptydir-data

Warning: Draining moves pods to new nodes and can cause brief disruption for workloads without a PodDisruptionBudget or sufficient replicas. Run migrations during a maintenance window, and set PDBs on critical services so the drain respects availability requirements.

Once workloads are running on the new pool and verified healthy, delete the old one:

Danger: Deleting a node pool destroys its nodes and any data on local SSDs or emptyDir volumes. Confirm all stateful workloads have been rescheduled and that no pods are pending before you run this. There is no undo.

gcloud container node-pools delete OLD_POOL_NAME \
  --cluster CLUSTER_NAME \
  --region REGION

Option 2: Terraform

For managed infrastructure, set shielded_instance_config on the node pool. Changing this forces replacement of the node pool, so plan accordingly.

resource "google_container_node_pool" "secure_pool" {
  name     = "secure-pool"
  cluster  = google_container_cluster.primary.id
  location = "us-central1"

  node_config {
    machine_type = "e2-standard-4"

    shielded_instance_config {
      enable_secure_boot          = true
      enable_integrity_monitoring = true
    }
  }

  node_count = 3
}

Tip: Enable Shielded Nodes at the cluster level too with enable_shielded_nodes = true on the google_container_cluster resource. That sets the default for new pools so future ones inherit the secure baseline without anyone remembering to add the flag.


How to prevent it from happening again

One-off remediation does not hold up over time. New pools get created during incidents, scaling experiments, and by people who do not know the standard. Bake the requirement into your pipeline.

Policy-as-code with OPA or Gatekeeper

If you provision clusters through Terraform, run Conftest against your plans to reject node pools without Secure Boot:

package gke.secureboot

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_container_node_pool"
  config := resource.change.after.node_config[_]
  not config.shielded_instance_config[_].enable_secure_boot
  msg := sprintf("Node pool '%s' must enable Secure Boot", [resource.address])
}

Organization Policy constraint

GCP offers a built-in org policy that requires Shielded VMs across the organization or folder:

gcloud resource-manager org-policies enable-enforce \
  compute.requireShieldedVm \
  --organization ORGANIZATION_ID

This enforces Shielded VM features, including Secure Boot, on Compute Engine instances, which covers GKE nodes. Test it against a non-production folder first, since it can block VM creation for images that are not Shielded-compatible.

Continuous scanning

Catch drift after the fact with continuous checks. Lensix runs the gke_nosecureboot check on a schedule across your GCP projects, so a pool created outside your IaC pipeline still gets flagged rather than sitting unnoticed until an audit.


Best practices

  • Enable all three Shielded Nodes features together. Secure Boot, vTPM, and integrity monitoring complement each other. Secure Boot blocks untrusted boot components, while integrity monitoring alerts you when a node's measured boot state drifts from baseline.
  • Set the secure baseline at the cluster level. Turning on Shielded Nodes for the cluster means new pools default to the right configuration, reducing reliance on individual operators.
  • Use Container-Optimized OS (COS). COS images are signed and designed to work with Secure Boot out of the box. Custom or third-party node images may require signing your own kernel modules.
  • Validate kernel modules before enforcing. Inventory any DaemonSets or agents that load out-of-tree modules and confirm they are signed, so Secure Boot does not break them.
  • Treat node pools as immutable. Because Secure Boot cannot be toggled in place, get into the habit of recreating pools to change node-level configuration rather than mutating them.
  • Map the control to your compliance framework. Document Secure Boot as your boot integrity control for CIS GKE 5.5.x and reference it in your audit evidence.

Secure Boot is a small flag with an outsized effect on how much an attacker can persist after a node compromise. It costs nothing extra, it does not slow workloads down, and on a fresh cluster it is one line of configuration. The only real friction is migrating existing pools, which is a one-time cost worth paying for boot integrity you can rely on.