Back to blog
Best PracticesCloud SecurityDatabasesGCPOperations & Compliance

BigQuery Dataset Not Using CMEK: Why It Matters and How to Fix It

Learn why BigQuery datasets should use customer-managed KMS keys (CMEK), the risks of default encryption, and step-by-step CLI and Terraform fixes.

TL;DR

This check flags BigQuery datasets that rely on Google's default encryption instead of a customer-managed encryption key (CMEK). Without CMEK you lose direct control over key rotation, revocation, and audit visibility. Fix it by creating a Cloud KMS key and setting it as the dataset's default encryption configuration.

BigQuery encrypts everything at rest by default, so it is easy to assume the encryption box is checked and move on. That default encryption uses keys that Google generates, owns, and rotates for you. For most workloads that is fine. But once you handle regulated data, customer PII, or anything covered by a contractual key-control requirement, "Google manages the key" stops being good enough. That is exactly what the bigquery_nocmk check looks for: datasets that have no customer-managed KMS key attached.


What this check detects

The check inspects each BigQuery dataset in your GCP projects and looks at its defaultEncryptionConfiguration. If that field is empty, the dataset is using Google-managed encryption keys (the default) rather than a customer-managed encryption key from Cloud KMS.

A dataset's default encryption configuration controls the key used for any new table created in that dataset that does not specify its own key. When the field is unset, BigQuery falls back to Google-managed keys, and the check returns a failure.

Note: CMEK does not change how data is encrypted (it is still AES-256 under the hood). It changes who controls the key. With CMEK, the key lives in Cloud KMS under your project, and BigQuery has to call KMS to wrap and unwrap the data encryption keys. Disable or destroy that KMS key and the data becomes unreadable.


Why it matters

The difference between default encryption and CMEK shows up in three concrete situations.

Key revocation and the kill switch

With CMEK you can disable a key version and effectively make a dataset unreadable within minutes. This is the cryptographic equivalent of pulling the plug. If you detect a breach, suspect credential compromise, or need to enforce a legal hold, you have a direct control. Google-managed keys give you no such lever.

Compliance and audit requirements

Frameworks like PCI DSS, HIPAA, FedRAMP, and many internal data-governance policies require demonstrable control over encryption keys, including documented rotation schedules and access logging. Default encryption cannot satisfy a requirement that says "the data owner must control the key material." CMEK can, because every key operation is logged in Cloud KMS and surfaced through Cloud Audit Logs.

Key rotation on your terms

Cloud KMS lets you set automatic rotation periods and trigger manual rotations when an incident calls for it. Google rotates its managed keys too, but you neither control the schedule nor see the events. For an auditor, "trust us, it rotates" rarely passes.

Warning: CMEK adds operational dependencies. If the KMS key is in a different project or region, or if its IAM bindings are wrong, BigQuery jobs will fail with permission errors. A destroyed key version means permanent data loss. Treat key lifecycle as carefully as the data itself.


How to fix it

Remediation has two parts: create (or pick) a Cloud KMS key, grant BigQuery permission to use it, then set it as the dataset's default encryption key.

Step 1: Create a KMS key ring and key

The key should live in the same region as your BigQuery dataset. BigQuery cannot use a key from a mismatched location.

gcloud kms keyrings create bq-keyring \
  --location=us \
  --project=my-project

gcloud kms keys create bq-cmek \
  --location=us \
  --keyring=bq-keyring \
  --purpose=encryption \
  --rotation-period=90d \
  --next-rotation-time=$(date -u -d "+90 days" +%Y-%m-%dT%H:%M:%SZ) \
  --project=my-project

Step 2: Grant the BigQuery service account access to the key

Every project has a dedicated BigQuery service account that performs encryption operations. Find it and give it the encrypter/decrypter role on the key.

# Get the BigQuery encryption service account for the project
SA=$(bq show --encryption_service_account --project_id=my-project \
  --format=prettyjson | python3 -c "import sys,json; print(json.load(sys.stdin)['ServiceAccountID'])")

gcloud kms keys add-iam-policy-binding bq-cmek \
  --location=us \
  --keyring=bq-keyring \
  --member="serviceAccount:${SA}" \
  --role="roles/cloudkms.cryptoKeyEncrypterDecrypter" \
  --project=my-project

Step 3: Set the key as the dataset default

Danger: Setting a default CMEK on a dataset does not re-encrypt existing tables. It only applies to tables created afterward. To bring old tables under CMEK you must rewrite them (for example, with a CREATE OR REPLACE TABLE or a copy job that specifies the key). Test this on a non-production dataset first, and confirm the key IAM binding is in place before you start, or jobs will fail mid-flight.

Using the bq CLI:

bq update \
  --default_kms_key=projects/my-project/locations/us/keyRings/bq-keyring/cryptoKeys/bq-cmek \
  my-project:my_dataset

To bring an existing table under CMEK, rewrite it with the key specified:

bq query --use_legacy_sql=false --destination_kms_key=projects/my-project/locations/us/keyRings/bq-keyring/cryptoKeys/bq-cmek \
  'CREATE OR REPLACE TABLE my_dataset.my_table AS SELECT * FROM my_dataset.my_table'

Console steps

  1. Open BigQuery in the Google Cloud console and select the dataset.
  2. Click Edit details.
  3. Under Encryption, choose Customer-managed key.
  4. Select the key ring and key you created, then save.

Terraform

If you manage BigQuery as code, declare the key directly on the dataset resource so drift is caught on every plan.

resource "google_kms_crypto_key" "bq_cmek" {
  name            = "bq-cmek"
  key_ring        = google_kms_key_ring.bq.id
  rotation_period = "7776000s" # 90 days
}

resource "google_bigquery_dataset" "secure" {
  dataset_id = "my_dataset"
  location   = "US"

  default_encryption_configuration {
    kms_key_name = google_kms_crypto_key.bq_cmek.id
  }
}

Tip: Wire the IAM binding for the BigQuery service account into the same Terraform module using the google_kms_crypto_key_iam_member resource. That keeps the key, the dataset, and the permission grant in one place, so nobody can apply a dataset without the access it needs to function.


How to prevent it from happening again

Manual fixes drift. The reliable approach is to make CMEK the only way a dataset can be created in your environment.

Organization policy constraint

GCP ships a built-in org policy that denies creation of resources without CMEK. Apply constraints/gcp.restrictNonCmekServices and list BigQuery, so any attempt to create a dataset or table without a customer-managed key is rejected at the API layer.

cat > cmek-policy.yaml <<'EOF'
name: organizations/123456789012/policies/gcp.restrictNonCmekServices
spec:
  rules:
    - values:
        allowedValues:
          - "bigquery.googleapis.com"
EOF

gcloud org-policies set-policy cmek-policy.yaml

You can pair this with constraints/gcp.restrictCmekCryptoKeyProjects to limit which projects KMS keys may come from, which prevents someone from satisfying the requirement with an unmanaged key in a random project.

Policy-as-code in CI/CD

Catch missing CMEK before anything reaches GCP. A Conftest/OPA policy against Terraform plans works well:

package bigquery

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "google_bigquery_dataset"
  not resource.change.after.default_encryption_configuration
  msg := sprintf("BigQuery dataset '%s' must set default_encryption_configuration (CMEK)", [resource.address])
}

Run it as a required check in your pipeline so a pull request that omits CMEK never merges.

Tip: Combine the preventive org policy with continuous detection in Lensix. The org policy stops new violations, and the bigquery_nocmk check catches anything created before the policy existed or in projects outside its scope.


Best practices

  • Match key and dataset location. A regional key only works with a dataset in the same region. For multi-region datasets (US, EU), use a multi-region key ring.
  • Enable automatic rotation. A 90-day rotation period is a common baseline. Older key versions remain available to decrypt data already encrypted with them, so rotation does not break existing tables.
  • Never grant broad KMS roles. The BigQuery service account needs only cryptoKeyEncrypterDecrypter on the specific key, not project-wide KMS admin.
  • Separate duties. Keep KMS key administration in a dedicated security project with its own IAM, away from the teams that consume the data. The key admin can revoke; the data team cannot.
  • Monitor key usage. Cloud Audit Logs records every encrypt and decrypt call. Alert on unexpected denials, which usually signal a misconfigured IAM binding or a job using the wrong key.
  • Document the recovery path. Because destroying a key version destroys the data, write down who can disable versus destroy keys, and protect destroy operations behind an approval workflow.

Default encryption protects data from someone walking off with a disk. CMEK protects data from a compromised account, a careless export, or an auditor's checklist. They solve different problems, and for sensitive datasets you want both layers working together.

Once CMEK is in place and enforced by org policy, the bigquery_nocmk check should stay green. If it flips back to failing, that is a strong signal that a dataset was created outside your guardrails, which is worth investigating on its own.