Fix EMR In-Transit Encryption Disabled on AWS

TL;DR

This check flags EMR clusters whose security configuration leaves in-transit encryption off, meaning data moving between cluster nodes and frameworks travels in plaintext. Create a security configuration with in-transit encryption enabled and attach it to your clusters at launch.

EMR clusters chew through large volumes of data, and a lot of that data moves around the network constantly: between core and task nodes during a shuffle, between the master and worker daemons, and across framework-specific channels in Spark, Hadoop, Presto, and Hive. If in-transit encryption is disabled, all of that traffic crosses the wire unencrypted. This check, emr_notls, looks at the security configuration attached to your EMR clusters and fails when in-transit encryption is not turned on.

What this check detects

Amazon EMR uses a reusable object called a security configuration to control encryption settings. A security configuration can enable encryption at rest, encryption in transit, or both. This check inspects the security configuration associated with each cluster and reports a finding when the in-transit encryption setting is absent or disabled.

In-transit encryption in EMR covers several distinct channels depending on the applications you run:

Hadoop MapReduce shuffle and RPC traffic between nodes
Spark internal communication, including block transfer and shuffle
Presto / Trino internal node-to-node communication (on supported EMR releases)
HDFS data transfer between DataNodes
TLS for the EMR-internal control channels

Note: In-transit encryption requires TLS certificates. EMR can either fetch certificates from an S3 zip bundle (PEM mode) or generate them via a custom certificate provider you supply as a JAR. You configure this once in the security configuration, then reuse it across many clusters.

Why it matters

The risk here is straightforward: anyone who can observe traffic inside the cluster's network can read the data being processed. That sounds unlikely until you remember how EMR clusters actually run.

EMR nodes live in a VPC subnet. If an attacker compromises a single instance in that subnet, or if a misconfigured security group exposes node ports, or if a malicious workload runs alongside your jobs in a shared account, plaintext traffic becomes an easy target. Data that you carefully encrypted at rest in S3 gets decrypted, pulled into the cluster, and then shuffled between nodes in the clear. The protection you paid for at rest evaporates the moment processing starts.

The business impact is concrete for regulated workloads. PCI DSS, HIPAA, and many internal data-handling policies explicitly require encryption of sensitive data in transit, not just at rest. An EMR cluster processing cardholder data or PHI without in-transit encryption is a compliance gap that an auditor will find, and a breach involving that data carries reporting obligations and penalties.

Warning: Encryption at rest and encryption in transit are independent settings. A cluster can have at-rest encryption enabled and still fail this check. Do not assume one implies the other.

How to fix it

You cannot toggle in-transit encryption on a running cluster. It is baked into the security configuration the cluster was launched with. The fix is to create a security configuration that enables in-transit encryption, then launch clusters with it attached.

Step 1: Prepare TLS certificates

EMR needs certificates for the encrypted channels. The simplest approach for getting started is the PEM artifact method: bundle a private key and certificate chain into a zip in S3. For production, generate certificates from your own CA or a custom certificate provider.

# Generate a self-signed cert for testing (use a real CA in production)
openssl req -x509 -newkey rsa:2048 -keyout privateKey.pem \
  -out certificateChain.pem -days 365 -nodes \
  -subj "/CN=*.ec2.internal"

# Bundle and upload to S3
zip my-certs.zip privateKey.pem certificateChain.pem trustedCertificates.pem
aws s3 cp my-certs.zip s3://my-emr-config-bucket/certs/my-certs.zip

Note: The certificate common name must match the EMR node domain. For clusters in the default VPC region the domain is usually *.ec2.internal; in other regions it follows the pattern *.compute.internal. Mismatched CN values cause cluster provisioning to fail.

Step 2: Create a security configuration with in-transit encryption

Define the security configuration JSON. The block below enables in-transit encryption using the PEM artifact you uploaded.

{
  "EncryptionConfiguration": {
    "EnableInTransitEncryption": true,
    "EnableAtRestEncryption": false,
    "InTransitEncryptionConfiguration": {
      "TLSCertificateConfiguration": {
        "CertificateProviderType": "PEM",
        "S3Object": "s3://my-emr-config-bucket/certs/my-certs.zip"
      }
    }
  }
}

Create it with the CLI:

aws emr create-security-configuration \
  --name "emr-intransit-encryption" \
  --security-configuration file://security-config.json

Tip: Enable at-rest encryption in the same security configuration while you are here. Setting EnableAtRestEncryption to true with an S3 and local disk encryption block closes both gaps in one object and saves you a second remediation cycle.

Step 3: Launch clusters with the security configuration

aws emr create-cluster \
  --name "analytics-cluster" \
  --release-label emr-7.1.0 \
  --applications Name=Spark Name=Hadoop \
  --security-configuration "emr-intransit-encryption" \
  --instance-type m5.xlarge \
  --instance-count 3 \
  --use-default-roles

Console steps

Open the EMR console and go to Security configurations in the left navigation.
Choose Create, give it a name, and check Enable in-transit encryption.
Select your certificate provider (PEM with the S3 path, or a custom provider JAR) and save.
When creating a cluster, expand Security configuration and select the one you just made.

Danger: You cannot retrofit a running cluster. Remediating an existing non-compliant cluster means launching a replacement with the secure configuration and terminating the old one. Drain or checkpoint running jobs first, because terminating an EMR cluster permanently destroys all instance storage and any data on local HDFS that has not been written to S3.

How to prevent it from happening again

The reliable way to keep this from recurring is to make in-transit encryption part of how clusters get provisioned, not a setting someone remembers to check.

Define the security configuration in Terraform

resource "aws_emr_security_configuration" "secure" {
  name = "emr-intransit-encryption"

  configuration = jsonencode({
    EncryptionConfiguration = {
      EnableInTransitEncryption = true
      EnableAtRestEncryption    = true
      InTransitEncryptionConfiguration = {
        TLSCertificateConfiguration = {
          CertificateProviderType = "PEM"
          S3Object                = "s3://my-emr-config-bucket/certs/my-certs.zip"
        }
      }
      AtRestEncryptionConfiguration = {
        S3EncryptionConfiguration = {
          EncryptionMode = "SSE-KMS"
          AwsKmsKey      = aws_kms_key.emr.arn
        }
        LocalDiskEncryptionConfiguration = {
          EncryptionKeyProviderType = "AwsKms"
          AwsKmsKey                 = aws_kms_key.emr.arn
        }
      }
    }
  })
}

resource "aws_emr_cluster" "analytics" {
  name                   = "analytics-cluster"
  release_label          = "emr-7.1.0"
  applications           = ["Spark", "Hadoop"]
  security_configuration = aws_emr_security_configuration.secure.name
  service_role           = aws_iam_role.emr_service.arn
  # ... instance groups, etc.
}

With the security configuration referenced directly in the cluster resource, a Terraform plan that omits it becomes a visible diff in review.

Gate it in CI/CD with policy-as-code

Add a check that fails the pipeline when an EMR cluster has no security configuration or one without in-transit encryption. A Conftest / OPA Rego rule against a Terraform plan:

# Run against a Terraform plan JSON
terraform show -json plan.tfplan > plan.json
conftest test plan.json --policy policy/

package main

deny[msg] {
  resource := input.resource_changes[_]
  resource.type == "aws_emr_cluster"
  not resource.change.after.security_configuration
  msg := sprintf("EMR cluster '%s' has no security configuration", [resource.address])
}

Tip: Pair the pipeline gate with continuous detection. Lensix runs emr_notls against your live accounts, so clusters launched outside of CI (manual console launches, one-off scripts) still get caught instead of slipping through.

Enforce with an SCP

For environments where you want a hard boundary, a Service Control Policy can deny elasticmapreduce:RunJobFlow calls that do not reference a security configuration, pushing every team toward the approved path.

Best practices

Enable encryption in transit and at rest together. Treat them as a single baseline. A security configuration that does both removes ambiguity and matches what most compliance frameworks expect.
Use a real CA, not self-signed certs, in production. Self-signed PEM bundles are fine for proving the setup works, but production should use certificates from your internal CA or AWS Certificate Manager Private CA, with a rotation plan.
Rotate certificates before they expire. Expired certificates in the S3 bundle will break new cluster launches. Track expiry and update the bundle ahead of time.
Standardize on a small number of named security configurations. One or two well-defined, secure configurations are easier to audit than dozens of ad hoc ones.
Lock down the network too. Encryption in transit is one layer. Keep EMR nodes in private subnets, restrict security group ingress, and avoid placing untrusted workloads in the same subnet.
Pin recent EMR releases. Application-level in-transit encryption coverage (especially for Spark and Presto) improves across releases, so newer release labels give you broader protection.

Getting in-transit encryption right on EMR is mostly a one-time setup that pays off every time a cluster spins up. Define the security configuration once, reference it everywhere through IaC, gate it in CI, and let continuous detection catch the stragglers.

EMR Cluster In-Transit Encryption Disabled: Why It Matters and How to Fix It