Back to blog
AWSBest PracticesCloud SecurityNetworking

SageMaker Notebook Publicly Accessible: Why Direct Internet Access Is a Risk and How to Fix It

Learn why SageMaker notebooks with direct internet access enabled are a security risk, and how to disable it, route through a VPC, and enforce it in CI/CD.

TL;DR

This check flags SageMaker notebook instances created with direct internet access enabled, which exposes the notebook's management endpoint and any embedded credentials to the public internet. Disable direct internet access and route traffic through a VPC with a NAT gateway instead.

SageMaker notebook instances are convenient. You spin one up, open Jupyter in your browser, and start training models against your data lake within minutes. That convenience comes with a default that surprises a lot of teams: when you create a notebook instance attached to a VPC, you can still leave direct internet access on, which gives the instance a route to the public internet through an AWS-managed network interface rather than through your own VPC controls.

This Lensix check, sagemaker_public, detects exactly that configuration. If a notebook instance has direct internet access enabled, it shows up as a finding.


What this check detects

SageMaker notebook instances have a setting called DirectInternetAccess. It has two values:

  • Enabled — the instance reaches the internet through an AWS-managed elastic network interface that sits outside your VPC. Your route tables, NAT configuration, and security groups do not gate that path.
  • Disabled — the instance only reaches the internet through the VPC and subnet you attach it to, meaning a NAT gateway, VPC endpoints, and your own security groups control everything.

The check fails when DirectInternetAccess is set to Enabled. You can confirm the setting on any instance with the CLI:

aws sagemaker describe-notebook-instance \
  --notebook-instance-name my-notebook \
  --query 'DirectInternetAccess'

A return value of "Enabled" is the condition this check flags.

Note: "Direct internet access" does not mean the Jupyter UI itself is open to the world. Access to the notebook UI is still gated by IAM and presigned URLs. What it changes is the egress path for code running inside the notebook, and it can be combined with an open security group to widen exposure further. The risk is about what the notebook can reach and what can reach back, not just the login page.


Why it matters

A SageMaker notebook is rarely an isolated toy. It usually holds an IAM execution role with permissions to read S3 buckets, query Athena, pull from RDS, or write to feature stores. It often has data scientists' code, API keys pasted into cells, and cached datasets sitting on its attached EBS volume. That makes it a high-value target.

When direct internet access is enabled, you lose the ability to inspect and control the notebook's traffic with your own VPC tooling. A few concrete consequences:

  • Data exfiltration becomes easier. If an attacker gains code execution inside the notebook (through a malicious pip package, a compromised dependency, or a leaked presigned URL), they have an unmonitored egress path straight to the internet. With a VPC-only setup, you can force traffic through endpoints and inspect or block it.
  • Credential theft scales fast. The notebook's execution role credentials are available from the instance metadata service. Combined with open egress, exfiltrating those credentials to an attacker-controlled server is trivial.
  • You bypass your own guardrails. Teams invest in VPC flow logs, egress firewalls, and DNS filtering. Direct internet access routes around all of it because the traffic never touches your VPC.

Warning: Pairing direct internet access with a security group that allows inbound 0.0.0.0/0 is the worst case. The notebook becomes both reachable and able to talk back out freely. Always check the attached security groups when you find this misconfiguration.


How to fix it

The core fix is to set DirectInternetAccess to Disabled and route the notebook through a VPC with a controlled egress path. There is one important catch: you cannot change this setting on an existing notebook instance in place. You have to recreate the instance.

Step 1: Prepare the network

Make sure the target VPC and subnet have a way to reach the internet without direct access, typically a NAT gateway in a public subnet, plus VPC endpoints for the AWS services the notebook needs (S3, SageMaker API, STS, CloudWatch Logs). VPC endpoints keep AWS API traffic off the public internet entirely.

# Example: create an S3 gateway endpoint so the notebook reaches S3 privately
aws ec2 create-vpc-endpoint \
  --vpc-id vpc-0abc123 \
  --service-name com.amazonaws.us-east-1.s3 \
  --route-table-ids rtb-0def456 \
  --vpc-endpoint-type Gateway

Step 2: Back up anything on the instance

Notebook files live on the attached EBS volume. Before deleting the instance, push your notebooks to a Git repository or copy them to S3. SageMaker's lifecycle configurations and Git integration make this straightforward.

Danger: Deleting a notebook instance destroys its EBS volume and any unsaved work on it. Confirm your notebooks are committed to Git or copied to S3 before running the delete command below.

Step 3: Stop and delete the old instance

aws sagemaker stop-notebook-instance \
  --notebook-instance-name my-notebook

# Wait until status is "Stopped"
aws sagemaker wait notebook-instance-stopped \
  --notebook-instance-name my-notebook

aws sagemaker delete-notebook-instance \
  --notebook-instance-name my-notebook

Step 4: Recreate with direct internet access disabled

aws sagemaker create-notebook-instance \
  --notebook-instance-name my-notebook \
  --instance-type ml.t3.medium \
  --role-arn arn:aws:iam::111122223333:role/SageMakerExecutionRole \
  --subnet-id subnet-0a1b2c3d \
  --security-group-ids sg-0123456789abcdef0 \
  --direct-internet-access Disabled

The key flag is --direct-internet-access Disabled. The instance now reaches the internet only through the subnet you provided, which means your NAT gateway and security groups are in the path.

Console steps

  1. Open the SageMaker console and go to Notebook instances.
  2. Stop and delete the offending instance (after backing up your notebooks).
  3. Choose Create notebook instance.
  4. Under Network, select your VPC and a private subnet.
  5. For Direct internet access, choose Disable — Access the internet through a VPC.
  6. Attach a security group with no open inbound rules and finish creation.

Tip: If your team relies on these notebooks daily, consider migrating to SageMaker Studio instead of standalone notebook instances. Studio runs in VpcOnly network mode and centralizes user management, which removes this entire class of misconfiguration.


Fixing it with Infrastructure as Code

If you manage SageMaker through Terraform, set direct_internet_access explicitly and never leave it to default:

resource "aws_sagemaker_notebook_instance" "ml" {
  name                   = "my-notebook"
  instance_type          = "ml.t3.medium"
  role_arn               = aws_iam_role.sagemaker.arn
  subnet_id              = aws_subnet.private.id
  security_groups        = [aws_security_group.notebook.id]
  direct_internet_access = "Disabled"
}

For CloudFormation, the equivalent property is DirectInternetAccess: Disabled on the AWS::SageMaker::NotebookInstance resource, and you must also supply SubnetId and SecurityGroupIds for the VPC-only path to work.


How to prevent it from happening again

One-time fixes drift. Bake the rule into the places where notebooks get created.

Service Control Policy to block the bad setting

An SCP can deny creating notebook instances unless direct internet access is disabled. Apply it at the organization or OU level:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyPublicSageMakerNotebooks",
      "Effect": "Deny",
      "Action": "sagemaker:CreateNotebookInstance",
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "sagemaker:DirectInternetAccess": "Disabled"
        }
      }
    }
  ]
}

Policy-as-code in CI/CD

Catch it before it ships. A Checkov scan in your pipeline will fail any Terraform plan that sets direct internet access to enabled:

checkov -d ./infra --check CKV_AWS_122

The corresponding OPA/Rego rule is simple to write if you prefer Conftest:

package sagemaker

deny[msg] {
  resource := input.resource.aws_sagemaker_notebook_instance[name]
  resource.direct_internet_access != "Disabled"
  msg := sprintf("Notebook '%s' must set direct_internet_access = Disabled", [name])
}

Tip: Run the Lensix sagemaker_public check on a schedule so any notebook created outside your pipeline, for example through the console by a data scientist in a hurry, gets caught within minutes rather than at the next audit.


Best practices

  • Default to VPC-only. Make disabled direct internet access the standard in every module, template, and runbook. Engineers should have to justify enabling it, not the other way around.
  • Use VPC endpoints for AWS services. Gateway endpoints for S3 and DynamoDB and interface endpoints for SageMaker, STS, and CloudWatch keep API traffic private and often remove the need for a NAT gateway for AWS-only workloads.
  • Scope execution roles tightly. A notebook role should only reach the specific buckets and services it needs. Least privilege limits the blast radius if the instance is compromised.
  • Lock down security groups. No inbound 0.0.0.0/0 rules, and egress restricted to what the workload actually requires.
  • Enable VPC flow logs. Once traffic flows through your VPC, you can actually see and alert on unexpected egress.
  • Prefer SageMaker Studio in VpcOnly mode for teams, since it centralizes the network controls and reduces per-instance drift.

Direct internet access on a notebook is one of those settings that looks harmless during setup and turns into an exfiltration path during an incident. Disable it, route through your VPC, and enforce the rule in code so it never comes back.