Fix Expired API Gateway Custom Domain Certificates

TL;DR

This check flags API Gateway custom domains whose SSL/TLS certificate has expired, which means clients get TLS errors and your API is effectively down. Reissue or reimport a valid certificate (ideally via ACM with auto-renewal) and reassociate it with the custom domain.

A custom domain like api.yourcompany.com in front of API Gateway needs a valid TLS certificate to terminate HTTPS. When that certificate expires, every client that respects certificate validation, which is basically every browser, SDK, and mobile app, refuses to connect. The result is a hard outage that looks identical to the API being down, except nothing in your application logs explains why.

The apigw_certexpired check looks at the certificates attached to your API Gateway custom domains and reports any that are past their NotAfter date. It is a simple condition with an outsized blast radius.

What this check detects

API Gateway lets you map a friendly custom domain name to your APIs instead of exposing the default execute-api endpoint. To serve traffic over HTTPS on that domain, the custom domain has an associated ACM (AWS Certificate Manager) certificate. For edge-optimized domains the certificate lives in us-east-1; for regional domains it lives in the same region as the API.

The check inspects each custom domain's certificate and compares the expiry date against the current time. If the certificate's validity period has already ended, the check fails. This covers both REST APIs (v1) and HTTP/WebSocket APIs (v2), and it applies whether the certificate was issued by ACM or imported from a third-party CA.

Note: ACM-issued public certificates that pass validation usually renew themselves automatically. The most common reason a certificate expires anyway is that it was imported from an external CA, or the domain validation lapsed (more on that below). So a failure here often points at a certificate ACM does not manage for you.

Why it matters

An expired certificate is not a slow degradation, it is a cliff. The moment the clock passes NotAfter, well-behaved clients start rejecting the TLS handshake. Here is what that looks like in practice:

Total outage for your API consumers. Mobile apps, partner integrations, and browser clients all fail with errors like CERT_HAS_EXPIRED or NET::ERR_CERT_DATE_INVALID.
No application error to chase. Your Lambda functions and backends are healthy. CloudWatch shows normal latency. The failure happens at TLS termination, before requests ever reach your code, so on-call engineers waste time looking in the wrong place.
Cascading failures. If internal services call your API over the custom domain, a single expired cert can take down a chain of dependent systems.
Loss of trust. Customers who see a browser security warning on your API or login domain remember it. Security teams treat expired certs as a sign of weak operational hygiene.

There is also a subtler risk. Teams under pressure to restore service sometimes tell clients to disable certificate validation or set verify=False as a "temporary" workaround. That workaround tends to outlive the incident and quietly opens the door to man-in-the-middle attacks for months afterward.

Warning: Certificate expiry incidents almost always happen outside business hours, because expiry dates are not aligned to your work calendar. Treat this check as something to fix proactively, not something to react to at 2am.

How to fix it

The fix is to get a valid certificate associated with the custom domain again. The exact steps depend on whether you can let ACM manage the certificate (strongly preferred) or you must reimport an external one.

Step 1: Identify the affected custom domain and certificate

List your custom domains and find which certificate each one uses.

# REST APIs (API Gateway v1)
aws apigateway get-domain-names \
  --query "items[].{Domain:domainName,Cert:certificateArn,RegionalCert:regionalCertificateArn}" \
  --output table

# HTTP / WebSocket APIs (API Gateway v2)
aws apigatewayv2 get-domain-names \
  --query "Items[].{Domain:DomainName,Cert:DomainNameConfigurations[0].CertificateArn}" \
  --output table

Then check the expiry on the certificate ARN you found. Remember to query in the right region (us-east-1 for edge-optimized).

aws acm describe-certificate \
  --certificate-arn arn:aws:acm:us-east-1:111122223333:certificate/abc123 \
  --query "Certificate.{Status:Status,NotAfter:NotAfter,InUse:InUseBy,Renewal:RenewalEligibility}" \
  --output json

Step 2a: Request a fresh ACM-managed certificate (preferred)

If the domain is publicly resolvable, request a new public certificate through ACM and validate it with DNS. ACM will then auto-renew it going forward.

aws acm request-certificate \
  --domain-name api.yourcompany.com \
  --validation-method DNS \
  --region us-east-1

ACM returns a new certificate ARN. Fetch the DNS validation record it needs and add the CNAME to your hosted zone.

aws acm describe-certificate \
  --certificate-arn arn:aws:acm:us-east-1:111122223333:certificate/NEW-ARN \
  --query "Certificate.DomainValidationOptions[].ResourceRecord" \
  --region us-east-1

Tip: If your DNS is in Route 53, ACM can write the validation record for you in the console with one click ("Create records in Route 53"). This is the single biggest cause of certificates failing to auto-renew, the validation CNAME getting deleted, so let AWS manage it.

Step 2b: Reimport an external certificate

If you must use a certificate from an external CA, import the renewed bundle. You can reuse the same ARN with --certificate-arn so you do not have to reassociate the domain.

aws acm import-certificate \
  --certificate-arn arn:aws:acm:us-east-1:111122223333:certificate/abc123 \
  --certificate fileb://cert.pem \
  --certificate-chain fileb://chain.pem \
  --private-key fileb://privatekey.pem \
  --region us-east-1

Step 3: Associate the new certificate with the custom domain

If you reimported to the same ARN, the domain already points at it and you are done. If you requested a brand new certificate, update the custom domain to use the new ARN.

Danger: Updating a custom domain certificate affects live traffic. Make the change during a maintenance window if possible, confirm the new certificate is ISSUED first, and have a rollback ARN ready. A typo in the ARN here will break the domain just as surely as an expired cert.

# REST API edge-optimized
aws apigateway update-domain-name \
  --domain-name api.yourcompany.com \
  --patch-operations op=replace,path=/certificateArn,value=arn:aws:acm:us-east-1:111122223333:certificate/NEW-ARN

# HTTP / WebSocket API
aws apigatewayv2 update-domain-name \
  --domain-name api.yourcompany.com \
  --domain-name-configurations CertificateArn=arn:aws:acm:us-east-1:111122223333:certificate/NEW-ARN

Step 4: Verify

# Confirm the served certificate and its expiry
echo | openssl s_client -servername api.yourcompany.com \
  -connect api.yourcompany.com:443 2>/dev/null \
  | openssl x509 -noout -dates -subject

You should see a notAfter date comfortably in the future.

How to prevent it from happening again

Manual certificate management does not scale and humans forget renewal dates. Build the renewal and the alerting into your platform.

Prefer ACM-issued certificates with DNS validation

Public certificates ACM issues and validates via DNS renew automatically with no action required, as long as the validation CNAME stays in place. Avoid imported certificates unless a compliance requirement forces a specific CA. Defining the certificate in Terraform keeps the validation record alongside it:

resource "aws_acm_certificate" "api" {
  domain_name       = "api.yourcompany.com"
  validation_method = "DNS"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_route53_record" "api_validation" {
  for_each = {
    for dvo in aws_acm_certificate.api.domain_validation_options : dvo.domain_name => {
      name   = dvo.resource_record_name
      type   = dvo.resource_record_type
      record = dvo.resource_record_value
    }
  }

  zone_id = aws_route53_zone.main.zone_id
  name    = each.value.name
  type    = each.value.type
  records = [each.value.record]
  ttl     = 60
}

resource "aws_acm_certificate_validation" "api" {
  certificate_arn         = aws_acm_certificate.api.arn
  validation_record_fqdns = [for r in aws_route53_record.api_validation : r.fqdn]
}

Alert well before expiry

ACM emits a DaysToExpiry metric and fires an EventBridge ACM Certificate Approaching Expiration event 45 days before expiry by default. Route that event to your alerting channel so someone sees it weeks ahead, not after the outage.

{
  "source": ["aws.acm"],
  "detail-type": ["ACM Certificate Approaching Expiration"]
}

Wire this EventBridge rule to an SNS topic or a chatops integration so the warning lands where your team actually looks.

Tip: For imported certificates, the 45-day event still fires but ACM cannot renew them for you. Treat those alerts as an action item, not an FYI, and consider raising the lead time so there is room to chase an external CA.

Gate it in CI/CD and policy-as-code

Catch risky configurations before they ship. A simple pre-deploy check can fail the pipeline if a custom domain points at a certificate expiring within your safety window:

#!/usr/bin/env bash
set -euo pipefail
THRESHOLD_DAYS=30
ARN="$1"
NOT_AFTER=$(aws acm describe-certificate --certificate-arn "$ARN" \
  --query "Certificate.NotAfter" --output text)
EXP_EPOCH=$(date -d "$NOT_AFTER" +%s)
NOW_EPOCH=$(date +%s)
DAYS_LEFT=$(( (EXP_EPOCH - NOW_EPOCH) / 86400 ))
if (( DAYS_LEFT < THRESHOLD_DAYS )); then
  echo "FAIL: certificate $ARN expires in $DAYS_LEFT days"
  exit 1
fi
echo "OK: $DAYS_LEFT days remaining"

Run the Lensix apigw_certexpired check continuously so you get coverage across every account and region, including the domains nobody remembers owning.

Best practices

Standardize on ACM with DNS validation. Auto-renewal removes the most common failure mode entirely.
Never delete validation CNAME records. They look like clutter but they are what keeps auto-renewal working. Manage them in IaC so nobody removes them by hand.
Keep certificates in the correct region. Edge-optimized domains require us-east-1. A right-cert-wrong-region mistake fails silently until you try to use it.
Inventory every custom domain. Shadow domains created during experiments are the ones most likely to expire unnoticed. Continuous scanning beats tribal memory.
Alert early and to the right place. A 45-day warning that lands in an unread inbox is no better than no warning at all.
Avoid imported certificates where you can. Every imported cert is a manual renewal you have committed to forever. Reserve them for cases where a specific CA is mandated.

An expired API Gateway certificate is one of the most preventable outages in cloud operations. The fix takes minutes, but the prevention, managed certificates plus early alerting plus continuous checks, is what keeps it from ever paging you.

API Gateway Custom Domain Certificate Expired: Causes, Fixes, and Prevention