This check flags API Gateway stages that don't have detailed CloudWatch metrics enabled, which leaves you blind to per-method latency, error rates, and traffic patterns. Turn on detailed metrics with a single update-stage call or a one-line setting in your IaC.
When an API Gateway stage misbehaves at 2am, the difference between a five minute fix and a two hour incident usually comes down to one thing: whether you can see which method is failing. Without detailed metrics, all you get is a coarse, stage-wide view that hides the method actually causing the pain. This check catches that gap before it bites you during an outage.
What this check detects
The apigw_nometrics check looks at each deployed API Gateway stage and verifies whether detailed CloudWatch metrics are enabled. In AWS terms, this is the metricsEnabled setting on a stage's method settings.
API Gateway emits two tiers of metrics:
- Basic metrics (always on): aggregated at the stage level. You see total
Count,4XXError,5XXError,Latency, andIntegrationLatencyacross the entire stage. - Detailed metrics (off by default): the same metrics broken down per method and resource path, so you can isolate
GET /ordersfromPOST /payments.
When detailed metrics are disabled, you can confirm an API is throwing 500s but you cannot tell from CloudWatch which endpoint is responsible. The check fails for any stage where this finer-grained visibility is missing.
Note: This setting can be configured globally for a stage (all methods) or per individual method using the */* wildcard versus a specific {resource}/{httpMethod} path. The check evaluates the effective configuration applied to your methods.
Why it matters
API Gateway sits at the front door of your services. It's frequently the first AWS component a request touches and the last one a response leaves through. Losing observability here means losing it everywhere downstream.
Slow incident response
Stage-level metrics tell you something is broken but not where. Imagine a stage serving forty methods. A spike in 5XXError shows up, but the aggregate metric blends every route together. Your on-call engineer ends up grepping logs or guessing instead of pulling up a per-method graph and immediately seeing that POST /checkout is the only failing endpoint.
Missed performance regressions
A single slow method can drag up the aggregate latency just enough to look like noise. With detailed metrics, a regression on one endpoint is obvious. Without them, gradual degradation hides inside the average until customers complain.
Weaker security signal
Per-method error rates are a useful early signal for abuse. A sudden burst of 4XXError concentrated on an auth endpoint can indicate credential stuffing or enumeration attempts. Aggregated metrics smear that signal across all traffic and make it far easier to miss.
Warning: Detailed metrics are not free. Each method generates additional CloudWatch custom metrics, and CloudWatch bills per metric per month. On an API with hundreds of methods across multiple stages, this adds up. Enable it where visibility matters most, and read the cost note further down before flipping it on everywhere.
How to fix it
You can enable detailed metrics through the console, the CLI, or infrastructure as code. The IaC route is the only one that sticks, so treat the console and CLI options as quick fixes during an incident.
Option 1: AWS Console
- Open the API Gateway console and select your API.
- In the left nav, choose Stages and select the target stage.
- Open the Logs and tracing section and choose Edit.
- Toggle Detailed metrics on.
- Save changes. CloudWatch begins emitting per-method metrics within a few minutes.
Option 2: AWS CLI
For a REST API, patch the stage's method settings. The /*/* path applies the setting to every method on the stage.
aws apigateway update-stage \
--rest-api-id abc123def4 \
--stage-name prod \
--patch-operations \
op=replace,path=/*/*/metrics/enabled,value=true
Verify it took effect:
aws apigateway get-stage \
--rest-api-id abc123def4 \
--stage-name prod \
--query 'methodSettings."*/*".metricsEnabled'
For HTTP APIs (API Gateway v2), detailed metrics work differently. v2 emits metrics at the route level through the DetailedMetricsEnabled route setting:
aws apigatewayv2 update-stage \
--api-id abc123def4 \
--stage-name prod \
--default-route-settings DetailedMetricsEnabled=true
Note: The update-stage call only changes configuration, not behavior, so it is safe to run against production. There is no redeploy and no request disruption. Metrics simply start flowing.
Option 3: Terraform
For a REST API, set metrics_enabled on the method settings resource:
resource "aws_api_gateway_method_settings" "prod" {
rest_api_id = aws_api_gateway_rest_api.example.id
stage_name = aws_api_gateway_stage.prod.stage_name
method_path = "*/*"
settings {
metrics_enabled = true
logging_level = "INFO"
data_trace_enabled = false
}
}
For an HTTP API stage:
resource "aws_apigatewayv2_stage" "prod" {
api_id = aws_apigatewayv2_api.example.id
name = "prod"
auto_deploy = true
default_route_settings {
detailed_metrics_enabled = true
}
}
Option 4: CloudFormation
ProdStage:
Type: AWS::ApiGateway::Stage
Properties:
RestApiId: !Ref MyRestApi
StageName: prod
DeploymentId: !Ref MyDeployment
MethodSettings:
- ResourcePath: "/*"
HttpMethod: "*"
MetricsEnabled: true
Tip: Pair detailed metrics with execution logging at the INFO level on non-production stages and ERROR on production. Metrics tell you which method is failing, and logs tell you why. Together they cut mean time to resolution far more than either does alone.
How to prevent it from happening again
Manually toggling a setting fixes one stage today. To stop it drifting back tomorrow, push the requirement into the pipeline.
Enforce it in module defaults
If teams provision API Gateways through a shared Terraform module, bake metrics_enabled = true into the module so every new API inherits it. Make the value a variable that defaults to true so opting out is a deliberate, reviewable choice rather than the silent default.
Gate it with policy as code
Add an OPA or Conftest policy to your CI that rejects any plan creating a stage without metrics. A rough Rego check against a Terraform plan looks like this:
package apigateway
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_api_gateway_method_settings"
not resource.change.after.settings[_].metrics_enabled
msg := sprintf("API Gateway method settings %s must enable detailed metrics", [resource.address])
}
Run it as a required step before terraform apply:
terraform plan -out=tfplan
terraform show -json tfplan > plan.json
conftest test plan.json --policy ./policies
Catch drift continuously
Policy gates only cover changes that go through your pipeline. Someone with console access can still disable metrics by hand. Continuous scanning closes that gap. Lensix runs the apigw_nometrics check on a schedule across every account and region, so a manual change surfaces as a finding instead of waiting for the next incident to reveal it.
Tip: Wire your scan findings into the same alerting channel your on-call team already watches. A config drift finding that lands in Slack next to your other alerts gets fixed in hours. One buried in a weekly report gets fixed never.
Best practices
- Enable detailed metrics on every production stage. The visibility is worth the cost where uptime matters. Treat it as a baseline, not an upgrade.
- Build dashboards per method, not per stage. Once detailed metrics flow, create CloudWatch dashboards that graph latency and error rate for your highest-traffic and highest-risk endpoints.
- Alarm on per-method 5XX rates. Set CloudWatch alarms on the
5XXErrormetric scoped to critical methods so you are paged before the aggregate budget is blown. - Be deliberate about cost on wide APIs. If an API has hundreds of methods, consider enabling detailed metrics per critical method instead of the
*/*wildcard to control the number of custom metrics CloudWatch bills you for. - Combine metrics, access logs, and X-Ray. Metrics show the trend, access logs show individual requests, and X-Ray traces show the path through your integrations. The three together give you a complete picture.
Note: Detailed metrics are a monitoring control, not a security boundary. They make problems visible faster, but they do not prevent abuse. Keep them alongside throttling, WAF rules, and authorizers as part of a layered API defense.
Observability is cheapest to add before you need it and most expensive to be missing during an outage. Enabling detailed metrics is a small, low-risk change that pays off the first time something breaks at the worst possible hour.

