GCP Instance Group Missing Autoscaling: Fix Guide

TL;DR

This check flags GCP managed instance groups that run with a fixed number of VMs because autoscaling is disabled. Without it, you either overpay for idle capacity or fall over under load. Attach an autoscaler with gcloud compute instance-groups managed set-autoscaling to fix it.

A managed instance group (MIG) is GCP's way of running a fleet of identical VMs from a single template. The whole point of a MIG is elasticity: it can grow and shrink based on demand, replace unhealthy instances, and spread load across zones. When autoscaling is turned off, you lose half of that value. The group becomes a static pool of machines that sits at one size no matter what your traffic does.

The Lensix check compute_noautoscale looks at every managed instance group in your GCP projects and reports the ones that have no autoscaler attached. It is a low-severity finding in security terms, but it has direct consequences for reliability and cost, which is exactly where most cloud incidents and budget overruns come from.

What this check detects

The check inspects each managed instance group and verifies whether an autoscaling policy is configured. A MIG can run in one of two modes:

Manual scaling: the group holds a fixed target size that you set by hand. It stays there until you change it.
Autoscaling: an autoscaler adjusts the number of instances automatically based on signals like CPU utilization, load balancer serving capacity, Cloud Monitoring metrics, or a custom schedule.

When the check finds a MIG with no autoscaler, it raises compute_noautoscale. The finding tells you the group is locked to whatever size it was last set to, with no ability to respond to changing demand.

Note: This check applies to zonal and regional managed instance groups. It does not apply to unmanaged instance groups, which are just collections of arbitrary VMs and do not support autoscaling at all.

Why it matters

The risk here is not a breach, it is operational. A MIG without autoscaling fails in two opposite directions, and most teams hit both over the life of a service.

Under-provisioning causes outages

If you fix the group at, say, three instances and a traffic spike arrives, those three machines absorb everything. CPU climbs, latency rises, requests start timing out, and eventually instances become unresponsive. A load balancer in front of a static MIG cannot conjure more capacity. It just keeps routing traffic to overloaded backends until they fall over. This is the classic failure mode during product launches, marketing pushes, and the kind of traffic you cannot predict.

Over-provisioning wastes money

The opposite is just as common. To avoid the outage scenario, teams size the group for peak load and leave it there. Now you are paying for peak capacity twenty-four hours a day, including the overnight hours when the group is doing almost nothing. For a fleet running n2-standard-8 instances, the difference between scaling down at night and running flat out can be thousands of dollars a month.

Warning: A static MIG sized for peak is often the single largest line item in a compute bill. If you have several of these across environments, the wasted spend compounds fast. Autoscaling typically pays for itself within the first billing cycle.

Slower recovery from regional pressure

Regional MIGs with autoscaling can rebalance and add capacity in a healthy zone when another zone is degraded. A static group cannot react to that pressure on its own, so a partial zone issue turns into a partial outage for your users.

How to fix it

Attaching an autoscaler is a non-destructive change. It does not recreate your instances or interrupt traffic. You are adding a controller that manages the group's size from now on.

Option 1: gcloud CLI

The most common starting point is CPU-based autoscaling. This example targets 60 percent average CPU utilization, allows the group to range from 2 to 10 instances, and waits 90 seconds for new instances to warm up before counting their metrics.

gcloud compute instance-groups managed set-autoscaling my-mig \
  --zone=us-central1-a \
  --min-num-replicas=2 \
  --max-num-replicas=10 \
  --target-cpu-utilization=0.6 \
  --cool-down-period=90

For a regional MIG, swap --zone for --region:

gcloud compute instance-groups managed set-autoscaling my-mig \
  --region=us-central1 \
  --min-num-replicas=3 \
  --max-num-replicas=15 \
  --target-cpu-utilization=0.6 \
  --cool-down-period=90

Note: CPU is a fine default, but it is rarely the best signal for request-driven services. If your MIG sits behind an HTTP(S) load balancer, scaling on serving capacity (requests per second or backend utilization) tracks real demand far more accurately. Use --scale-based-on-load-balancing with a backend service configured for utilization.

Option 2: Google Cloud Console

Go to Compute Engine, Instance groups.
Click the name of the managed instance group.
Click Edit.
Under Autoscaling, set the mode to On: add and remove instances to the group.
Choose a scaling metric (CPU utilization, load balancing utilization, or a Cloud Monitoring metric).
Set minimum and maximum instance counts and the cooldown period.
Click Save.

Option 3: Terraform

If you manage infrastructure as code, the autoscaler is a separate resource that references the instance group. This is the version you want in your repo so the configuration does not drift.

resource "google_compute_autoscaler" "default" {
  name   = "my-mig-autoscaler"
  zone   = "us-central1-a"
  target = google_compute_instance_group_manager.default.id

  autoscaling_policy {
    min_replicas    = 2
    max_replicas    = 10
    cooldown_period = 90

    cpu_utilization {
      target = 0.6
    }
  }
}

For a regional MIG, use google_compute_region_autoscaler with a region argument and point target at a google_compute_region_instance_group_manager.

Tip: Set min_replicas to the smallest number of instances that can serve your baseline traffic without falling over, not to 1. Going too low means cold-start latency every time traffic returns, and a single instance gives you no redundancy if it fails.

How to prevent it from happening again

Fixing one MIG by hand is easy. Stopping new ones from shipping without autoscaling is the part that actually moves the needle. A few layers work well together.

Gate it in CI with policy-as-code

If your MIGs are defined in Terraform, you can block any plan that creates an instance group manager without an associated autoscaler. Here is the shape of an OPA/Conftest policy that flags region instance group managers lacking an autoscaler in the same plan:

package main

deny[msg] {
  some r
  input.resource_changes[r].type == "google_compute_region_instance_group_manager"
  name := input.resource_changes[r].change.after.name

  not has_autoscaler(name)

  msg := sprintf("MIG '%s' has no autoscaler attached", [name])
}

has_autoscaler(target_name) {
  some a
  input.resource_changes[a].type == "google_compute_region_autoscaler"
}

Wire this into your pipeline so a failing policy stops the merge. The change request never reaches production without scaling configured.

Enforce with Organization Policy where possible

GCP Organization Policy does not have a built-in constraint specifically for "MIG must autoscale," but you can use custom constraints to enforce patterns on instance group resources at the org or folder level. Combine that with a deny-by-default approach in your module library so the path of least resistance includes autoscaling.

Standardize on a module

The most reliable prevention is removing the choice. Wrap your MIG definition in a shared Terraform module that always creates an autoscaler with sane defaults. Teams consume the module and get autoscaling for free, which means nobody has to remember to add it.

Tip: Run the Lensix compute_noautoscale check on a schedule across all projects so any MIG that slips through, including ones created outside your IaC pipeline, surfaces within a day instead of at the next incident review.

Best practices

Scale on the right signal. Use load balancing utilization for request-driven web services, custom Cloud Monitoring metrics for queue workers, and CPU only as a fallback. Scaling on the wrong metric is almost as bad as not scaling at all.
Always set a sensible minimum. Keep enough baseline instances to survive a single-instance failure and serve normal traffic without cold starts.
Set a maximum you can actually pay for. The max replica count is your cost ceiling and your blast-radius limit. Pick a number that handles real spikes but will not silently scale to a five-figure bill if a metric goes haywire.
Tune the cooldown period to your boot time. If instances take two minutes to become useful, a 30-second cooldown will cause the autoscaler to over-react and thrash. Match the cooldown to how long your app actually takes to warm up.
Use predictive autoscaling for known patterns. If your traffic follows a daily curve, GCP's predictive autoscaling can add capacity ahead of the spike instead of chasing it, which reduces latency during ramp-up.
Pair autoscaling with health checks. Autoscaling decides how many instances to run; health checks and auto-healing keep those instances actually serving. You want both on every production MIG.

Autoscaling is one of those settings that costs nothing to enable and quietly prevents both outages and waste. The check is easy to clear, and once you standardize it through a shared module and a CI gate, it stops being something anyone has to think about.

GCP Managed Instance Group Has No Autoscaling: Why It Matters and How to Fix It