☁︎SAA-C03

Auto Scaling

Auto Scaling — Concept

What it is

Amazon EC2 Auto Scaling = automatically launches and terminates EC2 instances to match demand, while keeping a desired count of healthy instances across multiple AZs.

(Broader AWS Auto Scaling can scale ECS, DynamoDB, Aurora replicas, etc. via target tracking on those services.)

Why it exists

Manual sizing wastes money and risks outages. Auto Scaling ensures HA, elasticity, and cost optimization — pay only for what you need now.

Key building blocks

  • Auto Scaling Group (ASG) — defines min/max/desired counts and the AZs/subnets to run in.
  • Launch Template (preferred) or Launch Configuration (legacy) — describes the AMI, instance type, key, SGs, user-data, storage.
  • Target Group (when behind an LB) — ASG registers/deregisters instances with the TG.
  • Health checks — EC2 status (default) and/or ELB health checks; unhealthy → terminate + replace.

Scaling policies

PolicyHow
ManualAdjust desired count by hand
ScheduledUp/down at fixed times (e.g. business hours)
Dynamic — Target Tracking"Keep CPU at 50 %" — AWS calculates the deltas
Dynamic — Step ScalingAdd N instances when alarm at X, more if X+Y
Dynamic — Simple ScalingSingle adjustment per alarm; cooldown
Predictive ScalingML predicts daily pattern, pre-scales

Lifecycle hooks

  • Pause an instance in Pending:Wait or Terminating:Wait to run custom logic (warm up, drain cache, save logs).
  • Hook can call Lambda / SQS / SNS or be polled.

Termination policies & instance refresh

  • Default: oldest LC/LT first, then closest to next billing hour, then random.
  • Instance Refresh rolls a new launch template through the ASG (controlled %, warm-up).

Mixed instances policy

  • Mix On-Demand + Spot in one ASG.
  • Allocate across multiple instance types and purchase options.
  • Maximize availability of Spot capacity.

Warm pools

  • Pre-launched stopped instances ready to start quickly — useful when app boot time is long.

ELB / ALB integration

  • ASG associates with one or more target groups.
  • ELB health checks can be the source of truth (more accurate than EC2 status).

When to use vs alternatives

Use ...Instead of ...When ...
ASG + ALBSingle EC2Always for production web tier
ASG with SpotAll On-DemandWorkload is fault-tolerant and stateless
Predictive scalingReactivePredictable daily pattern
Lifecycle hooksNoneNeed warm-up scripts or graceful drain
Warm poolCold launchesBoot time > a couple of minutes

Common exam scenarios

  1. "Web tier must survive AZ failure and scale to load"ASG across 2+ AZs behind ALB.
  2. "Save cost by mixing On-Demand + Spot in ASG"mixed instances policy.
  3. "Stop scaling thrash on metric spikes" → use target tracking with reasonable warm-up, or step scaling with cooldown.
  4. "Drain in-flight requests before terminating"deregistration delay on TG + lifecycle hook on terminate.
  5. "Predictable lunchtime spike each day"scheduled or predictive scaling.
  6. "App takes 5 min to boot"warm pool to reduce start latency.
  7. "Roll a new AMI through ASG safely"Instance Refresh with desired-healthy-percentage.

Exam tip

"Highly available + scalable" almost always = ASG across multi-AZ behind ELB. If a question shows a single EC2 with elastic IP — that's a wrong design unless the question is testing limits.

References