Auto Scaling — Concept

What it is

Amazon EC2 Auto Scaling = automatically launches and terminates EC2 instances to match demand, while keeping a desired count of healthy instances across multiple AZs.

(Broader AWS Auto Scaling can scale ECS, DynamoDB, Aurora replicas, etc. via target tracking on those services.)

Why it exists

Manual sizing wastes money and risks outages. Auto Scaling ensures HA, elasticity, and cost optimization — pay only for what you need now.

Key building blocks

Auto Scaling Group (ASG) — defines min/max/desired counts and the AZs/subnets to run in.
Launch Template (preferred) or Launch Configuration (legacy) — describes the AMI, instance type, key, SGs, user-data, storage.
Target Group (when behind an LB) — ASG registers/deregisters instances with the TG.
Health checks — EC2 status (default) and/or ELB health checks; unhealthy → terminate + replace.

Scaling policies

Policy	How
Manual	Adjust desired count by hand
Scheduled	Up/down at fixed times (e.g. business hours)
Dynamic — Target Tracking	"Keep CPU at 50 %" — AWS calculates the deltas
Dynamic — Step Scaling	Add N instances when alarm at X, more if X+Y
Dynamic — Simple Scaling	Single adjustment per alarm; cooldown
Predictive Scaling	ML predicts daily pattern, pre-scales

Lifecycle hooks

Pause an instance in Pending:Wait or Terminating:Wait to run custom logic (warm up, drain cache, save logs).
Hook can call Lambda / SQS / SNS or be polled.

Termination policies & instance refresh

Default: oldest LC/LT first, then closest to next billing hour, then random.
Instance Refresh rolls a new launch template through the ASG (controlled %, warm-up).

Mixed instances policy

Mix On-Demand + Spot in one ASG.
Allocate across multiple instance types and purchase options.
Maximize availability of Spot capacity.

Warm pools

Pre-launched stopped instances ready to start quickly — useful when app boot time is long.

ELB / ALB integration

ASG associates with one or more target groups.
ELB health checks can be the source of truth (more accurate than EC2 status).

When to use vs alternatives

Use ...	Instead of ...	When ...
ASG + ALB	Single EC2	Always for production web tier
ASG with Spot	All On-Demand	Workload is fault-tolerant and stateless
Predictive scaling	Reactive	Predictable daily pattern
Lifecycle hooks	None	Need warm-up scripts or graceful drain
Warm pool	Cold launches	Boot time > a couple of minutes

Common exam scenarios

"Web tier must survive AZ failure and scale to load" → ASG across 2+ AZs behind ALB.
"Save cost by mixing On-Demand + Spot in ASG" → mixed instances policy.
"Stop scaling thrash on metric spikes" → use target tracking with reasonable warm-up, or step scaling with cooldown.
"Drain in-flight requests before terminating" → deregistration delay on TG + lifecycle hook on terminate.
"Predictable lunchtime spike each day" → scheduled or predictive scaling.
"App takes 5 min to boot" → warm pool to reduce start latency.
"Roll a new AMI through ASG safely" → Instance Refresh with desired-healthy-percentage.

Exam tip

"Highly available + scalable" almost always = ASG across multi-AZ behind ELB. If a question shows a single EC2 with elastic IP — that's a wrong design unless the question is testing limits.

References

https://docs.aws.amazon.com/autoscaling/