☁︎SAA-C03

Step Functions

Step Functions — Concept

What it is

AWS Step Functions = serverless workflow orchestrator. You define a state machine in JSON (Amazon States Language); Step Functions runs it: tasks, choices, parallel branches, retries, error handling, waits.

Why it exists

Stringing Lambdas with SNS/SQS quickly becomes spaghetti — no visibility, no built-in retries, no easy human approval steps. Step Functions gives you visual workflows, state, error handling, long-running waits, and integration with 200+ AWS services.

Two workflow types

StandardExpress
Max duration1 year5 minutes
Execution rate2,000 / s (start rate)100,000+ / s
Pricingper state transitionper request + duration (cheaper at high volume)
Execution historyexactly-onceat-least-once (Async) or exactly-once (Sync)
Uselong, durable, human approval, infrequenthigh-volume event-driven

State types

  • Task — invoke service (Lambda, ECS, SNS, SQS, DynamoDB, …).
  • Choice — branch based on input.
  • Wait — pause until X seconds / until timestamp.
  • Parallel — run branches in parallel.
  • Map — iterate over a list.
  • Pass / Succeed / Fail — control flow.

Integrations

  • Direct integrations with AWS services — no Lambda glue needed (e.g., arn:aws:states:::dynamodb:putItem).
  • .sync suffix = wait for service job to complete (Glue, ECS task, EMR step, Batch job).
  • .waitForTaskToken = pause until a callback (great for human approval).

Error handling

  • Per-task Retry with intervals / max attempts / exponential backoff.
  • Catch blocks for specific errors.
  • Failed executions visible in console with full history.

When to use vs alternatives

Use ...Instead of ...When ...
Step Functions Standardchained LambdasLong workflow (hours/days), visibility, retries, human approval
Step Functions Expresschained LambdasHigh-volume short workflows, event processing
EventBridgeStep FunctionsSimple "event → fanout" routing, no orchestration
SQSStep FunctionsDecoupling and buffering only
AWS BatchStep FunctionsLong-running compute jobs (vs orchestration)

Common exam scenarios

  1. "Multi-step order workflow with approval after 24 h"Standard workflow with .waitForTaskToken.
  2. "Image processing pipeline: thumbnail → label → DB update with retries"Express (or Standard).
  3. "Run hundreds of parallel data tasks with map state"Step Functions Map (distributed).
  4. "Coordinate Glue job → wait for it → run Lambda" → Task .sync on Glue job.
  5. "Need visual diagram of the workflow + error history" → Step Functions console.

Exam tip

"Orchestrate" / "workflow" / "human approval" / "retry & catch" → Step Functions. For "event routing" → EventBridge. For "simple decoupling" → SQS.

References