S3 — Concept
What it is
Amazon Simple Storage Service (S3) = AWS's highly durable, infinitely scalable object storage. Objects (up to 5 TB) live in buckets identified by a globally unique name and a region.
Why it exists
S3 is the bedrock of AWS: data lake, static website hosting, backups, archive, big-data input/output, content distribution origin. 11 nines (99.999999999 %) of durability and 99.99 % availability target.
Core concepts
- Bucket — top-level container (region-scoped, globally unique name).
- Object — file + metadata + version ID (if versioning).
- Key — full path inside the bucket (e.g.
2026/05/photo.jpg). - Prefix — virtual folder (no real hierarchy under the hood).
Storage classes (must memorize)
| Class | Latency | Min storage | Use |
|---|---|---|---|
| S3 Standard | ms | — | Frequent access, hot data |
| S3 Intelligent-Tiering | ms | — | Unknown / changing access pattern (auto moves between Frequent/Infrequent/Archive tiers) |
| S3 Standard-IA | ms | 30 d | Infrequent but quick access; retrieval fee |
| S3 One Zone-IA | ms | 30 d | Single-AZ, ~20 % cheaper than Std-IA, reproducible data |
| S3 Glacier Instant Retrieval | ms | 90 d | Archive that needs occasional fast access |
| S3 Glacier Flexible Retrieval | minutes–hours (1–5 min Expedited, 3–5 h Std, 5–12 h Bulk) | 90 d | Backup / DR |
| S3 Glacier Deep Archive | 12 h Std / 48 h Bulk | 180 d | Long-term compliance, cheapest |
| S3 Reduced Redundancy (RRS) | ms | — | Deprecated, don't pick on exam |
Lifecycle policy
- Move objects between classes by age or transition (e.g. Standard → IA after 30 d → Glacier after 90 d).
- Expire (delete) objects after N days.
- Can also expire incomplete multipart uploads (clean up costs).
- Filter by prefix or tag.
Versioning & MFA Delete
- Versioning (bucket-level) keeps all versions of objects; deletes create a delete marker.
- Once enabled cannot be disabled (only suspended).
- MFA Delete requires MFA to delete versions or disable versioning (root account only).
Replication
- Cross-Region Replication (CRR) — DR, latency.
- Same-Region Replication (SRR) — log aggregation, compliance.
- Requires versioning enabled on source and destination.
- Asynchronous; new objects only (use Batch Replication for backfill).
- Cross-account works.
Encryption (very common exam topic)
| Mode | Who manages key | Notes |
|---|---|---|
| SSE-S3 (default) | AWS-managed AES-256 | Free; transparent |
| SSE-KMS | KMS-managed CMK | Audit + key rotation; counts toward KMS API limits |
| SSE-C | Customer provides key per request | AWS doesn't store the key |
| DSSE-KMS | Dual-layer with two distinct KMS keys | Higher assurance compliance |
| Client-side | You encrypt before upload | E2E control |
Buckets can require encryption via bucket policy with s3:x-amz-server-side-encryption condition.
Block Public Access (BPA) blocks all public ACLs/policies by default at account & bucket level.
Security & access
- Bucket policy (resource-based) — fine-grained including conditions like IP, VPC, encryption header, MFA.
- IAM policy — identity-based.
- ACLs — legacy; AWS recommends disabling object ACLs (Bucket Owner Enforced).
- Pre-signed URL — time-bound permission to upload/download via IAM creds.
- Block Public Access — master switch, defaults to ON for new buckets.
Object Lock & Glacier Vault Lock
- Object Lock = WORM (write-once-read-many), legal hold or retention period (Governance/Compliance mode).
- Bucket versioning required.
Performance features
- Multipart upload — required > 5 GB, recommended > 100 MB; parallel parts; resumable.
- Byte-range fetches — parallel download.
- S3 Transfer Acceleration — uploads via CloudFront edge POPs (extra fee).
- S3 Select / Glacier Select — server-side SQL on individual objects (CSV/JSON/Parquet).
- Request rate: at least 5,500 GET / 3,500 PUT per prefix per second (scales).
Storage Lens, Inventory, Analytics
- S3 Inventory = daily CSV/Parquet listing of objects (for batch processing).
- Storage Lens = org-wide usage & best-practice dashboard.
- S3 Storage Class Analysis = recommends Std → IA transitions.
When to use vs alternatives
| Use ... | Instead of ... | When ... |
|---|---|---|
| S3 | EFS / EBS | Object storage, no POSIX, web-scale |
| Intelligent-Tiering | Manual lifecycle | Unknown / variable access patterns |
| Glacier Deep Archive | On-prem tape | Cheapest long-term archive |
| S3 Transfer Acceleration | Direct upload | Far-from-region clients pushing big files |
Common exam scenarios
- "Cheapest storage for 7-year compliance archive, retrieval rare" → Glacier Deep Archive.
- "Auto-optimize cost without knowing access pattern" → Intelligent-Tiering.
- "Static website hosted on S3 with HTTPS and custom domain" → S3 + CloudFront (S3 website is HTTP only).
- "Allow upload to bucket from a mobile app without exposing keys" → Pre-signed PUT URL.
- "Ensure all objects encrypted with KMS" → bucket policy with
s3:x-amz-server-side-encryption-aws-kms-key-idcondition + default encryption. - "DR replica in another region" → CRR with versioning.
- "Legal hold / immutable storage for audit logs" → Object Lock (Compliance mode) + versioning.
- "Block accidental public exposure" → Block Public Access ON at account level.
Exam tip
S3 = object storage, never mount as a file system in real solutions. CloudFront is the standard companion for HTTPS / latency. Encryption + Block Public Access are almost always part of "best practice" answers.