| Section | Topics |
|---|---|
| AWS Global Infrastructure | Regions, AZs, Edge Locations |
| IAM | Users, Groups, Policies, Roles |
| EC2 | Instance Types, Purchasing, Security Groups |
| AMI | Custom AMIs, Cross-region |
| EBS | Volume Types, Snapshots, Multi-Attach |
| EFS | Performance Modes, Storage Tiers |
| Storage Comparison | EBS vs EFS vs Instance Store |
| ELB & ASG | ALB, NLB, GLB, Health Checks, Scaling |
| RDS & Aurora | Read Replicas, Multi-AZ, Aurora, Proxy |
| ElastiCache | Redis vs Memcached, Caching Strategies |
| S3 | Storage Classes, Security, Replication, Performance |
| Lambda | Invocations, Concurrency, Layers, VPC |
| API Gateway | Endpoints, Integrations, Security, Caching |
| DynamoDB | Capacity, Indexes, Streams, DAX |
| SQS | Standard/FIFO, Visibility, DLQ |
| SNS | Fan-out, Filtering, FIFO |
| Kinesis | Streams, Firehose, Analytics |
| Step Functions | State Machines, Workflows |
| Containers | ECS, Fargate, ECR |
| CloudFormation | Templates, Functions, Stacks |
| SAM | Serverless Templates, CLI |
| CI/CD | CodeCommit, CodeBuild, CodeDeploy, CodePipeline |
| CloudWatch | Metrics, Logs, Alarms |
| X-Ray | Distributed Tracing, Sampling |
| Cognito | User Pools, Identity Pools |
| KMS | Encryption, Key Types, Envelope Encryption |
| Secrets & Parameters | Secrets Manager, SSM Parameter Store |
| EventBridge | Event Bus, Rules, Targets |
| Elastic Beanstalk | Deployment Policies, Extensions |
| Self-Exam Questions | 100+ questions across all DVA-C02 topics |
Take Interactive Quiz — Test your knowledge with an interactive quiz!
A Region is a geographic area with multiple data centers (e.g., us-east-1, eu-west-2).
| Consideration | Description |
|---|---|
| Compliance | Data may need to stay in specific countries |
| Latency | Deploy closer to users for better performance |
| Service availability | Not all services available in all regions |
| Pricing | Varies by region |
Each Region has 2-6 AZs (e.g., us-east-1a, us-east-1b). Each AZ = one or more discrete data centers with independent power, networking, and connectivity.
| Key Point | Detail |
|---|---|
| Isolation | AZs are physically separated (disaster protection) |
| Low latency | Connected via high-bandwidth, low-latency networking |
| HA design | Distribute resources across AZs for fault tolerance |
| Concept | Description |
|---|---|
| Edge Locations | CDN endpoints for CloudFront (200+ worldwide) |
| Global services | IAM, Route 53, CloudFront, WAF (not region-specific) |
| Regional services | EC2, RDS, EBS, etc. (bound to a region) |
💡 Exam tip: Know which services are global vs regional. IAM is global; EC2 and EBS are regional; EBS is AZ-specific.
Global service — not region-specific
| Concept | Description |
|---|---|
| Users | Individual identities, can belong to multiple groups |
| Groups | Collections of users (cannot nest groups) |
| Policies | JSON documents defining permissions |
| Inline Policy | Policy attached directly to a user (no group needed) |
{
"Version": "2012-10-17", // Policy language version
"Statement": [{
"Sid": "StatementId", // Optional identifier
"Effect": "Allow|Deny",
"Principal": "arn:...", // Account/user/role this applies to
"Action": ["s3:Get*"], // API actions
"Resource": ["arn:..."] // Target resources
}]
}- Permissions for services, not users
- Attach policies to roles → assign roles to services (e.g., EC2)
- The service receiving the role = trusted entity
| Tool | Purpose |
|---|---|
| Credentials Report | CSV of all users + credential status |
| Access Advisor | Shows service access history per user |
EC2 encompasses: Instances • EBS (drives) • ELB (load balancing) • ASG (auto-scaling)
- OS: Linux, Windows, MacOS
- Compute: CPU cores, RAM
- Storage: Network (EBS/EFS) or Hardware (Instance Store)
- Network: Card speed, public/private IP
User Data — Bootstrap script that runs at launch with root privileges. Use for updates, software installation, config.
Acts as a firewall for EC2 instances.
| Rule Type | Default | Description |
|---|---|---|
| Inbound | Blocked | Controls incoming traffic |
| Outbound | Allowed | Controls outgoing traffic |
⚠️ Timeout = Security Group issue — If you can't connect (SSH/HTTP/HTTPS), check SG first
Key points:
- Locked to region + VPC
- SGs can reference other SGs (e.g., allow traffic from instances with SG2)
| Port | Protocol |
|---|---|
| 22 | SSH / SFTP |
| 21 | FTP |
| 80 | HTTP |
| 443 | HTTPS |
| 3389 | RDP (Windows) |
Naming: m5.2xlarge → m (class) + 5 (generation) + 2xlarge (size)
| Type | Prefix | Use Case |
|---|---|---|
| General Purpose | t3, m5 | Balanced workloads |
| Compute Optimized | c5 | Batch processing, high-performance computing |
| Memory Optimized | r5 | In-memory databases, caching |
| Storage Optimized | i3 | High IOPS, data warehousing |
| GPU | p3 | ML/AI, graphics |
| FPGA | f1 | Custom hardware acceleration |
| Option | Discount | Commitment | Best For |
|---|---|---|---|
| On-Demand | None | None | Short, unpredictable workloads |
| Reserved | Up to 72% | 1-3 years | Steady-state workloads |
| Savings Plan | Up to 72% | $/hour for 1-3 years | Flexible long workloads |
| Spot | Up to 90% | None (can be interrupted) | Batch jobs, fault-tolerant |
| Dedicated Host | Varies | Optional 1-3 year | Licensing, compliance |
| Dedicated Instance | Varies | None | Compliance, isolation |
| Capacity Reservation | None | Pay regardless of use | Guaranteed availability |
On-Demand: Linux/Windows billed per second (after 1st min), other OS per hour
Reserved Instances: Reserve specific attributes (type, region, OS). Pay upfront = more discount. Can sell on AWS Marketplace.
Savings Plan: Commit to $/hour spend, locked to instance family + region (e.g., m5 in us-east-1). Excess usage = On-Demand pricing.
Spot: Cheapest option. Lose instance when spot price > your bid. Never use for critical workloads.
Dedicated Host vs Instance:
- Host — Full server control, see sockets/cores (for BYOL licensing)
- Instance — Dedicated hardware, no host visibility, may share with same account
Region-specific — must copy to use in another region
AMI = Pre-configured EC2 template (OS + software + config)
| AMI Type | Description |
|---|---|
| Public | AWS-provided (Amazon Linux, Ubuntu, etc.) |
| AWS Marketplace | Third-party, often pre-configured software |
| Custom | Your own, built from an EC2 instance |
Creating a Custom AMI:
- Launch EC2 → configure/install software
- Stop instance (for data integrity)
- Create AMI → creates EBS snapshots automatically
- Launch new instances from your AMI
💡 AMIs speed up boot time since software is pre-baked, not installed via User Data
Network-attached storage for EC2 — like a USB stick over the network.
Key characteristics:
- Bound to one AZ
- Attached to one instance at a time (except io1/io2 Multi-Attach)
- Network drive = some latency
- Persists independently of instance lifecycle
Backup mechanism for EBS volumes — can restore to any AZ.
| Feature | Description |
|---|---|
| Cross-AZ restore | Snapshot in us-east-1a → restore in us-east-1b |
| Recycle Bin | Deleted snapshots recoverable (configurable retention) |
| Fast Snapshot Restore | No latency on first use, but expensive |
Hardware-attached storage (physically on the host) — not network-based.
| Pros | Cons |
|---|---|
| Extremely high IOPS (millions) | Ephemeral — data lost on stop/terminate/hardware failure |
| Low latency (direct attached) | Cannot detach and reattach |
| Included in instance cost | Size tied to instance type |
Use cases: Buffer, cache, scratch data, temporary content
⚠️ You are responsible for backups/replication — AWS won't recover this data
| Volume | Default Behavior |
|---|---|
| Root EBS | Deleted on termination |
| Additional EBS | Preserved on termination |
Can be changed via console or CLI at launch time.
| Type | Category | IOPS | Throughput | Size | Boot? |
|---|---|---|---|---|---|
| gp3 | General SSD | 3,000–16,000 | 125–1,000 MiB/s | 1 GiB–16 TiB | ✅ |
| gp2 | General SSD | 3 IOPS/GiB (max 16,000) | Linked to IOPS | 1 GiB–16 TiB | ✅ |
| io2 Block Express | Provisioned IOPS | Up to 256,000 | 4,000 MiB/s | 4 GiB–64 TiB | ✅ |
| io1 | Provisioned IOPS | Up to 64,000 | 1,000 MiB/s | 4 GiB–16 TiB | ✅ |
| st1 | Throughput HDD | Max 500 | 500 MiB/s | 125 GiB–16 TiB | ❌ |
| sc1 | Cold HDD | Max 250 | 250 MiB/s | 125 GiB–16 TiB | ❌ |
💡 Only SSD types (gp2/gp3/io1/io2) can be boot volumes
gp3 vs gp2: gp3 allows independent IOPS/throughput scaling; gp2 links IOPS to size
Provisioned IOPS (io1/io2): For sustained IOPS needs — databases, critical apps. io2 Block Express offers sub-millisecond latency.
- io1/io2 only — attach same volume to multiple EC2 in same AZ
- Up to 16 instances simultaneously
- Use case: clustered applications requiring shared storage
Managed NFS that can be mounted on multiple EC2 across multiple AZs.
| Feature | Value |
|---|---|
| Compatibility | Linux only (POSIX) |
| Scaling | Automatic, up to petabytes |
| Throughput | Up to 10+ GB/s |
| Pricing | Pay per GB used |
| Mode | Use Case |
|---|---|
| General Purpose | Latency-sensitive (web servers, CMS) |
| Max I/O | Higher latency, highly parallel (big data) |
| Mode | Description |
|---|---|
| Bursting | Scales with storage size |
| Provisioned | Fixed throughput regardless of size |
| Elastic | Auto-scales based on workload (recommended) |
| Tier | Cost | Access |
|---|---|---|
| Standard | Higher | Frequent |
| Infrequent Access (IA) | Lower storage, pay per retrieval | Occasional |
| Archive | ~50% cheaper | Rare |
💡 Use lifecycle policies to auto-move files between tiers
| Option | Description |
|---|---|
| Standard (Multi-AZ) | Production, HA |
| One Zone | Dev/backup, cheaper, single AZ |
| Feature | EBS | EFS | Instance Store |
|---|---|---|---|
| Attach to | 1 instance (io1/io2: multi) | 100s of instances | 1 instance |
| AZ scope | Single AZ | Multi-AZ | Single AZ |
| Persistence | Persists | Persists | Ephemeral |
| Use case | Boot volumes, databases | Shared content, web serving | Cache, temp data |
| Cost | Per provisioned GB | Per used GB | Included |
Terminology: ELB (Elastic Load Balancing) is the service name, not a load balancer type. The actual LB types are ALB, NLB, GLB, and CLB.
| Layer | Name | Protocol/Example | AWS LB |
|---|---|---|---|
| 7 | Application | HTTP, HTTPS, WebSocket | ALB |
| 4 | Transport | TCP, UDP, TLS | NLB |
| 3 | Network | IP, ICMP | GLB |
| 2 | Data Link | Ethernet, MAC | — |
| 1 | Physical | Cables, signals | — |
| Type | Layer | Protocols | Use Case |
|---|---|---|---|
| ALB | 7 | HTTP, HTTPS, WebSocket | Web apps, microservices |
| NLB | 4 | TCP, UDP, TLS | Extreme performance, static IP |
| GLB | 3 | IP (GENEVE) | Firewalls, packet inspection |
| CLB | 4/7 | HTTP, HTTPS, TCP, SSL | Legacy (avoid) |
⚠️ CLB = Classic Load Balancer, sometimes called "Classic ELB" — adds to the ELB naming confusion. Avoid for new projects.
LB periodically pings targets to verify they're healthy.
| Setting | Description |
|---|---|
| Protocol | HTTP, HTTPS, TCP |
| Path | e.g., /health (HTTP/HTTPS only) |
| Interval | Time between checks (default: 30s) |
| Threshold | Consecutive successes/failures to change state |
| Timeout | Time to wait for response |
⚠️ ELB does NOT terminate unhealthy targets — it only stops routing traffic to them
ASG can use ELB health status to decide when to terminate/replace instances.
| Health Check Type | Default | Termination Trigger |
|---|---|---|
| EC2 | ✅ Yes | Instance stopped, impaired, or terminated |
| ELB | ❌ No | Target fails LB health check |
💡 Enable ELB health checks on ASG for automatic replacement of app-level failures
Layer 7 (HTTP) — routes to target groups:
| Target Type | Example |
|---|---|
| EC2 instances | i-0123... |
| Lambda functions | my-function |
| Private IPs | On-prem servers |
Routing rules based on:
- URL path (
/api/*,/images/*) - Hostname (
api.example.com) - Query strings (
?platform=mobile) - HTTP headers
Key points:
- Fixed DNS hostname (no static IP)
- Client IP in
X-Forwarded-Forheader - WebSocket support
Layer 4 (TCP/UDP) — highest performance LB.
| Feature | Value |
|---|---|
| Performance | Millions of requests/sec |
| Latency | ~100ms (vs ~400ms ALB) |
| Static IP | One per AZ |
Target groups: EC2 instances, Private IPs, ALB (NLB → ALB combo)
NLB provides:
- Static hostname
- Static IP (one per AZ)
- Elastic IP support
💡 When to use NLB: Gaming servers, IoT backends, financial trading platforms — anywhere you need ultra-low latency, millions of requests/sec, or must whitelist a static IP for clients/firewalls.
| Routing By | ALB | NLB |
|---|---|---|
| URL path | ✅ | ❌ |
| Hostname | ✅ | ❌ |
| Query strings | ✅ | ❌ |
| HTTP headers | ✅ | ❌ |
| Port | ✅ | ✅ |
NLB = Layer 4 (sees packets, not HTTP). ALB = Layer 7 (sees HTTP content). Content-based routing → ALB. Static IP + performance → NLB.
Layer 3 (IP) — for network appliances (firewalls, IDS, packet inspection).
Flow: Traffic → GLB → Security appliances → GLB → Your app
| Feature | Detail |
|---|---|
| Protocol | GENEVE (port 6081) |
| Use case | Third-party virtual appliances |
| Layer | 3 (Network) |
GENEVE encapsulates packets in UDP for cross-host VM/container communication
Same client always routed to same target instance.
| Cookie Type | Who Creates | Cookie Name |
|---|---|---|
| Duration-based | ALB | AWSALB (reserved) |
| Application-based (LB) | ALB | AWSALBAPP (reserved) |
| Application-based (App) | Your app | Custom (e.g., SESSIONID) |
⚠️ AWSALB*names are AWS-reserved — cannot be used by your app
💡 Use for stateful apps; avoid if possible (prefer stateless + external session store)
Distributes traffic evenly across all targets in all AZs, regardless of AZ distribution.
| LB Type | Default | Cost |
|---|---|---|
| ALB | Enabled | Free |
| NLB | Disabled | Charged |
| GLB | Disabled | Charged |
Without cross-zone: If AZ-1 has 2 instances and AZ-2 has 8, each AZ gets 50% of traffic (unfair distribution)
SSL Termination: LB decrypts HTTPS traffic, forwards HTTP to targets (offloads CPU from instances).
| Concept | Description |
|---|---|
| SSL Certificate | Loaded on LB via ACM (AWS Certificate Manager) |
| SNI (Server Name Indication) | Allows multiple SSL certs on one LB — client indicates hostname, LB selects correct cert |
SNI Support:
- ✅ ALB, NLB (multiple certs)
- ❌ CLB (one cert only)
💡 Use ACM for free, auto-renewing public certificates
Time allowed for in-flight requests to complete when a target is deregistering or unhealthy.
| Setting | Default | Range |
|---|---|---|
| Deregistration Delay | 300s (5 min) | 0–3600s |
💡 Set to 0 for short-lived requests; increase for long uploads/connections
Automatically adjusts EC2 capacity to match demand. ASG is free — you pay only for instances.
| Setting | Description |
|---|---|
| Minimum | Never go below this |
| Desired | Target number of instances |
| Maximum | Never exceed this |
Defines what to launch:
| Setting | Example |
|---|---|
| AMI | ami-0123456789 |
| Instance Type | t3.micro |
| IAM Role | MyEC2Role |
| Security Groups | sg-web |
| User Data | Bootstrap script |
| Key Pair | my-key |
| EBS Volumes | gp3, 20 GiB |
💡 CloudWatch alarms can trigger ASG scale-out/in based on metrics (CPU, RAM, custom)
| Policy | Description | Example |
|---|---|---|
| Target Tracking | Maintain a target metric value | Keep avg CPU at 40% |
| Step Scaling | Scale based on threshold ranges | CPU > 70% → +2, CPU < 30% → -1 |
| Scheduled | Scale at specific times | Add 3 instances every Friday 5PM |
| Predictive | ML-based forecasting | Pre-scale for predicted daily peaks |
- Predictive Scaling — ML analyzes historical load patterns, pre-provisions capacity ahead of predicted spikes. Great for recurring patterns (daily/weekly cycles).
| Metric | Best For |
|---|---|
| CPUUtilization | Compute-bound apps |
| RequestCountPerTarget | Web servers behind ALB |
| NetworkIn/Out | Network-bound apps |
| Custom (CloudWatch) | App-specific (queue depth, etc.) |
Prevents rapid successive scaling actions. Default: 300 seconds.
💡 Use shorter cooldown with faster-booting AMIs; longer for slow startup apps
Rolling update when you change Launch Template — replaces instances gradually.
| Setting | Description |
|---|---|
| Min Healthy % | % of instances that must stay running (e.g., 90%) |
| Warm-up | Time before new instance counts as healthy |
💡 Enables zero-downtime deployments for Launch Template changes
Managed relational database — AWS handles patching, backups, scaling, HA, monitoring.
| Feature | Included |
|---|---|
| OS/DB patching | ✅ |
| Automated backups | ✅ |
| Multi-AZ failover | ✅ |
| Read replicas | ✅ (up to 15) |
| Encryption (at-rest & in-flight) | ✅ |
| Performance Insights | ✅ |
It supports MySQL, Postgres, Oracle, MariaDB, MS SQL Server, Aurora.
⚠️ No SSH access to the underlying instance
RDS automatically increases storage when running low. Set MaxStorageThreshold to cap it.
| Feature | Read Replicas | Multi-AZ |
|---|---|---|
| Purpose | Read scaling | Disaster recovery |
| Replication | ASYNC (eventually consistent) | SYNC (immediate) |
| Readable? | ✅ Yes | ❌ Standby only |
| Cross-region? | ✅ Yes (with cost) | ❌ Same region |
| Failover | Manual (promote to standalone) | Automatic |
| Max count | 15 | 1 standby |
Cost: Same-region RR replication = free. Cross-region = network charges.
Multi-AZ setup: Enable in console → snapshot taken → restored to standby AZ → sync begins. Zero downtime.
💡 Read replicas can also be Multi-AZ (common exam question)
AWS-built relational DB, compatible with MySQL and PostgreSQL.
| Feature | Value |
|---|---|
| Performance | 5x MySQL, 3x PostgreSQL |
| Storage | Auto-scales 10 GB → 128 TiB |
| Replicas | Up to 15 (faster replication than RDS) |
| Failover | < 30 seconds |
| Copies | 6 copies across 3 AZs |
| Cost | ~20% more than RDS |
Self-healing: Corrupted data blocks repaired via peer-to-peer replication.
| Endpoint | Purpose |
|---|---|
| Writer Endpoint | Always points to current master (for writes) |
| Reader Endpoint | Load-balanced across all read replicas |
| Custom Endpoint | Route to specific subset of instances |
💡 Use Writer for writes, Reader for reads — endpoints auto-update on failover
| Layer | Implementation |
|---|---|
| At-rest encryption | KMS key at launch (encrypts master + replicas + snapshots) |
| In-flight encryption | TLS by default (use AWS TLS root certs) |
| Authentication | Username/password OR IAM DB authentication |
| Network | Security groups control access |
To encrypt an unencrypted DB: snapshot → copy with encryption → restore
Serverless connection pooler in front of RDS/Aurora.
| Benefit | Description |
|---|---|
| Connection pooling | Reduces DB load from many connections |
| Failover | Reduces failover time by 66% |
| IAM auth | Enforce IAM authentication |
| VPC only | Never publicly accessible |
💡 Great for Lambda → RDS (Lambda opens many short-lived connections)
Managed in-memory caching — Redis or Memcached.
| Feature | Redis | Memcached |
|---|---|---|
| Multi-AZ | ✅ | ❌ |
| Auto Failover | ✅ | ❌ |
| Replication | ✅ | ❌ |
| Persistence | ✅ | ❌ |
| Backup & Restore | ✅ | ✅ |
| Data structures | Complex (lists, sets, sorted sets) | Simple key-value |
| Sharding | Cluster mode | Multi-node |
💡 Exam tip: Use Redis for HA, persistence, complex data. Use Memcached for simple caching, multi-threaded, horizontal scaling.
| Question | Consider |
|---|---|
| Safe to cache? | What if stale data causes security/business issues? |
| Effective? | Best for slow-changing, frequently-read data |
| Structure fit? | Key-value lookups work best; complex joins may not |
| TTL strategy? | How long before data expires? |
App → Cache (miss?) → DB → Cache → App
| Pros | Cons |
|---|---|
| Only requested data cached | Cache miss = 3 network calls |
| Node failure not fatal | Stale data possible |
| Simple to implement | Must handle cache invalidation |
App → DB + Cache (write both)
| Pros | Cons |
|---|---|
| Cache always current | Write penalty (2 writes) |
| No stale data | Cache churn (data may never be read) |
| Missing data until first write |
💡 Combine Write-Through + Lazy Loading for best results
Set expiration on cached items. Balance between:
- Short TTL — Fresh data, more cache misses
- Long TTL — Fewer misses, risk of stale data
App → Cache → (async) → DB
| Pros | Cons |
|---|---|
| Fast writes (async to DB) | Data loss risk if cache fails |
| Reduces DB load | Complex to implement |
| Good for write-heavy workloads | Eventually consistent |
App → Cache (auto-fetches from DB on miss)
Cache sits between app and DB. On miss, cache itself fetches from DB and stores. Simpler app logic, but requires cache to understand DB.
| Use Case | Pattern |
|---|---|
| Session storage | Redis with TTL |
| Database query caching | Lazy Loading + TTL |
| Real-time leaderboards | Redis Sorted Sets |
| Pub/Sub messaging | Redis Pub/Sub |
| Rate limiting | Redis counters with TTL |
Object storage with unlimited storage, highly durable (99.999999999% — 11 9s).
| Concept | Description |
|---|---|
| Bucket | Container for objects, globally unique name |
| Object | File + metadata, identified by key (full path) |
| Key | Full path including "folders" (e.g., images/2024/photo.jpg) |
| Max object size | 5 TB (use multipart upload for >100 MB, required >5 GB) |
| Class | Durability | Availability | Use Case |
|---|---|---|---|
| S3 Standard | 11 9s | 99.99% | Frequently accessed data |
| S3 Intelligent-Tiering | 11 9s | 99.9% | Unknown/changing access patterns |
| S3 Standard-IA | 11 9s | 99.9% | Infrequent access, rapid retrieval |
| S3 One Zone-IA | 11 9s | 99.5% | Infrequent, non-critical, reproducible |
| S3 Glacier Instant | 11 9s | 99.9% | Archive, millisecond retrieval |
| S3 Glacier Flexible | 11 9s | 99.99% | Archive, minutes to hours retrieval |
| S3 Glacier Deep Archive | 11 9s | 99.99% | Long-term archive, 12-48 hour retrieval |
💡 Use Lifecycle Policies to automatically transition objects between classes
| Layer | Mechanism |
|---|---|
| User-based | IAM policies |
| Resource-based | Bucket policies (JSON), Object ACLs, Bucket ACLs |
| Encryption | SSE-S3, SSE-KMS, SSE-C, client-side |
Bucket Policy Structure:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-bucket/*"
}]
}| Type | Key Management | Use Case |
|---|---|---|
| SSE-S3 | AWS managed | Default encryption |
| SSE-KMS | KMS key | Audit trail, fine control |
| SSE-C | Customer-provided | Full key control |
| Client-side | Encrypt before upload | Maximum control |
💡 SSE-KMS has API call limits (quota). For high throughput, consider SSE-S3.
- Enable at bucket level
- Protects against unintentional deletes (delete marker, not actual delete)
- Once enabled, can only be suspended (not disabled)
nullversion ID for objects uploaded before versioning
| Type | Description |
|---|---|
| CRR (Cross-Region) | Compliance, lower latency, replication across accounts |
| SRR (Same-Region) | Log aggregation, live replication between prod/test |
Requirements: Versioning enabled on both buckets, proper IAM permissions
⚠️ Only new objects replicated after enabling. Use S3 Batch Replication for existing objects.
Trigger actions on bucket events (PUT, DELETE, etc.):
| Target | Use Case |
|---|---|
| SNS | Fan-out to multiple subscribers |
| SQS | Queue for processing |
| Lambda | Real-time processing |
| EventBridge | Advanced filtering, multiple destinations |
| Feature | Description |
|---|---|
| Multi-part upload | Parallelize uploads, recommended >100 MB |
| Transfer Acceleration | Use CloudFront edge locations for faster uploads |
| Byte-range fetches | Parallelize downloads by requesting byte ranges |
| S3 Select / Glacier Select | Retrieve subset of data using SQL |
Baseline: 3,500 PUT/COPY/POST/DELETE and 5,500 GET/HEAD requests per second per prefix.
Temporary access to private objects without changing bucket policy.
aws s3 presign s3://bucket/object --expires-in 3600| Parameter | Description |
|---|---|
| Expires | 1 second to 7 days (default: 1 hour) |
| Permissions | Inherits permissions of the user who generated it |
Serverless compute — run code without managing servers.
| Limit | Value |
|---|---|
| Memory | 128 MB – 10,240 MB (10 GB) |
| Timeout | Max 15 minutes (900 seconds) |
| Environment variables | 4 KB total |
| /tmp storage | 512 MB – 10,240 MB |
| Deployment package | 50 MB zipped, 250 MB unzipped |
| Concurrent executions | 1,000 default (can increase) |
| Layers | Up to 5 per function |
💡 CPU scales proportionally with memory. More memory = more CPU = faster execution.
| Type | Behavior | Retries | Examples |
|---|---|---|---|
| Synchronous | Caller waits for response | None (caller handles) | API Gateway, SDK |
| Asynchronous | Fire and forget | 2 retries (3 total) | S3, SNS, EventBridge |
| Event Source Mapping | Lambda polls source | Depends on source | SQS, Kinesis, DynamoDB Streams |
Event → Lambda (internal queue) → [Retry 1] → [Retry 2] → DLQ/Destination
| Setting | Description |
|---|---|
| Retries | 2 retries with exponential backoff |
| DLQ | Dead Letter Queue (SQS or SNS) for failed events |
| Destinations | Route success/failure to SQS, SNS, Lambda, or EventBridge |
💡 Destinations are preferred over DLQ — more features, supports success events
Lambda polls from:
| Source | Behavior |
|---|---|
| SQS | Batch processing, long polling |
| SQS FIFO | Lambda scales to # of message groups |
| Kinesis/DynamoDB Streams | Process in order per shard |
Error Handling:
- Entire batch fails if one record fails
- Options: discard, retry, split batch, send to DLQ/destination
By default, Lambda runs in AWS-managed VPC (has internet). To access private resources:
- Configure VPC, subnets, security groups
- Lambda creates ENIs in your subnets
- Use NAT Gateway for internet access from private subnet
⚠️ Lambda in VPC has no internet unless you have NAT Gateway
| Type | Description |
|---|---|
| Unreserved | Shared pool, up to account limit |
| Reserved | Guaranteed minimum for a function |
| Provisioned | Pre-initialized instances, no cold start |
Cold Start: First invocation initializes execution environment (can add seconds). Provisioned concurrency eliminates cold starts.
Share code/dependencies across functions:
Function → Layer 1 (libs) → Layer 2 (common code)
- Up to 5 layers per function
- Total unzipped size < 250 MB
- Use for: common libraries, custom runtimes
| Type | Location | Max Duration | Use Case |
|---|---|---|---|
| CloudFront Functions | Edge locations | < 1 ms | Simple request/response manipulation |
| Lambda@Edge | Regional edge cache | 5-30 seconds | Complex logic, external calls |
Managed API service — create, publish, secure, and monitor APIs.
| Type | Description |
|---|---|
| Edge-optimized | Routed through CloudFront (default) |
| Regional | For clients in same region |
| Private | Accessible only from VPC via VPC endpoint |
| Type | Features | Cost |
|---|---|---|
| REST API | Full features (caching, API keys, usage plans, request validation) | Higher |
| HTTP API | Simpler, faster, JWT auth only | ~70% cheaper |
| WebSocket API | Real-time two-way communication | Per message |
| Type | Description |
|---|---|
| Lambda Proxy | Request passed as-is to Lambda, Lambda returns full response |
| Lambda Custom | Transform request/response with mapping templates |
| HTTP Proxy | Pass through to HTTP endpoint |
| HTTP Custom | Transform with mapping templates |
| AWS Service | Direct integration with AWS services |
| Mock | Return response without backend |
💡 Lambda Proxy is most common — simplest setup, Lambda controls response format
| Method | Description |
|---|---|
| IAM | AWS Sig v4, good for internal/AWS clients |
| Lambda Authorizer | Custom auth logic (JWT, OAuth, etc.) |
| Cognito User Pools | JWT validation with Cognito |
| API Keys + Usage Plans | Rate limiting per client |
| Concept | Description |
|---|---|
| Stage | Named reference to deployment (dev, prod, v1) |
| Stage Variables | Key-value pairs, like environment variables |
| Canary Deployment | Route % of traffic to new deployment |
| Limit | Value |
|---|---|
| Account limit | 10,000 requests/second |
| Per-stage limit | Configurable |
| Per-client (Usage Plans) | API key-based throttling |
429 Too Many Requests when throttled. Client should retry with exponential backoff.
- Cache responses at stage level
- TTL: 0-3600 seconds (default: 300)
- Cache size: 0.5 GB – 237 GB
- Cache key: method + resource path (can include headers/query params)
💡 Reduce backend calls, improve latency. Invalidate with
Cache-Control: max-age=0header.
Fully managed NoSQL database — millisecond latency at any scale.
| Concept | Description |
|---|---|
| Table | Collection of items |
| Item | Row (max 400 KB) |
| Attribute | Column (nested up to 32 levels) |
| Primary Key | Partition key (required) + optional sort key |
| Type | Components | Use Case |
|---|---|---|
| Partition key | Single attribute | Unique identifier |
| Composite | Partition + Sort key | One-to-many relationships |
💡 Choose partition key with high cardinality for even distribution
| Mode | Description | Use Case |
|---|---|---|
| Provisioned | Set RCU/WCU, auto-scaling available | Predictable workloads |
| On-Demand | Pay per request | Unpredictable, new tables |
Throughput units:
| Unit | Capacity |
|---|---|
| 1 RCU | 1 strongly consistent read/sec (4 KB) OR 2 eventually consistent |
| 1 WCU | 1 write/sec (1 KB) |
| Type | Description |
|---|---|
| Eventually consistent | Default, might return stale data |
| Strongly consistent | Returns most recent data, uses 2x RCU |
| Type | Partition Key | Sort Key | When Created | Throughput |
|---|---|---|---|---|
| LSI | Same as table | Different | Table creation only | Shares table's |
| GSI | Different | Different | Anytime | Separate (provision separately) |
⚠️ GSI throttling can throttle main table writes. Provision GSI capacity carefully.
Ordered stream of item modifications (insert, update, delete).
| View Type | Content |
|---|---|
| KEYS_ONLY | Just the key attributes |
| NEW_IMAGE | Item after modification |
| OLD_IMAGE | Item before modification |
| NEW_AND_OLD_IMAGES | Both images |
Use cases: Trigger Lambda, replicate to other tables, analytics
| Operation | Description | Cost |
|---|---|---|
| GetItem | Single item by primary key | Uses RCU |
| Query | Items by partition key + optional sort key | Efficient, uses RCU |
| Scan | Entire table | Expensive, avoid in production |
| BatchGetItem | Up to 100 items | Parallel GetItem |
| BatchWriteItem | Up to 25 PutItem/DeleteItem | Parallel writes |
# Optimistic locking example
response = table.update_item(
Key={'pk': 'item1'},
UpdateExpression='SET #v = :newval, version = version + :inc',
ConditionExpression='version = :expectedVersion',
ExpressionAttributeValues={':expectedVersion': 1, ':newval': 'updated', ':inc': 1}
)💡 Use for optimistic concurrency control — no locking overhead
In-memory cache for DynamoDB — microsecond latency.
| Feature | Value |
|---|---|
| Latency | Microseconds (vs milliseconds) |
| Cache | Item cache + query cache |
| Compatibility | Drop-in replacement (same API) |
Use case: Read-heavy workloads, hot keys
Multi-region, multi-active replication.
| Feature | Description |
|---|---|
| Active-Active | Read/write in any region |
| Replication | Sub-second across regions |
| Requirement | DynamoDB Streams must be enabled |
Auto-delete expired items (no WCU cost).
Set TTL attribute → Store expiry timestamp (epoch) → DynamoDB deletes after expiry
Fully managed message queue — decouple applications.
| Type | Throughput | Ordering | Delivery |
|---|---|---|---|
| Standard | Unlimited | Best-effort | At-least-once |
| FIFO | 300 msg/s (3000 batched) | Strict | Exactly-once |
| Setting | Default | Description |
|---|---|---|
| Visibility Timeout | 30 seconds | Time message is hidden after receive |
| Message Retention | 4 days | Max: 14 days |
| Max Message Size | 256 KB | Use S3 for larger payloads |
| Delay Queue | 0 seconds | Delay before message is visible |
| Long Polling | Disabled | Wait for messages (reduces API calls) |
Receive → Message hidden → Process → Delete
↓
(If timeout expires before delete)
↓
Message reappears in queue
💡 If processing takes longer than visibility timeout, call
ChangeMessageVisibility
Messages that fail processing after maxReceiveCount go to DLQ.
| Setting | Description |
|---|---|
| maxReceiveCount | # of receives before sending to DLQ |
| Redrive | Move DLQ messages back to main queue |
| Feature | Description |
|---|---|
| MessageGroupId | Messages in same group processed in order |
| MessageDeduplicationId | Prevent duplicates within 5-minute window |
| Naming | Queue name must end with .fifo |
Lambda polls SQS and processes batches:
| Setting | Description |
|---|---|
| Batch size | 1-10 messages per invocation |
| Batch window | Time to wait for batch to fill |
| Concurrency | One invocation per message group (FIFO) |
Pub/sub messaging — push to multiple subscribers.
| Type | Use Case |
|---|---|
| SQS | Queue for processing |
| Lambda | Serverless processing |
| HTTP/S | Webhook endpoints |
| Email/SMS | User notifications |
| Kinesis Data Firehose | Stream to S3, Redshift |
Producer → SNS Topic → SQS Queue 1 → Consumer 1
→ SQS Queue 2 → Consumer 2
→ Lambda → Process
💡 Decouple, parallel processing, different consumption rates
Filter messages per subscriber using filter policies:
{
"eventType": ["order_placed"],
"store": [{"prefix": "us-"}]
}- Order guaranteed per message group
- Subscribers must be SQS FIFO queues
- Topic name must end with
.fifo
Real-time streaming data at scale.
| Service | Purpose |
|---|---|
| Kinesis Data Streams | Collect and process real-time data |
| Kinesis Data Firehose | Load streams into AWS data stores |
| Kinesis Data Analytics | SQL/Flink analytics on streams |
| Kinesis Video Streams | Stream video for analytics |
| Concept | Description |
|---|---|
| Shard | Unit of capacity (1 MB/s in, 2 MB/s out) |
| Partition Key | Determines which shard receives record |
| Sequence Number | Unique ID per record within shard |
| Retention | 1-365 days (default: 24 hours) |
Capacity:
| Direction | Per Shard |
|---|---|
| Write | 1 MB/s or 1,000 records/s |
| Read | 2 MB/s (shared by all consumers) |
| Type | Description |
|---|---|
| Shared | Multiple consumers share 2 MB/s per shard |
| Enhanced Fan-Out | 2 MB/s per consumer per shard (push model) |
Near real-time delivery (60-900 second buffer) to:
| Destination | Description |
|---|---|
| S3 | Most common |
| Redshift | Via S3 copy |
| OpenSearch | Search/analytics |
| HTTP endpoint | Custom destinations |
💡 Firehose = managed, auto-scaling, no capacity planning. Streams = more control, real-time.
| Feature | Data Streams | Data Firehose |
|---|---|---|
| Latency | ~200 ms | 60-900 seconds |
| Capacity | Provision shards | Auto-scaling |
| Data retention | 1-365 days | No storage |
| Consumer | Custom (Lambda, apps) | Built-in destinations |
| Data transformation | External | Built-in Lambda |
Orchestrate Lambda functions and AWS services with visual workflows.
| Concept | Description |
|---|---|
| State Machine | Workflow definition (JSON/YAML) |
| State | Individual step in workflow |
| Execution | Running instance of state machine |
| Task | Unit of work (Lambda, AWS service, HTTP) |
| State | Description |
|---|---|
| Task | Execute work (Lambda, AWS API) |
| Choice | Branch based on condition |
| Parallel | Execute branches in parallel |
| Map | Iterate over array |
| Wait | Delay execution |
| Pass | Pass input to output, inject data |
| Succeed/Fail | End execution |
| Type | Max Duration | Pricing | Use Case |
|---|---|---|---|
| Standard | 1 year | Per state transition | Long-running, auditing |
| Express | 5 minutes | Per execution + duration | High-volume, event processing |
| Mechanism | Description |
|---|---|
| Retry | Retry failed states with backoff |
| Catch | Handle errors, transition to fallback |
"Retry": [{
"ErrorEquals": ["States.TaskFailed"],
"MaxAttempts": 3,
"IntervalSeconds": 1,
"BackoffRate": 2.0
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"Next": "HandleError"
}]| Pattern | Description |
|---|---|
| Request Response | Call service, wait for response |
| Run a Job (.sync) | Wait for job completion (Batch, ECS, Glue) |
| Wait for Callback | Pause until external callback (Human approval) |
Container orchestration on AWS.
| Launch Type | Description |
|---|---|
| EC2 | You manage EC2 instances, more control |
| Fargate | Serverless, AWS manages infrastructure |
| Concept | Description |
|---|---|
| Task Definition | Blueprint for containers (image, CPU, memory, ports) |
| Task | Running instance of Task Definition |
| Service | Maintains desired count of tasks, load balancing |
| Cluster | Logical grouping of tasks/services |
| Setting | Description |
|---|---|
| Image | Docker image (from ECR or public) |
| CPU/Memory | Resource allocation |
| Port Mappings | Container port to host port |
| Environment | Variables, secrets from SSM/Secrets Manager |
| IAM Role | Task role (permissions for containers) |
| Logging | CloudWatch Logs integration |
| Feature | Description |
|---|---|
| Serverless | No EC2 management |
| Pricing | Per vCPU + memory per second |
| Scaling | Auto-scaling on CPU/memory metrics |
Private Docker registry:
| Feature | Description |
|---|---|
| Encryption | Images encrypted at rest |
| Scanning | Vulnerability scanning |
| Lifecycle Policies | Auto-delete old images |
| Cross-region | Replicate to other regions |
| Role | Purpose |
|---|---|
| Task Execution Role | Pulls images from ECR, sends logs to CloudWatch |
| Task Role | Permissions for the application running in container |
💡 Task Role = what container can do. Execution Role = what ECS agent can do.
| Feature | Description |
|---|---|
| ALB | Dynamic port mapping, path-based routing |
| NLB | High throughput, static IP |
| Service Discovery | Route 53 DNS for service-to-service |
Infrastructure as Code — define AWS resources in templates.
AWSTemplateFormatVersion: "2010-09-09"
Description: String
Parameters: # Input values
Resources: # AWS resources (REQUIRED)
Outputs: # Export values
Mappings: # Static variables
Conditions: # Conditional resource creation| Function | Purpose | Example |
|---|---|---|
!Ref |
Reference resource/parameter | !Ref MyBucket |
!GetAtt |
Get resource attribute | !GetAtt MyBucket.Arn |
!Sub |
String substitution | !Sub "arn:aws:s3:::${BucketName}" |
!Join |
Join strings | !Join ["-", [a, b, c]] → "a-b-c" |
!If |
Conditional value | !If [Prod, m5.large, t3.micro] |
!ImportValue |
Import from another stack | !ImportValue VPCId |
!FindInMap |
Lookup in Mappings | !FindInMap [RegionMap, !Ref 'AWS::Region', AMI] |
| Parameter | Value |
|---|---|
AWS::AccountId |
Account ID |
AWS::Region |
Current region |
AWS::StackName |
Stack name |
AWS::StackId |
Stack ID |
AWS::NoValue |
Remove property conditionally |
Stack A (export):
Outputs:
VPCId:
Value: !Ref MyVPC
Export:
Name: SharedVPCStack B (import):
VpcId: !ImportValue SharedVPCReusable components embedded in parent stack:
Resources:
NetworkStack:
Type: AWS::CloudFormation::Stack
Properties:
TemplateURL: https://s3.amazonaws.com/mybucket/network.yaml💡 Nested = component reuse. Cross-stack = share values between independent stacks.
Preview changes before executing:
aws cloudformation create-change-set --stack-name MyStack --template-body file://template.yaml
aws cloudformation describe-change-set --change-set-name MyChangeSet
aws cloudformation execute-change-set --change-set-name MyChangeSetDetect if actual resources differ from template definition.
Simplified CloudFormation for serverless.
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31 # SAM transform
Globals:
Function:
Timeout: 30
Resources:
MyFunction:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: python3.9
CodeUri: ./src
Events:
Api:
Type: Api
Properties:
Path: /hello
Method: GET| Type | Creates |
|---|---|
AWS::Serverless::Function |
Lambda + execution role |
AWS::Serverless::Api |
API Gateway REST API |
AWS::Serverless::HttpApi |
API Gateway HTTP API |
AWS::Serverless::SimpleTable |
DynamoDB table |
AWS::Serverless::LayerVersion |
Lambda Layer |
| Command | Description |
|---|---|
sam init |
Initialize new project |
sam build |
Build and package |
sam local invoke |
Test locally |
sam local start-api |
Local API Gateway |
sam deploy --guided |
Interactive deployment |
sam sync |
Fast sync for development |
Built-in policies for common patterns:
Policies:
- S3ReadPolicy:
BucketName: !Ref MyBucket
- DynamoDBCrudPolicy:
TableName: !Ref MyTableAWS Git repository hosting.
| Feature | Description |
|---|---|
| Auth | HTTPS (Git credentials), SSH (keys), IAM roles |
| Triggers | Lambda, SNS on repository events |
| Notifications | CloudWatch Events/EventBridge |
Managed build service — compile, test, produce artifacts.
buildspec.yml:
version: 0.2
phases:
install:
runtime-versions:
python: 3.9
pre_build:
commands:
- pip install -r requirements.txt
build:
commands:
- python -m pytest
- sam build
post_build:
commands:
- sam package --s3-bucket $BUCKET
artifacts:
files:
- template.yaml
- '**/*'
cache:
paths:
- '/root/.cache/pip/**/*'| Section | Purpose |
|---|---|
| phases | install, pre_build, build, post_build |
| artifacts | Files to output |
| cache | Speed up builds |
| env | Environment variables |
Automated deployment to EC2, Lambda, ECS.
appspec.yml (EC2):
version: 0.0
os: linux
files:
- source: /
destination: /var/www/html
hooks:
BeforeInstall:
- location: scripts/install_dependencies.sh
AfterInstall:
- location: scripts/start_server.shLifecycle Hooks (EC2):
ApplicationStop → DownloadBundle → BeforeInstall → Install → AfterInstall → ApplicationStart → ValidateService
| Platform | Types | Description |
|---|---|---|
| EC2 | In-Place, Blue/Green | Rolling update or swap target groups |
| Lambda | AllAtOnce, Canary, Linear | Traffic shifting |
| ECS | Blue/Green | Traffic shifting with ALB |
Lambda deployment:
| Type | Description |
|---|---|
| AllAtOnce | Immediate shift to new version |
| Canary | x% for n minutes, then 100% |
| Linear | x% every n minutes |
Orchestrate CI/CD workflow:
Source → Build → Test → Deploy
↓
[Manual Approval]
| Feature | Description |
|---|---|
| Stages | Sequential groups of actions |
| Actions | Individual tasks (source, build, deploy) |
| Artifacts | Files passed between stages (stored in S3) |
| Manual Approval | Human gate between stages |
Monitoring, logging, and alarms.
| Concept | Description |
|---|---|
| Namespace | Container for metrics (e.g., AWS/EC2) |
| Dimension | Attribute of metric (InstanceId, AutoScalingGroupName) |
| Resolution | Standard (1 min) or High-res (1 sec) |
| Custom Metrics | Your own metrics via PutMetricData API |
EC2 Default Metrics:
- CPU, Network, Disk (read/write operations)
- NOT included: Memory, disk space (need CloudWatch Agent)
| State | Description |
|---|---|
| OK | Metric within threshold |
| ALARM | Metric breached threshold |
| INSUFFICIENT_DATA | Not enough data points |
Actions: SNS notification, Auto Scaling, EC2 actions (stop, terminate, reboot)
| Concept | Description |
|---|---|
| Log Group | Collection of log streams (e.g., per application) |
| Log Stream | Sequence of events from same source |
| Retention | Never expire by default, configure 1 day to 10 years |
| Metric Filters | Extract metrics from log data |
| Subscription Filters | Stream logs to Lambda, Kinesis, OpenSearch |
Query logs with SQL-like syntax:
fields @timestamp, @message
| filter @message like /ERROR/
| sort @timestamp desc
| limit 20Install on EC2/on-premises for:
- Custom metrics: Memory, disk, swap, custom
- Log collection: Push logs to CloudWatch Logs
Monitoring for ECS, EKS, Kubernetes — metrics per container, task, service.
Distributed tracing for debugging and performance analysis.
| Concept | Description |
|---|---|
| Trace | End-to-end request journey |
| Segment | Work done by one service |
| Subsegment | Granular breakdown (HTTP calls, DB queries) |
| Annotations | Indexed key-value pairs (searchable) |
| Metadata | Non-indexed key-value pairs |
| Service | Setup |
|---|---|
| Lambda | Enable active tracing |
| API Gateway | Enable tracing in stage settings |
| EC2/ECS | Install X-Ray daemon + SDK |
| Elastic Beanstalk | Extension configuration |
Runs on EC2/ECS, buffers and sends trace data to X-Ray API.
App (X-Ray SDK) → UDP port 2000 → X-Ray Daemon → X-Ray API
Control volume of requests traced:
| Setting | Description |
|---|---|
| Reservoir | Fixed # requests per second traced |
| Rate | Percentage of additional requests traced |
Default: 1 request/sec + 5% additional
| API | Used By |
|---|---|
| PutTraceSegments | App/SDK uploads segments |
| GetTraceSummaries | Get list of traces |
| BatchGetTraces | Get full trace details |
User identity and access management.
Authentication — Sign-up, sign-in, returns JWT tokens.
| Feature | Description |
|---|---|
| Sign-up/Sign-in | Email, phone, username |
| MFA | SMS, TOTP |
| Social login | Google, Facebook, SAML, OIDC |
| Hosted UI | Pre-built login pages |
| Triggers | Lambda on auth events |
JWT Tokens:
- ID Token: User identity/attributes
- Access Token: API authorization
- Refresh Token: Get new tokens
Authorization — Exchange tokens for temporary AWS credentials.
[User] → [CUP/Social] → [ID Token] → [Identity Pool] → [Temp AWS Credentials]
| Feature | Description |
|---|---|
| Federation | CUP, Google, Facebook, SAML, OpenID |
| IAM Roles | Map users to authenticated/unauthenticated roles |
| Fine-grained | Policy variables for row-level access |
| Feature | User Pools | Identity Pools |
|---|---|---|
| Purpose | Authentication | Authorization |
| Returns | JWT tokens | AWS credentials |
| Use with | API Gateway, ALB | AWS SDK (S3, DynamoDB) |
Managed encryption keys.
| Type | Managed By | Cost | Rotation |
|---|---|---|---|
| AWS Owned | AWS | Free | Varies |
| AWS Managed | AWS | Free | Auto yearly |
| Customer Managed | You |
|
Optional/yearly |
| API | Purpose |
|---|---|
| Encrypt | Encrypt data up to 4 KB |
| Decrypt | Decrypt data |
| GenerateDataKey | Returns plaintext + encrypted data key |
| GenerateDataKeyWithoutPlaintext | Returns only encrypted data key |
For data > 4 KB:
1. GenerateDataKey → plaintext DEK + encrypted DEK
2. Encrypt data with plaintext DEK
3. Store encrypted DEK with encrypted data
4. Decrypt: Use KMS to decrypt DEK → use DEK to decrypt data
| Policy Type | Description |
|---|---|
| Default | Created automatically, grants access to root user |
| Custom | Define who can access key, required for cross-account |
Additional authenticated data for extra security:
kms.encrypt(
KeyId='alias/my-key',
Plaintext=data,
EncryptionContext={'department': 'engineering'}
)Decryption must include same encryption context
| Feature | Description |
|---|---|
| Purpose | Store secrets (passwords, API keys, tokens) |
| Rotation | Automatic rotation with Lambda |
| Integration | RDS, Redshift, DocumentDB automatic rotation |
| Cost | Per secret + per API call |
| Feature | Description |
|---|---|
| Purpose | Configuration and secrets |
| Types | String, StringList, SecureString (encrypted) |
| Hierarchy | /app/prod/db-connection |
| Cost | Free (standard) or paid (advanced) |
| Use Case | Service |
|---|---|
| Secrets with rotation | Secrets Manager |
| RDS/database credentials | Secrets Manager |
| Configuration values | Parameter Store |
| Cost-sensitive | Parameter Store |
| Simple secrets without rotation | Parameter Store (SecureString) |
Serverless event bus — route events to targets.
| Source | Examples |
|---|---|
| AWS Services | EC2, S3, CodePipeline state changes |
| Custom Apps | Your applications via PutEvents API |
| SaaS Partners | Zendesk, Datadog, Auth0 |
| Scheduled | Cron expressions |
| Type | Description |
|---|---|
| Event Pattern | Match events by pattern (source, detail-type, etc.) |
| Schedule | Cron or rate expression |
Lambda, SQS, SNS, Step Functions, Kinesis, ECS Tasks, CodePipeline, EC2 Actions, API Gateway, EventBridge in another account/region...
{
"source": ["aws.ec2"],
"detail-type": ["EC2 Instance State-change Notification"],
"detail": {
"state": ["stopped", "terminated"]
}
}- Discover/store event schemas
- Generate code bindings
- Versioning
PaaS for deploying web applications.
| Policy | Downtime | Description |
|---|---|---|
| All at once | Yes | Fastest, brief outage |
| Rolling | No | Deploy batch by batch |
| Rolling with additional batch | No | Maintain capacity during deployment |
| Immutable | No | New ASG, swap when healthy |
| Blue/Green | No | Create new environment, swap URL |
.ebextensions/*.config files customize environment:
option_settings:
aws:elasticbeanstalk:application:environment:
MY_ENV_VAR: value
packages:
yum:
git: []
container_commands:
01_migrate:
command: "python manage.py migrate"
leader_only: trueLimit stored application versions (max 1000):
- Delete based on age or count
- Option to preserve source bundle in S3
Click to reveal answers. Includes key DVA-C02 topics beyond the notes above.
Is IAM a global or regional service?
✅ Global — IAM users, groups, roles, and policies are not region-specific.
Is EBS regional or AZ-specific?
✅ AZ-specific — EBS volumes are bound to a single Availability Zone.
How many AZs does a Region typically have?
✅ 2-6 AZs per Region.
Can an IAM group contain another group?
✅ No — Groups can only contain users, not other groups.
What are IAM Roles used for?
✅ Services, not users. Roles grant permissions to AWS services (e.g., EC2, Lambda) to perform actions.
You're trying to SSH into your EC2 and getting a timeout. What's the most likely issue?
✅ Security Group — Timeout = 100% a security group issue. Check inbound rules for port 22.
Which EC2 purchasing option offers up to 90% discount but can be interrupted?
✅ Spot Instances — Cheapest option, but AWS can reclaim when spot price exceeds your bid.
What's the difference between Dedicated Host and Dedicated Instance?
✅ Dedicated Host — Full server control, see sockets/cores (for BYOL licensing)
✅ Dedicated Instance — Dedicated hardware, no host visibility
What happens to Instance Store data when you stop an EC2 instance?
✅ Data is lost — Instance Store is ephemeral. Data is lost on stop, terminate, or hardware failure.
Which EBS volume types can be used as boot volumes?
✅ SSD types only — gp2, gp3, io1, io2. HDD types (st1, sc1) cannot be boot volumes.
What is the max IOPS for gp3?
✅ 16,000 IOPS — Can be provisioned independently of volume size.
Can you attach an EBS volume to multiple EC2 instances?
✅ Only io1/io2 with Multi-Attach — up to 16 instances, same AZ only.
EFS is compatible with which operating systems?
✅ Linux only — EFS is POSIX-compliant, not compatible with Windows.
Are AMIs region-specific or global?
✅ Region-specific — Must copy an AMI to use it in another region.
What does ELB stand for and is it a load balancer type?
✅ Elastic Load Balancing — It's the service name, not a LB type. Actual types are ALB, NLB, GLB, CLB.
Which load balancer provides a static IP address?
✅ NLB — Network Load Balancer provides one static IP per AZ. ALB only provides a static DNS hostname.
NLB operates at which OSI layer? ALB?
✅ NLB — Layer 4 (Transport: TCP, UDP)
✅ ALB — Layer 7 (Application: HTTP, HTTPS)
Will ELB terminate an unhealthy target?
✅ No — ELB only stops routing traffic. ASG with ELB health checks enabled will terminate/replace unhealthy instances.
Is Cross-Zone Load Balancing enabled by default for ALB? NLB?
✅ ALB — Enabled by default (free)
✅ NLB — Disabled by default (charged if enabled)
What is the default ASG cooldown period?
✅ 300 seconds (5 minutes) — Prevents rapid successive scaling actions.
What scaling policy uses ML to predict load patterns?
✅ Predictive Scaling — Analyzes historical patterns and pre-provisions capacity.
Read Replicas use sync or async replication?
✅ ASYNC — Data is eventually consistent across read replicas.
Multi-AZ uses sync or async replication?
✅ SYNC — Changes are immediately replicated to standby for disaster recovery.
Can you read from a Multi-AZ standby database?
✅ No — Standby is only for failover. Use Read Replicas for read scaling.
How many Read Replicas can RDS have? Aurora?
✅ Both can have up to 15 Read Replicas.
What's the failover time for Aurora?
✅ Less than 30 seconds.
How do you encrypt an existing unencrypted RDS database?
✅ Snapshot → Copy with encryption → Restore from encrypted snapshot.
What is RDS Proxy and when should you use it?
✅ Serverless connection pooler. Use with Lambda to reduce DB connections (Lambda opens many short-lived connections).
Is RDS Proxy publicly accessible?
✅ No — It lives inside your VPC only, never publicly accessible.
What is the maximum Lambda execution timeout?
✅ 15 minutes (900 seconds).
What is the maximum Lambda memory allocation?
✅ 10,240 MB (10 GB). CPU scales proportionally with memory.
What is the /tmp directory size limit in Lambda?
✅ 10,240 MB (10 GB) — Use for temporary file processing.
What happens if Lambda runs out of memory?
✅ Execution fails with "Process exited before completing request" or OutOfMemoryError.
What are Lambda Layers used for?
✅ Share code/dependencies across multiple functions. Up to 5 layers per function.
How do you give Lambda access to resources in a VPC?
✅ Configure VPC settings (subnets + security groups). Lambda creates ENIs in your VPC.
What's the difference between synchronous and asynchronous Lambda invocation?
✅ Sync — Caller waits for response (API Gateway, SDK invoke)
✅ Async — Caller doesn't wait, Lambda handles retries (S3, SNS, EventBridge)
How many retries does Lambda do for async invocations?
✅ 2 retries (3 total attempts). Failed events can go to DLQ or on-failure destination.
What are the three API Gateway endpoint types?
✅ Edge-optimized (CloudFront), Regional, Private (VPC only)
What is the API Gateway default timeout?
✅ 29 seconds — Cannot exceed this even if Lambda timeout is higher.
How do you handle CORS in API Gateway?
✅ Enable CORS on the resource/method. API Gateway adds
Access-Control-Allow-Originheaders.
What's the difference between REST API and HTTP API in API Gateway?
✅ HTTP API — Cheaper, faster, simpler (JWT auth, Lambda proxy)
✅ REST API — Full features (caching, request validation, usage plans, API keys)
How do you implement rate limiting in API Gateway?
✅ Usage Plans + API Keys — Set throttling limits per client.
What are the two capacity modes in DynamoDB?
✅ Provisioned (set RCU/WCU) and On-Demand (pay per request).
What is the maximum item size in DynamoDB?
✅ 400 KB per item.
What's the difference between Query and Scan?
✅ Query — Efficient, uses partition key (and optionally sort key)
✅ Scan — Reads entire table, expensive, use sparingly
What are DynamoDB Streams used for?
✅ Capture item-level changes (insert, update, delete). Trigger Lambda, replicate data, etc.
What is a GSI vs LSI in DynamoDB?
✅ GSI — Different partition key, can be added anytime, has own throughput
✅ LSI — Same partition key, must be created at table creation, shares table throughput
How do you implement optimistic locking in DynamoDB?
✅ Use conditional writes with a version attribute. Write fails if version doesn't match.
What is the maximum object size in S3?
✅ 5 TB. Use multipart upload for objects > 100 MB (required > 5 GB).
What is S3 Transfer Acceleration?
✅ Uses CloudFront edge locations to speed up uploads over long distances.
What's the difference between S3 Standard-IA and S3 One Zone-IA?
✅ Standard-IA — Multi-AZ, for infrequent access
✅ One Zone-IA — Single AZ, cheaper, data lost if AZ fails
What is S3 Object Lock?
✅ WORM model (Write Once Read Many). Prevents object deletion/modification for retention period.
How do you enable versioning on an S3 bucket?
✅ Enable at bucket level. Once enabled, can only be suspended (not disabled). Protects against accidental deletes.
What is the default visibility timeout for SQS?
✅ 30 seconds — Time a message is hidden after being read.
What is the maximum retention period for SQS messages?
✅ 14 days (default: 4 days).
What's the difference between Standard and FIFO SQS queues?
✅ Standard — Unlimited throughput, at-least-once delivery, best-effort ordering
✅ FIFO — 300 msg/s (3000 with batching), exactly-once, strict ordering
What is a Dead Letter Queue (DLQ)?
✅ Queue for messages that failed processing after max retries. Helps debug failures.
What's the difference between SQS and SNS?
✅ SQS — Queue, pull-based, messages persist until processed
✅ SNS — Pub/sub, push-based, messages sent immediately to all subscribers
What is the SNS + SQS fan-out pattern?
✅ SNS topic pushes to multiple SQS queues. Decouples publishers from consumers, enables parallel processing.
What is the buildspec.yml file?
✅ CodeBuild configuration file. Defines build phases (install, pre_build, build, post_build) and artifacts.
What is the appspec.yml/appspec.yaml file?
✅ CodeDeploy configuration. Defines deployment lifecycle hooks and file mappings.
What deployment types does CodeDeploy support for EC2?
✅ In-place (rolling) and Blue/Green (traffic shift to new instances).
What deployment types does CodeDeploy support for Lambda?
✅ AllAtOnce, Canary (x% then 100%), Linear (x% every n minutes).
What is the intrinsic function to reference another resource in CloudFormation?
✅
!ReforRef:— Returns the physical ID of the resource.
What does !GetAtt do in CloudFormation?
✅ Gets an attribute from a resource (e.g.,
!GetAtt MyBucket.Arn).
What is AWS SAM?
✅ Serverless Application Model — Simplified CloudFormation for serverless (Lambda, API Gateway, DynamoDB).
What command packages and deploys a SAM application?
✅
sam build→sam deploy(orsam deploy --guidedfor interactive).
What is the minimum resolution for CloudWatch custom metrics?
✅ 1 second (high-resolution). Standard is 1 minute.
How long are CloudWatch Logs retained by default?
✅ Forever (never expire). Must set retention policy to auto-delete.
What is X-Ray used for?
✅ Distributed tracing — Visualize requests as they travel through your application. Debug latency issues.
What is the X-Ray daemon?
✅ Runs on EC2/ECS, collects trace data from SDK and sends to X-Ray service. Lambda has it built-in.
What are X-Ray segments and subsegments?
✅ Segment — Work done by a service/resource
✅ Subsegment — Granular breakdown (e.g., external HTTP call, DB query)
What's the difference between Cognito User Pools and Identity Pools?
✅ User Pools — Authentication (sign-up, sign-in, get JWT tokens)
✅ Identity Pools — Authorization (exchange tokens for temporary AWS credentials)
How do you authenticate API Gateway with Cognito?
✅ Use Cognito User Pool Authorizer — Validates JWT tokens from User Pool.
What are the two types of KMS keys?
✅ AWS managed (aws/service-name, free) and Customer managed (you control rotation, policies).
What is envelope encryption?
✅ Data encrypted with data key, data key encrypted with KMS key. Used for large data.
What is the GenerateDataKey API?
✅ Returns a plaintext data key + encrypted copy. Use plaintext to encrypt data, store encrypted key with data.
What is EventBridge (formerly CloudWatch Events)?
✅ Serverless event bus. Route events from AWS services, SaaS, custom apps to targets (Lambda, SQS, etc.).
What is an EventBridge rule?
✅ Matches incoming events (by pattern or schedule) and routes to target(s).
What's the difference between Redis and Memcached in ElastiCache?
✅ Redis — Multi-AZ, replication, persistence, complex data types
✅ Memcached — Simple key-value, multi-threaded, no persistence, horizontal scaling
What is Lazy Loading (Cache-Aside) pattern?
✅ App checks cache first → on miss, fetches from DB → stores in cache → returns. Only requested data is cached.
What is Write-Through caching?
✅ Write to cache AND DB on every update. Cache always current, but write penalty and cache churn.
What is the main drawback of Lazy Loading?
✅ Cache miss = 3 network calls (check cache, query DB, write cache). Also, data can become stale.
When would you use Redis over Memcached?
✅ When you need: Multi-AZ, persistence, complex data structures (sorted sets, lists), pub/sub, or backup/restore.
What is TTL in caching?
✅ Time-To-Live — Automatic expiration of cached items. Balance freshness vs cache hit rate.