AWS Security Automation Platform

An event-driven, fully automated AWS cloud security platform that detects threats across your AWS environment and responds within seconds — no human intervention required for containment. Built on GuardDuty, Security Hub, Inspector, Macie, IAM Access Analyzer, and AWS Config, with Lambda functions that handle the full incident response lifecycle: detect → contain → forensics → alert → audit.

Architecture Overview

┌──────────────────────────────────────────────────────────────────────────────────┐
│                    AWS SECURITY AUTOMATION PLATFORM                              │
│                                                                                  │
│  DETECTION LAYER                                                                 │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐           │
│  │  GuardDuty  │  │  Inspector  │  │    Macie     │  │  IAM Access │           │
│  │             │  │    v2       │  │              │  │  Analyzer   │           │
│  │ ML-powered  │  │ CVE scan:   │  │ S3 sensitive │  │ External    │           │
│  │ threat      │  │ EC2/ECR/    │  │ data (PII,   │  │ access to   │           │
│  │ detection   │  │ Lambda      │  │ credentials) │  │ resources   │           │
│  └──────┬──────┘  └──────┬──────┘  └──────┬───────┘  └──────┬──────┘           │
│         │                │                │                  │                  │
│  ┌──────▼────────────────▼────────────────▼──────────────────▼──────┐           │
│  │                  AWS Security Hub                                 │           │
│  │  Aggregates all findings, CIS Benchmark + PCI DSS standards       │           │
│  │  Single pane of glass for security posture                        │           │
│  └──────────────────────────────┬────────────────────────────────────┘           │
│                                 │                                                │
│  ROUTING LAYER                  │                                                │
│  ┌──────────────────────────────▼────────────────────────────────────┐           │
│  │                    Amazon EventBridge                             │           │
│  │                                                                   │           │
│  │  Rule: EC2 findings ──────────────────────────────────────────►  │           │
│  │  Rule: IAM findings ──────────────────────────────────────────►  │           │
│  │  Rule: Macie findings ─────────────────────────────────────────► │           │
│  │  Rule: IP-based findings ─────────────────────────────────────►  │           │
│  │  Rule: Schedule (every 6h) ────────────────────────────────────► │           │
│  └──────────────────────────────┬────────────────────────────────────┘           │
│                                 │                                                │
│  RESPONSE LAYER                 │                                                │
│  ┌──────────┐  ┌──────────┐  ┌─┴────────┐  ┌──────────┐  ┌──────────────────┐  │
│  │isolate-  │  │revoke-   │  │ block-ip │  │   s3-    │  │findings-         │  │
│  │ec2       │  │iam-keys  │  │          │  │remediat- │  │aggregator        │  │
│  │          │  │          │  │          │  │ion       │  │(scheduled)       │  │
│  │1. Attach │  │1. Disable│  │1. WAFv2  │  │1. Block  │  │• GuardDuty       │  │
│  │   quarant│  │   key    │  │   IP set │  │   public │  │• Security Hub    │  │
│  │   ine SG │  │2. Deny   │  │2. NACL   │  │   access │  │• Inspector       │  │
│  │2. Remove │  │   policy │  │   rule   │  │2. Enable │  │• Access Analyzer │  │
│  │   all SGs│  │3. Tag    │  │3. Threat │  │   KMS    │  │• CloudWatch      │  │
│  │3. Tag    │  │   user   │  │   intel  │  │   encrypt│  │  metrics         │  │
│  │4. Snap-  │  │4. Revoke │  │   table  │  │3. Enable │  │• Slack digest    │  │
│  │   shot   │  │   session│  │4. Alert  │  │   version│  │• S3 report       │  │
│  │5. Alert  │  │5. Alert  │  │          │  │4. Bucket │  │                  │  │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  │   policy │  └──────────────────┘  │
│       │             │             │         │5. Alert  │                         │
│  AUDIT / NOTIFICATION LAYER      │         └──────────┘                         │
│  ┌────▼─────────────▼─────────────▼──────────────────────────────────────────┐   │
│  │  DynamoDB Audit Trail      SNS → Slack / PagerDuty / Email               │   │
│  │  - Every action logged     - Alert on every CRITICAL/HIGH finding        │   │
│  │  - 90-day retention        - Daily digest summary                        │   │
│  │  - KMS encrypted           - Lambda DLQ for failed remediations          │   │
│  └────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                  │
│  COMPLIANCE / AUDIT LAYER                                                        │
│  ┌──────────────┐  ┌──────────────────────────────────────────────────────────┐  │
│  │  CloudTrail  │  │  AWS Config                                              │  │
│  │  Multi-region│  │  Rules: s3-public-access, s3-encryption, root-mfa,       │  │
│  │  KMS encrypt │  │  iam-password-policy, cloudtrail-enabled                  │  │
│  │  S3 + CW logs│  │  Continuous compliance evaluation                        │  │
│  └──────────────┘  └──────────────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────────────────────────────────┘

Why Automated Response

The average time to detect a breach is 207 days. The average time to contain it is 73 days. Automated response changes those numbers dramatically.

A compromised EC2 instance running crypto mining software can cost thousands of dollars per day. A leaked IAM key can be exploited within minutes of being discovered (attackers scan GitHub for leaked keys in near real-time). A publicly exposed S3 bucket with PII is a data breach that must be reported to regulators within 72 hours under GDPR.

This platform responds to all three scenarios in seconds:

Compromised EC2: GuardDuty fires → EventBridge routes → Lambda attaches quarantine security group (zero egress/ingress) and creates forensic EBS snapshots → all within 30 seconds of detection
Leaked IAM key: GuardDuty fires → EventBridge routes → Lambda disables the access key and attaches explicit-deny policy → credential is useless within 30 seconds
Public S3 bucket: Macie fires → EventBridge routes → Lambda enables block-public-access, KMS encryption, versioning, and access logging → remediated within 60 seconds

Every action is logged to a DynamoDB audit table with TTL=90 days, and every response function has a Lambda Dead Letter Queue (DLQ) so failed remediations are not silently dropped.

Repository Structure

aws-security-automation/
│
├── README.md
├── pytest.ini                              # Test configuration (80% coverage gate)
│
├── lambda/
│   ├── requirements.txt                    # boto3, pytest, moto, coverage
│   │
│   ├── layers/
│   │   └── common/
│   │       └── utils.py                    # Shared layer: logging, SNS, DynamoDB audit,
│   │                                       # AWS client factory, finding parsers, decorators
│   │
│   ├── functions/
│   │   ├── isolate-ec2/
│   │   │   └── handler.py                  # EC2 containment: quarantine SG, EBS snapshots
│   │   │
│   │   ├── revoke-iam-keys/
│   │   │   └── handler.py                  # IAM containment: disable key, deny policy,
│   │   │                                   # revoke sessions, handle root/assumed roles
│   │   │
│   │   ├── block-ip/
│   │   │   └── handler.py                  # Network containment: WAFv2, NACL, threat intel
│   │   │
│   │   ├── s3-remediation/
│   │   │   └── handler.py                  # S3 hardening: public access, KMS, versioning,
│   │   │                                   # access logging, restrictive bucket policy
│   │   │
│   │   └── findings-aggregator/
│   │       └── handler.py                  # Scheduled: aggregate all sources, CloudWatch
│   │                                       # metrics, daily Slack digest, S3 report
│   │
│   └── tests/
│       ├── conftest.py                     # Shared fixtures, env vars, boto3 mocking
│       ├── test_isolate_ec2.py             # 12 tests: routing, isolation, SG, snapshots
│       ├── test_revoke_iam_keys.py         # 9 tests: disable key, deny policy, root, sessions
│       └── test_s3_remediation.py          # 10 tests: block-ip (7) + S3 remediation (8)
│
└── terraform/
    └── environments/
        └── prod/
            ├── main.tf                     # All security services + Lambda + EventBridge
            ├── variables.tf
            └── outputs.tf

Security Services

GuardDuty — Threat Detection

GuardDuty uses machine learning to analyze CloudTrail API logs, VPC Flow Logs, and DNS query logs continuously. It detects anomalies that rule-based tools miss — for example, an IAM user that has never called ec2:RunInstances suddenly launching 50 instances at 3am is anomalous behavior that GuardDuty catches even without a specific rule for it.

What it monitors:

Data Source	Detections
CloudTrail management events	Unusual API calls, policy changes, new IAM users
CloudTrail S3 data events	Unusual S3 access patterns, mass data exfiltration
VPC Flow Logs	Port scanning, brute force, C2 communication
DNS logs	Malware C2 domain communication, DNS data exfiltration
EKS audit logs	Privilege escalation, container escapes
EBS volumes	Malware scanning via malware protection feature

Finding types this platform responds to:

Finding Type	MITRE ATT&CK	Response
`UnauthorizedAccess:EC2/SSHBruteForce`	T1110.001	EC2 isolation
`UnauthorizedAccess:EC2/RDPBruteForce`	T1110.001	EC2 isolation
`CryptoCurrency:EC2/BitcoinTool.B!DNS`	T1496	EC2 isolation
`Backdoor:EC2/C&CActivity.B!DNS`	T1071	EC2 isolation
`Trojan:EC2/BlackholeTraffic`	T1071	EC2 isolation
`UnauthorizedAccess:EC2/TorClient`	T1090	EC2 isolation
`UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration`	T1552.005	Revoke credentials
`UnauthorizedAccess:IAMUser/ConsoleLoginSuccess.B`	T1078	Revoke credentials
`Policy:IAMUser/RootCredentialUsage`	T1078.004	Critical alert
`Stealth:IAMUser/CloudTrailLoggingDisabled`	T1562.008	Re-enable CloudTrail
`Impact:IAMUser/AnomalousBehavior`	T1486	Revoke credentials
`Recon:EC2/PortProbing`	T1595	Block IP (WAF + NACL)

Terraform configuration:

resource "aws_guardduty_detector" "main" {
  enable = true
  datasources {
    s3_logs { enable = true }
    kubernetes { audit_logs { enable = true } }
    malware_protection {
      scan_ec2_instance_with_findings {
        ebs_volumes { enable = true }
      }
    }
  }
}

Security Hub — Finding Aggregation

Security Hub is the single pane of glass. It receives findings from GuardDuty, Inspector, Macie, IAM Access Analyzer, and Config, normalizes them into a standard format (ASFF — Amazon Security Finding Format), and evaluates them against security standards.

Standards enabled:

Standard	Coverage
CIS AWS Foundations Benchmark v1.4	58 controls — IAM, logging, monitoring, networking
AWS Foundational Security Best Practices	200+ controls across all AWS services
PCI DSS v3.2.1	Payment card data protection controls

Why Security Hub matters for this platform:

Every Lambda function can call securityhub:BatchUpdateFindings to mark findings as RESOLVED after remediation. This keeps the Security Hub dashboard clean and provides an audit trail showing which findings were auto-remediated vs. which required human intervention.

Inspector v2 — Vulnerability Scanning

Inspector v2 continuously scans:

EC2 instances (OS packages, software CVEs via SSM Agent)
ECR container images (layer-by-layer CVE scanning — runs on push AND continuously on stored images)
Lambda functions (package dependencies)

Unlike a one-time pipeline scan (like Trivy in the jenkins pipeline), Inspector monitors running workloads. If a new CVE is published for a package already installed on your EC2 fleet, Inspector fires a finding within hours.

Integration with the platform:

New CRITICAL CVEs appear in Security Hub → findings-aggregator picks them up → included in daily digest
Findings-aggregator tracks critical_cves list with CVE IDs and CVSS scores
CloudWatch metric FindingsBySource/inspector tracks CVE count over time

Macie — Sensitive Data Discovery

Macie scans S3 buckets for sensitive data using ML classifiers:

Personally Identifiable Information (PII) — names, addresses, SSNs, passport numbers
Financial data — credit card numbers, bank account numbers
Credentials — AWS access keys, private keys, passwords in files
Healthcare data — PHI under HIPAA

Why this matters: A developer accidentally commits a CSV with 10,000 customer email addresses to an S3 bucket they're using for testing. Without Macie, this goes undetected. Macie finds it within 15 minutes (configured to FIFTEEN_MINUTES publishing frequency) and triggers the s3-remediation Lambda, which blocks public access and applies a restrictive bucket policy before anyone outside the account can download it.

Scheduled classification job:

resource "aws_macie2_classification_job" "s3_scan" {
  job_type = "SCHEDULED"
  schedule_frequency { weekly_schedule = "MONDAY" }
  s3_job_definition {
    bucket_definitions {
      account_id = local.account_id
      buckets    = ["arn:aws:s3:::*"]  # Scan all buckets
    }
  }
}

IAM Access Analyzer

IAM Access Analyzer continuously evaluates resource policies to identify resources accessible from outside your AWS account. It analyzes:

S3 bucket policies
IAM role trust policies
KMS key policies
Lambda function policies
SQS queue policies
Secrets Manager secrets

An active Access Analyzer finding means an external entity (another AWS account, the public, or a third-party service) has been granted access to one of your resources. This is sometimes intentional (a cross-account role for a vendor) but is often a misconfiguration.

Findings-aggregator tracks the count of active Access Analyzer findings and includes them in the daily digest. Any new findings are surfaced in the CloudWatch dashboard.

AWS Config — Compliance Monitoring

AWS Config continuously evaluates resource configurations against rules. Unlike GuardDuty (which detects active threats), Config detects drift — when a resource was correctly configured and then changed.

Config rules deployed:

Rule	What It Checks	Severity
`s3-bucket-public-access-prohibited`	S3 Block Public Access enabled	HIGH
`s3-bucket-server-side-encryption-enabled`	Default encryption on all S3 buckets	HIGH
`root-account-mfa-enabled`	Root account has MFA	CRITICAL
`iam-password-policy`	Password policy meets complexity requirements	MEDIUM
`cloud-trail-enabled`	CloudTrail is active in all regions	CRITICAL

Config + Lambda = automatic remediation: Config violations flow to Security Hub → EventBridge → s3-remediation Lambda (for S3 rules). CloudTrail disabling is handled by revoke-iam-keys Lambda which re-enables it immediately.

CloudTrail — Audit Logging

CloudTrail records every API call made in the account. This is the raw material for forensic investigation after an incident. Configuration:

Multi-region trail (catches API calls in every region, including ones you're not using)
S3 and Lambda data events enabled (records object-level access)
KMS encrypted logs
CloudWatch Logs delivery (enables CloudWatch Insights queries on API activity)
Log file validation enabled (detects if logs have been tampered with)

Why log file validation matters: If an attacker compromises your account and deletes CloudTrail logs to cover their tracks, the Stealth:IAMUser/CloudTrailLoggingDisabled GuardDuty finding fires and the revoke-iam-keys Lambda re-enables CloudTrail. Even if they delete the log files, the SHA-256 hash chain (log file validation) proves which logs are missing.

Lambda Remediation Functions

All Lambda functions share:

A common layer (lambda/layers/common/utils.py) with logging, SNS notifications, DynamoDB audit writer, AWS client factory
The @remediation_handler decorator which provides execution timing, error handling, and structured logging
reserved_concurrent_executions = 10 — prevents runaway remediation loops
X-Ray tracing enabled
Dead Letter Queue (SQS) for failed invocations
Python 3.12 runtime

isolate-ec2

Trigger: GuardDuty EC2 findings (SSH/RDP brute force, crypto mining, C2 communication, Tor)

Decision logic:

Finding type in ISOLATION_FINDING_TYPES OR severity HIGH/CRITICAL → full isolation
Finding type in ALERT_ONLY_TYPES → notification only, no isolation
Everything else → notification only

Isolation steps:

Describe the instance — get current security groups, VPC ID, EBS volumes, state
Skip if already terminated
Get or create quarantine security group (no inbound, no outbound rules, tagged with security:purpose=quarantine)
modify_instance_attribute — replace all security groups with quarantine SG only
create_tags — mark instance as quarantined with finding ID and timestamp
create_snapshot — forensic EBS snapshot of every attached volume
Write audit record to DynamoDB
Send SNS notification (→ Slack)

The quarantine SG design: The quarantine SG has absolutely no rules — not even a default allow. This is different from a "deny all" rule: there simply are no rules, so no traffic is permitted. AWS evaluates security groups as whitelists — if nothing is explicitly allowed, nothing is allowed. The key detail is that revoke_security_group_egress removes the default outbound allow-all rule that AWS creates automatically on new SGs.

revoke-iam-keys

Trigger: GuardDuty IAM findings (credential exfiltration, anomalous behavior, root usage, policy changes)

Decision tree:

Root credential usage?
  → Cannot disable root → critical SNS alert with escalation instructions
  → Manual investigation required

CloudTrail disabled?
  → Re-enable CloudTrail for all trails in the region

IAM finding + user principal?
  → Disable access key (Status: Inactive)
  → Attach explicit-deny inline policy (blocks ALL actions)
  → Tag user as quarantined
  → Force password reset (invalidates console session)

Assumed role?
  → Attach deny policy with DateLessThan condition
    (revokes all tokens issued before this timestamp)

Why explicit-deny + disable key: Disabling an access key is not enough. The key might have been used to create other credentials or to establish long-term sessions. The explicit-deny policy (Effect: Deny, Action: *, Resource: *) blocks all actions regardless of what other policies allow. Belt and suspenders.

Root credential usage: Root credentials cannot be disabled programmatically. The function sends a CRITICAL alert with specific investigation instructions and escalates to PagerDuty via SNS message attributes.

block-ip

Trigger: GuardDuty findings with a remote IP (port probing, brute force, C2)

Safety check — private IP detection: Before blocking any IP, the function checks if it falls in RFC 1918 private ranges (10/8, 172.16/12, 192.168/16). A private IP in a GuardDuty finding usually means lateral movement — blocking it in the NACL would break internal connectivity. The function alerts with "private-ip-lateral-movement-possible" instead.

Blocking mechanism:

WAFv2 — adds IP to a managed IP set that blocks traffic at CloudFront/ALB before it reaches EC2. This is the most effective layer — traffic is dropped at the edge.
NACL — adds a DENY rule to the default VPC NACL. NACLs are stateless and operate at the subnet level. Uses rule numbers 200-299 (reserved for automated blocks) to avoid conflict with manual rules.
DynamoDB threat intel table — records the IP with 30-day TTL. Can be queried by other Lambda functions to check if an IP is known-bad before establishing connections.

s3-remediation

Triggers: Macie findings (sensitive data), Security Hub S3 controls, Config violations

Remediation actions:

Condition	Action
Public access enabled	`put_public_access_block` (BlockPublicAcls, IgnorePublicAcls, BlockPublicPolicy, RestrictPublicBuckets all = true)
No default encryption	`put_bucket_encryption` with SSE-KMS
No versioning	`put_bucket_versioning` (Status: Enabled)
No access logging	`put_bucket_logging` → central security logging bucket
Macie PII finding	`put_bucket_policy` with DenyNonHTTPS + DenyPublicAccess statements

The restrictive bucket policy: Applied only when Macie finds PII/sensitive data. Forces HTTPS (prevents eavesdropping on bucket traffic) and restricts access to the account's own principals only (no cross-account reads of PII data).

findings-aggregator

Trigger: EventBridge scheduled rule — every 6 hours

What it aggregates:

GuardDuty: all active findings in the last 24 hours, grouped by severity and type
Security Hub: all NEW/ACTIVE findings, grouped by severity and product
IAM Access Analyzer: all active external access findings
Inspector v2: CRITICAL CVEs across EC2/ECR/Lambda

Outputs:

CloudWatch metrics — FindingsBySeverity/{CRITICAL,HIGH,MEDIUM,LOW} and FindingsBySource/{guardduty,security_hub,inspector,iam_access_analyzer}. These power CloudWatch dashboards and alarms.
Slack daily digest — summary of finding counts with color-coded severity (red = critical, yellow = medium, green = clean). Sent via SNS.
S3 JSON report — full aggregated report stored at s3://findings-bucket/findings-summaries/YYYY/MM/DD/HHMMSS.json. Lifecycle rule transitions to Glacier after 90 days, expires after 365 days.

EventBridge Routing

EventBridge is the routing layer between security services and Lambda functions. Each rule uses event pattern matching to route specific finding types to the correct Lambda.

Rule patterns:

// EC2 findings → isolate-ec2
{
  "source": ["aws.guardduty"],
  "detail-type": ["GuardDuty Finding"],
  "detail": {
    "type": [
      {"prefix": "UnauthorizedAccess:EC2/"},
      {"prefix": "CryptoCurrency:EC2/"},
      {"prefix": "Backdoor:EC2/"},
      {"prefix": "Trojan:EC2/"},
      {"prefix": "Recon:EC2/"}
    ]
  }
}

The prefix matching is deliberate — it catches new GuardDuty finding subtypes in these categories without requiring rule updates. When AWS adds a new CryptoCurrency:EC2/MoneroTool finding type, the existing rule routes it to the right Lambda automatically.

Infrastructure

Everything is provisioned by Terraform. Key design decisions:

Single KMS key, multiple uses: One KMS key encrypts CloudTrail logs, DynamoDB tables, S3 buckets, SQS DLQ, and Lambda environment variables. The key policy explicitly allows CloudTrail to use it for log encryption. Annual rotation is enabled.

Lambda DLQ: Every Lambda has a Dead Letter Queue (SQS). If a Lambda invocation fails (network error, throttle, unhandled exception), EventBridge retries twice and then delivers the event to the DLQ. A CloudWatch alarm monitors DLQ message count and alerts if any messages arrive — failed remediations are never silently dropped.

Reserved concurrency = 10: Each Lambda has reserved_concurrent_executions = 10. This prevents a scenario where a flood of GuardDuty findings (e.g., a large-scale brute force attack) spawns thousands of simultaneous Lambda invocations that exhaust the account's concurrency limit and starve other Lambda functions.

S3 lifecycle policy: The findings reports bucket has a lifecycle rule: findings are accessible for 90 days (standard storage), transition to Glacier after 90 days (cost optimization), and expire after 365 days (compliance data retention).

Testing

The test suite uses pytest with moto for mocking AWS services. Coverage requirement is 80% (enforced in pytest.ini).

Test structure:

Test File	Functions Under Test	Test Count
`test_isolate_ec2.py`	handler, isolate_instance, get_or_create_quarantine_sg, create_forensic_snapshots	12
`test_revoke_iam_keys.py`	handler, revoke_credentials, handle_root_credential_usage, handle_policy_change	9
`test_s3_remediation.py`	block-ip handler + helpers, s3-remediation handler + helpers	15

Key test patterns:

All tests mock boto3 via conftest.py autouse fixture — no real AWS calls are ever made in unit tests. The mock_boto3_session fixture patches boto3.client and boto3.resource globally.

Happy path + failure handling — every function has tests for both success and graceful failure (API errors, missing resources, unexpected input).

Security invariants tested:

Private IPs are never blocked (would break internal connectivity)
Root credential usage never attempts to disable root (impossible)
Terminated instances are skipped (cannot isolate a terminated instance)
Explicit deny policy is a true Deny-All (not just a deny of specific actions)

Run tests:

cd aws-security-automation

# Install dependencies
pip install -r lambda/requirements.txt

# Run all tests with coverage
pytest

# Run specific test file
pytest lambda/tests/test_isolate_ec2.py -v

# Run with coverage report
pytest --cov=lambda --cov-report=html
open htmlcov/index.html

# Run only fast unit tests
pytest -m "not integration" -v

Compliance Mapping

Control	NIST 800-53	CIS AWS	PCI DSS	Implemented By
Threat detection	SI-3, SI-4	3.x	11.5	GuardDuty
Vulnerability management	RA-5, SI-2	—	6.3	Inspector v2
Sensitive data discovery	SC-28, MP-4	—	3.4	Macie
Audit logging	AU-2, AU-12	2.1	10.2	CloudTrail
Compliance monitoring	CA-7	1.x, 2.x	2.2	AWS Config
Incident response	IR-4, IR-5	—	12.9	Lambda (all)
IAM access review	AC-2, AC-6	1.x	7.1	IAM Access Analyzer
Credential management	IA-5	1.14	8.x	revoke-iam-keys
Network protection	SC-7	4.x	1.x	block-ip
Encryption at rest	SC-28	2.1.1	3.4	s3-remediation
Automated remediation	SI-7, IR-4	—	12.10	All Lambda functions

Prerequisites

AWS account with admin permissions
Terraform 1.7+ (terraform version)
Python 3.12+ (python3 --version)
AWS CLI v2 configured (aws sts get-caller-identity)
S3 bucket for Terraform state
DynamoDB table for Terraform lock

Deployment Guide

# 1. Clone repository
git clone https://github.com/YOUR_USERNAME/aws-security-automation.git
cd aws-security-automation

# 2. Create Terraform state backend (one-time)
aws s3 mb s3://enterprise-security-tfstate --region us-east-1
aws s3api put-bucket-versioning \
  --bucket enterprise-security-tfstate \
  --versioning-configuration Status=Enabled

aws dynamodb create-table \
  --table-name enterprise-security-tflock \
  --attribute-definitions AttributeName=LockID,AttributeType=S \
  --key-schema AttributeName=LockID,KeyType=HASH \
  --billing-mode PAY_PER_REQUEST

# 3. Install Python dependencies and run tests
pip install -r lambda/requirements.txt
pytest
# All tests must pass before deploying

# 4. Initialize and deploy Terraform
cd terraform/environments/prod
terraform init
terraform plan -var="alert_email=security@company.com" -out=tfplan
# Review the plan carefully
terraform apply tfplan

# 5. Verify services are active
aws guardduty list-detectors
aws securityhub describe-hub
aws macie2 get-macie-session
aws accessanalyzer list-analyzers

# 6. Test the pipeline with a GuardDuty sample finding
aws guardduty create-sample-findings \
  --detector-id $(aws guardduty list-detectors --query 'DetectorIds[0]' --output text) \
  --finding-types "UnauthorizedAccess:EC2/SSHBruteForce"

# Watch Lambda logs
aws logs tail /aws/lambda/security-auto-prod-isolate-ec2 --follow

Runbook — Common Operations

Investigating a quarantined EC2 instance

# Find quarantined instances
aws ec2 describe-instances \
  --filters "Name=tag:security:quarantined,Values=true" \
  --query 'Reservations[].Instances[].{ID:InstanceId,State:State.Name,Reason:Tags[?Key==`security:quarantine-reason`].Value|[0]}'

# Get the finding that triggered isolation
INSTANCE_ID="i-0abc123"
aws ec2 describe-tags --filters "Name=resource-id,Values=${INSTANCE_ID}" \
  --query 'Tags[?Key==`security:quarantine-finding-id`].Value' --output text

# View DynamoDB audit record
aws dynamodb get-item \
  --table-name security-remediation-audit \
  --key '{"finding_id": {"S": "FINDING_ID_HERE"}, "timestamp": {"S": "TIMESTAMP_HERE"}}'

# Access forensic snapshot
aws ec2 describe-snapshots \
  --filters "Name=tag:security:source-instance,Values=${INSTANCE_ID}"

# Once investigation is complete — restore instance
# 1. Remove quarantine tag
aws ec2 delete-tags --resources ${INSTANCE_ID} \
  --tags Key=security:quarantined

# 2. Restore original security groups (stored in tag)
aws ec2 describe-tags --filters "Name=resource-id,Values=${INSTANCE_ID}" \
  --query 'Tags[?Key==`security:original-security-groups`].Value' --output text

Restoring an IAM user after investigation

IAM_USER="compromised-user"

# View all inline policies (find the deny policy)
aws iam list-user-policies --user-name ${IAM_USER}

# Remove the security quarantine policy after investigation
aws iam delete-user-policy \
  --user-name ${IAM_USER} \
  --policy-name "SecurityAutomationExplicitDeny"

# Re-enable access key if cleared
aws iam update-access-key \
  --user-name ${IAM_USER} \
  --access-key-id AKIAIOSFODNN7EXAMPLE \
  --status Active

# Remove quarantine tags
aws iam untag-user \
  --user-name ${IAM_USER} \
  --tag-keys "security:quarantined" "security:quarantine-reason"

Checking the threat intelligence table

# Query all blocked IPs
aws dynamodb scan \
  --table-name security-threat-intel \
  --query 'Items[].{IP:ip_address.S, Type:finding_type.S, Severity:severity.S, Seen:first_seen.S}'

# Remove an IP that was false-positive
aws dynamodb delete-item \
  --table-name security-threat-intel \
  --key '{"ip_address": {"S": "1.2.3.4"}}'

# Also remove from WAF
IP_SET_ID=$(aws wafv2 list-ip-sets --scope REGIONAL \
  --query 'IPSets[?Name==`security-auto-prod-blocked-ips`].Id' --output text)

aws wafv2 get-ip-set --name security-auto-prod-blocked-ips \
  --scope REGIONAL --id ${IP_SET_ID}
# Then update-ip-set removing the IP

Manually triggering a remediation

# Trigger isolate-ec2 manually
aws lambda invoke \
  --function-name security-auto-prod-isolate-ec2 \
  --payload '{"source":"aws.guardduty","detail-type":"GuardDuty Finding","detail":{"id":"manual-test-001","type":"UnauthorizedAccess:EC2/SSHBruteForce","severity":8.0,"accountId":"123456789012","region":"us-east-1","title":"Manual test","description":"Manual trigger","resource":{"instanceDetails":{"instanceId":"i-YOUR_INSTANCE_ID","networkInterfaces":[]}},"service":{"action":{}}}}' \
  output.json

cat output.json | jq

Reviewing the daily security digest

# Manually trigger the findings aggregator
aws lambda invoke \
  --function-name security-auto-prod-findings-aggregator \
  --payload '{}' \
  aggregator-output.json

cat aggregator-output.json | jq

# Check the S3 reports
TODAY=$(date +%Y/%m/%d)
aws s3 ls s3://security-auto-prod-findings-ACCOUNT_ID/findings-summaries/${TODAY}/
aws s3 cp s3://security-auto-prod-findings-ACCOUNT_ID/findings-summaries/${TODAY}/LATEST.json - | jq

Troubleshooting

Lambda is not triggering on GuardDuty findings:

# Verify EventBridge rule is enabled
aws events describe-rule --name security-auto-prod-guardduty-ec2

# Check Lambda permissions
aws lambda get-policy --function-name security-auto-prod-isolate-ec2

# Test EventBridge rule with a sample event
aws events test-event-pattern \
  --event-pattern '{"source":["aws.guardduty"],"detail":{"type":[{"prefix":"UnauthorizedAccess:EC2/"}]}}' \
  --event '{"source":"aws.guardduty","detail":{"type":"UnauthorizedAccess:EC2/SSHBruteForce"}}'

Lambda is failing — check DLQ:

# Get DLQ URL
DLQ_URL=$(aws sqs get-queue-url --queue-name security-auto-prod-lambda-dlq --query QueueUrl --output text)

# Receive failed messages
aws sqs receive-message --queue-url ${DLQ_URL} --max-number-of-messages 10

Lambda cannot modify EC2 instance:

Verify the Lambda execution role has ec2:ModifyInstanceAttribute permission
Check if the instance has a resource-based policy blocking modifications
Verify the Lambda is in a VPC/region that can reach the EC2 API endpoint

GuardDuty not generating findings in test:

# Generate sample findings for all types
aws guardduty create-sample-findings \
  --detector-id DETECTOR_ID \
  --finding-types \
    "UnauthorizedAccess:EC2/SSHBruteForce" \
    "CryptoCurrency:EC2/BitcoinTool.B!DNS" \
    "UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration.OutsideAWS"

Security Considerations

Principle of least privilege — Lambda IAM role: The Lambda execution role grants only the specific actions needed for each remediation type. For example, it allows ec2:ModifyInstanceAttribute (to change security groups) but not ec2:TerminateInstances (intentional — terminating an instance destroys forensic evidence). Review the IAM policy in main.tf before deploying.

Reserved concurrency as a safety control: reserved_concurrent_executions = 10 prevents automated runaway. If 1000 GuardDuty findings fire simultaneously (e.g., during a large-scale attack), only 10 Lambda invocations run concurrently. The rest queue. This prevents both AWS account concurrency exhaustion and inadvertent mass remediation.

Private IP safety check: The block-ip function refuses to block RFC 1918 addresses. A private source IP in a GuardDuty finding often indicates lateral movement (an attacker who already has a foothold in your network) — blocking it in the NACL would block legitimate internal traffic and potentially prevent your own incident response team from accessing affected systems.

Root credential handling: Root credentials cannot be disabled via API. The revoke-iam-keys function never attempts to call iam:UpdateAccessKey for root — doing so would throw an exception that could mask the critical alert. Instead, it sends a maximum-urgency SNS notification and explicitly lists manual investigation steps.

Audit trail integrity: Every DynamoDB audit record includes lambda_function (function name), timestamp, finding_id, actions_taken, and success. The table has point-in-time recovery enabled. Records have a 90-day TTL — long enough for compliance audits but not forever (cost control).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
lambda		lambda
terraform/environments/prod		terraform/environments/prod
.gitignore		.gitignore
README.md		README.md
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

AWS Security Automation Platform

Table of Contents

Architecture Overview

Why Automated Response

Repository Structure

Security Services

GuardDuty — Threat Detection

Security Hub — Finding Aggregation

Inspector v2 — Vulnerability Scanning

Macie — Sensitive Data Discovery

IAM Access Analyzer

AWS Config — Compliance Monitoring

CloudTrail — Audit Logging

Lambda Remediation Functions

isolate-ec2

revoke-iam-keys

block-ip

s3-remediation

findings-aggregator

EventBridge Routing

Infrastructure

Testing

Compliance Mapping

Prerequisites

Deployment Guide

Runbook — Common Operations

Investigating a quarantined EC2 instance

Restoring an IAM user after investigation

Checking the threat intelligence table

Manually triggering a remediation

Reviewing the daily security digest

Troubleshooting

Security Considerations

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages