Short version: Stand up networking, identity, logging, and a few platform services with explicit defaults and basic alarms.
The minimal stack
Service | Use it for | Baseline settings | Alerting |
---|---|---|---|
IAM & Identity Center | Human access | SSO groups → permission sets; no IAM users; MFA on; short admin sessions | Alarm on root usage; failed console auth bursts |
VPC | Networking | Private subnets for apps/DB; public only for ALBs/bastions; one NAT GW per AZ for prod | N/A |
Security Groups | Primary traffic policy | Tiered SGs (web→app→db); SG-to-SG references; outbound restrict where feasible | Fail closed in templates; review changes weekly |
NACLs | Subnet guardrail | Neutral defaults; optional denylist or emergency block | N/A |
ALB/NLB | Ingress/load-balancing | ALB for HTTP(S) with TLS at the edge; NLB for TCP; access logs → S3 | ALB 5xx rate and target health |
EC2 & Auto Scaling | Compute | Latest AMIs; SSM agent; IMDSv2 required; instance profiles only (no keys) | Status checks; CPU/network extremes |
EBS | Block storage | Encryption by default; gp3; snap lifecycle policies | Low free storage on critical volumes |
S3 | Object storage | Block Public Access on; default encryption (SSE-KMS); lifecycle to cheaper tiers | Bucket public ACL detected; replication failures (if used) |
RDS | Managed DB | Encryption on; backups retained; Multi-AZ for prod; minor versions auto | FreeStorageSpace, CPUCreditBalance (burst), failovers |
Route 53 | DNS/health checks | Split-horizon if needed; alias records to ALB/NLB; health checks for externals | Health check alarms to incident channel |
CloudTrail | Audit log | Organization trail, multi-region, S3 + log validation, to log-archive account | Root usage, unauthorized API calls, console login without MFA |
CloudWatch | Metrics/logs/alerts | Set log retention (30–90d); ship app logs; a few good alarms only | EC2 system checks, ALB 5xx, RDS free storage |
AWS Config | Inventory/compliance | Record all resources in all regions; a small set of managed rules | Non-compliance routed to a ticket queue |
KMS | Key management | Customer-managed keys for S3/RDS/EBS; limited key admins; rotation policy | Key disable/delete attempts (CloudTrail→alarm) |
Systems Manager (SSM) | Access/ops | Session Manager instead of SSH where possible; inventory and patch baselines | SSM connection failures on critical fleets |
Quick wins
- Turn on encryption by default (EBS, S3, RDS) and make it part of your templates.
- Send ALB and VPC Flow Logs to S3/CloudWatch with retention set. Don’t keep logs forever by accident.
- IMDSv2 required on EC2; no instance metadata from unauthenticated clients.
What to skip (for now)
- Do not enable every managed service “just because”. Establish your baseline first; then layer GuardDuty/Inspector/Security Hub where they make sense.
- Avoid bespoke NACL rule sets; keep traffic policy in SGs.
- Don’t create IAM users. Use Identity Center.
Security gaps in Linux and cloud systems risk downtime, data compromise, lost business — and compliance failures.
With 20+ years’ experience and active UK Security Check (SC) clearance, I harden Linux and cloud platforms for government, corporate, and academic sectors — ensuring secure, compliant, and resilient infrastructure.