← Week 3: Log Aggregation & Analysis

Day 16: CloudWatch Logs

Phase 6 · Sep 17, 2026

← Week 3: Log Aggregation & Analysis

Agenda (2–3 hours)

  • Read (45 min): CloudWatch Logs documentation — log groups, streams, retention, Logs Insights query language; CloudWatch Logs Subscriptions
  • Study (45 min): What is the difference between a metric filter and a Logs Insights query? When would you use each?
  • Practice (45 min): Write Logs Insights queries to find: all ERROR events in the last hour, P95 latency from JSON logs, and the top 5 users by request count
  • Challenge (30 min): CloudWatch Logs costs $0.50/GB ingested. A service logs 10GB/day. Design a log filtering strategy at the ECS task level to stay under $50/month
← Week 3: Log Aggregation & Analysis

CloudWatch Logs Structure

Log Group: /ecs/task-svc          (retention: 30 days)
├── Log Stream: task/task-svc/abc123  ← one stream per ECS task
├── Log Stream: task/task-svc/def456
└── Log Stream: task/task-svc/ghi789

ECS automatically routes container stdout to CloudWatch Logs via the awslogs log driver.
Retention: set on the log group (1 day to never expire); default is never expire → unbounded cost.

← Week 3: Log Aggregation & Analysis

Logs Insights Queries

-- All errors in the last hour
fields @timestamp, message, trace_id, user_id
| filter level = "ERROR"
| sort @timestamp desc
| limit 50

-- P95 latency from structured JSON
fields @timestamp, duration_ms
| filter ispresent(duration_ms)
| stats pct(duration_ms, 95) as p95,
        avg(duration_ms) as avg,
        count() as n
    by bin(5m)

-- Top users by request count
stats count() as requests by user_id
| sort requests desc
| limit 10
← Week 3: Log Aggregation & Analysis

Metric Filters

Extract metrics from log patterns without storing aggregated data:

{
  "FilterPattern": "{ $.level = \"ERROR\" }",
  "MetricName": "ErrorCount",
  "MetricNamespace": "TaskService",
  "MetricValue": "1",
  "DefaultValue": 0
}

Metric filters run in real time as logs are ingested — feed CloudWatch Alarms without a separate Prometheus scrape.

← Week 3: Log Aggregation & Analysis

Key Takeaways

  • Set log group retention on creation — unbounded retention is a common cost trap
  • Logs Insights supports SQL-like queries on JSON fields without pre-aggregation
  • Metric filters extract counters from log patterns for alarms without querying full log data
  • awslogs log driver in ECS task definitions routes stdout/stderr to CloudWatch automatically

Tomorrow: OpenSearch — indexing logs for full-text search and Kibana dashboards.