← Week 3: Testing & Deployment

Day 20: Retrospective

Phase 7 · Oct 12, 2026

← Week 3: Testing & Deployment

Agenda (2–3 hours)

  • Review (60 min): Walk through each phase of the course; for each, identify one concept you now understand deeply and one you would revisit
  • Write (60 min): Document the 5 most important lessons learned from building the integration project
  • Reflect (60 min): Compare your understanding on Day 1 vs today; what would you do differently on the next distributed system you build?
← Week 3: Testing & Deployment

Course Summary

Phase Topics Integration Project Usage
1 Consistency, Consensus, Data Structures DynamoDB conditional updates (optimistic concurrency); consistent hashing concepts
2 Async Rust, Tokio, Tower, Axum Worker loop, JoinSet, CancellationToken, Tower middleware
3 gRPC, Custom Protocols, Service Mesh tonic gRPC API, mTLS, OTel trace propagation
4 Reliability, Transactions, Event Sourcing Event store, idempotency key, DLQ saga, backoff/retry
5 SQS/SNS, DynamoDB, ECS Core infrastructure of the project
6 Tracing, Metrics, Logs Full OTel stack, Prometheus SLOs, structured JSON logs
← Week 3: Testing & Deployment

Key Lessons

  1. Design before code: the architecture review checklist (Day 7) caught real design gaps before implementation
  2. Idempotency is load-bearing: every write path must be safe to retry; this is non-negotiable at scale
  3. Observability is a first-class feature: trace_id in logs was the single highest-value debugging investment
  4. Chaos before confidence: the worker crash test revealed a state machine bug that would have caused silent data loss in production
  5. SLOs frame all trade-offs: every design decision can be evaluated against the error budget — is it worth 10% of this month's budget?
← Week 3: Testing & Deployment

What to Revisit

  • Raft: implement a simplified Raft log in Rust — understanding consensus unlocks everything else
  • Single-table DynamoDB: read Alex DeBrie's book cover-to-cover; the access-pattern-first discipline changes how you think about all data modeling
  • OTel sampling: tail-based sampling is complex but the only correct approach for production at scale
  • Tower middleware: build a custom Tower layer from scratch — the Service trait abstraction is powerful once it clicks

Tomorrow: final challenge — the complete end-to-end system review.