← Week 3: Container Orchestration

Day 18: Service Discovery

Phase 5 · Aug 29, 2026

← Week 3: Container Orchestration

Agenda (2–3 hours)

  • Read (45 min): AWS Cloud Map documentation; ECS service discovery; Route 53 private hosted zones
  • Study (45 min): Compare DNS-based vs API-based service discovery. What are the trade-offs for cache TTL vs real-time registration?
  • Practice (45 min): Enable Cloud Map service discovery on two ECS services; verify that service A can resolve service B by name
  • Challenge (30 min): A service has 5 replicas; 2 are unhealthy. DNS returns all 5 IPs until TTL expires. Design a pattern to exclude unhealthy instances faster than the DNS TTL allows
← Week 3: Container Orchestration

Cloud Map Service Discovery

ECS Service → Cloud Map namespace (my-app.local)
                └── Service (api)
                    ├── Instance: 10.0.1.5:8080 (healthy)
                    ├── Instance: 10.0.1.6:8080 (healthy)
                    └── Instance: 10.0.1.7:8080 (unhealthy → deregistered)

ECS automatically registers/deregisters task IPs in Cloud Map when tasks start/stop.

DNS query: api.my-app.local → returns healthy instance IPs (A records with TTL 60s).

← Week 3: Container Orchestration

DNS vs API-Based Discovery

Approach Latency Staleness Client complexity
DNS A records ~0ms (cached) Up to TTL (60s) None — standard DNS
Cloud Map API (DiscoverInstances) ~10ms Real-time (filtered by health) SDK call required
Service mesh (Envoy) ~0ms (local proxy) xDS push (~1s) Transparent

For latency-sensitive paths: DNS with short TTL (10–30s) or service mesh sidecar.
For accuracy: DiscoverInstances API with health filter.

← Week 3: Container Orchestration

Route 53 Private Hosted Zones

Internal service names without Cloud Map:

Route 53 Private Zone: internal.mycompany.com (associated with VPC)
├── api.internal.mycompany.com → ALB DNS name
└── cache.internal.mycompany.com → ElastiCache endpoint

Trade-off vs Cloud Map:

  • Private zones require manual registration (or automation)
  • Cloud Map integrates with ECS/EKS health checks automatically
  • Private zones support complex routing policies (weighted, failover)
← Week 3: Container Orchestration

Key Takeaways

  • Cloud Map integrates with ECS to auto-register task IPs; DNS TTL controls staleness
  • DiscoverInstances API bypasses DNS cache for real-time, health-filtered discovery
  • Service mesh sidecars offer the best of both — zero-latency local proxy + fast convergence
  • Short DNS TTL (10–30s) significantly reduces impact of failed instance propagation delay

Tomorrow: load balancing with ALB and NLB — target groups, health checks, and routing rules.