← Week 3: Distributed Data Structures

Day 20: Distributed Hash Tables

Phase 1 · Jun 8, 2026

← Week 3: Distributed Data Structures

Agenda (2–3 hours)

  • Read (45 min): Stoica et al. "Chord: A Scalable Peer-to-Peer Lookup Service" (SIGCOMM 2001) §1–4; Maymounkov & Mazières "Kademlia" (2002) §1–3
  • Study (45 min): Trace a Chord lookup; compare finger table routing to Kademlia's XOR metric
  • Practice (45 min): Simulate 8 Chord nodes (power-of-2 IDs); build finger tables and trace a lookup path
  • Challenge (30 min): How does BitTorrent use Kademlia? How does IPFS build on Kademlia for content routing?
← Week 3: Distributed Data Structures

The DHT Problem

We have N nodes and K key-value pairs. Given a key, how do we find the node that stores its value — without a central directory?

Requirements:

  • Decentralized: no single point of failure
  • Scalable: O(log n) per-node state and O(log n) lookup hops
  • Resilient: handles node churn (join/leave)
← Week 3: Distributed Data Structures

Chord

Maps both nodes and keys to a circular ID space (SHA-1 mod 2^160).

Routing: each node maintains:

  • Successor: next node clockwise on the ring
  • Finger table: m entries where finger[i] = successor of (node + 2^i) mod 2^m

Lookup: route toward the key using finger table jumps. Converges in O(log n) hops.

Consistent hashing (from Phase 1 Week 3 Day 16): Chord is the original consistent-hashing paper's routing layer. Each key is stored at its successor.

← Week 3: Distributed Data Structures

Kademlia

Uses XOR as the distance metric between node IDs.

Key properties:

  • XOR is symmetric: distance(A,B) = distance(B,A)
  • Routing table: k-buckets organized by XOR distance; bucket i contains nodes with XOR distance in [2^i, 2^(i+1))
  • Lookup: query k closest nodes iteratively; each step gets closer (XOR monotonicity)

Why XOR?:

  • Symmetric routing: node A and node B have the same picture of their neighborhood
  • Each step reduces XOR distance; convergence is guaranteed
← Week 3: Distributed Data Structures

Applications

System DHT Use case
BitTorrent Kademlia (mainline DHT) Peer discovery without central tracker
IPFS Kademlia-based Content routing: find who has a CID
Ethereum Kademlia (discv4/discv5) Peer discovery for the devp2p network
Amazon Dynamo Consistent hashing + finger-table variant Key-to-node routing for internal cluster
Cassandra Consistent hashing Partition key → replica set
← Week 3: Distributed Data Structures

Key Takeaways

  • DHTs provide decentralized, O(log n) key lookup without a central directory
  • Chord: ring + finger tables; simple and well-analyzed
  • Kademlia: XOR metric enables symmetric routing and parallel lookups; dominant in practice
  • DHTs are the foundation of peer-to-peer file sharing, content-addressed storage (IPFS), and blockchain peer discovery

Tomorrow: Phase 1 Challenge — implement a consistent hash ring in Rust.