Why Distributed Locks Are Hard
A distributed lock protects a resource from concurrent modification across multiple processes.
Problem: the lock holder can be paused (GC pause, OOM swap, VM migration) after acquiring the lock but before the protected operation completes.
Process A: acquires lock
Process A: [pauses for 30 seconds — GC pause]
Process B: lock expires; acquires lock; modifies resource
Process A: [resumes]; modifies resource — DATA CORRUPTION
Solution: fencing tokens — a monotonically increasing number issued with each lock grant. The protected resource rejects operations with stale tokens.