Read Path and Bloom Filters
Reading from an LSM tree is more expensive than writing:
- Check MemTable
- Check L0 SSTables (may need to check all)
- Check L1, L2, ... (only one file per level due to non-overlapping invariant)
Without optimization, this is O(levels) disk reads per miss.
Bloom filters: each SSTable has a Bloom filter. Before reading an SSTable block, check the filter. "Not present" → skip entirely. Reduces average reads to ~1 disk I/O for most workups.