← Week 2: ML-KEM and ML-DSA

Day 8: ML-KEM Conceptual Walkthrough

Phase 3 · July 15, 2026

← Week 2: ML-KEM and ML-DSA

Agenda (2–3 hours)

  • Read (75 min): NIST FIPS 203 §4–5 (ML-KEM specification overview, parameter sets)
  • Study (45 min): Kyber's three-step construction — key generation, encapsulation, decapsulation
  • Challenge (30 min): Trace through ML-KEM at a concrete level
← Week 2: ML-KEM and ML-DSA

ML-KEM in One Sentence

ML-KEM (formerly CRYSTALS-Kyber) is a Key Encapsulation Mechanism based on
the hardness of the Module Learning With Errors (MLWE) problem.

It provides:

  • IND-CCA2 security: an adversary who can adaptively query a decapsulation oracle
    gains no information about the shared secret. This is the gold standard for KEMs.
  • Forward secrecy when ephemeral: if the key pair is freshly generated per session,
    compromise of the long-term key does not reveal past sessions.
← Week 2: ML-KEM and ML-DSA

High-Level Construction

ML-KEM is built from an IND-CPA-secure PKE (Public Key Encryption) scheme,
transformed into an IND-CCA2-secure KEM via the Fujisaki-Okamoto (FO) transform.

[KeyGen]
  A ← random matrix over Z_q[X]/(X^n+1)   ← public, from seed
  s, e ← small random polynomials (error)
  t = A·s + e                               ← public key component
  pk = (A_seed, t),  sk = s

[Encaps(pk)]
  r, e1, e2 ← fresh small random polynomials
  u = Aᵀ·r + e1
  v = tᵀ·r + e2 + encode(message)
  ciphertext = (u, v)
  shared_secret = H(message, pk_hash)

[Decaps(sk, ciphertext)]
  recover message = decode(v - sᵀ·u)
  re-encapsulate and check
  shared_secret = H(message, pk_hash)
← Week 2: ML-KEM and ML-DSA

Why the Re-Encapsulation Check?

The FO transform adds a crucial step: implicit rejection.

After decapsulating to recover the message, Decaps re-runs Encaps with that message
and verifies the ciphertext matches. If it doesn't: the ciphertext was malformed
(possibly a CCA2 attack). In that case, output H(sk, ciphertext) (a random-looking value) instead.

This prevents chosen-ciphertext attacks. An attacker who submits a modified
ciphertext gets a useless random value, learning nothing about the secret key.

← Week 2: ML-KEM and ML-DSA

ML-KEM Parameters (FIPS 203)

ML-KEM-512 ML-KEM-768 ML-KEM-1024
Module rank k 2 3 4
q 3329 3329 3329
Public key 800 bytes 1184 bytes 1568 bytes
Private key 1632 bytes 2400 bytes 3168 bytes
Ciphertext 768 bytes 1088 bytes 1568 bytes
Shared secret 32 bytes 32 bytes 32 bytes

Compare to X25519: 32-byte public key, 32-byte shared secret.
ML-KEM-768: ~37× larger public key, ~34× larger ciphertext. But only done once per TLS session.

← Week 2: ML-KEM and ML-DSA

Performance

On modern hardware (AWS Graviton3):

  • ML-KEM-768 KeyGen: ~25 µs
  • ML-KEM-768 Encaps: ~28 µs
  • ML-KEM-768 Decaps: ~32 µs
  • X25519 KeyGen+Exchange: ~20 µs total

Performance difference: negligible for server workloads.
Network latency dominates TLS handshake time far more than KEM computation.

← Week 2: ML-KEM and ML-DSA

Challenge Assignment

Trace through ML-KEM-768 key generation and encapsulation conceptually:

  1. Explain what A (the matrix) represents and why it can be public
  2. Why must s (the secret) have small coefficients? What breaks if it's random?
  3. Why does Encaps generate its own randomness r rather than using the sender's key?
  4. Why is the shared secret H(message, pk_hash) rather than just message?
  5. What does IND-CCA2 mean for a provisioning service? When would an adversary
    be able to run a chosen-ciphertext attack against your service?
← Week 2: ML-KEM and ML-DSA

Resources

  • NIST FIPS 203: §4 (Parameter sets), §5 (Specification) — focus on §5.1–5.3
  • CRYSTALS-Kyber submission paper: pq-crystals.org/kyber — more mathematical detail
  • "Understanding Kyber": blog.cloudflare.com — accessible technical walkthrough