← Week 1: Protocol Buffers & gRPC

Day 1: Protocol Buffers — Encoding and Design

Phase 3 · Jul 1, 2026

← Week 1: Protocol Buffers & gRPC

Agenda (2–3 hours)

  • Read (45 min): Protocol Buffers language guide (proto3); Google's encoding documentation
  • Study (45 min): Manually encode and decode a simple message; understand varint, length-delimited, and fixed-width wire types
  • Practice (45 min): Write a .proto file with nested messages, enums, oneof, and repeated fields; use protoc to inspect the wire format
  • Challenge (30 min): Compare JSON vs protobuf for a 1,000-record batch: measure serialization time and bytes on the wire
← Week 1: Protocol Buffers & gRPC

Why Protocol Buffers?

Property JSON Protobuf
Human-readable Yes No (binary)
Schema Optional (JSON Schema) Required (.proto)
Size ~3-10× larger Compact
Parse speed Slow ~3–10× faster
Schema evolution Manual versioning Built-in field numbering
Cross-language Universal Generated stubs for 10+ languages

Protobuf is the default RPC serialization format at Google, AWS, and most hyperscalers.

← Week 1: Protocol Buffers & gRPC

proto3 Syntax

syntax = "proto3";
package myapp.v1;

message User {
  uint64 id = 1;
  string name = 2;
  string email = 3;
  repeated string roles = 4;
  UserStatus status = 5;
  oneof contact {
    string phone = 6;
    string slack_id = 7;
  }
}

enum UserStatus {
  USER_STATUS_UNSPECIFIED = 0; // proto3: always have a 0 value
  USER_STATUS_ACTIVE = 1;
  USER_STATUS_SUSPENDED = 2;
}
← Week 1: Protocol Buffers & gRPC

Wire Encoding

Each field: (field_number << 3 | wire_type) | value

Wire types:

  • 0 — Varint (int32, int64, uint32, bool, enum)
  • 1 — 64-bit fixed (fixed64, double)
  • 2 — Length-delimited (string, bytes, nested message, repeated)
  • 5 — 32-bit fixed (fixed32, float)

Varint encoding: 7 bits per byte, MSB=1 means more bytes follow. Compact for small integers.

Zero values are not encoded: proto3 omits default values from the wire format. This enables forward compatibility (old code skips unknown fields) and compactness.

← Week 1: Protocol Buffers & gRPC

Field Numbering and Evolution

Rules for backward-compatible evolution:

  1. Never reuse field numbers — the number is the identity, not the name
  2. New fields — old parsers skip them; new parsers see default value if absent
  3. Reserved — mark removed field numbers as reserved 3, 7; to prevent reuse
  4. Never change field types incompatibly (e.g., int32int64 is OK; int32string is not)
← Week 1: Protocol Buffers & gRPC

Key Takeaways

  • Protobuf wire format is compact binary: varints for integers, length-prefix for strings/nested
  • Field numbers (not names) are the identity — changing a name is safe; changing a number is not
  • Zero/default values are not written to the wire — be careful with "field not set" vs "field is zero"
  • proto3 simplifies required/optional (everything is optional); use google.protobuf.FieldMask for partial updates

Tomorrow: using protobuf in Rust with prost and build.rs.