← Week 1: Protocol Buffers & gRPC

Day 6: gRPC Error Handling — Status, Deadlines, and Metadata

Phase 3 · Jul 6, 2026

← Week 1: Protocol Buffers & gRPC

Agenda (2–3 hours)

  • Read (45 min): gRPC status codes documentation; google.rpc.Status proto; gRPC deadline propagation spec
  • Study (45 min): Map each gRPC status code to when you would use it; understand how deadlines propagate across services
  • Practice (45 min): Implement a service that propagates deadlines downstream; test what happens when the deadline expires mid-call
  • Challenge (30 min): Design error handling for a three-service call chain: A → B → C. What status code should B return when C returns UNAVAILABLE?
← Week 1: Protocol Buffers & gRPC

gRPC Status Codes

Code Name When to use
0 OK Success
1 CANCELLED Client cancelled the RPC
2 UNKNOWN Catch-all internal error
3 INVALID_ARGUMENT Bad request data (client bug)
4 DEADLINE_EXCEEDED Timeout before completion
5 NOT_FOUND Resource doesn't exist
7 PERMISSION_DENIED Authorized but not allowed
8 RESOURCE_EXHAUSTED Rate limited, quota exceeded
14 UNAVAILABLE Transient; safe to retry
16 UNAUTHENTICATED Not authenticated
← Week 1: Protocol Buffers & gRPC

Rich Error Details

// google/rpc/error_details.proto
message BadRequest {
  message FieldViolation {
    string field = 1;
    string description = 2;
  }
  repeated FieldViolation field_violations = 1;
}

message RetryInfo {
  google.protobuf.Duration retry_delay = 1;
}
use tonic_types::{StatusExt, ErrorDetails};

let details = ErrorDetails::new()
    .add_bad_request_violation("email", "must be a valid email address")
    .add_retry_info(Some(std::time::Duration::from_secs(5)));

return Err(Status::with_error_details(Code::InvalidArgument, "validation failed", details)?);
← Week 1: Protocol Buffers & gRPC

Deadline Propagation

Deadlines should propagate down the call chain:

// Reading the deadline from an incoming request
async fn handle(request: Request<SomeReq>) -> Result<Response<SomeRes>, Status> {
    // Extract remaining time
    let deadline = request.deadline();
    let remaining = deadline.map(|d| d.duration_since(Instant::now()).unwrap_or_default());

    // Pass it to downstream calls
    let mut outgoing = Request::new(DownstreamReq { ... });
    if let Some(d) = remaining {
        outgoing.set_timeout(d);
    }
    downstream_client.call(outgoing).await
}

If you don't propagate deadlines, service B times out locally but service C keeps running — wasted work and resource leakage.

← Week 1: Protocol Buffers & gRPC

Retryable vs Non-Retryable Errors

Safe to retry (transient):

  • UNAVAILABLE — service is temporarily down
  • RESOURCE_EXHAUSTED — rate limited; retry after RetryInfo.retry_delay

NOT safe to retry:

  • INVALID_ARGUMENT — request is malformed; retry won't help
  • NOT_FOUND — the resource doesn't exist
  • PERMISSION_DENIED — getting a new token might fix it; retrying with same token won't

Conditionally retryable:

  • DEADLINE_EXCEEDED — if the operation is idempotent
← Week 1: Protocol Buffers & gRPC

Key Takeaways

  • Use specific status codes: UNAVAILABLE is transient; INVALID_ARGUMENT is a client bug
  • google.rpc.ErrorDetails carries structured machine-readable error context
  • Always propagate deadlines: pass remaining time to downstream gRPC calls
  • Document which status codes are retryable in your service's API contract

Tomorrow: Challenge — bidirectional gRPC chat service.