BUZZ AI Gateway
Docs · Concepts · Zero Retention

Zero Retention

Zero retention is a precise commitment about a precise thing: request and response bodies. BUZZ does not store the prompts you send or the completions Claude returns. It does store the metadata required to bill you, route around outages, and respond to abuse, but never the message content itself.

Why "zero" needs a definition

"We don't log requests" is a common claim. It is also vague enough to be meaningless. A useful zero-retention statement has to answer four questions: what is dropped, what is kept, where the line is drawn, and how long anything that is kept lives.

BUZZ draws the line at the boundary of the message body. Anything inside the body of a request to POST /v1/messages, and anything inside the body of the response, is treated as customer data and is not persisted to disk anywhere in our infrastructure. Counters, identifiers, status codes, and routing decisions about that request are persisted, because we have to bill you and we have to be able to investigate failures.

Data flow at a glance

YOUR APP | | TLS (your prompt) v BUZZ EDGE -- read auth header, route channel | -- build upstream connection | -- pipe bytes (no body buffer to disk) v ANTHROPIC / VENDOR | | TLS (model output) v BUZZ EDGE -- pipe bytes back (no body buffer to disk) | -- read trailing usage from final SSE event | -- emit billing record { user_id, model, tokens, status, ms } v YOUR APP PERSISTED: billing record only (no prompt, no completion) NOT PERSISTED: request body, response body, tool inputs, tool results

Bytes flow through a streaming pipe. The only thing extracted on the way out is the trailing usage object emitted in the final SSE message_delta or in the JSON response. That object becomes a billing record. The body bytes themselves are never written to disk and never copied into a queue.

What is stored

For every successful or failed request, BUZZ stores a small record that looks like this:

{
  "request_id":   "202605260713594...",
  "user_id":      "usr_...",
  "key_id":       "key_...",
  "endpoint":     "/v1/messages",
  "model":        "claude-haiku-4-5-20251001",
  "channel":      "anthropic-direct-1",
  "status":       200,
  "input_tokens": 1240,
  "output_tokens": 380,
  "cache_read_tokens": 800,
  "latency_ms":   2410,
  "ts":           "2026-05-26T07:13:59Z"
}

That is the entire shape of what we keep about your traffic. It is enough to compute your bill, expose accurate usage in your dashboard, debug a 5xx surge by request ID, and detect abuse patterns without reading message text.

What is not stored

The honest exceptions

Three narrow situations involve content briefly leaving the streaming pipe, and you should know about them:

Error envelopes. When the upstream returns a non-2xx response, the small JSON error envelope (a few hundred bytes describing the error type) is captured into the structured log so support can reproduce the failure. The original request body is not.

Abuse triage. If your account is flagged for suspected abuse (e.g., a sudden spike from a leaked key), an operator can enable a temporary, per-key sampling that captures request hashes (not bodies) for review. This requires explicit operator action and is logged.

OS-level memory. Bytes pass through process memory while in flight. They are not flushed to disk, but they exist in RAM for the duration of the request. This is fundamental to TLS termination on any proxy and applies equally to Anthropic, Cloudflare, or any other intermediate.

Compliance and enterprise meaning

Zero retention turns a number of hard procurement questions into easy ones. If your security review asks "where does customer prompt content go", the answer is "into Anthropic's TLS connection, then nowhere else", and that answer is verifiable from BUZZ's own logging surface (which does not contain a prompt field at all).

For regulated workloads (healthcare, legal, financial), this is the difference between a gateway that requires a Data Processing Agreement covering message content and one that does not, because there is no message content for an agreement to cover. It also collapses the blast radius of a hypothetical BUZZ breach: an attacker who fully compromised our database would obtain billing counters and routing logs, not your conversations.

The architectural foundation is transparent forwarding. The simplest way to not retain something is to never have it. BUZZ is built so that the most natural code path is also the most private one.

See also