Transparent Forwarding
A gateway should be a wire, not a translator. Transparent forwarding is BUZZ's commitment to proxy the Anthropic Messages API byte-for-byte: same request shape, same response shape, same SSE event order, same error envelopes. Your code does not need to know BUZZ exists.
What it means
Most "AI gateways" are application servers wearing a proxy costume. They re-parse your JSON, re-shape it into their internal model, call the upstream LLM with their own client, and re-emit a translated response. Every step is an opportunity for behavior to drift.
BUZZ takes the opposite approach. The principle is simple:
If Anthropic accepts a field, BUZZ forwards it. If Anthropic returns a field, BUZZ returns it. If Anthropic emits an SSE event, BUZZ emits the same event in the same order.
The bytes you send arrive at Anthropic almost unchanged, modulo headers BUZZ rewrites for routing and authentication. The bytes Anthropic sends back arrive at you almost unchanged, modulo accounting fields BUZZ appends without altering anything Anthropic produced.
Why it matters
Three reasons.
Forward compatibility. When Anthropic ships a new feature like extended thinking, structured tool input, or a new stop_reason value, you do not have to wait for the gateway to "support" it. The first request you send with the new field works the same day Anthropic enables it on their side.
Debuggability. If your code breaks, you can compare a BUZZ response with an Anthropic-direct response field by field. They match. The gateway is no longer a suspect when you triage a bug, which collapses your debugging surface.
Portability. Code written against the BUZZ base URL runs against api.anthropic.com with a one-line URL swap. There is no vendor lock-in at the protocol layer because there is no protocol layer to lock in.
How BUZZ implements it
The forwarding pipeline is intentionally minimal:
- Auth rewrite. Your BUZZ key is validated, then replaced with the upstream channel's credentials before egress. Your key never reaches Anthropic.
- Streaming pass-through. SSE chunks are forwarded as soon as they arrive, byte-aligned. Event order,
event:/data:framing, and inter-event timing are preserved. - Cache directives untouched. Every
cache_controlblock is forwarded verbatim. Anthropic decides what hits cache, not BUZZ. - Tool blocks untouched.
tool_use,tool_result, andtool_choiceshapes pass through without transformation. - Unknown fields pass through. If you send a field BUZZ has never seen, it goes upstream. If Anthropic returns a field BUZZ has never seen, it comes back to you.
The few places where BUZZ does intervene are deliberate and small: it strips the optional sk- prefix on Bearer tokens, defaults anthropic-version to 2023-06-01 when omitted, and drops a short list of channel-gated parameters (inference_geo, speed, service_tier) when the upstream channel does not allow them. None of these change the meaning of any field Anthropic returns. Fields you may see in the response such as usage.iterations[] originate from the upstream provider — BUZZ does not synthesize them.
Compared with traditional relay stations
Traditional Claude relay stations typically run a Node or Python server that owns its own SDK call to Anthropic. They convert your request into an internal type, call anthropic.messages.create(), then convert back. The implications are significant:
| Concern | Traditional relay | BUZZ transparent forwarding |
|---|---|---|
| New Anthropic features | Wait for relay author to update | Available immediately |
| Streaming behavior | Often re-buffered or re-encoded | Byte-aligned SSE pass-through |
| Cache hit rate | Frequently degraded by request rewrites | Same as Anthropic-direct |
| Field-level fidelity | Risk of drift on every release | Tested against captured fixtures |
| Code portability | SDK is custom, not Anthropic-shape | Anthropic SDK works as-is |
The relay approach is fine if your only goal is to expose a chat completion endpoint. It breaks down the moment you care about prompt caching, tool use round-trips, extended thinking, or any feature Anthropic adds in the next twelve months.
A note on trust
Transparent forwarding is also a trust contract. A gateway that re-parses your request can also log it, modify it, replay it, or store it. A gateway that pipes bytes from your TLS connection to Anthropic's TLS connection has no opportunity to do any of those things to message bodies. This is the architectural foundation of zero retention: the simplest way to not retain data is to never have it.
You should always ask of any gateway: can you produce a request fixture that, sent through your gateway, returns a response byte-identical to one sent direct to Anthropic? If the answer is yes, transparent forwarding is real. If the answer involves caveats, it is marketing.
See also
POST /v1/messages— the primary forwarded endpoint- Zero Retention — the privacy implication of transparent forwarding
- Multi-Vendor Routing — how BUZZ chooses upstream while staying transparent
- BUZZ vs Claude relay stations — architectural comparison with traditional relays
- Zero retention LLM gateway — the design rationale