Migrate from Direct Anthropic API to BUZZ
If your code talks to https://api.anthropic.com/v1/messages today, you can move to BUZZ by changing one URL. The request shape, the response shape, streaming, tool use, and prompt caching all stay byte-for-byte identical.
base_url (Python) or baseURL (Node) to https://buzzai.cc and use a BUZZ API key. That is the entire change for the common path. Everything else in this guide is what you don't have to do.
The one-line change
Python SDK
Before — Anthropic direct
from anthropic import Anthropic
client = Anthropic(
api_key=os.environ["ANTHROPIC_API_KEY"],
)
msg = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[
{"role": "user", "content": "ping"}
],
)
After — BUZZ
from anthropic import Anthropic
client = Anthropic(
base_url="https://buzzai.cc",
api_key=os.environ["BUZZ_API_KEY"],
)
msg = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=200,
messages=[
{"role": "user", "content": "ping"}
],
)
Node.js SDK
Before — Anthropic direct
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: process.env.ANTHROPIC_API_KEY,
});
const msg = await client.messages.create({
model: "claude-haiku-4-5-20251001",
max_tokens: 200,
messages: [
{ role: "user", content: "ping" },
],
});
After — BUZZ
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
baseURL: "https://buzzai.cc",
apiKey: process.env.BUZZ_API_KEY,
});
const msg = await client.messages.create({
model: "claude-haiku-4-5-20251001",
max_tokens: 200,
messages: [
{ role: "user", content: "ping" },
],
});
curl
Before
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{...}'
After
curl https://buzzai.cc/v1/messages \
-H "Authorization: Bearer $BUZZ_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{...}'
Three accepted auth header forms
Anthropic accepts only x-api-key. BUZZ accepts three forms — anything your existing code is already sending will work.
| Header | Notes |
|---|---|
Authorization: Bearer <KEY> | Recommended. Aligns with the OpenAI SDK convention and the rest of BUZZ's REST surface. |
Authorization: Bearer sk-<KEY> | Accepted. The sk- prefix is auto-stripped server-side. |
x-api-key: <KEY> | Drop-in compatible with the Anthropic SDK default. If your code uses the Anthropic SDK without overriding headers, this is what gets sent. |
anthropic-version is required by Anthropic upstream but optional on BUZZ — BUZZ defaults it to 2023-06-01 when omitted. Send it explicitly anyway so the same code keeps working against direct Anthropic.
What passes through unchanged
BUZZ does transparent forwarding. The bytes you send are the bytes the upstream receives, and the bytes it returns are the bytes you get back. The fields below behave identically against BUZZ and against direct Anthropic.
Request fields
| Field | Status |
|---|---|
| model | Identical. Use the same model id you used against Anthropic. claude-opus-4-7, claude-sonnet-4-6, and dated aliases like claude-haiku-4-5-20251001, claude-sonnet-4-5-20250929, claude-opus-4-5-20251101 are all accepted. |
| messages | Identical. Same role alternation, same content blocks (string or array of text/image/tool_use/tool_result). |
| system | Identical, including the array form with per-block cache_control. |
| max_tokens / temperature / top_p / top_k / stop_sequences | Identical pass-through. |
| stream | Identical. Same SSE wire format. See the streaming guide. |
| tools / tool_choice | Identical. Verified end-to-end: a Tokyo-weather tool definition and a multi-turn tool_use/tool_result round-trip both work without modification. |
| thinking | Identical pass-through (Opus 4.7 extended thinking). |
| cache_control | Identical. The byte-level transparency means prompt cache works end-to-end. Verified: same payload sent twice produced cold cache_creation_input_tokens=1200, then warm cache_read_input_tokens=1200. |
| metadata | Pass-through. Anthropic primarily uses {"user_id": "..."}; BUZZ does not interpret it. |
Response fields
| Field | Status |
|---|---|
| id, type, role, model, content, stop_reason, stop_sequence | Identical to Anthropic. Same enum values for stop_reason: end_turn, max_tokens, stop_sequence, tool_use, pause_turn, refusal. |
| usage.input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens, cache_creation, service_tier | Identical schema and semantics. |
| SSE event sequence | Identical. message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop, ping, error. |
Differences worth knowing
These are the only deltas in practice. None of them require code changes if your client tolerates extra fields and unknown statuses (which any robust JSON parser does).
Extra response fields BUZZ may add
BUZZ may include accounting-transparency fields in the response that Anthropic does not emit:
usage.iterations[]— per-iteration token breakdowncontext_management.applied_edits— usually an empty array
BUZZ never removes fields. Your parser should accept unknown keys and ignore them.
HTTP 503 buzz_error · model_not_found
Anthropic uses 404 for an unknown model. BUZZ uses 503 with error.type: "buzz_error" and a model_not_found message when no upstream channel can serve the model under your group. This is a routing decision, not an outage. Add a 503 handler that hits GET /v1/models for the live list. See the error-handling guide.
BUZZ-side error envelope shape
Anthropic-passthrough errors keep the official {"type":"error","error":{...},"request_id":"req_..."} shape. Errors that BUZZ generates itself (auth, schema validation, channel routing) use:
{"error":{"type":"buzz_error","message":"... (request id: 202605260713...)"}}
If your error handler only branches on Anthropic's error.type enum, add a fallback for buzz_error. The BUZZ request id is appended to error.message rather than living in a separate field.
Channel-gated request fields
Three optional fields are silently dropped unless your channel allows them: inference_geo, speed, service_tier. If your code sends them and depends on them, talk to support to enable the corresponding allow-flag. Most callers do not use these.
Switching via environment variables
The cleanest pattern is to keep your code untouched and switch at the environment-variable level. The Anthropic SDKs honor ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY automatically.
# .env (was direct-Anthropic)
ANTHROPIC_API_KEY=sk-ant-...
# .env (now BUZZ)
ANTHROPIC_BASE_URL=https://buzzai.cc
ANTHROPIC_API_KEY=<YOUR_BUZZ_KEY>
Same code, different env. Same pattern works for Claude Code: point the CLI at BUZZ via env vars and your existing prompts and tooling keep working.
Side-by-side parallel test
Before flipping production, run both endpoints in parallel for a representative sample. The point is to confirm the response bytes match, not to A/B benchmark.
import os, json
from anthropic import Anthropic
direct = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
buzz = Anthropic(base_url="https://buzzai.cc", api_key=os.environ["BUZZ_API_KEY"])
prompt = [{"role": "user", "content": "Say the word 'hello' and nothing else."}]
common = dict(model="claude-haiku-4-5-20251001", max_tokens=20, messages=prompt)
a = direct.messages.create(**common)
b = buzz.messages.create(**common)
print("direct text:", a.content[0].text)
print("buzz text:", b.content[0].text)
print("direct stop:", a.stop_reason)
print("buzz stop:", b.stop_reason)
# Expect: same content text, same stop_reason. Usage tokens may differ
# slightly because each call is independent, but the schema is identical.
Run the same comparison with stream=True, with a tool-use round-trip, and with a long cache_control-marked system prompt. If the field shapes match for those four cases, the rest of your surface area will work too.
Switch-day checklist
- Issue a BUZZ API key. Use a separate key per environment (dev / staging / prod) so you can revoke selectively.
- Set the base URL via env var. Either the SDK constructor argument or
ANTHROPIC_BASE_URL=https://buzzai.cc. Avoid hardcoding the URL in source. - Send
anthropic-version: 2023-06-01explicitly. Optional on BUZZ, required on Anthropic. Sending it keeps your code portable both ways. - Run a parallel-call check on your hottest prompt. Compare
content[0].textandstop_reasonfor the non-stream case. They should match. - Run a streaming check. Confirm you receive the same six-event sequence ending in
message_stop. See the streaming guide. - Run a tool-use check. Send a tool-defined prompt, confirm the response carries a
tool_usecontent block, then complete the round-trip withtool_result. - Run a prompt-cache check. Send the same payload twice with
cache_control:{type:"ephemeral"}on the system block. First call should report non-zerocache_creation_input_tokens, second call non-zerocache_read_input_tokens. - Update error handling for 503 and BUZZ envelope. See the error-handling guide.
- Pull the live model list once.
GET /v1/modelsagainst your new key. Use this list as the source of truth for the model ids your group can route to. - Keep the Anthropic key around. Useful as a fallback target during the first week and for the parallel-call test harness.
- Monitor: HTTP status distribution, latency p50/p95,
cache_readratio. All three should be at parity (or better, on the cache ratio) after the switch.
Rolling back
Because the change is one URL plus one key, rollback is also one URL plus one key. Keep both BUZZ_API_KEY and ANTHROPIC_API_KEY in your secret store; flipping the active ANTHROPIC_BASE_URL reverts traffic without a code deploy.