BUZZ AI Gateway
Docs · Guides · Migrate from Anthropic

Migrate from Direct Anthropic API to BUZZ

If your code talks to https://api.anthropic.com/v1/messages today, you can move to BUZZ by changing one URL. The request shape, the response shape, streaming, tool use, and prompt caching all stay byte-for-byte identical.

POST https://buzzai.cc/v1/messages
The migration is one line. Set base_url (Python) or baseURL (Node) to https://buzzai.cc and use a BUZZ API key. That is the entire change for the common path. Everything else in this guide is what you don't have to do.

The one-line change

Python SDK

Before — Anthropic direct
from anthropic import Anthropic

client = Anthropic(
    api_key=os.environ["ANTHROPIC_API_KEY"],
)

msg = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=200,
    messages=[
        {"role": "user", "content": "ping"}
    ],
)
After — BUZZ
from anthropic import Anthropic

client = Anthropic(
    base_url="https://buzzai.cc",
    api_key=os.environ["BUZZ_API_KEY"],
)

msg = client.messages.create(
    model="claude-haiku-4-5-20251001",
    max_tokens=200,
    messages=[
        {"role": "user", "content": "ping"}
    ],
)

Node.js SDK

Before — Anthropic direct
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

const msg = await client.messages.create({
  model: "claude-haiku-4-5-20251001",
  max_tokens: 200,
  messages: [
    { role: "user", content: "ping" },
  ],
});
After — BUZZ
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  baseURL: "https://buzzai.cc",
  apiKey: process.env.BUZZ_API_KEY,
});

const msg = await client.messages.create({
  model: "claude-haiku-4-5-20251001",
  max_tokens: 200,
  messages: [
    { role: "user", content: "ping" },
  ],
});

curl

Before
curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{...}'
After
curl https://buzzai.cc/v1/messages \
  -H "Authorization: Bearer $BUZZ_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{...}'

Three accepted auth header forms

Anthropic accepts only x-api-key. BUZZ accepts three forms — anything your existing code is already sending will work.

HeaderNotes
Authorization: Bearer <KEY>Recommended. Aligns with the OpenAI SDK convention and the rest of BUZZ's REST surface.
Authorization: Bearer sk-<KEY>Accepted. The sk- prefix is auto-stripped server-side.
x-api-key: <KEY>Drop-in compatible with the Anthropic SDK default. If your code uses the Anthropic SDK without overriding headers, this is what gets sent.

anthropic-version is required by Anthropic upstream but optional on BUZZ — BUZZ defaults it to 2023-06-01 when omitted. Send it explicitly anyway so the same code keeps working against direct Anthropic.

What passes through unchanged

BUZZ does transparent forwarding. The bytes you send are the bytes the upstream receives, and the bytes it returns are the bytes you get back. The fields below behave identically against BUZZ and against direct Anthropic.

Request fields

FieldStatus
modelIdentical. Use the same model id you used against Anthropic. claude-opus-4-7, claude-sonnet-4-6, and dated aliases like claude-haiku-4-5-20251001, claude-sonnet-4-5-20250929, claude-opus-4-5-20251101 are all accepted.
messagesIdentical. Same role alternation, same content blocks (string or array of text/image/tool_use/tool_result).
systemIdentical, including the array form with per-block cache_control.
max_tokens / temperature / top_p / top_k / stop_sequencesIdentical pass-through.
streamIdentical. Same SSE wire format. See the streaming guide.
tools / tool_choiceIdentical. Verified end-to-end: a Tokyo-weather tool definition and a multi-turn tool_use/tool_result round-trip both work without modification.
thinkingIdentical pass-through (Opus 4.7 extended thinking).
cache_controlIdentical. The byte-level transparency means prompt cache works end-to-end. Verified: same payload sent twice produced cold cache_creation_input_tokens=1200, then warm cache_read_input_tokens=1200.
metadataPass-through. Anthropic primarily uses {"user_id": "..."}; BUZZ does not interpret it.

Response fields

FieldStatus
id, type, role, model, content, stop_reason, stop_sequenceIdentical to Anthropic. Same enum values for stop_reason: end_turn, max_tokens, stop_sequence, tool_use, pause_turn, refusal.
usage.input_tokens, output_tokens, cache_creation_input_tokens, cache_read_input_tokens, cache_creation, service_tierIdentical schema and semantics.
SSE event sequenceIdentical. message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop, ping, error.

Differences worth knowing

These are the only deltas in practice. None of them require code changes if your client tolerates extra fields and unknown statuses (which any robust JSON parser does).

Extra response fields BUZZ may add

BUZZ may include accounting-transparency fields in the response that Anthropic does not emit:

BUZZ never removes fields. Your parser should accept unknown keys and ignore them.

HTTP 503 buzz_error · model_not_found

Anthropic uses 404 for an unknown model. BUZZ uses 503 with error.type: "buzz_error" and a model_not_found message when no upstream channel can serve the model under your group. This is a routing decision, not an outage. Add a 503 handler that hits GET /v1/models for the live list. See the error-handling guide.

BUZZ-side error envelope shape

Anthropic-passthrough errors keep the official {"type":"error","error":{...},"request_id":"req_..."} shape. Errors that BUZZ generates itself (auth, schema validation, channel routing) use:

{"error":{"type":"buzz_error","message":"... (request id: 202605260713...)"}}

If your error handler only branches on Anthropic's error.type enum, add a fallback for buzz_error. The BUZZ request id is appended to error.message rather than living in a separate field.

Channel-gated request fields

Three optional fields are silently dropped unless your channel allows them: inference_geo, speed, service_tier. If your code sends them and depends on them, talk to support to enable the corresponding allow-flag. Most callers do not use these.

Switching via environment variables

The cleanest pattern is to keep your code untouched and switch at the environment-variable level. The Anthropic SDKs honor ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY automatically.

# .env (was direct-Anthropic)
ANTHROPIC_API_KEY=sk-ant-...

# .env (now BUZZ)
ANTHROPIC_BASE_URL=https://buzzai.cc
ANTHROPIC_API_KEY=<YOUR_BUZZ_KEY>

Same code, different env. Same pattern works for Claude Code: point the CLI at BUZZ via env vars and your existing prompts and tooling keep working.

Side-by-side parallel test

Before flipping production, run both endpoints in parallel for a representative sample. The point is to confirm the response bytes match, not to A/B benchmark.

import os, json
from anthropic import Anthropic

direct = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
buzz = Anthropic(base_url="https://buzzai.cc", api_key=os.environ["BUZZ_API_KEY"])

prompt = [{"role": "user", "content": "Say the word 'hello' and nothing else."}]
common = dict(model="claude-haiku-4-5-20251001", max_tokens=20, messages=prompt)

a = direct.messages.create(**common)
b = buzz.messages.create(**common)

print("direct text:", a.content[0].text)
print("buzz   text:", b.content[0].text)
print("direct stop:", a.stop_reason)
print("buzz   stop:", b.stop_reason)
# Expect: same content text, same stop_reason. Usage tokens may differ
# slightly because each call is independent, but the schema is identical.

Run the same comparison with stream=True, with a tool-use round-trip, and with a long cache_control-marked system prompt. If the field shapes match for those four cases, the rest of your surface area will work too.

Switch-day checklist

Rolling back

Because the change is one URL plus one key, rollback is also one URL plus one key. Keep both BUZZ_API_KEY and ANTHROPIC_API_KEY in your secret store; flipping the active ANTHROPIC_BASE_URL reverts traffic without a code deploy.

See also