Drop-in OpenAI to Claude Migration: A Per-Field Compatibility Matrix
You have an OpenAI project. You want to add Claude. Can you really change one base_url and call it done? The honest answer is mostly yes, with a short list of caveats. Here is the per-field reference.
The pitch sounds too clean. Keep the OpenAI SDK, change base_url to a gateway, pass model="claude-opus-4-8", ship. Most posts stop there. Production code does not. Production code uses stop, response_format, seed, presence_penalty, parallel tool calls, and a JSON parser that asserts on the response shape. Some of those survive the translation cleanly. Some change meaning. Some are silently dropped and your code never finds out.
This piece is the field-by-field reference. Every parameter on chat.completions.create, what happens when the upstream is Claude, and what to do when the mapping is not exact. The framing assumes you are routing through BUZZ AI Gateway at https://buzzai.cc/v1, which is the OpenAI-compatible endpoint. The same key also speaks the Anthropic Messages protocol at https://buzzai.cc when you need it.
1. Why not just use the Anthropic SDK directly
It is a fair question. The Anthropic SDK exists, it is well maintained, and for greenfield Claude code it is the obvious choice. Migration is a different problem. You already have OpenAI code.
Existing wrappers and middleware. A real codebase has a client factory that sets timeouts, attaches a custom httpx client with TLS pins, registers an OpenTelemetry tracer, wires a retry policy that knows OpenAI error codes, attaches a Langfuse callback, and adds a circuit breaker. Adopting the Anthropic SDK means duplicating that scaffolding for a second SDK or extracting an abstraction that bridges both. Both options are real engineering effort that does not move product forward.
Frameworks already speak OpenAI. LangChain’s ChatOpenAI, LlamaIndex’s OpenAI integration, the Vercel AI SDK openai provider, Mastra, Haystack, AutoGen, CrewAI, and most agent frameworks accept an openai_api_base or baseURL override. They do not all have first-class Anthropic providers, and the ones that do are sometimes one or two features behind.
Observability is calibrated for it. The OpenTelemetry semantic conventions for GenAI are written around the chat.completions shape. Most internal homegrown loggers parse messages, choices[0].message.content, tool_calls, and usage.prompt_tokens. Switching SDKs invalidates those parsers.
Mental model uniformity. Engineers reading the code in a year do not need to remember which SDK uses messages versus contents, or which one wants system at the top level versus inline. client.chat.completions.create(model="...", messages=[...]) is the lingua franca.
The migration cost calculation is concrete. Adopting the Anthropic SDK means: a new dependency, a parallel client factory, a parallel retry policy, parallel tracing instrumentation, parallel mocks in tests, and two code paths in every function that picks a model at runtime. Pointing the OpenAI SDK at an OpenAI-compatible Claude endpoint means: one config string. The first path is correct work. The second path is correct work that finishes by lunch.
The right rule of thumb is to keep the OpenAI SDK as the default and reach for the Anthropic SDK only at the specific call sites where a Claude-native feature actually matters. Those call sites exist. They are also rare in proportion to the rest of the code.
2. How the OpenAI-compatible path works under the hood
The gateway exposes two protocols on the same key:
- Anthropic Messages at
https://buzzai.ccfor clients that speak Anthropic. - OpenAI
chat.completionsathttps://buzzai.cc/v1for clients that speak OpenAI.
When you POST to /v1/chat/completions with a Claude model name, the gateway runs a deterministic translation:
- Pulls
messages[role=="system"]out of the array and lifts it to the Anthropic top-levelsystemfield. - Maps the remaining
messagesinto the Anthropicmessagesarray, converting OpenAI content (string or list of parts) into Anthropic content blocks (text, image, tool_use, tool_result). - Translates
toolsfrom OpenAI’s{type:"function", function:{...}}shape into Anthropic’s{name, description, input_schema}shape. - Maps
tool_choice:"auto"→{type:"auto"},"none"→{type:"none"},"required"→{type:"any"},{type:"function", function:{name}}→{type:"tool", name}. - Forwards sampling parameters (
temperature,top_p,max_tokens,stop_sequences) one to one. Drops parameters with no Anthropic equivalent. - Calls Anthropic, awaits the response, and inverts the translation:
content_blockentries becomechoices[0].message.contentandtool_calls;stop_reasonbecomesfinish_reason;usage.input_tokensandusage.output_tokensbecomeusage.prompt_tokensandusage.completion_tokens.
For streaming, Anthropic emits typed events (message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop). The gateway collapses these into a single chat.completion.chunk SSE stream where each chunk has delta.content or delta.tool_calls. The text bytes are forwarded as they arrive. Nothing is buffered. The typed event taxonomy is the only thing lost.
3. Complete field compatibility matrix
The table below covers every chat.completions.create parameter you are likely to use, what Claude calls the equivalent (if any), whether the mapping is exact, and any notes you need before relying on the field in a Claude code path.
| OpenAI field | Claude equivalent | Compatible | Notes |
|---|---|---|---|
model |
model |
Yes | Routing key. Pass any supported identifier (e.g. claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5). See /models. |
messages |
messages + system |
Yes | role: "system" entries are lifted out and merged into Anthropic’s top-level system. Multimodal parts (image_url) are translated to Anthropic image blocks. |
max_tokens |
max_tokens |
Yes | Required by Anthropic. The gateway supplies a sensible default if you omit it, but always set it explicitly in production. |
max_completion_tokens |
max_tokens |
Yes | Newer OpenAI alias; treated equivalently by the gateway. |
temperature |
temperature |
Mostly | Forwarded one to one. Anthropic accepts 0.0 to 1.0; values above 1.0 are clamped. Distributions differ from OpenAI even at the same value. |
top_p |
top_p |
Yes | Forwarded as-is. Anthropic recommends tuning either temperature or top_p, not both. |
top_k |
top_k |
Yes | OpenAI does not expose top_k, but the gateway passes it through to Anthropic if you include it via extra_body. |
stop |
stop_sequences |
Mostly | String or list of strings is mapped to stop_sequences. Anthropic limits to four sequences. Whether the stop string is included in output differs from OpenAI in edge cases; trim defensively. |
stream |
stream |
Mostly | Works. Chunks arrive in OpenAI chat.completion.chunk shape. Anthropic typed events are collapsed into deltas. |
stream_options.include_usage |
message_delta.usage |
Yes | The final chunk carries usage when include_usage: true is set, populated from Anthropic’s terminal message_delta. |
tools |
tools |
Yes | OpenAI function tools map to Anthropic tool definitions. Parameters JSON Schema is forwarded. Parallel tool calls supported. |
tool_choice |
tool_choice |
Yes | auto, none, required, and explicit function selection all map to Anthropic equivalents. |
parallel_tool_calls |
disable_parallel_tool_use (inverse) |
Yes | OpenAI’s parallel_tool_calls: false maps to Anthropic’s tool_choice.disable_parallel_tool_use: true. |
response_format |
tool-forced JSON | Best-effort | Claude has no native structured outputs. {type:"json_object"} is honored as a hint. {type:"json_schema"} is best-effort. Use a forced-tool pattern for guaranteed schema validity. |
seed |
(none) | Dropped | Anthropic does not expose deterministic seeding. The field is accepted and ignored. Do not rely on reproducibility. |
n |
(none) | Dropped | Anthropic returns one completion per call. To get N samples, issue N parallel calls. |
logprobs |
(none) | Dropped | Anthropic does not return per-token log probabilities. The field is accepted and ignored. |
top_logprobs |
(none) | Dropped | Same as logprobs. Not available. |
presence_penalty |
(none) | Dropped | Anthropic does not expose presence penalty. Dropped silently. |
frequency_penalty |
(none) | Dropped | Anthropic does not expose frequency penalty. Dropped silently. |
logit_bias |
(none) | Dropped | Anthropic does not expose token-level bias. Dropped silently. |
user |
(metadata) | Accepted | Accepted at the gateway for your own logging and abuse tagging. Not forwarded to Anthropic; Anthropic uses metadata.user_id instead. |
service_tier |
(none) | Dropped | OpenAI scale-tier flag has no Anthropic equivalent. |
store |
(none) | Dropped | OpenAI request-storage flag does not apply. The gateway is zero-retention regardless. |
metadata |
metadata |
Mostly | Forwarded as Anthropic metadata.user_id if it contains a user_id key. |
The pattern is clean. Structural fields (messages, tools, tool_choice, stream, max_tokens) translate cleanly. Sampling fields (temperature, top_p, stop) translate but with model-specific behavior. OpenAI-only knobs (n, logprobs, presence/frequency_penalty, logit_bias, seed) are silently dropped because Anthropic has no equivalent. Structured output (response_format) is best-effort and benefits from the tool-forced pattern.
4. Fully compatible fields
These move across with no behavioral surprises. You can leave them in your OpenAI code untouched.
model— routing key.messages— including system messages, multimodal image parts, and assistant turns with prior tool calls.max_tokensandmax_completion_tokens.top_p.tools— function definitions with JSON Schema parameters.tool_choice—auto,none,required, and explicit function selection.parallel_tool_calls.streamwithstream_options.include_usage.
A two-minute verification script confirms the basics. Save this as verify.py and run it against a Claude model name through the gateway.
import os
from openai import OpenAI
client = OpenAI(
api_key=os.environ["BUZZ_API_KEY"],
base_url="https://buzzai.cc/v1",
)
# 1. Basic completion with system prompt and max_tokens.
r = client.chat.completions.create(
model="claude-opus-4-8",
messages=[
{"role": "system", "content": "Reply with exactly one word."},
{"role": "user", "content": "Continent of Tokyo?"},
],
max_tokens=10,
temperature=0,
)
assert r.choices[0].message.content.strip().lower().startswith("asia")
assert r.usage.prompt_tokens > 0
assert r.usage.completion_tokens > 0
print("ok: basic")
# 2. Streaming with usage.
total = ""
stream = client.chat.completions.create(
model="claude-haiku-4-5",
messages=[{"role": "user", "content": "Count from 1 to 5."}],
stream=True,
stream_options={"include_usage": True},
max_tokens=50,
)
final_usage = None
for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
total += chunk.choices[0].delta.content
if chunk.usage:
final_usage = chunk.usage
assert "1" in total and "5" in total
assert final_usage is not None
print("ok: streaming + usage")
# 3. Tool call.
tools = [{
"type": "function",
"function": {
"name": "add",
"description": "Add two integers.",
"parameters": {
"type": "object",
"properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
"required": ["a", "b"],
},
},
}]
r = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{"role": "user", "content": "Use the add tool to compute 17 + 25."}],
tools=tools,
tool_choice="auto",
max_tokens=200,
)
call = r.choices[0].message.tool_calls[0]
assert call.function.name == "add"
print("ok: tool call ->", call.function.arguments)
If all three asserts pass, the structural translation is healthy. Anything that fails here is a real incompatibility worth reporting.
5. Partially compatible fields
These work but the behavior is subtly different. Read this section before relying on them in tight assertions or test fixtures.
stop versus stop_sequences
OpenAI accepts stop as a string or a list of up to four strings. Anthropic accepts stop_sequences as a list of up to four strings. The gateway maps your value across. Where they differ:
- Output inclusion. OpenAI typically excludes the stop string from the returned content. Anthropic stops generation when the sequence is produced and may or may not include partial bytes depending on tokenization. The safe pattern is to
rstrip()the response against your stop sequence after the call. - Tokenization boundaries. A stop sequence that crosses a token boundary on one model may not on another.
"\n\n"is reliable."END_OF_OUTPUT"is reliable. Short sequences like"a"are unreliable on any model. - Finish reason. When generation halts on a stop sequence, OpenAI sets
finish_reason: "stop". The gateway maps Anthropic’sstop_reason: "stop_sequence"to the same OpenAI value, so your downstream branching does not need to change.
stream chunk format
Streaming works. The chunk shape is OpenAI’s, not Anthropic’s. If your code only reads chunk.choices[0].delta.content, you will not notice. If your code expects Anthropic’s typed events, it will not see them. Specifically:
- The first chunk has an empty
delta.role: "assistant"and no content. This matches the OpenAI shape. - Content chunks carry
delta.content. Tool-call chunks carrydelta.tool_callswith incrementalfunction.argumentsdeltas, exactly as OpenAI streams parallel tool calls. - The terminal chunk has
finish_reasonset and an emptydelta. Ifstream_options.include_usageis true, a follow-up chunk containsusage. - Anthropic-specific events like
content_block_startwithtype: "thinking"do not appear in OpenAI chunks. The thinking content, if generated, is not exposed via the OpenAI stream.
temperature distribution
temperature=0.7 against Claude does not produce the same distribution as temperature=0.7 against GPT-5. The gateway forwards your value untouched, but Anthropic’s sampler is not OpenAI’s. Tune temperature per model. If your code has temperature constants that were calibrated against GPT, expect to recalibrate against Claude.
response_format
OpenAI offers response_format: {type: "json_object"} and {type: "json_schema", json_schema: {...}}. The first asks for valid JSON. The second is constrained decoding that guarantees schema validity. Anthropic does not have an equivalent of constrained decoding. The gateway accepts both forms and uses them as hints. The reliable pattern for guaranteed schema validity on Claude is to define a tool whose parameters are your schema and force the call:
schema = {
"type": "object",
"properties": {
"title": {"type": "string"},
"tags": {"type": "array", "items": {"type": "string"}},
"rating": {"type": "integer", "minimum": 1, "maximum": 5},
},
"required": ["title", "tags", "rating"],
}
resp = client.chat.completions.create(
model="claude-opus-4-8",
messages=[{"role": "user", "content": "Categorize this article: ..."}],
tools=[{
"type": "function",
"function": {
"name": "emit_categorization",
"description": "Emit the structured categorization.",
"parameters": schema,
},
}],
tool_choice={"type": "function", "function": {"name": "emit_categorization"}},
max_tokens=500,
)
import json
result = json.loads(resp.choices[0].message.tool_calls[0].function.arguments)
# result is now guaranteed to validate against `schema`.
This pattern works on every Claude model the gateway exposes and is also forward-compatible with OpenAI: the same code path runs against gpt-5 without modification.
6. Incompatible fields (silently dropped)
These are accepted by the gateway and ignored. The call returns 200, your response shape is correct, but the field had no effect. The list:
n— Anthropic returns one completion per call. To produce N samples, fan out N parallel calls.logprobsandtop_logprobs— Anthropic does not return per-token log probabilities.presence_penaltyandfrequency_penalty— no Anthropic equivalent. If you have repetition problems, the right tool on Claude is the system prompt orstop_sequences.logit_bias— no token-level bias on Anthropic.seed— Anthropic does not expose a deterministic seed.temperature=0reduces variance but does not eliminate it.service_tierandstore— OpenAI-only. The gateway is zero-retention by default;storehas no meaning here.
The gateway logs at request time when a dropped field is present so you can audit. The application response does not change. If your code depended on, say, n=5 producing five completions, the Claude code path will silently produce one. This is the most common migration footgun. Grep your codebase before flipping the URL:
rg --pcre2 '(\bn\s*=\s*[2-9]|logprobs|presence_penalty|frequency_penalty|logit_bias|\bseed\s*=)' \
--type py --type ts --type js
Triage each hit. Most are dead defaults from OpenAI templates and can be removed.
7. Function calling and tools field mapping
Tools are the most important non-trivial mapping. The OpenAI shape and the Anthropic shape look different on the wire, but the gateway makes them isomorphic for normal use.
Definition mapping
OpenAI:
{
"type": "function",
"function": {
"name": "search_orders",
"description": "Search orders by customer email and date range.",
"parameters": {
"type": "object",
"properties": {
"email": {"type": "string", "format": "email"},
"since": {"type": "string", "format": "date"},
"until": {"type": "string", "format": "date"}
},
"required": ["email"]
}
}
}
Anthropic equivalent that the gateway emits internally:
{
"name": "search_orders",
"description": "Search orders by customer email and date range.",
"input_schema": {
"type": "object",
"properties": {
"email": {"type": "string", "format": "email"},
"since": {"type": "string", "format": "date"},
"until": {"type": "string", "format": "date"}
},
"required": ["email"]
}
}
The translation is mechanical: function.parameters → input_schema, with the rest preserved. JSON Schema features supported by both providers (types, enum, required, nested objects, arrays, oneOf in limited form) round-trip. Schema features only OpenAI parses (the strict mode used in response_format) are passed through but enforced only at OpenAI; Anthropic interprets them as hints.
tool_choice mapping
| OpenAI | Anthropic |
|---|---|
"auto" | {"type": "auto"} |
"none" | {"type": "none"} |
"required" | {"type": "any"} |
{"type":"function","function":{"name":"X"}} | {"type":"tool","name":"X"} |
Tool call response
Claude returns tool calls as content_block entries of type tool_use. The gateway converts these into OpenAI tool_calls:
// OpenAI shape returned by the gateway
{
"id": "chatcmpl-...",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [{
"id": "toolu_01ABC...", // Anthropic's tool_use id, preserved
"type": "function",
"function": {
"name": "search_orders",
"arguments": "{\"email\":\"a@b.co\",\"since\":\"2026-01-01\"}"
}
}]
},
"finish_reason": "tool_calls"
}]
}
The id field is Claude’s native toolu_ identifier, preserved end-to-end. When you append the tool result back to the conversation, use that same id:
messages.append({
"role": "tool",
"tool_call_id": call.id, # toolu_01ABC...
"content": json.dumps(orders),
})
The gateway converts the OpenAI {role: "tool", tool_call_id, content} message back into an Anthropic {type: "tool_result", tool_use_id, content} block on the next turn. Round-trip preserved.
Parallel tool calls
Claude can return multiple tool_use blocks in a single message. The gateway emits them as multiple entries in tool_calls, in order. Your existing OpenAI parallel-tool-call loop continues to work. To disable parallel calls, set parallel_tool_calls: false; the gateway maps that to Anthropic’s tool_choice.disable_parallel_tool_use: true.
8. Code migration examples
Python (OpenAI SDK)
Before:
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resp = client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Summarize the news."},
],
temperature=0.4,
max_tokens=500,
)
print(resp.choices[0].message.content)
After:
from openai import OpenAI
client = OpenAI(
api_key=os.environ["BUZZ_API_KEY"],
base_url="https://buzzai.cc/v1", # only this line is new
)
resp = client.chat.completions.create(
model="claude-opus-4-8", # and this string
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Summarize the news."},
],
temperature=0.4,
max_tokens=500,
)
print(resp.choices[0].message.content)
Node.js (openai package)
Before:
import OpenAI from "openai";
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const resp = await client.chat.completions.create({
model: "gpt-5",
messages: [{ role: "user", content: "Hello" }],
max_tokens: 200,
});
console.log(resp.choices[0].message.content);
After:
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.BUZZ_API_KEY,
baseURL: "https://buzzai.cc/v1",
});
const resp = await client.chat.completions.create({
model: "claude-sonnet-4-6",
messages: [{ role: "user", content: "Hello" }],
max_tokens: 200,
});
console.log(resp.choices[0].message.content);
LangChain (Python)
Before:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-5",
temperature=0.2,
max_tokens=800,
)
print(llm.invoke("Plan a refactor of the auth module."))
After:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="claude-opus-4-8",
temperature=0.2,
max_tokens=800,
api_key=os.environ["BUZZ_API_KEY"],
base_url="https://buzzai.cc/v1",
)
print(llm.invoke("Plan a refactor of the auth module."))
Vercel AI SDK (TypeScript)
import { createOpenAI } from "@ai-sdk/openai";
const buzz = createOpenAI({
apiKey: process.env.BUZZ_API_KEY!,
baseURL: "https://buzzai.cc/v1",
});
const { text } = await generateText({
model: buzz("claude-haiku-4-5"),
prompt: "Write a one-line release note for v2.4.",
});
cURL (for sanity checks)
curl https://buzzai.cc/v1/chat/completions \
-H "Authorization: Bearer $BUZZ_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-8",
"max_tokens": 200,
"messages": [
{"role": "user", "content": "Reply with the word READY only."}
]
}'
Every example above shares the same diff: base_url changes, the model name changes, and the API key changes. Nothing else.
9. Common pitfalls
Five issues come up repeatedly during the OpenAI to Claude migration. None of them are unfixable. All of them are easier to spot when you know what to look for.
Pitfall 1: Forgetting max_tokens
OpenAI chat.completions happily accepts no max_tokens and uses a generous default. Anthropic requires max_tokens. The gateway supplies a default to keep your code working, but the default is intentionally conservative. If your prompt expects a 4000-token response and you never set max_tokens, you may see truncated output. Always set it explicitly in production.
# BAD
client.chat.completions.create(model="claude-opus-4-8", messages=[...])
# GOOD
client.chat.completions.create(
model="claude-opus-4-8",
messages=[...],
max_tokens=4000,
)
Pitfall 2: Multiple system messages
OpenAI tolerates multiple system messages interleaved anywhere in the array. Anthropic has a single top-level system field. The gateway concatenates all role: "system" entries with newlines and lifts them out. If your code prepends a system message in one place and appends another in middleware, the resulting concatenation order is the order they appear in messages. If you depended on a specific ordering or on a system message appearing mid-conversation, refactor to a single system block.
Pitfall 3: Assuming response_format guarantees JSON
On OpenAI, response_format: {type: "json_schema", json_schema: {strict: true, ...}} guarantees a parseable response that validates the schema. On Claude, the same field is a hint. Most of the time the response is valid JSON. Some of the time it is not. If your downstream code does json.loads() without a try block, it will eventually crash. Use the forced-tool pattern from section 5 if validity is load-bearing.
Pitfall 4: Counting on seed for reproducibility
OpenAI exposes seed and a system_fingerprint for best-effort reproducibility. Anthropic exposes neither. If your test fixtures are recorded with seed=42 and you replay them against Claude expecting the same output, they will diverge. Drop the assertion or replay against a recorded mock.
Pitfall 5: Tool result message format
The OpenAI tool-result message is {role: "tool", tool_call_id: "...", content: "..."}. The gateway maps this onto Anthropic’s {type: "tool_result", tool_use_id: "...", content: "..."}. The mistake is to send the result as a regular user message with the tool name baked into the text. That works on Claude (the model is smart enough) but breaks the structured agent loop and confuses log parsers. Always use the canonical tool message shape.
# GOOD
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": json.dumps(result),
})
# AVOID
messages.append({
"role": "user",
"content": f"Tool result: {result}",
})
10. FAQ
Is it really enough to change base_url and the model name to migrate OpenAI code to Claude?
For most application code, yes. Set base_url="https://buzzai.cc/v1", set api_key to your BUZZ key, and pass model="claude-opus-4-8" (or any other supported Claude identifier) to client.chat.completions.create. The OpenAI-compatible gateway translates chat.completions to Anthropic Messages and the response back. Streaming, tool calling, system prompts, and usage accounting all survive. Provider-native features like extended thinking and prompt caching do not, but those are opt-in and only matter at specific call sites.
Which OpenAI parameters are silently dropped when targeting Claude?
n, logprobs, top_logprobs, frequency_penalty, presence_penalty, and logit_bias have no Anthropic equivalent and are dropped during translation. response_format with json_schema is best-effort. user is accepted but not forwarded to the upstream provider. seed is not honored. The gateway returns a successful response, but those flags do not change behavior. Treat their presence as dead code in a Claude code path.
How does the stop parameter behave on Claude versus OpenAI?
Both accept a string or list of strings. The semantics are close but not identical. Anthropic’s stop_sequences match against the model’s generated text and stop generation when produced. OpenAI’s stop has the same intent but historical edge cases differ on tokenization and on whether the stop sequence is included in output. The gateway forwards your stop value to stop_sequences; trim defensively on the client if exact byte-level behavior matters.
Does streaming work the same way?
Yes. stream=True works. The chunk shape is the OpenAI chat.completion.chunk format with delta.content. Under the hood, Anthropic emits typed events (message_start, content_block_start, content_block_delta, content_block_stop, message_delta, message_stop). The gateway collapses these into OpenAI deltas. Token text is preserved exactly. The typed event taxonomy is not visible. If your code relies on those typed events, use the Anthropic SDK against the same gateway instead.
Do tool calls and function calling survive the migration?
Yes. The OpenAI tools and tool_choice parameters map cleanly onto Anthropic tool_use blocks. The gateway translates JSON Schema parameters, forwards tool definitions, and converts Claude’s tool_use response back into OpenAI tool_calls with id, type, function.name, and function.arguments fields. The full agent loop works. Parallel tool calls also map across.
Will my LangChain or LlamaIndex code keep working?
Yes. LangChain’s ChatOpenAI, LlamaIndex’s OpenAI integration, the Vercel AI SDK openai provider, and most agent frameworks accept a base_url override. Pointing them at https://buzzai.cc/v1 and changing the model name is enough. Tracing wrappers (Langfuse, Helicone, OpenTelemetry GenAI conventions) also work unchanged because the wire format is the same.
How do I handle response_format with json_schema?
Claude does not have a one-to-one equivalent of OpenAI’s structured outputs. The gateway accepts response_format and uses it as a hint, but the strict guarantee that OpenAI provides via constrained decoding is not enforced. The reliable pattern on Claude is to define a tool whose parameters match your target schema and call it with tool_choice forcing that tool. The result comes back as tool_calls[0].function.arguments and is guaranteed to validate against the schema.
Can the same key talk to both protocols?
Yes. Your BUZZ key works against the OpenAI-compatible endpoint at https://buzzai.cc/v1 and against the Anthropic Messages endpoint at https://buzzai.cc. Keep the bulk of your code on the OpenAI SDK and add the Anthropic SDK as a second client only where Claude-native features (extended thinking, prompt caching, typed streaming events) actually matter. One key, one billing surface, two protocols.
11. Where to go from here
The migration matrix is the technical answer. The practical answer is even shorter: change the URL, change the model name, run your tests, ship. The fields that translate cleanly cover the entirety of normal application code. The fields that are silently dropped were almost always cargo-culted from a tutorial and have no behavioral effect even on OpenAI for the typical workload. The handful of fields with subtle behavior shifts (stop, response_format, temperature) need a five-minute read of section 5 and a small refactor when they matter.
If you want the same key to also drive Claude Code as your terminal coding agent, one install command does it:
curl -fsSL https://buzzai.cc/sh/claudecode.sh | bash
That points the CLI at the gateway over the Anthropic protocol. Application code keeps using the OpenAI SDK against https://buzzai.cc/v1. Both bills land on the same key, the same dashboard, and the same zero-retention forwarding policy.
base_url="https://buzzai.cc/v1" on your existing OpenAI client, and pass model="claude-opus-4-8". Live model list at /models. Live pricing at /api/pricing.