Claude Opus 4.8 Is Live on BUZZ: Model ID, Pricing, and Migration Notes
Claude Opus 4.8 is now routable through BUZZ AI Gateway. You do not need a new account, a new endpoint, or a new key — change the model string to claude-opus-4-8 and you are on the latest Opus. This post covers the exact identifier, the pay-per-token price, how prompt caching behaves across the version bump, and the one thing to watch when you switch.
The identifier
The model name is claude-opus-4-8. As with every Claude model, Anthropic uses hyphens, not dots — claude-opus-4.8 is not a valid identifier and will return a 404 model not found. If you see that error after switching, check for a stray dot first.
BUZZ exposes Opus 4.8 on both interfaces it already serves:
- Anthropic Messages at
https://buzzai.cc/v1/messages— used by Claude Code and the officialanthropicSDK. - OpenAI-compatible at
https://buzzai.cc/v1/chat/completions— used by theopenaiSDK and anything that speaks the chat.completions schema.
The full live model list is always published at https://buzzai.cc/models. If a name appears there, your key can reach it.
Calling it
Anthropic Messages (curl):
curl https://buzzai.cc/v1/messages \
-H "x-api-key: $BUZZ_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "content-type: application/json" \
-d '{
"model": "claude-opus-4-8",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Explain a B-tree in two sentences."}]
}'
OpenAI SDK (Python) — only base_url and model differ from a stock OpenAI call:
from openai import OpenAI
client = OpenAI(
base_url="https://buzzai.cc/v1",
api_key="YOUR_BUZZ_KEY",
)
resp = client.chat.completions.create(
model="claude-opus-4-8",
messages=[{"role": "user", "content": "Explain a B-tree in two sentences."}],
)
print(resp.choices[0].message.content)
Claude Code — point it at BUZZ once and ask for Opus 4.8 like any other model:
export ANTHROPIC_BASE_URL=https://buzzai.cc
export ANTHROPIC_AUTH_TOKEN=YOUR_BUZZ_KEY
# then select claude-opus-4-8 in Claude Code's model picker
Pricing
Opus 4.8 is billed pure pay-per-token, with no monthly fee and no minimum spend, the same model that applies to every other model on BUZZ.
| Model | Input / 1M | Output / 1M |
|---|---|---|
| Claude Opus 4.8 | $0.20 | $1.00 |
| Claude Sonnet 4.6 | $0.12 | $0.60 |
| Claude Haiku 4.5 | $0.04 | $0.20 |
Prompt cache reads and writes are billed automatically using Anthropic's official discount multipliers. On a workload with a stable, reused prefix — long system prompts, large tool schemas, retrieved documents — the effective input cost lands far below the headline rate, because most of the prefix is served as cache reads rather than fresh input. See the prompt caching playbook for how to structure requests so the cache actually hits.
Upgrading from 4.7: what changes (almost nothing)
The version bump is intentionally boring. Here is the full list of what you touch:
| Concern | Change required |
|---|---|
| Model string | claude-opus-4-7 → claude-opus-4-8 |
| Endpoint / base URL | None |
| API key | None |
| Request / response schema | None |
Prompt cache markers (cache_control) | None |
| Tool-use / function-calling blocks | None |
| Streaming (SSE) handling | None |
The one thing to watch: cache warm-up
Prompt cache entries are scoped per model. A prefix you cached under claude-opus-4-7 does not carry over to claude-opus-4-8 — the first request on the new model is a cache miss and pays the full input rate for that prefix. After that first warm-up pass, cache reads resume normally on the new model.
Practical implication: if you run an A/B between 4.7 and 4.8, each side maintains its own cache, so budget for two warm-up passes rather than one. For a clean cutover, just switch the model string everywhere at once and accept a single warm-up cycle.
If you swap models mid-session and notice cache_read_input_tokens drop to zero for a request or two, that is expected. It climbs back as the new model's cache fills.
When to reach for Opus 4.8 vs Sonnet 4.6
Opus is the heavyweight: deeper reasoning, better at long multi-step agentic loops, stronger on hard code and analysis. Sonnet 4.6 is roughly a fifth of the input price and is the right default for most chat, drafting, and routine coding. A common production pattern is to route the bulk of traffic to Sonnet and escalate only the hard requests to Opus 4.8 — and because BUZZ uses one key and one endpoint for both, that routing is a single string in your own code, not a second integration.
Q1: What is the model identifier for Claude Opus 4.8?
It is claude-opus-4-8. Hyphens, not dots — claude-opus-4.8 returns a 404. Pass it as the model parameter against https://buzzai.cc/v1 (OpenAI-compatible) or https://buzzai.cc (Anthropic Messages).
Q2: Do I need to change anything to upgrade from 4.7?
Only the model string. Endpoint, key, request schema, prompt cache markers, and tool-use blocks are all unchanged. Replace claude-opus-4-7 with claude-opus-4-8 and you are done.
Q3: How much does Opus 4.8 cost?
Pay-per-token, no monthly fee, no minimum. Input is $0.20 per million tokens, output is $1.00 per million. Prompt cache hits are billed automatically at Anthropic's official discount multipliers.
Q4: Does prompt caching work with Opus 4.8?
Yes, with the same cache_control markers as the rest of the current Claude family. Cache is scoped per model, so expect one warm-up pass after switching from 4.7.
Q5: Can I call Opus 4.8 with the OpenAI SDK?
Yes. Set base_url="https://buzzai.cc/v1", set api_key to your BUZZ key, and pass model="claude-opus-4-8". The same openai client that calls gpt-5 can call Opus 4.8 by changing two strings.
Try Opus 4.8 now
Sign up, top up the balance, copy your key, and send the first request to claude-opus-4-8 in under a minute. Pay only for the tokens you use.
Last reviewed: 2026-05-29