What is the model identifier for Claude Opus 4.8 on BUZZ?

The identifier is claude-opus-4-8. Anthropic uses hyphens, not dots, so claude-opus-4.8 will return a 404. Pass claude-opus-4-8 as the model parameter against https://buzzai.cc/v1 (OpenAI-compatible) or https://buzzai.cc (Anthropic Messages) and the request routes to Opus 4.8.

Does prompt caching work with Claude Opus 4.8?

Yes. Opus 4.8 supports the same cache_control markers as the rest of the current Claude family. Cache entries are scoped per model, so a prefix cached under claude-opus-4-7 is a cache miss the first time you send it to claude-opus-4-8. Expect one warm-up pass after switching, then cache reads resume normally.

Can I call Claude Opus 4.8 with the OpenAI SDK?

Yes. Set base_url to https://buzzai.cc/v1, set the api_key to your BUZZ key, and pass model=claude-opus-4-8 to client.chat.completions.create. BUZZ translates the chat.completions schema to the Anthropic Messages schema and back, so the same OpenAI client that calls gpt-5 can call Opus 4.8 by changing two strings.

Home › Blog › Claude Opus 4.8 Is Live on BUZZ

Claude Opus 4.8 Is Live on BUZZ: Model ID, Pricing, and Migration Notes

Q: Do I need to change anything to upgrade from claude-opus-4-7 to 4.8?

Only the model string. The endpoint, your API key, the request schema, prompt caching markers, and tool-use blocks are all unchanged. Replace claude-opus-4-7 with claude-opus-4-8 in the model field and the upgrade is complete.

Q: How much does Claude Opus 4.8 cost on BUZZ?

Pure pay-per-token, no monthly fee and no minimum spend. Opus 4.8 input is $0.20 per million tokens and output is $1.00 per million tokens. Prompt cache reads and writes are billed automatically using Anthropic's official discount multipliers, so a high cache-hit workload pays a fraction of the headline input rate.

Claude Opus 4.8 is now routable through BUZZ AI Gateway. You do not need a new account, a new endpoint, or a new key — change the model string to claude-opus-4-8 and you are on the latest Opus. This post covers the exact identifier, the pay-per-token price, how prompt caching behaves across the version bump, and the one thing to watch when you switch.

Published 2026-05-29 · Reading time ~6 min

claude-opus-4-8Model identifier

$0.20Input / 1M tokens

$1.00Output / 1M tokens

0 changesEndpoint & key

The identifier

The model name is claude-opus-4-8. As with every Claude model, Anthropic uses hyphens, not dots — claude-opus-4.8 is not a valid identifier and will return a 404 model not found. If you see that error after switching, check for a stray dot first.

BUZZ exposes Opus 4.8 on both interfaces it already serves:

Anthropic Messages at https://buzzai.cc/v1/messages — used by Claude Code and the official anthropic SDK.
OpenAI-compatible at https://buzzai.cc/v1/chat/completions — used by the openai SDK and anything that speaks the chat.completions schema.

The full live model list is always published at https://buzzai.cc/models. If a name appears there, your key can reach it.

Calling it

Anthropic Messages (curl):

curl https://buzzai.cc/v1/messages \
  -H "x-api-key: $BUZZ_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "content-type: application/json" \
  -d '{
    "model": "claude-opus-4-8",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Explain a B-tree in two sentences."}]
  }'

OpenAI SDK (Python) — only base_url and model differ from a stock OpenAI call:

from openai import OpenAI

client = OpenAI(
    base_url="https://buzzai.cc/v1",
    api_key="YOUR_BUZZ_KEY",
)

resp = client.chat.completions.create(
    model="claude-opus-4-8",
    messages=[{"role": "user", "content": "Explain a B-tree in two sentences."}],
)
print(resp.choices[0].message.content)

Claude Code — point it at BUZZ once and ask for Opus 4.8 like any other model:

export ANTHROPIC_BASE_URL=https://buzzai.cc
export ANTHROPIC_AUTH_TOKEN=YOUR_BUZZ_KEY
# then select claude-opus-4-8 in Claude Code's model picker

Pricing

Opus 4.8 is billed pure pay-per-token, with no monthly fee and no minimum spend, the same model that applies to every other model on BUZZ.

Model	Input / 1M	Output / 1M
Claude Opus 4.8	$0.20	$1.00
Claude Sonnet 4.6	$0.12	$0.60
Claude Haiku 4.5	$0.04	$0.20

Prompt cache reads and writes are billed automatically using Anthropic's official discount multipliers. On a workload with a stable, reused prefix — long system prompts, large tool schemas, retrieved documents — the effective input cost lands far below the headline rate, because most of the prefix is served as cache reads rather than fresh input. See the prompt caching playbook for how to structure requests so the cache actually hits.

Upgrading from 4.7: what changes (almost nothing)

The version bump is intentionally boring. Here is the full list of what you touch:

Concern	Change required
Model string	`claude-opus-4-7` → `claude-opus-4-8`
Endpoint / base URL	None
API key	None
Request / response schema	None
Prompt cache markers (`cache_control`)	None
Tool-use / function-calling blocks	None
Streaming (SSE) handling	None

The one thing to watch: cache warm-up

Prompt cache entries are scoped per model. A prefix you cached under claude-opus-4-7 does not carry over to claude-opus-4-8 — the first request on the new model is a cache miss and pays the full input rate for that prefix. After that first warm-up pass, cache reads resume normally on the new model.

Practical implication: if you run an A/B between 4.7 and 4.8, each side maintains its own cache, so budget for two warm-up passes rather than one. For a clean cutover, just switch the model string everywhere at once and accept a single warm-up cycle.

If you swap models mid-session and notice cache_read_input_tokens drop to zero for a request or two, that is expected. It climbs back as the new model's cache fills.

When to reach for Opus 4.8 vs Sonnet 4.6

Opus is the heavyweight: deeper reasoning, better at long multi-step agentic loops, stronger on hard code and analysis. Sonnet 4.6 is roughly a fifth of the input price and is the right default for most chat, drafting, and routine coding. A common production pattern is to route the bulk of traffic to Sonnet and escalate only the hard requests to Opus 4.8 — and because BUZZ uses one key and one endpoint for both, that routing is a single string in your own code, not a second integration.

Q1: What is the model identifier for Claude Opus 4.8?

It is claude-opus-4-8. Hyphens, not dots — claude-opus-4.8 returns a 404. Pass it as the model parameter against https://buzzai.cc/v1 (OpenAI-compatible) or https://buzzai.cc (Anthropic Messages).

Q2: Do I need to change anything to upgrade from 4.7?

Only the model string. Endpoint, key, request schema, prompt cache markers, and tool-use blocks are all unchanged. Replace claude-opus-4-7 with claude-opus-4-8 and you are done.

Q3: How much does Opus 4.8 cost?

Pay-per-token, no monthly fee, no minimum. Input is $0.20 per million tokens, output is $1.00 per million. Prompt cache hits are billed automatically at Anthropic's official discount multipliers.

Q4: Does prompt caching work with Opus 4.8?

Yes, with the same cache_control markers as the rest of the current Claude family. Cache is scoped per model, so expect one warm-up pass after switching from 4.7.

Q5: Can I call Opus 4.8 with the OpenAI SDK?

Yes. Set base_url="https://buzzai.cc/v1", set api_key to your BUZZ key, and pass model="claude-opus-4-8". The same openai client that calls gpt-5 can call Opus 4.8 by changing two strings.

Try Opus 4.8 now

Sign up, top up the balance, copy your key, and send the first request to claude-opus-4-8 in under a minute. Pay only for the tokens you use.

Create an account

Published: 2026-05-29
Last reviewed: 2026-05-29