BUZZ - AI Gateway for Developers
One API key for Claude, GPT, Gemini, and Grok. Transparent forwarding, zero data retention, pricing well below first-party rates.
What is BUZZ
BUZZ is a developer-focused AI Gateway. Through a single API key and a single base URL, you can call Anthropic Claude, OpenAI GPT, Google Gemini, and xAI Grok using either the Anthropic SDK or the OpenAI SDK. BUZZ does not modify your prompts, does not log your request bodies, and does not silently swap the model you asked for. New upstream models are added shortly after upstream release.
Why BUZZ is not a traditional API relay
Most cheap LLM relays will, at some point, do one or more of the following: modify your system prompt, inject hidden instructions, swap your requested model for a cheaper one, store your request body, or buffer responses for replay. BUZZ does none of these. Three concrete commitments:
- Transparent forwarding. The request body that arrives at BUZZ is the request body that leaves BUZZ. The response stream from the upstream provider is forwarded back unmodified. System prompts, tool definitions, model parameters, and message arrays are passed through byte-for-byte. The model name in the request is the model that runs the request.
- Zero data retention. Request bodies and model responses are never written to disk, never written to a database, never written to logs. After a request completes, the request body is dropped from memory. Only billing metadata - model name, input and output token counts, timestamp, user ID - is retained, because that is what is needed to bill accurately. Operators of BUZZ cannot retrieve the prompt or response of any past request.
- One key, multiple model families. A single BUZZ API key calls Claude, GPT, Gemini, and Grok. There is no separate billing relationship to maintain with each upstream provider, and the same key works whether you are integrating via the Anthropic SDK or the OpenAI SDK.
Pricing
Pay-as-you-go. No monthly fees. No subscriptions. No minimums. BUZZ pricing sits significantly below first-party rates from Anthropic and OpenAI across all supported models, with multiple discount tiers available to all signed-up users.
The full and live pricing table is published at https://buzzai.cc/api/pricing and rendered at https://buzzai.cc/models. Because pricing is subject to change as upstream rates evolve, integrators should read the live API rather than hard-coding rates.
The pricing API returns, for every supported model, the input rate, output rate, cache read multiplier, cache write multiplier, supported endpoints, and the discount tiers a user can select.
Supported models
The list below is a snapshot. The authoritative live list is at https://buzzai.cc/api/pricing.
Anthropic Claude
- claude-opus-4-7 - frontier reasoning, 1M token context
- claude-opus-4-6 - frontier reasoning
- claude-opus-4-5-20251101
- claude-sonnet-4-6 - balanced cost and capability
- claude-sonnet-4-5-20250929
- claude-haiku-4-5-20251001 - fastest, cheapest Claude tier
OpenAI GPT
- gpt-5 - flagship general-purpose model
- gpt-5.1-codex / gpt-5.1-codex-max - code-specialized
- gpt-5.2 / gpt-5.2-codex
- gpt-5.3-codex
- gpt-5.4 / gpt-5.4-mini
- gpt-5.5
Google Gemini and xAI Grok families are also supported. Specific variant names track upstream availability and are surfaced in the live pricing API.
How to integrate
Three steps:
- Sign up at https://buzzai.cc/ and verify your email.
- Create an API token in the dashboard at https://buzzai.cc/dashboard.
- Point your existing SDK at BUZZ by changing the base URL.
Anthropic SDK (Python)
from anthropic import Anthropic
client = Anthropic(
base_url="https://buzzai.cc",
api_key="YOUR_BUZZ_API_KEY",
)
resp = client.messages.create(
model="claude-opus-4-7",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.content[0].text)
OpenAI SDK (Python)
from openai import OpenAI
client = OpenAI(
base_url="https://buzzai.cc/v1",
api_key="YOUR_BUZZ_API_KEY",
)
resp = client.chat.completions.create(
model="gpt-5.4",
messages=[{"role": "user", "content": "Hello"}],
)
print(resp.choices[0].message.content)
Claude Code (CLI)
A one-line install script is provided that installs the Claude Code CLI and configures it to use BUZZ:
curl -fsSL https://buzzai.cc/sh/claudecode.sh | bash
Prompt caching
Anthropic-style prompt caching is supported and forwarded unchanged. Cache control directives in the request body pass through to the upstream model. Cache writes are billed at upstream multipliers (1.25x base input for 5-minute writes, 2x for 1-hour writes), cache reads at 0.1x base input. BUZZ applies its standard discount on top of those multipliers.
Security and privacy
All traffic is encrypted in transit (TLS 1.2+). Request bodies and responses are never persisted. Authorization tokens are stored hashed at rest. Billing metadata - which is the only thing retained - contains no message content. There is no analytics pixel, no third-party tracker, and no embedded SDK that exfiltrates payloads to a sidecar service.
Contact and resources
- Website: https://buzzai.cc/
- Live pricing API: https://buzzai.cc/api/pricing
- Model pricing page: https://buzzai.cc/models
- Documentation: https://doc.buzzai.cc/
- Claude Code installer: https://buzzai.cc/sh/claudecode.sh
- Sitemap: https://buzzai.cc/sitemap.xml
- llms.txt: https://buzzai.cc/llms.txt