BUZZ AI Gateway
Docs · Recipes · Coding Assistant

Coding Assistant

A working pattern for code generation, refactoring, and explanation with Claude. System prompt template, file-system tool use (read_file / write_file / run_tests), and prompt caching of your project's README so every turn stays cheap.

POST https://buzzai.cc/v1/messages
What you get. A loop that lets Claude read your repo, propose edits, and run your test suite. Sonnet 4.6 is the default workhorse here; switch to Opus 4.7 for hard refactors and Haiku 4.5 for low-stakes pair programming.

Pick a model

ModelWhenNotes
claude-haiku-4-5-20251001Inline completions, quick rewrites, lint-style fixesFastest, cheapest. Pairs well with editor integrations.
claude-sonnet-4-6Default coding agentBest price/quality balance. Use this unless you have a reason not to.
claude-opus-4-7Architectural refactors, gnarly bug huntsEnable thinking for the hardest cases. temperature/top_p/top_k are ignored when thinking is on.

System prompt template

Three things worth getting right: role (what kind of engineer), style (formatting and verbosity), boundaries (what tools to use, what not to touch).

You are a senior software engineer working inside the user's repository.

ROLE
- Write production-quality code that fits the existing style.
- Read before you write. Use the read_file tool to see surrounding code first.
- Run tests after meaningful edits. Use run_tests with the smallest scope that covers the change.

STYLE
- Match the project's formatter, naming, and import order.
- Keep diffs minimal. Do not reformat untouched code.
- Comment only when intent is non-obvious.

BOUNDARIES
- Never edit files outside the repository root.
- Never delete tests to make a build green.
- If a change touches public API, surface a one-line migration note in your final reply.
- Stop and ask if the user's request is ambiguous about behavior.

OUTPUT
- Final messages should explain WHAT you changed and WHY, not paste full files.
- Cite file paths with line ranges when discussing existing code.

Define the tools

The minimum useful set is three tools: a reader, a writer, and a test runner. Tool inputs are JSON Schema; Claude will fill them in and you implement the side effects.

{
  "tools": [
    {
      "name": "read_file",
      "description": "Read a UTF-8 text file from the project, optionally a line range. Returns the raw content.",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "Path relative to the repo root."},
          "start_line": {"type": "integer", "minimum": 1},
          "end_line": {"type": "integer", "minimum": 1}
        },
        "required": ["path"]
      }
    },
    {
      "name": "write_file",
      "description": "Overwrite a UTF-8 file. Create parent directories if needed. Returns the new byte size.",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string"},
          "content": {"type": "string"}
        },
        "required": ["path", "content"]
      }
    },
    {
      "name": "run_tests",
      "description": "Run the project's test suite. Pattern selects a subset (e.g. 'tests/auth/').",
      "input_schema": {
        "type": "object",
        "properties": {
          "pattern": {"type": "string"}
        }
      }
    }
  ]
}

Cache the project README

Your README, architecture notes, and coding conventions don't change between turns. Drop them into the system array as a cached block; first call writes the cache, every subsequent turn reads at 1/10 the input price.

"system": [
  {"type": "text", "text": "You are a senior software engineer..."},
  {
    "type": "text",
    "text": "<PROJECT_README>\n" + readme_text + "\n</PROJECT_README>\n" +
            "<ARCHITECTURE_NOTES>\n" + arch_text + "\n</ARCHITECTURE_NOTES>",
    "cache_control": {"type": "ephemeral"}
  }
]

Verified BUZZ behavior on a 20K-token system block:

Callinput_tokenscache_creationcache_read
1 (cold)~20~200000
2+ (warm)~200~20000

Full working example

"""
Coding assistant: Claude reads files, writes files, and runs tests.
Requires: pip install anthropic
"""
import os, subprocess, pathlib
from anthropic import Anthropic

REPO_ROOT = pathlib.Path(".").resolve()

client = Anthropic(
    base_url="https://buzzai.cc",
    api_key=os.environ["BUZZ_API_KEY"],
)

SYSTEM_PROMPT = """You are a senior software engineer working inside the user's repository.
Read before you write. Run tests after meaningful edits. Keep diffs minimal."""

TOOLS = [
    {
        "name": "read_file",
        "description": "Read a UTF-8 text file from the project.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string"},
                "start_line": {"type": "integer", "minimum": 1},
                "end_line": {"type": "integer", "minimum": 1},
            },
            "required": ["path"],
        },
    },
    {
        "name": "write_file",
        "description": "Overwrite a UTF-8 file. Returns new byte size.",
        "input_schema": {
            "type": "object",
            "properties": {"path": {"type": "string"}, "content": {"type": "string"}},
            "required": ["path", "content"],
        },
    },
    {
        "name": "run_tests",
        "description": "Run the project's test suite. Optional pattern to scope.",
        "input_schema": {
            "type": "object",
            "properties": {"pattern": {"type": "string"}},
        },
    },
]


def safe_path(p: str) -> pathlib.Path:
    full = (REPO_ROOT / p).resolve()
    if REPO_ROOT not in full.parents and full != REPO_ROOT:
        raise ValueError(f"path escapes repo: {p}")
    return full


def execute_tool(name: str, args: dict) -> str:
    if name == "read_file":
        text = safe_path(args["path"]).read_text()
        if "start_line" in args or "end_line" in args:
            lines = text.splitlines()
            s = args.get("start_line", 1) - 1
            e = args.get("end_line", len(lines))
            text = "\n".join(lines[s:e])
        return text
    if name == "write_file":
        path = safe_path(args["path"])
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(args["content"])
        return f"wrote {path.stat().st_size} bytes"
    if name == "run_tests":
        cmd = ["pytest", "-x", "-q"]
        if args.get("pattern"):
            cmd.append(args["pattern"])
        out = subprocess.run(cmd, capture_output=True, text=True, cwd=REPO_ROOT)
        return f"exit={out.returncode}\n{out.stdout}\n{out.stderr}"
    raise ValueError(f"unknown tool: {name}")


def build_system():
    readme = (REPO_ROOT / "README.md").read_text() if (REPO_ROOT / "README.md").exists() else ""
    return [
        {"type": "text", "text": SYSTEM_PROMPT},
        {
            "type": "text",
            "text": f"<PROJECT_README>\n{readme}\n</PROJECT_README>",
            "cache_control": {"type": "ephemeral"},
        },
    ]


def chat(user_request: str, max_turns: int = 12):
    messages = [{"role": "user", "content": user_request}]
    for turn in range(max_turns):
        resp = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            system=build_system(),
            tools=TOOLS,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason != "tool_use":
            for block in resp.content:
                if block.type == "text":
                    print(block.text)
            return

        tool_results = []
        for block in resp.content:
            if block.type == "tool_use":
                try:
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })
                except Exception as e:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"ERROR: {e}",
                        "is_error": True,
                    })
        messages.append({"role": "user", "content": tool_results})
    print("hit max_turns without finishing")


if __name__ == "__main__":
    chat("Add a /healthz route to app/server.py and a unit test for it.")
// Coding assistant: Claude reads files, writes files, runs tests.
// Requires: npm i @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
import { readFile, writeFile, mkdir, stat } from "node:fs/promises";
import { existsSync } from "node:fs";
import { spawn } from "node:child_process";
import path from "node:path";

const REPO_ROOT = path.resolve(".");

const client = new Anthropic({
  baseURL: "https://buzzai.cc",
  apiKey: process.env.BUZZ_API_KEY,
});

const SYSTEM_PROMPT = `You are a senior software engineer working inside the user's repository.
Read before you write. Run tests after meaningful edits. Keep diffs minimal.`;

const TOOLS = [
  {
    name: "read_file",
    description: "Read a UTF-8 text file from the project.",
    input_schema: {
      type: "object",
      properties: {
        path: { type: "string" },
        start_line: { type: "integer", minimum: 1 },
        end_line: { type: "integer", minimum: 1 },
      },
      required: ["path"],
    },
  },
  {
    name: "write_file",
    description: "Overwrite a UTF-8 file. Returns new byte size.",
    input_schema: {
      type: "object",
      properties: { path: { type: "string" }, content: { type: "string" } },
      required: ["path", "content"],
    },
  },
  {
    name: "run_tests",
    description: "Run the project's test suite. Optional pattern to scope.",
    input_schema: {
      type: "object",
      properties: { pattern: { type: "string" } },
    },
  },
];

function safePath(p) {
  const full = path.resolve(REPO_ROOT, p);
  if (!full.startsWith(REPO_ROOT)) throw new Error(`path escapes repo: ${p}`);
  return full;
}

async function executeTool(name, args) {
  if (name === "read_file") {
    let text = await readFile(safePath(args.path), "utf8");
    if (args.start_line || args.end_line) {
      const lines = text.split("\n");
      const s = (args.start_line ?? 1) - 1;
      const e = args.end_line ?? lines.length;
      text = lines.slice(s, e).join("\n");
    }
    return text;
  }
  if (name === "write_file") {
    const full = safePath(args.path);
    await mkdir(path.dirname(full), { recursive: true });
    await writeFile(full, args.content);
    const s = await stat(full);
    return `wrote ${s.size} bytes`;
  }
  if (name === "run_tests") {
    const cmd = ["npm", ["test", "--", "--bail"]];
    if (args.pattern) cmd[1].push(args.pattern);
    return await new Promise((resolve) => {
      const child = spawn(cmd[0], cmd[1], { cwd: REPO_ROOT });
      let out = "";
      child.stdout.on("data", (d) => (out += d));
      child.stderr.on("data", (d) => (out += d));
      child.on("close", (code) => resolve(`exit=${code}\n${out}`));
    });
  }
  throw new Error(`unknown tool: ${name}`);
}

async function buildSystem() {
  const readmePath = path.join(REPO_ROOT, "README.md");
  const readme = existsSync(readmePath) ? await readFile(readmePath, "utf8") : "";
  return [
    { type: "text", text: SYSTEM_PROMPT },
    {
      type: "text",
      text: `<PROJECT_README>\n${readme}\n</PROJECT_README>`,
      cache_control: { type: "ephemeral" },
    },
  ];
}

async function chat(userRequest, maxTurns = 12) {
  const messages = [{ role: "user", content: userRequest }];
  for (let turn = 0; turn < maxTurns; turn++) {
    const resp = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      system: await buildSystem(),
      tools: TOOLS,
      messages,
    });
    messages.push({ role: "assistant", content: resp.content });

    if (resp.stop_reason !== "tool_use") {
      for (const b of resp.content) if (b.type === "text") console.log(b.text);
      return;
    }

    const toolResults = [];
    for (const b of resp.content) {
      if (b.type !== "tool_use") continue;
      try {
        const r = await executeTool(b.name, b.input);
        toolResults.push({ type: "tool_result", tool_use_id: b.id, content: r });
      } catch (e) {
        toolResults.push({
          type: "tool_result",
          tool_use_id: b.id,
          content: `ERROR: ${e.message}`,
          is_error: true,
        });
      }
    }
    messages.push({ role: "user", content: toolResults });
  }
  console.log("hit maxTurns without finishing");
}

await chat("Add a /healthz route to app/server.js and a unit test for it.");

Caching strategy in practice

Where you place cache_control directly controls what gets reused across turns. Order matters: Anthropic caches a prefix, so anything that changes between turns must come after the cached blocks.

Cache invalidation. Any byte change in a cached block invalidates the prefix. If you want fast iteration on the system prompt while keeping the README cached, put the README block before the prompt you're tweaking. Re-order and you pay the create cost again.

Parallel tool calls

Claude can emit multiple tool_use blocks in a single response. Run them in parallel and return one tool_result per tool_use_id in the next user message.

# Python: parallel reads with asyncio.gather
import asyncio

async def execute_parallel(blocks):
    async def one(b):
        return {
            "type": "tool_result",
            "tool_use_id": b.id,
            "content": await asyncio.to_thread(execute_tool, b.name, b.input),
        }
    return await asyncio.gather(*(one(b) for b in blocks if b.type == "tool_use"))

If you want to force parallelism off (some agents are easier to debug serially):

"tool_choice": {"type": "auto", "disable_parallel_tool_use": true}

Error handling

Three failure modes you'll hit in production:

FailureRecovery
Tool raises exceptionReturn {is_error: true, content: "ERROR: ..."}. Claude will retry, ask, or back off on its own.
HTTP 429 rate_limit_errorHonor the retry-after header. Exponential backoff with jitter, cap at ~60s.
HTTP 503 model_not_foundBUZZ-specific: no upstream channel for the model+group. Fall back to a less-restricted model (often claude-sonnet-4-6) or contact support.

See also