Coding Assistant
A working pattern for code generation, refactoring, and explanation with Claude. System prompt template, file-system tool use (read_file / write_file / run_tests), and prompt caching of your project's README so every turn stays cheap.
Pick a model
| Model | When | Notes |
|---|---|---|
claude-haiku-4-5-20251001 | Inline completions, quick rewrites, lint-style fixes | Fastest, cheapest. Pairs well with editor integrations. |
claude-sonnet-4-6 | Default coding agent | Best price/quality balance. Use this unless you have a reason not to. |
claude-opus-4-7 | Architectural refactors, gnarly bug hunts | Enable thinking for the hardest cases. temperature/top_p/top_k are ignored when thinking is on. |
System prompt template
Three things worth getting right: role (what kind of engineer), style (formatting and verbosity), boundaries (what tools to use, what not to touch).
You are a senior software engineer working inside the user's repository.
ROLE
- Write production-quality code that fits the existing style.
- Read before you write. Use the read_file tool to see surrounding code first.
- Run tests after meaningful edits. Use run_tests with the smallest scope that covers the change.
STYLE
- Match the project's formatter, naming, and import order.
- Keep diffs minimal. Do not reformat untouched code.
- Comment only when intent is non-obvious.
BOUNDARIES
- Never edit files outside the repository root.
- Never delete tests to make a build green.
- If a change touches public API, surface a one-line migration note in your final reply.
- Stop and ask if the user's request is ambiguous about behavior.
OUTPUT
- Final messages should explain WHAT you changed and WHY, not paste full files.
- Cite file paths with line ranges when discussing existing code.
Define the tools
The minimum useful set is three tools: a reader, a writer, and a test runner. Tool inputs are JSON Schema; Claude will fill them in and you implement the side effects.
{
"tools": [
{
"name": "read_file",
"description": "Read a UTF-8 text file from the project, optionally a line range. Returns the raw content.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string", "description": "Path relative to the repo root."},
"start_line": {"type": "integer", "minimum": 1},
"end_line": {"type": "integer", "minimum": 1}
},
"required": ["path"]
}
},
{
"name": "write_file",
"description": "Overwrite a UTF-8 file. Create parent directories if needed. Returns the new byte size.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"content": {"type": "string"}
},
"required": ["path", "content"]
}
},
{
"name": "run_tests",
"description": "Run the project's test suite. Pattern selects a subset (e.g. 'tests/auth/').",
"input_schema": {
"type": "object",
"properties": {
"pattern": {"type": "string"}
}
}
}
]
}
Cache the project README
Your README, architecture notes, and coding conventions don't change between turns. Drop them into the system array as a cached block; first call writes the cache, every subsequent turn reads at 1/10 the input price.
"system": [
{"type": "text", "text": "You are a senior software engineer..."},
{
"type": "text",
"text": "<PROJECT_README>\n" + readme_text + "\n</PROJECT_README>\n" +
"<ARCHITECTURE_NOTES>\n" + arch_text + "\n</ARCHITECTURE_NOTES>",
"cache_control": {"type": "ephemeral"}
}
]
Verified BUZZ behavior on a 20K-token system block:
| Call | input_tokens | cache_creation | cache_read |
|---|---|---|---|
| 1 (cold) | ~20 | ~20000 | 0 |
| 2+ (warm) | ~20 | 0 | ~20000 |
Full working example
"""
Coding assistant: Claude reads files, writes files, and runs tests.
Requires: pip install anthropic
"""
import os, subprocess, pathlib
from anthropic import Anthropic
REPO_ROOT = pathlib.Path(".").resolve()
client = Anthropic(
base_url="https://buzzai.cc",
api_key=os.environ["BUZZ_API_KEY"],
)
SYSTEM_PROMPT = """You are a senior software engineer working inside the user's repository.
Read before you write. Run tests after meaningful edits. Keep diffs minimal."""
TOOLS = [
{
"name": "read_file",
"description": "Read a UTF-8 text file from the project.",
"input_schema": {
"type": "object",
"properties": {
"path": {"type": "string"},
"start_line": {"type": "integer", "minimum": 1},
"end_line": {"type": "integer", "minimum": 1},
},
"required": ["path"],
},
},
{
"name": "write_file",
"description": "Overwrite a UTF-8 file. Returns new byte size.",
"input_schema": {
"type": "object",
"properties": {"path": {"type": "string"}, "content": {"type": "string"}},
"required": ["path", "content"],
},
},
{
"name": "run_tests",
"description": "Run the project's test suite. Optional pattern to scope.",
"input_schema": {
"type": "object",
"properties": {"pattern": {"type": "string"}},
},
},
]
def safe_path(p: str) -> pathlib.Path:
full = (REPO_ROOT / p).resolve()
if REPO_ROOT not in full.parents and full != REPO_ROOT:
raise ValueError(f"path escapes repo: {p}")
return full
def execute_tool(name: str, args: dict) -> str:
if name == "read_file":
text = safe_path(args["path"]).read_text()
if "start_line" in args or "end_line" in args:
lines = text.splitlines()
s = args.get("start_line", 1) - 1
e = args.get("end_line", len(lines))
text = "\n".join(lines[s:e])
return text
if name == "write_file":
path = safe_path(args["path"])
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(args["content"])
return f"wrote {path.stat().st_size} bytes"
if name == "run_tests":
cmd = ["pytest", "-x", "-q"]
if args.get("pattern"):
cmd.append(args["pattern"])
out = subprocess.run(cmd, capture_output=True, text=True, cwd=REPO_ROOT)
return f"exit={out.returncode}\n{out.stdout}\n{out.stderr}"
raise ValueError(f"unknown tool: {name}")
def build_system():
readme = (REPO_ROOT / "README.md").read_text() if (REPO_ROOT / "README.md").exists() else ""
return [
{"type": "text", "text": SYSTEM_PROMPT},
{
"type": "text",
"text": f"<PROJECT_README>\n{readme}\n</PROJECT_README>",
"cache_control": {"type": "ephemeral"},
},
]
def chat(user_request: str, max_turns: int = 12):
messages = [{"role": "user", "content": user_request}]
for turn in range(max_turns):
resp = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4096,
system=build_system(),
tools=TOOLS,
messages=messages,
)
messages.append({"role": "assistant", "content": resp.content})
if resp.stop_reason != "tool_use":
for block in resp.content:
if block.type == "text":
print(block.text)
return
tool_results = []
for block in resp.content:
if block.type == "tool_use":
try:
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
except Exception as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"ERROR: {e}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results})
print("hit max_turns without finishing")
if __name__ == "__main__":
chat("Add a /healthz route to app/server.py and a unit test for it.")
// Coding assistant: Claude reads files, writes files, runs tests.
// Requires: npm i @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
import { readFile, writeFile, mkdir, stat } from "node:fs/promises";
import { existsSync } from "node:fs";
import { spawn } from "node:child_process";
import path from "node:path";
const REPO_ROOT = path.resolve(".");
const client = new Anthropic({
baseURL: "https://buzzai.cc",
apiKey: process.env.BUZZ_API_KEY,
});
const SYSTEM_PROMPT = `You are a senior software engineer working inside the user's repository.
Read before you write. Run tests after meaningful edits. Keep diffs minimal.`;
const TOOLS = [
{
name: "read_file",
description: "Read a UTF-8 text file from the project.",
input_schema: {
type: "object",
properties: {
path: { type: "string" },
start_line: { type: "integer", minimum: 1 },
end_line: { type: "integer", minimum: 1 },
},
required: ["path"],
},
},
{
name: "write_file",
description: "Overwrite a UTF-8 file. Returns new byte size.",
input_schema: {
type: "object",
properties: { path: { type: "string" }, content: { type: "string" } },
required: ["path", "content"],
},
},
{
name: "run_tests",
description: "Run the project's test suite. Optional pattern to scope.",
input_schema: {
type: "object",
properties: { pattern: { type: "string" } },
},
},
];
function safePath(p) {
const full = path.resolve(REPO_ROOT, p);
if (!full.startsWith(REPO_ROOT)) throw new Error(`path escapes repo: ${p}`);
return full;
}
async function executeTool(name, args) {
if (name === "read_file") {
let text = await readFile(safePath(args.path), "utf8");
if (args.start_line || args.end_line) {
const lines = text.split("\n");
const s = (args.start_line ?? 1) - 1;
const e = args.end_line ?? lines.length;
text = lines.slice(s, e).join("\n");
}
return text;
}
if (name === "write_file") {
const full = safePath(args.path);
await mkdir(path.dirname(full), { recursive: true });
await writeFile(full, args.content);
const s = await stat(full);
return `wrote ${s.size} bytes`;
}
if (name === "run_tests") {
const cmd = ["npm", ["test", "--", "--bail"]];
if (args.pattern) cmd[1].push(args.pattern);
return await new Promise((resolve) => {
const child = spawn(cmd[0], cmd[1], { cwd: REPO_ROOT });
let out = "";
child.stdout.on("data", (d) => (out += d));
child.stderr.on("data", (d) => (out += d));
child.on("close", (code) => resolve(`exit=${code}\n${out}`));
});
}
throw new Error(`unknown tool: ${name}`);
}
async function buildSystem() {
const readmePath = path.join(REPO_ROOT, "README.md");
const readme = existsSync(readmePath) ? await readFile(readmePath, "utf8") : "";
return [
{ type: "text", text: SYSTEM_PROMPT },
{
type: "text",
text: `<PROJECT_README>\n${readme}\n</PROJECT_README>`,
cache_control: { type: "ephemeral" },
},
];
}
async function chat(userRequest, maxTurns = 12) {
const messages = [{ role: "user", content: userRequest }];
for (let turn = 0; turn < maxTurns; turn++) {
const resp = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 4096,
system: await buildSystem(),
tools: TOOLS,
messages,
});
messages.push({ role: "assistant", content: resp.content });
if (resp.stop_reason !== "tool_use") {
for (const b of resp.content) if (b.type === "text") console.log(b.text);
return;
}
const toolResults = [];
for (const b of resp.content) {
if (b.type !== "tool_use") continue;
try {
const r = await executeTool(b.name, b.input);
toolResults.push({ type: "tool_result", tool_use_id: b.id, content: r });
} catch (e) {
toolResults.push({
type: "tool_result",
tool_use_id: b.id,
content: `ERROR: ${e.message}`,
is_error: true,
});
}
}
messages.push({ role: "user", content: toolResults });
}
console.log("hit maxTurns without finishing");
}
await chat("Add a /healthz route to app/server.js and a unit test for it.");
Caching strategy in practice
Where you place cache_control directly controls what gets reused across turns. Order matters: Anthropic caches a prefix, so anything that changes between turns must come after the cached blocks.
- System block 1: stable instructions (role, style, boundaries). Cache.
- System block 2: project context (README, architecture notes). Cache.
- Tools array: stable across turns. Cache implicitly via the same definition.
- Messages: per-turn user input + tool results. Do not cache.
Parallel tool calls
Claude can emit multiple tool_use blocks in a single response. Run them in parallel and return one tool_result per tool_use_id in the next user message.
# Python: parallel reads with asyncio.gather
import asyncio
async def execute_parallel(blocks):
async def one(b):
return {
"type": "tool_result",
"tool_use_id": b.id,
"content": await asyncio.to_thread(execute_tool, b.name, b.input),
}
return await asyncio.gather(*(one(b) for b in blocks if b.type == "tool_use"))
If you want to force parallelism off (some agents are easier to debug serially):
"tool_choice": {"type": "auto", "disable_parallel_tool_use": true}
Error handling
Three failure modes you'll hit in production:
| Failure | Recovery |
|---|---|
| Tool raises exception | Return {is_error: true, content: "ERROR: ..."}. Claude will retry, ask, or back off on its own. |
HTTP 429 rate_limit_error | Honor the retry-after header. Exponential backoff with jitter, cap at ~60s. |
HTTP 503 model_not_found | BUZZ-specific: no upstream channel for the model+group. Fall back to a less-restricted model (often claude-sonnet-4-6) or contact support. |
See also
- Recipe: Agent Loop — the multi-turn loop in depth, with iteration and budget controls
- Prompt Caching concept
- Tool Use concept
- POST /v1/messages reference