BUZZ AI Gateway
文档 · 实战示例 · 编程助手

编程助手

用 Claude 做代码生成、重构、解释的可工作模式。包含 system prompt 模板、文件系统 Tool Use(read_file / write_file / run_tests),以及把项目 README 缓存进 system,让每一轮都便宜。

POST https://buzzai.cc/v1/messages
你能拿到什么。 一个让 Claude 读你的仓库、改文件、跑测试的循环。Sonnet 4.6 是日常主力;遇到复杂重构换 Opus 4.7,做轻量结对编程换 Haiku 4.5。

选模型

模型适用说明
claude-haiku-4-5-20251001编辑器内联补全、快速改写、风格修复最快、最便宜。适合编辑器集成。
claude-sonnet-4-6默认编程 agent性价比平衡的主力,没有特殊原因就选它。
claude-opus-4-7架构级重构、棘手 bug最难的场景开 thinking。开 thinking 时 temperature/top_p/top_k 会被忽略。

System prompt 模板

三件事必须写清楚:角色(什么样的工程师)、风格(格式与详略)、边界(用哪些工具,不碰哪些东西)。

你是用户仓库内的资深软件工程师。

ROLE 角色
- 写生产级代码,贴合现有风格。
- 写之前先读。用 read_file 工具看一眼周围代码再下手。
- 改完做有意义的编辑后跑测试。用 run_tests 时尽量缩小范围。

STYLE 风格
- 和项目的 formatter、命名、import 顺序保持一致。
- diff 越小越好。不要重新格式化没改过的代码。
- 注释只写在意图不明显的地方。

BOUNDARIES 边界
- 永远不要改仓库根目录之外的文件。
- 永远不要为了让 build 过就删测试。
- 改动涉及对外 API 时,在最终回复里附一行 migration 提示。
- 用户需求模棱两可时,停下来问清楚。

OUTPUT 输出
- 最终回复说清楚改了什么、为什么,不要贴整个文件。
- 讨论已有代码时引用文件路径 + 行号区间。

定义工具

最小可用集合就三个工具:读、写、跑测试。Tool 输入是 JSON Schema,Claude 填值,你执行副作用。

{
  "tools": [
    {
      "name": "read_file",
      "description": "读取项目内 UTF-8 文本文件,可选行号范围。返回原始内容。",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string", "description": "相对仓库根的路径。"},
          "start_line": {"type": "integer", "minimum": 1},
          "end_line": {"type": "integer", "minimum": 1}
        },
        "required": ["path"]
      }
    },
    {
      "name": "write_file",
      "description": "覆盖写 UTF-8 文件,父目录不存在则创建。返回新字节数。",
      "input_schema": {
        "type": "object",
        "properties": {
          "path": {"type": "string"},
          "content": {"type": "string"}
        },
        "required": ["path", "content"]
      }
    },
    {
      "name": "run_tests",
      "description": "跑项目测试套件。pattern 可选,缩小范围(如 'tests/auth/')。",
      "input_schema": {
        "type": "object",
        "properties": {
          "pattern": {"type": "string"}
        }
      }
    }
  ]
}

把 README 缓存进 system

README、架构说明、编码规范这些跨轮不变。塞进 system 数组、加 cache_control:第一次写入缓存,后续每轮命中,价格只有原始输入的 1/10

"system": [
  {"type": "text", "text": "你是资深软件工程师..."},
  {
    "type": "text",
    "text": "<PROJECT_README>\n" + readme_text + "\n</PROJECT_README>\n" +
            "<ARCHITECTURE_NOTES>\n" + arch_text + "\n</ARCHITECTURE_NOTES>",
    "cache_control": {"type": "ephemeral"}
  }
]

BUZZ 实测,20K-token 的 system 块:

调用input_tokenscache_creationcache_read
1(冷)~20~200000
2+(热)~200~20000

完整可运行样例

"""
编程助手:Claude 读文件、写文件、跑测试。
依赖:pip install anthropic
"""
import os, subprocess, pathlib
from anthropic import Anthropic

REPO_ROOT = pathlib.Path(".").resolve()

client = Anthropic(
    base_url="https://buzzai.cc",
    api_key=os.environ["BUZZ_API_KEY"],
)

SYSTEM_PROMPT = """你是用户仓库内的资深软件工程师。
写之前先读。改完做有意义的编辑后跑测试。diff 越小越好。"""

TOOLS = [
    {
        "name": "read_file",
        "description": "读取项目内 UTF-8 文本文件。",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string"},
                "start_line": {"type": "integer", "minimum": 1},
                "end_line": {"type": "integer", "minimum": 1},
            },
            "required": ["path"],
        },
    },
    {
        "name": "write_file",
        "description": "覆盖写 UTF-8 文件,返回字节数。",
        "input_schema": {
            "type": "object",
            "properties": {"path": {"type": "string"}, "content": {"type": "string"}},
            "required": ["path", "content"],
        },
    },
    {
        "name": "run_tests",
        "description": "跑项目测试,可选 pattern 缩小范围。",
        "input_schema": {
            "type": "object",
            "properties": {"pattern": {"type": "string"}},
        },
    },
]


def safe_path(p: str) -> pathlib.Path:
    full = (REPO_ROOT / p).resolve()
    if REPO_ROOT not in full.parents and full != REPO_ROOT:
        raise ValueError(f"路径越界: {p}")
    return full


def execute_tool(name: str, args: dict) -> str:
    if name == "read_file":
        text = safe_path(args["path"]).read_text()
        if "start_line" in args or "end_line" in args:
            lines = text.splitlines()
            s = args.get("start_line", 1) - 1
            e = args.get("end_line", len(lines))
            text = "\n".join(lines[s:e])
        return text
    if name == "write_file":
        path = safe_path(args["path"])
        path.parent.mkdir(parents=True, exist_ok=True)
        path.write_text(args["content"])
        return f"wrote {path.stat().st_size} bytes"
    if name == "run_tests":
        cmd = ["pytest", "-x", "-q"]
        if args.get("pattern"):
            cmd.append(args["pattern"])
        out = subprocess.run(cmd, capture_output=True, text=True, cwd=REPO_ROOT)
        return f"exit={out.returncode}\n{out.stdout}\n{out.stderr}"
    raise ValueError(f"未知工具: {name}")


def build_system():
    readme = (REPO_ROOT / "README.md").read_text() if (REPO_ROOT / "README.md").exists() else ""
    return [
        {"type": "text", "text": SYSTEM_PROMPT},
        {
            "type": "text",
            "text": f"<PROJECT_README>\n{readme}\n</PROJECT_README>",
            "cache_control": {"type": "ephemeral"},
        },
    ]


def chat(user_request: str, max_turns: int = 12):
    messages = [{"role": "user", "content": user_request}]
    for turn in range(max_turns):
        resp = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=4096,
            system=build_system(),
            tools=TOOLS,
            messages=messages,
        )
        messages.append({"role": "assistant", "content": resp.content})

        if resp.stop_reason != "tool_use":
            for block in resp.content:
                if block.type == "text":
                    print(block.text)
            return

        tool_results = []
        for block in resp.content:
            if block.type == "tool_use":
                try:
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })
                except Exception as e:
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": f"ERROR: {e}",
                        "is_error": True,
                    })
        messages.append({"role": "user", "content": tool_results})
    print("到达 max_turns 仍未完成")


if __name__ == "__main__":
    chat("给 app/server.py 加一个 /healthz 路由,补一个单元测试。")
// 编程助手:Claude 读文件、写文件、跑测试。
// 依赖:npm i @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
import { readFile, writeFile, mkdir, stat } from "node:fs/promises";
import { existsSync } from "node:fs";
import { spawn } from "node:child_process";
import path from "node:path";

const REPO_ROOT = path.resolve(".");

const client = new Anthropic({
  baseURL: "https://buzzai.cc",
  apiKey: process.env.BUZZ_API_KEY,
});

const SYSTEM_PROMPT = `你是用户仓库内的资深软件工程师。
写之前先读。改完做有意义的编辑后跑测试。diff 越小越好。`;

const TOOLS = [
  {
    name: "read_file",
    description: "读取项目内 UTF-8 文本文件。",
    input_schema: {
      type: "object",
      properties: {
        path: { type: "string" },
        start_line: { type: "integer", minimum: 1 },
        end_line: { type: "integer", minimum: 1 },
      },
      required: ["path"],
    },
  },
  {
    name: "write_file",
    description: "覆盖写 UTF-8 文件,返回字节数。",
    input_schema: {
      type: "object",
      properties: { path: { type: "string" }, content: { type: "string" } },
      required: ["path", "content"],
    },
  },
  {
    name: "run_tests",
    description: "跑项目测试,可选 pattern 缩小范围。",
    input_schema: {
      type: "object",
      properties: { pattern: { type: "string" } },
    },
  },
];

function safePath(p) {
  const full = path.resolve(REPO_ROOT, p);
  if (!full.startsWith(REPO_ROOT)) throw new Error(`路径越界: ${p}`);
  return full;
}

async function executeTool(name, args) {
  if (name === "read_file") {
    let text = await readFile(safePath(args.path), "utf8");
    if (args.start_line || args.end_line) {
      const lines = text.split("\n");
      const s = (args.start_line ?? 1) - 1;
      const e = args.end_line ?? lines.length;
      text = lines.slice(s, e).join("\n");
    }
    return text;
  }
  if (name === "write_file") {
    const full = safePath(args.path);
    await mkdir(path.dirname(full), { recursive: true });
    await writeFile(full, args.content);
    const s = await stat(full);
    return `wrote ${s.size} bytes`;
  }
  if (name === "run_tests") {
    const cmd = ["npm", ["test", "--", "--bail"]];
    if (args.pattern) cmd[1].push(args.pattern);
    return await new Promise((resolve) => {
      const child = spawn(cmd[0], cmd[1], { cwd: REPO_ROOT });
      let out = "";
      child.stdout.on("data", (d) => (out += d));
      child.stderr.on("data", (d) => (out += d));
      child.on("close", (code) => resolve(`exit=${code}\n${out}`));
    });
  }
  throw new Error(`未知工具: ${name}`);
}

async function buildSystem() {
  const readmePath = path.join(REPO_ROOT, "README.md");
  const readme = existsSync(readmePath) ? await readFile(readmePath, "utf8") : "";
  return [
    { type: "text", text: SYSTEM_PROMPT },
    {
      type: "text",
      text: `<PROJECT_README>\n${readme}\n</PROJECT_README>`,
      cache_control: { type: "ephemeral" },
    },
  ];
}

async function chat(userRequest, maxTurns = 12) {
  const messages = [{ role: "user", content: userRequest }];
  for (let turn = 0; turn < maxTurns; turn++) {
    const resp = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 4096,
      system: await buildSystem(),
      tools: TOOLS,
      messages,
    });
    messages.push({ role: "assistant", content: resp.content });

    if (resp.stop_reason !== "tool_use") {
      for (const b of resp.content) if (b.type === "text") console.log(b.text);
      return;
    }

    const toolResults = [];
    for (const b of resp.content) {
      if (b.type !== "tool_use") continue;
      try {
        const r = await executeTool(b.name, b.input);
        toolResults.push({ type: "tool_result", tool_use_id: b.id, content: r });
      } catch (e) {
        toolResults.push({
          type: "tool_result",
          tool_use_id: b.id,
          content: `ERROR: ${e.message}`,
          is_error: true,
        });
      }
    }
    messages.push({ role: "user", content: toolResults });
  }
  console.log("到达 maxTurns 仍未完成");
}

await chat("给 app/server.js 加一个 /healthz 路由,补一个单元测试。");

缓存策略实战

cache_control 放在哪儿,直接决定哪些内容能跨轮复用。顺序很关键:Anthropic 缓存的是前缀,任何会变的内容都必须放在缓存块之后

缓存失效。 缓存块里任何一字节变化都会让前缀失效。如果你想在迭代 system prompt 的同时保留 README 缓存,把 README 块放在你正在改的 prompt 之前。一旦换序,就要重付一次 create 费。

并行 tool 调用

Claude 一次响应里可能输出多个 tool_use 块。并行执行,然后在下一条 user 消息里按 tool_use_id 一一返回 tool_result

# Python:asyncio.gather 并行读
import asyncio

async def execute_parallel(blocks):
    async def one(b):
        return {
            "type": "tool_result",
            "tool_use_id": b.id,
            "content": await asyncio.to_thread(execute_tool, b.name, b.input),
        }
    return await asyncio.gather(*(one(b) for b in blocks if b.type == "tool_use"))

需要强制串行(某些 agent 串行更好调试):

"tool_choice": {"type": "auto", "disable_parallel_tool_use": true}

错误处理

生产中三种典型故障:

故障恢复
Tool 抛异常返回 {is_error: true, content: "ERROR: ..."}。Claude 会自己重试、追问或退一步。
HTTP 429 rate_limit_error遵守 retry-after。指数退避 + 抖动,封顶 ~60 秒。
HTTP 503 model_not_foundBUZZ 特有:该 model 在你的 group 下没有可用 channel。降级到限制更宽的模型(通常是 claude-sonnet-4-6)或联系支持。

相关链接