Agent Loop
A multi-turn agent that calls tools until it's done. The loop body is small. The hard parts are the four guard rails: stop conditions, max iterations, token budget, and tool-failure recovery.
The shape of the loop
Stripped down, the agent loop is six lines:
while True:
response = call_claude(messages)
if response.stop_reason != "tool_use":
return response
tool_results = run_all_tool_uses(response)
messages.append(assistant_turn(response))
messages.append(user_turn(tool_results))
That much will work for happy-path demos. Production needs four extra guards.
Stop conditions
Claude tells you why it stopped via stop_reason. Treat each one explicitly:
stop_reason | Meaning | Loop action |
|---|---|---|
tool_use | Wants to call one or more tools | Execute tools, append results, continue |
end_turn | Done, returning a final answer | Exit loop, return text content |
max_tokens | Hit max_tokens mid-response | Continue with the partial assistant message in history; append a user nudge or raise to the user |
stop_sequence | Custom stop string matched | Exit; treat output as truncated at the marker |
pause_turn | Server-side pause (rare; long thinking, server tools) | Send the same conversation back to resume |
refusal | Model declined to continue | Surface to user; do not retry blindly |
Max iterations safety valve
Some tasks legitimately take 20+ turns. Some get stuck in a loop calling the same tool with slightly different arguments. Pick a hard ceiling:
- Simple tasks: 10 iterations.
- Coding agents over a real repo: 30-50.
- Research / multi-source synthesis: 50-100.
When you hit the ceiling, don't just abort. Inject a final user message that asks Claude to summarise progress and stop:
"You've hit the iteration limit. Stop calling tools. In your final message,
summarise what you accomplished, what's left, and any blockers."
Token budget
Read response.usage after every turn and accumulate. When you cross a threshold, force a wrap-up:
budget_input = 1_000_000 # tokens
budget_output = 200_000
total_in = total_out = 0
for turn in range(MAX_TURNS):
resp = client.messages.create(...)
total_in += resp.usage.input_tokens + resp.usage.cache_read_input_tokens
total_out += resp.usage.output_tokens
if total_in > budget_input or total_out > budget_output:
# Inject "wrap up" instruction and break after one more call
...
Cache reads count against your budget at 10% of input cost, but they still count tokens. Track them separately if you bill end users by token; track only dollar-equivalent if you bill by cost.
Tool-failure recovery
Tools throw. Network calls time out. The model can request invalid arguments. Don't crash the loop — return the failure as a structured tool result and let Claude react.
{
"type": "tool_result",
"tool_use_id": "toolu_...",
"content": "ERROR: file not found at path 'src/foo.py'. Try listing the directory first.",
"is_error": true
}
Three failure tiers, with different handling:
| Tier | Example | Recovery |
|---|---|---|
| Recoverable, model's fault | Bad path, wrong arg type | Return is_error: true with a hint. Claude usually retries with a fix. |
| Recoverable, transient | HTTP 503 from a downstream API, network blip | Retry inside the tool with backoff before returning. Limit to 2-3 attempts. |
| Unrecoverable | Permission denied, dependency missing, invalid credentials | Return error and abort the loop. Don't burn tokens watching Claude fail repeatedly. |
Full working example
"""
Agent loop with iteration cap, token budget, and tool failure recovery.
Requires: pip install anthropic
"""
import os, time, random
from anthropic import Anthropic
from anthropic import APIStatusError
client = Anthropic(
base_url="https://buzzai.cc",
api_key=os.environ["BUZZ_API_KEY"],
)
# === Configurable guards ===
MAX_ITERATIONS = 30
BUDGET_INPUT_TOKENS = 1_000_000
BUDGET_OUTPUT_TOKENS = 200_000
# === Tools ===
TOOLS = [
{
"name": "search",
"description": "Search a knowledge base. Returns top matches.",
"input_schema": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
},
{
"name": "fetch_url",
"description": "Fetch a URL and return its text content.",
"input_schema": {
"type": "object",
"properties": {"url": {"type": "string", "format": "uri"}},
"required": ["url"],
},
},
]
class UnrecoverableToolError(Exception):
pass
def execute_tool(name, args):
if name == "search":
# ... do real search ...
return f"Top 3 results for {args['query']!r}: ..."
if name == "fetch_url":
import urllib.request
for attempt in range(3):
try:
with urllib.request.urlopen(args["url"], timeout=10) as r:
return r.read().decode("utf-8", errors="replace")[:8000]
except urllib.error.HTTPError as e:
if e.code in (403, 404):
raise UnrecoverableToolError(f"HTTP {e.code} for {args['url']}")
time.sleep(2 ** attempt + random.random())
raise RuntimeError(f"fetch_url failed after retries: {args['url']}")
raise UnrecoverableToolError(f"unknown tool: {name}")
def call_with_retry(**kwargs):
"""Outer retry for transient API errors (429/500/529)."""
for attempt in range(5):
try:
return client.messages.create(**kwargs)
except APIStatusError as e:
if e.status_code in (429, 500, 503, 529):
wait = (2 ** attempt) + random.random()
time.sleep(min(wait, 60))
continue
raise
raise RuntimeError("API failed after retries")
def run_agent(user_request: str):
messages = [{"role": "user", "content": user_request}]
total_in = total_out = 0
iteration = 0
abort = False
while iteration < MAX_ITERATIONS:
iteration += 1
# Inject wrap-up if we are out of budget
if total_in > BUDGET_INPUT_TOKENS or total_out > BUDGET_OUTPUT_TOKENS:
messages.append({
"role": "user",
"content": "Token budget exhausted. Stop calling tools. "
"Reply now with a summary of progress and what remains.",
})
resp = call_with_retry(
model="claude-sonnet-4-6",
max_tokens=4096,
tools=TOOLS,
messages=messages,
)
total_in += resp.usage.input_tokens + (resp.usage.cache_read_input_tokens or 0)
total_out += resp.usage.output_tokens
messages.append({"role": "assistant", "content": resp.content})
if resp.stop_reason == "end_turn":
return _final_text(resp), {"iters": iteration, "in": total_in, "out": total_out}
if resp.stop_reason == "refusal":
return "[refused]", {"iters": iteration, "in": total_in, "out": total_out}
if resp.stop_reason == "max_tokens":
messages.append({"role": "user", "content": "Continue from where you stopped."})
continue
if resp.stop_reason != "tool_use":
# pause_turn, stop_sequence, or unknown — re-loop with empty user nudge
continue
tool_results = []
for block in resp.content:
if block.type != "tool_use":
continue
try:
output = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": output,
})
except UnrecoverableToolError as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"FATAL: {e}",
"is_error": True,
})
abort = True
except Exception as e:
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": f"ERROR (recoverable): {e}",
"is_error": True,
})
messages.append({"role": "user", "content": tool_results})
if abort:
return "[aborted: unrecoverable tool error]", {
"iters": iteration, "in": total_in, "out": total_out,
}
# Hit MAX_ITERATIONS
messages.append({
"role": "user",
"content": "You hit the iteration limit. Stop calling tools and "
"summarise what you accomplished, what's left, and any blockers.",
})
resp = call_with_retry(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=messages, # tools omitted to force a text response
)
return _final_text(resp), {"iters": iteration + 1, "in": total_in, "out": total_out}
def _final_text(resp):
return "\n".join(b.text for b in resp.content if b.type == "text")
if __name__ == "__main__":
text, stats = run_agent("Find the three most cited 2025 papers on prompt caching and summarise their findings.")
print(text)
print(f"\n[stats] iterations={stats['iters']} input_tokens={stats['in']} output_tokens={stats['out']}")
// Agent loop with iteration cap, token budget, and tool failure recovery.
// Requires: npm i @anthropic-ai/sdk
import Anthropic, { APIError } from "@anthropic-ai/sdk";
const client = new Anthropic({
baseURL: "https://buzzai.cc",
apiKey: process.env.BUZZ_API_KEY,
});
const MAX_ITERATIONS = 30;
const BUDGET_INPUT_TOKENS = 1_000_000;
const BUDGET_OUTPUT_TOKENS = 200_000;
const TOOLS = [
{
name: "search",
description: "Search a knowledge base. Returns top matches.",
input_schema: {
type: "object",
properties: { query: { type: "string" } },
required: ["query"],
},
},
{
name: "fetch_url",
description: "Fetch a URL and return its text content.",
input_schema: {
type: "object",
properties: { url: { type: "string", format: "uri" } },
required: ["url"],
},
},
];
class UnrecoverableToolError extends Error {}
function sleep(ms) { return new Promise((r) => setTimeout(r, ms)); }
async function executeTool(name, args) {
if (name === "search") {
return `Top 3 results for "${args.query}": ...`;
}
if (name === "fetch_url") {
for (let attempt = 0; attempt < 3; attempt++) {
try {
const r = await fetch(args.url, { signal: AbortSignal.timeout(10000) });
if (r.status === 403 || r.status === 404) {
throw new UnrecoverableToolError(`HTTP ${r.status} for ${args.url}`);
}
if (!r.ok) throw new Error(`HTTP ${r.status}`);
const text = await r.text();
return text.slice(0, 8000);
} catch (e) {
if (e instanceof UnrecoverableToolError) throw e;
await sleep(1000 * 2 ** attempt + Math.random() * 1000);
}
}
throw new Error(`fetch_url failed after retries: ${args.url}`);
}
throw new UnrecoverableToolError(`unknown tool: ${name}`);
}
async function callWithRetry(params) {
for (let attempt = 0; attempt < 5; attempt++) {
try {
return await client.messages.create(params);
} catch (e) {
if (e instanceof APIError && [429, 500, 503, 529].includes(e.status)) {
await sleep(Math.min(60000, 1000 * 2 ** attempt + Math.random() * 1000));
continue;
}
throw e;
}
}
throw new Error("API failed after retries");
}
function finalText(resp) {
return resp.content.filter((b) => b.type === "text").map((b) => b.text).join("\n");
}
export async function runAgent(userRequest) {
const messages = [{ role: "user", content: userRequest }];
let totalIn = 0, totalOut = 0, iteration = 0, abort = false;
while (iteration < MAX_ITERATIONS) {
iteration++;
if (totalIn > BUDGET_INPUT_TOKENS || totalOut > BUDGET_OUTPUT_TOKENS) {
messages.push({
role: "user",
content:
"Token budget exhausted. Stop calling tools. " +
"Reply now with a summary of progress and what remains.",
});
}
const resp = await callWithRetry({
model: "claude-sonnet-4-6",
max_tokens: 4096,
tools: TOOLS,
messages,
});
totalIn += (resp.usage.input_tokens || 0) + (resp.usage.cache_read_input_tokens || 0);
totalOut += resp.usage.output_tokens || 0;
messages.push({ role: "assistant", content: resp.content });
if (resp.stop_reason === "end_turn") {
return { text: finalText(resp), stats: { iters: iteration, in: totalIn, out: totalOut } };
}
if (resp.stop_reason === "refusal") {
return { text: "[refused]", stats: { iters: iteration, in: totalIn, out: totalOut } };
}
if (resp.stop_reason === "max_tokens") {
messages.push({ role: "user", content: "Continue from where you stopped." });
continue;
}
if (resp.stop_reason !== "tool_use") continue;
const toolResults = [];
for (const block of resp.content) {
if (block.type !== "tool_use") continue;
try {
const output = await executeTool(block.name, block.input);
toolResults.push({ type: "tool_result", tool_use_id: block.id, content: output });
} catch (e) {
const isFatal = e instanceof UnrecoverableToolError;
toolResults.push({
type: "tool_result",
tool_use_id: block.id,
content: `${isFatal ? "FATAL" : "ERROR (recoverable)"}: ${e.message}`,
is_error: true,
});
if (isFatal) abort = true;
}
}
messages.push({ role: "user", content: toolResults });
if (abort) {
return {
text: "[aborted: unrecoverable tool error]",
stats: { iters: iteration, in: totalIn, out: totalOut },
};
}
}
messages.push({
role: "user",
content:
"You hit the iteration limit. Stop calling tools and " +
"summarise what you accomplished, what's left, and any blockers.",
});
const resp = await callWithRetry({
model: "claude-sonnet-4-6",
max_tokens: 2048,
messages,
});
return {
text: finalText(resp),
stats: { iters: iteration + 1, in: totalIn, out: totalOut },
};
}
const { text, stats } = await runAgent(
"Find the three most cited 2025 papers on prompt caching and summarise their findings."
);
console.log(text);
console.log(`\n[stats] iterations=${stats.iters} input=${stats.in} output=${stats.out}`);
Managing message history
Long-running agents accumulate hundreds of tool results, which inflates input cost on every turn. Three tactics:
1. Cache the prefix
Stable system prompt + tool definitions sit in a cached system block. Doesn't help with the growing message tail, but eliminates the fixed cost.
2. Trim old tool results
After N turns, replace the content of old tool results with a short summary, keeping the structure intact:
def trim_old_results(messages, keep_last=8):
# Walk all but the last `keep_last` user messages
for msg in messages[:-keep_last]:
if msg["role"] != "user" or not isinstance(msg["content"], list):
continue
for block in msg["content"]:
if isinstance(block, dict) and block.get("type") == "tool_result":
if len(block.get("content", "")) > 200:
block["content"] = block["content"][:180] + "... [trimmed]"
3. Summarise and restart
Periodically, ask Claude to summarise the conversation, then start a new conversation seeded with that summary. Heaviest tactic; useful for very long sessions (50+ turns).
Opus thinking inside the loop
For Opus 4.7 you can enable extended thinking. Two things to know:
- Thinking blocks must be preserved. When you append the assistant turn back into
messages, include thethinkingblocks too. Stripping them breaks signature verification on the next turn. temperature,top_p,top_kare ignored when thinking is on. Configure behaviour with the system prompt instead.
resp = client.messages.create(
model="claude-opus-4-7",
max_tokens=8192,
thinking={"type": "enabled", "budget_tokens": 4096},
tools=TOOLS,
messages=messages,
)
# Append resp.content unchanged — thinking blocks ride along.
messages.append({"role": "assistant", "content": resp.content})
See also
- Recipe: Coding Assistant — concrete tools (read_file / write_file / run_tests) for an agent loop
- Tool Use concept
- POST /v1/messages reference
- Claude API error code reference