Rerank
Rerank a list of documents by relevance to a query. The BUZZ Rerank API uses the Jina / Cohere request shape so SDKs and tooling for either provider work as drop-in clients. Behind the scenes BUZZ routes to whichever rerank channel your group has access to.
query, documents, model, top_n, return_documents, plus Jina's max_chunk_per_doc and overlap_tokens). The response shape mirrors Cohere's: a results array of { index, relevance_score, document }.
GET /v1/models first to confirm a model is live for you before integrating against it. If the model is unavailable you'll get HTTP 503 model_not_found (see Errors).
Authentication
Same token-auth chain as /v1/messages and /v1/chat/completions. Use a sk--prefixed user API token created from the BUZZ console.
| Header | Notes |
|---|---|
Authorization: Bearer sk-<TOKEN> | The sk- prefix is automatically stripped server-side. |
Content-Type: application/json | Standard JSON body. |
Example request
curl -X POST https://buzzai.cc/v1/rerank \
-H "Authorization: Bearer sk-$BUZZ_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "rerank-english-v3.0",
"query": "What is the capital of France?",
"documents": [
"Paris is the capital and largest city of France.",
"London is the capital of the United Kingdom.",
"Berlin has been the capital of Germany since 1990.",
"Madrid is the capital of Spain."
],
"top_n": 3,
"return_documents": true
}'import os, requests
resp = requests.post(
"https://buzzai.cc/v1/rerank",
headers={
"Authorization": f"Bearer sk-{os.environ['BUZZ_TOKEN']}",
"Content-Type": "application/json",
},
json={
"model": "rerank-english-v3.0",
"query": "What is the capital of France?",
"documents": [
"Paris is the capital and largest city of France.",
"London is the capital of the United Kingdom.",
"Berlin has been the capital of Germany since 1990.",
"Madrid is the capital of Spain.",
],
"top_n": 3,
"return_documents": True,
},
timeout=30,
)
print(resp.json())const resp = await fetch("https://buzzai.cc/v1/rerank", {
method: "POST",
headers: {
Authorization: `Bearer sk-${process.env.BUZZ_TOKEN}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "rerank-english-v3.0",
query: "What is the capital of France?",
documents: [
"Paris is the capital and largest city of France.",
"London is the capital of the United Kingdom.",
"Berlin has been the capital of Germany since 1990.",
"Madrid is the capital of Spain.",
],
top_n: 3,
return_documents: true,
}),
});
console.log(await resp.json());Response shape
{
"results": [
{"index": 0, "relevance_score": 0.987, "document": "Paris is the capital and largest city of France."},
{"index": 3, "relevance_score": 0.412, "document": "Madrid is the capital of Spain."},
{"index": 1, "relevance_score": 0.288, "document": "London is the capital of the United Kingdom."}
],
"usage": {"prompt_tokens": 64, "completion_tokens": 0, "total_tokens": 64}
}
Numeric scores depend on the underlying reranker. The document field is included only when return_documents: true.
Body parameters
modelrequired
string — Model id. If empty, the relay returns "Model name not specified, model name cannot be empty".
queryrequired
string — The search query that documents are scored against.
documentsrequired
array — The candidate set to rerank. Each item may be either:
- a plain string, or
- an object with a
textfield (Cohere-style structured documents)
top_noptional
integer — Return only the top N results (sorted by relevance_score descending). When omitted the upstream provider's default applies.
return_documentsoptional
boolean — When true, each result includes the original document text. When false or omitted, only index and relevance_score are returned.
max_chunk_per_docoptional
integer — Jina-style: cap the number of chunks created from each long document. Use to bound token cost on very long inputs.
overlap_tokensoptional
integer — Jina-style: token overlap between adjacent chunks within the same document. Useful to preserve context across chunk boundaries.
Response
| Field | Type | Description |
|---|---|---|
| results | array | Array of result objects, sorted by relevance_score descending. Length capped to top_n when provided. |
| results[].index | integer | Zero-based position in the original documents array. |
| results[].relevance_score | number | Provider-defined relevance score. Compare scores within a single response only — they are not normalised across calls or models. |
| results[].document | string | object | Original document content, present only when return_documents: true. Echoes the input shape (string or {text: ...}). |
| usage | object | Token-usage record (same shape as completions usage) used by BUZZ for billing. |
Supported model IDs
The following model IDs are recognised by the corresponding rerank channels in the BUZZ source. Whether a given model is reachable for your token depends on which channels are enabled for your group; always reconcile against GET /v1/models at runtime.
Cohere channel
rerank-english-v3.0rerank-multilingual-v3.0rerank-english-v2.0rerank-multilingual-v2.0
Jina channel
jina-reranker-v2-base-multilingualjina-reranker-m0
Errors
| HTTP | error envelope | Cause |
|---|---|---|
| 400 | {"error":{"type":"buzz_error","message":"Model name not specified, ..."}} | model field empty or missing. |
| 401 | {"error":{"type":"buzz_error", ...}} | Missing or invalid sk- token. |
| 429 | {"error":{"type":"buzz_error", ...}} | Rate limit hit; respect retry-after. |
| 503 | {"error":{"code":"model_not_found","type":"buzz_error","message":"No available channel for model <X> under group <G> (distributor)"}} | The requested model has no enabled channel for your token's group. Confirm the model is live via GET /v1/models, or pick another supported id. |
See also
GET /v1/models— query live model availability for your tokenPOST /v1/messages— Anthropic-compatible chatPOST /v1/chat/completions— OpenAI-compatible chat