BUZZ AI Gateway
Docs · API Reference · POST /v1/rerank

Rerank

Rerank a list of documents by relevance to a query. The BUZZ Rerank API uses the Jina / Cohere request shape so SDKs and tooling for either provider work as drop-in clients. Behind the scenes BUZZ routes to whichever rerank channel your group has access to.

POST https://buzzai.cc/v1/rerank
Drop-in compatible with Cohere & Jina rerankers. The accepted body matches the Jina / Cohere rerank schema (query, documents, model, top_n, return_documents, plus Jina's max_chunk_per_doc and overlap_tokens). The response shape mirrors Cohere's: a results array of { index, relevance_score, document }.
Channel availability. The endpoint, request schema, and response schema are all wired up at the relay layer. Whether a specific rerank model is reachable from your token depends on which channels your group has enabled. Always query GET /v1/models first to confirm a model is live for you before integrating against it. If the model is unavailable you'll get HTTP 503 model_not_found (see Errors).

Authentication

Same token-auth chain as /v1/messages and /v1/chat/completions. Use a sk--prefixed user API token created from the BUZZ console.

HeaderNotes
Authorization: Bearer sk-<TOKEN>The sk- prefix is automatically stripped server-side.
Content-Type: application/jsonStandard JSON body.

Example request

curl -X POST https://buzzai.cc/v1/rerank \
  -H "Authorization: Bearer sk-$BUZZ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-english-v3.0",
    "query": "What is the capital of France?",
    "documents": [
      "Paris is the capital and largest city of France.",
      "London is the capital of the United Kingdom.",
      "Berlin has been the capital of Germany since 1990.",
      "Madrid is the capital of Spain."
    ],
    "top_n": 3,
    "return_documents": true
  }'
import os, requests

resp = requests.post(
    "https://buzzai.cc/v1/rerank",
    headers={
        "Authorization": f"Bearer sk-{os.environ['BUZZ_TOKEN']}",
        "Content-Type": "application/json",
    },
    json={
        "model": "rerank-english-v3.0",
        "query": "What is the capital of France?",
        "documents": [
            "Paris is the capital and largest city of France.",
            "London is the capital of the United Kingdom.",
            "Berlin has been the capital of Germany since 1990.",
            "Madrid is the capital of Spain.",
        ],
        "top_n": 3,
        "return_documents": True,
    },
    timeout=30,
)
print(resp.json())
const resp = await fetch("https://buzzai.cc/v1/rerank", {
  method: "POST",
  headers: {
    Authorization: `Bearer sk-${process.env.BUZZ_TOKEN}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "rerank-english-v3.0",
    query: "What is the capital of France?",
    documents: [
      "Paris is the capital and largest city of France.",
      "London is the capital of the United Kingdom.",
      "Berlin has been the capital of Germany since 1990.",
      "Madrid is the capital of Spain.",
    ],
    top_n: 3,
    return_documents: true,
  }),
});
console.log(await resp.json());

Response shape

{
  "results": [
    {"index": 0, "relevance_score": 0.987, "document": "Paris is the capital and largest city of France."},
    {"index": 3, "relevance_score": 0.412, "document": "Madrid is the capital of Spain."},
    {"index": 1, "relevance_score": 0.288, "document": "London is the capital of the United Kingdom."}
  ],
  "usage": {"prompt_tokens": 64, "completion_tokens": 0, "total_tokens": 64}
}

Numeric scores depend on the underlying reranker. The document field is included only when return_documents: true.

Body parameters

modelrequired

string — Model id. If empty, the relay returns "Model name not specified, model name cannot be empty".

queryrequired

string — The search query that documents are scored against.

documentsrequired

array — The candidate set to rerank. Each item may be either:

top_noptional

integer — Return only the top N results (sorted by relevance_score descending). When omitted the upstream provider's default applies.

return_documentsoptional

boolean — When true, each result includes the original document text. When false or omitted, only index and relevance_score are returned.

max_chunk_per_docoptional

integer — Jina-style: cap the number of chunks created from each long document. Use to bound token cost on very long inputs.

overlap_tokensoptional

integer — Jina-style: token overlap between adjacent chunks within the same document. Useful to preserve context across chunk boundaries.

Response

FieldTypeDescription
resultsarrayArray of result objects, sorted by relevance_score descending. Length capped to top_n when provided.
results[].indexintegerZero-based position in the original documents array.
results[].relevance_scorenumberProvider-defined relevance score. Compare scores within a single response only — they are not normalised across calls or models.
results[].documentstring | objectOriginal document content, present only when return_documents: true. Echoes the input shape (string or {text: ...}).
usageobjectToken-usage record (same shape as completions usage) used by BUZZ for billing.

Supported model IDs

The following model IDs are recognised by the corresponding rerank channels in the BUZZ source. Whether a given model is reachable for your token depends on which channels are enabled for your group; always reconcile against GET /v1/models at runtime.

Cohere channel

Jina channel

Errors

HTTPerror envelopeCause
400{"error":{"type":"buzz_error","message":"Model name not specified, ..."}}model field empty or missing.
401{"error":{"type":"buzz_error", ...}}Missing or invalid sk- token.
429{"error":{"type":"buzz_error", ...}}Rate limit hit; respect retry-after.
503{"error":{"code":"model_not_found","type":"buzz_error","message":"No available channel for model <X> under group <G> (distributor)"}}The requested model has no enabled channel for your token's group. Confirm the model is live via GET /v1/models, or pick another supported id.

See also