文档 · API 参考 · POST /v1/rerank

重排序

按 query 对一组文档做相关性重排序。BUZZ Rerank API 采用 Jina / Cohere 的请求体格式,两家的 SDK 和工具链都能直接接入。后端会路由到你 group 可用的 rerank channel。

POST https://buzzai.cc/v1/rerank

同时兼容 Cohere 与 Jina rerank。 接受的请求体对齐 Jina / Cohere 重排序 schema(query、documents、model、top_n、return_documents,以及 Jina 的 max_chunk_per_doc 和 overlap_tokens)。响应形态对齐 Cohere:results 数组,每项 { index, relevance_score, document }。

通道可用性。 接口、请求 schema、响应 schema 都已在 relay 层接通。但具体某个 rerank 模型对你的 token 是否可用,取决于你所在 group 启用了哪些 channel。接入前请先用 GET /v1/models 实时查询模型是否对你可用,再做集成。如果模型不可达,会得到 HTTP 503 model_not_found(详见错误码)。

鉴权

与 /v1/messages、/v1/chat/completions 同一条 token 鉴权链。使用从 BUZZ 控制台创建的 sk- 前缀用户 API token。

Header	说明
`Authorization: Bearer sk-<TOKEN>`	服务端会自动剥离 `sk-` 前缀。
`Content-Type: application/json`	标准 JSON 请求体。

请求示例

curl -X POST https://buzzai.cc/v1/rerank \
  -H "Authorization: Bearer sk-$BUZZ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-english-v3.0",
    "query": "What is the capital of France?",
    "documents": [
      "Paris is the capital and largest city of France.",
      "London is the capital of the United Kingdom.",
      "Berlin has been the capital of Germany since 1990.",
      "Madrid is the capital of Spain."
    ],
    "top_n": 3,
    "return_documents": true
  }'

import os, requests

resp = requests.post(
    "https://buzzai.cc/v1/rerank",
    headers={
        "Authorization": f"Bearer sk-{os.environ['BUZZ_TOKEN']}",
        "Content-Type": "application/json",
    },
    json={
        "model": "rerank-english-v3.0",
        "query": "What is the capital of France?",
        "documents": [
            "Paris is the capital and largest city of France.",
            "London is the capital of the United Kingdom.",
            "Berlin has been the capital of Germany since 1990.",
            "Madrid is the capital of Spain.",
        ],
        "top_n": 3,
        "return_documents": True,
    },
    timeout=30,
)
print(resp.json())

const resp = await fetch("https://buzzai.cc/v1/rerank", {
  method: "POST",
  headers: {
    Authorization: `Bearer sk-${process.env.BUZZ_TOKEN}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "rerank-english-v3.0",
    query: "What is the capital of France?",
    documents: [
      "Paris is the capital and largest city of France.",
      "London is the capital of the United Kingdom.",
      "Berlin has been the capital of Germany since 1990.",
      "Madrid is the capital of Spain.",
    ],
    top_n: 3,
    return_documents: true,
  }),
});
console.log(await resp.json());

响应形态

{
  "results": [
    {"index": 0, "relevance_score": 0.987, "document": "Paris is the capital and largest city of France."},
    {"index": 3, "relevance_score": 0.412, "document": "Madrid is the capital of Spain."},
    {"index": 1, "relevance_score": 0.288, "document": "London is the capital of the United Kingdom."}
  ],
  "usage": {"prompt_tokens": 64, "completion_tokens": 0, "total_tokens": 64}
}

具体分值由底层 reranker 决定。document 字段仅在 return_documents: true 时返回。

请求体参数

model必填

string — 模型 id。为空时 relay 直接返回 "Model name not specified, model name cannot be empty"。

query必填

string — 用于打分的查询文本。

documents必填

array — 候选文档集合,每项可以是:

纯字符串,或
带 text 字段的对象(Cohere 结构化文档形式)

top_n可选

integer — 只返回前 N 条结果(按 relevance_score 降序)。不填则使用上游 provider 的默认值。

return_documents可选

boolean — 为 true 时每条结果会带上原始 document 文本;为 false 或不传时只返回 index 和 relevance_score。

max_chunk_per_doc可选

integer — Jina 风格:每个长文档拆分成的最大 chunk 数。用于在长输入上控制 token 成本。

overlap_tokens可选

integer — Jina 风格:同一文档内相邻 chunk 之间重叠的 token 数,用于跨 chunk 保留上下文。

响应

字段	类型	说明
results	array	结果数组,按 `relevance_score` 降序排列。指定 `top_n` 时长度封顶。
results[].index	integer	该结果在原始 `documents` 数组中的零基索引。
results[].relevance_score	number	provider 自定义的相关性分值。仅同一次响应内可比较,跨调用 / 跨模型不归一化。
results[].document	string \| object	原始文档内容,仅 `return_documents: true` 时出现,与输入形态一致(字符串或 `{text: ...}`)。
usage	object	token 用量记录(同 completions `usage`),BUZZ 用来计费。

支持的模型 ID

下面是 BUZZ 源码中各 rerank channel 识别的模型 ID。能否真正调通取决于你 group 启用了哪些 channel,运行时请以 GET /v1/models 为准。

Cohere channel

rerank-english-v3.0
rerank-multilingual-v3.0
rerank-english-v2.0
rerank-multilingual-v2.0

Jina channel

jina-reranker-v2-base-multilingual
jina-reranker-m0

错误码

HTTP	错误响应包络	典型场景
400	`{"error":{"type":"buzz_error","message":"Model name not specified, ..."}}`	`model` 字段为空或缺失。
401	`{"error":{"type":"buzz_error", ...}}`	`sk-` token 缺失或非法。
429	`{"error":{"type":"buzz_error", ...}}`	命中速率限制,遵守 `retry-after`。
503	`{"error":{"code":"model_not_found","type":"buzz_error","message":"No available channel for model <X> under group <G> (distributor)"}}`	该模型在 token 所在 group 没有启用的 channel。先用 `GET /v1/models` 确认可达性,或换一个支持的模型 ID。

重排序

鉴权

请求示例

响应形态

请求体参数

model必填

query必填

documents必填

top_n可选

return_documents可选

max_chunk_per_doc可选

overlap_tokens可选

响应

支持的模型 ID

Cohere channel

Jina channel

错误码

相关链接