Query Collection

Execute a natural language query against a collection. When `inference=true`, returns an AI-generated response with relevant documents. When `inference=false`, returns raw search results with content and metadata. ## Streaming (SSE) When `stream: true` and `inference: true`, the response is a Server-Sent Events stream. Every `data:` field is a JSON object with a `type` discriminator. ### SSE Event Types | `type` value | Schema | Description | |---|---|---| | `text` | `QueryStreamTextEvent` | Incremental text chunk of the AI response. | | `tool.start` | `QueryStreamToolStartEvent` | The agent is performing a knowledge-base search. | | `tool.end` | `QueryStreamToolEndEvent` | A tool call completed. `tool_call_id` correlates with the preceding `tool.start`. | | `stream_complete` | `QueryStreamCompleteEvent` | Stream finished successfully. Close the connection. | | `stream_error` | `QueryStreamErrorEvent` | An error occurred. Close the connection. | ### Example SSE Stream ``` data: {"type":"tool.start","seq":1,"run_id":"run_abc","tool_call_id":"tc_1","name":"searchKnowledgeBase","args":{"query":"revenue projections Q4"}} data: {"type":"tool.end","seq":2,"run_id":"run_abc","tool_call_id":"tc_1","name":"searchKnowledgeBase","ok":true,"result_summary":{"resultCount":12}} data: {"type":"text","content":"Based on the documents"} data: {"type":"text","content":" provided, the revenue"} data: {"type":"text","content":" projections for Q4 show"} data: {"type":"text","content":" a 15% increase over Q3."} data: {"type":"stream_complete","metadata":{"totalResults":12},"stats":{"totalTokens":150}} ``` ### Notes - The agent may perform multiple searches per query. Each search produces a `tool.start` / `tool.end` pair. - Text chunks are interleaved between tool events — text arrives after the agent has gathered results from a search. - Connect with `Accept: text/event-stream` and set a generous timeout (120s+) for long responses.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Path parameters

collection_namestringRequired

Headers

X-Organization-IDstringRequired

Request

This endpoint expects an object.
querystringRequired
The natural language query to search for
streamtrueRequired

Enable real-time streaming of the response

inferencebooleanOptionalDefaults to false

Enable LLM-generated answers based on the relevant sections retrieved. When false, returns raw search results.

top_kintegerOptionalDefaults to 10

Number of results to return. Only valid when inference=false. Not supported when inference=true (the agent controls its own search strategy).

rerankbooleanOptionalDefaults to false

Enable Voyage AI rerank-2.5 reranking for improved relevance ordering. Adds ~100-300ms latency.

metadata_filtermap from strings to anyOptional
Filter expression for vector search. Supports: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $and, $or
custom_promptstringOptional

Custom system prompt to override the default RAG prompt when inference=true. Allows customizing how the LLM processes and responds to the query with the retrieved context.

Response

text.deltaobject
Incremental text chunk of the AI response.
OR
tool.startobject

Emitted when the AI agent begins a knowledge-base search.

OR
tool.endobject
Emitted when a tool call completes.
OR
stream_completeobject
Emitted when the stream finishes successfully. Close the connection after receiving this.
OR
stream_errorobject
Emitted when an error occurs during generation. Close the connection after receiving this.