Query Collection | Captain Docs

Execute a natural language query against a collection.

When inference=true, returns an AI-generated response with relevant documents. When inference=false, returns raw search results with content and metadata.

Path parameters

collection_namestringRequired

Name of the collection to query

Request

This endpoint expects an object.

querystringRequired

The natural language query to search for

inferencebooleanOptionalDefaults to false

Enable LLM-generated answers based on the relevant sections retrieved. When false, returns raw search results.

streambooleanOptionalDefaults to false

Enable real-time streaming of the response

top_kintegerOptionalDefaults to 80

Number of results to return

model_idenumOptional

Model to use for inference. Options: ‘gpt-oss-120b’ or ‘claude-sonnet-4.5’. If not specified, defaults to gpt-oss-120b with claude-sonnet-4.5 fallback.

Allowed values:

metadata_filtermap from strings to anyOptional

Filter expression for vector search. Supports: $eq,$ ne, $gt,$ gte, $lt,$ lte, $in,$ nin, $and,$ or

Filter expression for vector search. Supports: $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $and, $or

custom_promptstringOptional

Custom system prompt to override the default RAG prompt when inference=true. Allows customizing how the LLM processes and responds to the query with the retrieved context.

Response

Successful Response

successboolean or null

Whether the query was successful

summarystring or null

AI-generated summary/response (when inference=true)

responsestring or null

Alias for summary (v1 compatibility)

relevant_documentslist of objects or null

List of relevant documents (when inference=true)

inferenceboolean or null

Whether inference mode was used

search_resultslist of objects or null

Raw search results with content (when inference=false)

total_resultsinteger or null

Total number of search results found

top_kinteger or null

Number of results returned

querystring or null

The original query

tokens_usedmap from strings to integers or null

Token usage breakdown by category

execution_time_msinteger or null

Query execution time in milliseconds

request_idstring or null

Unique request identifier (used for streaming)

streamingobject or null

Streaming configuration (when stream=true)

token_balanceobject or null

Current token balance after this request

1	curl -X POST https://api.runcaptain.com/v2/collections/my_documents/query \
2	-H "Content-Type: application/json" \
3	-d '{
4	"query": "What are the key terms in the contract?",
5	"inference": false,
6	"stream": false,
7	"top_k": 80
8	}'

1	{
2	"response": "Based on the contract, the key terms include...",
3	"relevant_documents": [
4	{
5	"relevancy_score": 0.92,
6	"document_id": "doc_abc123",
7	"document_name": "contract.pdf",
8	"document_type": "pdf"
9	}
10	],
11	"query": "What are the key terms in the contract?",
12	"status": "success",
13	"collection_name": "my_documents"
14	}

Path parameters

Headers

Request

Response