Query - v3
Query - v3
Query - v3
Search indexed files and return source chunks with optional document, metadata, region (bounding boxes), relation, and related chunk context.
Compared with v2: this response uses results[].text, supports include controls for document, chunk, region, and relation context, and returns structured rerank details.
Result identity: results[].chunk_id is the stable identifier for a retrieved chunk. Use it when reading chunk details, updating chunk metadata, creating chunk relations, or storing a reference to a search result. Chunk IDs include the parent document identifier and the chunk index.
Scores: score is the final retrieval score for the result. metadata.vectorScore, metadata.bm25Score, metadata.rrfScore, and metadata.crossModalRrfScore are ranking signals used to produce the final result order. These values are useful for debugging retrieval behavior and should not be compared across unrelated collections.
Modality and match sources: modality describes the indexed content type that matched the query, such as pdf, document, image, video, spreadsheet, or text. match_sources lists the retrieval signals that contributed to the match, such as content_embedding, keyword, ocr, table, transcript, metadata, or summary.
Document and source: document identifies the parent file for the chunk. document.source.type describes how the file was ingested, and document.source.uri identifies the original file location or Captain-managed file URI.
Location: location identifies where the chunk appears inside the source file. PDF and document results use page fields, media results use time fields, and spreadsheet results use sheet, row, and column fields. Fields that do not apply are returned as null.
Regions: regions contains extracted layout regions when requested. Regions are used for OCR text, form fields, headings, tables, charts, and image areas. Bounding boxes are relative to the rendered page or image. The origin is the top-left corner. To draw a region, set x = left * renderedWidth, y = top * renderedHeight, width = width * renderedWidth, and height = height * renderedHeight.
Media: media contains media-specific context for audio and video results, such as transcript data for the retrieved segment. If no media-specific context is returned, the field is null.
Metadata: metadata contains Captain-generated fields about retrieval, ranking, source location, and indexed content. custom_metadata contains metadata supplied by your application during indexing or through metadata update endpoints.
Reranking: rerank_score is the score assigned by the reranker for this result. The top-level rerank object reports whether reranking was applied and why. Multimodal collections require reranking to combine text and non-text results into one ranked list.
Relations: relations contains graph edges connected to the retrieved chunk. related_chunks contains the linked chunks when relation context is requested, including their text, location, and metadata.