Advanced Search & Relations

Advanced search in v3 File Search is built around three concepts:

  • Document metadata: stable file-level fields used for query filters and source context.
  • Chunk custom metadata: custom annotations attached to a specific chunk.
  • Chunk relations: typed links between chunks for graph-aware retrieval and review workflows.

Start with the File Search API guide for the basic v3 query flow. This guide covers the knobs that make retrieval more precise.

Document Metadata Filters

Use filter on v3 Query to restrict retrieval to documents that match file-level metadata.

1{
2 "query": "approved safety claims",
3 "limit": 10,
4 "filter": {
5 "review_state": "approved",
6 "year": { "$gte": 2024 },
7 "$or": [
8 { "department": "medical" },
9 { "department": "regulatory" }
10 ]
11 }
12}

Document metadata is best for stable fields such as access tier, source system, review status, file owner, product line, year, or jurisdiction.

Supported operators include bare equality, $eq, $ne, $gt, $gte, $lt, $lte, $in, $nin, $and, and $or.

Include Controls

Use include on v3 Query to request only the extra objects your application needs.

1{
2 "query": "contract renewal obligations",
3 "limit": 10,
4 "include": {
5 "document": true,
6 "metadata": true,
7 "regions": true,
8 "relations": true,
9 "related_chunks": true
10 }
11}

Keep regions, relations, and related_chunks off unless the caller needs them. The base query response is easier for agents to consume when it returns only source text and core metadata.

Chunk Custom Metadata

Use chunk metadata when your application needs annotations below the document level, such as claim type, reviewer status, extraction confidence, entity IDs, or a workflow state for one chunk.

The chunk metadata endpoints are:

1import requests
2
3BASE_URL = "https://api.runcaptain.com"
4API_KEY = "your_api_key"
5COLLECTION = "medical_claims"
6CHUNK_ID = "chk_abc123_004"
7
8response = requests.patch(
9 f"{BASE_URL}/v3/collections/{COLLECTION}/chunks/{CHUNK_ID}/metadata",
10 headers={
11 "Authorization": f"Bearer {API_KEY}",
12 "Content-Type": "application/json",
13 },
14 json={
15 "metadata": {
16 "claim_type": "safety",
17 "review_state": "approved"
18 }
19 },
20)

Query and document responses expose these annotations as custom_metadata on chunks. The metadata endpoint request and response body uses metadata because the URL is already scoped to one chunk.

Documents, Chunks, and Regions

Use Get Document, List Chunks, and Get Chunk when you need source inspection without running a new search.

Set include_regions=true when the UI or downstream workflow needs extracted layout regions. Regions are useful for PDF and document review flows, but should not be requested by default in simple agent search flows.

Chunk Relations

Relations link one chunk to another with a typed edge. Use them when your application needs source graph behavior, such as cited-by links, contradiction links, section-to-table links, or claim-to-evidence links.

The relation endpoints are:

1response = requests.post(
2 f"{BASE_URL}/v3/collections/{COLLECTION}/chunks/{CHUNK_ID}/relations",
3 headers={
4 "Authorization": f"Bearer {API_KEY}",
5 "Content-Type": "application/json",
6 },
7 json={
8 "target_chunk_id": "chk_abc123_009",
9 "relation_type": "supports",
10 "metadata": {
11 "reviewer": "clinical-review"
12 }
13 },
14)

Relation-Aware Queries

Use include.relations, include.related_chunks, and relation_direction on v3 Query when a search result should carry graph context.

1{
2 "query": "claims supported by clinical evidence",
3 "limit": 10,
4 "include": {
5 "relations": true,
6 "related_chunks": true
7 },
8 "relation_direction": "outgoing"
9}

Use relation-aware queries when the graph should affect what the caller can inspect after retrieval. Use plain queries when the caller only needs the best matching source chunks.