Index R2 File

Index a single file from a Cloudflare R2 bucket into a collection. Headers: - Authorization: Bearer {api_key} - Captain API key for authentication - X-Organization-ID: Organization UUID Args: collection_name: Name of the collection (path parameter) body: R2 file configuration with file_uri (r2://bucket/path/to/file.pdf) Returns: { job_id, status: "pending" }

Path parameters

collection_namestringRequired

Headers

authorizationstring or nullOptional

Request

This endpoint expects an object.
access_key_idstringRequired
account_idstringRequired
bucket_namestringRequired
file_uristringRequired

R2 object URI in the format r2://bucket-name/path/to/file.pdf

processing_typeenumRequired

Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.

secret_access_keystringRequired
custom_metadatamap from strings to strings or integers or doubles or booleans or lists of strings or nullOptional

Custom metadata to attach to all chunks from this file. Keys must be strings. Values: str, int, float, bool, or List[str].

jurisdictionenum or nullOptionalDefaults to default
overwrite_existingbooleanOptionalDefaults to false

When true, files that already exist in the collection will be deleted and re-indexed with the latest changes. Requires skip_existing=false. Setting both to true returns a 400 error.

parsing_scriptstring or nullOptional

Relative path to a JS parsing script for JSON files (e.g. ‘research/paper-parser’). When provided, .json files are processed through a sandboxed V8 isolate. Without this, .json files are indexed as raw text.

skip_existingbooleanOptionalDefaults to true

When true, files already indexed in the collection are skipped and will not be re-indexed with incoming changes. When false, all incoming files are indexed regardless of whether they already exist.

Response

Successful Response
job_idstring
statusstringDefaults to pending

Errors

400
Bad Request Error