Index Gcs File

Index a single file from GCS bucket into a collection. Headers: - Authorization: Bearer {api_key} - Captain API key for authentication - X-Organization-ID: Organization UUID Args: collection_name: Name of the collection (path parameter) body: GCS file configuration with file_uri Returns: { job_id, status: "pending" }

Path parameters

collection_namestringRequired

Headers

authorizationstringOptional

Request

This endpoint expects an object.
bucket_namestringRequired
file_uristringRequired
processing_typeenumRequired

Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.

Allowed values:
service_account_jsonstringRequired
custom_metadatamap from strings to optional strings or integers or doubles or booleans or lists of stringsOptional

Custom metadata to attach to all chunks from this file. Keys must be strings. Values: str, int, float, bool, or List[str].

overwrite_existingbooleanOptionalDefaults to false

When true, files that already exist in the collection will be deleted and re-indexed with the latest changes. Requires skip_existing=false. Setting both to true returns a 400 error.

parsing_scriptstringOptional

Relative path to a JS parsing script for JSON files (e.g. ‘research/paper-parser’). When provided, .json files are processed through a sandboxed V8 isolate. Without this, .json files are indexed as raw text.

skip_existingbooleanOptionalDefaults to true

When true, files already indexed in the collection are skipped and will not be re-indexed with incoming changes. When false, all incoming files are indexed regardless of whether they already exist.

Response

Successful Response
job_idstring
statusstringDefaults to pending

Errors

400
Index Gcs File V2request Bad Request Error