Index GCS File
Index a single file from a GCS bucket into a collection. Returns a job_id for tracking progress.
Authentication
Path parameters
Headers
Request
GCS URI format: gs://bucket-name/path/to/file.pdf
Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.
Custom metadata to attach to all chunks from this file. Keys must be strings. Values: str, int, float, bool, or array of strings.
Relative path to a JavaScript parsing script for JSON files (e.g. ‘research/paper-parser’). When provided, .json files are processed through a sandboxed V8 isolate that executes the script to extract text and metadata. Without this parameter, .json files are indexed as raw text. Scripts are org-scoped and managed in the Parser Studio.
When true, files already indexed in the collection are skipped and will not be re-indexed with incoming changes. When false, all incoming files are indexed regardless of whether they already exist.
When true, files that already exist in the collection will be deleted and re-indexed with the latest changes. Requires skip_existing=false. Setting both to true returns a 400 error.