Index R2 Directory
Index all files from a specific directory (prefix) in a Cloudflare R2 bucket into a collection. Uses prefix-based filtering to index only objects within the specified path. Returns a job_id for tracking progress via GET /v2/jobs/{job_id}.
Authentication
Bearer authentication of the form Bearer <token>, where token is your auth token.
Path parameters
Request
Path to the directory (prefix) within the bucket. Accepts either a relative path (e.g., ‘reports/2024/january’) or a full R2 URI (e.g., ‘r2://my-bucket/reports/2024/january’). All objects within this prefix will be indexed.
Cloudflare account ID (found in your R2 dashboard URL)
Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.
R2 jurisdiction. ‘default’ for global, ‘eu’ for EU-only storage, ‘fedramp’ for FedRAMP-compliant storage.
Maximum number of files to index (optional)
Skip files that are already indexed in the collection. When true, only new files will be indexed. Set to false to re-index all files.
Custom metadata to attach to all indexed chunks. Keys must be strings. Values: str, int, float, bool, or array of strings.