Index URL
Index documents from public URL(s) into a collection.
Accepts either a single url string or a urls array of strings pointing to hosted documents (PDF, TXT, DOCX, CSV, XLSX, etc.).
Documents are downloaded and processed through the same pipeline as cloud storage indexing.
Returns a job_id for tracking progress via GET /v2/jobs/{job_id}.
Authentication
Bearer authentication of the form Bearer <token>, where token is your auth token.
Path parameters
Request
Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.
A single public URL to a hosted document (PDF, TXT, DOCX, etc.). Provide either ‘url’ or ‘urls’, not both.
Custom metadata to attach to all indexed chunks. Keys must be strings. Values: str, int, float, bool, or array of strings.