Index URL

Index documents from public URL(s) into a collection.

Accepts either a single url string or a urls array of strings pointing to hosted documents (PDF, TXT, DOCX, CSV, XLSX, etc.).

Documents are downloaded and processed through the same pipeline as cloud storage indexing.

Returns a job_id for tracking progress via GET /v2/jobs/{job_id}.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

X-Organization-IDstring
API Key authentication via header

Path parameters

collection_namestringRequired

Request

This endpoint expects an object.
processing_typeenumRequired

Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.

Allowed values:
urlstringOptional

A single public URL to a hosted document (PDF, TXT, DOCX, etc.). Provide either ‘url’ or ‘urls’, not both.

urlslist of stringsOptional
An array of public URLs to hosted documents. Provide either 'url' or 'urls', not both.
custom_metadatamap from strings to anyOptional

Custom metadata to attach to all indexed chunks. Keys must be strings. Values: str, int, float, bool, or array of strings.

Response

Indexing job started
job_idstring
statusenum
Allowed values: