Index URLs

Index documents from public URLs into a collection. No cloud storage credentials required. You can provide either: - `url` — a single URL string for one document - `urls` — an array of URL strings for multiple documents Supported file types: PDF, DOCX, DOC, XLSX, XLS, CSV, TSV, TXT, MD, JSON, YAML, YML, PNG, JPG, JPEG, GIF, BMP, TIFF. Documents are downloaded and processed through the same pipeline as cloud storage indexing. Returns a job_id for tracking progress via GET /v2/jobs/{job_id}.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

X-Organization-IDstring
API Key authentication via header

Path parameters

collection_namestringRequired

Request

This endpoint expects an object.
processing_typeenumRequired

Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.

Allowed values:
urlstringOptional

A single public URL to a hosted document. Supported types: PDF, DOCX, DOC, XLSX, XLS, CSV, TSV, TXT, MD, JSON, YAML, YML, PNG, JPG, JPEG, GIF, BMP, TIFF. Provide either ‘url’ or ‘urls’, not both.

urlslist of stringsOptional
An array of public URLs to hosted documents. Provide either 'url' or 'urls', not both.
custom_metadatamap from strings to anyOptional

Custom metadata to attach to all indexed chunks. Keys must be strings. Values: str, int, float, bool, or array of strings.

Response

Indexing job started
job_idstring
statusenum
Allowed values: