Index Azure Directory

Index all files from a specific directory (prefix) in an Azure Blob Storage container into a collection. Uses prefix-based filtering to index only blobs within the specified path. Returns a job_id for tracking progress via GET /v2/jobs/{job_id}.

Authentication

AuthorizationBearer

Bearer authentication of the form Bearer <token>, where token is your auth token.

Path parameters

collection_namestringRequired

Headers

X-Organization-IDstringOptional

Request

This endpoint expects an object.
account_keystringRequired
account_namestringRequired
container_namestringRequired
directory_pathstringRequired

Path to the directory within the container. Accepts either a relative path (e.g., ‘reports/2024/january’) or a full Azure Blob URI (e.g., ‘https://account.blob.core.windows.net/container/reports/2024/january’). All blobs within this prefix will be indexed.

processing_typeenumRequired

Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.

custom_metadatamap from strings to strings or integers or doubles or booleans or lists of strings or nullOptional

Custom metadata to attach to all indexed chunks. Keys must be strings. Values: str, int, float, bool, or List[str].

max_filesinteger or nullOptional
overwrite_existingbooleanOptionalDefaults to false

When true, files that already exist in the collection will be deleted and re-indexed with the latest changes. Requires skip_existing=false. Setting both to true returns a 400 error.

parsing_scriptstring or nullOptional

Relative path to a JS parsing script for JSON files (e.g. ‘research/paper-parser’). When provided, .json files are processed through a sandboxed V8 isolate. Without this, .json files are indexed as raw text.

skip_existingbooleanOptionalDefaults to true

When true, files already indexed in the collection are skipped and will not be re-indexed with incoming changes. When false, all incoming files are indexed regardless of whether they already exist.

Response

Successful Response
job_idstring
statusstringDefaults to pending

Errors

400
Bad Request Error
422
Unprocessable Entity Error