Index YouTube Transcript

Index YouTube video transcripts into a collection. Fetches transcripts from YouTube videos using auto-generated or manual captions, formats them with inline timestamps, and indexes the text for semantic search. You can provide either: - `url` — a single YouTube video URL - `urls` — an array of YouTube video URLs (max 20) Transcripts are always processed as basic text (no OCR needed). Each transcript is formatted with `[HH:MM:SS]` timestamp markers so search results can reference specific moments in the video. ## Supported URL Formats - `youtube.com/watch?v=VIDEO_ID` - `youtu.be/VIDEO_ID` - `youtube.com/shorts/VIDEO_ID` ## Auto-Injected Metadata The following metadata is automatically added to indexed chunks: - `youtube_video_id` — the video ID - `youtube_url` — the original video URL - `youtube_language` — transcript language - `youtube_duration_seconds` — total video duration Returns a job_id for tracking progress via GET /v2/jobs/{job_id}.

Authentication

AuthorizationBearer
Bearer token authentication using API key

Path parameters

collection_namestringRequired
Name of the collection to index into

Headers

X-Organization-IDstringRequired
Idempotency-KeystringOptional
UUID for request deduplication

Request

This endpoint expects an object.
urlstringOptional

A single YouTube video URL. Supported formats: youtube.com/watch?v=, youtu.be/, youtube.com/shorts/. Provide either ‘url’ or ‘urls’, not both.

urlslist of stringsOptional

An array of YouTube video URLs to index (max 20). Provide either ‘url’ or ‘urls’, not both.

languageslist of stringsOptional

Preferred transcript languages in priority order (ISO 639-1 codes). Defaults to English. Only specify if you need a non-English transcript (e.g., [‘fr’, ‘de’]). Falls back to auto-generated captions if manual transcript unavailable.

custom_metadatamap from strings to anyOptional

Custom metadata to attach to all indexed chunks. Keys must be strings. Values: str, int, float, bool, or array of strings.

Response

Indexing job started
job_idstring
statusenum
Allowed values: