Index R2 File | Captain Docs

import requests
BASE_URL = "https://api.runcaptain.com"
API_KEY = "your_api_key"
ORG_ID = "your_organization_id"
headers = {
    "Authorization": f"Bearer {API_KEY}",
    "X-Organization-ID": ORG_ID,
    "Content-Type": "application/json"
}
response = requests.post(
    f"{BASE_URL}/v2/collections/my_documents/index/r2/file",
    headers=headers,
    json={
        "bucket_name": "my-r2-bucket",
        "file_uri": "r2://my-r2-bucket/reports/annual-review.pdf",
        "account_id": "your_cloudflare_account_id",
        "access_key_id": "your_r2_access_key_id",
        "secret_access_key": "your_r2_secret_access_key",
        "processing_type": "advanced"
    },
    timeout=60.0
)
if response.status_code in [200, 201]:
    data = response.json()
    print(f"Job started! ID: {data['job_id']}")
else:
    print(f"Error: {response.status_code}")

{
  "job_id": "job_r2file_abc123",
  "status": "pending"
}

Index a single file from a Cloudflare R2 bucket into a collection. Returns a job_id for tracking progress.

Authentication

AuthorizationBearer

Bearer token authentication using API key

Path parameters

collection_namestringRequired

Request

This endpoint expects an object.

bucket_namestringRequired

Name of the R2 bucket

file_uristringRequired

R2 URI format: r2://bucket-name/path/to/file.pdf

account_idstringRequired

Cloudflare account ID (found in your R2 dashboard URL)

access_key_idstringRequired

R2 S3 API token Access Key ID

secret_access_keystringRequired

R2 S3 API token Secret Access Key

processing_typeenumRequired

Document processing type. ‘advanced’ uses agentic OCR with AI-enhanced extraction for complex layouts, tables, figures, charts, and documents containing images. ‘basic’ provides reliable OCR optimized for general document indexing and high-volume processing.

Allowed values:

jurisdictionenumOptionalDefaults to default

R2 jurisdiction. ‘default’ for global, ‘eu’ for EU-only storage, ‘fedramp’ for FedRAMP-compliant storage.

Allowed values:

custom_metadatamap from strings to anyOptional

Custom metadata to attach to all chunks from this file. Keys must be strings. Values: str, int, float, bool, or array of strings.

Response

Indexing Job Started

job_idstring

statusenum

Allowed values:

Authentication

Path parameters

Headers

Request

Response