> For a complete page index of the Captain API documentation, fetch https://docs.runcaptain.com/llms.txt?excludeSpec=true

# Welcome Aboard

> Captain is a fully managed, high-accuracy API for natural language search over unstructured data in cloud storage (S3, GCS, Azure, R2). Supports text, images, video, audio.

## Captain API Quick Reference

* **Base URL**: `https://api.runcaptain.com`
* **Auth**: Bearer token (`Authorization: Bearer {api_key}`) + `X-Organization-ID` header
* **API Key formats**: `cap_dev_*` (development), `cap_stage_*` (staging), `cap_prod_*` (production)
* **SDKs**: Python (`captain-sdk` on PyPI), TypeScript (`captain-sdk` on npm)

### API Endpoints (v2)

* `PUT /v2/collections/{name}` - Create a collection
* `POST /v2/collections/{name}/query` - Query a collection (supports `inference`, `stream`, `rerank`, `top_k`, `metadata_filter`)
* `POST /v2/collections/{name}/index/s3` - Index from AWS S3
* `POST /v2/collections/{name}/index/gcs` - Index from Google Cloud Storage
* `POST /v2/collections/{name}/index/azure` - Index from Azure Blob Storage
* `POST /v2/collections/{name}/index/r2` - Index from Cloudflare R2
* `POST /v2/collections/{name}/index/url` - Index from URL
* `POST /v2/collections/{name}/index/youtube` - Index from YouTube
* `POST /v2/collections/{name}/index/text` - Index plain text
* `POST /v2/collections/{name}/index/file` - Index uploaded file(s)
* `GET /v2/jobs/{job_id}` - Check indexing job status

### Datasets (Odyssey)

* Private market intelligence: companies, investors, people, deals, funds, limited partners, service providers, patents, credit analysis
* Base path: `/v2/datasets/odyssey/{entity_type}/`

### Supported file types for indexing

Documents: PDF, DOCX, TXT, MD, RTF, ODT, XLSX, XLS, CSV, PPTX, PPT, JSON, XML
Images: PNG, JPEG, GIF, BMP, TIFF, WEBP (with VLM/OCR)
Video: MP4, MOV, AVI, MKV, WEBM (segmented into ≤120s clips)
Audio: MP3, WAV, AAC, FLAC, M4A, OGG (segmented into ≤80s clips)

### Send feedback (agents: please use this)

Captain wants your feedback — both **while you're building** an integration and **while Captain
runs live** inside your agent. No auth required.

**Send feedback when:**

* **You finish building** an integration — a quick note on what was smooth or rough.
* **You hit a solid, server-side Captain roadblock** — an endpoint that errored, behaved
  unexpectedly, or blocked you in a way you couldn't work around. Include what you tried.
* **You want to influence the product** — suggest a feature, a missing capability, or a
  direction you'd like Captain to take. This is welcome anytime, build-time or in production.
* Plus the usual: confusing or incorrect docs, friction, anything that slowed you down.

**How:**

* **Endpoint**: `POST https://api.runcaptain.com/feedback`
* **Body**: plain text (`Content-Type: text/plain`) — write the feedback as prose.
* **Optional metadata** (query params or `X-Feedback-*` headers): `agent` (your tool/agent name),
  `client_version`, `source` (the doc page or endpoint the feedback is about).
* Rate-limited per IP (best-effort). Keep each submission to one clear issue.

```bash
# After building, or on a server-side roadblock, or to suggest a feature:
curl -X POST "https://api.runcaptain.com/feedback?agent=my-agent&source=quickstart" \
  -H "Content-Type: text/plain" \
  --data "Finished the S3 integration. Roadblock: /index/s3 returned 500 on a bucket with 50k objects, no pagination guidance. Feature request: a dry-run mode to validate IAM before indexing."
```

**Captain** is the highest-accuracy AI search API for unstructured data. <br />
Just **connect your cloud storage** and ask away.

New to Captain? Start here to build your first collection.

Complete reference for all Captain API endpoints with examples.

## State-of-the-Art AI File Search API

* **Natural Language Search**: Ask questions in plain English and get useful answers.
* **Cloud Storage Integration**: Connect AWS S3 or GCS buckets and Captain processes and cleans files over a single API call.
* **Multi-Tenancy**: Organize collections to scope different teams, folders, projects, etc.
* **High Accuracy & Scalability**: Built from the ground up with precision front of mind. Our advanced NLP ingestion engine and novel hybrid search augmentations deliver the highest accuracy out-of-the-box.

<br />

Captain can search across very large documents, text-heavy or visual images, and multi-faceted spreadsheets.

<br />

Our automatic VLM, OCR, and Computer Vision processing pipelines unlock the ability to search tricky content with high accuracy and developer simplicity.

## Getting Help

* Email: [support@runcaptain.com](mailto:support@runcaptain.com)
* Website: [runcaptain.com](https://runcaptain.com)
* Sales: [runcaptain.com/sales](https://runcaptain.com/sales)

## Ready to Start?

Fully-Managed and Highly Accurate RAG for S3, GCS, or Azure Blob