Skip to content

API Reference

Base URL: https://api.runcaptain.com

Authentication

All API requests require authentication using your API key and organization ID.

Authorization: Bearer YOUR_API_KEY
X-Organization-ID: YOUR_ORG_UUID

Most endpoints also require api_key and organization_id in the request body.

Captain API (Chat Completions with Infinite Context)

For OpenAI-compatible chat completions with infinite context support, see the Infinite Responses API Documentation.

Context Passing Methods

Captain supports multiple ways to pass context with your requests:

  1. OpenAI SDK (⭐ Recommended): Use extra_body parameter

    extra_body: {
      captain: {
        context: "your context here"
      }
    }
    

  2. Vercel AI SDK: Use base64-encoded custom header

    headers: {
      'X-Captain-Context': Buffer.from(context).toString('base64')
    }
    

  3. Direct HTTP: Use captain parameter in request body

    {
      "captain": {
        "context": "your context here"
      }
    }
    

For complete examples and detailed documentation, see: - JavaScript/TypeScript SDK Guide - Infinite Responses API


Data Lake Integration APIs

The following endpoints are for managing databases and querying indexed cloud storage.

Environment Scoping

Important: API keys are scoped to specific environments, and can only access data in that environment.

  • Development keys (prefix: cap_dev_*) can only access development databases and data
  • Staging keys (prefix: cap_stage_*) can only access staging databases and data
  • Production keys (prefix: cap_prod_*) can only access production databases and data

When you create a database with a development key, it becomes a development database. When you query, list files, or delete files, you can only interact with databases and data in the same environment as your API key.

Example: If you create a database called contracts_2024 using a development key (cap_dev_*), you cannot access it using a production key (cap_prod_*), even if both keys belong to the same organization. You would need to create a separate contracts_2024 database using the production key.

This environment isolation ensures that development, staging, and production data remain completely separate.


Create Database

Creates a new database for indexing files.

Endpoint

POST /v1/create-database

Parameters

Parameter Type Required Description
database_name string Yes Unique name for your database. Must be unique within your account.
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests

response = requests.post(
    "https://api.runcaptain.com/v1/create-database",
    data={
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2',
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'database_name': 'my_documents'
    }
)

print(response.json())

Response

Success (200 OK)

{
  "database_name": "my_documents",
  "database_id": "db_abc123",
  "status": "success",
  "message": "Database created successfully"
}

Error (400 Bad Request)

{
  "status": "error",
  "message": "Database name already exists"
}

Notes

  • Database names are scoped to your organization and environment
  • Two different organizations can use the same database name, and the same organization can create databases with the same name in different environments (dev, staging, prod)
  • Only one database with a given name can exist per organization per environment
  • Special characters and spaces should be avoided in database names
  • The database will be created in the environment matching your API key (development keys create development databases, production keys create production databases, etc.)

Delete Database

Deletes a database. This cannot be undone.

Endpoint

POST /v1/delete-database

Parameters

Parameter Type Required Description
database_name string Yes Name of the database to delete
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests

response = requests.post(
    "https://api.runcaptain.com/v1/delete-database",
    data={
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2',
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'database_name': 'my_documents'
    }
)

print(response.json())

Response

Success (200 OK)

{
  "status": "success",
  "message": "Database deleted successfully",
  "database_name": "my_documents"
}

Error (404 Not Found)

{
  "status": "error",
  "message": "Database not found"
}

Error (401 Unauthorized)

{
  "status": "error",
  "message": "Invalid API key"
}

Notes

  • This action cannot be undone
  • All indexed files in the database are deleted as well
  • You may create a new database with the same name after deletion
  • Only databases in the same environment as your API key can be deleted (development keys can only delete development databases, etc.)

List Databases

Retrieves all databases associated with your account.

Endpoint

POST /v1/list-databases

Parameters

Parameter Type Required Description
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests

headers = {
    "Authorization": "Bearer cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd",
    "Content-Type": "application/x-www-form-urlencoded"
}

response = requests.post(
    "https://api.runcaptain.com/v1/list-databases",
    headers=headers,
    data={
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2'
    }
)

print(response.json())

Response

Success (200 OK)

[
  {
    "database_id": "db_abc123",
    "database_name": "contracts_2024",
    "environment": "production",
    "is_active": true,
    "created_at": "2024-01-15T10:30:00Z",
    "request_count": 1250
  },
  {
    "database_id": "db_def456",
    "database_name": "research_docs",
    "environment": "production",
    "is_active": true,
    "created_at": "2024-02-01T14:20:00Z",
    "request_count": 487
  }
]

Empty Response (200 OK)

[]

Response Fields

Field Type Description
database_id string Unique identifier for the database
database_name string Name of the database
environment string Environment scope of the database
is_active boolean Whether the database is active
created_at string ISO 8601 timestamp of creation
request_count integer Number of queries made to this database

Notes

  • This list is limited to 1,000 databases. If you need a higher count, please contact us at support@runcaptain.com or call us at +1 (260) CAP-TAIN
  • Only databases in the same environment as your API key are returned (development keys only see development databases, etc.)

List Files

Retrieves all indexed files in a specific database, with pagination support.

Endpoint

POST /v1/list-files

Parameters

Parameter Type Required Description
database_name string Yes Name of the database to list files from
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID
limit integer No Maximum number of files to return (default: 100)
offset integer No Offset for pagination (default: 0)

Request Example

import requests

response = requests.post(
    "https://api.runcaptain.com/v1/list-files",
    data={
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2',
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'database_name': 'contracts_2024',
        'limit': 50,
        'offset': 0
    }
)

print(response.json())

Response

Success (200 OK)

[
  {
    "file_id": "0199bc97-212f-729a-9c0b-cc23f21e0995",
    "file_name": "Acme_Corp_Contract.pdf",
    "chunk_id": "chunk_abc123",
    "chunk_index": 0,
    "created_at": "2024-01-15T10:30:00Z",
    "updated_at": "2024-01-15T10:30:00Z"
  },
  {
    "file_id": "0199bc97-212f-729a-9c0b-cc23f21e0995",
    "file_name": "Acme_Corp_Contract.pdf",
    "chunk_id": "chunk_def456",
    "chunk_index": 1,
    "created_at": "2024-01-15T10:30:00Z",
    "updated_at": "2024-01-15T10:30:00Z"
  },
  {
    "file_id": "0199bc97-20c9-770e-95f9-3fee32ab9b14",
    "file_name": "Beta_Industries_Contract.pdf",
    "chunk_id": "chunk_ghi789",
    "chunk_index": 0,
    "created_at": "2024-01-16T14:20:00Z",
    "updated_at": "2024-01-16T14:20:00Z"
  }
]

Empty Response (200 OK)

[]

Response Fields

Field Type Description
file_id string Unique identifier for the file
file_name string Name of the file
chunk_id string Unique identifier for this chunk
chunk_index integer Index of the chunk within the file (0-based)
created_at string ISO 8601 timestamp when the file was indexed
updated_at string ISO 8601 timestamp when the file was last updated

Notes

  • Files are returned ordered by file name (ascending), then by chunk index
  • Each chunk of a file appears as a separate entry in the response
  • Large files may be split into multiple chunks, which is why the same file may appear multiple times with different chunk_index values
  • Only non-deleted files are returned
  • Results are scoped to your organization and the environment of your API key

Delete File

Soft delete a specific file from a database. The file is marked as deleted but the data is preserved.

Endpoint

POST /v1/delete-file

Parameters

Parameter Type Required Description
database_name string Yes Name of the database containing the file
file_id string Yes ID of the file to delete
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests

response = requests.post(
    "https://api.runcaptain.com/v1/delete-file",
    data={
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2',
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'database_name': 'contracts_2024',
        'file_id': '0199bc97-212f-729a-9c0b-cc23f21e0995'
    }
)

print(response.json())

Response

Success (200 OK)

{
  "success": true,
  "message": "File deleted successfully"
}

Error (404 Not Found)

{
  "success": false,
  "message": "File not found or already deleted"
}

Error (401 Unauthorized)

{
  "success": false,
  "message": "Invalid API key"
}

Notes

  • This is a soft delete operation - the file is marked as is_deleted=true but the data is preserved
  • Deleted files will not appear in /list-files responses
  • Deleted files will not be included in query results
  • The file cannot be undeleted through the API (contact support if needed)
  • All chunks associated with the file are deleted
  • Results are scoped to your organization and the environment of your API key

Index S3 Bucket

Start indexing all files from an S3 bucket into your database.

Note: You'll need AWS credentials to access your S3 bucket. See the Cloud Storage Credentials Guide for step-by-step instructions on obtaining AWS Access Keys.

Endpoint

POST /v1/index-s3

Parameters

Parameter Type Required Description
database_name string Yes Target database name
bucket_name string Yes S3 bucket name
aws_access_key_id string Yes AWS access key ID
aws_secret_access_key string Yes AWS secret access key (URL-encoded)
bucket_region string Yes S3 bucket region (e.g., us-east-1)
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests
from urllib.parse import quote

aws_secret = "your_aws_secret_key"
aws_secret_encoded = quote(aws_secret, safe='')

response = requests.post(
    "https://api.runcaptain.com/v1/index-s3",
    data={
        'database_name': 'contracts_2024',
        'bucket_name': 'my-company-docs',
        'aws_access_key_id': 'AKIAIOSFODNN7EXAMPLE',
        'aws_secret_access_key': aws_secret_encoded,
        'bucket_region': 'us-east-1',
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2'
    },
    timeout=30.0
)

print(response.json())

Response

Success (200/201)

{
  "job_id": "job_abc123xyz",
  "status": "processing",
  "message": "Indexing job started successfully",
  "timestamp": "2024-01-15T10:30:00Z"
}

Error (400 Bad Request)

{
  "status": "error",
  "message": "Unsupported file type detected",
  "details": "File 'video.mp4' is not supported"
}

Indexing Behavior

Important: This endpoint wipes all previously indexed files from the database and then indexes all files from the bucket.

Environment Scoping

  • Only databases in the same environment as your API key can be indexed (development keys can only index into development databases, etc.)
  • Indexed files are scoped to the environment of the API key used

Supported File Types

Captain uses an allow-list for supported file types:

Documents

  • PDF (.pdf) - Up to 512MB with automatic page chunking
  • Microsoft Word (.docx)
  • Text files (.txt)
  • Markdown (.md)

Spreadsheets & Data

  • Microsoft Excel (.xlsx, .xls) - With intelligent row-based chunking
  • CSV (.csv) - With header preservation across chunks
  • JSON (.json)

Presentations

  • Microsoft PowerPoint (.pptx, .ppt)

Images (with OCR and Computer Vision support)

  • JPEG (.jpg, .jpeg)
  • PNG (.png)
  • BMP (.bmp) (Experimental)
  • GIF (.gif) (Experimental)
  • TIFF (.tiff) (Experimental)

Code

  • Python (.py)
  • TypeScript (.ts)
  • JavaScript (.js)
  • HTML (.html)
  • CSS (.css)
  • PHP (.php)
  • Java (.java)

Unsupported types (.mov, .mp4, .avi, etc.) will individually fail but the rest of the files will be indexed.


Index S3 File

Index a single file from an S3 bucket into your database.

Note: You'll need AWS credentials to access your S3 bucket. See the Cloud Storage Credentials Guide for step-by-step instructions on obtaining AWS Access Keys.

Endpoint

POST /v1/index-s3-file

Parameters

Parameter Type Required Description
database_name string Yes Target database name
bucket_name string Yes S3 bucket name
file_uri string Yes S3 URI of the file (format: s3://bucket-name/path/to/file.pdf)
aws_access_key_id string Yes AWS access key ID
aws_secret_access_key string Yes AWS secret access key (URL-encoded)
bucket_region string Yes S3 bucket region (e.g., us-east-1)
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests
from urllib.parse import quote

aws_secret = "your_aws_secret_key"
aws_secret_encoded = quote(aws_secret, safe='')

response = requests.post(
    "https://api.runcaptain.com/v1/index-s3-file",
    data={
        'database_name': 'contracts_2024',
        'bucket_name': 'my-company-docs',
        'file_uri': 's3://my-company-docs/contracts/acme_contract.pdf',
        'aws_access_key_id': 'AKIAIOSFODNN7EXAMPLE',
        'aws_secret_access_key': aws_secret_encoded,
        'bucket_region': 'us-east-1',
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2'
    },
    timeout=30.0
)

print(response.json())

Response

Success (200/201)

{
  "job_id": "job_xyz789abc",
  "status": "processing",
  "message": "Indexing job started successfully",
  "timestamp": "2024-01-15T10:30:00Z"
}

Error (400 Bad Request)

{
  "status": "error",
  "message": "Invalid S3 URI format",
  "details": "S3 URI must start with 's3://' and include bucket and file path"
}

Notes

  • The file_uri must be a valid S3 URI in the format s3://bucket-name/path/to/file.ext
  • The bucket name in the URI must match the bucket_name parameter
  • If the file already exists in the database, it will be replaced
  • Supported file types are the same as the Index S3 Bucket endpoint
  • Only databases in the same environment as your API key can be indexed (development keys can only index into development databases, etc.)

Index GCS Bucket

Start indexing all files from a Google Cloud Storage bucket into your database.

Note: You'll need a GCS Service Account JSON key to access your bucket. See the Cloud Storage Credentials Guide for step-by-step instructions on obtaining service account credentials.

Endpoint

POST /v1/index-gcs

Parameters

Parameter Type Required Description
database_name string Yes Target database name
bucket_name string Yes GCS bucket name
service_account_json string Yes Service Account JSON credentials (as string)
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests
import json

# Load service account JSON
with open('path/to/service-account-key.json', 'r') as f:
    service_account_json = f.read()

response = requests.post(
    "https://api.runcaptain.com/v1/index-gcs",
    data={
        'database_name': 'contracts_2024',
        'bucket_name': 'my-company-docs',
        'service_account_json': service_account_json,
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2'
    },
    timeout=30.0
)

print(response.json())

Response

Success (200/201)

{
  "job_id": "job_gcs_abc123xyz",
  "status": "processing",
  "message": "Indexing job started for GCS bucket 'my-company-docs'",
  "timestamp": "2024-01-15T10:30:00Z"
}

Error (400 Bad Request)

{
  "status": "error",
  "message": "Invalid service account JSON",
  "details": "Missing required fields: type, project_id, private_key"
}

Indexing Behavior

Important: This endpoint wipes all previously indexed files from the database and then indexes all files from the bucket.

Service Account Requirements

Your service account needs the following minimum IAM permission:

  • Storage Object Viewer (roles/storage.objectViewer)

This role grants: - storage.objects.list - List objects in the bucket - storage.objects.get - Read object data

Environment Scoping

  • Only databases in the same environment as your API key can be indexed (development keys can only index into development databases, etc.)
  • Indexed files are scoped to the environment of the API key used

Supported File Types

Same file types as the S3 indexing endpoints (see Index S3 Bucket for the full list).


Index GCS File

Index a single file from a Google Cloud Storage bucket into your database.

Note: You'll need a GCS Service Account JSON key to access your bucket. See the Cloud Storage Credentials Guide for step-by-step instructions on obtaining service account credentials.

Endpoint

POST /v1/index-gcs-file

Parameters

Parameter Type Required Description
database_name string Yes Target database name
bucket_name string Yes GCS bucket name
file_uri string Yes GCS URI of the file (format: gs://bucket-name/path/to/file.pdf)
service_account_json string Yes Service Account JSON credentials (as string)
api_key string Yes Your Captain API key
organization_id string Yes Your organization UUID

Request Example

import requests
import json

# Load service account JSON
with open('path/to/service-account-key.json', 'r') as f:
    service_account_json = f.read()

response = requests.post(
    "https://api.runcaptain.com/v1/index-gcs-file",
    data={
        'database_name': 'contracts_2024',
        'bucket_name': 'my-company-docs',
        'file_uri': 'gs://my-company-docs/contracts/acme_contract.pdf',
        'service_account_json': service_account_json,
        'api_key': 'cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd',
        'organization_id': '01999eb7-8554-5c7b-6321-066454166af2'
    },
    timeout=30.0
)

print(response.json())

Response

Success (200/201)

{
  "job_id": "job_gcs_file_xyz789abc",
  "status": "processing",
  "message": "Single file indexing job started for GCS file 'contracts/acme_contract.pdf'",
  "timestamp": "2024-01-15T10:30:00Z"
}

Error (400 Bad Request)

{
  "status": "error",
  "message": "Invalid GCS URI format",
  "details": "GCS URI must start with 'gs://' and include bucket and file path"
}

Notes

  • The file_uri must be a valid GCS URI in the format gs://bucket-name/path/to/file.ext
  • The bucket name in the URI must match the bucket_name parameter
  • If the file already exists in the database, it will be replaced
  • Supported file types are the same as the Index GCS Bucket endpoint
  • Only databases in the same environment as your API key can be indexed (development keys can only index into development databases, etc.)

Check Indexing Status

Retrieves the status of an indexing job. Polling this endpoint is the recommended way to check the status of an indexing job.

Endpoint

GET /v1/indexing-status/{job_id}

Parameters

Path Parameters:

Parameter Type Required Description
job_id string Yes Job ID returned from index-all endpoint

Request Example

import requests

job_id = "job_abc123xyz"

headers = {
    "Authorization": "Bearer cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd",
    "Content-Type": "application/json"
}

response = requests.get(
    f"https://api.runcaptain.com/v1/indexing-status/{job_id}",
    headers=headers
)

print(response.json())

Response

In Progress

{
  "job_id": "job_abc123xyz",
  "status": "processing",
  "completed": false,
  "active_file_processing_workers": 8,
  "timestamp": "2024-01-15T10:35:00Z",
  "job_details": {
    "job_name": "index_contracts_2024",
    "total_files": 1000,
    "indexed_files": 342,
    "failed_files": 3
  }
}

Completed

{
  "job_id": "job_abc123xyz",
  "status": "completed",
  "completed": true,
  "active_file_processing_workers": 0,
  "timestamp": "2024-01-15T11:05:00Z",
  "job_details": {
    "job_name": "index_contracts_2024",
    "total_files": 1000,
    "indexed_files": 997,
    "failed_files": 3
  }
}

Error

{
  "job_id": "job_abc123xyz",
  "status": "error",
  "completed": true,
  "error": "AWS credentials invalid",
  "timestamp": "2024-01-15T10:32:00Z"
}

Polling Recommendations

  • Poll every 3 seconds (industry standard)
  • Check for completed: true or status: "completed", "error", or "failed"
  • Monitor active_file_processing_workers to gauge progress, although usually this just says 0 and then jumps to all at the end.
  • Calculate progress: (indexed_files + failed_files) / total_files * 100

Query Database

Query your indexed data using natural language.

Endpoint

POST /v1/query

Headers

Header Required Description
Authorization Yes Bearer token: Bearer {api_key}
Content-Type Yes Must be application/x-www-form-urlencoded
X-Organization-ID Yes Your organization UUID
Idempotency-Key No Unique UUID for request deduplication (recommended)

Parameters

Parameter Type Required Description
query string Yes Natural language query (URL-encoded)
database_name string Yes Database to query
include_files boolean No Include file metadata in response (default: false)

Request Example

import requests
import uuid
from urllib.parse import quote

query = "find Q3 contracts mentioning 'termination for convenience'"
idempotency_key = str(uuid.uuid7())

headers = {
    "Authorization": "Bearer cap_dev_NvXocMo6ZrqsVgAKR6ofIB8TtwbdSBfd",
    "Content-Type": "application/x-www-form-urlencoded",
    "X-Organization-ID": "01999eb7-8554-5c7b-6321-066454166af2"
    "Idempotency-Key": idempotency_key,
}

response = requests.post(
    "https://api.runcaptain.com/v1/query",
    headers=headers,
    data={
        'query': quote(query),
        'database_name': 'contracts_2024',
        'include_files': 'true'
    },
    timeout=120.0
)

print(response.json())

Response

Success (200 OK)

{
  "status": "success",
  "response": "Based on your Q3 contracts, three documents mention 'termination for convenience' clauses. The Acme Corp contract (Section 12.3) allows either party to terminate with 30 days notice. The Beta Industries agreement (Section 8.1) specifies 60 days notice for convenience termination...",
  "relevant_files": [
    {
      "file_name": "Acme_Corp_Q3_2024.pdf",
      "relevancy_score": 0.92,
      "file_type": "pdf",
      "file_id": "0199bc97-212f-729a-9c0b-cc23f21e0995"
    },
    {
      "file_name": "Beta_Industries_Contract.pdf",
      "relevancy_score": 0.87,
      "file_type": "pdf",
      "file_id": "0199bc97-20c9-770e-95f9-3fee32ab9b14"
    }
  ],
  "query": "find Q3 contracts mentioning 'termination for convenience'",
  "database_name": "contracts_2024",
  "processing_metrics": {
    "total_files_processed": 4,
    "total_tokens": 16308,
    "execution_time_ms": 1250
  }
}

Note: When include_files=false (default), the relevant_files array is omitted from the response to reduce payload size.

Response Fields

Field Type Description
status string Request status
response string Natural language answer to your query
relevant_files array Array of relevant file objects (if include_files: true)
query string Echo of your original query
database_name string Database that was queried

Relevant Files Object:

Field Type Description
file_name string Name of the file
relevancy_score float Relevancy score (0.0 - 1.0)
file_type string File extension/type (e.g., "pdf", "py", "docx")
file_id string Unique identifier for the file

Processing Metrics Object:

Field Type Description
total_files_processed integer Number of files analyzed
total_tokens integer Total tokens processed
execution_time_ms integer Query execution time in milliseconds

Notes

  • Query timeout is 120 seconds
  • The response field contains the answer with inline references
  • Idempotency-Key prevents duplicate processing of the same request (minimizing costs)
  • Only databases in the same environment as your API key can be queried (development keys can only query development databases, etc.)

Error Responses

All endpoints follow consistent error response formats:

400 Bad Request

{
  "status": "error",
  "message": "Invalid parameter: database_name is required"
}

401 Unauthorized

{
  "status": "error",
  "message": "Invalid or expired API key"
}

403 Forbidden

{
  "status": "error",
  "message": "API key does not belong to this organization"
}

404 Not Found

{
  "status": "error",
  "message": "Database not found"
}

500 Internal Server Error

{
  "status": "error",
  "message": "Internal server error",
  "details": "Contact support if this persists"
}


Rate Limits

Rate limits are applied per API key. Contact support for specific limits on your account.

Support

For API support, contact: support@runcaptain.com or call us at +1 (260) CAP-TAIN.