Cloud Storage Config

Captain supports indexing from Amazon S3, Google Cloud Storage, Azure Blob Storage, and Cloudflare R2.

This guide walks through the process of connecting them with Captain.

Choose your cloud storage provider


AWS S3 Bucket Setup

Step 1: Create the Bucket

  1. Log in to the AWS Console

  2. Navigate to S3BucketsCreate bucket

    Captain recommends the following standard configuration:
    1. Select a region for the bucket
    2. Set a bucket name
    3. Keep General Purpose Storage selected if asked
    4. Keep ACLs disabled
    5. Keep Block all public access checked
    6. (Optional) Enable Bucket Versioning (for SOC 2 compliance)
    7. Once these settings are configured, the bucket is ready.

      Scroll down and click Create bucket


AWS S3 Access Keys

To index files from Amazon S3 buckets, you’ll need an AWS Access Key ID and Secret Access Key.

Step 1: Navigate to IAM Security Credentials

  1. Log in to the AWS Console
  2. Click on your account name in the top-right corner
  3. Select Security credentials from the dropdown menu
AWS Console - Security Credentials

Step 2: Create Access Key

  1. Scroll down to the Access keys section
  2. Click the Create access key button
AWS IAM - Create Access Key

Step 3: Retrieve Your Credentials

  1. Your Access Key ID and Secret Access Key will be displayed
  2. Important: This is the only time you can view the Secret Access Key
  3. Click Show to reveal the Secret Access Key
  4. Copy both the Access Key ID and Secret Access Key to a secure location
  5. Optionally, download the .csv file for safekeeping
AWS - Retrieve Access Key

Required IAM Permissions (AWS)

Your AWS access key needs the following permissions to work with Captain:

For read-only access to S3 buckets:

1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Effect": "Allow",
6 "Action": [
7 "s3:GetObject",
8 "s3:ListBucket"
9 ],
10 "Resource": [
11 "arn:aws:s3:::your-bucket-name",
12 "arn:aws:s3:::your-bucket-name/*"
13 ]
14 }
15 ]
16}

Replace your-bucket-name with your actual S3 bucket name.


Google Cloud Storage Bucket Setup

Step 1: Create the Bucket

  1. Go to Google Cloud Console
  2. Navigate to Cloud StorageBucketsCreate
  3. Enter a unique bucket name (e.g., company-captain-documents)
  4. Select a Location type (Region, Dual-region, or Multi-region)
  5. Select Standard storage class for general use
  6. Under Access control, select “Uniform” (recommended)
  7. Click Create

Google Cloud Storage Credentials

To index files from Google Cloud Storage buckets, you’ll need a Service Account JSON key.

Step 1: Navigate to Service Accounts

  1. Go to Google Cloud Console
  2. Navigate to IAM & AdminService Accounts
  3. Click Create Service Account
Google Cloud - Service Accounts

Step 2: Create Service Account

  1. Enter a Service account name (e.g., captain-storage-access)
  2. Add a Service account description (optional but recommended)
  3. The Service account ID will be auto-generated
  4. Click Create and continue
Google Cloud - Create Service Account

Step 3: Grant Permissions

Under Grant this service account access to project, choose the appropriate role based on your needs:

For read-only access to buckets/objects:

  • Role: Storage Object Viewer (roles/storage.objectViewer)

For read/write access:

  • Role: Storage Object Admin (roles/storage.objectAdmin)

For full bucket management:

  • Role: Storage Admin (roles/storage.admin)

Click ContinueDone

Step 4: Create and Download JSON Key

  1. You’ll now see your new service account in the list
  2. Click on the service account name
  3. Navigate to the Keys tab
  4. Click Add KeyCreate New Key
  5. Select JSON as the key type
  6. Click Create
Google Cloud - Create JSON Key

The JSON key file will automatically download to your computer. This file contains your service account credentials.

JSON Key File Format

Your downloaded JSON key will look like this:

1{
2 "type": "service_account",
3 "project_id": "your-project-id",
4 "private_key_id": "abc123...",
5 "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
6 "client_email": "captain-storage-access@your-project.iam.gserviceaccount.com",
7 "client_id": "123456789",
8 "auth_uri": "https://accounts.google.com/o/oauth2/auth",
9 "token_uri": "https://oauth2.googleapis.com/token",
10 "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
11 "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
12}

Using Google Cloud Service Accounts with Captain

Store the JSON key file securely (e.g., in a secret management service or as an environment variable)


Azure Blob Storage Setup

Step 1: Create a Storage Account

  1. Log in to the Azure Portal
  2. Navigate to Storage accountsCreate
  3. Select your Subscription and Resource group (or create a new one)
  4. Enter a Storage account name (e.g., captaindocuments)
  5. Select a Region closest to your operations
  6. Select Standard performance and LRS (Locally-redundant storage) for general use
  7. Click Review + createCreate

Step 2: Create a Container

  1. Open your new storage account
  2. Navigate to Data storageContainers
  3. Click + Container
  4. Enter a container name (e.g., documents)
  5. Set Private access level (no anonymous access)
  6. Click Create

Azure Blob Storage Credentials

To index files from Azure Blob Storage, you’ll need your Storage Account Name and Account Key.

Step 1: Get Your Account Name and Key

  1. Open your storage account in the Azure Portal
  2. Navigate to Security + networkingAccess keys
  3. Your Storage account name is displayed at the top
  4. Click Show next to either key to reveal the Account Key (base64-encoded)
  5. Copy both values

Important: Treat the account key like a password. Store it securely (e.g., in a secret management service or as an environment variable).

Step 2: Use with Captain

Pass the account name, account key, and container name when calling the indexing endpoint:

$curl -X POST https://api.runcaptain.com/v2/collections/my_collection/index/azure \
> -H "Authorization: Bearer $CAPTAIN_API_KEY" \
> -H "X-Organization-ID: $CAPTAIN_ORG_ID" \
> -H "Content-Type: application/json" \
> -d '{
> "container_name": "documents",
> "account_name": "captaindocuments",
> "account_key": "your_account_key_base64"
> }'

Required Azure Permissions

The account key provides full access to all containers in the storage account. For more granular control, you can use Shared Access Signatures (SAS) with the following minimum permissions:

  • Read — to access blob contents
  • List — to enumerate blobs in the container

You can generate a SAS token from the Azure Portal under your storage account’s Shared access signature settings.


Cloudflare R2 Setup

Step 1: Create an R2 Bucket

  1. Log in to the Cloudflare Dashboard
  2. Navigate to R2 Object StorageCreate bucket
  3. Enter a bucket name (e.g., captain-documents)
  4. Select a location hint (optional — R2 automatically distributes globally)
  5. Click Create bucket

Cloudflare R2 Credentials

To index files from Cloudflare R2, you’ll need your Account ID, an Access Key ID, and a Secret Access Key.

Step 1: Find Your Account ID

  1. Log in to the Cloudflare Dashboard
  2. Your Account ID is visible in the URL: https://dash.cloudflare.com/<account_id>
  3. You can also find it on the R2 Overview page in the right sidebar

Step 2: Create an R2 API Token

  1. Navigate to R2 Object StorageManage R2 API Tokens
  2. Click Create API token
  3. Enter a token name (e.g., captain-read-access)
  4. Under Permissions, select Object Read only
  5. Under Specify bucket(s), select your bucket or allow access to all buckets
  6. Click Create API Token

Step 3: Retrieve Your Credentials

  1. Your Access Key ID and Secret Access Key will be displayed
  2. Important: This is the only time you can view the Secret Access Key
  3. Copy both values to a secure location

Step 4: Use with Captain

Pass the account ID, access key ID, secret access key, and bucket name when calling the indexing endpoint:

$curl -X POST https://api.runcaptain.com/v2/collections/my_collection/index/r2 \
> -H "Authorization: Bearer $CAPTAIN_API_KEY" \
> -H "X-Organization-ID: $CAPTAIN_ORG_ID" \
> -H "Content-Type: application/json" \
> -d '{
> "bucket_name": "captain-documents",
> "account_id": "your_cloudflare_account_id",
> "access_key_id": "your_r2_access_key_id",
> "secret_access_key": "your_r2_secret_access_key"
> }'

R2 Jurisdictions

R2 supports jurisdiction-restricted storage. You can optionally specify a jurisdiction parameter:

  • default — Global (no restriction). This is the default.
  • eu — EU-only data residency
  • fedramp — FedRAMP-compliant storage

Required R2 Permissions

Your R2 API token needs the following minimum permissions:

  • Object Read — to access object contents
  • List — to enumerate objects in the bucket

For more granular control, create a token scoped to a specific bucket rather than all buckets.


Troubleshooting

AWS Issues

Error: “Invalid AWS credentials”

  • Verify your Access Key ID and Secret Access Key are correct
  • Check that the access key is active in the IAM console
  • Ensure your IAM user/role has the necessary S3 permissions

Error: “Access Denied”

  • Verify your IAM permissions include s3:GetObject and s3:ListBucket
  • Check bucket policies and ensure they allow your IAM user/role
  • Verify the bucket region matches the bucket_region parameter

Google Cloud Issues

Error: “Invalid service account credentials”

  • Verify the JSON key file is valid and not corrupted
  • Check that the service account is enabled
  • Ensure the service account has the necessary Storage permissions

Error: “Permission denied”

  • Verify the service account has the appropriate Storage role
  • Check that the bucket exists and the service account has access
  • Review IAM permissions in the Google Cloud Console

Azure Issues

Error: “Invalid credentials”

  • Verify the account name matches your storage account exactly
  • Check that the account key is the full base64-encoded key (not truncated)
  • Ensure the storage account exists and is active

Error: “Container not found”

  • Verify the container name matches exactly (case-sensitive)
  • Check that the container exists in the storage account
  • Ensure the storage account is in the correct subscription

Cloudflare R2 Issues

Error: “Invalid credentials”

  • Verify your Account ID matches the one in your Cloudflare dashboard URL
  • Check that the Access Key ID and Secret Access Key are from an R2 API token (not a general Cloudflare API token)
  • Ensure the API token has not been revoked

Error: “Access Denied”

  • Verify the R2 API token has Object Read permissions
  • Check that the token is scoped to the correct bucket (or all buckets)
  • Ensure the bucket exists and is in the correct jurisdiction

Need Help?

If you encounter issues obtaining or using your cloud storage credentials, contact Captain support: