Cloud Storage Credentials Guide

This guide will help you create cloud storage buckets with proper compliance settings and obtain the necessary credentials to use Captain with your cloud storage provider.


Creating Your Cloud Storage Bucket

Before obtaining credentials, you’ll need a bucket to store your documents. Follow these steps to create a properly configured bucket.

AWS S3 Bucket Setup

Step 1: Create the Bucket

  1. Log in to the AWS Console
  2. Navigate to S3BucketsCreate bucket
  3. Enter a unique bucket name (e.g., company-captain-documents)
  4. Select your preferred AWS Region (recommendation: same region as your application)
  5. Under Object Ownership, keep “ACLs disabled” (recommended)
  6. Under Block Public Access settings, keep all options checked (block all public access)
  7. Click Create bucket

Step 2: Enable Object Versioning (SOC II Compliance)

Required for SOC II Compliance: Object versioning maintains a complete audit trail of all document changes, enabling recovery from accidental deletions and supporting data retention policies.

  1. Click on your newly created bucket
  2. Navigate to the Properties tab
  3. Find the Bucket Versioning section
  4. Click Edit
  5. Select Enable versioning
  6. Click Save changes

Step 3: (Optional) Enable Server-Side Encryption

For additional security, enable default encryption:

  1. In the Properties tab, find Default encryption
  2. Click Edit
  3. Select Server-side encryption with Amazon S3 managed keys (SSE-S3) or your preferred encryption method
  4. Click Save changes

Step 4: (Optional) Configure Lifecycle Rules

For cost optimization with versioned objects:

  1. Navigate to the Management tab
  2. Click Create lifecycle rule
  3. Configure rules to transition old versions to cheaper storage classes or delete them after a retention period

Google Cloud Storage Bucket Setup

Step 1: Create the Bucket

  1. Go to Google Cloud Console
  2. Navigate to Cloud StorageBucketsCreate
  3. Enter a unique bucket name (e.g., company-captain-documents)
  4. Select a Location type (Region, Dual-region, or Multi-region)
  5. Select Standard storage class for general use
  6. Under Access control, select “Uniform” (recommended)
  7. Click Create

Step 2: Enable Object Versioning (SOC II Compliance)

Required for SOC II Compliance: Object versioning maintains a complete audit trail of all document changes.

Via Google Cloud Console:

  1. Click on your bucket name
  2. Navigate to the Configuration tab
  3. Find Object versioning and click Edit
  4. Toggle versioning to Enabled
  5. Click Save

Via gsutil command line:

$gsutil versioning set on gs://your-bucket-name

Via gcloud CLI:

$gcloud storage buckets update gs://your-bucket-name --versioning

Step 3: (Optional) Configure Lifecycle Rules

To manage versioned object costs:

  1. In your bucket, go to Lifecycle tab
  2. Click Add a rule
  3. Configure rules to delete old versions after your retention period

SOC II Compliance Checklist

Ensure your buckets meet SOC II requirements:

RequirementAWS S3Google Cloud Storage
Object Versioning✓ Enable Bucket Versioning✓ Enable Object Versioning
Encryption at Rest✓ SSE-S3 or SSE-KMS✓ Google-managed or CMEK
Access Logging✓ Enable Server Access Logging✓ Enable Data Access Logs
Block Public Access✓ Block all public access✓ Uniform access control
IAM Policies✓ Least-privilege permissions✓ Least-privilege permissions

AWS S3 Credentials

To index files from Amazon S3 buckets, you’ll need an AWS Access Key ID and Secret Access Key.

Step 1: Navigate to IAM Security Credentials

  1. Log in to the AWS Console
  2. Click on your account name in the top-right corner
  3. Select Security credentials from the dropdown menu

AWS Console - Security Credentials

Step 2: Create Access Key

  1. Scroll down to the Access keys section
  2. Click the Create access key button

AWS IAM - Create Access Key

Step 3: Retrieve Your Credentials

  1. Your Access Key ID and Secret Access Key will be displayed
  2. Important: This is the only time you can view the Secret Access Key
  3. Click Show to reveal the Secret Access Key
  4. Copy both the Access Key ID and Secret Access Key to a secure location
  5. Optionally, download the .csv file for safekeeping

AWS - Retrieve Access Key

Best Practices for AWS Access Keys

  • Never store your access key in plain text, in a code repository, or in code
  • Disable or delete access keys when no longer needed
  • Enable least-privilege permissions
  • Rotate access keys regularly

Required IAM Permissions

Your AWS access key needs the following permissions to work with Captain:

For read-only access to S3 buckets:

1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Effect": "Allow",
6 "Action": [
7 "s3:GetObject",
8 "s3:ListBucket"
9 ],
10 "Resource": [
11 "arn:aws:s3:::your-bucket-name",
12 "arn:aws:s3:::your-bucket-name/*"
13 ]
14 }
15 ]
16}

Replace your-bucket-name with your actual S3 bucket name.


Google Cloud Storage Credentials

To index files from Google Cloud Storage buckets, you’ll need a Service Account JSON key.

Step 1: Navigate to Service Accounts

  1. Go to Google Cloud Console
  2. Navigate to IAM & AdminService Accounts
  3. Click Create Service Account

Google Cloud - Service Accounts

Step 2: Create Service Account

  1. Enter a Service account name (e.g., captain-storage-access)
  2. Add a Service account description (optional but recommended)
  3. The Service account ID will be auto-generated
  4. Click Create and continue

Google Cloud - Create Service Account

Step 3: Grant Permissions

Under Grant this service account access to project, choose the appropriate role based on your needs:

For read-only access to buckets/objects:

  • Role: Storage Object Viewer (roles/storage.objectViewer)

For read/write access:

  • Role: Storage Object Admin (roles/storage.objectAdmin)

For full bucket management:

  • Role: Storage Admin (roles/storage.admin)

Click ContinueDone

Step 4: Create and Download JSON Key

  1. You’ll now see your new service account in the list
  2. Click on the service account name
  3. Navigate to the Keys tab
  4. Click Add KeyCreate New Key
  5. Select JSON as the key type
  6. Click Create

Google Cloud - Create JSON Key

The JSON key file will automatically download to your computer. This file contains your service account credentials.

JSON Key File Format

Your downloaded JSON key will look like this:

1{
2 "type": "service_account",
3 "project_id": "your-project-id",
4 "private_key_id": "abc123...",
5 "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
6 "client_email": "captain-storage-access@your-project.iam.gserviceaccount.com",
7 "client_id": "123456789",
8 "auth_uri": "https://accounts.google.com/o/oauth2/auth",
9 "token_uri": "https://oauth2.googleapis.com/token",
10 "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
11 "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
12}

Using the Service Account with Captain

When using Captain with Google Cloud Storage, you’ll typically need to:

  1. Store the JSON key file securely on your server
  2. Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to the JSON file path:
$export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your-service-account-key.json"

Or load it programmatically in your application.

Best Practices for Service Account Keys

  • Service account keys could pose a security risk if compromised
  • Never commit service account keys to version control
  • Store keys securely using secret management services
  • Rotate keys regularly
  • Delete unused service accounts and keys
  • Consider using Workload Identity Federation for more secure authentication (advanced)
  • Google automatically disables service account keys detected in public repositories

Using Credentials with Captain

Once you have your credentials, you can use them with Captain’s indexing endpoints:

AWS S3 Example

1import requests
2from urllib.parse import quote
3
4aws_secret = "your_aws_secret_key"
5aws_secret_encoded = quote(aws_secret, safe='')
6
7headers = {
8 "Authorization": "Bearer cap_dev_...",
9 "X-Organization-ID": "01999eb7-...",
10 "Content-Type": "application/x-www-form-urlencoded"
11}
12
13response = requests.post(
14 "https://api.runcaptain.com/v1/index-s3",
15 headers=headers,
16 data={
17 'database_name': 'my_database',
18 'bucket_name': 'my-s3-bucket',
19 'aws_access_key_id': 'AKIAIOSFODNN7EXAMPLE',
20 'aws_secret_access_key': aws_secret_encoded,
21 'bucket_region': 'us-east-1'
22 }
23)

Google Cloud Storage Example

1import requests
2
3# Load service account JSON
4with open('path/to/service-account-key.json', 'r') as f:
5 service_account_json = f.read()
6
7headers = {
8 "Authorization": "Bearer cap_dev_...",
9 "X-Organization-ID": "01999eb7-...",
10 "Content-Type": "application/x-www-form-urlencoded"
11}
12
13response = requests.post(
14 "https://api.runcaptain.com/v1/index-gcs",
15 headers=headers,
16 data={
17 'database_name': 'my_database',
18 'bucket_name': 'my-gcs-bucket',
19 'service_account_json': service_account_json
20 }
21)

Security Reminders

  • Never share your credentials in public forums, chat messages, or email
  • Never commit credentials to version control systems (Git, SVN, etc.)
  • Use environment variables or secure secret management services in production
  • Rotate credentials regularly to minimize security risks
  • Apply least-privilege permissions - only grant the minimum access needed
  • Monitor credential usage through AWS CloudTrail or Google Cloud Audit Logs

Troubleshooting

AWS Issues

Error: “Invalid AWS credentials”

  • Verify your Access Key ID and Secret Access Key are correct
  • Check that the access key is active in the IAM console
  • Ensure your IAM user/role has the necessary S3 permissions

Error: “Access Denied”

  • Verify your IAM permissions include s3:GetObject and s3:ListBucket
  • Check bucket policies and ensure they allow your IAM user/role
  • Verify the bucket region matches the bucket_region parameter

Google Cloud Issues

Error: “Invalid service account credentials”

  • Verify the JSON key file is valid and not corrupted
  • Check that the service account is enabled
  • Ensure the service account has the necessary Storage permissions

Error: “Permission denied”

  • Verify the service account has the appropriate Storage role
  • Check that the bucket exists and the service account has access
  • Review IAM permissions in the Google Cloud Console

Need Help?

If you encounter issues obtaining or using your cloud storage credentials, contact Captain support: