Export to S3
This guide will walk you through the process of setting up S3 exports for your Usermaven workspace. Usermaven can deliver batched exports of your ClickHouse events directly into an S3 bucket or any S3-compatible object storage that you control. Each export contains only the rows that are new since the previous delivery, ensuring incremental updates without duplicates.
What data is included?
- Events table – All event properties that are visible in your workspace will be present in the export. Columns that are internal to our platform are filtered out automatically.
- Workspace scoping – Only data belonging to your workspace identifier is exported.
- Timestamps – Rows are ordered by the event ingestion timestamp so downstream systems can process them chronologically.
Prerequisites
Before setting up your S3 export, make sure you have:
- An active AWS account (or an S3-compatible storage provider account)
- Permissions to create and manage S3 buckets
- Access to IAM for creating service accounts or roles
- Access to your Usermaven workspace settings
What to prepare
Please have the following details ready before configuring your S3 export:
Information | Description |
---|---|
Bucket name | The S3 bucket where files should be delivered. |
AWS region | Region of the target bucket (for example us-east-1 ). |
Folder/prefix | Optional subfolder inside the bucket where files should be stored. |
Authentication | Access key and secret with write permissions. |
Optional endpoint | Required only for S3-compatible providers such as MinIO, Cloudflare R2, Wasabi, or DigitalOcean Spaces. |
Note: Minimal permissions should include s3:PutObject
, s3:DeleteObject
, and
s3:ListBucket
for the bucket or prefix you specify.
Step-by-Step Guide
1. Set Up Your S3 Bucket
- Log in to your AWS Console and navigate to the S3 service.
- Create a new bucket or select an existing bucket where you want to receive your Usermaven event exports.
- Note the bucket name and region – you’ll need these details later.
- (Optional) Create a specific folder/prefix within the bucket for organizing your Usermaven exports.
2. Configure IAM Permissions
- Navigate to the IAM service in your AWS Console.
- Create a new IAM user or select an existing user.
- Attach a policy that grants the following permissions on your target bucket:
s3:PutObject
s3:DeleteObject
s3:ListBucket
- Generate an access key and secret for this user.
- Keep these credentials secure – you’ll need to provide them to Usermaven.
3. Access Your Usermaven Integration Settings
- Log into your Usermaven account.
- Navigate to Workspace Settings from the main menu.
- Click on Integrations to view available integrations.
4. Locate and Configure S3 Export
-
Scroll through the integrations until you find the S3 card.
-
Click on the Connect S3 button to start the configuration process.
-
Fill out the integration form with the following details:
- Sync Name: Give this export a memorable name, like “Production Events Export”.
- Bucket Name: Enter the name of your S3 bucket.
- AWS Region: Select or enter the region of your bucket (e.g.,
us-east-1
). - Folder/Prefix (Optional): Specify a subfolder path if you want files organized in a specific location.
- Access Key ID: Enter your AWS Access Key ID.
- Secret Access Key: Enter your AWS Secret Access Key.
- Endpoint (Optional): Only required for S3-compatible providers. Leave blank for standard S3.
- Click Test Connection to verify your credentials and bucket access are configured correctly. This will help you identify any issues before saving.
- Once the connection test is successful, double-check all your entries and click Save to complete the setup.
After saving, your S3 export will be listed on the Integrations page with its sync status and next scheduled delivery time.
Delivery cadence
Exports run automatically every 24 hours. Each run picks up rows with an ingestion timestamp greater than the last successful upload. If a delivery fails, the next run automatically retries the same window so no data is lost.
File structure in your bucket
Files are organised using a partitioned folder layout so you can load them efficiently into downstream warehouses or lakehouses:
s3://<bucket>/<prefix>/<workspace_id>/usermaven_events/
├── usermaven_events/
│ ├── _dlt_loads/ # metadata files created by the exporter
│ ├── _dlt_pipeline_state/ # checkpoints used for incremental loads
│ ├── _dlt_version/ # exporter version marker
│ └── events/
│ └── year=YYYY/
│ └── month=MM/
│ └── day=DD/
│ └── *.parquet files
└── usermaven_events_staging/ # staging data for processing
Important notes:
- Data is organized hierarchically with your workspace identifier, followed by dataset folders.
- Event data is partitioned by year, month, and day inside the
events/
folder for easy time-based queries. - Data files are delivered in Apache Parquet format for optimal compression and analytics support.
- The exporter writes one or more Parquet files per run depending on volume.
- Directories prefixed with
_dlt_
are internal metadata used by the data loader. They should not be deleted as they ensure incremental deliveries work correctly.
File contents
- Schema – Each Parquet file mirrors the event schema visible in your workspace, including nested JSON properties.
- Compression – We use columnar compression (Snappy by default) to keep file sizes manageable while preserving query speed. Alternative codecs such as Zstandard can be enabled on request.
- Timestamps – All timestamps are in UTC. Partition folders are based on the time the export job ran, not necessarily the original event time.
Working with S3-compatible providers
We support any storage provider that exposes an S3 API. In addition to the standard details above, please supply the HTTPS endpoint for your provider and confirm whether path-style addressing is required. All other behaviour (file format, folder layout, metadata directories) remains the same.
Supported providers include:
- MinIO
- Cloudflare R2
- Wasabi
- DigitalOcean Spaces
- And other S3-compatible services
Monitoring and issue resolution
No new files appear
Confirm that new events were received in the application during the export window and that the workspace identifier matches what was provided to Usermaven.
Permissions errors
Ensure the IAM user has write access to the bucket/prefix and, if using a KMS-encrypted bucket, that the policy allows encryption and decryption from the exporting account.
Multiple files per run
Large exports are automatically split into several Parquet files to optimize performance and manageability.
Reprocessing history
If you need a historical backfill, contact your Usermaven representative so we can temporarily extend the lookback window or reset the incremental checkpoint on your behalf.
Support
For any questions or issues, reach out to Usermaven support and include the path to the affected files (bucket, prefix, date) so we can investigate promptly.