DuckDB extensions for AWS Lambda

Run DuckDB serverlessly for querying remote and local data. With access to all currently available extensions.

Getting started

1. Deploy DuckDB on AWS Lambda

You can leverage DuckDB for your data applications on AWS in multiple ways. If you'd like to expose an API that provides dynamic query capabilities for data in S3 or remote locations, the serverless-duckdb project could come handy.

Reparitioning data in S3 data lakes is also a common use case that DuckDB supports (even better than AWS Athena) Therefore, serverless-parquet-repartitioner was created.

Or, just build something completely custom by leveraging the pre-build Lambda layers

2. Add custom extension source

As DuckDB doesn't currently build the extensions for older Linux variants like Amazon Linux 2 which AWS Lambda uses (this results in GLIBC incompatibilities when trying to load/install them), it was necessary to build them separately. Luckily, DuckDB offers a way to specify another extension repository URL as source:

SET custom_extension_repository = 'http://extensions.quacking.cloud';

3. Install & Load extensions

Once you set the custom extension repository, you can dynmically install and load the custom DuckDB extensions for different use cases. For an overview, have a look at the table of usabel extensions below.

Available DuckDB extensions

Extension Name Description Install Load Repository
arrow Use Apache Arrow within DuckDB INSTALL arrow; LOAD arrow; Link
aws Use AWS credentials from the credential chain INSTALL aws; LOAD aws; Link
azure Use Azure credentials INSTALL azure; LOAD azure; Link
fts Use Full Text Search within DuckDB INSTALL fts; LOAD fts; Link
httpfs Use HTTP file system within DuckDB INSTALL httpfs; LOAD httpfs; Link
iceberg Use Apache Iceberg tables with DuckDB INSTALL iceberg; LOAD iceberg; Link
json Use JSON data with DuckDB INSTALL json; LOAD json; Link
mysql_scanner Access MySQL databases from DuckDB INSTALL mysql_scanner; LOAD mysql_scanner; Link
parquet Access Parquet files from DuckDB INSTALL parquet; LOAD parquet; Link
postgres_scanner Access Postgres databases from DuckDB INSTALL postgres_scanner; LOAD postgres_scanner; Link
spatial Use spatial features from GDAL and GEOS INSTALL spatial; LOAD spatial; Link
sqlite_scanner Access SQLite databases from DuckDB INSTALL sqlite_scanner; LOAD sqlite_scanner; Link
sqlsmith Access SQLite databases from DuckDB INSTALL sqlsmith; LOAD sqlsmith; Link
substrait Use and generate Substrait query plans INSTALL substrait; LOAD substrait; Link
vss Use Vesctor Similarity Search with DuckDB INSTALL vss; LOAD vss; Link