Run DuckDB serverlessly for querying remote and local data. With access to all currently available extensions.
You can leverage DuckDB for your data applications on AWS in multiple ways. If you'd like to expose an API that provides dynamic query capabilities for data in S3 or remote locations, the serverless-duckdb project could come handy.
Reparitioning data in S3 data lakes is also a common use case that DuckDB supports (even better than AWS Athena) Therefore, serverless-parquet-repartitioner was created.
Or, just build something completely custom by leveraging the pre-build Lambda layers
As DuckDB doesn't currently build the extensions for older Linux variants like Amazon Linux 2 which AWS Lambda uses (this results in GLIBC incompatibilities when trying to load/install them), it was necessary to build them separately. Luckily, DuckDB offers a way to specify another extension repository URL as source:
SET custom_extension_repository = 'http://extensions.quacking.cloud';
Once you set the custom extension repository, you can dynmically install and load the custom DuckDB extensions for different use cases. For an overview, have a look at the table of usabel extensions below.
Extension Name | Description | Install | Load | Repository |
---|---|---|---|---|
arrow | Use Apache Arrow within DuckDB |
INSTALL arrow;
|
LOAD arrow;
|
Link |
aws | Use AWS credentials from the credential chain |
INSTALL aws;
|
LOAD aws;
|
Link |
azure | Use Azure credentials |
INSTALL azure;
|
LOAD azure;
|
Link |
fts | Use Full Text Search within DuckDB |
INSTALL fts;
|
LOAD fts;
|
Link |
httpfs | Use HTTP file system within DuckDB |
INSTALL httpfs;
|
LOAD httpfs;
|
Link |
iceberg | Use Apache Iceberg tables with DuckDB |
INSTALL iceberg;
|
LOAD iceberg;
|
Link |
json | Use JSON data with DuckDB |
INSTALL json;
|
LOAD json;
|
Link |
mysql_scanner | Access MySQL databases from DuckDB |
INSTALL mysql_scanner;
|
LOAD mysql_scanner;
|
Link |
parquet | Access Parquet files from DuckDB |
INSTALL parquet;
|
LOAD parquet;
|
Link |
postgres_scanner | Access Postgres databases from DuckDB |
INSTALL postgres_scanner;
|
LOAD postgres_scanner;
|
Link |
spatial | Use spatial features from GDAL and GEOS |
INSTALL spatial;
|
LOAD spatial;
|
Link |
sqlite_scanner | Access SQLite databases from DuckDB |
INSTALL sqlite_scanner;
|
LOAD sqlite_scanner;
|
Link |
sqlsmith | Access SQLite databases from DuckDB |
INSTALL sqlsmith;
|
LOAD sqlsmith;
|
Link |
substrait | Use and generate Substrait query plans |
INSTALL substrait;
|
LOAD substrait;
|
Link |
vss | Use Vesctor Similarity Search with DuckDB |
INSTALL vss;
|
LOAD vss;
|
Link |