Metadata-Version: 2.3
Name: duckbricks-utils
Version: 0.1.2
Summary: DuckLake connection utilities for DuckBricks notebooks and pipelines
Author: DuckBricks Team
Requires-Python: >=3.11,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: duckdb (>=1.3.0)
Requires-Dist: python-dotenv (>=1.0)
Description-Content-Type: text/markdown

# duckbricks-utils

DuckLake connection utilities for DuckBricks notebooks and pipelines.

Provides a consistent interface for connecting to a [DuckLake](https://ducklake.select/) catalog backed by PostgreSQL, from any environment — local IDE, Marimo notebooks, or job executors. Supports multiple storage backends (local, S3, MinIO, Cloudflare R2, GCS, Azure Blob).

## Installation

```bash
pip install duckbricks-utils
```

## Quick start

```python
from duckbricks_utils import connect

conn = connect()
result = conn.execute("SELECT * FROM my_table LIMIT 10").df()
```

## Configuration

All settings are read from environment variables (a `.env` file is supported via `python-dotenv`).

### PostgreSQL catalog

| Variable | Default | Description |
|---|---|---|
| `DUCKLAKE_PG_HOST` | `localhost` | PostgreSQL host |
| `DUCKLAKE_PG_PORT` | `5432` | PostgreSQL port |
| `DUCKLAKE_PG_DATABASE` | `duckbricks` | PostgreSQL database name |
| `DUCKLAKE_PG_USER` | `duckbricks` | PostgreSQL user |
| `DUCKLAKE_PG_PASSWORD` | `duckbricks` | PostgreSQL password |
| `DUCKBRICKS_DUCKLAKE_NAME` | `duckbricks` | DuckLake catalog name |

### Storage backend

Set `DUCKBRICKS_STORAGE_BACKEND` to one of the supported values (default: `local`).

| Backend | `DUCKBRICKS_STORAGE_BACKEND` |
|---|---|
| Local filesystem | `local` |
| Amazon S3 | `s3` |
| MinIO | `minio` |
| Cloudflare R2 | `r2` |
| Google Cloud Storage | `gcs` |
| Azure Blob Storage | `azure` |

Each backend reads its own credentials from environment variables (e.g. `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` for S3).

### Local backend

| Variable | Default | Description |
|---|---|---|
| `DUCKBRICKS_DATA_PATH` | `/data/parquet/` | Parquet storage path |

## API reference

```python
from duckbricks_utils import connect, catalog_name, data_path
from duckbricks_utils import StorageBackend, StorageBackendFactory

# Open a DuckDB connection with the DuckLake catalog attached
conn = connect()

# Open a connection with a custom data path override
conn = connect(override_data_path="/tmp/my_data/")

# Get the configured catalog name
name = catalog_name()

# Get the active backend's data path
path = data_path()

# Resolve the backend from environment
backend = StorageBackendFactory.from_env()

# List all supported backend names
backends = StorageBackendFactory.supported_backends()
```

## Requirements

- Python ≥ 3.11
- DuckDB ≥ 1.3.0

