Metadata-Version: 2.4
Name: forja-lakehouse
Version: 0.1.0
Summary: Plugin de forja — arquitectura Lakehouse con DuckDB, Delta Lake y Polars
Author-email: Juan <juanch.21@gmail.com>
Requires-Python: >=3.10
Requires-Dist: forja>=1.0.1
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.7; extra == 'dev'
Provides-Extra: lakehouse
Requires-Dist: deltalake>=0.17.0; extra == 'lakehouse'
Requires-Dist: duckdb>=0.10.0; extra == 'lakehouse'
Requires-Dist: fsspec>=2024.2.0; extra == 'lakehouse'
Requires-Dist: pandera[polars]>=0.18.0; extra == 'lakehouse'
Requires-Dist: polars>=0.20.0; extra == 'lakehouse'
Requires-Dist: pyarrow>=15.0.0; extra == 'lakehouse'
Requires-Dist: s3fs>=2024.2.0; extra == 'lakehouse'
Description-Content-Type: text/markdown

# forja-lakehouse

Plugin de [forja](https://pypi.org/project/forja/) que agrega la arquitectura **Lakehouse medallion** (bronze / silver / gold).

```bash
pip install forja forja-lakehouse
dfg init mi_proyecto --arch lakehouse
```

## Stack

- **DuckDB** — motor analítico local, lee Delta Lake y Parquet nativamente
- **Polars** — transformaciones rápidas con lazy evaluation
- **Delta Lake** — formato ACID para la capa silver
- **PyArrow** — serialización Parquet para la capa bronze
- **fsspec** — abstracción de storage (local, S3, GCS, Cloudflare R2)

## Arquitectura generada

```
src/<proyecto>/
├── bronze/     # ingesta cruda → Parquet
├── silver/     # limpieza con Polars → Delta Lake
├── gold/       # agregaciones con DuckDB → serving
├── pipelines/
│   ├── ingest.py     # dfg run ingest
│   ├── transform.py  # dfg run transform
│   └── serve.py      # dfg run serve
└── config/
    └── storage.py    # STORAGE_ROOT (local o s3://)
```

## Licencia

MIT
