Metadata-Version: 2.4
Name: ratebook-data
Version: 0.1.0
Summary: Ratebook data plant: source ingestion (URDB, EIA), DuckDB warehouse, dataset releases.
Project-URL: Homepage, https://github.com/cbetz/ratebook
Project-URL: Repository, https://github.com/cbetz/ratebook
Project-URL: Issues, https://github.com/cbetz/ratebook/issues
Author: Chris Betz
License-Expression: Apache-2.0
License-File: LICENSE
License-File: LICENSE-DATA
Keywords: dataset,duckdb,eia,electricity,energy,tariff,urdb,utility
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Database
Requires-Python: >=3.12
Requires-Dist: duckdb>=1.0
Requires-Dist: httpx>=0.27
Requires-Dist: pypdf>=4.0
Requires-Dist: ratebook
Provides-Extra: extract
Requires-Dist: anthropic>=0.40; extra == 'extract'
Description-Content-Type: text/markdown

# ratebook-data

The Ratebook data plant: source ingestion, DuckDB warehouse, and (eventually) versioned
Parquet/JSON dataset releases with full per-record provenance.

```sh
uv run ratebook-data urdb   # download usurdb.csv.gz → data/raw/, load into data/ratebook.duckdb
```

Every download is kept verbatim under `data/raw/` with a `.meta.json` provenance sidecar
(source URL, sha256, ETag, Last-Modified). The DuckDB `raw` schema is all-VARCHAR; typing
happens in transformations, never at ingest.
