Metadata-Version: 2.4
Name: turbine-data
Version: 0.2.0
Summary: A CLI tool for working with data products and contracts
Author-email: "Chibrani - Derks, Yassin" <yassin.chibrani-derks@enexis.nl>
Classifier: Development Status :: 4 - Beta
Requires-Python: <3.15,>=3.13
Requires-Dist: alembic>=1.18.4
Requires-Dist: cyclopts>=4.8.0
Requires-Dist: fastapi-pagination>=0.15.12
Requires-Dist: fastapi>=0.135.1
Requires-Dist: httpx>=0.28.1
Requires-Dist: jinja2>=3.1.6
Requires-Dist: jsonschema-rs>=0.46.0
Requires-Dist: logfire>=4.29.0
Requires-Dist: loguru>=0.7.3
Requires-Dist: networkx>=3.6.1
Requires-Dist: numpy>=2.4.3
Requires-Dist: open-data-contract-standard>=3.1.2
Requires-Dist: opentelemetry-instrumentation-psycopg>=0.60b1
Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.60b1
Requires-Dist: prompt-toolkit>=3.0.52
Requires-Dist: pydantic-settings>=2.13.1
Requires-Dist: python-dotenv>=1.2.2
Requires-Dist: rich>=14.3.3
Requires-Dist: ruamel-yaml<0.18.0,>=0.17.0
Requires-Dist: scipy>=1.17.1
Requires-Dist: sqlakeyset>=2.0.1775222100
Requires-Dist: sqlalchemy>=2.0.48
Requires-Dist: sqlglot>=29.0.1
Requires-Dist: sqlmodel>=0.0.37
Requires-Dist: tomli-w>=1.2.0
Requires-Dist: tree-sitter-yaml>=0.7.2
Requires-Dist: tree-sitter>=0.25.2
Requires-Dist: uvicorn>=0.43.0
Provides-Extra: all
Requires-Dist: connectorx>=0.4.5; extra == 'all'
Requires-Dist: duckdb-engine>=0.13; extra == 'all'
Requires-Dist: duckdb>=0.9; extra == 'all'
Requires-Dist: fastapi-pagination>=0.15.12; extra == 'all'
Requires-Dist: fastapi>=0.135.1; extra == 'all'
Requires-Dist: httpx>=0.28.1; extra == 'all'
Requires-Dist: logfire>=4.29.0; extra == 'all'
Requires-Dist: lsprotocol>=2025.0.0; extra == 'all'
Requires-Dist: openpyxl>=3.1.5; extra == 'all'
Requires-Dist: pandas>=2.0; extra == 'all'
Requires-Dist: plotly>=6.6.0; extra == 'all'
Requires-Dist: polars>=1.0; extra == 'all'
Requires-Dist: psycopg[binary]>=3.1; extra == 'all'
Requires-Dist: pyarrow>=15.0; extra == 'all'
Requires-Dist: pygls>=2.0.1; extra == 'all'
Requires-Dist: snowflake-connector-python>=3.0; extra == 'all'
Requires-Dist: snowflake-sqlalchemy>=1.7; extra == 'all'
Requires-Dist: soda-duckdb>=4.2.0; extra == 'all'
Requires-Dist: soda-postgres>=4.2.0; extra == 'all'
Requires-Dist: soda-snowflake>=4.2.0; extra == 'all'
Requires-Dist: sqlakeyset>=2.0.1775222100; extra == 'all'
Requires-Dist: sqlmodel>=0.0.37; extra == 'all'
Requires-Dist: streamlit-antd-components>=0.3.2; extra == 'all'
Requires-Dist: streamlit-echarts>=0.6.0; extra == 'all'
Requires-Dist: streamlit-extras>=1.3.0; extra == 'all'
Requires-Dist: streamlit>=1.56.0; extra == 'all'
Requires-Dist: uvicorn>=0.43.0; extra == 'all'
Provides-Extra: api
Requires-Dist: fastapi-pagination>=0.15.12; extra == 'api'
Requires-Dist: fastapi>=0.135.1; extra == 'api'
Requires-Dist: httpx>=0.28.1; extra == 'api'
Requires-Dist: sqlakeyset>=2.0.1775222100; extra == 'api'
Requires-Dist: sqlmodel>=0.0.37; extra == 'api'
Requires-Dist: uvicorn>=0.43.0; extra == 'api'
Provides-Extra: dashboard
Requires-Dist: httpx>=0.28.1; extra == 'dashboard'
Requires-Dist: openpyxl>=3.1.5; extra == 'dashboard'
Requires-Dist: plotly>=6.6.0; extra == 'dashboard'
Requires-Dist: streamlit-antd-components>=0.3.2; extra == 'dashboard'
Requires-Dist: streamlit-echarts>=0.6.0; extra == 'dashboard'
Requires-Dist: streamlit-extras>=1.3.0; extra == 'dashboard'
Requires-Dist: streamlit>=1.56.0; extra == 'dashboard'
Provides-Extra: duckdb
Requires-Dist: connectorx>=0.4.5; extra == 'duckdb'
Requires-Dist: duckdb-engine>=0.13; extra == 'duckdb'
Requires-Dist: duckdb>=0.9; extra == 'duckdb'
Requires-Dist: fastapi-pagination>=0.15.12; extra == 'duckdb'
Requires-Dist: fastapi>=0.135.1; extra == 'duckdb'
Requires-Dist: httpx>=0.28.1; extra == 'duckdb'
Requires-Dist: logfire>=4.29.0; extra == 'duckdb'
Requires-Dist: lsprotocol>=2025.0.0; extra == 'duckdb'
Requires-Dist: pandas>=2.0; extra == 'duckdb'
Requires-Dist: polars>=1.0; extra == 'duckdb'
Requires-Dist: pyarrow>=15.0; extra == 'duckdb'
Requires-Dist: pygls>=2.0.1; extra == 'duckdb'
Requires-Dist: soda-duckdb>=4.2.0; extra == 'duckdb'
Requires-Dist: sqlakeyset>=2.0.1775222100; extra == 'duckdb'
Requires-Dist: sqlmodel>=0.0.37; extra == 'duckdb'
Requires-Dist: uvicorn>=0.43.0; extra == 'duckdb'
Provides-Extra: duckdb-minimal
Requires-Dist: duckdb-engine>=0.13; extra == 'duckdb-minimal'
Requires-Dist: duckdb>=0.9; extra == 'duckdb-minimal'
Requires-Dist: soda-duckdb>=4.2.0; extra == 'duckdb-minimal'
Provides-Extra: graph
Requires-Dist: networkx>=3.6.1; extra == 'graph'
Requires-Dist: scipy>=1.17.1; extra == 'graph'
Provides-Extra: lsp
Requires-Dist: lsprotocol>=2025.0.0; extra == 'lsp'
Requires-Dist: pygls>=2.0.1; extra == 'lsp'
Provides-Extra: postgres
Requires-Dist: connectorx>=0.4.5; extra == 'postgres'
Requires-Dist: fastapi-pagination>=0.15.12; extra == 'postgres'
Requires-Dist: fastapi>=0.135.1; extra == 'postgres'
Requires-Dist: httpx>=0.28.1; extra == 'postgres'
Requires-Dist: logfire>=4.29.0; extra == 'postgres'
Requires-Dist: lsprotocol>=2025.0.0; extra == 'postgres'
Requires-Dist: pandas>=2.0; extra == 'postgres'
Requires-Dist: polars>=1.0; extra == 'postgres'
Requires-Dist: psycopg[binary]>=3.1; extra == 'postgres'
Requires-Dist: pyarrow>=15.0; extra == 'postgres'
Requires-Dist: pygls>=2.0.1; extra == 'postgres'
Requires-Dist: soda-postgres>=4.2.0; extra == 'postgres'
Requires-Dist: sqlakeyset>=2.0.1775222100; extra == 'postgres'
Requires-Dist: sqlmodel>=0.0.37; extra == 'postgres'
Requires-Dist: uvicorn>=0.43.0; extra == 'postgres'
Provides-Extra: postgres-minimal
Requires-Dist: psycopg[binary]>=3.1; extra == 'postgres-minimal'
Requires-Dist: soda-postgres>=4.2.0; extra == 'postgres-minimal'
Provides-Extra: python-checks
Requires-Dist: connectorx>=0.4.5; extra == 'python-checks'
Requires-Dist: pandas>=2.0; extra == 'python-checks'
Requires-Dist: polars>=1.0; extra == 'python-checks'
Requires-Dist: pyarrow>=15.0; extra == 'python-checks'
Provides-Extra: snowflake
Requires-Dist: connectorx>=0.4.5; extra == 'snowflake'
Requires-Dist: fastapi-pagination>=0.15.12; extra == 'snowflake'
Requires-Dist: fastapi>=0.135.1; extra == 'snowflake'
Requires-Dist: httpx>=0.28.1; extra == 'snowflake'
Requires-Dist: logfire>=4.29.0; extra == 'snowflake'
Requires-Dist: lsprotocol>=2025.0.0; extra == 'snowflake'
Requires-Dist: pandas>=2.0; extra == 'snowflake'
Requires-Dist: polars>=1.0; extra == 'snowflake'
Requires-Dist: pyarrow>=15.0; extra == 'snowflake'
Requires-Dist: pygls>=2.0.1; extra == 'snowflake'
Requires-Dist: snowflake-connector-python>=3.0; extra == 'snowflake'
Requires-Dist: snowflake-sqlalchemy>=1.7; extra == 'snowflake'
Requires-Dist: soda-snowflake>=4.2.0; extra == 'snowflake'
Requires-Dist: sqlakeyset>=2.0.1775222100; extra == 'snowflake'
Requires-Dist: sqlmodel>=0.0.37; extra == 'snowflake'
Requires-Dist: uvicorn>=0.43.0; extra == 'snowflake'
Provides-Extra: snowflake-minimal
Requires-Dist: snowflake-connector-python>=3.0; extra == 'snowflake-minimal'
Requires-Dist: snowflake-sqlalchemy>=1.7; extra == 'snowflake-minimal'
Requires-Dist: soda-snowflake>=4.2.0; extra == 'snowflake-minimal'
Provides-Extra: telemetry
Requires-Dist: logfire>=4.29.0; extra == 'telemetry'
Description-Content-Type: text/markdown

# Turbine

Contract-driven data quality for data products — powered by [ODCS](https://github.com/bitol-io/open-data-contract-standard) and [Soda Core](https://github.com/sodadata/soda-core).

[![pipeline](https://enx.gitlab.schubergphilis.com/1000078-app-keten-shared/applications/turbine/badges/main/pipeline.svg)](https://enx.gitlab.schubergphilis.com/1000078-app-keten-shared/applications/turbine/-/pipelines)
[![coverage](https://enx.gitlab.schubergphilis.com/1000078-app-keten-shared/applications/turbine/badges/main/coverage.svg)](https://enx.gitlab.schubergphilis.com/1000078-app-keten-shared/applications/turbine/-/pipelines)
[![python](https://img.shields.io/badge/python-3.13+-blue.svg)](https://www.python.org/)
[![ODCS](https://img.shields.io/badge/ODCS-v3.1.0-green.svg)](https://github.com/bitol-io/open-data-contract-standard)

![turbine check](docs/assets/turbine-check.png)

## How It Works

```
Contract ➜ Lint ➜ Check ➜ Score ➜ Flag ➜ Observe
```

1. **Contract** — Define expectations in YAML using [ODCS v3.1.0](https://github.com/bitol-io/open-data-contract-standard)
2. **Lint** — Validate contract schema before anything touches a database
3. **Check** — Run quality checks against live data (SodaCL, SQL, Python, window, group)
4. **Score** — Calculate a dimension-aware quality score
5. **Flag** — Tag failing rows with bitmask flags for downstream filtering
6. **Observe** — Export traces and metrics via OpenTelemetry

## Features

- **YAML contracts** — ODCS v3.1.0 with Soda extensions for quality checks
- **13 check types** — missing, duplicate, invalid, freshness, row_count, SQL, Python, typed multi-table Python, group, and window (zscore, spike, flatline)
- **Schema drift detection** — Compare live database schemas against your contract
- **Dimension-aware scoring** — Weight quality dimensions (completeness, accuracy, …) per check
- **Row-level flagging** — Per-cell Roaring-bitmap matrix tracks which rows failed which checks across runs
- **Code generation** — Scaffold SQLModel models and FastAPI routers from contracts
- **Dependency management** — Lockfile-based contract dependency resolution
- **IDE support** — Language server (LSP) with extensions for VSCode and JetBrains

## Quick Start

> **Prerequisites:** Python 3.13+ and [uv](https://docs.astral.sh/uv/)

```bash
# Install with your database driver
uv add "turbine-data[snowflake]"    # or: postgres, duckdb

# Initialize the recommended src/{project_name} layout
uv run turbine init --defaults

# Copy .env.example to .env and fill your warehouse credentials
cp .env.example .env

# Validate the starter Contract
uv run turbine lint src/{project_name}/contracts/example.yml

# Run quality checks against the starter Datasource named default
uv run turbine check --datasource default src/{project_name}/contracts/example.yml
```

## Supported Databases

| Database   | Install extra         |
| ---------- | --------------------- |
| PostgreSQL | `turbine-data[postgres]`   |
| Snowflake  | `turbine-data[snowflake]`  |
| DuckDB     | `turbine-data[duckdb]`     |

## Documentation

Full docs live in [`docs/`](docs/) — covering [getting started](docs/getting-started/), [guides](docs/guides/), [concepts](docs/concepts/), and [CLI reference](docs/reference/cli.md).

## Contributing

See the [contributing guide](docs/contributing/) for dev setup, testing, and code style.
