Metadata-Version: 2.4
Name: dqtlib
Version: 0.9.3
Summary: Data quality, observability, semantic, and causality library
Project-URL: Homepage, https://github.com/antonbarr-data/dqt
Project-URL: Repository, https://github.com/antonbarr-data/dqt
Author-email: Anton Barr <antonbar@gmail.com>
License: MIT
Keywords: anomaly-detection,causal-inference,causality,data-drift,data-quality,observability
Requires-Python: >=3.12
Requires-Dist: diptest>=0.6
Requires-Dist: duckdb>=0.9
Requires-Dist: ibis-framework>=9.0
Requires-Dist: jsonschema>=4.22
Requires-Dist: numpy>=1.26
Requires-Dist: pandas>=2.2
Requires-Dist: pydantic>=2.7
Requires-Dist: pyod>=1.1
Requires-Dist: pyyaml>=6.0
Requires-Dist: river>=0.21
Requires-Dist: scikit-learn>=1.5
Requires-Dist: scipy>=1.13
Requires-Dist: statsmodels>=0.14
Requires-Dist: structlog>=24.0
Provides-Extra: bigquery
Requires-Dist: db-dtypes>=1.2; extra == 'bigquery'
Requires-Dist: google-cloud-bigquery>=3.19; extra == 'bigquery'
Provides-Extra: causal
Requires-Dist: causal-learn>=0.1; extra == 'causal'
Requires-Dist: dowhy>=0.11; extra == 'causal'
Requires-Dist: tigramite>=0.7; extra == 'causal'
Provides-Extra: clickhouse
Requires-Dist: clickhouse-connect>=0.7; extra == 'clickhouse'
Provides-Extra: dashboard
Requires-Dist: fastapi>=0.111; extra == 'dashboard'
Requires-Dist: jinja2>=3.1; extra == 'dashboard'
Requires-Dist: python-multipart>=0.0.9; extra == 'dashboard'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'dashboard'
Provides-Extra: databricks
Requires-Dist: databricks-sql-connector>=3.1; extra == 'databricks'
Provides-Extra: deep
Requires-Dist: pyod[deep]>=1.1; extra == 'deep'
Requires-Dist: torch>=2.3; extra == 'deep'
Provides-Extra: explain
Requires-Dist: pgmpy>=0.1; extra == 'explain'
Requires-Dist: shap>=0.45; extra == 'explain'
Provides-Extra: files
Requires-Dist: openpyxl>=3.0; extra == 'files'
Requires-Dist: pyarrow>=14.0; extra == 'files'
Provides-Extra: forecast
Requires-Dist: prophet>=1.1; extra == 'forecast'
Requires-Dist: stumpy>=1.4; extra == 'forecast'
Provides-Extra: lineage
Requires-Dist: sqlglot>=23.0; extra == 'lineage'
Provides-Extra: postgres
Requires-Dist: ibis-framework[postgres]>=9.0; extra == 'postgres'
Requires-Dist: psycopg2-binary>=2.9; extra == 'postgres'
Provides-Extra: reports
Requires-Dist: matplotlib>=3.8; extra == 'reports'
Provides-Extra: snowflake
Requires-Dist: snowflake-connector-python>=3.9; extra == 'snowflake'
Provides-Extra: wiki
Requires-Dist: anthropic>=0.26; extra == 'wiki'
Description-Content-Type: text/markdown

# dqtlib

**Open-source data quality, lineage, semantic layer & causality — for dbt, warehouses and data lakes.**

pip-installable Python library for watching dbt-built warehouses and any SQL warehouse for statistical drift, anomalies, silent regressions, and explaining *why* metrics moved.

```bash
pip install dqtlib
```

The import name is `dqt`:

```python
from dqt import Check, Runner, MemoryStore
```

Full documentation and examples: https://github.com/antonbarr-data/dqt

## Detector documentation

64 statistical detectors across 10 groups — drift, outliers, time series, distribution, information theory, pattern, referential, schema, basic, and custom.

Every detector has a structured page at [`docs/algorithms/<group>/<slug>.md`](docs/algorithms/README.md) covering:

- What it computes and its parameters
- When it works well and when it fails (with concrete failure-mode table)
- Default-threshold calibration — empirical FPR across six canonical data shapes (Normal, Lognormal, Poisson, Beta, Pareto, Exponential)
- Recommended thresholds per data shape
- Canonical citation and runnable Python API example

Browse the full catalog: [docs/algorithms/README.md](docs/algorithms/README.md)
