Metadata-Version: 2.4
Name: exadata-validator
Version: 0.0.1
Summary: Validate data-model folders using DuckDB.
Author: Exaflow Team
Requires-Python: >=3.10,<3.11
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Requires-Dist: click (>=8.1,<8.2)
Requires-Dist: duckdb (>=1.1,<1.2)
Description-Content-Type: text/markdown

# exadata-validator

`exadata-validator` validates data-model folders using DuckDB.

## Install With pip

```bash
python -m venv .venv
source .venv/bin/activate
pip install exadata-validator
```

Validate a data model folder:

```bash
exadata-validator validate-data-model /path/to/data_model_folder
```

## Develop With Poetry

From the repository root:

```bash
cd data-validator/exaflow-data-validator
poetry install
```

Then run the same CLI through Poetry:

```bash
poetry run exadata-validator validate-data-model /path/to/data_model_folder
```

Use `exadata-validator validate-data-model --help` for reporting, output, and threading options.

## Folder Layout

```text
/path/to/data_model_folder/
  CDEsMetadata.json
  dataset1.csv
  dataset2.csv
```

## Validation Notes

- CSV validation queries files directly with DuckDB and uses fused aggregate checks to reduce scan overhead.
- Folder-level dataset uniqueness is enforced across all CSV files via SQL using normalized codes (`trim + lower`).

