Metadata-Version: 2.4
Name: dagster-soda
Version: 0.29.4
Summary: Package for running Soda Core data quality scans in Dagster.
Project-URL: Homepage, https://github.com/dagster-io/dagster/tree/master/python_modules/libraries/dagster-soda
Author-email: Dagster Labs <hello@dagsterlabs.com>
License-Expression: Apache-2.0
License-File: LICENSE
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: <3.15,>=3.10
Requires-Dist: dagster==1.13.4
Requires-Dist: pyyaml
Requires-Dist: soda-core<4,>=3.0
Provides-Extra: test
Requires-Dist: dagster-dg-cli==1.13.4; extra == 'test'
Description-Content-Type: text/markdown

# dagster-soda

[![PyPI version](https://badge.fury.io/py/dagster-soda.svg)](https://badge.fury.io/py/dagster-soda)

**dagster-soda** integrates [Soda Core](https://docs.soda.io/soda-core/) data quality checks with Dagster. It provides the **SodaScanComponent**, a Dagster component that runs Soda Core scans and maps SodaCL check results to Dagster asset checks.

## Installation

```bash
pip install dagster-soda
```

**Note:** `dagster-soda` requires **soda-core 3.x** (the `soda.scan` API). It pins `soda-core>=3.0,<4` by default.

## Usage

### Component: SodaScanComponent

Configure a `SodaScanComponent` in your Dagster project to:

- Point at SodaCL YAML check files and a Soda `configuration.yml`
- Map Soda dataset names to Dagster asset keys
- Run scans and report pass/fail as Dagster asset check results

### Scaffolding with the CLI

Use the Dagster CLI to scaffold a new Soda scan component in your project (requires [dagster-dg-cli](https://docs.dagster.io/concepts/components#scaffolding-components)):

```bash
dg scaffold defs dagster_soda.SodaScanComponent <path>
```

Example (scaffold into a folder named `soda_checks` under your defs directory):

```bash
dg scaffold defs dagster_soda.SodaScanComponent soda_checks
```

This generates:

- A **defs.yaml** with default attributes (`checks_paths`, `configuration_path`, `data_source_name`, `asset_key_map`)
- A **checks.yml** template with example SodaCL (e.g. `checks for my_table: - row_count > 0`)

Edit the generated files to match your data source and checks, then load your definitions as usual.

### Minimal defs.yaml example

```yaml
type: dagster_soda.SodaScanComponent
attributes:
  checks_paths:
    - checks.yml
  configuration_path: configuration.yml
  data_source_name: my_datasource
  asset_key_map:
    my_table: my_table
```

## Documentation

The docs for **dagster-soda** can be found [here](https://docs.dagster.io/integrations/libraries/soda/dagster-soda).
