Metadata-Version: 2.4
Name: dqm-ml
Version: 2.0.0rc1
Summary: Python library designed to compute data quality metrics for Machine Learning
Author-email: Safenai <support@safenai.io>
License-Expression: Apache-2.0
Project-URL: Homepage, https://irt-systemx.github.io/dqm-ml
Project-URL: Documentation, https://irt-systemx.github.io/dqm-ml
Project-URL: Repository, https://github.com/IRT-SystemX/dqm-ml
Keywords: ml,metrics,data
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: dqm-ml-core>=1.1.4
Provides-Extra: notebooks
Requires-Dist: jupyter>=1.0.0; extra == "notebooks"
Requires-Dist: plotly>=5.0.0; extra == "notebooks"
Provides-Extra: job
Requires-Dist: dqm-ml-job>=2.0.0rc0; extra == "job"
Provides-Extra: pytorch
Requires-Dist: dqm-ml-images>=2.0.0rc0; extra == "pytorch"
Provides-Extra: images
Requires-Dist: dqm-ml-images>=2.0.0rc0; extra == "images"
Provides-Extra: all
Requires-Dist: dqm-ml[images,job,notebooks,pytorch]; extra == "all"

# DQM-ML CLI Wrapper

Main CLI entry point for DQM-ML. Consolidates all modular packages into a single command-line interface.

## Installation

```bash
# Basic installation (core only)
pip install dqm-ml

# Installation with optional components
pip install "dqm-ml[all]"      # Everything
pip install "dqm-ml[job]"      # core + job
pip install "dqm-ml[pytorch]" # core + pytorch
pip install "dqm-ml[images]"  # core + images
pip install "dqm-ml[notebooks]" # Jupyter support
```

## Quick Start

### Process a Dataset

Run a data quality pipeline from a configuration file:

```bash
dqm-ml process -p config.yaml
```

### List Available Plugins

Show all registered metrics and data loaders:

```bash
dqm-ml list
```

### Check Version

```bash
dqm-ml version
```

## Commands

| Command | Description |
|---------|-------------|
| **process** | Execute a data quality pipeline from a YAML config |
| **list** | Show all available plugins (metrics, loaders) |
| **version** | Display version information |

## Configuration

DQM-ML uses YAML configuration files to define:
- Data sources (dataloaders)
- Metrics to compute (metrics_processor)
- Output settings (outputs)

### Completeness Example

```yaml
dataloaders:
  train:
    type: parquet
    path: data/train.parquet

metrics_processor:
  completeness:
    type: completeness
    input_columns: [col_a, col_b]
```

### Representativeness Example

```yaml
dataloaders:
  train:
    type: parquet
    path: data/train.parquet

metrics_processor:
  representativeness:
    type: representativeness
    input_columns: [feature_x, feature_y]
    distribution: "normal"
    metrics: ["chi-square", "kolmogorov-smirnov"]
```

### Domain Gap Example

```yaml
dataloaders:
  source:
    type: parquet
    path: data/source.parquet
  target:
    type: parquet
    path: data/target.parquet

metrics_processor:
  domain_gap:
    type: domain_gap
    INPUT:
      embedding_col: "features"
    DELTA:
      metric: "mmd_linear"
```

### Visual Features Example

```yaml
dataloaders:
  images:
    type: parquet
    path: data/images.parquet

metrics_processor:
  visual:
    type: visual_metric
    input_columns: ["image_data"]
    grayscale: true
```

### Multiple Metrics Example

```yaml
dataloaders:
  train:
    type: parquet
    path: data/train.parquet

metrics_processor:
  completeness:
    type: completeness
    input_columns: [col_a, col_b]
  
  representativeness:
    type: representativeness
    input_columns: [feature_x]
    distribution: "normal"
```

## See Also

- [Documentation](https://safenai.github.io/dqm-ml-workspace/)
- [Metrics Guide](https://safenai.github.io/dqm-ml-workspace/docs/metrics/)
- [Configuration Guide](https://safenai.github.io/dqm-ml-workspace/docs/configuration/)
