Metadata-Version: 2.4
Name: diffino-cli
Version: 0.3.1
Summary: Declarative data diff engine for tables, powered by Polars. Output Excel, HTML, or Typst PDF.
License-Expression: MIT
Project-URL: Repository, https://codeberg.org/songwupei/diffino
Keywords: diff,excel,csv,comparison,polars,openpyxl
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Office/Business :: Financial :: Spreadsheet
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: polars>=0.20.0
Requires-Dist: openpyxl>=3.1.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: typer>=0.9.0
Requires-Dist: duckdb>=0.10.0
Dynamic: license-file

# diffino

Declarative data diff engine for tables, powered by Polars. Compare Excel, CSV, or Parquet files and generate detailed reports with character-level inline diffs.

Supports cumulative changelog generation across multiple versions.

## Installation

```bash
pip install diffino-cli
```

## Quick Start

1. Prepare a config file:

```yaml
sources:
  left:
    type: excel
    path: data/v0.2.5.xlsx
    version: "0.2.5"      # optional — auto-parsed from filename
  right:
    type: excel
    path: data/v0.2.6.xlsx
    version: "0.2.6"

compare:
  - left_sheet: Sheet1
    key_columns:
      - ID
    ignore_columns:
      - Notes

output:
  project: 我的项目           # cover page title: "我的项目对比报告"
  formats:
    - excel
    - html
    - typst
  save_report: true           # persist DiffReport JSON for changelog
  report_dir: ./diffs
```

2. Run diffino:

```bash
diffino run config.yaml
```

3. Generate cumulative changelog from saved reports:

```bash
diffino changelog generate --input-dir ./diffs --releases releases.yaml
```

## Features

- **Multi-format sources**: Excel, CSV, Parquet, DuckDB
- **Key-based or fingerprint matching**: Compare by composite keys or full-row hashes
- **Column preprocessing**: Decimal rounding, text normalization, case sensitivity control
- **Character-level inline diff**: Red strikethrough for deleted text, green bold for inserted text
- **Three output formats**:
  - **Excel**: Side-by-side old/new rows, yellow-highlighted changed cells with rich-text inline diffs
  - **HTML**: Self-contained report with `<del>`/`<ins>` tags and JS filtering
  - **Typst**: Typst PDF with cover page, colored tables, character-level inline diffs. Uses Zhuque Fangsong CJK font.
- **Typst cover page**: Configurable project name — `{{PROJECT}}对比报告`
- **DiffReport persistence**: Auto-save JSON reports (`{old}__{new}.json`) for changelog accumulation
- **Version auto-detection**: Parses `name-vX.Y.Z.ext` patterns, with manual override in YAML
- **Changelog generation**: `diffino changelog generate` — version summary table + detailed per-version diffs
- **Release date config**: `releases.yaml` maps versions to release dates for changelog display
- **Excel styles**: `track`, `final`, `side_by_side`

## CLI Commands

```bash
diffino run config.yaml                  # Run comparison
diffino validate config.yaml             # Validate config only
diffino changelog generate               # Generate changelog.typ
  --input-dir ./diffs                    #   Directory of diff JSONs
  --output changelog.typ                 #   Output file
  --releases releases.yaml               #   Release dates
  --summary-keep 3                       #   Versions in summary
```

## Configuration Reference

See `config.example.yaml` for a complete example. Key options:

| Section | Field | Description |
|---|---|---|
| `sources.left/right` | `version` | Manual version override (auto-parsed from filename) |
| `output` | `project` | Project name for Typst cover (default: `数据`) |
| `output` | `save_report` | Persist DiffReport as JSON |
| `output` | `report_dir` | Directory for saved reports (default: `./diffs`) |
| `compare[]` | `key_columns` | Column names used for row matching |
| `compare[]` | `fingerprint` | Use full-row hash instead of key columns |
| `compare[]` | `ignore_columns` | Columns to exclude from comparison |
| `compare[]` | `column_rules` | Preprocessing rules (`decimal`, `text`) |
| `output` | `formats` | List of `excel`, `html`, `typst` |
| `output.excel` | `style` | `final`, `side_by_side`, or `track` |

### Releases config (`releases.yaml`)

```yaml
releases:
  - version: "0.2.5"
    date: 2026-05-19
  - version: "0.2.6"
    date: 2026-05-20
```

## License

MIT
