Metadata-Version: 2.4
Name: sniffdiff
Version: 0.0.1
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Rust
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Version Control :: Git
License-File: LICENSE
Summary: Semantic commit analysis for Python: blast radius, symbol diffs and complexity changes.
Keywords: python,git,code-review,static-analysis,cli
Author-email: William Norman <vvilliamnorman@gmail.com>
License-Expression: MIT
Requires-Python: >=3.8
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Documentation, https://github.com/wlamnorman/sniffdiff#readme
Project-URL: Repository, https://github.com/wlamnorman/sniffdiff

# sniffdiff

`sniffdiff` is a Rust CLI for reviewing Python diffs by symbol instead of by
line count.

It compares two local Git refs, parses the Python code before and after the
range, and prints a compact report of review facts:

- changed functions, methods, and classes;
- body versus signature changes;
- structural complexity movement;
- changed and unchanged callers;
- changed and unchanged tests that reference changed production symbols;
- implementation changes with no direct test references.

The goal is not to replace `git diff`. The goal is to answer:

```text
Which changed symbols deserve attention first, and why?
```

`sniffdiff` is intentionally not a repo knowledge graph, hosted service,
dashboard, AI reviewer, or broad multi-language analyzer.

## Status

Early MVP. Python only. Local Git only.

The crate is versioned as `0.0.1` while the fact model and output format settle.

## Support Expectations

`sniffdiff` does not run Python code or depend on the Python interpreter in the
target repository. It parses Python source statically with `tree-sitter-python`,
so syntax support follows the bundled parser grammar rather than a local
`python` executable version. The current lockfile resolves `tree-sitter-python`
to `0.23.6`, whose grammar includes Python 3-era constructs such as structural
pattern matching (`match`/`case`), exception groups (`except*`), positional-only
and keyword-only parameter separators, f-string interpolation, and Python
3.12-style `type` alias statements and generic type parameters. It also retains
some legacy Python grammar support, but `sniffdiff` is tested and positioned as
a Python 3 source analyzer. Newer Python syntax should be treated as supported
only after it is accepted by the bundled parser and covered by fixtures.

Runtime is currently proportional to the amount of Python source in the compared
refs, not only to the number of changed files. The analyzer parses non-test
Python files at both `base` and `head`, then parses test files at `head` to
attach test-reference facts. It is intended to be fast enough for local review
on typical Python packages, but large-repo performance is not yet benchmarked or
optimized.

Monorepos are supported when the repository is local and the refs are available,
but the tool is monorepo-compatible today rather than monorepo-optimized. Future
monorepo work should add repeatable `--path` scopes, `--exclude` and config
support, smarter changed-file scoping, parallel parsing, batched Git object
reads, optional blob-based caching, and timing/file-count fields in JSON output.

## Install

From a local checkout:

```sh
cargo install --path .
```

Then run from any Git repo:

```sh
sniffdiff main..HEAD
```

From crates.io:

```sh
cargo install sniffdiff
```

From PyPI, once the Python package is published:

```sh
uv tool install sniffdiff
pipx install sniffdiff
```

Or install directly from the repository:

```sh
cargo install --git https://github.com/wlamnorman/sniffdiff
```

## Usage

Compare your working tree against a base ref:

```sh
sniffdiff main
```

Compare two committed refs:

```sh
sniffdiff main..HEAD
```

Analyze another repository:

```sh
sniffdiff --repo ../some-python-repo main..HEAD
```

Use explicit refs instead of `base..head`:

```sh
sniffdiff --repo ../some-python-repo --base main --head HEAD
```

Show more report items:

```sh
sniffdiff main..HEAD --limit 10
sniffdiff main..HEAD --limit all
```

Show more caller/test references inside each report item:

```sh
sniffdiff main..HEAD --caller-preview-limit 8
```

Choose report detail:

```sh
sniffdiff main..HEAD --verbose
sniffdiff main..HEAD --detail verbose
sniffdiff main..HEAD --detail full
```

The default output format is YAML. Emit the same report model as JSON:

```sh
sniffdiff main..HEAD --json
sniffdiff main..HEAD --format json --detail full
```

When installed as a Python package, `python -m sniffdiff main..HEAD` delegates
to the same Rust binary. For local wrapper testing, set `SNIFFDIFF_BIN` to an
explicit binary path.

## Example Output

```yaml
schema_version: 1
detail: normal
scope:
  changed_files: 11
  changed_symbols: 16
  changed_test_files: 2
inspect:
- symbol: src/features.py::build_features
  changes:
  - public signature
  - implementation
  signature:
    before: build_features(rows)
    after: build_features(rows, *, strict=False, source="unknown")
  complexity:
    status: increased
    metrics:
    - name: branches
      before: 0
      after: 2
  changed_tests:
  - tests/test_features.py::test_build_features (1 callsite)
  - tests/test_features.py::test_build_features_skips_missing_names (1 callsite)
  unchanged_callers:
  - src/api.py::preview (1 callsite)
  - src/batch.py::build_batch (1 callsite)
  - src/pipeline.py::run_pipeline (1 callsite)
  - src/predict.py::predict (1 callsite)
  changed_callers:
  - src/reporting.py::summarize (1 callsite)
  - src/train.py::train (1 callsite)
- symbol: src/validators.py::validate_row
  changes:
  - public signature
  - implementation
  signature:
    before: validate_row(row)
    after: validate_row(row, *, strict=False)
  complexity:
    status: increased
    metrics:
    - name: branches
      before: 1
      after: 4
  tests: no direct test references found
  changed_callers:
  - src/validators.py::is_ready (1 callsite)
- symbol: src/scoring.py::score_features
  changes:
  - implementation
  complexity:
    status: increased
    metrics:
    - name: branches
      before: 0
      after: 2
    - name: nesting
      before: 0
      after: 2
  tests: no direct test references found
  changed_callers:
  - src/train.py::train (1 callsite)
- symbol: src/features.py::Formatter.format_many
  changes:
  - public signature
  signature:
    before: format_many(self, names)
    after: format_many(self, names, *, uppercase=False)
- symbol: src/features.py::Formatter.format_name
  changes:
  - public signature
  signature:
    before: format_name(self, name)
    after: format_name(self, name, *, uppercase=False, fallback="unknown")
  changed_callers:
  - src/features.py::Formatter.format_many (1 callsite)
omitted:
  symbol_changes: 11
  high_signal: 9
  low_signal: 2
  hint: use --limit 14 to show all high-signal items, --detail full for 2 low-signal facts
```

## Examples

Generate and analyze a throwaway Python repo:

```sh
make example
```

`make demo` is kept as an alias. The built-in example is deterministic and
offline. It includes signature changes, body changes, alias imports,
module-alias callers, changed tests, missing test movement, added/deleted
symbols, and a Git-detected file rename.

Run against real commits from well-known Python packages:

```sh
make example-requests
make example-click
make examples          # all real-world examples
```

Real-world examples live under `examples/real-world/`. They clone upstream
repositories into `target/` at run time instead of vendoring package source into
this repo, so they require network access the first time they run.

## GitHub Action

This repository includes `.github/workflows/sniff.yml`, which runs `sniffdiff`
against pull requests and writes the report to the GitHub Actions step summary.
It is intentionally log-first for now: no bot comments, no tokens beyond
read-only repository access, and no hosted service dependency.

## Parse Errors

By default, `sniffdiff` fails if either side of the range contains Python syntax
errors. That keeps the review facts honest.

For partial output while debugging a broken branch:

```sh
sniffdiff main..HEAD --allow-parse-errors
```

The report includes a `parse_errors:` line when partial facts were produced.

## Current Limits

- Python only.
- Local Git refs only.
- Working-tree comparison is supported with `sniffdiff <base-ref>`.
- No hosted forge APIs.
- No persistent index.
- No full Python call graph.
- Import and call matching are static heuristics.
- Tests are parsed only to support production-symbol facts, not as primary
  review targets.

## Development

Run the normal checks:

```sh
cargo fmt --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test
```

Check package contents before publishing:

```sh
cargo package --list
cargo publish --dry-run
```

See [docs/releasing.md](docs/releasing.md) for the crates.io and PyPI release
checklists.

Build an optimized release binary:

```sh
cargo build --release
```

Check the Python package wrapper:

```sh
PYTHONPATH=python SNIFFDIFF_BIN=target/debug/sniffdiff python -m sniffdiff --help
```

Build a Python wheel locally when `maturin` is available:

```sh
maturin build
```

Run `sniffdiff --help` for the complete CLI surface.

