Metadata-Version: 2.4
Name: llm-model-diff
Version: 0.1.0
Summary: Compare LLM model outputs side-by-side with rich diff visualization
Author: model-diff contributors
License: MIT
Keywords: llm,ai,diff,comparison,openai,anthropic,cli
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Utilities
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: click>=8.0
Requires-Dist: rich>=13.0
Requires-Dist: anthropic>=0.20.0
Requires-Dist: openai>=1.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: isort>=5.0; extra == "dev"
Requires-Dist: mypy>=1.0; extra == "dev"
Requires-Dist: flake8>=6.0; extra == "dev"

# model-diff

Compare LLM model outputs side-by-side with rich diff visualization.

Run the same prompt on multiple models simultaneously and see exactly what each model says differently.

## Installation

```bash
pip install model-diff
```

Or install from source:

```bash
git clone https://github.com/yourname/model-diff
cd model-diff
pip install -e .
```

## Requirements

Set the API keys for the providers you want to use:

```bash
export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
```

Missing keys are handled gracefully — models without a key are skipped with a warning.

## Usage

```bash
# Default: compare GPT-4o vs Claude Sonnet
model-diff "What is the best way to handle errors in Python?"

# Specify models explicitly
model-diff "Explain recursion" --models gpt-4o,claude-sonnet-4-6

# Use a prompt file
model-diff --prompt prompt.txt --models gpt-4o,claude-haiku-4-5-20251001,claude-sonnet-4-6

# Word-level diff
model-diff "Explain recursion" --diff words

# Show only differences (hide matching sections)
model-diff "Explain recursion" --only-diff

# Deterministic outputs
model-diff "Explain recursion" --temperature 0.0

# Save results to JSON
model-diff "Explain recursion" --output results.json
```

## Supported Models

| Model ID | Provider | API Key |
|---|---|---|
| `gpt-4o` | OpenAI | `OPENAI_API_KEY` |
| `gpt-4o-mini` | OpenAI | `OPENAI_API_KEY` |
| `claude-opus-4-6` | Anthropic | `ANTHROPIC_API_KEY` |
| `claude-sonnet-4-6` | Anthropic | `ANTHROPIC_API_KEY` |
| `claude-haiku-4-5-20251001` | Anthropic | `ANTHROPIC_API_KEY` |

## Architecture

```
src/model_diff/
├── cli.py      # Click-based CLI entry point
├── models.py   # Provider-specific API callers, run concurrently via threading
└── differ.py   # difflib-based diff engine + Rich output formatter
```

Model calls are issued concurrently using `threading`, so wall time equals the slowest model rather than the sum of all models.

## License

MIT
