Metadata-Version: 2.4
Name: runner-view
Version: 0.1.0
Summary: Local-first, event-driven evaluation dashboard for DeepEval
Requires-Python: >=3.10
Requires-Dist: gradio>=6.4.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# Runner View

Local-first, event-driven evaluation dashboard for DeepEval. Inspect trace runs and click to any node's detail in under 100ms — no account, no upload, nothing leaves your machine.

## Launch (one command)

```bash
uvx runner-view
```

Opens the most recent stored run in the dashboard. With no stored runs yet, it prints an actionable message and exits cleanly.

Open a specific portable trace file:

```bash
uvx runner-view path/to/trace.rvtrace
```

Installed-tool form is equivalent:

```bash
uv tool install runner-view
runner-view [target]        # no arg = latest stored run; or a .rvtrace path
```

Runs are stored append-only at `~/.runner-view/runs.jsonl` (one JSON line per run).

## Develop

```bash
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 PYTHONPATH=. pytest -q
```

`PYTEST_DISABLE_PLUGIN_AUTOLOAD=1` keeps the suite isolated from the project's own pytest11 entry point and sidesteps the `pytest-rerunfailures` socket-bind issue in sandboxed shells.

## Status

Epic 2 (one-command offline launch), Epic 3 (pytest debug launcher — zero-click failure triage), and Epic 7 (Analyst Curation Layer — regression diff views and flag-to-dataset functionality) shipped. 

### Analyst Curation Layer Features:
- **Regression Diff View**: Automatically shows when 2+ runs share the same test_id
- **Flag-to-Dataset Mapping**: Flag spans and save them to a local dataset file
- **Visual Highlighting**: 🟢 Improved, 🔴 Regressed, ⚪ Unchanged metrics
- **Per-Metric Comparison**: Detailed comparison of scores across runs
- **Group by Test ID**: Runs with the same test_id are automatically grouped

Decisions and context live in `_notes/`.
