Metadata-Version: 2.4
Name: medeval-harness
Version: 0.1.0
Summary: Evaluation harness for the MedEval triage agent. Scores accuracy, safety, hallucination, cost, and latency.
Project-URL: Homepage, https://github.com/Rajkumar2002-Rk/MedEval
Project-URL: Repository, https://github.com/Rajkumar2002-Rk/MedEval
Project-URL: Issues, https://github.com/Rajkumar2002-Rk/MedEval/issues
Author-email: Raja <rajkumarn2002@gmail.com>
License-Expression: MIT
Keywords: ai,esi,evaluation,healthcare,llm,medical,triage
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Healthcare Industry
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Requires-Python: >=3.10
Requires-Dist: click>=8.1
Requires-Dist: httpx>=0.27
Requires-Dist: pydantic>=2.7
Requires-Dist: rich>=13
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Description-Content-Type: text/markdown

# medeval-harness

Evaluation harness for the [MedEval](https://github.com/Rajkumar2002-Rk/MedEval) triage agent.

Scores a running MedEval agent against a 50-case ESI dataset on:

- Exact and adjacent ESI level accuracy
- Under-triage and over-triage rates (the safety metrics)
- Hallucination rate (LLM-extracted facts unsupported by complaint text)
- Decision-path consistency
- Cost per case and per-evaluation
- Latency (p50 / p95)

## Install

```bash
pip install medeval-harness