Metadata-Version: 2.4
Name: traumatrial-match
Version: 0.0.2
Summary: Open infrastructure for matching trauma patients to clinical trials in real time. Pure-Python rule engine with clause-level reasoning trace and NEMSIS v3.5 ePCR adapter.
Author: traumatrial contributors
License: MIT
Project-URL: Homepage, https://github.com/jajjer/traumatrial
Project-URL: Repository, https://github.com/jajjer/traumatrial
Project-URL: Issues, https://github.com/jajjer/traumatrial/issues
Project-URL: Demo, https://traumatrial.vercel.app
Keywords: trauma,clinical-trials,EMS,NEMSIS,ePCR,eligibility,matching,healthcare
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Healthcare Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: pydantic>=2.0
Provides-Extra: test
Requires-Dist: pytest>=7.0; extra == "test"
Provides-Extra: parse
Requires-Dist: anthropic>=0.40.0; extra == "parse"
Requires-Dist: httpx>=0.27.0; extra == "parse"

# traumatrial-match

Open infrastructure for trauma trial eligibility matching. Evaluates structured trauma trial inclusion/exclusion rules against patient records in <100ms with clause-level reasoning trace.

MIT licensed. Synthetic data only — never PHI. Pure-Python core; pydantic for validation.

## 60-second example

```python
from traumatrial_match import Patient, Trial, Rule, match

patient = Patient(
    patient_id="P-001",
    age_years=34,
    sex="M",
    gcs=7,
    sbp_mmhg=82,
    hr_bpm=128,
    mechanism="blunt_mvc",
    trauma_activation_level=1,
    eta_minutes=4,
    pregnancy_status="not_applicable",
    anticoagulant_use=False,
    presumed_tbi=True,
    presumed_hemorrhage=True,
    presumed_intracranial_hemorrhage=False,
    spinal_injury_suspected=False,
)

trial = Trial(
    trial_id="NCT05638581",
    short_name="TROOP",
    title="Trauma Resuscitation With Low-Titer Group O Whole Blood",
    requires_efic=True,
    inclusion=[
        Rule(field="age_years", op="gte", value=15, hard=True),
        Rule(field="presumed_hemorrhage", op="eq", value=True, hard=True),
        Rule(field="trauma_activation_level", op="lte", value=1, hard=False),
    ],
    exclusion=[
        Rule(field="pregnancy_status", op="in",
             value=["pregnant", "unknown_could_be_pregnant"], hard=True),
    ],
)

result = match(patient, trial)
print(result.eligible, result.confidence)
# True 1.0
for clause in result.trace:
    mark = "HIT" if clause.hit else "MISS"
    print(f"  [{mark}] {clause.clause}  patient={clause.patient_value}")
```

## What's in the box

- **`Patient`** — pydantic model for a trauma bay patient snapshot.
- **`Trial`** — pydantic model for a trial's structured eligibility rules.
- **`Rule`** — a single inclusion or exclusion clause: field + operator + value + hard/soft.
- **`MatchResult`** — eligibility, confidence, and a complete clause-level reasoning trace.
- **`match(patient, trial)`** — evaluate one patient against one trial.
- **`match_all(patient, trials)`** — evaluate against many; sorted eligible-first, confidence desc.
- **`from_nemsis_xml(xml_str)`** — convert a NEMSIS v3.5 ePCR XML into `(Patient, NemsisConversionTrace)`. The trace records each Patient field as `extracted` / `inferred` / `defaulted` / `skipped` with a one-line reason. v0 mapping covers ~10 high-signal eFields; everything else is honestly defaulted. See `traumatrial_match/nemsis.py`.

## Operators (8)

`eq`, `ne`, `gte`, `lte`, `gt`, `lt`, `in`, `not_in`.

`in` and `not_in` require a list value; the others require a scalar.

## Confidence rubric

- Any **hard inclusion missed** OR any **hard exclusion hit** → `eligible=False`, `confidence=0.0`.
- Otherwise → `eligible=True`. `confidence = soft_inclusion_hits / soft_inclusion_total`, or `1.0` if no soft inclusions.

The categorical signal is `eligible: bool`; magnitude is `confidence: float`. There is no `HIGH/MEDIUM/LOW` enum.

## Bundled corpus

The repo ships with 10 verified active trauma trials (TROOP, SWiFTCanada, ICECAP, SELECT-TBI, AEDH-MT, BOOST3, WEBSTER, FIT-BRAIN, Ketamine-TBI, ELASTIC) and 8 patient personas covering hemorrhage, TBI, anticoagulation, pregnancy exclusion, cardiac arrest, and pediatric exclusion. See `trials/` and `patients/`.

These are hand-written from the public clinicaltrials.gov criteria. **They are an approximation, not a clinical decision system.** PRs welcome that improve fidelity, add trials, or extend the operator vocabulary.

## Install (dev)

```bash
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[test]"
pytest
```

## Run the precompute script

This generates the static match payloads consumed by the Next.js demo in `../demo/`.

```bash
python scripts/precompute.py
```

## Auto-import a trial from clinicaltrials.gov

Watch a real trial become structured rules in 10 seconds. Fetches the trial from clinicaltrials.gov, sends the inclusion/exclusion text to Claude with our schema as the contract, validates the response with pydantic, and writes a `engine/trials/NCT….json`. If a criterion can't be expressed in our 8-operator vocabulary, it goes into `_metadata.skipped_criteria` instead of being silently dropped.

```bash
pip install -e ".[parse]"
export ANTHROPIC_API_KEY=sk-ant-...   # or put it in a .env at the repo root
python scripts/parse_trial.py NCT05638581
python scripts/parse_trial.py NCT05638581 NCT04217551 NCT04995068 --overwrite
```

**The schema is the constraint that keeps the LLM honest.** Field types (int / bool / enum) and value ranges are injected into the system prompt, AND validated at load time by `Rule._value_must_match_field_type`. A hallucinated value like `trauma_activation_level eq "massive_hemorrhage_protocol"` (the LLM's first attempt at TROOP) fails pydantic validation, which feeds the error back to the model for a retry. Up to 3 attempts before giving up.

Always hand-review the generated JSON before committing — the LLM is good but not perfect. Look for `gte` vs `gt` off-by-ones, soft-vs-hard misclassifications, and the `_metadata.skipped_criteria` list for things that need a schema extension.

## What this is NOT

- Not a clinical decision-support system.
- Not validated against real patient data.
- Not regulated, not certified, not BAA-able.
- Not a substitute for a research coordinator's clinical judgment.

It is a structured, testable, transparent **starting point** for talking about how trauma trial matching could be automated. Treat it that way.

## Contributing

See [`../CONTRIBUTING.md`](../CONTRIBUTING.md). PRs and issues welcome — particularly from trauma research coordinators, EMS data SMEs, and clinical trial operations folks who can tell us where the schema is wrong.

## License

MIT. See `../LICENSE`.
