Metadata-Version: 2.4
Name: praevius
Version: 0.1.0
Summary: Hospitalisation-risk predictor for elderly patients — an open, honestly-evaluated clinical decision-support research prototype.
Author-email: Rafael Zanarino <rafael@nodnex.com.br>
License: GPL-3.0-or-later
Project-URL: Homepage, https://praevius.nodnex.com.br
Project-URL: Repository, https://github.com/Zanarino/praevius
Keywords: machine-learning,clinical-decision-support,geriatrics,hospitalisation,frailty,scikit-learn,healthcare
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Healthcare Industry
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Scientific/Engineering :: Medical Science Apps.
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23
Requires-Dist: pandas>=1.5
Requires-Dist: scikit-learn<1.5,>=1.4
Requires-Dist: imbalanced-learn>=0.10
Requires-Dist: matplotlib>=3.6
Requires-Dist: seaborn>=0.12
Requires-Dist: shap>=0.41
Provides-Extra: app
Requires-Dist: streamlit>=1.28; extra == "app"
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: ruff>=0.5; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Requires-Dist: streamlit>=1.28; extra == "dev"
Provides-Extra: publish
Requires-Dist: skops>=0.10; extra == "publish"
Requires-Dist: huggingface-hub>=0.16; extra == "publish"
Requires-Dist: twine>=5.0; extra == "publish"
Requires-Dist: build>=1.0; extra == "publish"
Dynamic: license-file

# Praevius — Hospitalization Risk Predictor for Elderly Patients

> A machine learning tool that helps health professionals identify elderly patients at higher risk of hospitalization — up to 1 and 3 years in advance.

**Language / Idioma:** [🇧🇷 Português](README.pt-BR.md) | 🇬🇧 English (you are here)

🤗 The trained 1-year model is published on [Hugging Face](https://huggingface.co/Zanarino/praevius-hospitalization-1year) · ▶️ **[Try the interactive demo](https://huggingface.co/spaces/Zanarino/praevius-demo)** (Gradio Space — no install, no data stored).

---

## What is this tool?

Praevius is an open-source clinical decision-support tool. It analyses routine clinical data collected from elderly patients — things like their walking speed, number of medications, cognitive scores, and history of falls — and produces a **hospitalization risk score**: a number between 0% and 100% that estimates how likely a patient is to need hospitalization within the next **1 year** or **3 years**.

Think of it as an early-warning system. The goal is not to replace clinical judgment, but to give clinicians an extra layer of information — a systematic way to flag patients who may need more attention before a crisis happens.

---

## Who is this for?

This tool is designed for **health professionals who work with elderly patients**: geriatricians, general practitioners, nurses, physiotherapists, and care coordinators. You do not need any background in data science or machine learning to use it.

If you are a **researcher or developer**, the codebase is fully open and documented — contributions are very welcome (see [Contributing](#contributing)).

---

## Why does this matter?

Hospitalizations in the elderly are often preventable. Studies show that many hospitalization events are preceded by a gradual decline in physical function, cognitive capacity, and social engagement — signals that are often present in clinical records but are hard to synthesize manually when managing a large caseload.

This tool automates that synthesis. By processing multiple clinical variables at once and comparing a patient's profile against patterns learned from historical data, it can surface patients who are drifting toward higher risk — before the situation becomes urgent.

---

## How does it work? (No technical background required)

Here is a plain-language explanation of the process:

**1. Learning from historical data**

The tool was trained on records from elderly patients, each of which included dozens of clinical measurements plus a record of whether that patient was hospitalized within 1 or 3 years. The algorithm studied these records (currently 117 visits from 30 unique patients — see [Limitations](#limitations)) and learned which combinations of factors tend to appear before a hospitalization.

This is called "supervised machine learning" — we showed the algorithm many examples with known outcomes, and it identified the patterns that distinguish high-risk from low-risk profiles.

**2. Scoring new patients**

When you provide the tool with a new patient's data, it compares that patient's clinical profile against the patterns it learned, and outputs a probability score. A score of 0.78 (78%) would mean: *"based on patients with similar profiles in our training data, there is approximately a 78% chance this patient will be hospitalized within the next year."*

**Important:** with the current amount of training data, these probabilities are **not yet calibrated** — the exact percentage must not be read literally (this was measured; see [Limitations](#limitations)). The tool therefore reports **risk bands** (low / moderate / high), and the band — not the exact number — is the reliable reading.

**3. Explaining the score**

The tool also explains *why* a patient received a particular score — which factors raised the risk and which ones lowered it. This is shown as a chart called a SHAP explanation (more on this in the [Understanding Your Results](#understanding-your-results) section). You can point to specific clinical findings that drove the score and discuss them with your team.

**4. What the tool does not do**

- It does not diagnose conditions
- It does not prescribe treatment
- It does not replace clinical assessment
- It cannot tell you *why* a patient will be hospitalized — only that the pattern of their data resembles patients who historically were

---

## Current status

This project is under active development. Here is where we are:

| Phase | Description | Status |
|-------|-------------|--------|
| **Phase 1** | Foundation — robust model training and evaluation | ✅ Complete |
| **Phase 2** | Data strategy — augment dataset with public health data | ⏳ Planned |
| **Phase 3** | Clinical interface — Streamlit web app for direct use | ✅ Complete |
| **Phase 4** | Open source packaging — installable, citable, CI/CD | 🔨 In progress |

**Phase 1 checklist:**
- [x] Core ML pipeline (data loading, feature engineering, model training, evaluation)
- [x] Resolve repository merge conflicts
- [x] **Fix target binarization** — hospitalization columns stored counts (0–3); converted to binary (0 = never, 1 = at least once)
- [x] **Fix data leakage** — switched from random row-level split to patient-level split; all visits from one patient now stay in the same set
- [x] **Patient-level cross-validation** — `StratifiedGroupKFold` ensures all visits from the same patient stay together; all 30 patients contribute to evaluation; k chosen automatically based on dataset size
- [x] **SMOTE for class imbalance** — applied inside each CV training fold only; never on test data; auto-disabled when minority class ≥ 40%
- [x] **Remove Decision Tree** — AUC near 0.5 on small datasets; unstable and not suitable for clinical use
- [x] **Two-phase training pipeline** — Phase A: patient-level cross-validation (honest evaluation, source of the committed charts) → Phase B: final model retrained on all data and saved as a Pipeline
- [x] **`predict.py`** — interactive script to run a risk prediction for a single new patient; outputs a risk summary chart + SHAP explanation
- [x] **SHAP explanations per patient** — waterfall chart showing which clinical factors increased or decreased each individual patient's risk score
- [x] **Hyperparameter tuning with nested cross-validation** — `RandomizedSearchCV` with patient-level CV selects hyperparameters; a nested (outer) CV then evaluates the tuned model on patients the search never saw. Honest tuned 1-year AUC: **0.739 ± 0.277** (3-year: 0.462 — below chance, do not use)
- [x] **3-year model limitations documented** — clearly communicated in Limitations section and performance table
- [x] **End-to-end Pipeline (training = prediction)** — all preprocessing (imputation, feature engineering, encoding, scaling) and the model are saved together as a single scikit-learn Pipeline. Preprocessing is now fitted only on training data inside every CV fold (removing preprocessing leakage), and `predict.py` applies exactly the same transformations as training — fixing a bug where hand-rolled preprocessing produced invalid scores

---

## Getting started

### What you need

- Python 3.10 or higher ([download here](https://www.python.org/downloads/))
- About 5 minutes for setup
- A dataset is needed **only if you want to retrain** the model — Praevius ships with pre-trained models, so you can score patients right after installing.

### Step 1 — Install Praevius

```bash
git clone https://github.com/Zanarino/praevius.git
cd praevius
pip install ".[app]"
```

This installs the `praevius` package, pulls every dependency, and creates two terminal commands: **`praevius`** (scoring) and **`praevius-app`** (the clinical interface). `pip install .` alone installs the core tool and the `praevius` command; the `[app]` extra adds [Streamlit](https://streamlit.io) and the web interface. (Once the package is published, `pip install praevius` will work directly.)

If you are not familiar with git, you can also click the green **Code** button on GitHub and select **Download ZIP** first. If you see errors, make sure your Python version is 3.10 or higher (`python --version`).

### Step 2 — Score a patient

Try it immediately with a fictional example patient — **no dataset required**, since the pre-trained models are bundled:

```bash
praevius --example
```

Run `praevius` with no arguments for an interactive prompt that asks for the patient's clinical values; anything you leave blank is filled with a typical training value. You get the 1-year risk band, the inter-model agreement indicator, a SHAP explanation of the factors, and charts saved to an `outputs/` folder.

### Step 3 — Open the clinical interface (optional)

```bash
praevius-app
```

Your browser opens a bilingual (Portuguese/English) form where you enter the patient's data — fields you don't have are filled automatically with typical training values. After acknowledging the research-prototype notice, you get the **1-year risk band**, an indicator of how much the three models agree on this patient, the 3-year panel (currently "in development" — no score is shown, by design), and a SHAP chart explaining which factors drove the result. A one-page **PDF report** of the assessment (generated in memory) can be downloaded for the patient's record. The interface runs entirely on your machine: **no patient data ever leaves it or is stored**.

### Step 4 — Retrain on your own data (advanced, optional)

Praevius ships with pre-trained models, so this is only needed to rebuild them on your own dataset. Place your data at `raw_data/Virtual_Patient_Models_Dataset.csv` (see [The Dataset](#the-dataset) for the expected format), then run:

```bash
python -m praevius.predictive_model
```

This loads and cleans the data, evaluates the models via honest cross-validation, retrains them on all data, and saves the pipelines + model card (`model_card.json`) **into the package** (so they ship with it), with performance charts and reports in the `outputs/` folder. It takes around 5 minutes on a standard laptop.

---

## Understanding your results

After running the model, you will find the following files in the `outputs/` folder. Here is what each one means:

### `cv_summary_1year.csv` and `cv_summary_3years.csv` ← **Start here**

These are the most important files. They show the real model performance, measured honestly through cross-validation — a method that uses **all 30 patients** for evaluation, not just a small subset.

| Column | What it means |
|--------|---------------|
| `roc_auc_mean` | Mean AUC across all folds — this is the primary performance metric |
| `roc_auc_std` | Standard deviation — shows how stable performance is across different patient groups |
| `pr_auc_mean` | Mean Precision-Recall AUC — complements ROC-AUC, especially useful for imbalanced classes |
| `recall_mean` | Of the patients who were actually hospitalised, what proportion did the model correctly identify? |
| `f1_mean` | Balanced score combining precision and recall — more informative than AUC alone |

### `cv_fold_results_1year.csv` and `cv_fold_results_3years.csv`

These files show the results for each individual fold — useful for diagnosing which patient groups the model consistently gets right or wrong. High variability across folds indicates we need more data.

### `nested_cv_tuned_1year.csv` and `nested_cv_tuned_3years.csv`

These files contain the **nested cross-validation** results for the hyperparameter-tuned Gradient Boosting: for each outer fold, the column `inner_best_auc` is the search's internal selection score and `outer_test_auc` is the honest evaluation on patients the search never saw. Use the mean of `outer_test_auc` when discussing tuned-model performance.

### `calibration_1year.csv` / `calibration_curve_1year.png` (and 3-year equivalents)

The **calibration evaluation**: Brier score and expected calibration error (ECE) for the deployed model and for two recalibrated variants (sigmoid and isotonic), measured on out-of-fold predictions, plus the no-information baseline (`base_rate_brier`) and the resulting `display_decision` (`bands_only` or `percentage`) that the interfaces obey. The PNG is the reliability diagram — the closer a curve is to the diagonal, the more literally its percentages can be read.

**What is ROC-AUC?**

ROC-AUC (Area Under the Receiver Operating Characteristic Curve) is a standard way to measure how good a predictive model is. It tells you how well the model separates high-risk patients from low-risk ones — across all possible risk thresholds.

Think of it like this: if you randomly pick one patient who was hospitalised and one who was not, the AUC is the probability that the model will correctly rank the hospitalised one as higher-risk.

In clinical terms, AUC is similar to the combined sensitivity/specificity performance of a diagnostic test. A model with AUC 0.82 would correctly rank 82% of randomly chosen high-risk/low-risk pairs.

| AUC value | What it means |
|-----------|---------------|
| 1.00 | Perfect — never makes a mistake |
| 0.90–0.99 | Excellent discrimination |
| 0.80–0.89 | Very good discrimination |
| 0.70–0.79 | Good discrimination |
| 0.60–0.69 | Fair — use with caution |
| 0.50 | No better than chance (equivalent to a coin flip) |
| < 0.50 | Performing worse than chance — something is wrong |

**Current model performance:**

> ⚠️ **How to read these numbers**
>
> All evaluation is by **patient-level cross-validation** — every patient is scored by a model that never saw them, and the mean ± std across folds is the honest estimate of real-world performance. There is no separate train/test split; the charts further down are drawn from the same honest cross-validation (ROC) or the final model (feature importance).
>
> The previous inflated AUC of 0.816 (before Phase 1 fixes) was caused by row-level splitting and target counts treated as categories — both forms of data leakage. Lower, honest numbers are better than higher, misleading ones.

**Current cross-validated performance (30 patients, 5-fold, mean ± std):**

| Model | 1-year ROC-AUC | 3-year ROC-AUC |
|-------|----------------|----------------|
| Logistic Regression | 0.389 ± 0.171 | 0.476 ± 0.251 |
| Random Forest | 0.541 ± 0.256 | 0.502 ± 0.203 |
| Gradient Boosting (default) | 0.659 ± 0.197 | 0.548 ± 0.206 |
| **Gradient Boosting (tuned, nested CV)** ¹ | **0.739 ± 0.277** | **0.462 ± 0.159** |

> ¹ The "tuned" row comes from **nested cross-validation**: the hyperparameter search runs inside each outer training fold, and the chosen model is evaluated on outer-fold patients the search never saw. This is the honest estimate of the *tuning procedure* — unlike the search's own best score (0.776 for 1 year, saved in `best_params_*.csv`), which is a selection score and optimistically biased by construction.
>
> Two honest observations from the nested CV: the 1-year result has very high fold-to-fold variance (one fold scored 0.32 while the others scored 0.61–1.00 — see `nested_cv_tuned_1year.csv`), and the **3-year tuned model performs below chance (0.462)** — tuning does not generalise at this horizon, reinforcing that the 3-year score must not be used.
>
> These numbers are slightly lower than previously reported because a subtle form of preprocessing leakage was removed: imputation, feature engineering and scaling are now fitted **only on the training data of each fold**, instead of on the full dataset. Lower, honest numbers are better than higher, misleading ones.

**Probability calibration (decides what the interfaces display):**

Calibration — whether a "70%" score really means a 70% chance — was measured with patient-level cross-validation, comparing the deployed model against Platt (sigmoid) and isotonic recalibration. Result for the 1-year champion: Brier score 0.160, *worse* than the no-information baseline of 0.153 (always predicting the prevalence), with an expected calibration error of 0.123; recalibration does not fix this at the current sample size. **Decision, recorded in the model card and obeyed by all interfaces: display risk bands (low / moderate / high) with the percentage de-emphasised.**

![Reliability diagram for the 1-year model](docs/img/calibration.png)

*The reliability diagram above: the closer a curve sits to the diagonal, the more literally its percentages can be read. (The full numbers regenerate to `outputs/calibration_1year.csv` when you run training.)*

> **Why is the variance so high (± 0.20+)?** With only ~6 patients per test fold, one difficult patient can swing the AUC by 0.2 or more. This is expected and honest — it means the model's estimates are not yet stable enough for clinical decisions. The variance will reduce as more patients are added to the dataset. Our cross-validation code already scales automatically: it selects 5-fold for 30–99 patients, 10-fold for 100–299, and LOO for fewer than 20.
>
> **Do not use either model for clinical decisions at this stage.** This is a research prototype under active development.

---

### ROC curve — cross-validated (honest)

![Cross-validated ROC curve for the 1-year horizon](docs/img/roc_cross_validated.png)

The ROC curve plots True Positive Rate (how many high-risk patients we correctly catch) against False Positive Rate (how many low-risk patients we incorrectly flag) at every possible risk threshold. This curve is **honest**: it is built from the pooled out-of-fold cross-validation predictions, so every point comes from a model scoring a patient it never trained on.

- A curve that hugs the **top-left corner** = excellent model
- A curve that follows the **diagonal dotted line** = model is no better than chance
- The **AUC** is the area under the curve — larger area means better performance

---

### Feature importance — final model

![Feature importance for the 1-year final model](docs/img/feature_importance.png)

This chart shows which clinical variables had the most influence on the **final model** (the champion Gradient Boosting pipeline trained on all patients). The longer the bar, the more important that variable was.

This is clinically useful for two reasons:
1. **Sanity check:** The top variables should make clinical sense. If something unexpected appears at the top (like patient ID), it signals a data problem.
2. **Clinical insight:** The chart may confirm or reveal which factors in your patient population most strongly predict hospitalization.

Variables you are likely to see at the top: frailty status (Fried criteria), number of comorbidities, gait speed, number of medications, and MMSE score.

---

## The dataset

### Format

The model expects a CSV file at `raw_data/Virtual_Patient_Models_Dataset.csv` with one row per patient visit. The key variables expected are listed below. All names are case-sensitive.

| Variable | Type | Description |
|----------|------|-------------|
| `part_id` | Integer | Patient identifier |
| `age` | Integer | Age in years |
| `gender` | String | Patient gender |
| `fried` | String | Frailty status: `Non frail`, `Pre-frail`, or `Frail` |
| `katz_index` | Integer | Katz Index of Independence in Activities of Daily Living (0–6) |
| `iadl_grade` | Integer | Instrumental Activities of Daily Living score |
| `gait_speed_4m` | Float | Gait speed over 4 metres (m/s) |
| `raise_chair_time` | Float | Time to rise from chair 5 times (seconds) |
| `falls_one_year` | Integer | Number of falls in the past year |
| `comorbidities_count` | Integer | Total number of comorbidities |
| `medication_count` | Integer | Number of medications |
| `mmse_total_score` | Integer | Mini-Mental State Examination score (0–30) |
| `depression_total_score` | Integer | Depression scale score |
| `hospitalization_one_year` | Integer | **Target:** 1 if hospitalized within 1 year, 0 if not |
| `hospitalization_three_years` | Integer | **Target:** 1 if hospitalized within 3 years, 0 if not |

Variables coded as `999` are treated as missing values and handled automatically.

### Privacy

**Patient data must never be committed to this repository.** The `raw_data/` folder is excluded by `.gitignore`. When contributing, always verify that your commit does not contain real patient records.

In practice, this policy means:

- The exploratory notebook (`exploratory_analysis_dataset.ipynb`) is committed **with its outputs cleared**, so no data rows appear in the repository. If you run it locally, clear the outputs before committing (`Cell → All Output → Clear` in Jupyter, or `jupyter nbconvert --clear-output --inplace <notebook>`).
- The trained pipeline files in `models/` contain only **aggregate statistics** of the training data (per-column medians, modes, means and standard deviations used for imputation and scaling) — never individual patient records.
- The `outputs/` folder (training charts, CSVs, reports) is **git-ignored** and regenerated locally by `python -m praevius.predictive_model`. Only a few curated illustrative charts are committed, under `docs/img/`; these are either aggregate (cross-validated ROC, feature importance, calibration) or based on a **fictional example patient** — never on a real record.

### Sample / synthetic data

A synthetic dataset generated to match the statistical properties of real data (without containing any real patient records) will be provided in Phase 2 of this project. This will allow anyone to run and test the tool without access to clinical data.

---

## Limitations

Being transparent about what this tool cannot do is as important as explaining what it can do.

**1. Small training dataset**
The current model was trained on 117 records from 30 unique patients. This is a small sample by machine learning standards. The models may not generalise well to patient populations that differ from the training group, and performance estimates have wide uncertainty margins — as evidenced by the high standard deviation (± 0.20+) in cross-validation.

**2. The 3-year model is not yet reliable**
The 3-year hospitalisation prediction model achieves a mean cross-validated AUC of approximately 0.55 — marginally above random chance, with high fold-to-fold variance (± 0.21). Worse, the hyperparameter-tuned version evaluated by nested cross-validation scores **below chance (0.462)** at this horizon. Predicting events 3 years ahead from this sample size is not yet feasible. We include the model for completeness and future development, but **do not use the 3-year score for clinical decisions at this stage.**

**3. Overfitting is visible**
The small dataset makes the models prone to overfitting: in cross-validation, individual folds swing widely (one 1-year fold scores 0.32 while others reach 1.00). This wide fold-to-fold variance is the honest signature of a model memorising rather than generalising — it is expected to shrink as more patients are added.

**4. Correlation, not causation**
The model finds statistical patterns. It cannot tell you *why* a patient is at high risk — only that their data profile resembles patients who were hospitalized. Always interpret the score in the context of your full clinical assessment.

**5. Population specificity**
Models trained on one population may not perform equally well on another. Before relying on this tool in a new clinical setting, validate its performance on your own data.

**6. This is a decision-support tool, not a decision-making tool**
Risk scores should inform — not replace — clinical judgment. A patient scored at 30% may still warrant intervention based on factors not captured in the data. A patient scored at 80% may have circumstances that make hospitalization unlikely.

**7. The probabilities are not calibrated**
A formal calibration evaluation (patient-level cross-validation; see `outputs/calibration_1year.csv`) showed that the exact percentages cannot be read literally: the deployed model's Brier score is worse than a no-information baseline, and recalibration (Platt/isotonic) does not fix this at the current sample size. All interfaces therefore display **risk bands** (low / moderate / high) and de-emphasise the percentage. This is expected to improve as more patients are added (Phase 2).

---

## Project roadmap

```
Phase 1 — Foundation (in progress)
├── ✅ Core ML pipeline
├── ✅ Merge conflict resolution
├── ✅ Fix target binarization (counts → binary)
├── ✅ Fix data leakage (patient-level split)
├── ✅ Patient-level cross-validation (StratifiedGroupKFold, auto k-selection)
├── ✅ SMOTE for class imbalance (inside CV folds only)
├── ✅ Remove Decision Tree (AUC ~0.5 on small N)
├── ✅ Two-phase pipeline (CV eval → final all-data model)
├── ✅ Single-patient prediction script (predict.py)
├── ✅ SHAP explanation per patient
├── ✅ Hyperparameter tuning (honest tuned 1-year AUC via nested CV: 0.739 ± 0.277)
├── ✅ Nested cross-validation — honest evaluation of the tuning procedure
├── ✅ 3-year model limitations clearly documented
└── ✅ End-to-end Pipeline — training and prediction share identical preprocessing

Phase 2 — Data strategy
├── ⬜ Augment with public datasets (NHANES, ELSA-Brasil, SHARE, InCHIANTI)
├── ⬜ Synthetic data generation (CTGAN) for open distribution
└── ⬜ Federated learning design for multi-institution contribution

Phase 3 — Clinical interface
├── ✅ Shared scoring engine, model card and inter-model agreement indicator
├── ✅ Probability calibration assessed — decision: interfaces show risk bands
├── ✅ Streamlit web application (local-only; disclaimer gate; no data stored)
├── ✅ Single-patient risk assessment form (blank fields imputed by the Pipeline)
├── ✅ SHAP explanation chart per patient
├── ✅ PDF report generation (in-memory, downloadable — nothing written to disk)
└── ✅ Portuguese / English bilingual interface

Phase 4 — Open source packaging
├── ✅ Installable Python package (`pip install`) with `praevius` / `praevius-app` commands
├── ✅ Automated tests (pytest + GitHub Actions CI)
├── ✅ CONTRIBUTING.md + CODE_OF_CONDUCT.md
├── ✅ GPL v3 license
├── ✅ CITATION.cff for academic citation
├── ✅ Ethics statement (ETHICS.md) + design rationale (docs/technical-decisions.md)
├── ✅ CHANGELOG.md
├── ✅ 1-year model published to Hugging Face
└── ⬜ Publish to PyPI · tag a citable release (Zenodo DOI)
```

---

## Contributing

Contributions are welcome from both **clinicians** and **developers**.

**If you are a clinician:**
- Share feedback on whether the outputs make clinical sense
- Report variables that are commonly collected in your setting but missing from the model
- Help validate the tool on new patient populations

**If you are a developer or data scientist:**
- See the Phase 1–4 roadmap above for what needs to be built
- Open an issue to discuss your idea before opening a pull request
- Follow the existing code style and document everything for a non-technical audience

**Getting started:**

```bash
git clone https://github.com/Zanarino/praevius.git
cd praevius
pip install -e ".[dev]"   # editable install + test/build tooling
python -m pytest tests/   # run the test suite
```

See the [CONTRIBUTING.md](CONTRIBUTING.md) guide for detailed contribution instructions, and please follow our [Code of Conduct](CODE_OF_CONDUCT.md). For the rationale behind the modelling choices (patient-level CV, the leakage fixes, why probabilities are shown as bands), see [docs/technical-decisions.md](docs/technical-decisions.md). Notable changes are tracked in the [CHANGELOG](CHANGELOG.md).

---

## Ethics statement

This tool is designed to **support** clinical decision-making, not to automate it. We believe that:

- Risk scores must always be explained, not just reported
- Clinicians must retain full authority over care decisions
- Patient data must be handled in accordance with applicable privacy law (LGPD in Brazil, GDPR in Europe, HIPAA in the US)
- Model limitations must be communicated clearly and honestly to all users
- The tool should never be used as a basis to withhold care from a patient

The full, bilingual ethics statement — covering intended use, data governance, fairness, accountability and our honesty commitments — is in **[ETHICS.md](ETHICS.md)**.

---

## License

This project is licensed under the **GNU General Public License v3.0 (GPL v3)**. You are free to use, modify, and distribute it — including for commercial purposes — but any derivative work must also be released under GPL v3 and made open source. This ensures the tool always remains free and open for the clinical community.

See the [LICENSE](LICENSE) file for details.

---

## Citation

If you use this tool in research or clinical work, please cite it as:

```
Zanarino, R. (2026). Praevius: Hospitalization Risk Predictor for Elderly Patients.
GitHub. https://github.com/Zanarino/praevius
```

A [`CITATION.cff`](CITATION.cff) file is included in the repository for automated citation (GitHub's "Cite this repository" button).

---

## Author

**Rafael Zanarino**
Data Science for Healthcare | 2026

*Built to improve the care of elderly patients.*

---

> **Disclaimer:** This tool is intended for research and clinical decision-support purposes only. It is not a certified medical device and must not be used as a sole basis for clinical decisions. Always consult applicable regulations before deploying in a clinical environment.
