Metadata-Version: 2.4
Name: clinikit
Version: 0.1.0
Summary: A lightweight, sklearn-compatible Python toolkit for tabular machine learning.
Project-URL: Homepage, https://github.com/clinikit/clinikit
Project-URL: Repository, https://github.com/clinikit/clinikit
Project-URL: Issues, https://github.com/clinikit/clinikit/issues
Project-URL: Changelog, https://github.com/clinikit/clinikit/blob/main/CHANGELOG.md
Project-URL: Documentation, https://clinikit.readthedocs.io
Author: Berat Kaan SEVEN
Author-email: beratkaanseven@gmail.com
License: MIT
License-File: LICENSE
Keywords: calibration,classification,fairness,label-noise,machine-learning,scikit-learn,selective-classification,tabular-data
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Typing :: Typed
Requires-Python: <3.14,>=3.10
Requires-Dist: catboost>=1.2
Requires-Dist: imbalanced-learn>=0.12
Requires-Dist: jinja2>=3.1
Requires-Dist: joblib>=1.4
Requires-Dist: lightgbm>=4.0
Requires-Dist: matplotlib>=3.8
Requires-Dist: numpy>=1.26
Requires-Dist: pandas>=2.2
Requires-Dist: pyyaml>=6.0
Requires-Dist: scikit-learn>=1.4
Requires-Dist: scipy>=1.11
Requires-Dist: typer>=0.9
Requires-Dist: xgboost>=2.0
Provides-Extra: active
Requires-Dist: modal-python>=0.4; extra == 'active'
Provides-Extra: advanced
Requires-Dist: feature-engine>=1.8; extra == 'advanced'
Requires-Dist: interpret>=0.6; extra == 'advanced'
Requires-Dist: ngboost>=0.5; extra == 'advanced'
Provides-Extra: all
Requires-Dist: autogluon-tabular>=1.1; extra == 'all'
Requires-Dist: cleanlab>=2.7; extra == 'all'
Requires-Dist: feature-engine>=1.8; extra == 'all'
Requires-Dist: flaml>=2.2; extra == 'all'
Requires-Dist: furo>=2024.0; extra == 'all'
Requires-Dist: hatch>=1.13; extra == 'all'
Requires-Dist: interpret>=0.6; extra == 'all'
Requires-Dist: lime>=0.2.0; extra == 'all'
Requires-Dist: mapie>=0.9; extra == 'all'
Requires-Dist: modal-python>=0.4; extra == 'all'
Requires-Dist: mypy>=1.10; extra == 'all'
Requires-Dist: myst-parser>=2.0; extra == 'all'
Requires-Dist: nbsphinx>=0.9; extra == 'all'
Requires-Dist: ngboost>=0.5; extra == 'all'
Requires-Dist: pre-commit>=3.7; extra == 'all'
Requires-Dist: pytest-cov>=5.0; extra == 'all'
Requires-Dist: pytest-xdist>=3.5; extra == 'all'
Requires-Dist: pytest>=8.0; extra == 'all'
Requires-Dist: ruff>=0.6; extra == 'all'
Requires-Dist: sdv>=1.17; extra == 'all'
Requires-Dist: shap>=0.46; extra == 'all'
Requires-Dist: sphinx-copybutton>=0.5; extra == 'all'
Requires-Dist: sphinx>=7.0; extra == 'all'
Requires-Dist: tabpfn>=2.0; extra == 'all'
Provides-Extra: automl
Requires-Dist: autogluon-tabular>=1.1; extra == 'automl'
Requires-Dist: flaml>=2.2; extra == 'automl'
Requires-Dist: tabpfn>=2.0; extra == 'automl'
Provides-Extra: conformal
Requires-Dist: mapie>=0.9; extra == 'conformal'
Provides-Extra: dev
Requires-Dist: hatch>=1.13; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pre-commit>=3.7; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest-xdist>=3.5; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.6; extra == 'dev'
Provides-Extra: diagnostics
Requires-Dist: cleanlab>=2.7; extra == 'diagnostics'
Provides-Extra: docs
Requires-Dist: furo>=2024.0; extra == 'docs'
Requires-Dist: myst-parser>=2.0; extra == 'docs'
Requires-Dist: nbsphinx>=0.9; extra == 'docs'
Requires-Dist: sphinx-copybutton>=0.5; extra == 'docs'
Requires-Dist: sphinx>=7.0; extra == 'docs'
Provides-Extra: explain
Requires-Dist: lime>=0.2.0; extra == 'explain'
Requires-Dist: shap>=0.46; extra == 'explain'
Provides-Extra: synthetic
Requires-Dist: sdv>=1.17; extra == 'synthetic'
Description-Content-Type: text/markdown

# clinikit

Prepared by Berat Kaan SEVEN

A lightweight, sklearn-compatible Python toolkit for tabular machine
learning. `clinikit` bundles 14 hybrid classifiers, 5 experiment
protocols, calibration utilities, label-noise diagnostics, fairness
audits, and structured HTML reports behind a single drop-in package.

Research and development use only. This is an integration toolkit,
not a regulated product and not a research paper of original methods.
See [`CITATIONS.md`](CITATIONS.md) for source-method references.

[![CI](https://github.com/clinikit/clinikit/actions/workflows/ci.yml/badge.svg)](https://github.com/clinikit/clinikit/actions/workflows/ci.yml)
[![codecov](https://codecov.io/gh/clinikit/clinikit/branch/main/graph/badge.svg)](https://codecov.io/gh/clinikit/clinikit)
[![PyPI version](https://img.shields.io/pypi/v/clinikit.svg)](https://pypi.org/project/clinikit/)
[![Python versions](https://img.shields.io/pypi/pyversions/clinikit.svg)](https://pypi.org/project/clinikit/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Documentation Status](https://readthedocs.org/projects/clinikit/badge/?version=latest)](https://clinikit.readthedocs.io/en/latest/?badge=latest)

---

## Why clinikit

`clinikit` is a complement to existing libraries, not a competitor.

| Library              | Focus                                      | Why clinikit is different                                                              |
| -------------------- | ------------------------------------------ | -------------------------------------------------------------------------------------- |
| scikit-learn         | General-purpose ML                         | Adds curated experiment protocols, audit utilities, and structured reporting           |
| Cleanlab             | Label noise only                           | Integrates Cleanlab plus neighborhood conflict and LOO into one diagnostics module     |
| MAPIE                | Conformal prediction only                  | Includes selective classification as one of 14 bundled models                          |
| Fairlearn / AIF360   | Fairness only                              | The `audit` module bundles fairness, leakage, and documentation helpers                |
| AutoGluon            | AutoML                                     | Library-first; thin AutoML wrappers exist but no auto-magic by default                 |
| PyHealth             | Deep learning for sequence / multimodal    | Tabular-only, classical ML focused, lightweight                                        |

---

## Installation

```bash
pip install clinikit
```

Optional dependency groups:

```bash
pip install "clinikit[diagnostics]"   # Cleanlab-based label-noise tools
pip install "clinikit[explain]"       # SHAP and LIME wrappers
pip install "clinikit[automl]"        # TabPFN, FLAML, AutoGluon wrappers
pip install "clinikit[synthetic]"     # CTGAN / TVAE wrappers
pip install "clinikit[conformal]"     # MAPIE conformal prediction
pip install "clinikit[all]"           # Everything
```

Supported Python versions: 3.10, 3.11, 3.12, 3.13.

---

## Quickstart

```python
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

from clinikit.datasets import load_pima
from clinikit.metrics import sensitivity, specificity
from clinikit.models import RuleAugmentedClassifier

X, y = load_pima(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

model = RuleAugmentedClassifier(base_estimator=LogisticRegression(max_iter=1000))
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Sensitivity:", sensitivity(y_test, y_pred))
print("Specificity:", specificity(y_test, y_pred))
```

For a complete walkthrough, see [`examples/quickstart.ipynb`](examples/quickstart.ipynb)
or open it in
[Colab](https://colab.research.google.com/github/clinikit/clinikit/blob/main/examples/quickstart.ipynb).

---

## What is in the box

### 14 hybrid classifiers (`clinikit.models`)

All sklearn-compatible, all pass `sklearn.utils.estimator_checks.check_estimator`.

- `RuleAugmentedClassifier`
- `BoundaryRefineClassifier`
- `SubgroupThresholdClassifier`
- `ErrorAwareCalibrator`
- `MonotonicBooster`
- `HardSampleWeightedEnsemble`
- `ClassConditionalImputer`
- `CrossDistributionDistiller`
- `SelectiveClassifier`
- `InstanceAdaptiveThreshold`
- `DialecticalEnsemble`
- `LatentSubtypeRouter`
- `IterativeLabelRefiner`
- `DualViewCoTrainer`

### Supporting modules

- `preprocessing` — imputers, scalers, outlier flags, missing indicators
- `metrics` — sensitivity, specificity, NPV, PPV, F2, MCC, Brier, ECE
- `curves` — ROC, PR, calibration, Decision Curve Analysis
- `protocols` — 5 experiment protocols (Defensible, MaxScore, OriginalOnly, Deployment, Audit)
- `leaderboard` — experiment tracking CSV with 38 columns
- `report` — HTML structured report generator (Jinja2 templates)
- `audit` — leakage detection, subgroup fairness, documentation checks
- `governance` — audit-trail manifest templates (documentation only)
- `reproducibility` — manifest files (data hash + config + library versions)
- `datasets` — UCI benchmarks (PIMA, Wisconsin, UCI Heart, Frankfurt)
- `cli` — Typer-based CLI: `train`, `benchmark`, `audit`, `validate`, `report`
- `thresholds`, `calibration`, `statistics`, `diagnostics`, `cost_sensitive`,
  `monitor`, `modelcard`, `cross_val`, `explainability`, `automl`,
  `external_val`, `time_split`, `active_learning`, `synthetic`

---

## Command-line interface

```bash
clinikit train      --config config.yaml
clinikit benchmark  --dataset pima --models all
clinikit audit      --data data.csv --report audit.html
clinikit validate   --model model.joblib --data data.csv
clinikit report     --leaderboard runs.csv --out report.html
```

---

## Project notes

`clinikit` is an integration toolkit. The methods it bundles are
adaptations of techniques published in the academic literature; see
[`CITATIONS.md`](CITATIONS.md) for source-method references. It is
not a research paper of original methods, and it is not a regulated
product. Research and development use only.

---

## Contributing

Contributions are welcome. Please read [`CONTRIBUTING.md`](CONTRIBUTING.md)
for the development workflow, coding standards, and pull-request
process. By participating, you agree to abide by the
[`Code of Conduct`](CODE_OF_CONDUCT.md).

---

## Citation

If you use `clinikit` in academic work, please cite it via the
[`CITATION.cff`](CITATION.cff) file, or use:

```bibtex
@software{clinikit,
  author  = {SEVEN, Berat Kaan},
  title   = {clinikit: a tabular machine-learning toolkit},
  year    = {2026},
  url     = {https://github.com/clinikit/clinikit},
  version = {0.1.0}
}
```

---

## License

Distributed under the MIT License. See [`LICENSE`](LICENSE) for the full text.
