Metadata-Version: 2.4
Name: okekeclean
Version: 1.1.0
Summary: Artifact detection for physiological signals (ABP and ECG)
License-Expression: AGPL-3.0-only
License-File: LICENSE
Requires-Dist: huggingface-hub>=0.20.0
Requires-Dist: numpy>=1.26.4
Requires-Dist: pandas>=2.1.4
Requires-Dist: scipy>=1.10.0
Requires-Dist: torch>=2.0.0
Requires-Dist: torchaudio>=2.0.0
Requires-Dist: torchvision>=0.15.0
Requires-Dist: tqdm>=4.60.0
Requires-Python: >=3.12
Description-Content-Type: text/markdown

# OkekeClean

[![PyPI](https://img.shields.io/pypi/v/okekeclean)](https://pypi.org/project/okekeclean/)
[![Python 3.12+](https://img.shields.io/badge/python-3.12%2B-blue)](https://pypi.org/project/okekeclean/)
[![License: AGPL v3](https://img.shields.io/badge/license-AGPL%20v3-blue)](./LICENSE)

OkekeClean is a Python package for detecting artifact in physiological waveforms. It
ships released inference code for arterial blood pressure (ABP) and
electrocardiogram (ECG) models, and downloads the corresponding model weights from
Hugging Face on first use.

## Installation

```bash
pip install okekeclean
```

Or with `uv`:

```bash
uv add okekeclean
```

## Quick Start

### ABP

```python
import pandas as pd
import torch

from okekeclean import ABPParams, detect_artifacts, load_model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = load_model("okekeclean-abp-ensemble", device=device)

waveform = pd.Series(
    [80.0, 83.2, 90.5, 102.1, 110.4, 105.0, 96.3, 88.7] * 500,
    index=pd.Timestamp("2025-01-01")
    + pd.to_timedelta(range(4000), unit="s") / 125,
)

artifact_flags = detect_artifacts(
    waveform=waveform,
    modality="ABP",
    model=model,
    params=ABPParams(),
    device=device,
)
```

### ECG

```python
import numpy as np
import pandas as pd
import torch

from okekeclean import ECGParams, detect_artifacts, load_model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = load_model("okekeclean-ecg-efficientnet_b0", device=device)

samples = 5000
time = np.arange(samples) / 500.0
waveform = pd.Series(
    0.9 * np.sin(2 * np.pi * 1.2 * time) + 0.05 * np.random.randn(samples),
    index=pd.Timestamp("2025-01-01") + pd.to_timedelta(time, unit="s"),
)

artifact_probs = detect_artifacts(
    waveform=waveform,
    modality="ECG",
    model=model,
    params=ECGParams(),
    device=device,
    return_type="probability",
)
```

Runnable examples:

- [`examples/quickstart_abp.py`](./examples/quickstart_abp.py)
- [`examples/quickstart_ecg.py`](./examples/quickstart_ecg.py)

## Released Models

| Model | Signal | Architecture | Threshold | Test performance |
| --- | --- | --- | --- | --- |
| `okekeclean-abp-ensemble` | ABP | ResNet-18 (full FT) + EfficientNet-B0 (shallow FT) | `0.184` | AU-ROC `0.958`, sensitivity `0.952`, specificity `0.730`, accuracy `0.795` |
| `okekeclean-abp-resnet18` | ABP | ResNet-18 (full FT) | `0.17785164713859558` | AU-ROC `0.951`, sensitivity `0.915`, specificity `0.824`, accuracy `0.851` |
| `okekeclean-abp-efficientnet_b0` | ABP | EfficientNet-B0 (shallow FT) | `0.04994076117873192` | AU-ROC `0.945`, sensitivity `0.799`, specificity `0.939`, accuracy `0.898` |
| `okekeclean-ecg-efficientnet_b0` | ECG | EfficientNet-B0 (full FT) | `0.20802117884159088` | AU-ROC `0.970`, sensitivity `0.858`, specificity `0.950`, accuracy `0.922` |

## Input Requirements

- ABP: `pd.Series` with a `DatetimeIndex`, sampled at 125 Hz, units in mmHg.
- ECG: `pd.Series` with a `DatetimeIndex`, sampled at 500 Hz, units in mV. ECG
  segments are resampled to 500 Hz internally if needed.

## Model Weights

- ABP weights: <https://huggingface.co/moberg-analytics/okekeclean-abp>
- ECG weights: <https://huggingface.co/moberg-analytics/okekeclean-ecg>

## License

GNU Affero General Public License v3. See [`LICENSE`](./LICENSE).

## Citation

```bibtex
@misc{okekeclean,
  title = {OkekeClean},
  author = {Tony Kabilan Okeke},
  year = {2026},
  howpublished = {\url{https://github.com/moberg-analytics/oss-models}}
}
```
