Metadata-Version: 2.4
Name: okekeclean
Version: 1.0.1
Summary: Artifact detection for physiological signals (ABP and ECG)
License-Expression: AGPL-3.0-only
License-File: LICENSE
Requires-Dist: huggingface-hub>=0.20.0
Requires-Dist: numpy>=1.26.4
Requires-Dist: pandas>=2.1.4
Requires-Dist: scipy>=1.10.0
Requires-Dist: torch>=2.0.0
Requires-Dist: torchaudio>=2.0.0
Requires-Dist: torchvision>=0.15.0
Requires-Dist: tqdm>=4.60.0
Requires-Python: >=3.12
Description-Content-Type: text/markdown

# OkekeClean

OkekeClean provides pre-trained artifact detection models for arterial blood pressure
(ABP) and electrocardiogram (ECG) waveforms. The package ships inference code for the
released models and downloads checkpoint weights from Hugging Face on first use.

## Installation

```bash
pip install okekeclean
```

Or with `uv`:

```bash
uv add okekeclean
```

## Quick Start

### ABP

```python
import pandas as pd
import torch

from okekeclean import ABPParams, detect_artifacts, load_model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = load_model("okekeclean-abp-ensemble", device=device)

waveform = pd.Series(
    [80.0, 83.2, 90.5, 102.1, 110.4, 105.0, 96.3, 88.7] * 500,
    index=pd.Timestamp("2025-01-01")
    + pd.to_timedelta(range(4000), unit="s") / 125,
)

artifact_flags = detect_artifacts(
    waveform=waveform,
    modality="ABP",
    model=model,
    params=ABPParams(),
    device=device,
)
```

### ECG

```python
import numpy as np
import pandas as pd
import torch

from okekeclean import ECGParams, detect_artifacts, load_model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = load_model("okekeclean-ecg-efficientnet_b0", device=device)

samples = 5000
time = np.arange(samples) / 500.0
waveform = pd.Series(
    0.9 * np.sin(2 * np.pi * 1.2 * time) + 0.05 * np.random.randn(samples),
    index=pd.Timestamp("2025-01-01") + pd.to_timedelta(time, unit="s"),
)

artifact_probs = detect_artifacts(
    waveform=waveform,
    modality="ECG",
    model=model,
    params=ECGParams(),
    device=device,
    return_type="probability",
)
```

See [`quickstart_abp.py`](/vault/work/moberg/artifact-detection/oss-models/packages/okekeclean/examples/quickstart_abp.py)
and [`quickstart_ecg.py`](/vault/work/moberg/artifact-detection/oss-models/packages/okekeclean/examples/quickstart_ecg.py)
for runnable examples with synthetic inputs.

## Released Models

| Model | Signal | Architecture | Threshold | Test performance |
| --- | --- | --- | --- | --- |
| `okekeclean-abp-ensemble` | ABP | ResNet-18 (full FT) + EfficientNet-B0 (shallow FT) | `0.184` | AU-ROC `0.958`, sensitivity `0.952`, specificity `0.730`, accuracy `0.795` |
| `okekeclean-abp-resnet18` | ABP | ResNet-18 (full FT) | `0.17785164713859558` | AU-ROC `0.951`, sensitivity `0.915`, specificity `0.824`, accuracy `0.851` |
| `okekeclean-abp-efficientnet_b0` | ABP | EfficientNet-B0 (shallow FT) | `0.04994076117873192` | AU-ROC `0.945`, sensitivity `0.799`, specificity `0.939`, accuracy `0.898` |
| `okekeclean-ecg-efficientnet_b0` | ECG | EfficientNet-B0 (full FT) | `0.20802117884159088` | AU-ROC `0.970`, sensitivity `0.858`, specificity `0.950`, accuracy `0.922` |

## Input Requirements

- ABP: `pd.Series` with a `DatetimeIndex`, sampled at 125 Hz, units in mmHg.
- ECG: `pd.Series` with a `DatetimeIndex`, sampled at 500 Hz, units in mV. ECG
  segments are resampled to 500 Hz internally if needed.

## Model Weights

- ABP weights: <https://huggingface.co/moberg-analytics/okekeclean-abp>
- ECG weights: <https://huggingface.co/moberg-analytics/okekeclean-ecg>

## Papers

- [ABP Paper](TODO)
- [ECG Paper](TODO)

## License

GNU Affero General Public License v3. See [`LICENSE`](/vault/work/moberg/artifact-detection/oss-models/packages/okekeclean/LICENSE).

## Citation

```bibtex
@misc{okekeclean,
  title = {OkekeClean},
  author = {Tony Kabilan Okeke},
  year = {2026},
  howpublished = {\url{https://github.com/moberg-analytics/oss-models}}
}
```
