Metadata-Version: 2.4
Name: missoutlier
Version: 0.1.2
Summary: Outlier detection using the MISS (MAD-IQR-SD Simultaneous) method
Author-email: Guillaume Pech <guillaumepech.cog@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/GuillaumePech/missOutlierPy
Project-URL: Paper, https://osf.io/preprints/psyarxiv/2r9yw_v2
Keywords: outlier,detection,statistics,MAD,IQR,MISS
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.20
Requires-Dist: scipy>=1.7
Dynamic: license-file

<div align="center">

# 🎯 missoutlier

### **Outlier Detection Using the MISS Method**

*A weighted composite of MAD, IQR, and SD for robust univariate outlier detection*

[![Python](https://img.shields.io/badge/Python-%3E%3D%203.8-blue?logo=python&logoColor=white)](https://www.python.org/)
[![PyPI](https://img.shields.io/pypi/v/missoutlier?color=orange&logo=pypi&logoColor=white)](https://pypi.org/project/missoutlier/)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
[![PsyArXiv](https://img.shields.io/badge/PsyArXiv-10.31234%2Fosf.io%2F2r9yw__v2-red)](https://osf.io/preprints/psyarxiv/2r9yw_v2)

---

</div>

## Overview

**missoutlier** implements the **MISS** (MAD–IQR–SD Simultaneous) method, a new approach for univariate outlier detection that combines three classical techniques into a single robust threshold:

| Method | Bounds | Weight |
|--------|--------|--------|
| **MAD** (Median Absolute Deviation) | `median ± 1.5 × MAD` | 87.8% |
| **IQR** (Interquartile Range) | `Q25/Q75 ± 1 × IQR` | 1.2% |
| **SD** (Standard Deviation) | `mean ± 5 × SD` | 11.0% |

The composite threshold is computed as:

$$\text{MISS} = 0.878 \times \text{MAD} + 0.012 \times \text{IQR} + 0.11 \times \text{SD}$$

By heavily weighting the robust MAD while retaining sensitivity from IQR and SD, MISS offers a balanced approach that handles skewed and heavy-tailed distributions better than any single method alone.

---

## Installation

```bash
# Install from PyPI
pip install missoutlier
```

Or install directly from GitHub:

```bash
pip install git+https://github.com/GuillaumePech/missOutlierPy.git
```

**Dependencies:** `numpy >= 1.20`, `scipy >= 1.7`

---

## Quick Start

```python
import numpy as np
from missoutlier import detect_outliers_miss

# Generate data with outliers
x = np.concatenate([np.random.randn(100), [50, -40]])

# Default: replace outliers with NaN
x_clean = detect_outliers_miss(x)
# Detected 2 outliers (1.96% of data) using MISS method.

# Drop outliers entirely
x_dropped = detect_outliers_miss(x, drop=True)
# Detected 2 outliers (1.96% of data) using MISS method.

# Handle existing NaNs
x_na = np.concatenate([np.random.randn(100), [np.nan, 50]])
x_clean = detect_outliers_miss(x_na, na_rm=True)

# Silent mode (no messages)
x_clean = detect_outliers_miss(x, silent=True)
```

---

## Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `data` | array-like | — | Input data (must be one-dimensional) |
| `drop` | bool | `False` | If `True`, removes outliers. If `False`, replaces them with `NaN` |
| `na_rm` | bool | `False` | If `True`, ignores `NaN` values when computing thresholds |
| `silent` | bool | `False` | If `True`, suppresses the detection message |

---

## How It Works

```
                ┌──────────────┐
                │  Input Data  │
                └──────┬───────┘
                       │
          ┌────────────┼────────────┐
          ▼            ▼            ▼
     ┌─────────┐ ┌─────────┐ ┌─────────┐
     │ 1.5 MAD │ │  1 IQR  │ │  5 SD   │
     │  ×0.878 │ │  ×0.012 │ │  ×0.11  │
     └────┬────┘ └────┬────┘ └────┬────┘
          │            │            │
          └────────────┼────────────┘
                       ▼
              ┌────────────────┐
              │ MISS Threshold │
              └────────┬───────┘
                       ▼
              ┌────────────────┐
              │ Flag Outliers  │
              └────────────────┘
```

---

## Citation

If you use this package in your research, please cite:

> Pech, G., Vaccaro, N., Caspar, E. A., Amerio, P., Cleeremans, A., Leys, C., & Ley, C. (2026). How not to MISS an outlier: comparing three classic univariate methods and introducing a new one, the MAD–IQR–SD Simultaneous (MISS). *PsyArXiv*. https://doi.org/10.31234/osf.io/2r9yw_v2


---

## License

MIT © [Guillaume Pech](https://github.com/GuillaumePech)
