Metadata-Version: 2.4
Name: pyaugur
Version: 0.1.0
Summary: Python port of Augur: cell type prioritization in high-dimensional single-cell data
Author: Augur Python Port
License: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/omicverse/py-augur
Project-URL: Repository, https://github.com/omicverse/py-augur
Project-URL: Issues, https://github.com/omicverse/py-augur/issues
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.22
Requires-Dist: scipy>=1.9
Requires-Dist: pandas>=1.5
Requires-Dist: scikit-learn>=1.1
Requires-Dist: statsmodels>=0.13
Provides-Extra: anndata
Requires-Dist: anndata>=0.8; extra == "anndata"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: matplotlib>=3.5; extra == "dev"

# pyaugur

Python port of [Augur](https://github.com/neurorestore/Augur): cell type prioritization in high-dimensional single-cell data.

## Install

```bash
pip install pyaugur
```

## Quick Start

```python
import numpy as np
import pandas as pd
from pyaugur import calculate_auc

# Expression matrix: genes x cells
expr = pd.read_csv("expression.csv", index_col=0).values
meta = pd.read_csv("metadata.csv")  # columns: cell_type, label

result = calculate_auc(expr, meta=meta)
print(result["AUC"])  # Mean AUC per cell type, ranked
```

## API

### `calculate_auc(input, meta=None, ...)`

Train a classifier (random forest or logistic regression) to predict condition labels per cell type, evaluate AUC in cross-validation.

**Returns**: dict with `AUC` (DataFrame), `results`, `feature_importance`, `parameters`.

### `calculate_differential_prioritization(augur1, augur2, permuted1, permuted2, ...)`

Permutation test for differential prioritization between two conditions.

### `select_variance(mat, var_quantile=0.5)`

Feature selection by variance (loess on CV vs mean expression).

### `select_random(mat, feature_perc=0.5)`

Random feature subsampling.

## Performance

vs R Augur on sc_sim dataset (15,697 genes x 600 cells):

| Metric | Value |
|---|---|
| Pearson r (AUC parity) | 0.9977 |
| Speedup | 4.0x |

## License

GPL-3.0 (matching upstream R package).
