Metadata-Version: 2.4
Name: isovae
Version: 0.1.0
Summary: IsoVAE: isoform-usage prediction and long-read isoform-usage denoising for single-cell RNA-seq
Author: IsoVAE developers
License-Expression: MIT
Project-URL: Homepage, https://github.com/your-username/IsoVAE
Project-URL: Documentation, https://your-username.github.io/IsoVAE/
Project-URL: Repository, https://github.com/your-username/IsoVAE
Project-URL: Issues, https://github.com/your-username/IsoVAE/issues
Keywords: single-cell RNA-seq,isoform usage,long-read RNA-seq,variational autoencoder,denoising
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.23
Requires-Dist: pandas>=1.5
Requires-Dist: scipy>=1.9
Requires-Dist: scikit-learn>=1.2
Requires-Dist: anndata>=0.9
Requires-Dist: torch>=2.0
Requires-Dist: matplotlib>=3.6
Requires-Dist: seaborn>=0.12
Provides-Extra: docs
Requires-Dist: mkdocs>=1.5; extra == "docs"
Requires-Dist: mkdocs-material>=9.5; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.24; extra == "docs"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Dynamic: license-file

# IsoVAE

IsoVAE is a Python package for single-cell isoform-usage analysis. It supports:

1. **Isoform-usage prediction** from short-read single-cell gene-expression profiles.
2. **Long-read isoform-usage denoising** from sparse long-read isoform count matrices.

IsoVAE models **within-gene isoform usage proportions**, not absolute transcript abundance.

## Installation

```bash
pip install isovae
```

For local development:

```bash
git clone https://github.com/your-username/IsoVAE.git
cd IsoVAE
pip install -e .
```

## Quick start

```python
import scanpy as sc
from isovae import (
    load_artifact,
    reconstruct_preprocessor_from_training_data,
    predict_isoform_usage,
    denoise_isoform_usage,
)

model_path = "path/to/vae_xda_model.pt"

gene_train = sc.read("path/to/training_gene_matrix.h5ad")
iso_train = sc.read("path/to/training_isoform_matrix.h5ad")

preprocessor = reconstruct_preprocessor_from_training_data(
    model_path,
    adata_gene_train=gene_train,
    adata_iso_train=iso_train,
    seed=42,
)

artifact = load_artifact(model_path, preprocessor=preprocessor, device="cpu")

# Predict isoform usage from short-read data.
gene_query = sc.read("path/to/query_gene_matrix.h5ad")
pred_usage, pred_meta = predict_isoform_usage(artifact, gene_query)
pred_usage.to_csv("predicted_isoform_usage.csv")

# Denoise long-read isoform usage.
iso_query = sc.read("path/to/query_isoform_matrix.h5ad")
denoised_usage, noisy_usage, denoise_meta = denoise_isoform_usage(artifact, iso_query)
denoised_usage.to_csv("denoised_isoform_usage.csv")
```

## Documentation

The documentation source is in `docs/` and can be built with MkDocs:

```bash
pip install -e ".[docs]"
mkdocs serve
```

To deploy to GitHub Pages:

```bash
mkdocs gh-deploy
```

See `docs/deployment.md` for deployment instructions for GitHub Pages, Read the Docs, Netlify and Vercel.

## Repository layout

```text
.
├── src/isovae/        # Python package
├── docs/              # Documentation source
├── mkdocs.yml         # Documentation configuration
├── pyproject.toml     # Package metadata
├── requirements.txt
├── LICENSE
└── README.md
```

Large data files, AnnData objects, model checkpoints and manuscript outputs are not included in the package.

## Citation

If you use IsoVAE, please cite the accompanying manuscript after publication.
