Metadata-Version: 2.1
Name: scsplice
Version: 2.0.0
Summary: Single-cell alternative-splicing analysis on AnnData (Python port of R splikit)
Keywords: single-cell,splicing,scverse,anndata,scrna-seq
Author-Email: Arsham Mikaeili Namini <arsham.mikaeilinamini@mail.mcgill.ca>
License: MIT License
         
         Copyright (c) 2026 Arsham Mikaeili Namini and the Computational and Statistical Genomics Laboratory, McGill University
         
         Permission is hereby granted, free of charge, to any person obtaining a copy
         of this software and associated documentation files (the "Software"), to deal
         in the Software without restriction, including without limitation the rights
         to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
         copies of the Software, and to permit persons to whom the Software is
         furnished to do so, subject to the following conditions:
         
         The above copyright notice and this permission notice shall be included in all
         copies or substantial portions of the Software.
         
         THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
         IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
         FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
         AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
         LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
         OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
         SOFTWARE.
         
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: C++
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Project-URL: Documentation, https://arshammik.github.io/scsplice/
Project-URL: Homepage, https://github.com/Arshammik/scsplice
Project-URL: Issues, https://github.com/Arshammik/scsplice/issues
Project-URL: Changelog, https://github.com/Arshammik/scsplice/blob/main/CHANGELOG.md
Project-URL: Source, https://github.com/Arshammik/scsplice
Requires-Python: >=3.10
Requires-Dist: anndata>=0.10
Requires-Dist: numpy>=2.0
Requires-Dist: scipy>=1.13
Requires-Dist: pandas>=2.2
Requires-Dist: h5py>=3.10
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: pytest-xdist; extra == "test"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.6; extra == "docs"
Requires-Dist: mkdocs-material>=9.5; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.26; extra == "docs"
Requires-Dist: mkdocs-jupyter>=0.25; extra == "docs"
Requires-Dist: pymdown-extensions>=10.7; extra == "docs"
Requires-Dist: mike>=2.1; extra == "docs"
Provides-Extra: dev
Requires-Dist: scsplice[docs,test]; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Description-Content-Type: text/markdown

# scsplice

Single-cell alternative-splicing analysis for the [scverse](https://scverse.org) ecosystem.

`scsplice` is the Python port of the R package [splikit](https://github.com/csglab/splikit). It analyses splice-junction count data in single-cell RNA-seq, treating each event as a pair of inclusion (`M1`) and exclusion (`M2`) counts derived from local junction variants (LJVs). The package is AnnData-native — junctions live on the `var` axis, M1 and M2 sit in `layers`, and downstream analysis composes naturally with `scanpy`.

## Status

v1.0. v1.0 scope is intentionally narrow:

- `scs.io.read_starsolo` — ingest STARsolo `Solo.out/SJ/` for one or more samples.
- `scs.tl.make_m2` — build the exclusion matrix from M1 + LJV grouping.
- `scs.pp.highly_variable_events` — per-library binomial-deviance HVE selection.
- `scs.tl.pseudo_correlation` — beta-binomial Cox-Snell / Nagelkerke pseudo-R² against an external matrix.

HVG, plotting, and silhouette utilities from the R package are intentionally omitted — `scanpy`, `pyranges`, and `sklearn` already cover those.

## Installation

`scsplice` ships a C++ extension built via `scikit-build-core` + `pybind11`.
Eigen3 (header-only) is required at install time; OpenMP is optional but
strongly recommended for multi-threaded kernels.

### From PyPI (once v2.0 is published)

```bash
pip install scsplice
```

### From source

```bash
git clone https://github.com/Arshammik/scsplice
cd scsplice
pip install .
```

System dependencies before running `pip install`:

**Ubuntu / Debian**

```bash
sudo apt install libeigen3-dev libomp-dev
```

**macOS** (Homebrew)

```bash
brew install eigen libomp
# tell CMake where Apple Clang's OpenMP lives
export OpenMP_ROOT="$(brew --prefix libomp)"
export LDFLAGS="-L${OpenMP_ROOT}/lib"
export CPPFLAGS="-I${OpenMP_ROOT}/include"
```

**HPC cluster (Compute Canada / Sharcnet pattern)**

```bash
module load eigen/3.4.0
# any modern GCC with OpenMP (gcc/12+) on the system module path
```

### Editable install (development)

```bash
pip install -e ".[dev]"
```

This installs the package, all test dependencies, the docs toolchain
(`mkdocs-material`, `mkdocstrings[python]`, `mkdocs-jupyter`), and `ruff` /
`pre-commit`. C++ edits require re-running `pip install -e .`; pure-Python
edits take effect immediately.

## Quick start

```python
import scsplice as scs
import scanpy as sc

# (1) Ingest STARsolo splice-junction counts (M1) and LJV grouping
adata = scs.io.read_starsolo(
    sj_dirs=["sample1/Solo.out/SJ", "sample2/Solo.out/SJ"],
    sample_ids=["s1", "s2"],
)

# (2) Build exclusion matrix (M2) from inclusion counts + junction grouping
scs.tl.make_m2(adata, n_threads=8)

# (3) Identify highly variable events per library using binomial deviance
scs.pp.highly_variable_events(adata, min_row_sum=50, n_threads=8)

# Optional: compose with scanpy on the splicing embedding
# (PCA / neighbors / leiden over logit(M1 / (M1 + M2)))
```

## Numerical equivalence

`scsplice` reproduces R `splikit` results to a documented tolerance on a
fixed reference dataset (M2 bit-exact; HVE deviance `rtol=1e-10`;
pseudo-correlation `rtol=1e-7`). The cross-language regression suite,
R reference fixtures, and end-to-end M1/M2 validation pipeline live on the
[`validation` branch](https://github.com/Arshammik/scsplice/tree/validation).

## Documentation

Full documentation is available at https://arshammik.github.io/scsplice/.

Topics include:
- [Getting Started](https://arshammik.github.io/scsplice/getting-started/) — installation and first workflow
- [Tutorials](https://arshammik.github.io/scsplice/tutorials/) — step-by-step notebooks
- [How-to Guides](https://arshammik.github.io/scsplice/how-to-guides/) — recipes for common tasks
- [Reference](https://arshammik.github.io/scsplice/reference/) — complete API documentation
- [Explanation](https://arshammik.github.io/scsplice/explanation/) — conceptual background and design

## License

MIT.
