Metadata-Version: 2.4
Name: revise-svc
Version: 0.0.31
Summary: Vision-integrated spatial transcriptomics SVC reconstruction and Sim2Real-ST benchmarking.
Author: Yushuai Wu and Yifeng Jiao
License: MIT License
        
        Copyright (c) 2025 Yushuai Wu
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/wuys13/REVISE
Project-URL: Documentation, https://revise-svc.readthedocs.io/en/latest/
Project-URL: Repository, https://github.com/wuys13/REVISE
Project-URL: Issues, https://github.com/wuys13/REVISE/issues
Project-URL: Data, https://zenodo.org/records/17705737
Keywords: spatial transcriptomics,single-cell,optimal transport,benchmark,bioinformatics,SVC
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: anndata>=0.9
Requires-Dist: matplotlib>=3.6
Requires-Dist: numba>=0.57
Requires-Dist: numpy>=1.22
Requires-Dist: pandas>=1.5
Requires-Dist: POT==0.9.5
Requires-Dist: scanpy>=1.9
Requires-Dist: statsmodels>=0.14.1
Requires-Dist: scikit-image>=0.20
Requires-Dist: scikit-learn>=1.2
Requires-Dist: scipy>=1.9
Requires-Dist: seaborn>=0.12
Requires-Dist: squidpy>=1.2
Requires-Dist: tqdm>=4.64
Requires-Dist: igraph
Requires-Dist: leidenalg
Requires-Dist: scikit-misc
Requires-Dist: gseapy
Provides-Extra: annotation
Requires-Dist: tacco>=0.2; extra == "annotation"
Provides-Extra: dev
Requires-Dist: black; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Dynamic: license-file

# REVISE

[![PyPI](https://img.shields.io/pypi/v/revise-svc.svg)](https://pypi.org/project/revise-svc/)
[![Documentation Status](https://readthedocs.org/projects/revise-svc/badge/?version=latest)](https://revise-svc.readthedocs.io/en/latest/?badge=latest)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

REVISE (REconstruction via Vision-integrated Spatial Estimation) reconstructs
**Spatially-inferred Virtual Cells (SVCs)** from spatial transcriptomics data
by integrating ST measurements, histological images, and matched single-cell
RNA-seq references.

The current codebase is organized around one configuration-driven engine,
`REVISEPipeline`, and two user-facing modes:

| Mode | Goal | Main entry points | Primary outputs |
| --- | --- | --- | --- |
| `benchmark` | Reproduce Sim2Real-ST evaluations across six confounding factors | `benchmark_main.py`, `benchmark_main.sh`, `reproduce/benchmark/*.ipynb` | `metrics_normalized.csv` with PCC, SSIM, MSE, and NRMSE |
| `application` | Reconstruct SVCs and run downstream real-data analysis | `application_sp_SVC_recon.py`, `application_sc_SVC_recon.py`, `reproduce/case/*.ipynb` | `sp_SVC.h5ad`, `sc_SVC_expr.h5ad`, `sc_SVC_spatial.h5ad`, notebook figures |

Documentation: <https://revise-svc.readthedocs.io/en/latest/>

Dataset and reproduced results: <https://zenodo.org/records/17705737>

## What REVISE Covers

Sim2Real-ST benchmarks six confounding factors across three spatial
transcriptomics platform types:

- Spatially heterogeneous factors: image segmentation artifacts and bin-to-cell
  assignment errors.
- Spatially homogeneous factors: spot size, batch effect, gene panel
  limitation, and gene dropout.

REVISE reconstructs two complementary SVC types:

- `sp-SVC`: spatial refinement for hST platforms such as Visium HD.
- `sc-SVC`: molecular completion and cell-state refinement for iST/sST
  platforms such as Xenium and Visium.

## Architecture

Modern runs flow through:

1. `revise.framework.REVISEPipeline`
2. `revise/revise.yaml` profiles and runtime/io overrides
3. `revise.recon.pipeline.UnifiedReconstructionPipeline`
4. backend strategy and plugin registries in `revise/backend/`

`UnifiedReconstructionPipeline` owns the fixed lifecycle: input validation,
global anchoring, local unit preparation, graph construction, OT problem
construction, OT solving, expression update, SVC finalization, and optional
benchmark evaluation.

Legacy-style runner classes are kept under `revise/backend/runners/` for
notebook compatibility and parity checks. New code should prefer
`REVISEPipeline` or the root wrapper scripts.

## Installation

Install the package from PyPI:

```bash
pip install revise-svc
```

Optional annotation support:

```bash
pip install "revise-svc[annotation]"
```

Development install:

```bash
git clone https://github.com/wuys13/REVISE.git
cd REVISE
pip install -e ".[dev]"
```

Download Sim2Real-ST benchmark data and real application data from
[Zenodo](https://zenodo.org/records/17705737), then place them under
`raw_data/` if you want to reproduce the paper results.

## Quick Start

### Benchmark Mode

`benchmark_main.py` runs Sim2Real-ST cases and writes per-gene benchmark metrics.
The paper-facing metrics are PCC, SSIM, and MSE; NRMSE is also retained in the
CSV for legacy compatibility.

```bash
python benchmark_main.py \
  --cf segmentation \
  --raw_data_path raw_data/Sim2Real-ST \
  --sample_name P2CRC/cut_part1 \
  --task segmentation \
  --save_path output/benchmark
```

Supported `--cf` values:

- `segmentation`
- `bin2cell`
- `batch_effect`
- `spot_size`
- `gene_panel`
- `gene_dropout`

Use the merged launcher for multi-case reproduction:

```bash
bash benchmark_main.sh
```

### Application Mode

Application scripts default to `output/` subdirectories so notebook analysis can
load the reconstructed SVC files directly.

For hST / Visium HD style sp-SVC reconstruction:

```bash
python application_sp_SVC_recon.py \
  --raw_data_path raw_data/Real_application \
  --sample_name P1CRC \
  --st_file HD.h5ad \
  --sc_ref_file adata_sc_all_reanno.h5ad
```

Default published notebook output:

```text
output/sp_SVC_case/<sample_name>/sp_SVC.h5ad
```

For iST / Xenium style sc-SVC reconstruction:

```bash
python application_sc_SVC_recon.py \
  --sample_name P2CRC \
  --data_type Xenium \
  --raw_data_path raw_data/Real_application \
  --sc_ref_file adata_sc_all_reanno.h5ad \
  --select_ct T
```

Default published notebook outputs:

```text
output/sc_SVC_case/<sample_name>_<data_type>/<select_ct>/sc_SVC_expr.h5ad
output/sc_SVC_case/<sample_name>_<data_type>/<select_ct>/sc_SVC_spatial.h5ad
```

### Python API

```python
from revise.framework import REVISEPipeline

pipeline = REVISEPipeline(config_path="revise/revise.yaml")
svc = pipeline.run(
    profile="application_sc",
    runtime_overrides={"platform": "iST", "confounding": "segmentation"},
    io_overrides={
        "data_root": "raw_data/Real_application",
        "output_root": "output/sc_SVC_case",
        "sample_name": "P2CRC",
        "st_file": "Xenium.h5ad",
        "sc_ref_file": "adata_sc_all_reanno.h5ad",
        "patient_key": "Patient",
    },
    set_overrides=["sc.select_ct=T"],
)
```

## Notebooks

| Area | Files | Purpose |
| --- | --- | --- |
| Benchmark | `reproduce/benchmark/seg_benchmark.ipynb`, `spot_benchmark.ipynb`, `batch_benchmark.ipynb`, `imputation_benchmark.ipynb` | Inspect Sim2Real-ST benchmark outputs and PCC/SSIM/MSE trends |
| Application reconstruction | `reproduce/case/*_recon.ipynb`, `reproduce/case/sp_SVC_case.ipynb` | Rebuild paper application cases from raw inputs |
| Application analysis | `reproduce/case/*_analysis.ipynb`, `application_sc_SVC_analysis_case.ipynb` | Analyze SVC outputs, cell states, pathways, spatial patterns, and downstream figures |
| SMI case | `SMI/CosMx-SMI-REVISE_spSVC.ipynb` | CosMx SMI sp-SVC application example |

ReadTheDocs links the maintained benchmark and case notebooks through
`docs/benchmark/` and `docs/case/`.

## Repository Layout

- `revise/framework.py`: public `REVISEPipeline` entry point.
- `revise/revise.yaml`: routing profiles and default configuration.
- `revise/recon/`: unified pipeline context and lifecycle orchestration.
- `revise/backend/`: strategies, platform adapters, plugin registries, kernels,
  and lower-level operations.
- `revise/config/`: config loader and internal runner configuration contracts.
- `revise/analysis/`: benchmark metric and downstream analysis helpers.
- `reproduce/benchmark/`: benchmark launchers and analysis notebooks.
- `reproduce/case/`: real application reconstruction and analysis notebooks.
- `docs/`: ReadTheDocs / Sphinx source.

## License

REVISE is released under the MIT License.
