Metadata-Version: 2.4
Name: mossn
Version: 0.1.0
Summary: MOSSN: sample-specific protein network inference from gene expression and multi-omics data
Author: Zihao Chen
License: Copyright (c) 2026 Zihao Chen
        
        All rights reserved.
        
        This source distribution is provided for research and evaluation unless a
        separate license is granted by the author.
        
Keywords: bioinformatics,network,ppi,multi-omics,gene-expression
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: networkx>=3.0
Requires-Dist: numpy>=1.23
Requires-Dist: pandas>=1.5
Requires-Dist: scipy>=1.10
Provides-Extra: test
Requires-Dist: pytest>=7.0; extra == "test"
Dynamic: license-file

# mossn

`mossn` packages the MOSSN algorithm for constructing sample-specific protein
interaction networks from gene expression data, together with ablation variants
and multi-omics extensions.

## Features

- Sample-specific edge reweighting using gene-expression-derived correction
  scores.
- Random walk with restart (RWR) to estimate node importance per sample.
- Ablation variants:
  - `no_prior`
  - `uniform`
  - `no_seed`
  - `no_rwr`
  - `no_corr`
- Multi-omics extensions:
  - coupled restart
  - direct cross-layer graph
  - multilayer graph
  - late fusion

## Installation

```bash
pip install mossn
```

For local development:

```bash
pip install -e .
```

## Quick Start

```python
from mossn import prepare_data_no_prior, run_no_prior_single_sample
from mossn.example_data import load_example_expression, load_example_links

links = load_example_links()
expression_data = load_example_expression()

graph, base_weights, expression_data = prepare_data_no_prior(links, expression_data)
edge_table = run_no_prior_single_sample(
    sample_id=expression_data.columns[0],
    graph=graph,
    base_weights=base_weights,
    expression_data=expression_data,
)

print(edge_table.head())
```

## Bundled Example Data

The package includes the following example datasets:

- TCGA BLCA expression matrix
- STRING-derived PPI links

You can access them with:

```python
from mossn.example_data import (
    get_example_expression_path,
    get_example_links_path,
    load_example_expression,
    load_example_links,
)
```

## Input format

### PPI links

The `links` table must contain:

- `protein1`
- `protein2`
- `score`

### Expression matrix

The expression matrix must use:

- rows as genes or proteins
- columns as sample IDs

## Main API

### Single-omics

- `prepare_data_no_prior`
- `run_no_prior_single_sample`
- `prepare_data_uniform`
- `run_uniform_single_sample`
- `prepare_data_no_seed`
- `run_no_seed_single_sample`
- `prepare_data_no_rwr`
- `run_no_rwr_single_sample`
- `prepare_data_no_corr`
- `run_no_corr_single_sample`

### Data-driven extension

- `prepare_data_driven`
- `build_graph_from_correlation`
- `run_driven_single_sample`
- `infer_driven_network`

### Multi-omics

- `prepare_data_coupled`
- `run_coupled_single_sample`
- `prepare_data_direct`
- `run_direct_single_sample`
- `prepare_data_multilayer`
- `run_multilayer_single_sample`
- `run_late_fusion_single_sample`

## Notes

- The package expects matched identifiers between the network and omics tables.
- Sample-specific normalization uses median and interquartile range (IQR).
- Node importance is rank-normalized before computing final edge weights.
- The data-driven mode first infers a background graph from expression
  correlations when an external reference network is unavailable.

## Data-Driven Example

```python
from mossn import infer_driven_network, run_driven_single_sample
from mossn.example_data import load_example_expression

expression_data = load_example_expression()
graph, base_weights, expression_data = infer_driven_network(
    expression_data=expression_data,
    cor_threshold=0.9,
)

edge_table = run_driven_single_sample(
    sample_id=expression_data.columns[0],
    graph=graph,
    base_weights=base_weights,
    expression_data=expression_data,
)
```
