Metadata-Version: 2.4
Name: scpclm
Version: 0.1.0
Summary: Hybrid parametric modeling of single-cell gene expression dynamics along pseudotime
Author: Yuan
License: MIT
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: pandas
Requires-Dist: matplotlib
Dynamic: author
Dynamic: description
Dynamic: description-content-type
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# scPCLM

scPCLM is a hybrid parametric framework for modeling gene expression dynamics along pseudotime in single-cell RNA-seq data.

It is designed to identify differentiation-associated genes by fitting biologically interpretable hill-shaped and valley-shaped trajectories, while providing robust estimation of temporal features such as turning points.

---

## Features

- Supports multiple count distributions:
  - Poisson  
  - Negative Binomial (NB)  
  - Zero-Inflated Poisson (ZIP)  
  - Zero-Inflated Negative Binomial (ZINB)

- Models asymmetric expression patterns:
  - Hill-shaped trajectories  
  - Valley-shaped trajectories  

- Hybrid optimization strategy:
  - Global search via Differential Evolution  
  - Local refinement via L-BFGS-B  

- Interpretable parameters:
  - Turning point (t0)  
  - Activation and decay rates (k1, k2)  
  - Expression magnitude (mu)  

- Uncertainty estimation:
  - Fisher information-based confidence intervals  

---

## Installation

Install from PyPI:

```bash
pip install scpclm
```

Or install from source:

```bash
git clone https://github.com/jesseslight9-jpg/scpclm
cd scpclm
pip install -e .
```

---

## Input Data Format

Input data should be a cell-by-gene matrix where:

- The first column is pseudotime  
- The remaining columns are gene expression values  

Example:

| Index | Pseudotime | Gene1 | Gene2 |
|------|------------|-------|-------|
| 1    | t1         | y11   | y12   |
| 2    | t2         | y21   | y22   |

---

## Python API Usage

### Automatic mode

```python
from scpclm import scpclm_auto

result = scpclm_auto(
    gene_index=100,
    t=pseudotime,
    y1=expression,
    marginal="ZIP",
    iter_num=100,
    save_dir="Results/"
)
```

---

### Hill-only mode

```python
from scpclm import scpclm_hill

result = scpclm_hill(
    gene_index=100,
    t=pseudotime,
    y1=expression,
    marginal="NB"
)
```

---

### Valley-only mode

```python
from scpclm import scpclm_valley

result = scpclm_valley(
    gene_index=100,
    t=pseudotime,
    y1=expression,
    marginal="ZIP"
)
```

---

## Output

The model returns:

- Estimated parameters: mu, k1, k2, t0  
- Model selection criterion: AIC  
- Confidence intervals  
- Fitted expression values  
- Visualization plots (.png)  

---

## Command-line Usage (Optional)

```bash
python scripts/run_scPCLM.py \
    --model.iter 100 \
    --model.marginal ZIP \
    --data.dir your_data.csv \
    --model.save_dir Results/
```

---

## Project Structure

```
scpclm/
    scpclm/      core implementation
    scripts/     command-line scripts
    examples/    demos
```

---

## License

MIT License

---

## Citation

If you use this method, please cite:

scPCLM: A hybrid parametric approach for identifying differentiation-associated genes from single-cell RNA-seq data
