Metadata-Version: 2.4
Name: dgscrna
Version: 1.1
Summary: A Python package for single-cell RNA-seq cell type annotation using marker-based scoring and deep learning
Author-email: Yimin Liu <yiminliu.career@gmail.com>
License: GPL-3.0
Project-URL: Homepage, https://github.com/yourusername/DGscRNA
Project-URL: Repository, https://github.com/yourusername/DGscRNA.git
Project-URL: Bug Tracker, https://github.com/yourusername/DGscRNA/issues
Keywords: single-cell,RNA-seq,cell-type,annotation,bioinformatics
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: pandas>=1.3.0
Requires-Dist: scanpy>=1.9.0
Requires-Dist: anndata>=0.8.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: torch>=1.9.0
Requires-Dist: torchmetrics>=0.7.0
Requires-Dist: hdbscan>=0.8.0
Requires-Dist: leidenalg>=0.8.0
Requires-Dist: harmonypy>=0.0.5
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: seaborn>=0.11.0
Requires-Dist: tqdm>=4.62.0
Requires-Dist: joblib>=1.1.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-cov>=2.0; extra == "dev"
Requires-Dist: black>=21.0; extra == "dev"
Requires-Dist: flake8>=3.8; extra == "dev"
Requires-Dist: mypy>=0.800; extra == "dev"
Requires-Dist: bump2version>=1.0.0; extra == "dev"
Dynamic: license-file

# DGscRNA

A Python package for single-cell RNA-seq cell type annotation using marker-based scoring and deep learning refinement.

## Overview

DGscRNA combines traditional marker-based cell type scoring with deep learning to resolve ambiguous cell type assignments in single-cell RNA-seq data. The workflow includes:

1. **Preprocessing**: Quality control, normalization, and dimensionality reduction
2. **Clustering**: Multiple clustering algorithms (Leiden, HDBSCAN, K-means)
3. **Marker Scoring**: Density-based scoring using known cell type markers
4. **Deep Learning**: Neural network refinement of ambiguous annotations

## Installation

```bash
pip install dgscrna
```

Or install from source:

```bash
git clone https://github.com/yourusername/DGscRNA.git
cd DGscRNA
pip install -e .
```

## Quick Start

```python
import scanpy as sc
import dgscrna as dg

# Load your data
adata = sc.read_h5ad('your_data.h5ad')

# Run the complete pipeline
results = dg.run_dgscrna_pipeline(
    adata=adata,
    marker_folder='path/to/marker/sets/',
    clustering_methods=['leiden', 'hdbscan'],
    deep_learning=True
)

# View results
sc.pl.umap(adata, color=['leiden', 'CellMarker_Thyroid_mean_DGscRNA'])
```

## Input Data Format

### Single-cell Data
- **Format**: AnnData object (scanpy/anndata)
- **Requirements**: Preprocessed and normalized gene expression matrix

### Marker Sets
- **Format**: CSV files in a folder
- **Structure**: Columns are cell type names, rows are marker genes
- **Example**:
```csv
,CellType1,CellType2,CellType3
0,Gene1,Gene4,Gene7
1,Gene2,Gene5,Gene8
2,Gene3,Gene6,Gene9
```

## Output

- **AnnData object**: With added annotation columns
- **Results dictionary**: Training scores and metrics
- **Visualization**: UMAP plots with annotations

## Documentation

- [API Reference](docs/api.md)
- [Installation Guide](docs/installation.md)
- [Tutorial](docs/tutorial.md)
- [Examples](examples/)

## License

GPL-3.0 License - see LICENSE file for details.

## Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests.

## Support

For questions and support, please open an issue on GitHub or contact the maintainers. 
