Metadata-Version: 2.4
Name: xener
Version: 0.1.4
Summary: Single-cell cross-species cell type annotation tool using knowledge graph.
Author: Xener Team
Author-email: Shuai Liu <liushuai6@genomics.cn>, Huan Zhang <zhanghuan4@genomics.cn>, Lei Cao <caolei2@genomics.cn>, Shuangsang Fang <fangshuangsang@genomics.cn>
License-Expression: MIT
Project-URL: Homepage, https://xenor.dcs.cloud/
Project-URL: GitHub, https://github.com/liushuai6bgi/Xener
Keywords: single-cell,cell-type-annotation,knowledge-graph,bioinformatics
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=2.2
Requires-Dist: scanpy>=1.11
Requires-Dist: pandas>=2.3
Requires-Dist: networkx>=3.5
Requires-Dist: scipy>=1.16
Requires-Dist: psutil>=5.9
Requires-Dist: biopython>=1.85
Requires-Dist: openai>=1.102
Requires-Dist: neo4j>=5.28
Requires-Dist: anndata>=0.12
Requires-Dist: h5py>=3.14
Requires-Dist: langchain-openai>=0.3.32
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: flake8>=4.0.0; extra == "dev"
Requires-Dist: mypy>=0.950; extra == "dev"
Dynamic: license-file

# Xener
> This is the public version, containing only the necessary code.

A cross-species single-cell cell type annotation tool using knowledge graph.

## Installation

```bash
pip install .
# or
pip install xener
```

## Quick Start

```python
from xener import Xener

# Initialize
annor = Xener()

# Run full pipeline
cluster2celltype, _ = annor.run_from_yaml('config.yaml')
```

`config.yaml` example.

```yaml
cluster_key: leiden
model_species:
- Brassica_rapa
non_model_fasta: Arabidopsis_thaliana.fasta
non_model_h5ad: ERP132245.h5ad
organ: leaf
outdir: output/ERP132245
```

## Step-by-step

```python
from xener import Xener
import scanpy as sc

annor = Xener()
adata = sc.read('ERP132245.h5ad')
cluster_key = 'leiden'
non_model_fasta = 'Arabidopsis_thaliana.fasta'
model_species = ['Brassica_rapa']
organ = 'leaf'
outdir = 'output/ERP132245'

marker_gene = annor.get_markers(adata, cluster_key)

marker_weight = annor.get_gene_weight(marker_gene)

gene_homolo_weight = annor.mapping(marker_weight, non_model_fasta, model_species, outdir)

topk_markers = annor.get_topk_gene(gene_homolo_weight, k=30)
# Only the top 30 genes will be retained for the subsequent steps.

cluster2celltype, _, celltype_weight, _, _ = annor.cell_annotation(
            topk_markers, annotation_info_path, organ)
```

## Sub-cluster refinement

```python
cluster_id = 0
candidate_celltype = ['type1', 'type2']# Only support the values that appear in celltype_weight[celltype_weight['cluster'] == cluster_id]['celltype'].unique()
key_added = 'xener_refine'
moranI_threshold = 0.5
# moranI_threshold used for gene screening, the effective value ranges from [-1, 1]. The closer to 1, the stricter it is. If an invalid value is input, the screening step will be skipped.

annor.refine_single_cluster(adata, topk_markers, 
            cluster_key, cluster_id, candidate_celltype, 
            key_added, organ, moranI_threshold)
# The results can be found in  adata.obs[key_added]
```

## Links

[Homepage](https://xenor.dcs.cloud/): https://xenor.dcs.cloud/

[PyPI](https://pypi.org/project/xener/): https://pypi.org/project/xener

[Github](https://github.com/liushuai6bgi/Xener): https://github.com/liushuai6bgi/Xener
