sciduck.basic_qc module

sciduck.basic_qc.filter_on_counts_genes(adata: AnnData, min_counts: int = 2000, max_counts: int = 100000, min_genes: int = 1000, max_genes: int = 13000, inplace: bool = False) AnnData | None

Filter samples based on number of counts (UMIs) and genes detected.

Parameters:
  • adata – Anndata object.

  • min_counts – Minimum counts detected.

  • max_counts – Maximum counts detected.

  • min_genes – Minimum genes detected.

  • max_genes – Maximum genes detected.

  • inplace – Update the adata object in place or return unmodified object with keeper_cells flagged.

Returns:

Returns either AnnData | None

sciduck.basic_qc.filter_on_precomputed_metrics(adata: AnnData, doublet_score: float = 0.3, pct_counts_mt: float = 3.0, GEX_Reads_mapped_confidently_to_genome: float = 0.0, GEX_Reads_mapped_to_genome: float = 0.0, GEX_Reads_with_TSO: float = 1.0, inplace: bool = False) AnnData | None

Filter samples based on precomputed quality control metrics.

Parameters:
  • adata – Anndata object.

  • doublet_score – Maximum doublet score.

  • pct_counts_mt – Maximum percentage of counts in mitochondrial genes.

  • GEX_Reads_mapped_confidently_to_genome – Minimum percentage of confidently mapped reads. There is no pre-defined good practice for threshold, user must specify.

  • GEX_Reads_mapped_to_genome – Minimum percentage of reads mapped to genome. There is no pre-defined good practice for threshold, user must specify.

  • GEX_Reads_with_TSO – Maximum percentage of reads with TSO, per cell. There is no pre-defined good practice for threshold, user must specify.

  • inplace – Update the adata object in place or return unmodified object with keeper_cells flagged.

Returns:

Returns either AnnData | None

sciduck.basic_qc.filter_utilizing_coarse_labels(adata: AnnData, coarse_label_column: str, coarse_label_map: dict = {'Neurons': ['Excitatory', 'Inhibitory'], 'Non-Neurons': ['Astrocytes', 'Oligodendrocytes', 'Microglia', 'Endothelial', 'Pericytes']}, coarse_label_gene_threshold: dict = {'Neurons': 2000, 'Non-Neurons': 1000}, inplace: bool = False) AnnData | None

Filter samples based on coarse label specific thresholds for genes detected.

Parameters:
  • adata – Anndata object.

  • coarse_label_column – Column name in adata.obs containing coarse labeling identifying neuron and non-neuronal cell types.

  • coarse_label_map – Coarse cell type labels to define specific filtering on.

  • coarse_label_gene_threshold – Minimum genes detected for each coarse label.

  • inplace – Update the adata object in place or return unmodified object with keeper_cells flagged.

Returns:

Returns either AnnData | None