dandelion.preprocessing.assign_isotype

dandelion.preprocessing.assign_isotype(fasta, fileformat='blast', org='human', correct_c_call=True, correction_dict=None, plot=True, figsize=(4, 4), blastdb=None, allele=False, parallel=True, ncpu=None, verbose=False)[source]

Annotate contigs with constant region call using blastn

Parameters
  • fasta (str) – path to fasta file.

  • fileformat (str) – format of V(D)J file/objects. Default is ‘blast’. Also accepts ‘changeo’ (same behaviour as ‘blast’) and ‘airr’.

  • org (str) – organism of reference folder. Default is ‘human’.

  • correct_c_call (bool) – whether or not to adjust the c_calls after blast based on provided primers specified in primer_dict option. Default is True.

  • correction_dict (Dict, optional) – a nested dictionary contain isotype/c_genes as keys and primer sequences as records to use for correcting annotated c_calls. Defaults to a curated dictionary for human sequences if left as none.

  • plot (bool) – whether or not to plot reassignment summary metrics. Default is True.

  • figsize (Tuple[Union[int,float], Union[int,float]]) – size of figure. Default is (4, 4).

  • blastdb (str, optional) – path to blast database. Defaults to $BLASTDB environmental variable.

  • allele (bool) – whether or not to return allele calls. Default is False.

  • parallel (bool) – whether or not to use parallelization. Default is True.

  • ncpu (int) – number of cores to use if parallel is True. Default is all available minus 1.

  • verbose (bool) – whether or not to print the blast command in terminal. Default is False.

Returns

Return type

V(D)J tsv files with constant genes annotated.