dandelion.preprocessing.reassign_alleles¶
-
dandelion.preprocessing.
reassign_alleles
(data, combined_folder, v_germline=None, germline=None, org='human', v_field='v_call_genotyped', germ_types='dmask', novel=True, cloned=False, plot=True, figsize=(4, 3), sample_id_dictionary=None, verbose=False)[source]¶ Correct allele calls based on a personalized genotype using tigger-reassignAlleles. It uses a subject-specific genotype to correct correct preliminary allele assignments of a set of sequences derived from a single subject.
- Parameters
data (Sequence) – list of data folders containing the .tsv files. if provided as a single string, it will first be converted to a list; this allows for the function to be run on single/multiple samples.
combined_folder (str, PathLike) – name of folder for concatenated data file and genotyped files.
v_germline (str, optional) – path to heavy chain v germline fasta. Defaults to IGHV fasta in $GERMLINE environmental variable.
germline (str, optional) – path to germline database folder. Defaults to $GERMLINE environmental variable.
org (str) – organism of germline database. Default is ‘human’.
v_field (str) – name of column containing the germline V segment call. Default is ‘v_call_genotyped’ (airr) for after tigger.
germ_types (str) – Specify type of germline for reconstruction. Accepts one of : ‘full’, ‘dmask’, ‘vonly’, ‘region’. Default is ‘dmask’.
novel (bool) – whether or not to run novel allele discovery during tigger-genotyping. Default is True (yes).
cloned (bool) – whether or not to run CreateGermlines.py with –cloned.
plot (bool) – whether or not to plot reassignment summary metrics. Default is True.
figsize (Tuple[Union[int,float], Union[int,float]]) – size of figure. Default is (4, 3).
sample_id_dictionary (dict, optional) – dictionary for creating a sample_id column in the concatenated file.
verbose (bool) – Whether or not to print the command used in the terminal. Default is False.
- Returns
- Return type
Individual V(D)J data files with v_call_genotyped column containing reassigned heavy chain v calls