dandelion.preprocessing.create_germlines¶
-
dandelion.preprocessing.
create_germlines
(self, germline=None, org='human', seq_field='sequence_alignment', v_field='v_call', d_field='d_call', j_field='j_call', germ_types='dmask', fileformat='airr', initialize_metadata=False)[source]¶ Runs CreateGermlines.py to reconstruct the germline V(D)J sequence, from which the Ig lineage and mutations can be inferred.
- Parameters
self (Dandelion, pd.DataFrame, str) – Dandelion object, pandas DataFrame in changeo/airr format, or file path to changeo/airr file after clones have been determined.
germline (str, optional) – path to germline database folder. Defaults to $GERMLINE environmental variable.
org (str) – organism of germline database. Default is ‘human’.
seq_field (str) – name of column containing the aligned sequence. Default is ‘sequence_alignment’ (airr).
v_field (str) – name of column containing the germline V segment call. Default is ‘v_call’ (airr).
d_field (str) – name of column containing the germline d segment call. Default is ‘d_call’ (airr).
j_field (str) – name of column containing the germline j segment call. Default is ‘j_call’ (airr).
germ_types (str) – Specify type(s) of germlines to include full germline, germline with D segment masked, or germline for V segment only. Default is ‘dmask’.
fileformat (str) – format of V(D)J file/objects. Default is ‘airr’. Also accepts ‘changeo’.
- Returns
- Return type
V(D)J data file with reconstructed germline sequences.