dandelion.tools.find_clones

dandelion.tools.find_clones(self, identity=0.85, key=None, locus=None, by_alleles=False, key_added=None, recalculate_length=True, productive_only=True)[source]

Find clones based on heavy chain and light chain CDR3 junction hamming distance.

Parameters
  • self (Dandelion, DataFrame, str) – Dandelion object, pandas DataFrame in changeo/airr format, or file path to changeo/airr file after clones have been determined.

  • identity (float) – Junction similarity parameter. Default 0.85

  • key (str, optional) – column name for performing clone clustering. None defaults to ‘junction_aa’.

  • locus (str, optional) – placeholder to allow for future tcr analysis mode for performing clustering. None defaults to ‘bcr’.

  • by_alleles (bool) – Whether or not to collapse alleles to genes. None defaults to False.

  • key_added (str, optional) – If specified, this will be the column name for clones. None defaults to ‘clone_id’

  • recalculate_length (bool) – Whether or not to re-calculate junction length, rather than rely on parsed assignment (which occasionally is wrong). Default is True

  • productive_only (bool) – Whether or not to perform clone_clustering only on productive clones.

Returns

Return type

Dandelion object with clone_id annotated in .data slot and .metadata initialized.