dandelion.tools.clone_diversity

dandelion.tools.clone_diversity(self, groupby, method='gini', metric=None, clone_key=None, update_obs_meta=True, diversity_key=None, resample=False, downsample=None, n_resample=50, normalize=True, reconstruct_network=True, expanded_only=False, use_contracted=False, key_added=None)[source]

Compute B cell clones diversity : Gini indices, Chao1 estimates, or Shannon entropy.

Parameters
  • self (Dandelion, AnnData) – Dandelion or AnnData object.

  • groupby (str) – Column name to calculate the gini indices on, for e.g. sample, patient etc.

  • method (str) – Method for diversity estimation. Either one of [‘gini’, ‘chao1’, ‘shannon’].

  • metric (str, optional) – Metric to use for calculating Gini indices of clones. Accepts one of [‘clone_vertexsize’, ‘clone_degree’, ‘clone_centrality’]. None defaults to ‘clone_vertexsize’.

  • clone_key (str, optional) – Column name specifying the clone_id column in metadata.

  • update_obs_meta (bool) – If True, a pandas dataframe is returned. If False, function will try to populate the input object’s metadata/obs slot.

  • diversity_key (str, optional) – key for ‘diversity’ results in .uns.

  • downsample (int, optional) – number of cells to downsample to. If None, defaults to size of smallest group.

  • resample (bool) – Whether or not to randomly sample cells without replacement to the minimum size of groups for the diversity calculation. Default is False.

  • n_resample (int) – Number of times to perform resampling. Default is 50.

  • normalize (bool) – Whether or not to return normalized Shannon Entropy according to https://math.stackexchange.com/a/945172. Default is True.

  • reconstruct_network (bool) – Whether or not to reconstruct the network for Gini Index based measures. Default is True and will reconstruct for each group specified by groupby option.

  • expanded_only (bool) – Whether or not to calculate gini indices using expanded clones only. Default is False i.e. use all cells/clones.

  • use_contracted (bool) – Whether or not to perform the gini calculation after contraction of clone network. Only applies to calculation of clone size gini index. Default is False. This is to try and preserve the single-cell properties of the network.

  • key_added (str, list, optional) – column names for output.

Returns

Return type

pandas dataframe, Dandelion object with updated .metadata slot or AnnData object with updated .obs slot.