Fork me on GitHub

biom-format.org

biom.table.Table.subsample

«  biom.table.Table.sort_order   ::   Contents   ::   biom.table.Table.sum  »

biom.table.Table.subsample

Table.subsample(n, axis='sample')

Randomly subsample without replacement.

Parameters:

n : int

Number of items to subsample from counts.

axis : {‘sample’, ‘observation’}, optional

The axis to sample over

Returns:

biom.Table

A subsampled version of self

Raises:

ValueError

If n is less than zero.

Notes

Subsampling is performed without replacement. If n is greater than the sum of a given vector, that vector is omitted from the result.

Adapted from skbio.math.subsample, see biom-format/licenses for more information about scikit-bio.

This code assumes absolute abundance.

Examples

>>> import numpy as np
>>> from biom.table import Table
>>> table = Table(np.array([[0, 2, 3], [1, 0, 2]]), ['O1', 'O2'],
...               ['S1', 'S2', 'S3'])

Subsample 1 item over the sample axis:

>>> print table.subsample(1).sum(axis='sample')
[ 1.  1.  1.]

Subsample 2 items over the sample axis, note that ‘S1’ is filtered out:

>>> ss = table.subsample(2)
>>> print ss.sum(axis='sample')
[ 2.  2.]
>>> print ss.sample_ids
['S2' 'S3']

«  biom.table.Table.sort_order   ::   Contents   ::   biom.table.Table.sum  »