biom.table.Table.to_hdf5¶
- Table.to_hdf5(h5grp, generated_by, compress=True)¶
Store CSC and CSR in place
The resulting structure of this group is below. A few basic definitions, N is the number of observations and M is the number of samples. Data are stored in both compressed sparse row [R10] (CSR, for observation oriented operations) and compressed sparse column [R11] (CSC, for sample oriented operations).
Parameters: h5grp : {h5py.Group, h5py.File}
generated_by : str
A description of what generated the table
compress : bool, optional
Defaults to True means fields will be compressed with gzip, False means no compression
See also
Notes
The expected HDF5 group structure is below. An example of an HDF5 file in DDL can be found here [R12].
./id : str, an arbitrary ID ./type : str, the table type (e.g, OTU table) ./format-url : str, a URL that describes the format ./format-version : two element tuple of int32, major and minor ./generated-by : str, what generated this file ./creation-date : str, ISO format ./shape : two element tuple of int32, N by M ./nnz : int32 or int64, number of non zero elems ./observation : Group ./observation/ids : (N,) dataset of str or vlen str ./observation/matrix : Group ./observation/matrix/data : (nnz,) dataset of float64 ./observation/matrix/indices : (nnz,) dataset of int32 ./observation/matrix/indptr : (M+1,) dataset of int32 [./observation/metadata] : Optional, JSON str, in index order with ids. See below for added detail. ./sample : Group ./sample/ids : (M,) dataset of str or vlen str ./sample/matrix : Group ./sample/matrix/data : (nnz,) dataset of float64 ./sample/matrix/indices : (nnz,) dataset of int32 ./sample/matrix/indptr : (N+1,) dataset of int32 [./sample/metadata] : Optional, JSON str, in index order with ids. See below for added detail.
The expected structure (in JSON) for the optional metadata is a list of objects, where the index order of the list corresponds to the index order of the relevant axis IDs. The metadata are parsed directly by JSON, and there are no constraints on the contained metadata with the exception of the outer list, and that the order of the list matters. Below is an example of observational metadata for two observations:
[{“taxonomy”: [“foo”, “bar”]}, {“taxonomy”: [“foo”, “foobar”]}]
References
[R10] (1, 2) http://docs.scipy.org/doc/scipy-0.13.0/reference/generated/scipy.sparse.csr_matrix.html [R11] (1, 2) http://docs.scipy.org/doc/scipy-0.13.0/reference/generated/scipy.sparse.csc_matrix.html [R12] (1, 2) http://biom-format.org/documentation/format_versions/biom-2.0.html Examples
>>> from h5py import File >>> from biom.table import Table >>> from numpy import array >>> t = Table(array([[1, 2], [3, 4]]), ['a', 'b'], ['x', 'y']) >>> with File('foo.biom', 'w') as f: ... t.to_hdf5(f, "example")