Previous topic

corpora.lowcorpus – Corpus in List-of-Words format

Next topic

corpora.svmlightcorpus – Corpus in SVMlight format

corpora.mmcorpus – Corpus in Matrix Market format

Corpus in the Matrix Market format.

class gensim.corpora.mmcorpus.MmCorpus(input)

Corpus in the Matrix Market format.

Initialize the matrix reader.

The input refers to a file on local filesystem, which is expected to be in the sparse (coordinate) Matrix Market format. Documents are assumed to be rows of the matrix (and document features are columns).

input is either a string (file path) or a file-like object that supports seek(0) (e.g. gzip.GzipFile, bz2.BZ2File).

classmethod load(fname)
Load a previously saved object from file (also see save).
save(fname)
Save the object to file via pickling (also see load).
static saveCorpus(fname, corpus, id2word=None, progressCnt=1000)
Save a corpus in the Matrix Market format to disk.