Metadata-Version: 2.3
Name: gismo
Version: 0.5.0
Summary: GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
License: MIT
Keywords: NLP,Random Walk,D-Iteration,TF-I[D]TF
Author: Fabien Mathieu
Author-email: fabien.mathieu@normalesup.org
Maintainer: Fabien Mathieu
Maintainer-email: fabien.mathieu@normalesup.org
Requires-Python: >=3.10,<3.13
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Provides-Extra: spacy
Requires-Dist: beautifulsoup4 (>=4.13.3)
Requires-Dist: dill (>=0.3.9)
Requires-Dist: en-core-web-sm ; extra == "spacy"
Requires-Dist: lxml (>=5.3.1)
Requires-Dist: numba (>=0.61.0)
Requires-Dist: requests (>=2.32.3)
Requires-Dist: scikit-learn (>=1.6.1)
Requires-Dist: scipy (>=1.15.2)
Requires-Dist: spacy (>=3.8.4,<3.9) ; extra == "spacy"
Project-URL: Documentation, https://balouf.github.io/gismo/
Project-URL: Repository, https://github.com/balouf/gismo/
Description-Content-Type: text/markdown

[![Gismo logo](https://github.com/balouf/gismo/raw/master/docs/logo-line.png)](https://balouf.github.io/gismo/)

# A Generic Information Search... With a Mind of its Own!


[![Pypi badge](https://img.shields.io/pypi/v/gismo.svg)](https://pypi.python.org/pypi/gismo)
[![Build badge](https://github.com/balouf/gismo/actions/workflows/build.yml/badge.svg?branch=master)](https://github.com/balouf/gismo/actions?query=workflow%3Abuild)
[![Documentation badge](https://github.com/balouf/gismo/actions/workflows/docs.yml/badge.svg?branch=master)](https://github.com/balouf/gismo/actions?query=workflow%3Adocs)
[![codecov](https://codecov.io/gh/balouf/gismo/graph/badge.svg?token=TTMRW7XYS5)](https://codecov.io/gh/balouf/gismo)

GISMO is a NLP tool to rank and organize a corpus of documents according to a query.

Gismo stands for Generic Information Search... with a Mind of its Own.

- Free software: MIT License
- Github: <https://github.com/balouf/gismo/>
- Documentation: <https://balouf.github.io/gismo/>

## Features

Gismo combines three main ideas:

- **TF-IDTF**: a symmetric version of the TF-IDF embedding.
- **DIteration**: a fast, push-based, variant of the PageRank algorithm.
- **Fuzzy dendrogram**: a variant of the Louvain clustering algorithm.

## Quickstart

Install gismo:

```console
$ pip install gismo
```

Import gismo in a Python project:

```
import gismo as gs
```

To get the hang of a typical Gismo workflow, you can check the [Toy Example] notebook. For more advanced uses,
look at the other [tutorials] or directly the [reference] section.

## Credits

Thomas Bonald, Anne Bouillard, Marc-Olivier Buob, Dohy Hong.

This package was created with [Cookiecutter] and the [francois-durand/package_helper] project template.

# Coverage

[![codecov](https://codecov.io/gh/balouf/gismo/graphs/tree.svg?token=TTMRW7XYS5)](https://codecov.io/gh/balouf/gismo)

[cookiecutter]: https://github.com/audreyr/cookiecutter
[francois-durand/package_helper]: https://github.com/francois-durand/package_helper
[reference]: https://balouf.github.io/gismo/reference.html
[toy example]: https://balouf.github.io/gismo/tutorials/tutorial_toy_example.html
[tutorials]: https://balouf.github.io/gismo/tutorials/index.html#

