Metadata-Version: 2.3
Name: noloox
Version: 0.1.0
Summary: Unsupervised learning algorithms you will need one day.
License: MIT
Author: Márton Kardos
Author-email: power.up1163@gmail.com
Requires-Python: >=3.11
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Provides-Extra: docs
Requires-Dist: griffe (==0.40.0) ; extra == "docs"
Requires-Dist: jax (>=0.8.0,<0.9.0)
Requires-Dist: mkdocs (==1.6.1) ; extra == "docs"
Requires-Dist: mkdocs-autorefs (==0.5.0) ; extra == "docs"
Requires-Dist: mkdocs-material (==9.6.19) ; extra == "docs"
Requires-Dist: mkdocs-material-extensions (==1.3.1) ; extra == "docs"
Requires-Dist: mkdocstrings (==0.22.0) ; extra == "docs"
Requires-Dist: mkdocstrings-python (==1.8.0) ; extra == "docs"
Requires-Dist: numpy (>=2.0.0)
Requires-Dist: scikit-learn (>=1.3.0,<2.0.0)
Requires-Dist: scipy (>=1.16.0,<2.0.0)
Requires-Dist: tqdm (>=4.0.0,<5.0.0)
Description-Content-Type: text/markdown


<img src="docs/images/icon.png" width="75px"></img> is a Python library containing reference implementations of a bunch of very useful unsupervised learning algorithms that you probably won't find elsewhere.

#### What <img src="docs/images/icon.png" width="75px"></img> is:

- A collection of unsupervised machine learning algorithms
- A scikit-learn compatible library
- An educational resource containing worked examples and reference implementation

#### What <img src="docs/images/icon.png" width="75px"></img> isn't:

- The most feature-complete or efficient implementation of these algorithms
- A replacement for scikit-learn
- An all-in-one machine learning framework
- A library for complete Bayesian inference. Use a PPL like NumPyro, PyMC or Stan.

## Basic usage

Install noloox from PyPI:

```bash
pip install noloox
```

Then you can load models from the library and use them the same way you would use scikit-learn.

```python
from noloox.mixture import StudentsTMixture

model = StudentsTMixture(n_components=10)
cluster_labels = model.fit_predict(X)
```

## Models

| Model | What do I use it for? | JAX or NumPy? | What algorithm? |
| - | - | - | - |
| Peax | Cluster 2D data where the number of clusters is unknown. | NumPy | Expectation-Maximization | 
| SNMF | Factor data, where you expect the factors to be non-negative, but the data is unbounded | JAX | Iterative updates |
| WNMF | NMF, but you don't want to weight all observations equally. | NumPy | Iterative updates |
| StudentsTMixture and CauchyMixture | Cluster continuous data in a way that is robust to outliers. | JAX | Expectation-Maximization |
| DirichletMultinomialMixture | Cluster count data/Short-text topic modelling | JAX | Collapsed Gibbs Sampling |

## Our philosophy and goals

 - Keep implementations simple and minimal, Minimal dependencies
 - Everything should either be implemented in NumPy or JAX. Preferably as many in JAX as possible.
 - Library structure should match sklearn standards, and all algorithms should be drop-in replacements for scikit-learn equivalents.
 - Under these restrictions, algorithms should be as fast as humanly possible

## The <img src="docs/images/icon.png" width="90px"></img> wishlist:

There are a number of algorithms that would be nice to implement in the library.
Contributions are very welcome.

 - ProdLDA, and amortized ProdLDA (CTMs) (without Flax)
 - Parametric-TSNE, possibly also Multi-scale Parametric-TSNE
 - DiRE
 - Infinite NMF
 - Latent Dirichlet Allocation with Gibbs Sampling
 - Gaussian LDA


