Metadata-Version: 2.4
Name: magma-py-minimal
Version: 0.1.0a0
Summary: A Python implementation of the MagmaClust algorithm, coded in Jax.
Author-email: "S. Lejoly" <simon.lejoly@unamur.be>
Maintainer-email: "S. Lejoly" <simon.lejoly@unamur.be>
License: MIT
Project-URL: Homepage, https://github.com/SimLej18/MagmaClustPy
Project-URL: Repository, https://github.com/SimLej18/MagmaClustPy
Project-URL: Bug Tracker, https://github.com/SimLej18/MagmaClustPy/issues
Keywords: gaussian-processes,kernels,jax,machine-learning,covariance-functions,bayesian-optimization
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.12
Description-Content-Type: text/markdown
Requires-Dist: jax>=0.6.2
Requires-Dist: jaxlib>=0.6.2
Requires-Dist: equinox>=0.13.2
Requires-Dist: chex~=0.1.89
Requires-Dist: optax~=0.2.5
Requires-Dist: pandas~=2.3.0
Requires-Dist: numpy~=2.3.1
Requires-Dist: matplotlib~=3.10.0
Requires-Dist: kernax-ml>=0.4.4a0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: allure-pytest>=2.15; extra == "dev"
Requires-Dist: pytest-benchmark>=5.2; extra == "dev"
Requires-Dist: gpjax>=0.13; extra == "dev"
Requires-Dist: gpytorch>=1.15; extra == "dev"
Requires-Dist: scikit-learn>=1.8; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=0.990; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=5.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.0; extra == "docs"
Requires-Dist: nbsphinx>=0.8; extra == "docs"
Provides-Extra: benchmarks
Requires-Dist: pandas>=1.5.0; extra == "benchmarks"
Requires-Dist: matplotlib>=3.5.0; extra == "benchmarks"
Requires-Dist: jupyter>=1.0.0; extra == "benchmarks"

# MagmaClustPy
---

MagmaClustPy is a probabilistic learning framework based on MagmaClust, a multi-task Gaussian Process framework.

An original implementation of MagmaClust is available as [a R package](https://github.com/ArthurLeroy/MagmaClustR).

This implementation has many limitations:
* it doesn't do parallel computations and doesn't run on GPU
* it doesn't support non-gaussian likelihoods (for classification for example)
* it only models single-output GPs
* it trains on all the data at once, scaling pretty badly on bigger datasets

This Python package will aleviate these limitations with multiple design choices:
* We use Jax along with mapping/padding methods to allow fast and parallel computations on CPU/GPU/TPU
* We develop a multi-output algorithm based on Process Convolution for joint optimisation of correlated outputs
* We use Laplace-Matching to adapt to non-gaussian data and problems
* We support data points coming with known, heteroskedastic uncertainty estimates
* We learn data in batches to reduce time of training
* We explore sparse covariance matrix computations to speed up inference

---

## Installation

--- TODO ---

## Main differences with the original MagmaClustR library

* This is a module written in Python instead of package coded in R (obviously)
* The package runs on JAX and can therefore leverage various backends (CPU, GPU, TPU). 
* We use **custom classes for kernels** rather than string identifiers. These kernels can be composed (à la GPytorch). 
You can find them in `kernels.py`. Therefore, *signatures of functions that use kernels might be different*. A common 
example of this is the initialisation of kernel HPs. Rather than sending the kernel class and HPs as separate arguments, 
**the user can initialise the kernel with the wanted HP and then send it as a single argument**.
* We use **matplotlib** for plotting instead of **ggplot2**
* Files, class names, functions names, variables and parameters might have different names to be clearer or respect
Python conventions.
* This library sticks with the default precision of the linear algebra backend (or the one specified by the user). No 
implicit rounding of numbers is performed by the library itself.
* This library uses `logging` instead of `cat`. You can configure the logging level like this: 
`logging.basicConfig(level=logging.INFO)`.

---

## Development roadmap

- [x] Cluster mixture init and update
- [x] Cluster hyperpost and HP optim
- [ ] Model classes
- [x] Cluster prediction
- [x] Plot utilities
- [ ] Initializers
- [ ] Prior means modules
- [ ] Likelihood modules
- [ ] Minimal documentation (guides and API)
- [ ] PyPI package and deployment setup

🚀 Alpha release !

- [ ] Bug test - issue management
- [ ] Unit test
- [ ] Multi-output GPs
- [ ] Complete documentation
- [ ] Contribution guides
- [ ] Dev pipeline tools for testing/coverage/...

🚀 1.0.0 release

- [ ] Laplace-Matching likelihoods
- [ ] Continued development

---

## Help, feedback, contributions

Any feedback, issue or contribution is obviously mor than welcome! 
Don't hesitate to open an issue/discussion on GitHub, or get in touch with [Arthur Leroy](https://arthur-leroy.netlify.app/) if you have any question.

