Metadata-Version: 2.4
Name: multimin
Version: 0.9.5
Summary: MultiMin: Multivariate Gaussian fitting
Home-page: https://github.com/seap-udea/multimin
Author: Jorge I. Zuluaga
Author-email: jorge.zuluaga@udea.edu.co
License: AGPL-3.0-only
Keywords: fitting multivariate-normal statistics optimization
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Astronomy
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/x-rst
License-File: LICENSE
Requires-Dist: numpy>=1.20.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: matplotlib>=3.3.0
Requires-Dist: spiceypy>=5.0.0
Requires-Dist: pandas>=1.0.0
Requires-Dist: plotly>=5.0.0
Requires-Dist: pypandoc
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

|MultiMin Logo|

|version| |license| |pythonver| |downloads| |Powered by SciPy|

Introducing ``MultiMin``
------------------------

``MultiMin`` is a ``Python`` package designed to provide numerical tools for **fitting composed multivariate distributions** to data. It is particularly useful for modelling complex multimodal distributions in N-dimensions.

These are the main features of ``MultiMin``:

- **Multivariate Fitting**: Tools for fitting composed multivariate normal distributions (CMND).
- **Visualization**: Density plots and specific visualization utilities.
- **Statistical Analysis**: Tools for handling covariance matrices and correlations.

Resources
---------

- Documentation including examples and full API documentation: https://multimin.readthedocs.io.
- PyPI project page: https://pypi.org/project/multimin/.
- Github repo: https://github.com/seap-udea/multimin

Installation
------------

From PyPI
~~~~~~~~~

``MultiMin`` is available on PyPI at https://pypi.org/project/multimin/. You can install it with:

.. code:: bash

   pip install -U multimin

If you prefer, you can install the latest version of the developers taking it from the github repo:

.. code:: bash

   pip install -U git+https://github.com/seap-udea/multimin

From Sources
~~~~~~~~~~~~

You can also install from the `GitHub repository <https://github.com/seap-udea/multimin>`__:

.. code:: bash

   git clone https://github.com/seap-udea/multimin
   cd multimin
   pip install .

For development, use an editable installation:

.. code:: bash

   cd multimin
   pip install -e .

In Google Colab
~~~~~~~~~~~~~~~

If you use Google Colab, you can install ``MultiMin`` by executing:

.. code:: python

   !pip install -Uq multimin

or

.. code:: bash

   pip install -Uq git+https://github.com/seap-udea/multimin

Theoretical Background
----------------------

The core of ``MultiMin`` is the **Composed Multivariate Normal Distribution (CMND)**. The theory behind it posits that any multivariate distribution function :math:`p(\tilde U):\Re^{N}\rightarrow\Re`, where :math:`\tilde U:(u_1,u_2,u_3,\ldots,u_N)` are random variables, can be approximated with arbitrary precision by a normalized linear combination of :math:`M` Multivariate Normal Distributions (MND):

.. math::


   p(\tilde U) \approx \mathcal{C}_M(\tilde U; \{w_k\}_M, \{\mu_k\}_M, \{\Sigma_k\}_M) \equiv \sum_{i=1}^{M} w_i\mathcal{N}(\tilde U; \tilde \mu_i, \Sigma_i)

where the multivariate normal :math:`\mathcal{N}(\tilde U; \tilde \mu, \Sigma)` with mean vector :math:`\tilde \mu` and covariance matrix :math:`\Sigma` is given by:

.. math::


   \mathcal{N}(\tilde U; \tilde \mu, \Sigma) = \frac{1}{\sqrt{(2\pi)^{k} \det \Sigma}} \exp\left[-\frac{1}{2}(\tilde U - \tilde \mu)^{\rm T} \Sigma^{-1} (\tilde U - \tilde \mu)\right]

The covariance matrix :math:`\Sigma` elements are defined as :math:`\Sigma_{ij} = \rho_{ij}\sigma_{i}\sigma_{j}`, where :math:`\sigma_i` is the standard deviation of :math:`u_i` and :math:`\rho_{ij}` is the correlation coefficient between variable :math:`u_i` and :math:`u_j` (:math:`-1<\rho_{ij}<1`, :math:`\rho_{ii}=1`).

The normalization condition on :math:`p(\tilde U)` implies that the set of weights :math:`\{w_k\}_M` are also normalized, i.e., :math:`\sum_i w_i=1`.

Fitting procedure
~~~~~~~~~~~~~~~~~

To estimate the parameters of the CMND that best describe a given dataset , we use the **Likelihood Statistics** method.

Given a dataset of :math:`S` objects with state vectors :math:`\{\tilde U_k\}_{k=1}^S`, the likelihood :math:`\mathcal{L}` of the CMND parameters is defined as the product of the probability densities evaluated at each data point:

.. math::


   \mathcal{L} = \prod_{i=1}^{S} \mathcal{C}_M(\tilde U_i)

The goal is to find the set of parameters (weights, means, and covariances) that maximize this likelihood. In practice, it is numerically more stable to minimize the **negative normalized log-likelihood**:

.. math::


   -\frac{\log \mathcal{L}}{S} = -\frac{1}{S} \sum_{i=1}^{S} \log \mathcal{C}_M(\tilde U_i)

This approach allows us to fit the distribution without making strong assumptions about the underlying normality of the data, effectively treating the CMND as a series expansion of the true probability density function.

In ``MultiMin``, we use the ``scipy.optimize.minimize`` function to find the set of parameters that minimize the negative normalized log-likelihood.

Quickstart
----------

Getting started with ``MultiMin`` is straightforward. Import the package:

.. code:: python

   import multimin as mn

..

   **NOTE**: If you are working in Google Colab, load the matplotlib backend before producing plots:

   .. code:: python

      %matplotlib inline

Here is a basic example of how to use ``MultiMin`` to fit a 3D distribution composed of 2 Multivariate Normals.

1. Define a true distribution
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

First, we define a distribution from which we will generate synthetic data. We use a **Composed Multivariate Normal Distribution (CMND)** with 2 Gaussian components (``ngauss=2``) in 3 dimensions (``nvars=3``).

.. code:: python

   import numpy as np
   import multimin as mn

   # Define parameters for 2 Gaussian components
   weights = [0.5, 0.5]
   mus = [[1.0, 0.5, -0.5], [1.0, -0.5, +0.5]]
   sigmas = [[1, 1.2, 2.3], [0.8, 0.2, 3.3]]
   deg = np.pi/180
   angles = [
       [10*deg, 30*deg, 20*deg],
       [-20*deg, 0*deg, 30*deg],
   ] 

   # Calculate covariance matrices from rotation angles
   Sigmas = mn.Stats.calc_covariance_from_rotation(sigmas, angles)

   # Create the CMND object
   CMND = mn.ComposedMultiVariateNormal(mus=mus, weights=weights, Sigmas=Sigmas)

2. Generate sample data
~~~~~~~~~~~~~~~~~~~~~~~

We generate 5000 random samples from this distribution to serve as our “observed” data.

.. code:: python

   np.random.seed(1)
   sample = CMND.rvs(5000)

3. Visualize the data
~~~~~~~~~~~~~~~~~~~~~

We can check the distribution of the generated data using ``DensityPlot``.

.. code:: python

   import matplotlib.pyplot as plt

   # Define properties labels
   properties = dict(
       x=dict(label=r"$x$", range=None),
       y=dict(label=r"$y$", range=None),
       z=dict(label=r"$z$", range=None),
   )

   # Plot the density plot
   G = mn.DensityPlot(properties, figsize=3)

   hargs=dict(bins=30,cmap='Spectral_r')
   histogram=G.plot_hist(sample,**hargs)

   sargs=dict(s=0.5,edgecolor='None',color='r')
   scatter=G.scatter_plot(sample,**sargs)

The same ``properties`` dict can be passed to ``CMND.plot_sample`` and ``F.plot_fit`` via the ``properties`` argument for consistent axis labels. You can also pass a simple list of names (e.g. ``properties=["x","y","z"]``); then each name is used as the axis label and ``range=None``.

.. figure:: https://raw.githubusercontent.com/seap-udea/multimin/master/examples/gallery/cmnd_data_density_scatter.png
   :width: 600
   :alt: Data Scatter Plot

   Data Scatter Plot

4. Initialize the Fitter and Run the Fit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We initialize the ``FitCMND`` handler with the expected number of Gaussians (2) and variables (3). We then run the fitting procedure.

.. code:: python

   # Initialize the fitter
   F = mn.FitCMND(ngauss=2, nvars=3)

   # Run the fit (using advance=True for better convergence on complex models)
   F.fit_data(sample, advance=True)

5. Check and Plot Results
~~~~~~~~~~~~~~~~~~~~~~~~~

Finally, we visualize the fitted distribution compared to the data.

.. code:: python

   # Plot the fit result (properties accepts the same dict as DensityPlot, or a list of names)
   G = F.plot_fit(
       properties=properties,
       hargs=dict(bins=30, cmap='YlGn'),
       sargs=dict(s=0.2, edgecolor='None', color='r'),
       figsize=3
   )

.. figure:: https://raw.githubusercontent.com/seap-udea/multimin/master/examples/gallery/cmnd_fit_result_3d.png
   :width: 600
   :alt: Fit Result

   Fit Result

6. Inspect Parameters and Get Explicit PDF Function
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can tabulate the fitted parameters and obtain an explicit Python function that evaluates the fitted PDF. Below, each step is shown with its output.

**Stage 1: Tabulate the fitted CMND**

.. code:: python

   F.cmnd.tabulate(sort_by='weight')

Output:

::

                     w      mu_1      mu_2      mu_3   sigma_1   sigma_2   sigma_3    rho_12    rho_13    rho_23
   component                                                                                                    
   2          0.509108  1.019245 -0.480997  0.618821  0.794906  0.245786  3.327537  0.539417 -0.008936 -0.017769
   1          0.490892  0.957687  0.517584 -0.463392  1.039489  1.538029  2.116544 -0.209695  0.121184 -0.527142

**Stage 2: Get the source code and a callable function**

.. code:: python

   code, cmnd = F.cmnd.get_function()

Output (the printed code, which you can copy):

::

   from multimin import nmd

   def cmnd(X):

       mu1_1 = 0.957687
       mu1_2 = 0.517584
       mu1_3 = -0.463392
       mu1 = [mu1_1, mu1_2, mu1_3]
       Sigma1 = [[1.080538, -0.335252, 0.266619], [-0.335252, 2.365532, -1.716008], [0.266619, -1.716008, 4.479757]]
       n1 = nmd(X, mu1, Sigma1)

       mu2_1 = 1.019245
       mu2_2 = -0.480997
       mu2_3 = 0.618821
       mu2 = [mu2_1, mu2_2, mu2_3]
       Sigma2 = [[0.631876, 0.10539, -0.023637], [0.10539, 0.060411, -0.014533], [-0.023637, -0.014533, 11.072504]]
       n2 = nmd(X, mu2, Sigma2)

       w1 = 0.490892
       w2 = 0.509108

       return (
           w1*n1
           + w2*n2
       )

**Stage 3: Evaluate the PDF at a point**

.. code:: python

   cmnd([1.0, 0.5, -0.5])

Output:

::

   0.011073778538439395

**Stage 4: LaTeX output for papers**

You can get the fitted PDF as a LaTeX string (suitable for inclusion in papers) with parameter values and the definition of the normal distribution:

.. code:: python

   latex_str, _ = F.cmnd.get_function(print_code=False, type='latex', decimals=4)
   print(latex_str)

Output:

.. math:: f(\mathbf{x}) = w_1 \, \mathcal{N}(\mathbf{x}; \boldsymbol{\mu}_1, \mathbf{\Sigma}_1) + w_2 \, \mathcal{N}(\mathbf{x}; \boldsymbol{\mu}_2, \mathbf{\Sigma}_2)

where

.. math:: w_1 = 0.4909

.. math:: \boldsymbol{\mu}_1 = \left( \begin{array}{c} 0.9577 \\ 0.5176 \\ -0.4634 \end{array}\right)

.. math:: \mathbf{\Sigma}_1 = \left( \begin{array}{ccc} 1.0805 & -0.3353 & 0.2666 \\ -0.3353 & 2.3655 & -1.716 \\ 0.2666 & -1.716 & 4.4798 \end{array}\right)

.. math:: w_2 = 0.5091

.. math:: \boldsymbol{\mu}_2 = \left( \begin{array}{c} 1.0192 \\ -0.481 \\ 0.6188 \end{array}\right)

.. math:: \mathbf{\Sigma}_2 = \left( \begin{array}{ccc} 0.6319 & 0.1054 & -0.0236 \\ 0.1054 & 0.0604 & -0.0145 \\ -0.0236 & -0.0145 & 11.0725 \end{array}\right)

Here the normal distribution is defined as:

.. math:: \mathcal{N}(\mathbf{x}; \boldsymbol{\mu}, \mathbf{\Sigma}) = \frac{1}{\sqrt{(2\pi)^{{k}} \det \mathbf{\Sigma}}} \exp\left[-\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^{\top} \mathbf{\Sigma}^{{-1}} (\mathbf{x}-\boldsymbol{\mu})\right]

A parameter table in LaTeX is also available via ``F.cmnd.tabulate(sort_by='weight', type='latex')``.

Truncated multivariate distributions.
-------------------------------------

In real problems the domain of the variables is not infinite but bounded into a semi-finite region.

If we start from the unbounded multivariate normal distribution:

.. math::


   \mathcal{N}_k(\tilde U; \tilde \mu, \Sigma) = \frac{1}{\sqrt{(2\pi)^{k}\det \Sigma}} \exp\left[ -\frac{1}{2}(\tilde U - \tilde \mu)^{\rm T}\Sigma^{-1}(\tilde U - \tilde \mu) \right]

Let :math:`T\subset\{l,\dots,m\}`, where :math:`l\leq k` and :math:`m\leq k` be the set of indices of the truncated variables, and let :math:`a_i<b_i` be the truncation bounds for :math:`i\in S`. Define the truncation region:

.. math::


   A_S : \{\tilde U\in\mathbb{R}^k:\ a_i \le \tilde U_i \le b_i \ \ \forall\, i\in T \}

with the remaining coordinates :math:`i\notin T` unbounded. The partially-truncated multivariate normal distribution is defined by

.. math::


   \mathcal{TN}_T(\tilde U;\tilde\mu,\Sigma,\mathbf{a}_T,\mathbf{b}_T) = \frac{\mathcal{N}_k(\tilde U;\tilde\mu,\Sigma)\,\mathbf{1}_{A_T}(\tilde U)}{Z_ (\tilde\mu,\Sigma,\mathbf{a}_T,\mathbf{b}_T)},

where :math:`\mathbf{1}_{A_T}` is the indicator function of :math:`A_T` and the normalization constant is

.. math::


   Z_T(\tilde\mu,\Sigma,\mathbf{a}_T,\mathbf{b}_T)= \int_{A_T}\mathcal{N}_k(\tilde T;\tilde\mu,\Sigma)\,d\tilde T = \mathbb{P}_{\tilde T\sim\mathcal{N}_k(\tilde\mu,\Sigma)}\left(\tilde T\in A_T\right).

Example: univariate truncated mixture
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Define a mixture of two Gaussians on the interval :math:`[0, 1]` with the **domain** parameter, generate data, and fit with ``FitCMND(..., domain=[[0, 1]])``:

.. code:: python

   import numpy as np
   import multimin as mn

   # Truncated mixture of 2 Gaussians on [0, 1]
   CMND_1d = mn.ComposedMultiVariateNormal(
       mus=[0.2, 0.8],
       weights=[0.5, 0.5],
       Sigmas=[0.01, 0.03],
       domain=[[0, 1]],
   )
   np.random.seed(1)
   data_1d = CMND_1d.rvs(5000)

   # Fit with same domain so likelihood and means respect [0, 1]
   F_1d = mn.FitCMND(ngauss=2, nvars=1, domain=[[0, 1]])
   F_1d.fit_data(data_1d, advance=True)
   G = F_1d.plot_fit(hargs=dict(bins=40), sargs=dict(s=0.5, alpha=0.6))

.. figure:: https://raw.githubusercontent.com/seap-udea/multimin/master/examples/gallery/truncated_1d_fit.png
   :width: 600
   :alt: Truncated 1D fit

   Truncated 1D fit

You can also extract an explicit callable function for the fitted *truncated* PDF (including the bounds) and evaluate it safely outside the interval.

.. code:: python

   function, cmnd = F_1d.cmnd.get_function()

Output (the printed code, which you can copy):

::

   import numpy as np
   from multimin import tnmd

   def cmnd(X):

       a = 0.0
       b = 1.0

       mu1_1 = 0.200467
       sigma1_1 = 0.009683
       n1 = tnmd(X, mu1_1, sigma1_1, a, b)

       mu2_1 = 0.801063
       sigma2_1 = 0.030392
       n2 = tnmd(X, mu2_1, sigma2_1, a, b)

       w1 = 0.504151
       w2 = 0.495849

       return (
           w1*n1
           + w2*n2
       )

Evaluate the fitted PDF at a point inside the domain and outside the domain:

.. code:: python

   cmnd(0.5), cmnd(-0.2)

Output:

::

   (0.3128645172339761, 0.0)

For papers, you can also generate a LaTeX/Markdown description that includes the truncation information:

.. code:: python

   function_str, _ = F_1d.cmnd.get_function(print_code=False, type='latex', decimals=4)
   print(function_str)

Output:

Finite domain. The following variables are truncated (the rest are unbounded):

- Variable :math:`x_{1}` (index 1): domain :math:`[0.0, 1.0]`.

Truncation region: :math:`A_T = \{\tilde{U} \in \mathbb{R}^k : a_i \le \tilde{U}_i \le b_i \;\forall i \in T\}`, with :math:`T` the set of truncated indices.

.. math:: f(x) = w_1 \, \mathcal{TN}(x; \mu_{1}, \sigma_{1}, a, b) + w_2 \, \mathcal{TN}(x; \mu_{2}, \sigma_{2}, a, b)

where

.. math:: w_1 = 0.5042,\quad \mu_{1} = 0.2005,\quad \sigma_{1}^2 = 0.0097,\quad a = 0.0,\quad b = 1.0

.. math:: w_2 = 0.4958,\quad \mu_{2} = 0.8011,\quad \sigma_{2}^2 = 0.0304,\quad a = 0.0,\quad b = 1.0

Truncated normal. The unbounded normal is

.. math:: \mathcal{N}(x; \mu, \sigma) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right).

The truncation region is :math:`A_T = \{\tilde{U} \in \mathbb{R}^k : a_i \le \tilde{U}_i \le b_i \;\forall i \in T\}`. The partially truncated normal is

.. math:: \mathcal{TN}_T(\tilde{U}; \tilde{\mu}, \Sigma, \mathbf{a}_T, \mathbf{b}_T) = \frac{\mathcal{N}(\tilde{U}; \tilde{\mu}, \Sigma) \, \mathbf{1}_{A_T}(\tilde{U})}{Z_T(\tilde{\mu}, \Sigma, \mathbf{a}_T, \mathbf{b}_T)},

where :math:`\mathbf{1}_{A_T}` is the indicator of :math:`A_T` and the normalization constant is

.. math:: Z_T(\tilde{\mu}, \Sigma, \mathbf{a}_T, \mathbf{b}_T) = \int_{A_T} \mathcal{N}(\tilde{T}; \tilde{\mu}, \Sigma) \, d\tilde{T} = \mathbb{P}_{\tilde{T} \sim \mathcal{N}(\tilde{\mu},\Sigma)}(\tilde{T} \in A_T).

See `examples/multimin_truncated_tutorial.ipynb <examples/multimin_truncated_tutorial.ipynb>`__ for 3D truncated examples and more detail.

Citation
--------

The numerical tools and codes provided in this package have been developed and tested over several years of scientific research.

If you use ``MultiMin`` in your research, please cite:

.. code:: bibtex

   @software{multimin2026,
     author = {Zuluaga, Jorge I.},
     title = {MultiMin: Multivariate Gaussian fitting},
     year = {2026},
     url = {https://github.com/seap-udea/multimin}
   }

What’s New
----------

For a detailed list of changes and new features, see `WHATSNEW.md <WHATSNEW.md>`__.

Authors and Licensing
---------------------

This project is developed by the Solar, Earth and Planetary Physics Group (SEAP) at Universidad de Antioquia, Medellín, Colombia. The main developer is Prof. **Jorge I. Zuluaga** - jorge.zuluaga@udea.edu.co.

Other beta testers and contributions from:

- **Juanita A. Agudelo** - juanita.agudelo@udea.edu.co. Testing of the initial versions of the package in the context of NEAs research.

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the `LICENSE <LICENSE>`__ file for details.

Contributing
------------

We welcome contributions! If you’re interested in contributing to MultiMin, please:

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request

Please read the `CONTRIBUTING.md <CONTRIBUTING.md>`__ file for more information.

.. |MultiMin Logo| image:: https://raw.githubusercontent.com/seap-udea/multimin/master/docs/multimin-logo-white.webp
   :target: https://multimin.readthedocs.io/
.. |version| image:: https://img.shields.io/pypi/v/multimin?color=blue
   :target: https://pypi.org/project/multimin/
.. |license| image:: https://img.shields.io/badge/License-AGPL%20v3-blue.svg
   :target: https://github.com/seap-udea/multimin/blob/master/LICENSE
.. |pythonver| image:: https://img.shields.io/pypi/pyversions/multimin
   :target: https://pypi.org/project/multimin/
.. |downloads| image:: https://img.shields.io/pypi/dw/multimin
   :target: https://pypi.org/project/multimin/
.. |Powered by SciPy| image:: https://img.shields.io/badge/Powered%20by-SciPy-blue
   :target: https://scipy.org/
