Metadata-Version: 2.4
Name: wnet
Version: 0.9.16
Summary: Tools for calculation of Wasserstein metric between distributions based on Network Flow algorithm
Keywords: Wasserstein,Optimal Transport,Network Flow,Earth Mover's Distance
Author-Email: =?utf-8?q?Micha=C5=82_Startek?= <michal.startek@mimuw.edu.pl>
Maintainer-Email: =?utf-8?q?Micha=C5=82_Startek?= <michal.startek@mimuw.edu.pl>
License-Expression: MIT
License-File: LICENCE
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Development Status :: 4 - Beta
Project-URL: Homepage, https://github.com/michalsta/wnet
Project-URL: Repository, https://github.com/michalsta/wnet.git
Requires-Python: >=3.9
Requires-Dist: pylmcf>=0.9.13
Requires-Dist: numpy
Provides-Extra: extras
Requires-Dist: networkx; extra == "extras"
Requires-Dist: matplotlib; extra == "extras"
Provides-Extra: pytest
Requires-Dist: pytest; extra == "pytest"
Description-Content-Type: text/markdown

# wnet

Wasserstein Network (wnet) is a Python/C++ library for working with Wasserstein distances. It uses the Min Cost Flow algorithm as implemented by the [LEMON library](https://lemon.cs.elte.hu/trac/lemon), exposed to Python via the [pylmcf module](https://github.com/michalsta/pylmcf), enabling efficient computation and manipulation of Wasserstein distances between multidimensional distributions.

## Features
- Wasserstein and Truncated Wasserstein distance between multidimensional distributions (dimensions 1–20)
- Three distance metrics: L1, L2, L∞
- Derivatives with respect to peak intensities and spectrum mixture proportions
- Position gradients (∂cost/∂position) with warm-restart re-solving after peak position updates
- Support for distribution mixtures and efficient recalculation with changed mixture proportions
- Picklable `Distribution` objects

## Installation

You can install the Python package using pip:

```bash
pip install wnet
```

## Usage

### Basic distance

```python
import numpy as np
from wnet import WassersteinDistance, Distribution
from wnet.distances import DistanceMetric

positions1 = np.array([[0, 1, 5, 10], [0, 0, 0, 3]])
intensities1 = np.array([10, 5, 5, 5])

positions2 = np.array([[1, 10], [0, 0]])
intensities2 = np.array([20, 5])

S1 = Distribution(positions1, intensities1)
S2 = Distribution(positions2, intensities2)

print(WassersteinDistance(S1, S2, DistanceMetric.L1))
# 45
```

### Truncated Wasserstein

Mass that cannot be matched within `max_distance` is discarded at a fixed cost rather than transported arbitrarily far:

```python
from wnet import TruncatedWassersteinDistance

print(TruncatedWassersteinDistance(S1, S2, DistanceMetric.L2, max_distance=3.0))
```

### Derivatives w.r.t. peak intensities

`signal_part_derivatives()` returns the marginal cost of increasing each theoretical peak's intensity by 1 — useful for scoring how well each peak is explained:

```python
from wnet import WassersteinNetwork

W = WassersteinNetwork(S1, [S2], DistanceMetric.L2, max_distance=10.0)
W.build()
W.solve()

derivs = W.signal_part_derivatives()   # np.ndarray, one value per peak in S1
```

### Optimising peak positions

After an initial solve, positions can be updated and re-solved cheaply via a warm restart. `update_positions_and_get_gradient()` returns `∂cost/∂position` for all peaks so you can feed them directly into a gradient-based optimiser:

```python
W = WassersteinNetwork(S1, [S2], DistanceMetric.L2, max_distance=10.0)
W.build()
W.solve()

for _ in range(100):
    grad_empirical, grad_theoretical = W.update_positions_and_get_gradient(new_positions)
    new_positions -= 0.01 * grad_empirical
```

## Licence
MIT Licence

## Related Projects

- [pylmcf](https://github.com/michalsta/pylmcf) - Python bindings for Min Cost Flow algorithms from LEMON library.
- [wnetalign](https://github.com/michalsta/wnetalign) - Alignment of MS/NMR spectra using Truncated Wasserstein Distance
