Metadata-Version: 2.4
Name: baryx
Version: 0.1.0
Summary: A JAX recombination library
Project-URL: repository, https://github.com/datasig-ac-uk/baryx
Author-email: The Baryx Authors <info@datasig.ac.uk>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: jax,recombination
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.10
Requires-Dist: equinox>=0.11.5
Requires-Dist: jax>=0.4.35
Requires-Dist: jaxtyping>=0.2.24
Requires-Dist: typing-extensions>=4.4.0
Description-Content-Type: text/markdown

# Baryx
> **Bary**center preserving measure reduction in JA**X**

A JAX implementation of the recombination (barycenter preserving discrete-measure reduction) algorithms.

## Installation
Requires Python 3.10+
```bash
pip install baryx
```

## Example
```python
import jax
import jax.random as jr

from baryx import recombine

# Required for good results with the default solvers.
jax.config.update("jax_enable_x64", True)

# Nodes and weights define a discrete measure $\mu$.
n, m = 1024, 32
key = jr.key(0)
nodes_key, weights_key = jr.split(key)
nodes = jr.normal(nodes_key, (n, m))
weights = jr.uniform(weights_key, (n,))


# Test functions $f$ used to define the push-forward $f_{\#}\mu$.
def f(x):
    return x


# Apply the test functions $f$ to yield the push-forward $f_{\#}\mu$.
pushed_forward_nodes = f(nodes)

# Solve the recombination problem for the push-forward $f_{\#}{\mu}$.
solution = recombine(pushed_forward_nodes, weights)
err = solution.error(norm=2)
print(err)

# Construct the recombined measure $\hat{\mu}$.
recombined_nodes = nodes[solution.indices]
recombined_weights = solution.weights
```

## What is recombination
Recombination refers to both a discrete measure reduction problem and the algorithm(s) used to solve it.
### The recombination problem
Given a discrete measure $\mu$, whose support is a subset of some arbitrary space $X$ with cardinality $n$, and a set of test functions $`f = \{f_1, \dots, f_m \mid f_i \colon X \to \mathbb{R}\}`$, construct a discrete measure $`\hat{\mu}`$ whose support has cardinality at most $`r+1`$ and is a subset of the support of $`\mu`$ such that

$$\int_\Omega f(x)\mu(x) dx = \int_\Omega f(x)\hat{\mu}(x) dx,$$

where $`f(x) = (f_1(x), f_2(x), \dots, f_m(x))`$ is an embedding $`X \to \mathbb{R}^m`$. **Note:** The number of retained support points is at most $`r + 1`$, where $`r`$ is the rank of the embedded node matrix (i.e., the number of linearly independent test functions), satisfying $`r \le m`$.

**A linear programming interpretation:**
The recombination problem can also be formulated as finding a *basic feasible solution* $`\hat{w}`$ to the following linear program
$$\bar{A}^Tw = \bar{b},\\ w \ge 0,$$
with trivial (zero) objective and constraints $`\bar{A} = [1 \mid A]`$ and $`\bar{b} = [\sum w \mid b ]`$, where
$`A = (f(x_1), \dots, f(x_n))^T`$ is an $n \times m$ matrix representing the points for the *push-forward measure* $`f_{\#}\mu`$,
and $`b = \int_\Omega f(x)\mu(x)`$ represents the integral to be preserved.

In this interpretation, a given (not necessarily basic) feasible solution $w$—the weights of the initial measure $\mu$—is recombined into a basic feasible solution $`\hat{w}`$ that satisfies the same linear constraints with minimal support.

### The recombination algorithm(s)
The process of applying a numerical algorithm which solves the *recombination problem*, as defined above, may be referred to as *recombination*, e.g., "we perform recombination to obtain a reduced measure...".

This library implements the deterministic 1-Tree and m-Tree measure reduction algorithms introduced by [Litterer and Lyons 2012](https://doi.org/10.1214/11-AAP786) and [Tchernychova 2016](https://ora.ox.ac.uk/objects/uuid:a3a10980-d35d-467b-b3c0-d10d2e491f2d) respectively, in addition to providing a new generalization referred to as $\alpha$--Tree measure reduction. The tree reduction algorithms can be accessed by setting the `solver` keyword argument of `recombine(nodes, weights, solver=...)` to the following:
- **1-Tree:** `Caratheodory(...)` or `TreeReduce(Caratheodory(...), tree_reduction_factor=n/(m+1))`,
- **m-Tree reduction:** `TreeReduce(Caratheodory(...), tree_reduction_factor=2)`,
- **$\alpha$-Tree reduction (default):** `TreeReduce(Caratheodory(...), tree_reduction_factor=1+alpha/(m+1))`.

Notice that 1-Tree and m-Tree reduction are special cases of $\alpha$-Tree, where $\alpha = n - (m + 1)$ and $\alpha = m + 1$ respectively.

The randomized and hybrid algorithms introduced by [Cossentino et al 2020](https://arxiv.org/abs/2006.01757) are not yet implemented in this library; a JAX-free implementation can be found [here](https://github.com/FraCose/Recombination_Random_Algos).
Please open an issue if there is another recombination-like algorithm you would like to see implemented in baryx.

## Notice
Baryx loosely forks and extends the recombination algorithms implemented in the [Coreax](https://github.com/gchq/coreax) coreset library (licensed under [Apache-2.0](https://github.com/gchq/coreax?tab=Apache-2.0-1-ov-file#readme)). Baryx provides a variety of algorithmic, numerical, and quality-of-life improvements in addition to having a simpler user API, and fewer dependencies.

## See also
Some other recombination related works that you may find interesting:

**Packages**
- [PyRecombine](https://github.com/datasig-ac-uk/PyRecombine/) - the original (CPU only) CPP implementation with python bindings.
- [RoughPy](https://github.com/datasig-ac-uk/RoughPy/) - a toolbox for working with streaming data as rough paths in Python.
