Metadata-Version: 2.4
Name: sparse-approx-gsm
Version: 1.1.0
Summary: Sparse Approximation by the Generalized Soft-Min Penalty
Keywords: sparse-approximation,compressed-sensing,CUDA,optimization,trimmed-lasso
Author: Tal Amir, Ronen Basri, Boaz Nadler
License-Expression: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: C++
Classifier: Operating System :: POSIX :: Linux
Project-URL: Homepage, https://pypi.org/project/sparse-approx-gsm/
Requires-Python: >=3.8
Requires-Dist: numpy>=1.21
Provides-Extra: gpu
Requires-Dist: cupy-cuda12x; extra == "gpu"
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: scipy; extra == "dev"
Requires-Dist: mpmath; extra == "dev"
Description-Content-Type: text/markdown

# sparse-approx-gsm

Python (+ optional CUDA) implementation of **Sparse Approximation by the Generalized Soft-Min (GSM) Penalty** — a sparse-recovery solver based on the Trimmed Lasso.

Solves: **min ||Ax - y||_2  subject to  ||x||_0 <= k**

> Tal Amir, Ronen Basri, Boaz Nadler — *The Trimmed Lasso: Sparse Recovery Guarantees and Practical Optimization by the Generalized Soft-Min Penalty*. Weizmann Institute of Science.

## Install

**From PyPI:**
```bash
pip install sparse-approx-gsm
```

**From source (with CUDA extension):**
```bash
pip install pybind11 scikit-build-core cython numpy
NVCC=/path/to/nvcc
PYBIND11_DIR=$(python -c "import pybind11; print(pybind11.get_cmake_dir())")
CMAKE_ARGS="-DCMAKE_CUDA_COMPILER=$NVCC -Dpybind11_DIR=$PYBIND11_DIR -DCMAKE_CXX_COMPILER=/usr/bin/g++" \
  pip install --no-build-isolation . -v
```

**From source (CPU only):**
```bash
pip install --no-build-isolation .
```

## Usage

```python
from sparse_approx_gsm import sparse_approx_gsm, SparseApproxGSM, gsm

# High-level solver: min ||Ax - y||_2  s.t.  ||x||_0 <= k
x = sparse_approx_gsm(A, y, k, profile='normal')

# GSM function and gradient
mu, theta = gsm(z, k, gamma)
```

## GPU Acceleration

Install CuPy for the matching CUDA version (e.g. `pip install cupy-cuda12x`), then pass
a CuPy array to `gsm()` or use `SparseApproxGSM(..., use_gpu=True)`.

When `use_gpu=True`, arrays are cached on the GPU and kept there throughout the solver loop, minimizing CPU-GPU data transfers.

## What's New in v1.1.0

- **GPU transfer optimization**: Immutable arrays (A, G, y, Aty) cached on GPU; solution vector and weights stay on GPU through the MM loop
- **Redundant computation elimination**: GSM mu is cached from weight computation, avoiding duplicate GPU kernel launches
- **Cython acceleration**: Fused FISTA kernels and OpenMP-parallelized utilities for the CPU fallback path
- **Module renamed**: Import path is now `sparse_approx_gsm` (was `gsm_python`)
