Metadata-Version: 2.4
Name: sparho
Version: 0.2.0
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: BSD License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: numpy>=1.24
Requires-Dist: scipy>=1.10
Requires-Dist: scikit-learn>=1.3
Requires-Dist: libsvmdata>=0.4 ; extra == 'bench'
Requires-Dist: pandas>=2.1 ; extra == 'bench'
Requires-Dist: matplotlib>=3.8 ; extra == 'bench'
Requires-Dist: celer>=0.7 ; extra == 'celer'
Requires-Dist: pytest>=7 ; extra == 'dev'
Requires-Dist: pytest-cov ; extra == 'dev'
Requires-Dist: maturin>=1.5 ; extra == 'dev'
Requires-Dist: ruff ; extra == 'dev'
Requires-Dist: mypy ; extra == 'dev'
Requires-Dist: pre-commit ; extra == 'dev'
Requires-Dist: sphinx>=7.4 ; extra == 'docs'
Requires-Dist: furo>=2024.5 ; extra == 'docs'
Requires-Dist: sphinx-gallery>=0.16 ; extra == 'docs'
Requires-Dist: matplotlib>=3.8 ; extra == 'docs'
Requires-Dist: numpydoc>=1.7 ; extra == 'docs'
Requires-Dist: myst-parser>=4 ; extra == 'docs'
Provides-Extra: bench
Provides-Extra: celer
Provides-Extra: dev
Provides-Extra: docs
License-File: LICENSE
Summary: Nonsmooth bilevel hyperparameter optimization via implicit differentiation
Keywords: hyperparameter-optimization,bilevel,implicit-differentiation,sparse,lasso
Author-email: David Villacis <david@villacis.net>
License: BSD-3-Clause
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Repository, https://github.com/dvillacis/sparho

# sparho

Nonsmooth bilevel hyperparameter optimization via implicit differentiation.

A maintained, performant successor to [`sparse-ho`](https://github.com/QB3/sparse-ho)
(ICML 2020, dormant since 2022). Tunes hyperparameters of non-smooth estimators
(Lasso, ElasticNet, weighted Lasso, sparse logistic regression) by computing
the hypergradient via implicit differentiation rather than grid/random search.

**Status:** pre-alpha. Public API may change between minor versions until v1.0.

## Why

Implicit-differentiation HP optimization can be orders of magnitude faster
than `LassoCV`-style grid search when you have a held-out criterion, but
the existing libraries are dormant (`sparse-ho`) or no longer maintained
(`JAXopt`). `sparho` is a clean-break, scipy-stack-native Rust+Python
implementation built for the same target audience.

## Perf summary

v0.2 numbers (HOAG + warm-start + celer inner solver), single-threaded on
an Apple M-series; see `benchmarks/README.md` for the methodology and the
v0.1 historical row.

| dataset | shape | sparho | `LassoCV` | notes |
|---|---|---|---|---|
| `breast-cancer` | 683×10 | 0.26 s | 0.02 s | overhead-bound; both finish instantly |
| `leukemia` | 38×7129 | **0.58 s** | 19.0 s | **32.8× faster** (was 1.3× at v0.1) |
| `rcv1.binary` | 20242×47236 sparse | 211 s, MSE **0.194** | 22.6 s, MSE 0.225 | **better MSE** (see below); 2× wall faster than v0.1 |

What v0.2 delivers on top of v0.1:

- `hoag_search` outer loop (Pedregosa 2016): adaptive step from a
  Lipschitz proxy, `+C·tol` slack acceptance, exponentially-decreasing
  inner-tol schedule. Replaces `LineSearch`.
- Inner-solver warm-starting threaded through the `Solver` Protocol +
  every adapter + `CrossVal(warm_start=True)`.
- celer adapter recommended for the high-d regime — compounding the
  HOAG/warm-start win into 32.8× on `leukemia` and 1.65× faster than
  sklearn on `rcv1.binary`.
- Dense-matvec fix in `implicit_forward` (no `coo_tocsr` round-trip on
  dense designs): 8.4× faster hypergradient solve on `leukemia`.

Everything v0.1 delivered still holds: gradient-based outer loop with
full FD parity, vector-α (`WeightedL1`) which `LassoCV` cannot do,
ridge-stabilized hypergradient-CG on ill-conditioned sparse designs,
clean Protocol-based API, mypy strict + clippy clean, single wheel via
maturin.

**The rcv1 story.** Implicit differentiation lets sparho search past
`LassoCV`'s discrete grid: on `rcv1.binary`, sparho's outer loop drives
α down to `2.1·10⁻⁵`, well below `LassoCV`'s grid floor of `1·10⁻⁴`,
and lands on a **14 % better held-out MSE** (0.194 vs 0.225). The
wall-time gap halved at v0.2 (433 s → 211 s) thanks to warm-start +
celer; the remaining gap is irreducible inner-solver work at very small
α / large active set. sparho's win on this dataset is *quality per
outer iter*, not raw wall time.

## Install (planned)

```bash
pip install sparho               # release wheel, no Rust toolchain needed
pip install "sparho[celer]"      # add celer as a fast Lasso adapter
```

## Quickstart (planned API)

```python
from sparho import Problem, grad_search
from sparho.adapters import SklearnLasso
from sparho.criteria import held_out_mse
from sparho.optimizer import grad_descent
from sparho.hypergrad import implicit_forward

problem = Problem.lasso(X, y)
result = grad_search(
    problem,
    hp0=1e-3,
    solver=SklearnLasso(),
    hypergrad=implicit_forward,
    criterion=held_out_mse(idx_train, idx_val, X, y),
    optimizer=grad_descent(lr=1.0),
)
print(result.best_hyperparam, result.best_coef)
```

See `docs/migration_from_sparse_ho.md` for translation from sparse-ho's API.

## Design

- **One `Problem` dataclass.** No abstract base class tower. Algorithms are
  free functions over `Problem`. Typing via `typing.Protocol`.
- **Implicit-only at v0.1.** ImplicitForward is the only hypergradient mode.
- **Sparse-X first class.** CSC iterated directly in Rust; no densification.
- **Rust kernels via PyO3 + maturin + ABI3.** Single binary wheel, no numba.
- **Clean break from sparse-ho.** Migration guide rather than compat shim.

## Roadmap

See `ROADMAP.md`. v0.1 shipped sklearn + celer + callable adapters with
verified correctness and dense-high-d parity vs `LassoCV`. v0.2 closes
the inner-solver warm-starting and hypergradient-stability gaps and
lands the HOAG outer loop — 32.8× on `leukemia`, 2× wall on
`rcv1.binary`. v0.3 lands the sklearn-ecosystem wrappers (`LassoHO`,
`ElasticNetHO`, `LogisticRegressionHO`) so sparho slots into
`Pipeline` / EconML / MLflow, the `SURE` / `GSURE` criterion for
unsupervised tuning, a `MultiTaskLasso` / Group-L1 penalty, and adapters
for `skein` (nonconvex weighted/group) and `skglm` (MCP / SCAD / SLOPE /
Group / Huber / Poisson). See `docs/feature_research.md` for the
2026-05-20 landscape synthesis behind these picks.

## License

BSD 3-Clause.

