HumpDay's
LBFGSB is a faithful pure-Python port of scipy's _minimize_lbfgsb with four elements that distinguish it from a textbook two-loop L-BFGS:
- Bound-aware direction projection — direction components that would push x past an active bound are zeroed (the simple-bounds reduction of the Cauchy-point step).
- Projected-gradient sup-norm convergence test (
pgtol). - f-tolerance termination — stop when
(f_old − f_new) < factr·eps_mach·max(|f|, 1), scipy's headline criterion. - Feasibility-capped initial step — cap step length so
x + step·directionstays in[0,1]nbefore backtracking.
f(x), never an analytical gradient. Where scipy's L-BFGS-B uses analytic gradients (or a tighter Fortran-backed FD), HumpDay computes a central finite-difference gradient (2·n_dim evaluations) per L-BFGS iteration. That's the only structural deviation from the reference algorithm.
Interactive 3D Visualization
See L-BFGS-B in action on 3D optimization surfaces:
Loading 3D visualization...
Requires WebGL support
Instructions: Choose a test function and algorithm, then click Start to watch the step-by-step optimization process.
Implementation Details
| Component | Details | Links |
|---|---|---|
| Original Algorithm |
Byrd, Lu, Nocedal & Zhu Limited-memory BFGS with bound constraints Quasi-Newton method for large-scale problems Published: 1995 |
Original Paper |
| Reference Implementation |
SciPy optimize.minimize Method: 'L-BFGS-B' scipy.optimize.minimize(method='L-BFGS-B') Package: scipy |
SciPy Docs Source |
| HumpDay Python |
HumpDay LBFGSBPure-Python port of scipy's _minimize_lbfgsb: two-loop recursion + bound-aware direction projection + projected-gradient pgtol + factr·eps_mach termination + feasibility-capped Armijo line search.FD gradient (2·n_dim evals per iteration); memory m = min(10, n_dim).File: humpday/optimizers/scipy_algorithms.py
|
Source |
| HumpDay JavaScript |
Browser LBFGSBMirrors the Python port line-for-line. Also shared via BaseOptimizer._lbfgsPolish as the polish stage for DE / SA / BayesianOpt / PSO / CMA-ES / FA.Class: LBFGSB
|
JS Port |
Performance characteristics
- Best for: Smooth, near-quadratic basins where the L-BFGS curvature estimate is informative. Routinely the polish stage of choice at the end of population-based methods.
- Worst for: Multimodal landscapes — L-BFGS-B is local, not global. The bound-aware direction projection keeps it from wandering off the cube but it has no escape from a wrong basin.
- Per iteration: central FD gradient (2·n_dim evaluations) + one or more line-search trials. Memory
m = min(10, n_dim). - Relative to scipy.optimize L-BFGS-B at
n_trials=200, n_dim=2: ties on sphere and Ackley; a small residual on Rosenbrock from the FD-gradient cost. See SOTA status for the live numbers. - Reused as a polish stage by DE, SA, BayesianOpt, PSO, CMA-ES, and FireflyAlgorithm via
BaseOptimizer._lbfgs_polish.
Educational Resources
Algorithm Components
- Limited Memory: Store only m recent (s,y) pairs instead of full Hessian
- BFGS Update: Quasi-Newton Hessian approximation using gradients
- Two-Loop Recursion: Efficient computation of search direction
- Bound Constraints: Active set method for box constraints
- Line Search: Wolfe conditions for step length selection
Mathematical Foundation
- BFGS Formula: H_{k+1} = H_k + (1 + (y^T H_k y)/(s^T y)) * (s s^T)/(s^T y) - (s y^T H_k + H_k y s^T)/(s^T y)
- Two-Loop: q = ∇f_k, then backwards and forwards loops over stored vectors
- Initial Hessian: H_0 = γI where γ = (s^T y)/(y^T y)
- Bound Projection: Project gradient onto feasible cone
- Cauchy Point: Piecewise linear minimizer along projected gradient
️ Key Parameters
- Memory Size (m): Usually 5-20 (s,y) pairs stored
- Line Search: Wolfe conditions c1=1e-4, c2=0.9
- Gradient Tolerance: ||∇f|| < gtol (typically 1e-5)
- Function Tolerance: Relative change in f < ftol
- Max Iterations: Prevent infinite loops
L-BFGS vs L-BFGS-B
- L-BFGS: Unconstrained optimization only
- L-BFGS-B: Extends to simple bound constraints
- Active Set: L-BFGS-B identifies and handles active bounds
- Subspace: Optimization in subspace of free variables
- Complexity: Similar O(mn) complexity per iteration
Applications
- Machine Learning: Training large neural networks
- Parameter Estimation: Maximum likelihood estimation
- Signal Processing: Sparse signal reconstruction
- Engineering Design: Structural optimization with bounds
- Portfolio Optimization: Asset allocation with constraints
- Image Processing: Image restoration and deconvolution
Why L-BFGS-B is Popular
- Memory Efficient: O(mn) storage vs O(n²) for full Newton
- Fast Convergence: Superlinear on smooth problems
- Handles Bounds: Natural box constraint support
- Robust Implementation: Well-tested in SciPy
- Scalable: Works well on high-dimensional problems
- Default Choice: Often first choice for smooth bounded optimization