Bayesian Optimization (scikit-optimize)

Sequential Model-Based Optimization

Bayesian Optimization fits a Gaussian-process surrogate to the observed (x, f(x)) pairs and proposes the next evaluation point by maximising an acquisition function. HumpDay's BayesianOpt uses an RBF kernel and the Expected Improvement acquisition. After the GP-EI loop exhausts its share of the budget, HumpDay finishes with a short L-BFGS-B polish from the best observed point — the same pattern scikit-optimize's gp_minimize uses, and the reason HumpDay's BayesianOpt reaches machine precision on smooth basins where the GP itself would plateau at the kernel's noise floor.

HumpDay's BayesianOpt at a glance

  • Initial design: n_initial = min(5, max(2, n_dim)) uniform random samples.
  • Surrogate: Gaussian process with an RBF kernel; jitter on the kernel matrix for numerical stability.
  • Acquisition: Expected Improvement, optimised by random sampling + a short local refinement of the best candidate.
  • Polish: reserve min(20·n_dim, n_trials/2) evaluations for an L-BFGS-B polish from the best observed point at the end.

Interactive 3D Visualization

See Bayesian Optimization in action on 3D optimization surfaces:

Loading 3D visualization...

Requires WebGL support

Instructions: Choose a test function and algorithm, then click Start to watch the step-by-step optimization process.

Implementation Details

Component Details Links
Algorithmic Foundation Sequential Model-Based Optimization
Gaussian Process regression + acquisition functions
Expected Improvement, Upper Confidence Bound
Developed: 1960s-70s (Kushner, Močkus)
Tutorial Paper
Reference Implementation scikit-optimize (skopt)
Sequential model-based optimization library
Gaussian Process surrogate with various acquisition functions
Package: scikit-optimize
Skopt Docs GitHub
HumpDay Python HumpDay BayesianOpt
RBF-kernel GP surrogate + Expected Improvement acquisition + L-BFGS-B polish reserved at the end.
No required dependencies (numpy is used transparently when present for kernel construction; a pure-Python loop path covers the no-numpy install).
File: humpday/optimizers/evolutionary_algorithms.py
Source
HumpDay JavaScript Browser BayesianOpt
Mirrors the Python port (RBF + EI + L-BFGS-B polish). Used in the contest and the in-browser visualizer.
Class: BayesianOpt
JS Port

Performance characteristics

  • Best for: Expensive black-box objectives where every evaluation matters. The GP-EI loop is sample-efficient; the L-BFGS-B polish closes the residual on smooth basins to machine precision.
  • Worst for: High dimensions — the GP kernel cost grows as O(n_samples3) and the RBF surrogate degenerates past a dozen or so dimensions. Like grid search, BayesianOpt is impractical past n_dim ≈ 10.
  • Per outer iteration: rebuild the kernel matrix, optimise EI, evaluate one point. Then at the end, min(20·n_dim, n_trials/2) evaluations on an L-BFGS-B polish.
  • Relative to scikit-optimize gp_minimize at n_trials=200, n_dim=2: HumpDay wins by ~1021× on sphere (machine precision vs ref's 1e-8), and matches on Rosenbrock. The polish is the difference. See SOTA status for the live numbers.

Educational Resources

Core Components

  • Surrogate Model: Gaussian Process regression with uncertainty
  • Acquisition Function: Expected Improvement, Upper Confidence Bound, Entropy Search
  • Kernel Functions: RBF, Matérn, polynomial kernels
  • Hyperparameter Learning: Maximum likelihood estimation
  • Multi-Point Selection: Batch acquisition for parallel evaluation

Acquisition Functions

  • Expected Improvement (EI): E[max(f(x) - f_best, 0)]
  • Upper Confidence Bound (UCB): μ(x) + β*σ(x)
  • Probability of Improvement (PI): P[f(x) > f_best + ξ]
  • Entropy Search: Information gain about global optimum
  • Knowledge Gradient: Expected value of perfect information

Applications

  • Hyperparameter Optimization: Machine learning model tuning
  • Experimental Design: Scientific experiments and A/B testing
  • Engineering Optimization: CAD parameter tuning
  • Drug Discovery: Molecular property optimization
  • Robotics: Control parameter learning
  • AutoML: Architecture search and pipeline optimization

Advanced Techniques

  • Multi-Task GP: Transfer learning between related problems
  • Multi-Objective BO: Pareto optimization with GP models
  • Constrained BO: Handling feasibility constraints
  • High-Dimensional BO: Random embeddings, additive models
  • Distributed BO: Asynchronous parallel optimization

Mathematical Foundation

  • Gaussian Processes: Non-parametric Bayesian regression
  • Regret Bounds: Theoretical convergence guarantees
  • Information Theory: Mutual information and entropy
  • Decision Theory: Optimal stopping and value of information