Rechenberg's (1+1)-ES is the canonical
Evolution Strategy from Rechenberg (1973). A single parent
proposes one Gaussian-perturbed offspring per generation;
the offspring is accepted if it beats the parent. The step
size σ is adapted by the 1/5-success-rule: grow σ by 1.5×
when more than 1/5 of recent trials improved, shrink by
1.5⁻¹ otherwise. Was formerly named
AdaptiveRandomSearch in HumpDay; the new name
matches the canonical literature.
Interactive 3D Visualization
See Rechenberg (1+1)-ES in action on 3D optimization surfaces:
Loading 3D visualization...
Requires WebGL support
Instructions: Choose a test function and algorithm, then click Start to watch the step-by-step optimization process.
Implementation Details
| Component | Details | Links |
|---|---|---|
| Algorithm Concept |
Enhanced Random Search Random sampling with adaptive parameter adjustment Learning from optimization history Developed: Various researchers, 1970s-80s |
ARS Survey |
| Reference Implementation |
Research Literature Implementations Various academic implementations and variants No single standard package due to adaptation diversity Reference: Optimization literature and custom implementations |
GitHub Examples |
| Humpday Python Implementation |
HumpDay Rechenberg Custom implementation with success-rate adaptation Dynamic step size and region focusing File: humpday/optimizers/search_algorithms.py |
Source |
| Humpday JavaScript |
Browser Implementation Pure JavaScript adaptive random search Success tracking and parameter adaptation Class: Rechenberg |
JS Implementation |
Performance characteristics
- Best for: Smooth basins where the 1/5 rule keeps σ near the right exploration scale.
- Worst for: Multimodal landscapes (Ackley). Rechenberg has no escape from a local basin; once trapped, it refines into the wrong optimum.
- Step distribution: componentwise Gaussian. Each component of the offspring is drawn from N(0, σ2), so the step magnitude varies stochastically (mean ≈ σ·√n with heavier tails). The previous HumpDay variant projected to a unit direction, which damped the stochastic spread; the current implementation matches the canonical (1+1)-ES.
- Dimensions: Inexpensive per iteration; reasonable into the dozens.
- Relative to a canonical (1+1)-ES 1/5-rule reference at
n_trials=200, n_dim=2: algorithmically identical to the reference. Per-cell ratios drift with the seed (both implementations trap on Ackley about half the time at 4-seed samples). See SOTA status for the live numbers.
Educational Resources
- Adaptive Random Search Methods (Spall)
- Random Search Wikipedia
- Random Search for Hyper-Parameter Optimization
- Simple random search provides competitive approach
Adaptation Mechanisms
- Step Size Adaptation: σ(t+1) = σ(t) × factor based on success rate
- Success Rate Tracking: Monitor fraction of improvements over recent history
- Region Focusing: Bias sampling toward successful regions
- Direction Learning: Learn promising search directions
- Population Size: Adapt number of samples per iteration
- Distribution Shape: Modify sampling distribution parameters
Mathematical Foundation
- Success Rate: p_s(t) = (# improvements) / (# recent samples)
- Step Size Update: σ(t+1) = σ(t) × exp(α × (p_s(t) - p_target))
- Sampling: x(t+1) ~ N(x_best, σ(t)²I) or other adaptive distributions
- Region Weight: w_i ∝ exp(β × improvement_i) for region focusing
️ Key Parameters
- Initial Step Size: σ₀ typically 0.1-0.5 of search range
- Target Success Rate: p_target usually 0.1-0.3
- Adaptation Rate: α controls speed of step size changes
- Memory Window: How many recent samples to consider
- Min/Max Step Size: Bounds to prevent degenerate behavior
ARS Algorithm Pseudocode
function adaptiveRandomSearch(f, x0, maxEvals, targetSuccessRate):
x_best = x0
f_best = f(x0)
stepSize = 0.3 // Initial step size
successCount = 0
windowSize = 50
evaluations = []
for eval in 1 to maxEvals:
// Generate candidate with current step size
x_candidate = x_best + stepSize * randomGaussian(n)
x_candidate = clip(x_candidate, [0,1]) // Bounds
f_candidate = f(x_candidate)
// Track success/failure
improved = (f_candidate < f_best)
evaluations.append(improved)
if improved:
x_best = x_candidate
f_best = f_candidate
successCount += 1
// Adapt step size every windowSize evaluations
if eval % windowSize == 0:
successRate = successCount / windowSize
if successRate > targetSuccessRate:
stepSize *= 1.2 // Increase exploration
else:
stepSize *= 0.8 // Focus search
stepSize = clip(stepSize, [0.01, 1.0])
successCount = 0
return x_best, f_best
ARS Variants
- 1/5-th Rule ARS: Adapt step size based on success rate ≈ 1/5
- Multi-Point ARS: Generate multiple candidates per iteration
- Directional ARS: Learn and exploit successful search directions
- Region-Based ARS: Partition search space and adapt per region
- Covariance ARS: Adapt full covariance matrix like CMA-ES
- Hybrid ARS: Combine with local search or other methods
Adaptation Strategies
- Success-Based: Adapt based on improvement frequency
- Progress-Based: Adapt based on magnitude of improvements
- Diversity-Based: Maintain population diversity
- Distance-Based: Adapt based on distance between improvements
- Gradient-Free: Estimate search direction without gradients
Applications
- Hyperparameter Optimization: Machine learning model tuning
- Black-Box Optimization: When gradient information unavailable
- Noisy Functions: Robust to evaluation noise
- Expensive Functions: Efficient use of limited evaluations
- Multi-Modal Functions: Balance exploration and exploitation
- Simulation Optimization: Parameter tuning for simulations
Advantages of ARS
- Simplicity: Easy to implement and understand
- Robustness: Inherits random search reliability
- Efficiency: Better than pure random search
- Parameter-Light: Few hyperparameters to tune
- Parallelizable: Easy to distribute evaluations
- No Assumptions: Works on any objective function
️ Limitations
- Slower than Specialized Methods: Less efficient than problem-specific algorithms
- Parameter Sensitivity: Performance depends on adaptation parameters
- High Dimensions: Still suffers from curse of dimensionality
- Local Structure: Doesn't exploit smooth function structure
- Memory Requirements: Needs to track success history
Relationship to Other Methods
- vs Random Search: Adaptive step size and region focusing
- vs Evolution Strategies: Simpler adaptation, no population
- vs Simulated Annealing: No temperature schedule, success-based adaptation
- vs CMA-ES: Simpler covariance adaptation
- vs Bayesian Optimization: No surrogate model, direct adaptation