Metadata-Version: 2.1
Name: cpsat-autotune
Version: 0.0.3
Summary: A tool to automatically tune the hyperparameters of the OR-Tools' CP-SAT solver.
Author-email: Dominik Krupke <krupked@gmail.com>
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: optuna
Requires-Dist: ortools
Requires-Dist: numpy
Requires-Dist: scipy

# cpsat-autotune: A Hyperparameter Tuning Tool for Google's OR-Tools CP-SAT Solver

**cpsat-autotune** is a Python library designed to optimize the hyperparameters
of Google's OR-Tools CP-SAT solver for specific problem instances. While CP-SAT
is already highly optimized for a broad range of generic problems, fine-tuning
its parameters for particular problem sets can yield significant performance
gains. This tool leverages the `optuna` optimization library to systematically
explore and suggest optimal hyperparameter configurations tailored to your
needs.

Also check out our other projects:

- [The CP-SAT Primer](https://d-krupke.github.io/cpsat-primer/): A comprehensive
  guide to Google's OR-Tools CP-SAT solver.
- [CP-SAT Log Analyzer](https://github.com/d-krupke/CP-SAT-Log-Analyzer): A tool
  to analyze the logs generated by Google's OR-Tools CP-SAT solver.

## Use Case

**cpsat-autotune** is not a universal solution that guarantees a performance
boost for all uses of the CP-SAT solver. Instead, it is specifically designed to
enhance solver efficiency in targeted scenarios, particularly within the context
of Adaptive Large Neighborhood Search (ALNS).

### Adaptive Large Neighborhood Search (ALNS) Context

In ALNS, the CP-SAT solver is frequently invoked with a strict time limit to
solve similar problem instances as part of a larger iterative optimization
process. The goal is to incrementally improve a solution by exploring different
neighborhoods of the problem space. In this context, achieving even modest
performance gains on average instances can significantly impact the overall
efficiency of the search process, even if it results in occasional performance
drops on outlier instances.

### Benefits of Tuning in ALNS

- **Average Performance Gains:** By tuning the solver’s hyperparameters to
  optimize performance on typical instances, **cpsat-autotune** can reduce the
  average time per iteration. This is particularly valuable in ALNS, where a
  large number of solver calls are made.
- **Tolerance for Outliers:** In an ALNS framework, occasional slower iterations
  due to deteriorated performance on outlier instances are generally acceptable,
  as the search process can recover in subsequent iterations. Thus, the focus
  can be on enhancing the solver's average performance rather than ensuring
  consistent performance across all instances.
- **Augmented Solver Strategies:** Instead of completely replacing the CP-SAT
  solver with a single tuned configuration, **cpsat-autotune** allows you to
  tune hyperparameters for one or more specific instance sets and incorporate
  these as additional strategies within ALNS. This means you can maintain the
  default CP-SAT parameters while augmenting the solver's capability with
  tailored configurations. ALNS can then automatically select the most effective
  strategy for each iteration, leveraging the diverse set of tuned
  hyperparameters alongside the default configuration for optimal performance.

## Installation

You can install **cpsat-autotune** using `pip`:

```bash
pip install -U cpsat-autotune
```

Make sure to update the package before every use to ensure you have the latest
version, as this project is still a prototype.

## Basic Usage

Here is a basic example of how to use **cpsat-autotune** to optimize the time
required to find an optimal solution for a CP-SAT model:

```python
from cpsat_autotune import import_model, tune_time_to_optimal

# Load your model from a protobuf file
model = import_model("models/medium_hg.pb")

# Tune the model to minimize the time to reach an optimal solution
best = tune_time_to_optimal(
    model,
    timelimit_in_s=6,  # Enter a time limit slightly above what the solver with default parameters needs
    n_samples_per_param=5,  # The higher the number, the more accurate the tuning, but the longer it takes
    max_samples_per_param=10,  # After this number of samples for a parameter set, we just take the average from the history
    n_trials=50,  # Number of trials to run with Optuna
)
```

Sample output:

```plaintext
============================================================
                 OPTIMIZED PARAMETERS
============================================================

1. max_presolve_iterations: 1
   Contribution: 9.71%
   Default Value: 3
   Description:
       	Sets the maximum number of iterations that the presolve phase will execute.
		Presolve simplifies the problem by eliminating redundant constraints and variables before the main search begins.
		More iterations can lead to a more simplified problem but at the cost of longer presolve times.

2. cp_model_probing_level: 0
   Contribution: 51.99%
   Default Value: 2
   Description:
       	Defines the intensity of probing during presolve, where variables are temporarily fixed to infer more information about the problem.
		Higher levels of probing can result in a more simplified problem but require more computation time during presolve.

3. binary_minimization_algorithm: 2
   Contribution: 24.42%
   Default Value: 1
   Description:
       	Specifies the algorithm used for binary clause minimization during conflict analysis. The options are:
		- `0` NO_BINARY_MINIMIZATION.
		- `1` BINARY_MINIMIZATION_FIRST
		- `2` BINARY_MINIMIZATION_WITH_REACHABILITY
		- `3` EXPERIMENTAL_BINARY_MINIMIZATION
		- `4` BINARY_MINIMIZATION_FIRST_WITH_TRANSITIVE_REDUCTION

4. max_all_diff_cut_size: 32
   Contribution: 13.88%
   Default Value: 64
   Description:
       	Limits the size of "all different" constraints used when generating cuts.
		All-different constraints ensure that a set of variables takes distinct values. This parameter controls the balance between reducing the search space and the computational cost of generating cuts.
------------------------------------------------------------
Default Metric Value: 1.5331676
Optimized Metric Value: 0.544175
------------------------------------------------------------

============================================================
*** WARNING ***
============================================================
The optimized parameters listed above were obtained based on a sampling approach and may not fully capture the complexities of the entire problem space. While statistical reasoning has been applied, these results should be considered as a suggestion for further evaluation rather than definitive settings.
It is strongly recommended to validate these parameters in larger, more comprehensive experiments before adopting them in critical applications.
============================================================
```

## Available Tuning Methods

**cpsat-autotune** provides two primary methods for tuning:

### 1. `tune_time_to_optimal`

This method tunes the CP-SAT solver's hyperparameters to minimize the time
required to find an optimal solution. This is useful if you need a guaranteed
solution without a fixed time limit.

#### Parameters:

- `model`: The CP-SAT model you wish to tune.
- `timelimit_in_s`: Time limit for each solver operation in seconds.
- `opt_gap`: (Optional) The relative optimality gap to determine when a solution
  is considered optimal. Default is `0.0` (exact optimality).
- `n_samples_per_param`: (Optional) Number of samples per parameter in each
  trial. Default is `10`.
- `max_samples_per_param`: (Optional) Maximum number of samples per parameter
  before using the mean to improve runtime. Default is `30`.
- `n_trials`: (Optional) Number of trials to run. Default is `100`.

### 2. `tune_for_quality_within_timelimit`

This method tunes hyperparameters to maximize or minimize the objective value
within a specified time limit. This is useful when you need to find a good
solution within a fixed time frame, but do not require any guarantees.

#### Parameters:

- `model`: The CP-SAT model to be tuned.
- `timelimit_in_s`: Time limit for each solver operation in seconds.
- `obj_for_timeout`: Objective value applied if the solver times out.
- `direction`: Specify 'maximize' or 'minimize' depending on whether you want to
  optimize for the best or worst solution quality.
- `n_samples_per_param`: (Optional) Number of samples per parameter in each
  trial. Default is `10`.
- `max_samples_per_param`: (Optional) Maximum number of samples per parameter
  before using the mean to improve runtime. Default is `30`.
- `n_trials`: (Optional) Number of trials to run. Default is `100`.

## The Importance of Avoiding Overfitting

While tuning hyperparameters can improve solver performance for specific
instances, it also increases the risk of overfitting. Overfitting occurs when
the solver's performance is significantly improved on the training set of
problems but deteriorates on new, slightly different instances. For example,
tuning may reduce solve times on a set of similar problems but could result in
excessive solve times or failure on problems that deviate from the training set.

## How does the Tuning Work?

**cpsat-autotune** uses the `optuna` library to perform hyperparameter tuning
on a preselected set of parameters. The output of optuna is then further refined
and the significance of certain parameters is evaluated. Based on the assumption
that the default parameters are already well-tuned for a broad range of problems,
**cpsat-autotune** identifies the most significant changes to the default
configuration and suggests these as potential improvements. It does take a few
shortcuts to speed things up, while collecting more samples for important
values.

### Recommendations:

- **Robust Performance:** If consistent performance across a variety of
  instances is crucial, stick with the default CP-SAT parameters.
- **Targeted Performance:** If you are solving a large number of similar
  problems and can tolerate potential performance drops on outliers, use the
  suggested parameters after careful consideration.

## Contributing

Contributions are welcome. Please ensure that your code adheres to the project's
style guidelines and includes appropriate tests.

## License

This project is licensed under the MIT License.
