= 1
MAX_TIME = 5
INIT_SIZE ="10-river"
PREFIX= .1 K
11 river
Hyperparameter Tuning: HATR with Friedman Drift Data
11.1 Setup
Before we consider the detailed experimental setup, we select the parameters that affect run time, initial design size and the device that is used.
Caution: Run time and initial design size should be increased for real experiments
- MAX_TIME is set to one minute for demonstration purposes. For real experiments, this should be increased to at least 1 hour.
- INIT_SIZE is set to 5 for demonstration purposes. For real experiments, this should be increased to at least 10.
- K is set to 0.1 for demonstration purposes. For real experiments, this should be increased to at least 1.
10-river_bartz09_2023-07-08_13-48-02
- This notebook exemplifies hyperparameter tuning with SPOT (spotPython and spotRiver).
- The hyperparameter software SPOT was developed in R (statistical programming language), see Open Access book “Hyperparameter Tuning for Machine and Deep Learning with R - A Practical Guide”, available here: https://link.springer.com/book/10.1007/978-981-19-5170-1.
- This notebook demonstrates hyperparameter tuning for
river
. It is based on the notebook “Incremental decision trees in river: the Hoeffding Tree case”, see: https://riverml.xyz/0.15.0/recipes/on-hoeffding-trees/#42-regression-tree-splitters. - Here we will use the river
HTR
andHATR
functions as in “Incremental decision trees in river: the Hoeffding Tree case”, see: https://riverml.xyz/0.15.0/recipes/on-hoeffding-trees/#42-regression-tree-splitters.
11.2 Initialization of the Empty fun_control
Dictionary
from spotPython.utils.init import fun_control_init
from spotPython.utils.file import get_spot_tensorboard_path
= get_experiment_name(prefix=PREFIX)
experiment_name = fun_control_init(
fun_control =get_spot_tensorboard_path(experiment_name)) spot_tensorboard_path
11.3 Load Data: The Friedman Drift Data
= 7*24
horizon = K
k = int(k*100_000)
n_total = n_total
n_samples = int(k*25_000)
p_1 = int(k*50_000)
p_2 =(p_1, p_2)
position= 1_000
n_train = n_train + p_1 - 12
a = a + 12 b
- Since we also need a
river
version of the data below for plotting the model, the corresponding data set is generated here. Note:spotRiver
uses thetrain
andtest
data sets, whileriver
uses theX
andy
data sets
from river.datasets import synth
import pandas as pd
= synth.FriedmanDrift(
dataset ='gra',
drift_type=position,
position=123
seed )
from spotRiver.utils.data_conversion import convert_to_df
= "y"
target_column = convert_to_df(dataset, target_column=target_column, n_total=n_total) df
# Add column names x1 until x10 to the first 10 columns of the dataframe and the column name y to the last column
= [f"x{i}" for i in range(1, 11)] + ["y"]
df.columns
= df[:n_train]
train = df[n_train:]
test #
"data": None, # dataset,
fun_control.update({"train": train,
"test": test,
"n_samples": n_samples,
"target_column": target_column})
11.4 Specification of the Preprocessing Model
from river import preprocessing
= preprocessing.StandardScaler()
prep_model "prep_model": prep_model}) fun_control.update({
11.5 Select algorithm
and core_model_hyper_dict
- The
river
model (HATR
) is selected. - Furthermore, the corresponding hyperparameters, see: https://riverml.xyz/0.15.0/api/tree/HoeffdingTreeRegressor/ are selected (incl. type information, names, and bounds).
- The corresponding hyperparameter dictionary is added to the
fun_control
dictionary. - Alternatively, you can load a local hyper_dict. Simply set
river_hyper_dict.json
as the filename. Iffilename
is set toNone
, the hyper_dict is loaded from thespotRiver
package.
from river.tree import HoeffdingAdaptiveTreeRegressor
from spotRiver.data.river_hyper_dict import RiverHyperDict
from spotPython.hyperparameters.values import add_core_model_to_fun_control
= HoeffdingAdaptiveTreeRegressor
core_model =core_model,
add_core_model_to_fun_control(core_model=fun_control,
fun_control=RiverHyperDict,
hyper_dict=None) filename
11.6 Modify hyper_dict
Hyperparameters for the Selected Algorithm aka core_model
11.6.1 Modify hyperparameter of type factor
# modify_hyper_parameter_levels(fun_control, "leaf_model", ["LinearRegression"])
# fun_control["core_model_hyper_dict"]
11.6.2 Modify hyperparameter of type numeric and integer (boolean)
from spotPython.hyperparameters.values import modify_hyper_parameter_bounds
"delta", bounds=[1e-10, 1e-6])
modify_hyper_parameter_bounds(fun_control, # modify_hyper_parameter_bounds(fun_control, "min_samples_split", bounds=[3, 20])
"merit_preprune", [0, 0]) modify_hyper_parameter_bounds(fun_control,
11.7 Selection of the Objective (Loss) Function
There are two metrics:
1. `metric` is used for the river based evaluation via `eval_oml_iter_progressive`.
2. `metric_sklearn` is used for the sklearn based evaluation via `eval_oml_horizon`.
import numpy as np
from river import metrics
from sklearn.metrics import mean_absolute_error
= np.array([1, 1/1000, 1/1000])*10_000.0
weights = 7*24
horizon = 2
oml_grace_period = 100
step = 1.0
weight_coeff
fun_control.update({"horizon": horizon,
"oml_grace_period": oml_grace_period,
"weights": weights,
"step": step,
"log_level": 50,
"weight_coeff": weight_coeff,
"metric": metrics.MAE(),
"metric_sklearn": mean_absolute_error
})
11.8 Calling the SPOT Function
11.8.1 Prepare the SPOT Parameters
- Get types and variable names as well as lower and upper bounds for the hyperparameters.
from spotPython.hyperparameters.values import (
get_var_type,
get_var_name,
get_bound_values
)= get_var_type(fun_control)
var_type = get_var_name(fun_control)
var_name = get_bound_values(fun_control, "lower")
lower = get_bound_values(fun_control, "upper") upper
from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control))
| name | type | default | lower | upper | transform |
|------------------------|--------|------------------|------------|----------|-----------------------|
| grace_period | int | 200 | 10 | 1000 | None |
| max_depth | int | 20 | 2 | 20 | transform_power_2_int |
| delta | float | 1e-07 | 1e-10 | 1e-06 | None |
| tau | float | 0.05 | 0.01 | 0.1 | None |
| leaf_prediction | factor | mean | 0 | 2 | None |
| leaf_model | factor | LinearRegression | 0 | 2 | None |
| model_selector_decay | float | 0.95 | 0.9 | 0.99 | None |
| splitter | factor | EBSTSplitter | 0 | 2 | None |
| min_samples_split | int | 5 | 2 | 10 | None |
| bootstrap_sampling | factor | 0 | 0 | 1 | None |
| drift_window_threshold | int | 300 | 100 | 500 | None |
| switch_significance | float | 0.05 | 0.01 | 0.1 | None |
| binary_split | factor | 0 | 0 | 1 | None |
| max_size | float | 500.0 | 100 | 1000 | None |
| memory_estimate_period | int | 1000000 | 100000 | 1e+06 | None |
| stop_mem_management | factor | 0 | 0 | 1 | None |
| remove_poor_attrs | factor | 0 | 0 | 1 | None |
| merit_preprune | factor | 0 | 0 | 0 | None |
11.8.2 Run the Spot
Optimizer
from spotRiver.fun.hyperriver import HyperRiver
= HyperRiver().fun_oml_horizon fun
from spotPython.hyperparameters.values import get_default_hyperparameters_as_array
= get_default_hyperparameters_as_array(fun_control) X_start
- Run SPOT for approx. x mins (
max_time
). - Note: the run takes longer, because the evaluation time of initial design (here:
init_size
= INIT_SIZE as specified above) is not considered.
from spotPython.spot import spot
from math import inf
import numpy as np
= spot.Spot(fun=fun,
spot_tuner = lower,
lower = upper,
upper = inf,
fun_evals = "y",
infill_criterion = MAX_TIME,
max_time = np.sqrt(np.spacing(1)),
tolerance_x = var_type,
var_type = var_name,
var_name = True,
show_progress= fun_control,
fun_control ={"init_size": INIT_SIZE},
design_control={"noise": False,
surrogate_control"cod_type": "norm",
"min_theta": -4,
"max_theta": 3,
"n_theta": len(var_name),
"model_fun_evals": 10_000})
=X_start) spot_tuner.run(X_start
spotPython tuning: 2.199711145531071 [##--------] 19.27%
spotPython tuning: 2.199711145531071 [####------] 41.74%
spotPython tuning: 2.199711145531071 [######----] 55.56%
spotPython tuning: 2.199711145531071 [#######---] 66.45%
spotPython tuning: 2.199711145531071 [########--] 78.33%
spotPython tuning: 2.199711145531071 [#########-] 89.60%
spotPython tuning: 2.1505935752849936 [##########] 100.00% Done...
<spotPython.spot.spot.Spot at 0x16992e1a0>
11.8.3 Results
from spotPython.utils.file import save_pickle
save_pickle(spot_tuner, experiment_name)
from spotPython.utils.file import load_pickle
= load_pickle(experiment_name) spot_tuner
- Show the Progress of the hyperparameter tuning:
=True, filename="./figures/" + experiment_name+"_progress.pdf") spot_tuner.plot_progress(log_y
- Print the Results
print(gen_design_table(fun_control=fun_control, spot=spot_tuner))
| name | type | default | lower | upper | tuned | transform | importance | stars |
|------------------------|--------|------------------|----------|-----------|----------|-----------------------|--------------|---------|
| grace_period | int | 200 | 10.0 | 1000.0 | 998.0 | None | 0.00 | |
| max_depth | int | 20 | 2.0 | 20.0 | 16.0 | transform_power_2_int | 0.00 | |
| delta | float | 1e-07 | 1e-10 | 1e-06 | 1e-06 | None | 0.00 | |
| tau | float | 0.05 | 0.01 | 0.1 | 0.01 | None | 0.00 | |
| leaf_prediction | factor | mean | 0.0 | 2.0 | 2.0 | None | 5.44 | * |
| leaf_model | factor | LinearRegression | 0.0 | 2.0 | 0.0 | None | 5.45 | * |
| model_selector_decay | float | 0.95 | 0.9 | 0.99 | 0.9 | None | 0.00 | |
| splitter | factor | EBSTSplitter | 0.0 | 2.0 | 2.0 | None | 100.00 | *** |
| min_samples_split | int | 5 | 2.0 | 10.0 | 6.0 | None | 0.00 | |
| bootstrap_sampling | factor | 0 | 0.0 | 1.0 | 0.0 | None | 0.00 | |
| drift_window_threshold | int | 300 | 100.0 | 500.0 | 155.0 | None | 0.00 | |
| switch_significance | float | 0.05 | 0.01 | 0.1 | 0.01 | None | 0.00 | |
| binary_split | factor | 0 | 0.0 | 1.0 | 1.0 | None | 0.00 | |
| max_size | float | 500.0 | 100.0 | 1000.0 | 100.0 | None | 0.00 | |
| memory_estimate_period | int | 1000000 | 100000.0 | 1000000.0 | 162938.0 | None | 0.00 | |
| stop_mem_management | factor | 0 | 0.0 | 1.0 | 0.0 | None | 0.00 | |
| remove_poor_attrs | factor | 0 | 0.0 | 1.0 | 1.0 | None | 0.04 | |
| merit_preprune | factor | 0 | 0.0 | 0.0 | 0.0 | None | 0.00 | |
11.9 Show variable importance
=0.0025, filename="./figures/" + experiment_name+"_importance.pdf") spot_tuner.plot_importance(threshold
11.10 Build and Evaluate HTR Model with Tuned Hyperparameters
= test.shape[0]
m = int(m/2)-50
a = int(m/2) b
11.11 Der große Datensatz
Caution: Increased Friedman-Drift Data Set
- The Friedman-Drift Data Set is increased by a factor of two to show the transferability of the hyperparameter tuning results.
- Larger values of
k
lead to a longer run time.
= 7*24
horizon = 0.2
k = int(k*100_000)
n_total = n_total
n_samples = int(k*25_000)
p_1 = int(k*50_000)
p_2 =(p_1, p_2)
position= 1_000
n_train = n_train + p_1 - 12
a = a + 12 b
from river.datasets import synth
= synth.FriedmanDrift(
dataset ='gra',
drift_type=position,
position=123
seed )
from spotRiver.utils.data_conversion import convert_to_df
= "y"
target_column = convert_to_df(dataset, target_column=target_column, n_total=n_total)
df # Add column names x1 until x10 to the first 10 columns of the dataframe and the column name y to the last column
= [f"x{i}" for i in range(1, 11)] + ["y"]
df.columns = df[:n_train]
train = df[n_train:]
test = "y"
target_column #
"data": None, # dataset,
fun_control.update({"train": train,
"test": test,
"n_samples": n_samples,
"target_column": target_column})
11.12 Get Default Hyperparameters
# fun_control was modified, we generate a new one with the original
# default hyperparameters
from spotPython.hyperparameters.values import get_one_core_model_from_X
from spotPython.hyperparameters.values import get_default_hyperparameters_as_array
= get_default_hyperparameters_as_array(fun_control)
X_start = get_one_core_model_from_X(X_start, fun_control)
model_default model_default
HoeffdingAdaptiveTreeRegressor
HoeffdingAdaptiveTreeRegressor (
grace_period=200
max_depth=1048576
delta=1e-07
tau=0.05
leaf_prediction="mean"
leaf_model=LinearRegression (
optimizer=SGD (
lr=Constant (
learning_rate=0.01
)
)
loss=Squared ()
l2=0.
l1=0.
intercept_init=0.
intercept_lr=Constant (
learning_rate=0.01
)
clip_gradient=1e+12
initializer=Zeros ()
)
model_selector_decay=0.95
nominal_attributes=None
splitter=EBSTSplitter ()
min_samples_split=5
bootstrap_sampling=0
drift_window_threshold=300
drift_detector=ADWIN (
delta=0.002
clock=32
max_buckets=5
min_window_length=5
grace_period=10
)
switch_significance=0.05
binary_split=0
max_size=500.
memory_estimate_period=1000000
stop_mem_management=0
remove_poor_attrs=0
merit_preprune=0
seed=None
)
from spotRiver.evaluation.eval_bml import eval_oml_horizon
= eval_oml_horizon(
df_eval_default, df_true_default =model_default,
model=fun_control["train"],
train=fun_control["test"],
test=fun_control["target_column"],
target_column=fun_control["horizon"],
horizon=fun_control["oml_grace_period"],
oml_grace_period=fun_control["metric_sklearn"],
metric )
from spotRiver.evaluation.eval_bml import plot_bml_oml_horizon_metrics, plot_bml_oml_horizon_predictions
=["default"]
df_labels= [df_eval_default], log_y=False, df_labels=df_labels, metric=fun_control["metric_sklearn"])
plot_bml_oml_horizon_metrics(df_eval = [df_true_default[a:b]], target_column=target_column, df_labels=df_labels) plot_bml_oml_horizon_predictions(df_true
11.13 Get SPOT Results
from spotPython.hyperparameters.values import get_one_core_model_from_X
= spot_tuner.to_all_dim(spot_tuner.min_X.reshape(1,-1))
X = get_one_core_model_from_X(X, fun_control)
model_spot model_spot
HoeffdingAdaptiveTreeRegressor
HoeffdingAdaptiveTreeRegressor (
grace_period=998
max_depth=65536
delta=1e-06
tau=0.01
leaf_prediction="adaptive"
leaf_model=LinearRegression (
optimizer=SGD (
lr=Constant (
learning_rate=0.01
)
)
loss=Squared ()
l2=0.
l1=0.
intercept_init=0.
intercept_lr=Constant (
learning_rate=0.01
)
clip_gradient=1e+12
initializer=Zeros ()
)
model_selector_decay=0.9
nominal_attributes=None
splitter=QOSplitter (
radius=0.25
allow_multiway_splits=False
)
min_samples_split=6
bootstrap_sampling=0
drift_window_threshold=155
drift_detector=ADWIN (
delta=0.002
clock=32
max_buckets=5
min_window_length=5
grace_period=10
)
switch_significance=0.01
binary_split=1
max_size=100.
memory_estimate_period=162938
stop_mem_management=0
remove_poor_attrs=1
merit_preprune=0
seed=None
)
= eval_oml_horizon(
df_eval_spot, df_true_spot =model_spot,
model=fun_control["train"],
train=fun_control["test"],
test=fun_control["target_column"],
target_column=fun_control["horizon"],
horizon=fun_control["oml_grace_period"],
oml_grace_period=fun_control["metric_sklearn"],
metric )
=["default", "spot"]
df_labels= [df_eval_default, df_eval_spot], log_y=False, df_labels=df_labels, metric=fun_control["metric_sklearn"], filename="./figures/" + experiment_name+"_metrics.pdf") plot_bml_oml_horizon_metrics(df_eval
= int(m/2)+20
a = int(m/2)+50
b = [df_true_default[a:b], df_true_spot[a:b]], target_column=target_column, df_labels=df_labels, filename="./figures/" + experiment_name+"_predictions.pdf") plot_bml_oml_horizon_predictions(df_true
from spotPython.plot.validation import plot_actual_vs_predicted
=df_true_default["y"], y_pred=df_true_default["Prediction"], title="Default")
plot_actual_vs_predicted(y_test=df_true_spot["y"], y_pred=df_true_spot["Prediction"], title="SPOT") plot_actual_vs_predicted(y_test
11.14 Visualize Regression Trees
= dataset.take(n_total)
dataset_f for x, y in dataset_f:
model_default.learn_one(x, y)
Caution: Large Trees
- Since the trees are large, the visualization is suppressed by default.
- To visualize the trees, uncomment the following line.
# model_default.draw()
model_default.summary
{'n_nodes': 35,
'n_branches': 17,
'n_leaves': 18,
'n_active_leaves': 96,
'n_inactive_leaves': 0,
'height': 6,
'total_observed_weight': 39002.0,
'n_alternate_trees': 21,
'n_pruned_alternate_trees': 6,
'n_switch_alternate_trees': 2}
11.14.1 Spot Model
= dataset.take(n_total)
dataset_f for x, y in dataset_f:
model_spot.learn_one(x, y)
Caution: Large Trees
- Since the trees are large, the visualization is suppressed by default.
- To visualize the trees, uncomment the following line.
# model_spot.draw()
model_spot.summary
{'n_nodes': 11,
'n_branches': 5,
'n_leaves': 6,
'n_active_leaves': 3,
'n_inactive_leaves': 0,
'height': 5,
'total_observed_weight': 39002.0,
'n_alternate_trees': 24,
'n_pruned_alternate_trees': 15,
'n_switch_alternate_trees': 3}
from spotPython.utils.eda import compare_two_tree_models
print(compare_two_tree_models(model_default, model_spot))
| Parameter | Default | Spot |
|--------------------------|-----------|--------|
| n_nodes | 35 | 11 |
| n_branches | 17 | 5 |
| n_leaves | 18 | 6 |
| n_active_leaves | 96 | 3 |
| n_inactive_leaves | 0 | 0 |
| height | 6 | 5 |
| total_observed_weight | 39002 | 39002 |
| n_alternate_trees | 21 | 24 |
| n_pruned_alternate_trees | 6 | 15 |
| n_switch_alternate_trees | 2 | 3 |
11.15 Detailed Hyperparameter Plots
= "./figures/" + experiment_name
filename =filename) spot_tuner.plot_important_hyperparameter_contour(filename
leaf_prediction: 5.438195067257364
leaf_model: 5.4478753170354315
splitter: 100.0
remove_poor_attrs: 0.039746553711951044
11.16 Parallel Coordinates Plots
spot_tuner.parallel_plot()
11.17 Plot all Combinations of Hyperparameters
- Warning: this may take a while.
= False
PLOT_ALL if PLOT_ALL:
= spot_tuner.k
n for i in range(n-1):
for j in range(i+1, n):
=i, j=j, min_z=min_z, max_z = max_z) spot_tuner.plot_contour(i