list | grep "spot[RiverPython]" pip
spotPython 0.2.33
spotRiver 0.0.93
Note: you may need to restart the kernel to use updated packages.
In this tutorial, we will show how spotPython
can be integrated into the PyTorch
training workflow.
This document refers to the following software versions:
python
: 3.10.10torch
: 2.0.1torchvision
: 0.15.0list | grep "spot[RiverPython]" pip
spotPython 0.2.33
spotRiver 0.0.93
Note: you may need to restart the kernel to use updated packages.
spotPython
can be installed via pip. Alternatively, the source code can be downloaded from gitHub: https://github.com/sequential-parameter-optimization/spotPython.
!pip install spotPython
spotPython
from gitHub.# import sys
# !{sys.executable} -m pip install --upgrade build
# !{sys.executable} -m pip install --upgrade --force-reinstall spotPython
Before we consider the detailed experimental setup, we select the parameters that affect run time, initial design size and the device that is used.
= 1
MAX_TIME = 5
INIT_SIZE = "cpu" # "cuda:0" DEVICE
from spotPython.utils.device import getDevice
= getDevice(DEVICE)
DEVICE print(DEVICE)
cpu
import os
import copy
import socket
from datetime import datetime
from dateutil.tz import tzlocal
= datetime.now(tzlocal())
start_time = socket.gethostname().split(".")[0]
HOSTNAME = '11-torch' + "_" + HOSTNAME + "_" + str(MAX_TIME) + "min_" + str(INIT_SIZE) + "init_" + str(start_time).split(".", 1)[0].replace(' ', '_')
experiment_name = experiment_name.replace(':', '-')
experiment_name print(experiment_name)
if not os.path.exists('./figures'):
'./figures') os.makedirs(
11-torch_p040025_1min_5init_2023-06-16_20-20-27
fun_control
DictionaryspotPython
uses a Python dictionary for storing the information required for the hyperparameter tuning process, which was described in Section 13.2.
from spotPython.utils.init import fun_control_init
= fun_control_init(task="classification",
fun_control ="runs/11_spot_hpt_torch_fashion_mnist",
tensorboard_path=DEVICE) device
from torchvision import datasets, transforms
from torchvision.transforms import ToTensor
def load_data(data_dir="./data"):
# Download training data from open datasets.
= datasets.FashionMNIST(
training_data =data_dir,
root=True,
train=True,
download=ToTensor(),
transform
)# Download test data from open datasets.
= datasets.FashionMNIST(
test_data =data_dir,
root=False,
train=True,
download=ToTensor(),
transform
)return training_data, test_data
= load_data()
train, test train.data.shape, test.data.shape
(torch.Size([60000, 28, 28]), torch.Size([10000, 28, 28]))
= len(train)
n_samples # add the dataset to the fun_control
"data": None,
fun_control.update({"train": train,
"test": test,
"n_samples": n_samples,
"target_column": None})
After the training and test data are specified and added to the fun_control
dictionary, spotPython
allows the specification of a data preprocessing pipeline, e.g., for the scaling of the data or for the one-hot encoding of categorical variables, see Section 13.4.1. This feature is not used here, so we do not change the default value (which is None
).
algorithm
and core_model_hyper_dict
spotPython
implements a class which is similar to the class described in the PyTorch tutorial. The class is called Net_fashionMNIST
and is implemented in the file netfashionMNIST.py
. The class is imported here.
from torch import nn
import spotPython.torch.netcore as netcore
class Net_fashionMNIST(netcore.Net_Core):
def __init__(self, l1, l2, lr_mult, batch_size, epochs, k_folds, patience, optimizer, sgd_momentum):
super(Net_fashionMNIST, self).__init__(
=lr_mult,
lr_mult=batch_size,
batch_size=epochs,
epochs=k_folds,
k_folds=patience,
patience=optimizer,
optimizer=sgd_momentum,
sgd_momentum
)self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
28 * 28, l1),
nn.Linear(
nn.ReLU(),
nn.Linear(l1, l2),
nn.ReLU(),10)
nn.Linear(l2,
)
def forward(self, x):
= self.flatten(x)
x = self.linear_relu_stack(x)
logits return logits
This class inherits from the class Net_Core
which is implemented in the file netcore.py
, see ?sec-the-net-core-class-24.
from spotPython.data.torch_hyper_dict import TorchHyperDict
from spotPython.torch.netfashionMNIST import Net_fashionMNIST
from spotPython.hyperparameters.values import add_core_model_to_fun_control
= add_core_model_to_fun_control(core_model=Net_fashionMNIST,
fun_control =fun_control,
fun_control=TorchHyperDict,
hyper_dict=None) filename
hyper_dict
Hyperparameters for the Selected AlgorithmspotPython
uses JSON
files for the specification of the hyperparameters, which were described in Section 13.5.2.
The corresponding entries for the Net_fashionMNIST
class are shown below.
"Net_fashionMNIST":
{
"l1": {
"type": "int",
"default": 5,
"transform": "transform_power_2_int",
"lower": 2,
"upper": 9},
"l2": {
"type": "int",
"default": 5,
"transform": "transform_power_2_int",
"lower": 2,
"upper": 9},
"lr_mult": {
"type": "float",
"default": 1.0,
"transform": "None",
"lower": 0.1,
"upper": 10.0},
"batch_size": {
"type": "int",
"default": 4,
"transform": "transform_power_2_int",
"lower": 1,
"upper": 4},
"epochs": {
"type": "int",
"default": 3,
"transform": "transform_power_2_int",
"lower": 3,
"upper": 4},
"k_folds": {
"type": "int",
"default": 1,
"transform": "None",
"lower": 1,
"upper": 1},
"patience": {
"type": "int",
"default": 5,
"transform": "None",
"lower": 2,
"upper": 10
},
"optimizer": {
"levels": ["Adadelta",
"Adagrad",
"Adam",
"AdamW",
"SparseAdam",
"Adamax",
"ASGD",
"NAdam",
"RAdam",
"RMSprop",
"Rprop",
"SGD"],
"type": "factor",
"default": "SGD",
"transform": "None",
"core_model_parameter_type": "str",
"lower": 0,
"upper": 12},
"sgd_momentum": {
"type": "float",
"default": 0.0,
"transform": "None",
"lower": 0.0,
"upper": 1.0}
},
hyper_dict
Hyperparameters for the Selected Algorithm aka core_model
spotPython
provides functions for modifying the hyperparameters, their bounds and factors as well as for activating and de-activating hyperparameters without re-compilation of the Python source code. These functions were described in Section 13.5.3.
from spotPython.hyperparameters.values import modify_hyper_parameter_bounds
# fun_control = modify_hyper_parameter_bounds(fun_control, "delta", bounds=[1e-10, 1e-6])
# fun_control = modify_hyper_parameter_bounds(fun_control, "min_samples_split", bounds=[3, 20])
#fun_control = modify_hyper_parameter_bounds(fun_control, "merit_preprune", bounds=[0, 0])
# fun_control["core_model_hyper_dict"]
= modify_hyper_parameter_bounds(fun_control, "k_folds", bounds=[0, 0])
fun_control = modify_hyper_parameter_bounds(fun_control, "patience", bounds=[2, 2])
fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[2, 3]) fun_control
from spotPython.hyperparameters.values import modify_hyper_parameter_levels
# fun_control = modify_hyper_parameter_levels(fun_control, "leaf_model", ["LinearRegression"])
# fun_control["core_model_hyper_dict"]
Optimizers are described in Section 13.6.
The evaluation procedure requires the specification of two elements:
These are described in Section 18.9.
The key "loss_function"
specifies the loss function which is used during the optimization, see Section 13.8.
We will use CrossEntropy loss for the multiclass-classification task.
from torch.nn import CrossEntropyLoss
= CrossEntropyLoss()
loss_function
fun_control.update({"loss_function": loss_function,
"shuffle": True,
"eval": "train_hold_out"
})
from torchmetrics import Accuracy
= Accuracy(task="multiclass", num_classes=10).to(fun_control["device"])
metric_torch "metric_torch": metric_torch}) fun_control.update({
The following code passes the information about the parameter ranges and bounds to spot
.
# extract the variable types, names, and bounds
from spotPython.hyperparameters.values import (get_bound_values,
get_var_name,
get_var_type,)= get_var_type(fun_control)
var_type = get_var_name(fun_control)
var_name "var_type": var_type,
fun_control.update({"var_name": var_name})
= get_bound_values(fun_control, "lower")
lower = get_bound_values(fun_control, "upper") upper
from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control))
| name | type | default | lower | upper | transform |
|--------------|--------|-----------|---------|---------|-----------------------|
| l1 | int | 5 | 2 | 9 | transform_power_2_int |
| l2 | int | 5 | 2 | 9 | transform_power_2_int |
| lr_mult | float | 1.0 | 0.1 | 10 | None |
| batch_size | int | 4 | 1 | 4 | transform_power_2_int |
| epochs | int | 3 | 2 | 3 | transform_power_2_int |
| k_folds | int | 1 | 0 | 0 | None |
| patience | int | 5 | 2 | 2 | None |
| optimizer | factor | SGD | 0 | 12 | None |
| sgd_momentum | float | 0.0 | 0 | 1 | None |
fun_torch
The objective function fun_torch
is selected next. It implements an interface from PyTorch
’s training, validation, and testing methods to spotPython
.
from spotPython.fun.hypertorch import HyperTorch
= HyperTorch().fun_torch fun
import numpy as np
from spotPython.spot import spot
from math import inf
= spot.Spot(fun=fun,
spot_tuner = lower,
lower = upper,
upper = inf,
fun_evals = 1,
fun_repeats = MAX_TIME,
max_time = False,
noise = np.sqrt(np.spacing(1)),
tolerance_x = var_type,
var_type = var_name,
var_name = "y",
infill_criterion = 1,
n_points =123,
seed= 50,
log_level = False,
show_models= True,
show_progress= fun_control,
fun_control ={"init_size": INIT_SIZE,
design_control"repeats": 1},
={"noise": True,
surrogate_control"cod_type": "norm",
"min_theta": -4,
"max_theta": 3,
"n_theta": len(var_name),
"model_fun_evals": 10_000,
"log_level": 50
})=X_start) spot_tuner.run(X_start
config: {'l1': 16, 'l2': 32, 'lr_mult': 9.563687451910228, 'batch_size': 8, 'epochs': 8, 'k_folds': 0, 'patience': 2, 'optimizer': 'AdamW', 'sgd_momentum': 0.41533100039458876}
Epoch: 1
Loss on hold-out set: 0.7280306033538655
Accuracy on hold-out set: 0.76575
MulticlassAccuracy value on hold-out data: 0.765749990940094
Epoch: 2
Loss on hold-out set: 0.6369808959594617
Accuracy on hold-out set: 0.8016666666666666
MulticlassAccuracy value on hold-out data: 0.8016666769981384
Epoch: 3
Loss on hold-out set: 0.61605288642024
Accuracy on hold-out set: 0.80025
MulticlassAccuracy value on hold-out data: 0.8002499938011169
Epoch: 4
Loss on hold-out set: 0.5958177099404857
Accuracy on hold-out set: 0.8069583333333333
MulticlassAccuracy value on hold-out data: 0.8069583177566528
Epoch: 5
Loss on hold-out set: 0.641254554610583
Accuracy on hold-out set: 0.7777916666666667
MulticlassAccuracy value on hold-out data: 0.7777916789054871
Epoch: 6
Loss on hold-out set: 0.5700600726404227
Accuracy on hold-out set: 0.821
MulticlassAccuracy value on hold-out data: 0.8209999799728394
Epoch: 7
Loss on hold-out set: 0.6348543479736739
Accuracy on hold-out set: 0.814125
MulticlassAccuracy value on hold-out data: 0.8141250014305115
Epoch: 8
Loss on hold-out set: 0.6070657135549312
Accuracy on hold-out set: 0.82325
MulticlassAccuracy value on hold-out data: 0.8232499957084656
Early stopping at epoch 7
Returned to Spot: Validation loss: 0.6070657135549312
----------------------------------------------
config: {'l1': 128, 'l2': 32, 'lr_mult': 6.258012467639852, 'batch_size': 2, 'epochs': 4, 'k_folds': 0, 'patience': 2, 'optimizer': 'RAdam', 'sgd_momentum': 0.9572474073249809}
Epoch: 1
Loss on hold-out set: 0.6485239004243807
Accuracy on hold-out set: 0.831375
MulticlassAccuracy value on hold-out data: 0.831375002861023
Epoch: 2
Loss on hold-out set: 0.5155696155135445
Accuracy on hold-out set: 0.8497083333333333
MulticlassAccuracy value on hold-out data: 0.8497083187103271
Epoch: 3
Loss on hold-out set: 0.5956713381450979
Accuracy on hold-out set: 0.8692083333333334
MulticlassAccuracy value on hold-out data: 0.8692083358764648
Epoch: 4
Loss on hold-out set: 0.5416228853709973
Accuracy on hold-out set: 0.8683333333333333
MulticlassAccuracy value on hold-out data: 0.8683333396911621
Early stopping at epoch 3
Returned to Spot: Validation loss: 0.5416228853709973
----------------------------------------------
config: {'l1': 256, 'l2': 256, 'lr_mult': 0.2437336281201693, 'batch_size': 16, 'epochs': 8, 'k_folds': 0, 'patience': 2, 'optimizer': 'Adagrad', 'sgd_momentum': 0.15368887503658651}
Epoch: 1
Loss on hold-out set: 0.47869857964664697
Accuracy on hold-out set: 0.8354166666666667
MulticlassAccuracy value on hold-out data: 0.8354166746139526
Epoch: 2
Loss on hold-out set: 0.46281473950048285
Accuracy on hold-out set: 0.83275
MulticlassAccuracy value on hold-out data: 0.8327500224113464
Epoch: 3
Loss on hold-out set: 0.43123161408429345
Accuracy on hold-out set: 0.8472916666666667
MulticlassAccuracy value on hold-out data: 0.8472916483879089
Epoch: 4
Loss on hold-out set: 0.4169509609391292
Accuracy on hold-out set: 0.8529583333333334
MulticlassAccuracy value on hold-out data: 0.8529583215713501
Epoch: 5
Loss on hold-out set: 0.40455256626755
Accuracy on hold-out set: 0.8567916666666666
MulticlassAccuracy value on hold-out data: 0.8567916750907898
Epoch: 6
Loss on hold-out set: 0.4030191301231583
Accuracy on hold-out set: 0.8567916666666666
MulticlassAccuracy value on hold-out data: 0.8567916750907898
Epoch: 7
Loss on hold-out set: 0.39274805945530533
Accuracy on hold-out set: 0.8615833333333334
MulticlassAccuracy value on hold-out data: 0.8615833520889282
Epoch: 8
Loss on hold-out set: 0.3904207593202591
Accuracy on hold-out set: 0.8610833333333333
MulticlassAccuracy value on hold-out data: 0.8610833287239075
Returned to Spot: Validation loss: 0.3904207593202591
----------------------------------------------
config: {'l1': 64, 'l2': 8, 'lr_mult': 2.906205211581667, 'batch_size': 8, 'epochs': 4, 'k_folds': 0, 'patience': 2, 'optimizer': 'SGD', 'sgd_momentum': 0.25435133436334767}
Epoch: 1
Loss on hold-out set: 0.935462615609169
Accuracy on hold-out set: 0.6845833333333333
MulticlassAccuracy value on hold-out data: 0.684583306312561
Epoch: 2
Loss on hold-out set: 0.7853849367375175
Accuracy on hold-out set: 0.7312916666666667
MulticlassAccuracy value on hold-out data: 0.731291651725769
Epoch: 3
Loss on hold-out set: 0.7176219247840345
Accuracy on hold-out set: 0.7505833333333334
MulticlassAccuracy value on hold-out data: 0.7505833506584167
Epoch: 4
Loss on hold-out set: 0.6663412794545293
Accuracy on hold-out set: 0.7709583333333333
MulticlassAccuracy value on hold-out data: 0.7709583044052124
Returned to Spot: Validation loss: 0.6663412794545293
----------------------------------------------
config: {'l1': 4, 'l2': 128, 'lr_mult': 4.224097306355747, 'batch_size': 4, 'epochs': 8, 'k_folds': 0, 'patience': 2, 'optimizer': 'Adamax', 'sgd_momentum': 0.6538496127257492}
Epoch: 1
Loss on hold-out set: 0.7946200037089121
Accuracy on hold-out set: 0.7546666666666667
MulticlassAccuracy value on hold-out data: 0.7546666860580444
Epoch: 2
Loss on hold-out set: 0.7079592275182834
Accuracy on hold-out set: 0.7716666666666666
MulticlassAccuracy value on hold-out data: 0.7716666460037231
Epoch: 3
Loss on hold-out set: 0.6923255297316016
Accuracy on hold-out set: 0.7735
MulticlassAccuracy value on hold-out data: 0.7735000252723694
Epoch: 4
Loss on hold-out set: 0.7026066383594913
Accuracy on hold-out set: 0.761625
MulticlassAccuracy value on hold-out data: 0.7616249918937683
Epoch: 5
Loss on hold-out set: 0.651759386166268
Accuracy on hold-out set: 0.7841666666666667
MulticlassAccuracy value on hold-out data: 0.784166693687439
Epoch: 6
Loss on hold-out set: 0.7141814457841417
Accuracy on hold-out set: 0.781125
MulticlassAccuracy value on hold-out data: 0.781125009059906
Epoch: 7
Loss on hold-out set: 0.7420809289485314
Accuracy on hold-out set: 0.76625
MulticlassAccuracy value on hold-out data: 0.7662500143051147
Early stopping at epoch 6
Returned to Spot: Validation loss: 0.7420809289485314
----------------------------------------------
config: {'l1': 256, 'l2': 512, 'lr_mult': 0.22685311581308953, 'batch_size': 16, 'epochs': 8, 'k_folds': 0, 'patience': 2, 'optimizer': 'Adagrad', 'sgd_momentum': 0.19048454477708274}
Epoch: 1
Loss on hold-out set: 0.49062527282039325
Accuracy on hold-out set: 0.827875
MulticlassAccuracy value on hold-out data: 0.827875018119812
Epoch: 2
Loss on hold-out set: 0.44438711123913527
Accuracy on hold-out set: 0.840875
MulticlassAccuracy value on hold-out data: 0.8408750295639038
Epoch: 3
Loss on hold-out set: 0.4247303451299667
Accuracy on hold-out set: 0.8503333333333334
MulticlassAccuracy value on hold-out data: 0.8503333330154419
Epoch: 4
Loss on hold-out set: 0.41463254447778064
Accuracy on hold-out set: 0.8544583333333333
MulticlassAccuracy value on hold-out data: 0.8544583320617676
Epoch: 5
Loss on hold-out set: 0.40354599794745444
Accuracy on hold-out set: 0.8578333333333333
MulticlassAccuracy value on hold-out data: 0.8578333258628845
Epoch: 6
Loss on hold-out set: 0.3949941752279798
Accuracy on hold-out set: 0.860375
MulticlassAccuracy value on hold-out data: 0.8603749871253967
Epoch: 7
Loss on hold-out set: 0.3897688213456422
Accuracy on hold-out set: 0.8624583333333333
MulticlassAccuracy value on hold-out data: 0.862458348274231
Epoch: 8
Loss on hold-out set: 0.38786087719723583
Accuracy on hold-out set: 0.8639583333333334
MulticlassAccuracy value on hold-out data: 0.8639583587646484
Returned to Spot: Validation loss: 0.38786087719723583
----------------------------------------------
spotPython tuning: 0.38786087719723583 [#####-----] 52.13%
config: {'l1': 16, 'l2': 512, 'lr_mult': 0.1, 'batch_size': 16, 'epochs': 8, 'k_folds': 0, 'patience': 2, 'optimizer': 'Adagrad', 'sgd_momentum': 1.0}
Epoch: 1
Loss on hold-out set: 0.7161393035451571
Accuracy on hold-out set: 0.7407916666666666
MulticlassAccuracy value on hold-out data: 0.7407916784286499
Epoch: 2
Loss on hold-out set: 0.6461528285543124
Accuracy on hold-out set: 0.7752083333333334
MulticlassAccuracy value on hold-out data: 0.7752083539962769
Epoch: 3
Loss on hold-out set: 0.6088597588439782
Accuracy on hold-out set: 0.7881666666666667
MulticlassAccuracy value on hold-out data: 0.7881666421890259
Epoch: 4
Loss on hold-out set: 0.5854975271821022
Accuracy on hold-out set: 0.7969166666666667
MulticlassAccuracy value on hold-out data: 0.796916663646698
Epoch: 5
Loss on hold-out set: 0.5698832663496335
Accuracy on hold-out set: 0.804875
MulticlassAccuracy value on hold-out data: 0.8048750162124634
Epoch: 6
Loss on hold-out set: 0.5556277097364267
Accuracy on hold-out set: 0.8075833333333333
MulticlassAccuracy value on hold-out data: 0.8075833320617676
Epoch: 7
Loss on hold-out set: 0.5470832998951276
Accuracy on hold-out set: 0.8116666666666666
MulticlassAccuracy value on hold-out data: 0.8116666674613953
Epoch: 8
Loss on hold-out set: 0.5371010897705952
Accuracy on hold-out set: 0.814
MulticlassAccuracy value on hold-out data: 0.8140000104904175
Returned to Spot: Validation loss: 0.5371010897705952
----------------------------------------------
spotPython tuning: 0.38786087719723583 [#########-] 85.45%
config: {'l1': 256, 'l2': 512, 'lr_mult': 0.5705075457510752, 'batch_size': 16, 'epochs': 8, 'k_folds': 0, 'patience': 2, 'optimizer': 'Adagrad', 'sgd_momentum': 0.971336668097433}
Epoch: 1
Loss on hold-out set: 0.42686400876442593
Accuracy on hold-out set: 0.846125
MulticlassAccuracy value on hold-out data: 0.8461250066757202
Epoch: 2
Loss on hold-out set: 0.3839760159129898
Accuracy on hold-out set: 0.8611666666666666
MulticlassAccuracy value on hold-out data: 0.8611666560173035
Epoch: 3
Loss on hold-out set: 0.37075403837859633
Accuracy on hold-out set: 0.867
MulticlassAccuracy value on hold-out data: 0.8669999837875366
Epoch: 4
Loss on hold-out set: 0.36395191453769804
Accuracy on hold-out set: 0.8660416666666667
MulticlassAccuracy value on hold-out data: 0.8660416603088379
Epoch: 5
Loss on hold-out set: 0.34839614111681777
Accuracy on hold-out set: 0.8745416666666667
MulticlassAccuracy value on hold-out data: 0.8745416402816772
Epoch: 6
Loss on hold-out set: 0.34904215663547317
Accuracy on hold-out set: 0.8744166666666666
MulticlassAccuracy value on hold-out data: 0.8744166493415833
Epoch: 7
Loss on hold-out set: 0.342254511300164
Accuracy on hold-out set: 0.8787916666666666
MulticlassAccuracy value on hold-out data: 0.8787916898727417
Epoch: 8
Loss on hold-out set: 0.33139804435366144
Accuracy on hold-out set: 0.8809166666666667
MulticlassAccuracy value on hold-out data: 0.8809166550636292
Returned to Spot: Validation loss: 0.33139804435366144
----------------------------------------------
spotPython tuning: 0.33139804435366144 [##########] 100.00% Done...
<spotPython.spot.spot.Spot at 0x28f2eec80>
The textual output shown in the console (or code cell) can be visualized with Tensorboard as described in Section 13.13.
After the hyperparameter tuning run is finished, the results can be analyzed as described in Section 13.14.
= False
SAVE = False
LOAD
if SAVE:
= "res_" + experiment_name + ".pkl"
result_file_name with open(result_file_name, 'wb') as f:
pickle.dump(spot_tuner, f)
if LOAD:
= "ADD THE NAME here, e.g.: res_ch10-friedman-hpt-0_maans03_60min_20init_1K_2023-04-14_10-11-19.pkl"
result_file_name with open(result_file_name, 'rb') as f:
= pickle.load(f) spot_tuner
After the hyperparameter tuning run is finished, the progress of the hyperparameter tuning can be visualized. The following code generates the progress plot from ?fig-progress.
=False,
spot_tuner.plot_progress(log_y="./figures/" + experiment_name+"_progress.png") filename
print(gen_design_table(fun_control=fun_control,
=spot_tuner)) spot
| name | type | default | lower | upper | tuned | transform | importance | stars |
|--------------|--------|-----------|---------|---------|--------------------|-----------------------|--------------|---------|
| l1 | int | 5 | 2.0 | 9.0 | 8.0 | transform_power_2_int | 100.00 | *** |
| l2 | int | 5 | 2.0 | 9.0 | 9.0 | transform_power_2_int | 0.00 | |
| lr_mult | float | 1.0 | 0.1 | 10.0 | 0.5705075457510752 | None | 0.31 | . |
| batch_size | int | 4 | 1.0 | 4.0 | 4.0 | transform_power_2_int | 1.31 | * |
| epochs | int | 3 | 2.0 | 3.0 | 3.0 | transform_power_2_int | 39.69 | * |
| k_folds | int | 1 | 0.0 | 0.0 | 0.0 | None | 0.00 | |
| patience | int | 5 | 2.0 | 2.0 | 2.0 | None | 0.00 | |
| optimizer | factor | SGD | 0.0 | 12.0 | 1.0 | None | 1.93 | * |
| sgd_momentum | float | 0.0 | 0.0 | 1.0 | 0.971336668097433 | None | 1.40 | * |
=0.025, filename="./figures/" + experiment_name+"_importance.png") spot_tuner.plot_importance(threshold
The architecture of the spotPython
model can be obtained by the following code:
from spotPython.hyperparameters.values import get_one_core_model_from_X
= spot_tuner.to_all_dim(spot_tuner.min_X.reshape(1,-1))
X = get_one_core_model_from_X(X, fun_control)
model_spot model_spot
Net_fashionMNIST(
(flatten): Flatten(start_dim=1, end_dim=-1)
(linear_relu_stack): Sequential(
(0): Linear(in_features=784, out_features=256, bias=True)
(1): ReLU()
(2): Linear(in_features=256, out_features=512, bias=True)
(3): ReLU()
(4): Linear(in_features=512, out_features=10, bias=True)
)
)
= fun_control
fc "core_model_hyper_dict":
fc.update({"core_model"].__name__]})
hyper_dict[fun_control[= get_one_core_model_from_X(X_start, fun_control=fc)
model_default model_default
Net_fashionMNIST(
(flatten): Flatten(start_dim=1, end_dim=-1)
(linear_relu_stack): Sequential(
(0): Linear(in_features=784, out_features=32, bias=True)
(1): ReLU()
(2): Linear(in_features=32, out_features=32, bias=True)
(3): ReLU()
(4): Linear(in_features=32, out_features=10, bias=True)
)
)
The method train_tuned
takes a model architecture without trained weights and trains this model with the train data. The train data is split into train and validation data. The validation data is used for early stopping. The trained model weights are saved as a dictionary.
from spotPython.torch.traintest import train_tuned
=model_default, train_dataset=train, shuffle=True,
train_tuned(net=fun_control["loss_function"],
loss_function=fun_control["metric_torch"],
metric= fun_control["device"],
device =1_000_000,
show_batch_interval=None,
path=fun_control["task"]) task
Epoch: 1
Loss on hold-out set: 2.012274356206258
Accuracy on hold-out set: 0.38379166666666664
MulticlassAccuracy value on hold-out data: 0.38379165530204773
Epoch: 2
Loss on hold-out set: 1.4650455430348714
Accuracy on hold-out set: 0.6065833333333334
MulticlassAccuracy value on hold-out data: 0.6065833568572998
Epoch: 3
Loss on hold-out set: 1.1860688355763753
Accuracy on hold-out set: 0.61525
MulticlassAccuracy value on hold-out data: 0.6152499914169312
Epoch: 4
Loss on hold-out set: 1.0487525905768076
Accuracy on hold-out set: 0.6245
MulticlassAccuracy value on hold-out data: 0.6244999766349792
Epoch: 5
Loss on hold-out set: 0.9639142131010692
Accuracy on hold-out set: 0.6572083333333333
MulticlassAccuracy value on hold-out data: 0.6572083234786987
Epoch: 6
Loss on hold-out set: 0.9053418484131495
Accuracy on hold-out set: 0.6725416666666667
MulticlassAccuracy value on hold-out data: 0.6725416779518127
Epoch: 7
Loss on hold-out set: 0.8607923109730085
Accuracy on hold-out set: 0.688
MulticlassAccuracy value on hold-out data: 0.6880000233650208
Epoch: 8
Loss on hold-out set: 0.8262420001626015
Accuracy on hold-out set: 0.7021666666666667
MulticlassAccuracy value on hold-out data: 0.7021666765213013
Returned to Spot: Validation loss: 0.8262420001626015
----------------------------------------------
from spotPython.torch.traintest import test_tuned
=model_default, test_dataset=test,
test_tuned(net=fun_control["loss_function"],
loss_function=fun_control["metric_torch"],
metric=False,
shuffle= fun_control["device"],
device =fun_control["task"]) task
Loss on hold-out set: 0.8516067682743073
Accuracy on hold-out set: 0.6899
MulticlassAccuracy value on hold-out data: 0.6898999810218811
Final evaluation: Validation loss: 0.8516067682743073
Final evaluation: Validation metric: 0.6898999810218811
----------------------------------------------
(0.8516067682743073, nan, tensor(0.6899))
The following code trains the model model_spot
. If path
is set to a filename, e.g., path = "model_spot_trained.pt"
, the weights of the trained model will be saved to this file.
=model_spot, train_dataset=train,
train_tuned(net=fun_control["loss_function"],
loss_function=fun_control["metric_torch"],
metric=True,
shuffle= fun_control["device"],
device =None,
path=fun_control["task"]) task
Epoch: 1
Loss on hold-out set: 0.42698511441797016
Accuracy on hold-out set: 0.84525
MulticlassAccuracy value on hold-out data: 0.8452500104904175
Epoch: 2
Loss on hold-out set: 0.38963056811069446
Accuracy on hold-out set: 0.8589166666666667
MulticlassAccuracy value on hold-out data: 0.8589166402816772
Epoch: 3
Loss on hold-out set: 0.3643972325067346
Accuracy on hold-out set: 0.86675
MulticlassAccuracy value on hold-out data: 0.8667500019073486
Epoch: 4
Loss on hold-out set: 0.3652195796892047
Accuracy on hold-out set: 0.8677083333333333
MulticlassAccuracy value on hold-out data: 0.8677083253860474
Epoch: 5
Loss on hold-out set: 0.34622170781406264
Accuracy on hold-out set: 0.875375
MulticlassAccuracy value on hold-out data: 0.875374972820282
Epoch: 6
Loss on hold-out set: 0.34396518403043347
Accuracy on hold-out set: 0.8763333333333333
MulticlassAccuracy value on hold-out data: 0.8763333559036255
Epoch: 7
Loss on hold-out set: 0.3456016379551341
Accuracy on hold-out set: 0.8724166666666666
MulticlassAccuracy value on hold-out data: 0.8724166750907898
Epoch: 8
Loss on hold-out set: 0.3309829086779306
Accuracy on hold-out set: 0.8797916666666666
MulticlassAccuracy value on hold-out data: 0.8797916769981384
Returned to Spot: Validation loss: 0.3309829086779306
----------------------------------------------
=model_spot, test_dataset=test,
test_tuned(net=False,
shuffle=fun_control["loss_function"],
loss_function=fun_control["metric_torch"],
metric= fun_control["device"],
device =fun_control["task"]) task
Loss on hold-out set: 0.3626138243019581
Accuracy on hold-out set: 0.873
MulticlassAccuracy value on hold-out data: 0.8730000257492065
Final evaluation: Validation loss: 0.3626138243019581
Final evaluation: Validation metric: 0.8730000257492065
----------------------------------------------
(0.3626138243019581, nan, tensor(0.8730))
= "./figures/" + experiment_name
filename =filename) spot_tuner.plot_important_hyperparameter_contour(filename
l1: 100.00000000000001
lr_mult: 0.3128906751657627
batch_size: 1.3101481981869258
epochs: 39.68960740467837
optimizer: 1.9266098374955214
sgd_momentum: 1.3953255795042632
spot_tuner.parallel_plot()
Parallel coordinates plots
= False
PLOT_ALL if PLOT_ALL:
= spot_tuner.k
n for i in range(n-1):
for j in range(i+1, n):
=i, j=j, min_z=min_z, max_z = max_z) spot_tuner.plot_contour(i