20  HPT: PyTorch With VBDP

In this tutorial, we will show how spotPython can be integrated into the PyTorch training workflow for a classifiaction task.

Caution: Data must be downloaded manually
  • Ensure that the correspondiing data is available as ./data/VBDP/train.csv.

This document refers to the following software versions:

pip list | grep  "spot[RiverPython]"
spotPython               0.2.51
spotRiver                0.0.94
Note: you may need to restart the kernel to use updated packages.

spotPython can be installed via pip. Alternatively, the source code can be downloaded from gitHub: https://github.com/sequential-parameter-optimization/spotPython.

!pip install spotPython
# import sys
# !{sys.executable} -m pip install --upgrade build
# !{sys.executable} -m pip install --upgrade --force-reinstall spotPython

20.1 Step 1: Setup

Before we consider the detailed experimental setup, we select the parameters that affect run time, initial design size and the device that is used.

Caution: Run time and initial design size should be increased for real experiments
  • MAX_TIME is set to one minute for demonstration purposes. For real experiments, this should be increased to at least 1 hour.
  • INIT_SIZE is set to 5 for demonstration purposes. For real experiments, this should be increased to at least 10.
Note: Device selection
  • The device can be selected by setting the variable DEVICE.
  • Since we are using a simple neural net, the setting "cpu" is preferred (on Mac).
  • If you have a GPU, you can use "cuda:0" instead.
  • If DEVICE is set to None, spotPython will automatically select the device.
    • This might result in "mps" on Macs, which is not the best choice for simple neural nets.
MAX_TIME = 1
INIT_SIZE = 5
DEVICE = None # "cpu" # "cuda:0"
from spotPython.utils.device import getDevice
DEVICE = getDevice(DEVICE)
print(DEVICE)
mps
import os
import copy
import socket
from datetime import datetime
from dateutil.tz import tzlocal
start_time = datetime.now(tzlocal())
HOSTNAME = socket.gethostname().split(".")[0]
experiment_name = '25-torch' + "_" + HOSTNAME + "_" + str(MAX_TIME) + "min_" + str(INIT_SIZE) + "init_" + str(start_time).split(".", 1)[0].replace(' ', '_')
experiment_name = experiment_name.replace(':', '-')
print(experiment_name)
if not os.path.exists('./figures'):
    os.makedirs('./figures')
25-torch_maans05_1min_5init_2023-06-28_17-42-19

20.2 Step 2: Initialization of the fun_control Dictionary

Caution: Tensorboard does not work under Windows
  • Since tensorboard does not work under Windows, we recommend setting the parameter tensorboard_path to None if you are working under Windows.

spotPython uses a Python dictionary for storing the information required for the hyperparameter tuning process, which was described in Section 14.2, see Initialization of the fun_control Dictionary in the documentation.

from spotPython.utils.init import fun_control_init
fun_control = fun_control_init(task="classification",
    tensorboard_path="runs/25_spot_torch_vbdp",
    device=DEVICE)

20.3 Step 3: PyTorch Data Loading

20.3.1 1. Load VBDP Data

import pandas as pd
from sklearn.preprocessing import OrdinalEncoder
train_df = pd.read_csv('./data/VBDP/train.csv')
# remove the id column
train_df = train_df.drop(columns=['id'])
n_samples = train_df.shape[0]
n_features = train_df.shape[1] - 1
target_column = "prognosis"
# Encode our prognosis labels as integers for easier decoding later
enc = OrdinalEncoder()
train_df[target_column] = enc.fit_transform(train_df[[target_column]])
# convert all entries to int for faster processing
train_df = train_df.astype(int)
  • Add logical combinations (AND, OR, XOR) of the features to the data set:
from spotPython.utils.convert import add_logical_columns
df_new = train_df.copy()
# save the target column using "target_column" as the column name
target = train_df[target_column]
# remove the target column
df_new = df_new.drop(columns=[target_column])
train_df = add_logical_columns(df_new)
# add the target column back
train_df[target_column] = target
train_df = train_df.astype(int)
from sklearn.model_selection import train_test_split
import numpy as np

n_samples = train_df.shape[0]
n_features = train_df.shape[1] - 1
train_df.columns = [f"x{i}" for i in range(1, n_features+1)] + [target_column]

20.3.2 Check content of the target column

train_df[target_column].head()
0     3
1     7
2     3
3    10
4     6
Name: prognosis, dtype: int64
X_train, X_test, y_train, y_test = train_test_split(train_df.drop(target_column, axis=1), train_df[target_column],
                                                    random_state=42,
                                                    test_size=0.25,
                                                    stratify=train_df[target_column])
trainset = pd.DataFrame(np.hstack((X_train, np.array(y_train).reshape(-1, 1))))
testset = pd.DataFrame(np.hstack((X_test, np.array(y_test).reshape(-1, 1))))
trainset.columns = [f"x{i}" for i in range(1, n_features+1)] + [target_column]
testset.columns = [f"x{i}" for i in range(1, n_features+1)] + [target_column]
print(train_df.shape)
print(trainset.shape)
print(testset.shape)
(707, 6113)
(530, 6113)
(177, 6113)
import torch
from sklearn.model_selection import train_test_split
from spotPython.torch.dataframedataset import DataFrameDataset
dtype_x = torch.float32
dtype_y = torch.long
train_df = DataFrameDataset(train_df, target_column=target_column, dtype_x=dtype_x, dtype_y=dtype_y)
train = DataFrameDataset(trainset, target_column=target_column, dtype_x=dtype_x, dtype_y=dtype_y)
test = DataFrameDataset(testset, target_column=target_column, dtype_x=dtype_x, dtype_y=dtype_y)
n_samples = len(train)
# add the dataset to the fun_control
fun_control.update({"data": train_df, # full dataset,
               "train": train,
               "test": test,
               "n_samples": n_samples,
               "target_column": target_column})

20.4 Step 4: Specification of the Preprocessing Model

After the training and test data are specified and added to the fun_control dictionary, spotPython allows the specification of a data preprocessing pipeline, e.g., for the scaling of the data or for the one-hot encoding of categorical variables, see Section 14.4. This feature is not used here, so we do not change the default value (which is None).

20.5 Step 5: Select algorithm and core_model_hyper_dict

20.5.1 Implementing a Configurable Neural Network With spotPython

spotPython includes the Net_vbdp class which is implemented in the file netvbdp.py. The class is imported here.

This class inherits from the class Net_Core which is implemented in the file netcore.py, see Section 14.5.1.

20.5.2 Add the NN Model to the fun_control Dictionary

from spotPython.torch.netvbdp import Net_vbdp
from spotPython.data.torch_hyper_dict import TorchHyperDict
from spotPython.hyperparameters.values import add_core_model_to_fun_control
fun_control = add_core_model_to_fun_control(core_model=Net_vbdp,
                              fun_control=fun_control,
                              hyper_dict=TorchHyperDict)

The corresponding entries for the core_model class are shown below.

fun_control['core_model_hyper_dict']
{'_L0': {'type': 'int',
  'default': 64,
  'transform': 'None',
  'lower': 64,
  'upper': 64},
 'l1': {'type': 'int',
  'default': 8,
  'transform': 'transform_power_2_int',
  'lower': 8,
  'upper': 16},
 'dropout_prob': {'type': 'float',
  'default': 0.01,
  'transform': 'None',
  'lower': 0.0,
  'upper': 0.9},
 'lr_mult': {'type': 'float',
  'default': 1.0,
  'transform': 'None',
  'lower': 0.1,
  'upper': 10.0},
 'batch_size': {'type': 'int',
  'default': 4,
  'transform': 'transform_power_2_int',
  'lower': 1,
  'upper': 4},
 'epochs': {'type': 'int',
  'default': 4,
  'transform': 'transform_power_2_int',
  'lower': 4,
  'upper': 9},
 'k_folds': {'type': 'int',
  'default': 1,
  'transform': 'None',
  'lower': 1,
  'upper': 1},
 'patience': {'type': 'int',
  'default': 2,
  'transform': 'transform_power_2_int',
  'lower': 1,
  'upper': 5},
 'optimizer': {'levels': ['Adadelta',
   'Adagrad',
   'Adam',
   'AdamW',
   'SparseAdam',
   'Adamax',
   'ASGD',
   'NAdam',
   'RAdam',
   'RMSprop',
   'Rprop',
   'SGD'],
  'type': 'factor',
  'default': 'SGD',
  'transform': 'None',
  'class_name': 'torch.optim',
  'core_model_parameter_type': 'str',
  'lower': 0,
  'upper': 12},
 'sgd_momentum': {'type': 'float',
  'default': 0.0,
  'transform': 'None',
  'lower': 0.0,
  'upper': 1.0}}

20.6 Step 6: Modify hyper_dict Hyperparameters for the Selected Algorithm aka core_model

spotPython provides functions for modifying the hyperparameters, their bounds and factors as well as for activating and de-activating hyperparameters without re-compilation of the Python source code. These functions were described in Section 14.6.

Caution: Small number of epochs for demonstration purposes
  • epochs and patience are set to small values for demonstration purposes. These values are too small for a real application.
  • More resonable values are, e.g.:
    • fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[7, 9]) and
    • fun_control = modify_hyper_parameter_bounds(fun_control, "patience", bounds=[2, 7])
from spotPython.hyperparameters.values import modify_hyper_parameter_bounds

fun_control = modify_hyper_parameter_bounds(fun_control, "_L0", bounds=[n_features, n_features])
fun_control = modify_hyper_parameter_bounds(fun_control, "l1", bounds=[6, 13])
fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[2, 3])
fun_control = modify_hyper_parameter_bounds(fun_control, "patience", bounds=[2, 2])
from spotPython.hyperparameters.values import modify_hyper_parameter_levels
fun_control = modify_hyper_parameter_levels(fun_control, "optimizer",["Adam", "AdamW", "Adamax", "NAdam"])
# fun_control = modify_hyper_parameter_levels(fun_control, "optimizer", ["Adam"])
# fun_control["core_model_hyper_dict"]

20.6.1 Optimizers

Optimizers are described in Section 14.6.1.

fun_control = modify_hyper_parameter_bounds(fun_control,
    "lr_mult", bounds=[1e-3, 1e-3])
fun_control = modify_hyper_parameter_bounds(fun_control,
    "sgd_momentum", bounds=[0.9, 0.9])

20.7 Step 7: Selection of the Objective (Loss) Function

20.7.1 Evaluation

The evaluation procedure requires the specification of two elements:

  1. the way how the data is split into a train and a test set (see Section 14.7.1)
  2. the loss function (and a metric).

20.7.2 Loss Functions and Metrics

The loss function is specified by the key "loss_function". We will use CrossEntropy loss for the multiclass-classification task.

from torch.nn import CrossEntropyLoss
loss_function = CrossEntropyLoss()
fun_control.update({"loss_function": loss_function})

20.7.3 Metric

  • We will use the MAP@k metric for the evaluation of the model. Here is an example how this metric is calculated.
from spotPython.torch.mapk import MAPK
import torch
mapk = MAPK(k=2)
target = torch.tensor([0, 1, 2, 2])
preds = torch.tensor(
    [
        [0.5, 0.2, 0.2],  # 0 is in top 2
        [0.3, 0.4, 0.2],  # 1 is in top 2
        [0.2, 0.4, 0.3],  # 2 is in top 2
        [0.7, 0.2, 0.1],  # 2 isn't in top 2
    ]
)
mapk.update(preds, target)
print(mapk.compute()) # tensor(0.6250)
tensor(0.6250)
from spotPython.torch.mapk import MAPK
import torchmetrics
metric_torch = MAPK(k=3)
fun_control.update({"metric_torch": metric_torch})

20.8 Step 8: Calling the SPOT Function

20.8.1 Preparing the SPOT Call

The following code passes the information about the parameter ranges and bounds to spot.

# extract the variable types, names, and bounds
from spotPython.hyperparameters.values import (get_bound_values,
    get_var_name,
    get_var_type,)
var_type = get_var_type(fun_control)
var_name = get_var_name(fun_control)
fun_control.update({"var_type": var_type,
                    "var_name": var_name})
lower = get_bound_values(fun_control, "lower")
upper = get_bound_values(fun_control, "upper")

Now, the dictionary fun_control contains all information needed for the hyperparameter tuning. Before the hyperparameter tuning is started, it is recommended to take a look at the experimental design. The method gen_design_table generates a design table as follows:

from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control))
| name         | type   | default   |    lower |    upper | transform             |
|--------------|--------|-----------|----------|----------|-----------------------|
| _L0          | int    | 64        | 6112     | 6112     | None                  |
| l1           | int    | 8         |    6     |   13     | transform_power_2_int |
| dropout_prob | float  | 0.01      |    0     |    0.9   | None                  |
| lr_mult      | float  | 1.0       |    0.001 |    0.001 | None                  |
| batch_size   | int    | 4         |    1     |    4     | transform_power_2_int |
| epochs       | int    | 4         |    2     |    3     | transform_power_2_int |
| k_folds      | int    | 1         |    1     |    1     | None                  |
| patience     | int    | 2         |    2     |    2     | transform_power_2_int |
| optimizer    | factor | SGD       |    0     |    3     | None                  |
| sgd_momentum | float  | 0.0       |    0.9   |    0.9   | None                  |

This allows to check if all information is available and if the information is correct.

20.8.2 The Objective Function fun_torch

The objective function fun_torch is selected next. It implements an interface from PyTorch’s training, validation, and testing methods to spotPython.

from spotPython.fun.hypertorch import HyperTorch
fun = HyperTorch().fun_torch
from spotPython.hyperparameters.values import get_default_hyperparameters_as_array
hyper_dict=TorchHyperDict().load()
X_start = get_default_hyperparameters_as_array(fun_control, hyper_dict)

20.8.3 Starting the Hyperparameter Tuning

The spotPython hyperparameter tuning is started by calling the Spot function as described in Section 14.8.4.

import numpy as np
from spotPython.spot import spot
from math import inf
spot_tuner = spot.Spot(fun=fun,
                   lower = lower,
                   upper = upper,
                   fun_evals = inf,
                   fun_repeats = 1,
                   max_time = MAX_TIME,
                   noise = False,
                   tolerance_x = np.sqrt(np.spacing(1)),
                   var_type = var_type,
                   var_name = var_name,
                   infill_criterion = "y",
                   n_points = 1,
                   seed=123,
                   log_level = 50,
                   show_models= False,
                   show_progress= True,
                   fun_control = fun_control,
                   design_control={"init_size": INIT_SIZE,
                                   "repeats": 1},
                   surrogate_control={"noise": True,
                                      "cod_type": "norm",
                                      "min_theta": -4,
                                      "max_theta": 3,
                                      "n_theta": len(var_name),
                                      "model_fun_evals": 10_000,
                                      "log_level": 50
                                      })
spot_tuner.run(X_start=X_start)

config: {'_L0': 6112, 'l1': 2048, 'dropout_prob': 0.17031221661559992, 'lr_mult': 0.001, 'batch_size': 16, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'AdamW', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.2001488208770752 | Loss: 2.3973311356135776 | Acc: 0.1226415094339623.
Epoch: 2 | 
MAPK: 0.2008928805589676 | Loss: 2.3972574642726352 | Acc: 0.1179245283018868.
Epoch: 3 | 
MAPK: 0.1971726268529892 | Loss: 2.3972156729016985 | Acc: 0.1179245283018868.
Epoch: 4 | 
MAPK: 0.2001488059759140 | Loss: 2.3971562215260098 | Acc: 0.1132075471698113.
Epoch: 5 | 
MAPK: 0.2105654776096344 | Loss: 2.3971127441951205 | Acc: 0.1226415094339623.
Epoch: 6 | 
MAPK: 0.2016369104385376 | Loss: 2.3970473834446500 | Acc: 0.1132075471698113.
Epoch: 7 | 
MAPK: 0.2008928507566452 | Loss: 2.3970134939466203 | Acc: 0.1132075471698113.
Epoch: 8 | 
MAPK: 0.2053571641445160 | Loss: 2.3970162017004832 | Acc: 0.1179245283018868.
Returned to Spot: Validation loss: 2.3970162017004832

config: {'_L0': 6112, 'l1': 256, 'dropout_prob': 0.19379790035512987, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 4, 'k_folds': 1, 'patience': 4, 'optimizer': 'Adamax', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1728395223617554 | Loss: 2.3983015572583235 | Acc: 0.0990566037735849.
Epoch: 2 | 
MAPK: 0.1720679104328156 | Loss: 2.3982979456583657 | Acc: 0.0990566037735849.
Epoch: 3 | 
MAPK: 0.1736111044883728 | Loss: 2.3982781745769359 | Acc: 0.0990566037735849.
Epoch: 4 | 
MAPK: 0.1728395223617554 | Loss: 2.3982225700660988 | Acc: 0.0990566037735849.
Returned to Spot: Validation loss: 2.398222570066099

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.6759063718076167, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1761006414890289 | Loss: 2.3977689180734023 | Acc: 0.0943396226415094.
Epoch: 2 | 
MAPK: 0.1886792480945587 | Loss: 2.3975801175495364 | Acc: 0.0990566037735849.
Epoch: 3 | 
MAPK: 0.1949685662984848 | Loss: 2.3976262178061143 | Acc: 0.1037735849056604.
Epoch: 4 | 
MAPK: 0.1918238997459412 | Loss: 2.3973903970898323 | Acc: 0.0990566037735849.
Epoch: 5 | 
MAPK: 0.1855345964431763 | Loss: 2.3970521845907533 | Acc: 0.1037735849056604.
Epoch: 6 | 
MAPK: 0.1973270326852798 | Loss: 2.3970791371363513 | Acc: 0.1084905660377359.
Epoch: 7 | 
MAPK: 0.1965408623218536 | Loss: 2.3965954218270644 | Acc: 0.1037735849056604.
Epoch: 8 | 
MAPK: 0.2044025063514709 | Loss: 2.3960597582583159 | Acc: 0.1320754716981132.
Returned to Spot: Validation loss: 2.396059758258316

config: {'_L0': 6112, 'l1': 128, 'dropout_prob': 0.37306669346546995, 'lr_mult': 0.001, 'batch_size': 4, 'epochs': 4, 'k_folds': 1, 'patience': 4, 'optimizer': 'AdamW', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1808176040649414 | Loss: 2.3977439583472484 | Acc: 0.0990566037735849.
Epoch: 2 | 
MAPK: 0.1823899298906326 | Loss: 2.3977595770134115 | Acc: 0.0990566037735849.
Epoch: 3 | 
MAPK: 0.1808176040649414 | Loss: 2.3977082360465571 | Acc: 0.0990566037735849.
Epoch: 4 | 
MAPK: 0.1808176040649414 | Loss: 2.3977157259887121 | Acc: 0.0990566037735849.
Returned to Spot: Validation loss: 2.397715725988712

config: {'_L0': 6112, 'l1': 1024, 'dropout_prob': 0.870137281216666, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'Adam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1712963283061981 | Loss: 2.3980832718036793 | Acc: 0.0896226415094340.
Epoch: 2 | 
MAPK: 0.1697530895471573 | Loss: 2.3978549197868064 | Acc: 0.0754716981132075.
Epoch: 3 | 
MAPK: 0.1759259104728699 | Loss: 2.3980140509428800 | Acc: 0.0707547169811321.
Epoch: 4 | 
MAPK: 0.1751543283462524 | Loss: 2.3978794592398183 | Acc: 0.0801886792452830.
Epoch: 5 | 
MAPK: 0.1751543134450912 | Loss: 2.3978811281698720 | Acc: 0.0849056603773585.
Epoch: 6 | 
MAPK: 0.1604938358068466 | Loss: 2.3980553061873824 | Acc: 0.0849056603773585.
Early stopping at epoch 5
Returned to Spot: Validation loss: 2.3980553061873824

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.8368584385444511, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1635220199823380 | Loss: 2.3979257457661181 | Acc: 0.0801886792452830.
Epoch: 2 | 
MAPK: 0.1643081754446030 | Loss: 2.3975095546470500 | Acc: 0.0849056603773585.
Epoch: 3 | 
MAPK: 0.1957547217607498 | Loss: 2.3974573747167049 | Acc: 0.1226415094339623.
Epoch: 4 | 
MAPK: 0.1729559749364853 | Loss: 2.3971775590248829 | Acc: 0.0943396226415094.
Epoch: 5 | 
MAPK: 0.1745283007621765 | Loss: 2.3972414597025455 | Acc: 0.0990566037735849.
Epoch: 6 | 
MAPK: 0.1753144711256027 | Loss: 2.3970018130428388 | Acc: 0.1084905660377359.
Epoch: 7 | 
MAPK: 0.1894654184579849 | Loss: 2.3966686725616455 | Acc: 0.1179245283018868.
Epoch: 8 | 
MAPK: 0.1902515888214111 | Loss: 2.3962006141554633 | Acc: 0.1226415094339623.
Returned to Spot: Validation loss: 2.3962006141554633
spotPython tuning: 2.396059758258316 [########--] 75.35% 

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.4132005099912892, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1698113083839417 | Loss: 2.3977420060139782 | Acc: 0.0849056603773585.
Epoch: 2 | 
MAPK: 0.1768867969512939 | Loss: 2.3973733389152669 | Acc: 0.0896226415094340.
Epoch: 3 | 
MAPK: 0.1713836640119553 | Loss: 2.3967043318838441 | Acc: 0.0849056603773585.
Epoch: 4 | 
MAPK: 0.1658805310726166 | Loss: 2.3961260048848279 | Acc: 0.0849056603773585.
Epoch: 5 | 
MAPK: 0.1713836491107941 | Loss: 2.3948960731614313 | Acc: 0.0849056603773585.
Epoch: 6 | 
MAPK: 0.1737421751022339 | Loss: 2.3936951182923227 | Acc: 0.0849056603773585.
Epoch: 7 | 
MAPK: 0.1729560047388077 | Loss: 2.3924435624536478 | Acc: 0.0849056603773585.
Epoch: 8 | 
MAPK: 0.1745283156633377 | Loss: 2.3913613400369322 | Acc: 0.0849056603773585.
Returned to Spot: Validation loss: 2.3913613400369322
spotPython tuning: 2.3913613400369322 [##########] 100.00% Done...
<spotPython.spot.spot.Spot at 0x1637effd0>

20.9 Step 9: Tensorboard

The textual output shown in the console (or code cell) can be visualized with Tensorboard as described in Section 14.9, see also the description in the documentation: Tensorboard.

20.10 Step 10: Results

After the hyperparameter tuning run is finished, the results can be analyzed as described in Section 14.10.

spot_tuner.plot_progress(log_y=False, 
    filename="./figures/" + experiment_name+"_progress.png")

Progress plot. Black dots denote results from the initial design. Red dots illustrate the improvement found by the surrogate model based optimization.
from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control=fun_control, spot=spot_tuner))
| name         | type   | default   |   lower |   upper |              tuned | transform             |   importance | stars   |
|--------------|--------|-----------|---------|---------|--------------------|-----------------------|--------------|---------|
| _L0          | int    | 64        |  6112.0 |  6112.0 |             6112.0 | None                  |         0.00 |         |
| l1           | int    | 8         |     6.0 |    13.0 |               12.0 | transform_power_2_int |         0.00 |         |
| dropout_prob | float  | 0.01      |     0.0 |     0.9 | 0.4132005099912892 | None                  |         7.50 | *       |
| lr_mult      | float  | 1.0       |   0.001 |   0.001 |              0.001 | None                  |         0.00 |         |
| batch_size   | int    | 4         |     1.0 |     4.0 |                1.0 | transform_power_2_int |         1.78 | *       |
| epochs       | int    | 4         |     2.0 |     3.0 |                3.0 | transform_power_2_int |       100.00 | ***     |
| k_folds      | int    | 1         |     1.0 |     1.0 |                1.0 | None                  |         0.00 |         |
| patience     | int    | 2         |     2.0 |     2.0 |                2.0 | transform_power_2_int |         0.00 |         |
| optimizer    | factor | SGD       |     0.0 |     3.0 |                3.0 | None                  |        52.40 | **      |
| sgd_momentum | float  | 0.0       |     0.9 |     0.9 |                0.9 | None                  |         0.00 |         |
spot_tuner.plot_importance(threshold=0.025,
    filename="./figures/" + experiment_name+"_importance.png")

Variable importance plot, threshold 0.025.

20.10.1 Get the Tuned Architecture

from spotPython.hyperparameters.values import get_one_core_model_from_X
X = spot_tuner.to_all_dim(spot_tuner.min_X.reshape(1,-1))
model_spot = get_one_core_model_from_X(X, fun_control)
model_spot
Net_vbdp(
  (fc1): Linear(in_features=6112, out_features=4096, bias=True)
  (fc2): Linear(in_features=4096, out_features=2048, bias=True)
  (fc3): Linear(in_features=2048, out_features=1024, bias=True)
  (fc4): Linear(in_features=1024, out_features=512, bias=True)
  (fc5): Linear(in_features=512, out_features=11, bias=True)
  (relu): ReLU()
  (softmax): Softmax(dim=1)
  (dropout1): Dropout(p=0.4132005099912892, inplace=False)
  (dropout2): Dropout(p=0.2066002549956446, inplace=False)
)

20.10.2 Evaluation of the Tuned Architecture

from spotPython.torch.traintest import (
    train_tuned,
    test_tuned,
    )
train_tuned(net=model_spot, train_dataset=train,
        loss_function=fun_control["loss_function"],
        metric=fun_control["metric_torch"],
        shuffle=True,
        device = fun_control["device"],
        path=None,
        task=fun_control["task"],)
Epoch: 1 | 
MAPK: 0.1745283156633377 | Loss: 2.3978872636579118 | Acc: 0.1132075471698113.
Epoch: 2 | 
MAPK: 0.1745283007621765 | Loss: 2.3974965653329527 | Acc: 0.1132075471698113.
Epoch: 3 | 
MAPK: 0.1666666716337204 | Loss: 2.3970298384720423 | Acc: 0.1132075471698113.
Epoch: 4 | 
MAPK: 0.1863207519054413 | Loss: 2.3964918109605895 | Acc: 0.1132075471698113.
Epoch: 5 | 
MAPK: 0.1941823661327362 | Loss: 2.3956689452225306 | Acc: 0.1132075471698113.
Epoch: 6 | 
MAPK: 0.1926100552082062 | Loss: 2.3943297233221665 | Acc: 0.1132075471698113.
Epoch: 7 | 
MAPK: 0.1839622706174850 | Loss: 2.3928304865675152 | Acc: 0.1132075471698113.
Epoch: 8 | 
MAPK: 0.1981131583452225 | Loss: 2.3907967873339384 | Acc: 0.1132075471698113.
Returned to Spot: Validation loss: 2.3907967873339384

If path is set to a filename, e.g., path = "model_spot_trained.pt", the weights of the trained model will be loaded from this file.

test_tuned(net=model_spot, test_dataset=test,
            shuffle=False,
            loss_function=fun_control["loss_function"],
            metric=fun_control["metric_torch"],
            device = fun_control["device"],
            task=fun_control["task"],)
MAPK: 0.2172284871339798 | Loss: 2.3866183945302213 | Acc: 0.1186440677966102.
Final evaluation: Validation loss: 2.3866183945302213
Final evaluation: Validation metric: 0.2172284871339798
----------------------------------------------
(2.3866183945302213, nan, tensor(0.2172))

20.10.3 Cross-validated Evaluations

  • This is the evaluation that will be used in the comparison.
Caution: Cross-validated Evaluations
  • The number of folds is set to 1 by default.
  • Here it was changed to 3 for demonstration purposes.
  • Set the number of folds to a reasonable value, e.g., 10.
  • This can be done by setting the k_folds attribute of the model as follows:
  • setattr(model_spot, "k_folds", 10)
from spotPython.torch.traintest import evaluate_cv
# modify k-kolds:
setattr(model_spot, "k_folds",  3)
df_eval, df_preds, df_metrics = evaluate_cv(net=model_spot,
    dataset=fun_control["data"],
    loss_function=fun_control["loss_function"],
    metric=fun_control["metric_torch"],
    task=fun_control["task"],
    writer=fun_control["writer"],
    writerId="model_spot_cv",
    device = fun_control["device"])
Fold: 1
Epoch: 1 | 
MAPK: 0.1497175395488739 | Loss: 2.3975370192931869 | Acc: 0.0889830508474576.
Epoch: 2 | 
MAPK: 0.2252824753522873 | Loss: 2.3966445619777099 | Acc: 0.1398305084745763.
Epoch: 3 | 
MAPK: 0.2457627058029175 | Loss: 2.3954627150196139 | Acc: 0.1355932203389831.
Epoch: 4 | 
MAPK: 0.2365819066762924 | Loss: 2.3934934280686457 | Acc: 0.1271186440677966.
Epoch: 5 | 
MAPK: 0.2351694107055664 | Loss: 2.3900440325171259 | Acc: 0.1271186440677966.
Epoch: 6 | 
MAPK: 0.2422316074371338 | Loss: 2.3860418493464843 | Acc: 0.1271186440677966.
Epoch: 7 | 
MAPK: 0.2683615386486053 | Loss: 2.3816144062300859 | Acc: 0.1440677966101695.
Epoch: 8 | 
MAPK: 0.3319208920001984 | Loss: 2.3781753859277499 | Acc: 0.2118644067796610.
Fold: 2
Epoch: 1 | 
MAPK: 0.2153954654932022 | Loss: 2.3970630310349543 | Acc: 0.1186440677966102.
Epoch: 2 | 
MAPK: 0.2415253371000290 | Loss: 2.3965759337958641 | Acc: 0.1186440677966102.
Epoch: 3 | 
MAPK: 0.2655366361141205 | Loss: 2.3957776841470753 | Acc: 0.1186440677966102.
Epoch: 4 | 
MAPK: 0.2782485485076904 | Loss: 2.3944688045372398 | Acc: 0.1186440677966102.
Epoch: 5 | 
MAPK: 0.2747174799442291 | Loss: 2.3923674983493353 | Acc: 0.1186440677966102.
Epoch: 6 | 
MAPK: 0.2711863815784454 | Loss: 2.3891136828115429 | Acc: 0.1186440677966102.
Epoch: 7 | 
MAPK: 0.2824858725070953 | Loss: 2.3850725545721541 | Acc: 0.1610169491525424.
Epoch: 8 | 
MAPK: 0.2923728823661804 | Loss: 2.3807574025655196 | Acc: 0.1779661016949153.
Fold: 3
Epoch: 1 | 
MAPK: 0.1899717450141907 | Loss: 2.3972855685120922 | Acc: 0.0978723404255319.
Epoch: 2 | 
MAPK: 0.1970338672399521 | Loss: 2.3966946763507391 | Acc: 0.1191489361702128.
Epoch: 3 | 
MAPK: 0.2012711614370346 | Loss: 2.3957006123106357 | Acc: 0.1276595744680851.
Epoch: 4 | 
MAPK: 0.1984463036060333 | Loss: 2.3942144765692244 | Acc: 0.1234042553191489.
Epoch: 5 | 
MAPK: 0.1949152201414108 | Loss: 2.3920408083220659 | Acc: 0.1148936170212766.
Epoch: 6 | 
MAPK: 0.1920903772115707 | Loss: 2.3891599542003568 | Acc: 0.1148936170212766.
Epoch: 7 | 
MAPK: 0.1963277012109756 | Loss: 2.3857416581299344 | Acc: 0.1148936170212766.
Epoch: 8 | 
MAPK: 0.2224575877189636 | Loss: 2.3822010573694263 | Acc: 0.1404255319148936.
metric_name = type(fun_control["metric_torch"]).__name__
print(f"loss: {df_eval}, Cross-validated {metric_name}: {df_metrics}")
loss: 2.3803779486208985, Cross-validated MAPK: 0.28225046396255493

20.10.4 Detailed Hyperparameter Plots

filename = "./figures/" + experiment_name
spot_tuner.plot_important_hyperparameter_contour(filename=filename)
dropout_prob:  7.503274030422584
batch_size:  1.7777788119741342
epochs:  100.0
optimizer:  52.39842520546456

Contour plots.

20.10.5 Parallel Coordinates Plot

spot_tuner.parallel_plot()

Parallel coordinates plots

# close tensorbaoard writer
if fun_control["writer"] is not None:
    fun_control["writer"].close()

20.10.6 Plot all Combinations of Hyperparameters

  • Warning: this may take a while.
PLOT_ALL = False
if PLOT_ALL:
    n = spot_tuner.k
    for i in range(n-1):
        for j in range(i+1, n):
            spot_tuner.plot_contour(i=i, j=j, min_z=min_z, max_z = max_z)