20  HPT: PyTorch With VBDP

In this tutorial, we will show how spotPython can be integrated into the PyTorch training workflow for a classifiaction task.

Caution: Data must be downloaded manually
  • Ensure that the correspondiing data is available as ./data/VBDP/train.csv.

This document refers to the following software versions:

pip list | grep  "spot[RiverPython]"
spotPython                 0.2.38
spotRiver                  0.0.93
Note: you may need to restart the kernel to use updated packages.

spotPython can be installed via pip. Alternatively, the source code can be downloaded from gitHub: https://github.com/sequential-parameter-optimization/spotPython.

!pip install spotPython
# import sys
# !{sys.executable} -m pip install --upgrade build
# !{sys.executable} -m pip install --upgrade --force-reinstall spotPython

20.1 Step 1: Setup

Before we consider the detailed experimental setup, we select the parameters that affect run time, initial design size and the device that is used.

Caution: Run time and initial design size should be increased for real experiments
  • MAX_TIME is set to one minute for demonstration purposes. For real experiments, this should be increased to at least 1 hour.
  • INIT_SIZE is set to 5 for demonstration purposes. For real experiments, this should be increased to at least 10.
Note: Device selection
  • The device can be selected by setting the variable DEVICE.
  • Since we are using a simple neural net, the setting "cpu" is preferred (on Mac).
  • If you have a GPU, you can use "cuda:0" instead.
  • If DEVICE is set to None, spotPython will automatically select the device.
    • This might result in "mps" on Macs, which is not the best choice for simple neural nets.
MAX_TIME = 1
INIT_SIZE = 5
DEVICE = None # "cpu" # "cuda:0"
from spotPython.utils.device import getDevice
DEVICE = getDevice(DEVICE)
print(DEVICE)
mps
import os
import copy
import socket
from datetime import datetime
from dateutil.tz import tzlocal
start_time = datetime.now(tzlocal())
HOSTNAME = socket.gethostname().split(".")[0]
experiment_name = '25-torch' + "_" + HOSTNAME + "_" + str(MAX_TIME) + "min_" + str(INIT_SIZE) + "init_" + str(start_time).split(".", 1)[0].replace(' ', '_')
experiment_name = experiment_name.replace(':', '-')
print(experiment_name)
if not os.path.exists('./figures'):
    os.makedirs('./figures')
25-torch_bartz09_1min_5init_2023-06-19_04-27-28

20.2 Step 2: Initialization of the fun_control Dictionary

Caution: Tensorboard does not work under Windows
  • Since tensorboard does not work under Windows, we recommend setting the parameter tensorboard_path to None if you are working under Windows.

spotPython uses a Python dictionary for storing the information required for the hyperparameter tuning process, which was described in Section 14.2, see Initialization of the fun_control Dictionary in the documentation.

from spotPython.utils.init import fun_control_init
fun_control = fun_control_init(task="classification",
    tensorboard_path="runs/25_spot_torch_vbdp",
    device=DEVICE)

20.3 Step 3: PyTorch Data Loading

20.3.1 1. Load VBDP Data

import pandas as pd
from sklearn.preprocessing import OrdinalEncoder
train_df = pd.read_csv('./data/VBDP/train.csv')
# remove the id column
train_df = train_df.drop(columns=['id'])
n_samples = train_df.shape[0]
n_features = train_df.shape[1] - 1
target_column = "prognosis"
# # Encoder our prognosis labels as integers for easier decoding later
enc = OrdinalEncoder()
train_df[target_column] = enc.fit_transform(train_df[[target_column]])
train_df.head()

# convert all entries to int for faster processing
train_df = train_df.astype(int)
  • Add logical combinations (AND, OR, XOR) of the features to the data set:
from spotPython.utils.convert import add_logical_columns
df_new = train_df.copy()
# save the target column using "target_column" as the column name
target = train_df[target_column]
# remove the target column
df_new = df_new.drop(columns=[target_column])
train_df = add_logical_columns(df_new)
# add the target column back
train_df[target_column] = target
train_df = train_df.astype(int)
train_df.head()
sudden_fever headache mouth_bleed nose_bleed muscle_pain joint_pain vomiting rash diarrhea hypotension ... 6039 6040 6041 6042 6043 6044 6045 6046 6047 prognosis
0 1 1 0 1 1 1 1 0 1 1 ... 0 0 0 0 0 0 0 0 0 3
1 0 0 0 0 0 0 1 0 1 0 ... 0 0 0 0 0 0 0 0 0 7
2 0 1 1 1 0 1 1 1 1 1 ... 1 1 0 1 1 0 1 1 0 3
3 0 0 1 1 1 1 0 1 0 1 ... 0 0 0 0 0 0 0 0 0 10
4 0 0 0 0 0 0 0 0 1 0 ... 0 1 1 0 1 1 0 0 0 6

5 rows × 6113 columns

from sklearn.model_selection import train_test_split
import numpy as np

n_samples = train_df.shape[0]
n_features = train_df.shape[1] - 1
train_df.columns = [f"x{i}" for i in range(1, n_features+1)] + [target_column]
train_df.head()
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 ... x6104 x6105 x6106 x6107 x6108 x6109 x6110 x6111 x6112 prognosis
0 1 1 0 1 1 1 1 0 1 1 ... 0 0 0 0 0 0 0 0 0 3
1 0 0 0 0 0 0 1 0 1 0 ... 0 0 0 0 0 0 0 0 0 7
2 0 1 1 1 0 1 1 1 1 1 ... 1 1 0 1 1 0 1 1 0 3
3 0 0 1 1 1 1 0 1 0 1 ... 0 0 0 0 0 0 0 0 0 10
4 0 0 0 0 0 0 0 0 1 0 ... 0 1 1 0 1 1 0 0 0 6

5 rows × 6113 columns

20.3.2 Check content of the target column

train_df[target_column].head()
0     3
1     7
2     3
3    10
4     6
Name: prognosis, dtype: int64
X_train, X_test, y_train, y_test = train_test_split(train_df.drop(target_column, axis=1), train_df[target_column],
                                                    random_state=42,
                                                    test_size=0.25,
                                                    stratify=train_df[target_column])
trainset = pd.DataFrame(np.hstack((X_train, np.array(y_train).reshape(-1, 1))))
testset = pd.DataFrame(np.hstack((X_test, np.array(y_test).reshape(-1, 1))))
trainset.columns = [f"x{i}" for i in range(1, n_features+1)] + [target_column]
testset.columns = [f"x{i}" for i in range(1, n_features+1)] + [target_column]
print(train_df.shape)
print(trainset.shape)
print(testset.shape)
(707, 6113)
(530, 6113)
(177, 6113)
import torch
from sklearn.model_selection import train_test_split
from spotPython.torch.dataframedataset import DataFrameDataset
dtype_x = torch.float32
dtype_y = torch.long
train_df = DataFrameDataset(train_df, target_column=target_column, dtype_x=dtype_x, dtype_y=dtype_y)
train = DataFrameDataset(trainset, target_column=target_column, dtype_x=dtype_x, dtype_y=dtype_y)
test = DataFrameDataset(testset, target_column=target_column, dtype_x=dtype_x, dtype_y=dtype_y)
n_samples = len(train)
# add the dataset to the fun_control
fun_control.update({"data": train_df, # full dataset,
               "train": train,
               "test": test,
               "n_samples": n_samples,
               "target_column": target_column})

20.4 Step 4: Specification of the Preprocessing Model

After the training and test data are specified and added to the fun_control dictionary, spotPython allows the specification of a data preprocessing pipeline, e.g., for the scaling of the data or for the one-hot encoding of categorical variables, see Section 14.4. This feature is not used here, so we do not change the default value (which is None).

20.5 Step 5: Select algorithm and core_model_hyper_dict

20.5.1 Implementing a Configurable Neural Network With spotPython

spotPython includes the Net_vbdp class which is implemented in the file netvbdp.py. The class is imported here.

This class inherits from the class Net_Core which is implemented in the file netcore.py, see Section 14.5.1.

20.5.2 Add the NN Model to the fun_control Dictionary

from spotPython.torch.netvbdp import Net_vbdp
from spotPython.data.torch_hyper_dict import TorchHyperDict
from spotPython.hyperparameters.values import add_core_model_to_fun_control
fun_control = add_core_model_to_fun_control(core_model=Net_vbdp,
                              fun_control=fun_control,
                              hyper_dict=TorchHyperDict)

The corresponding entries for the core_model class are shown below.

fun_control['core_model_hyper_dict']
{'_L0': {'type': 'int',
  'default': 64,
  'transform': 'None',
  'lower': 64,
  'upper': 64},
 'l1': {'type': 'int',
  'default': 8,
  'transform': 'transform_power_2_int',
  'lower': 8,
  'upper': 16},
 'dropout_prob': {'type': 'float',
  'default': 0.01,
  'transform': 'None',
  'lower': 0.0,
  'upper': 0.9},
 'lr_mult': {'type': 'float',
  'default': 1.0,
  'transform': 'None',
  'lower': 0.1,
  'upper': 10.0},
 'batch_size': {'type': 'int',
  'default': 4,
  'transform': 'transform_power_2_int',
  'lower': 1,
  'upper': 4},
 'epochs': {'type': 'int',
  'default': 4,
  'transform': 'transform_power_2_int',
  'lower': 4,
  'upper': 9},
 'k_folds': {'type': 'int',
  'default': 1,
  'transform': 'None',
  'lower': 1,
  'upper': 1},
 'patience': {'type': 'int',
  'default': 2,
  'transform': 'transform_power_2_int',
  'lower': 1,
  'upper': 5},
 'optimizer': {'levels': ['Adadelta',
   'Adagrad',
   'Adam',
   'AdamW',
   'SparseAdam',
   'Adamax',
   'ASGD',
   'NAdam',
   'RAdam',
   'RMSprop',
   'Rprop',
   'SGD'],
  'type': 'factor',
  'default': 'SGD',
  'transform': 'None',
  'class_name': 'torch.optim',
  'core_model_parameter_type': 'str',
  'lower': 0,
  'upper': 12},
 'sgd_momentum': {'type': 'float',
  'default': 0.0,
  'transform': 'None',
  'lower': 0.0,
  'upper': 1.0}}

20.6 Step 6: Modify hyper_dict Hyperparameters for the Selected Algorithm aka core_model

spotPython provides functions for modifying the hyperparameters, their bounds and factors as well as for activating and de-activating hyperparameters without re-compilation of the Python source code. These functions were described in Section 14.6.

Caution: Small number of epochs for demonstration purposes
  • epochs and patience are set to small values for demonstration purposes. These values are too small for a real application.
  • More resonable values are, e.g.:
    • fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[7, 9]) and
    • fun_control = modify_hyper_parameter_bounds(fun_control, "patience", bounds=[2, 7])
from spotPython.hyperparameters.values import modify_hyper_parameter_bounds

fun_control = modify_hyper_parameter_bounds(fun_control, "_L0", bounds=[n_features, n_features])
fun_control = modify_hyper_parameter_bounds(fun_control, "l1", bounds=[6, 13])
fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[2, 3])
fun_control = modify_hyper_parameter_bounds(fun_control, "patience", bounds=[2, 2])
from spotPython.hyperparameters.values import modify_hyper_parameter_levels
fun_control = modify_hyper_parameter_levels(fun_control, "optimizer",["Adam", "AdamW", "Adamax", "NAdam"])
# fun_control = modify_hyper_parameter_levels(fun_control, "optimizer", ["Adam"])
# fun_control["core_model_hyper_dict"]

20.6.1 Optimizers

Optimizers are described in Section 14.6.1.

fun_control = modify_hyper_parameter_bounds(fun_control,
    "lr_mult", bounds=[1e-3, 1e-3])
fun_control = modify_hyper_parameter_bounds(fun_control,
    "sgd_momentum", bounds=[0.9, 0.9])

20.7 Step 7: Selection of the Objective (Loss) Function

20.7.1 Evaluation

The evaluation procedure requires the specification of two elements:

  1. the way how the data is split into a train and a test set (see Section 14.7.1)
  2. the loss function (and a metric).

20.7.2 Loss Functions and Metrics

The loss function is specified by the key "loss_function". We will use CrossEntropy loss for the multiclass-classification task.

from torch.nn import CrossEntropyLoss
loss_function = CrossEntropyLoss()
fun_control.update({"loss_function": loss_function})

20.7.3 Metric

  • We will use the MAP@k metric for the evaluation of the model. Here is an example how this metric is calculated.
from spotPython.torch.mapk import MAPK
import torch
mapk = MAPK(k=2)
target = torch.tensor([0, 1, 2, 2])
preds = torch.tensor(
    [
        [0.5, 0.2, 0.2],  # 0 is in top 2
        [0.3, 0.4, 0.2],  # 1 is in top 2
        [0.2, 0.4, 0.3],  # 2 is in top 2
        [0.7, 0.2, 0.1],  # 2 isn't in top 2
    ]
)
mapk.update(preds, target)
print(mapk.compute()) # tensor(0.6250)
tensor(0.6250)
from spotPython.torch.mapk import MAPK
import torchmetrics
metric_torch = MAPK(k=3)
fun_control.update({"metric_torch": metric_torch})

20.8 Step 8: Calling the SPOT Function

20.8.1 Preparing the SPOT Call

The following code passes the information about the parameter ranges and bounds to spot.

# extract the variable types, names, and bounds
from spotPython.hyperparameters.values import (get_bound_values,
    get_var_name,
    get_var_type,)
var_type = get_var_type(fun_control)
var_name = get_var_name(fun_control)
fun_control.update({"var_type": var_type,
                    "var_name": var_name})
lower = get_bound_values(fun_control, "lower")
upper = get_bound_values(fun_control, "upper")

Now, the dictionary fun_control contains all information needed for the hyperparameter tuning. Before the hyperparameter tuning is started, it is recommended to take a look at the experimental design. The method gen_design_table generates a design table as follows:

from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control))
| name         | type   | default   |    lower |    upper | transform             |
|--------------|--------|-----------|----------|----------|-----------------------|
| _L0          | int    | 64        | 6112     | 6112     | None                  |
| l1           | int    | 8         |    6     |   13     | transform_power_2_int |
| dropout_prob | float  | 0.01      |    0     |    0.9   | None                  |
| lr_mult      | float  | 1.0       |    0.001 |    0.001 | None                  |
| batch_size   | int    | 4         |    1     |    4     | transform_power_2_int |
| epochs       | int    | 4         |    2     |    3     | transform_power_2_int |
| k_folds      | int    | 1         |    1     |    1     | None                  |
| patience     | int    | 2         |    2     |    2     | transform_power_2_int |
| optimizer    | factor | SGD       |    0     |    3     | None                  |
| sgd_momentum | float  | 0.0       |    0.9   |    0.9   | None                  |

This allows to check if all information is available and if the information is correct.

20.8.2 The Objective Function fun_torch

The objective function fun_torch is selected next. It implements an interface from PyTorch’s training, validation, and testing methods to spotPython.

from spotPython.fun.hypertorch import HyperTorch
fun = HyperTorch().fun_torch
from spotPython.hyperparameters.values import get_default_hyperparameters_as_array
hyper_dict=TorchHyperDict().load()
X_start = get_default_hyperparameters_as_array(fun_control, hyper_dict)

20.8.3 Starting the Hyperparameter Tuning

The spotPython hyperparameter tuning is started by calling the Spot function as described in Section 14.8.4.

import numpy as np
from spotPython.spot import spot
from math import inf
spot_tuner = spot.Spot(fun=fun,
                   lower = lower,
                   upper = upper,
                   fun_evals = inf,
                   fun_repeats = 1,
                   max_time = MAX_TIME,
                   noise = False,
                   tolerance_x = np.sqrt(np.spacing(1)),
                   var_type = var_type,
                   var_name = var_name,
                   infill_criterion = "y",
                   n_points = 1,
                   seed=123,
                   log_level = 50,
                   show_models= False,
                   show_progress= True,
                   fun_control = fun_control,
                   design_control={"init_size": INIT_SIZE,
                                   "repeats": 1},
                   surrogate_control={"noise": True,
                                      "cod_type": "norm",
                                      "min_theta": -4,
                                      "max_theta": 3,
                                      "n_theta": len(var_name),
                                      "model_fun_evals": 10_000,
                                      "log_level": 50
                                      })
spot_tuner.run(X_start=X_start)

config: {'_L0': 6112, 'l1': 2048, 'dropout_prob': 0.17031221661559992, 'lr_mult': 0.001, 'batch_size': 16, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'AdamW', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1562500000000000 | Loss: 2.3980517898287093 | Acc: 0.0754716981132075.
Epoch: 2 | 
MAPK: 0.1666666716337204 | Loss: 2.3980109351021901 | Acc: 0.0849056603773585.
Epoch: 3 | 
MAPK: 0.1793154776096344 | Loss: 2.3979426962988719 | Acc: 0.0849056603773585.
Epoch: 4 | 
MAPK: 0.1644345223903656 | Loss: 2.3978416068213328 | Acc: 0.0660377358490566.
Epoch: 5 | 
MAPK: 0.1622023731470108 | Loss: 2.3978676114763533 | Acc: 0.0707547169811321.
Epoch: 6 | 
MAPK: 0.1889880895614624 | Loss: 2.3977864980697632 | Acc: 0.0896226415094340.
Epoch: 7 | 
MAPK: 0.1971726119518280 | Loss: 2.3977029834474837 | Acc: 0.0990566037735849.
Epoch: 8 | 
MAPK: 0.1941964179277420 | Loss: 2.3976612942559377 | Acc: 0.0754716981132075.
Returned to Spot: Validation loss: 2.3976612942559377

config: {'_L0': 6112, 'l1': 256, 'dropout_prob': 0.19379790035512987, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 4, 'k_folds': 1, 'patience': 4, 'optimizer': 'Adamax', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1566357910633087 | Loss: 2.3983427153693304 | Acc: 0.0754716981132075.
Epoch: 2 | 
MAPK: 0.1520061790943146 | Loss: 2.3983575767940946 | Acc: 0.0754716981132075.
Epoch: 3 | 
MAPK: 0.1520061790943146 | Loss: 2.3982980251312256 | Acc: 0.0754716981132075.
Epoch: 4 | 
MAPK: 0.1566357910633087 | Loss: 2.3982902456212929 | Acc: 0.0754716981132075.
Returned to Spot: Validation loss: 2.398290245621293

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.6759063718076167, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1619497090578079 | Loss: 2.3976024569205516 | Acc: 0.0896226415094340.
Epoch: 2 | 
MAPK: 0.1894654035568237 | Loss: 2.3973152052681401 | Acc: 0.1084905660377359.
Epoch: 3 | 
MAPK: 0.2028301954269409 | Loss: 2.3971720214159982 | Acc: 0.1273584905660377.
Epoch: 4 | 
MAPK: 0.1863207668066025 | Loss: 2.3970150430247470 | Acc: 0.1037735849056604.
Epoch: 5 | 
MAPK: 0.1831761300563812 | Loss: 2.3967633472298675 | Acc: 0.1132075471698113.
Epoch: 6 | 
MAPK: 0.1776729822158813 | Loss: 2.3963150438272729 | Acc: 0.1084905660377359.
Epoch: 7 | 
MAPK: 0.1792452782392502 | Loss: 2.3955174841970766 | Acc: 0.1084905660377359.
Epoch: 8 | 
MAPK: 0.1902515888214111 | Loss: 2.3944931412642858 | Acc: 0.1084905660377359.
Returned to Spot: Validation loss: 2.394493141264286

config: {'_L0': 6112, 'l1': 128, 'dropout_prob': 0.37306669346546995, 'lr_mult': 0.001, 'batch_size': 4, 'epochs': 4, 'k_folds': 1, 'patience': 4, 'optimizer': 'AdamW', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1242138519883156 | Loss: 2.3990250443512537 | Acc: 0.0707547169811321.
Epoch: 2 | 
MAPK: 0.1257861852645874 | Loss: 2.3990953148535961 | Acc: 0.0707547169811321.
Epoch: 3 | 
MAPK: 0.1250000149011612 | Loss: 2.3990971007437074 | Acc: 0.0707547169811321.
Epoch: 4 | 
MAPK: 0.1305031478404999 | Loss: 2.3990252917667605 | Acc: 0.0707547169811321.
Returned to Spot: Validation loss: 2.3990252917667605

config: {'_L0': 6112, 'l1': 1024, 'dropout_prob': 0.870137281216666, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'Adam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1419753283262253 | Loss: 2.3984423390141241 | Acc: 0.0849056603773585.
Epoch: 2 | 
MAPK: 0.1381172835826874 | Loss: 2.3985650362791837 | Acc: 0.0801886792452830.
Epoch: 3 | 
MAPK: 0.1597222238779068 | Loss: 2.3986238638559976 | Acc: 0.0943396226415094.
Epoch: 4 | 
MAPK: 0.1427469253540039 | Loss: 2.3987307813432484 | Acc: 0.0990566037735849.
Epoch: 5 | 
MAPK: 0.1358024775981903 | Loss: 2.3985595438215466 | Acc: 0.0707547169811321.
Early stopping at epoch 4
Returned to Spot: Validation loss: 2.3985595438215466

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.8368584385444511, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.2051886767148972 | Loss: 2.3978285384628006 | Acc: 0.1273584905660377.
Epoch: 2 | 
MAPK: 0.1886792331933975 | Loss: 2.3978993217900113 | Acc: 0.1084905660377359.
Epoch: 3 | 
MAPK: 0.1926100701093674 | Loss: 2.3978448206523679 | Acc: 0.1226415094339623.
Epoch: 4 | 
MAPK: 0.1918238848447800 | Loss: 2.3975205219016886 | Acc: 0.1179245283018868.
Epoch: 5 | 
MAPK: 0.2146226167678833 | Loss: 2.3977170620324477 | Acc: 0.1320754716981132.
Epoch: 6 | 
MAPK: 0.2051886767148972 | Loss: 2.3974914955642990 | Acc: 0.1084905660377359.
Epoch: 7 | 
MAPK: 0.1800314486026764 | Loss: 2.3975813658732288 | Acc: 0.0990566037735849.
Epoch: 8 | 
MAPK: 0.2020439952611923 | Loss: 2.3975597934902839 | Acc: 0.1132075471698113.
Returned to Spot: Validation loss: 2.397559793490284
spotPython tuning: 2.394493141264286 [####------] 36.23% 

config: {'_L0': 6112, 'l1': 512, 'dropout_prob': 0.6750896771619408, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 4, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1967592835426331 | Loss: 2.3976494029716209 | Acc: 0.1179245283018868.
Epoch: 2 | 
MAPK: 0.1890432089567184 | Loss: 2.3975939485761852 | Acc: 0.1037735849056604.
Epoch: 3 | 
MAPK: 0.1836419701576233 | Loss: 2.3976017369164362 | Acc: 0.1037735849056604.
Epoch: 4 | 
MAPK: 0.1836419701576233 | Loss: 2.3975458939870200 | Acc: 0.0943396226415094.
Returned to Spot: Validation loss: 2.39754589398702
spotPython tuning: 2.394493141264286 [####------] 39.27% 

config: {'_L0': 6112, 'l1': 512, 'dropout_prob': 0.6750395091714576, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1365740895271301 | Loss: 2.3987020209983543 | Acc: 0.0613207547169811.
Epoch: 2 | 
MAPK: 0.1419753134250641 | Loss: 2.3986908771373607 | Acc: 0.0707547169811321.
Epoch: 3 | 
MAPK: 0.1373456865549088 | Loss: 2.3986740730426930 | Acc: 0.0471698113207547.
Epoch: 4 | 
MAPK: 0.1435185223817825 | Loss: 2.3987063478540489 | Acc: 0.0707547169811321.
Epoch: 5 | 
MAPK: 0.1458333432674408 | Loss: 2.3986359172397189 | Acc: 0.0849056603773585.
Epoch: 6 | 
MAPK: 0.1396604925394058 | Loss: 2.3986406237990767 | Acc: 0.0613207547169811.
Epoch: 7 | 
MAPK: 0.1412037163972855 | Loss: 2.3987360707035772 | Acc: 0.0660377358490566.
Epoch: 8 | 
MAPK: 0.1342592537403107 | Loss: 2.3986145743617304 | Acc: 0.0518867924528302.
Returned to Spot: Validation loss: 2.3986145743617304
spotPython tuning: 2.394493141264286 [####------] 44.72% 

config: {'_L0': 6112, 'l1': 1024, 'dropout_prob': 0.6755726413592649, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1816037744283676 | Loss: 2.3978371170331849 | Acc: 0.1084905660377359.
Epoch: 2 | 
MAPK: 0.1784591078758240 | Loss: 2.3976157498809525 | Acc: 0.1084905660377359.
Epoch: 3 | 
MAPK: 0.1816037744283676 | Loss: 2.3976708403173483 | Acc: 0.1084905660377359.
Epoch: 4 | 
MAPK: 0.1831761151552200 | Loss: 2.3976594592040441 | Acc: 0.1084905660377359.
Epoch: 5 | 
MAPK: 0.1886792480945587 | Loss: 2.3974868868881800 | Acc: 0.1037735849056604.
Epoch: 6 | 
MAPK: 0.1839622408151627 | Loss: 2.3975845327917136 | Acc: 0.1084905660377359.
Epoch: 7 | 
MAPK: 0.1761006265878677 | Loss: 2.3976148839266793 | Acc: 0.1084905660377359.
Epoch: 8 | 
MAPK: 0.1784591078758240 | Loss: 2.3973647715910427 | Acc: 0.1084905660377359.
Returned to Spot: Validation loss: 2.3973647715910427
spotPython tuning: 2.394493141264286 [######----] 63.52% 

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.6751038025047295, 'lr_mult': 0.001, 'batch_size': 16, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1465773880481720 | Loss: 2.3977736915860857 | Acc: 0.0754716981132075.
Epoch: 2 | 
MAPK: 0.1837797313928604 | Loss: 2.3977653809956143 | Acc: 0.1179245283018868.
Epoch: 3 | 
MAPK: 0.2172619104385376 | Loss: 2.3974139009203230 | Acc: 0.1273584905660377.
Epoch: 4 | 
MAPK: 0.2127976268529892 | Loss: 2.3977286815643311 | Acc: 0.1509433962264151.
Epoch: 5 | 
MAPK: 0.2343750149011612 | Loss: 2.3974812711988176 | Acc: 0.1556603773584906.
Epoch: 6 | 
MAPK: 0.2492559254169464 | Loss: 2.3975140367235457 | Acc: 0.1509433962264151.
Epoch: 7 | 
MAPK: 0.2373511940240860 | Loss: 2.3973816462925504 | Acc: 0.1320754716981132.
Epoch: 8 | 
MAPK: 0.2604166567325592 | Loss: 2.3973024402345930 | Acc: 0.1698113207547170.
Returned to Spot: Validation loss: 2.397302440234593
spotPython tuning: 2.394493141264286 [#######---] 68.28% 

config: {'_L0': 6112, 'l1': 8192, 'dropout_prob': 0.6751701824474365, 'lr_mult': 0.001, 'batch_size': 8, 'epochs': 4, 'k_folds': 1, 'patience': 4, 'optimizer': 'NAdam', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1566357910633087 | Loss: 2.3978495774445712 | Acc: 0.0660377358490566.
Epoch: 2 | 
MAPK: 0.1728395223617554 | Loss: 2.3977185090382895 | Acc: 0.0943396226415094.
Epoch: 3 | 
MAPK: 0.1875000149011612 | Loss: 2.3976249871430575 | Acc: 0.1273584905660377.
Epoch: 4 | 
MAPK: 0.1921296268701553 | Loss: 2.3973713450961642 | Acc: 0.1132075471698113.
Returned to Spot: Validation loss: 2.3973713450961642
spotPython tuning: 2.394493141264286 [########--] 78.97% 

config: {'_L0': 6112, 'l1': 4096, 'dropout_prob': 0.6753757927471769, 'lr_mult': 0.001, 'batch_size': 2, 'epochs': 8, 'k_folds': 1, 'patience': 4, 'optimizer': 'Adamax', 'sgd_momentum': 0.9}
Epoch: 1 | 
MAPK: 0.1816037744283676 | Loss: 2.3979451701326191 | Acc: 0.1037735849056604.
Epoch: 2 | 
MAPK: 0.1713836491107941 | Loss: 2.3980691792829982 | Acc: 0.0801886792452830.
Epoch: 3 | 
MAPK: 0.1933962106704712 | Loss: 2.3978647578437373 | Acc: 0.1037735849056604.
Epoch: 4 | 
MAPK: 0.1957546919584274 | Loss: 2.3979510001416475 | Acc: 0.0990566037735849.
Epoch: 5 | 
MAPK: 0.1910377144813538 | Loss: 2.3978540739923155 | Acc: 0.1037735849056604.
Epoch: 6 | 
MAPK: 0.1926100403070450 | Loss: 2.3977304247190370 | Acc: 0.1037735849056604.
Epoch: 7 | 
MAPK: 0.1886792480945587 | Loss: 2.3979027788594083 | Acc: 0.1084905660377359.
Epoch: 8 | 
MAPK: 0.1705974936485291 | Loss: 2.3978197777046346 | Acc: 0.0990566037735849.
Returned to Spot: Validation loss: 2.3978197777046346
spotPython tuning: 2.394493141264286 [##########] 100.00% Done...
<spotPython.spot.spot.Spot at 0x2c3587220>

20.9 Step 9: Tensorboard

The textual output shown in the console (or code cell) can be visualized with Tensorboard as described in Section 14.9, see also the description in the documentation: Tensorboard.

20.10 Step 10: Results

After the hyperparameter tuning run is finished, the results can be analyzed as described in Section 14.10.

spot_tuner.plot_progress(log_y=False, 
    filename="./figures/" + experiment_name+"_progress.png")

Progress plot. Black dots denote results from the initial design. Red dots illustrate the improvement found by the surrogate model based optimization.
from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control=fun_control, spot=spot_tuner))
| name         | type   | default   |   lower |   upper |              tuned | transform             |   importance | stars   |
|--------------|--------|-----------|---------|---------|--------------------|-----------------------|--------------|---------|
| _L0          | int    | 64        |  6112.0 |  6112.0 |             6112.0 | None                  |         0.00 |         |
| l1           | int    | 8         |     6.0 |    13.0 |               12.0 | transform_power_2_int |        37.18 | *       |
| dropout_prob | float  | 0.01      |     0.0 |     0.9 | 0.6759063718076167 | None                  |       100.00 | ***     |
| lr_mult      | float  | 1.0       |   0.001 |   0.001 |              0.001 | None                  |         0.00 |         |
| batch_size   | int    | 4         |     1.0 |     4.0 |                1.0 | transform_power_2_int |         3.41 | *       |
| epochs       | int    | 4         |     2.0 |     3.0 |                3.0 | transform_power_2_int |         0.57 | .       |
| k_folds      | int    | 1         |     1.0 |     1.0 |                1.0 | None                  |         0.00 |         |
| patience     | int    | 2         |     2.0 |     2.0 |                2.0 | transform_power_2_int |         0.00 |         |
| optimizer    | factor | SGD       |     0.0 |     3.0 |                3.0 | None                  |         0.00 |         |
| sgd_momentum | float  | 0.0       |     0.9 |     0.9 |                0.9 | None                  |         0.00 |         |
spot_tuner.plot_importance(threshold=0.025,
    filename="./figures/" + experiment_name+"_importance.png")

Variable importance plot, threshold 0.025.

20.10.1 Get the Tuned Architecture

from spotPython.hyperparameters.values import get_one_core_model_from_X
X = spot_tuner.to_all_dim(spot_tuner.min_X.reshape(1,-1))
model_spot = get_one_core_model_from_X(X, fun_control)
model_spot
Net_vbdp(
  (fc1): Linear(in_features=6112, out_features=4096, bias=True)
  (fc2): Linear(in_features=4096, out_features=2048, bias=True)
  (fc3): Linear(in_features=2048, out_features=1024, bias=True)
  (fc4): Linear(in_features=1024, out_features=512, bias=True)
  (fc5): Linear(in_features=512, out_features=11, bias=True)
  (relu): ReLU()
  (softmax): Softmax(dim=1)
  (dropout1): Dropout(p=0.6759063718076167, inplace=False)
  (dropout2): Dropout(p=0.33795318590380835, inplace=False)
)

20.10.2 Evaluation of the Tuned Architecture

from spotPython.torch.traintest import (
    train_tuned,
    test_tuned,
    )
train_tuned(net=model_spot, train_dataset=train,
        loss_function=fun_control["loss_function"],
        metric=fun_control["metric_torch"],
        shuffle=True,
        device = fun_control["device"],
        path=None,
        task=fun_control["task"],)
Epoch: 1 | 
MAPK: 0.1344339847564697 | Loss: 2.3980706120437048 | Acc: 0.0660377358490566.
Epoch: 2 | 
MAPK: 0.1477987617254257 | Loss: 2.3979494211808690 | Acc: 0.0943396226415094.
Epoch: 3 | 
MAPK: 0.1650943458080292 | Loss: 2.3977398265082881 | Acc: 0.1037735849056604.
Epoch: 4 | 
MAPK: 0.1580188870429993 | Loss: 2.3977477100660218 | Acc: 0.0849056603773585.
Epoch: 5 | 
MAPK: 0.1635220199823380 | Loss: 2.3975086054711974 | Acc: 0.0943396226415094.
Epoch: 6 | 
MAPK: 0.1611635237932205 | Loss: 2.3971170052042545 | Acc: 0.0896226415094340.
Epoch: 7 | 
MAPK: 0.1509433984756470 | Loss: 2.3968254485220277 | Acc: 0.0754716981132075.
Epoch: 8 | 
MAPK: 0.1595912128686905 | Loss: 2.3966822736668139 | Acc: 0.0990566037735849.
Returned to Spot: Validation loss: 2.396682273666814

If path is set to a filename, e.g., path = "model_spot_trained.pt", the weights of the trained model will be loaded from this file.

test_tuned(net=model_spot, test_dataset=test,
            shuffle=False,
            loss_function=fun_control["loss_function"],
            metric=fun_control["metric_torch"],
            device = fun_control["device"],
            task=fun_control["task"],)
MAPK: 0.2097378373146057 | Loss: 2.3919104083200518 | Acc: 0.1299435028248588.
Final evaluation: Validation loss: 2.3919104083200518
Final evaluation: Validation metric: 0.2097378373146057
----------------------------------------------
(2.3919104083200518, nan, tensor(0.2097))

20.10.3 Cross-validated Evaluations

  • This is the evaluation that will be used in the comparison.
Caution: Cross-validated Evaluations
  • The number of folds is set to 1 by default.
  • Here it was changed to 3 for demonstration purposes.
  • Set the number of folds to a reasonable value, e.g., 10.
  • This can be done by setting the k_folds attribute of the model as follows:
  • setattr(model_spot, "k_folds", 10)
from spotPython.torch.traintest import evaluate_cv
# modify k-kolds:
setattr(model_spot, "k_folds",  3)
df_eval, df_preds, df_metrics = evaluate_cv(net=model_spot,
    dataset=fun_control["data"],
    loss_function=fun_control["loss_function"],
    metric=fun_control["metric_torch"],
    task=fun_control["task"],
    writer=fun_control["writer"],
    writerId="model_spot_cv",
    device = fun_control["device"])
Fold: 1
Epoch: 1 | 
MAPK: 0.1935028135776520 | Loss: 2.3974632145994801 | Acc: 0.1186440677966102.
Epoch: 2 | 
MAPK: 0.2026835978031158 | Loss: 2.3970745519056158 | Acc: 0.1228813559322034.
Epoch: 3 | 
MAPK: 0.2196327447891235 | Loss: 2.3967983924736411 | Acc: 0.1271186440677966.
Epoch: 4 | 
MAPK: 0.2132768332958221 | Loss: 2.3961138503026156 | Acc: 0.1271186440677966.
Epoch: 5 | 
MAPK: 0.2076271176338196 | Loss: 2.3953938080092607 | Acc: 0.1271186440677966.
Epoch: 6 | 
MAPK: 0.2097457647323608 | Loss: 2.3942873639575506 | Acc: 0.1271186440677966.
Epoch: 7 | 
MAPK: 0.2210451364517212 | Loss: 2.3928181680582337 | Acc: 0.1271186440677966.
Epoch: 8 | 
MAPK: 0.2182203084230423 | Loss: 2.3907344017998646 | Acc: 0.1271186440677966.
Fold: 2
Epoch: 1 | 
MAPK: 0.2542372643947601 | Loss: 2.3972244323310203 | Acc: 0.1652542372881356.
Epoch: 2 | 
MAPK: 0.2768361270427704 | Loss: 2.3968893794690147 | Acc: 0.1652542372881356.
Epoch: 3 | 
MAPK: 0.2782485783100128 | Loss: 2.3962075871936346 | Acc: 0.1610169491525424.
Epoch: 4 | 
MAPK: 0.2485875487327576 | Loss: 2.3954777313491045 | Acc: 0.1355932203389831.
Epoch: 5 | 
MAPK: 0.2288135439157486 | Loss: 2.3939730575529197 | Acc: 0.1355932203389831.
Epoch: 6 | 
MAPK: 0.2111581712961197 | Loss: 2.3925903914338451 | Acc: 0.1228813559322034.
Epoch: 7 | 
MAPK: 0.2083333283662796 | Loss: 2.3901078296920000 | Acc: 0.1186440677966102.
Epoch: 8 | 
MAPK: 0.2040960341691971 | Loss: 2.3878219046835172 | Acc: 0.1186440677966102.
Fold: 3
Epoch: 1 | 
MAPK: 0.1334745883941650 | Loss: 2.3979532718658447 | Acc: 0.0808510638297872.
Epoch: 2 | 
MAPK: 0.1518361717462540 | Loss: 2.3976832507020336 | Acc: 0.0936170212765957.
Epoch: 3 | 
MAPK: 0.1645480394363403 | Loss: 2.3974377219959839 | Acc: 0.1063829787234043.
Epoch: 4 | 
MAPK: 0.1751412302255630 | Loss: 2.3969538413872153 | Acc: 0.1191489361702128.
Epoch: 5 | 
MAPK: 0.1836158335208893 | Loss: 2.3961471581863143 | Acc: 0.1106382978723404.
Epoch: 6 | 
MAPK: 0.1807909458875656 | Loss: 2.3950486769110468 | Acc: 0.1148936170212766.
Epoch: 7 | 
MAPK: 0.1836158186197281 | Loss: 2.3933664641137851 | Acc: 0.1191489361702128.
Epoch: 8 | 
MAPK: 0.1850282549858093 | Loss: 2.3913705490403254 | Acc: 0.1148936170212766.
metric_name = type(fun_control["metric_torch"]).__name__
print(f"loss: {df_eval}, Cross-validated {metric_name}: {df_metrics}")
loss: 2.3899756185079024, Cross-validated MAPK: 0.20244820415973663

20.10.4 Detailed Hyperparameter Plots

filename = "./figures/" + experiment_name
spot_tuner.plot_important_hyperparameter_contour(filename=filename)
l1:  37.184099489346636
dropout_prob:  100.0
batch_size:  3.411653699188973
epochs:  0.5670309499600713

Contour plots.

20.10.5 Parallel Coordinates Plot

spot_tuner.parallel_plot()

Parallel coordinates plots

# close tensorbaoard writer
if fun_control["writer"] is not None:
    fun_control["writer"].close()

20.10.6 Plot all Combinations of Hyperparameters

  • Warning: this may take a while.
PLOT_ALL = False
if PLOT_ALL:
    n = spot_tuner.k
    for i in range(n-1):
        for j in range(i+1, n):
            spot_tuner.plot_contour(i=i, j=j, min_z=min_z, max_z = max_z)