list | grep "spot[RiverPython]" pip
spotPython 0.2.51
spotRiver 0.0.94
Note: you may need to restart the kernel to use updated packages.
In this tutorial, we will show how spotPython
can be integrated into the PyTorch
Lightning training workflow for a classifiaction task.
./data/VBDP/train.csv
.This document refers to the following software versions:
python
: 3.10.10torch
: 2.0.1torchvision
: 0.15.0list | grep "spot[RiverPython]" pip
spotPython 0.2.51
spotRiver 0.0.94
Note: you may need to restart the kernel to use updated packages.
spotPython
can be installed via pip. Alternatively, the source code can be downloaded from gitHub: https://github.com/sequential-parameter-optimization/spotPython.
!pip install spotPython
spotPython
from GitHub.# import sys
# !{sys.executable} -m pip install --upgrade build
# !{sys.executable} -m pip install --upgrade --force-reinstall spotPython
Before we consider the detailed experimental setup, we select the parameters that affect run time, initial design size and the device that is used.
DEVICE
."cpu"
is preferred (on Mac)."cuda:0"
instead."auto"
or None
, spotPython
will automatically select the device.
"mps"
on Macs, which is not the best choice for simple neural nets.PREFIX
is used for the experiment name and the name of the log file.= 1
MAX_TIME = 5
INIT_SIZE = "auto" #"cpu" # "cuda:0"
DEVICE = 0
WORKERS ="30" PREFIX
from spotPython.utils.device import getDevice
= getDevice(DEVICE)
DEVICE print(DEVICE)
mps
import os
if not os.path.exists('./figures'):
'./figures') os.makedirs(
fun_control
Dictionarytensorboard_path
to None
if you are working under Windows.spotPython
uses a Python dictionary for storing the information required for the hyperparameter tuning process, which was described in Section 14.2, see Initialization of the fun_control Dictionary in the documentation.
from spotPython.utils.init import fun_control_init
from spotPython.utils.file import get_experiment_name
= get_experiment_name(prefix=PREFIX)
experiment_name = fun_control_init(task="classification",
fun_control ="./runs/" + experiment_name,
tensorboard_path=WORKERS,
num_workers=DEVICE) device
The data loading and preprocessing is handled by Lightning
and PyTorch
. It comprehends the following classes:
CSVDataset
: A class that loads the data from a CSV file. [SOURCE]CSVDataModule
: A class that prepares the data for training and testing. [SOURCE]import torch
from spotPython.light.csvdataset import CSVDataset
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
# Create an instance of CSVDataset
= CSVDataset(csv_file="./data/VBDP/train.csv", train=True)
dataset # show the dimensions of the input data
print(dataset[0][0].shape)
# show the first element of the input data
print(dataset[0][0])
# show the size of the dataset
print(f"Dataset Size: {len(dataset)}")
torch.Size([64])
tensor([1., 1., 0., 1., 1., 1., 1., 0., 1., 1., 1., 1., 0., 0., 1., 1., 0., 0.,
1., 0., 1., 0., 1., 1., 1., 1., 1., 1., 1., 0., 0., 1., 0., 0., 0., 0.,
1., 0., 0., 0., 0., 0., 1., 0., 1., 0., 1., 0., 0., 0., 0., 1., 0., 1.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
Dataset Size: 707
# Set batch size for DataLoader
= 3
batch_size # Create DataLoader
= DataLoader(dataset, batch_size=batch_size, shuffle=True)
dataloader
# Iterate over the data in the DataLoader
for batch in dataloader:
= batch
inputs, targets print(f"Batch Size: {inputs.size(0)}")
print("---------------")
print(f"Inputs: {inputs}")
print(f"Targets: {targets}")
break
Batch Size: 3
---------------
Inputs: tensor([[1., 0., 0., 1., 1., 0., 1., 1., 1., 1., 0., 1., 0., 1., 1., 1., 1., 1.,
0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 0., 0., 0., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 0., 1., 1., 1., 0., 0., 1., 1., 1., 1.,
0., 1., 1., 1., 1., 0., 1., 1., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
0., 1., 1., 1., 1., 1., 0., 0., 1., 1.],
[0., 1., 1., 0., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
Targets: tensor([8, 3, 1])
fun_control
dictionary by Lightning
and PyTorch
.spotPython
with torch
, river
and sklearn
, the data sets are not added to the fun_control
dictionary.The fun_control
dictionary, the torch
, sklearn
and river
versions of spotPython
allow the specification of a data preprocessing pipeline, e.g., for the scaling of the data or for the one-hot encoding of categorical variables, see Section 14.4. This feature is not used in the Lightning
version.
Lightning allows the data preprocessing to be specified in the LightningDataModule
class. It is not considered here, because it should be computed at one location only.
algorithm
) and core_model_hyper_dict
spotPython
includes the NetLightBase
class [SOURCE] for configurable neural networks. The class is imported here. It inherits from the class Lightning.LightningModule
, which is the base class for all models in Lightning
. Lightning.LightningModule
is a subclass of torch.nn.Module
and provides additional functionality for the training and testing of neural networks. The class Lightning.LightningModule
is described in the Lightning documentation.
from spotPython.light.netlightbase import NetLightBase
from spotPython.data.light_hyper_dict import LightHyperDict
from spotPython.hyperparameters.values import add_core_model_to_fun_control
= add_core_model_to_fun_control(core_model=NetLightBase,
fun_control =fun_control,
fun_control= LightHyperDict) hyper_dict
The default entries for the core_model
class are shown below.
'core_model_hyper_dict'] fun_control[
{'l1': {'type': 'int',
'default': 3,
'transform': 'transform_power_2_int',
'lower': 3,
'upper': 8},
'epochs': {'type': 'int',
'default': 4,
'transform': 'transform_power_2_int',
'lower': 4,
'upper': 9},
'batch_size': {'type': 'int',
'default': 4,
'transform': 'transform_power_2_int',
'lower': 1,
'upper': 4},
'act_fn': {'levels': ['Sigmoid', 'Tanh', 'ReLU', 'LeakyReLU', 'ELU', 'Swish'],
'type': 'factor',
'default': 'ReLU',
'transform': 'None',
'class_name': 'spotPython.torch.activation',
'core_model_parameter_type': 'instance()',
'lower': 0,
'upper': 5},
'optimizer': {'levels': ['Adadelta',
'Adagrad',
'Adam',
'AdamW',
'SparseAdam',
'Adamax',
'ASGD',
'NAdam',
'RAdam',
'RMSprop',
'Rprop',
'SGD'],
'type': 'factor',
'default': 'SGD',
'transform': 'None',
'class_name': 'torch.optim',
'core_model_parameter_type': 'str',
'lower': 0,
'upper': 11},
'dropout_prob': {'type': 'float',
'default': 0.01,
'transform': 'None',
'lower': 0.0,
'upper': 0.1},
'lr_mult': {'type': 'float',
'default': 1.0,
'transform': 'None',
'lower': 0.1,
'upper': 10.0},
'patience': {'type': 'int',
'default': 2,
'transform': 'transform_power_2_int',
'lower': 2,
'upper': 6},
'initialization': {'levels': ['Default', 'Kaiming', 'Xavier'],
'type': 'factor',
'default': 'Default',
'transform': 'None',
'core_model_parameter_type': 'str',
'lower': 0,
'upper': 2}}
The NetLightBase
is a configurable neural network. The hyperparameters of the model are specified in the core_model_hyper_dict
dictionary [SOURCE].
hyper_dict
Hyperparameters for the Selected Algorithm aka core_model
spotPython
provides functions for modifying the hyperparameters, their bounds and factors as well as for activating and de-activating hyperparameters without re-compilation of the Python source code. These functions were described in Section 14.6.
epochs
and patience
are set to small values for demonstration purposes. These values are too small for a real application.fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[7, 9])
andfun_control = modify_hyper_parameter_bounds(fun_control, "patience", bounds=[2, 7])
from spotPython.hyperparameters.values import modify_hyper_parameter_bounds
= modify_hyper_parameter_bounds(fun_control, "l1", bounds=[6,13])
fun_control = modify_hyper_parameter_bounds(fun_control, "epochs", bounds=[6,13])
fun_control = modify_hyper_parameter_bounds(fun_control, "batch_size", bounds=[2, 8]) fun_control
from spotPython.hyperparameters.values import modify_hyper_parameter_levels
= modify_hyper_parameter_levels(fun_control, "optimizer",["Adam", "AdamW", "Adamax", "NAdam"])
fun_control # fun_control = modify_hyper_parameter_levels(fun_control, "optimizer", ["Adam"])
The updated fun_control
dictionary is shown below.
"core_model_hyper_dict"] fun_control[
{'l1': {'type': 'int',
'default': 3,
'transform': 'transform_power_2_int',
'lower': 6,
'upper': 13},
'epochs': {'type': 'int',
'default': 4,
'transform': 'transform_power_2_int',
'lower': 6,
'upper': 13},
'batch_size': {'type': 'int',
'default': 4,
'transform': 'transform_power_2_int',
'lower': 2,
'upper': 8},
'act_fn': {'levels': ['Sigmoid', 'Tanh', 'ReLU', 'LeakyReLU', 'ELU', 'Swish'],
'type': 'factor',
'default': 'ReLU',
'transform': 'None',
'class_name': 'spotPython.torch.activation',
'core_model_parameter_type': 'instance()',
'lower': 0,
'upper': 5},
'optimizer': {'levels': ['Adam', 'AdamW', 'Adamax', 'NAdam'],
'type': 'factor',
'default': 'SGD',
'transform': 'None',
'class_name': 'torch.optim',
'core_model_parameter_type': 'str',
'lower': 0,
'upper': 3},
'dropout_prob': {'type': 'float',
'default': 0.01,
'transform': 'None',
'lower': 0.0,
'upper': 0.1},
'lr_mult': {'type': 'float',
'default': 1.0,
'transform': 'None',
'lower': 0.1,
'upper': 10.0},
'patience': {'type': 'int',
'default': 2,
'transform': 'transform_power_2_int',
'lower': 2,
'upper': 6},
'initialization': {'levels': ['Default', 'Kaiming', 'Xavier'],
'type': 'factor',
'default': 'Default',
'transform': 'None',
'core_model_parameter_type': 'str',
'lower': 0,
'upper': 2}}
The evaluation procedure requires the specification of two elements:
Lightning
.The loss function is specified in the configurable network class [SOURCE] We will use CrossEntropy loss for the multiclass-classification task.
from spotPython.torch.mapk import MAPK
import torch
= MAPK(k=2)
mapk = torch.tensor([0, 1, 2, 2])
target = torch.tensor(
preds
[0.5, 0.2, 0.2], # 0 is in top 2
[0.3, 0.4, 0.2], # 1 is in top 2
[0.2, 0.4, 0.3], # 2 is in top 2
[0.7, 0.2, 0.1], # 2 isn't in top 2
[
]
)
mapk.update(preds, target)print(mapk.compute()) # tensor(0.6250)
tensor(0.6250)
Similar to the loss function, the metric is specified in the configurable network class [SOURCE].
spotPython
.Lightning
.The following code passes the information about the parameter ranges and bounds to spot
. It extracts the variable types, names, and bounds
from spotPython.hyperparameters.values import (get_bound_values,
get_var_name,
get_var_type,)= get_var_type(fun_control)
var_type = get_var_name(fun_control)
var_name "var_type": var_type,
fun_control.update({"var_name": var_name})
= get_bound_values(fun_control, "lower")
lower = get_bound_values(fun_control, "upper") upper
Now, the dictionary fun_control
contains all information needed for the hyperparameter tuning. Before the hyperparameter tuning is started, it is recommended to take a look at the experimental design. The method gen_design_table
[SOURCE] generates a design table as follows:
from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control))
| name | type | default | lower | upper | transform |
|----------------|--------|-----------|---------|---------|-----------------------|
| l1 | int | 3 | 6 | 13 | transform_power_2_int |
| epochs | int | 4 | 6 | 13 | transform_power_2_int |
| batch_size | int | 4 | 2 | 8 | transform_power_2_int |
| act_fn | factor | ReLU | 0 | 5 | None |
| optimizer | factor | SGD | 0 | 3 | None |
| dropout_prob | float | 0.01 | 0 | 0.1 | None |
| lr_mult | float | 1.0 | 0.1 | 10 | None |
| patience | int | 2 | 2 | 6 | transform_power_2_int |
| initialization | factor | Default | 0 | 2 | None |
This allows to check if all information is available and if the information is correct.
fun
The objective function fun
from the class HyperLight
[SOURCE] is selected next. It implements an interface from PyTorch
’s training, validation, and testing methods to spotPython
.
from spotPython.light.hyperlight import HyperLight
= HyperLight().fun fun
The spotPython
hyperparameter tuning is started by calling the Spot
function [SOURCE] as described in Section 14.8.4.
import numpy as np
from spotPython.spot import spot
from math import inf
= spot.Spot(fun=fun,
spot_tuner = lower,
lower = upper,
upper = inf,
fun_evals = 1,
fun_repeats = MAX_TIME,
max_time = False,
noise = np.sqrt(np.spacing(1)),
tolerance_x = var_type,
var_type = var_name,
var_name = "y",
infill_criterion = 1,
n_points =123,
seed= 50,
log_level = False,
show_models= True,
show_progress= fun_control,
fun_control ={"init_size": INIT_SIZE,
design_control"repeats": 1},
={"noise": True,
surrogate_control"cod_type": "norm",
"min_theta": -4,
"max_theta": 3,
"n_theta": len(var_name),
"model_fun_evals": 10_000,
"log_level": 50
}) spot_tuner.run()
config: {'l1': 4096, 'epochs': 4096, 'batch_size': 32, 'act_fn': ReLU(), 'optimizer': 'AdamW', 'dropout_prob': 0.04375810986688453, 'lr_mult': 4.211776903906428, 'patience': 16, 'initialization': 'Default'}
model: NetLightBase(
(act_fn): ReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=4096, bias=True)
(1): ReLU()
(2): Dropout(p=0.04375810986688453, inplace=False)
(3): Linear(in_features=4096, out_features=2048, bias=True)
(4): ReLU()
(5): Dropout(p=0.04375810986688453, inplace=False)
(6): Linear(in_features=2048, out_features=2048, bias=True)
(7): ReLU()
(8): Dropout(p=0.04375810986688453, inplace=False)
(9): Linear(in_features=2048, out_features=1024, bias=True)
(10): ReLU()
(11): Dropout(p=0.04375810986688453, inplace=False)
(12): Linear(in_features=1024, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.458235025405884 │ │ val_acc │ 0.08480565249919891 │ │ val_loss │ 2.458235025405884 │ │ valid_mapk │ 0.1751328706741333 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.1751328706741333, 'val_loss': 2.458235025405884, 'val_acc': 0.08480565249919891, 'hp_metric': 2.458235025405884}
config: {'l1': 64, 'epochs': 128, 'batch_size': 256, 'act_fn': LeakyReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.005170658955305807, 'lr_mult': 0.832718394912432, 'patience': 8, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2329816818237305 │ │ val_acc │ 0.30035334825515747 │ │ val_loss │ 2.2329816818237305 │ │ valid_mapk │ 0.4217664897441864 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.4217664897441864, 'val_loss': 2.2329816818237305, 'val_acc': 0.30035334825515747, 'hp_metric': 2.2329816818237305}
config: {'l1': 1024, 'epochs': 256, 'batch_size': 8, 'act_fn': Swish(), 'optimizer': 'NAdam', 'dropout_prob': 0.08834550718769361, 'lr_mult': 7.65501078489161, 'patience': 64, 'initialization': 'Xavier'}
model: NetLightBase(
(act_fn): Swish()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=1024, bias=True)
(1): Swish()
(2): Dropout(p=0.08834550718769361, inplace=False)
(3): Linear(in_features=1024, out_features=512, bias=True)
(4): Swish()
(5): Dropout(p=0.08834550718769361, inplace=False)
(6): Linear(in_features=512, out_features=512, bias=True)
(7): Swish()
(8): Dropout(p=0.08834550718769361, inplace=False)
(9): Linear(in_features=512, out_features=256, bias=True)
(10): Swish()
(11): Dropout(p=0.08834550718769361, inplace=False)
(12): Linear(in_features=256, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.4547009468078613 │ │ val_acc │ 0.08833922445774078 │ │ val_loss │ 2.4547009468078613 │ │ valid_mapk │ 0.16724537312984467 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.16724537312984467, 'val_loss': 2.4547009468078613, 'val_acc': 0.08833922445774078, 'hp_metric': 2.4547009468078613}
config: {'l1': 512, 'epochs': 512, 'batch_size': 16, 'act_fn': Sigmoid(), 'optimizer': 'Adam', 'dropout_prob': 0.07563714253500024, 'lr_mult': 2.3450676871382794, 'patience': 32, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): Sigmoid()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=512, bias=True)
(1): Sigmoid()
(2): Dropout(p=0.07563714253500024, inplace=False)
(3): Linear(in_features=512, out_features=256, bias=True)
(4): Sigmoid()
(5): Dropout(p=0.07563714253500024, inplace=False)
(6): Linear(in_features=256, out_features=256, bias=True)
(7): Sigmoid()
(8): Dropout(p=0.07563714253500024, inplace=False)
(9): Linear(in_features=256, out_features=128, bias=True)
(10): Sigmoid()
(11): Dropout(p=0.07563714253500024, inplace=False)
(12): Linear(in_features=128, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2733662128448486 │ │ val_acc │ 0.268551230430603 │ │ val_loss │ 2.2733662128448486 │ │ valid_mapk │ 0.33696338534355164 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.33696338534355164, 'val_loss': 2.2733662128448486, 'val_acc': 0.268551230430603, 'hp_metric': 2.2733662128448486}
config: {'l1': 256, 'epochs': 4096, 'batch_size': 64, 'act_fn': ReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.02833523179697884, 'lr_mult': 9.528945328733357, 'patience': 4, 'initialization': 'Xavier'}
model: NetLightBase(
(act_fn): ReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.02833523179697884, inplace=False)
(3): Linear(in_features=256, out_features=128, bias=True)
(4): ReLU()
(5): Dropout(p=0.02833523179697884, inplace=False)
(6): Linear(in_features=128, out_features=128, bias=True)
(7): ReLU()
(8): Dropout(p=0.02833523179697884, inplace=False)
(9): Linear(in_features=128, out_features=64, bias=True)
(10): ReLU()
(11): Dropout(p=0.02833523179697884, inplace=False)
(12): Linear(in_features=64, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.3004534244537354 │ │ val_acc │ 0.23674911260604858 │ │ val_loss │ 2.3004534244537354 │ │ valid_mapk │ 0.2987654209136963 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.2987654209136963, 'val_loss': 2.3004534244537354, 'val_acc': 0.23674911260604858, 'hp_metric': 2.3004534244537354}
config: {'l1': 64, 'epochs': 64, 'batch_size': 256, 'act_fn': ReLU(), 'optimizer': 'Adamax', 'dropout_prob': 0.0057308353644435995, 'lr_mult': 0.12746123603776952, 'patience': 8, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): ReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): ReLU()
(2): Dropout(p=0.0057308353644435995, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): ReLU()
(5): Dropout(p=0.0057308353644435995, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): ReLU()
(8): Dropout(p=0.0057308353644435995, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): ReLU()
(11): Dropout(p=0.0057308353644435995, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.3844985961914062 │ │ val_acc │ 0.1236749142408371 │ │ val_loss │ 2.3844985961914062 │ │ valid_mapk │ 0.21562740206718445 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.21562740206718445, 'val_loss': 2.3844985961914062, 'val_acc': 0.1236749142408371, 'hp_metric': 2.3844985961914062}
spotPython tuning: 2.2329816818237305 [----------] 2.80%
config: {'l1': 128, 'epochs': 512, 'batch_size': 16, 'act_fn': ELU(), 'optimizer': 'Adamax', 'dropout_prob': 0.0756380926179772, 'lr_mult': 2.3450649482576313, 'patience': 32, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): ELU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=128, bias=True)
(1): ELU()
(2): Dropout(p=0.0756380926179772, inplace=False)
(3): Linear(in_features=128, out_features=64, bias=True)
(4): ELU()
(5): Dropout(p=0.0756380926179772, inplace=False)
(6): Linear(in_features=64, out_features=64, bias=True)
(7): ELU()
(8): Dropout(p=0.0756380926179772, inplace=False)
(9): Linear(in_features=64, out_features=32, bias=True)
(10): ELU()
(11): Dropout(p=0.0756380926179772, inplace=False)
(12): Linear(in_features=32, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.263234853744507 │ │ val_acc │ 0.268551230430603 │ │ val_loss │ 2.263234853744507 │ │ valid_mapk │ 0.3856271207332611 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.3856271207332611, 'val_loss': 2.263234853744507, 'val_acc': 0.268551230430603, 'hp_metric': 2.263234853744507}
spotPython tuning: 2.2329816818237305 [#---------] 11.18%
config: {'l1': 512, 'epochs': 128, 'batch_size': 8, 'act_fn': Tanh(), 'optimizer': 'NAdam', 'dropout_prob': 0.005170536012710651, 'lr_mult': 0.8327083710883477, 'patience': 8, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): Tanh()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=512, bias=True)
(1): Tanh()
(2): Dropout(p=0.005170536012710651, inplace=False)
(3): Linear(in_features=512, out_features=256, bias=True)
(4): Tanh()
(5): Dropout(p=0.005170536012710651, inplace=False)
(6): Linear(in_features=256, out_features=256, bias=True)
(7): Tanh()
(8): Dropout(p=0.005170536012710651, inplace=False)
(9): Linear(in_features=256, out_features=128, bias=True)
(10): Tanh()
(11): Dropout(p=0.005170536012710651, inplace=False)
(12): Linear(in_features=128, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2661986351013184 │ │ val_acc │ 0.2720848023891449 │ │ val_loss │ 2.2661986351013184 │ │ valid_mapk │ 0.36516207456588745 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.36516207456588745, 'val_loss': 2.2661986351013184, 'val_acc': 0.2720848023891449, 'hp_metric': 2.2661986351013184}
spotPython tuning: 2.2329816818237305 [###-------] 26.67%
config: {'l1': 256, 'epochs': 4096, 'batch_size': 128, 'act_fn': ReLU(), 'optimizer': 'AdamW', 'dropout_prob': 0.02833522259075095, 'lr_mult': 9.5289448927627, 'patience': 8, 'initialization': 'Xavier'}
model: NetLightBase(
(act_fn): ReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.02833522259075095, inplace=False)
(3): Linear(in_features=256, out_features=128, bias=True)
(4): ReLU()
(5): Dropout(p=0.02833522259075095, inplace=False)
(6): Linear(in_features=128, out_features=128, bias=True)
(7): ReLU()
(8): Dropout(p=0.02833522259075095, inplace=False)
(9): Linear(in_features=128, out_features=64, bias=True)
(10): ReLU()
(11): Dropout(p=0.02833522259075095, inplace=False)
(12): Linear(in_features=64, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2677416801452637 │ │ val_acc │ 0.26501765847206116 │ │ val_loss │ 2.2677416801452637 │ │ valid_mapk │ 0.34780094027519226 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.34780094027519226, 'val_loss': 2.2677416801452637, 'val_acc': 0.26501765847206116, 'hp_metric': 2.2677416801452637}
spotPython tuning: 2.2329816818237305 [###-------] 30.19%
config: {'l1': 512, 'epochs': 128, 'batch_size': 8, 'act_fn': Tanh(), 'optimizer': 'NAdam', 'dropout_prob': 0.005170926446269222, 'lr_mult': 0.8327400103396447, 'patience': 8, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): Tanh()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=512, bias=True)
(1): Tanh()
(2): Dropout(p=0.005170926446269222, inplace=False)
(3): Linear(in_features=512, out_features=256, bias=True)
(4): Tanh()
(5): Dropout(p=0.005170926446269222, inplace=False)
(6): Linear(in_features=256, out_features=256, bias=True)
(7): Tanh()
(8): Dropout(p=0.005170926446269222, inplace=False)
(9): Linear(in_features=256, out_features=128, bias=True)
(10): Tanh()
(11): Dropout(p=0.005170926446269222, inplace=False)
(12): Linear(in_features=128, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.31839656829834 │ │ val_acc │ 0.21201413869857788 │ │ val_loss │ 2.31839656829834 │ │ valid_mapk │ 0.3350694179534912 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.3350694179534912, 'val_loss': 2.31839656829834, 'val_acc': 0.21201413869857788, 'hp_metric': 2.31839656829834}
spotPython tuning: 2.2329816818237305 [####------] 37.67%
config: {'l1': 64, 'epochs': 128, 'batch_size': 16, 'act_fn': Tanh(), 'optimizer': 'NAdam', 'dropout_prob': 0.0051691451807639194, 'lr_mult': 0.8325972770237184, 'patience': 32, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): Tanh()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): Tanh()
(2): Dropout(p=0.0051691451807639194, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): Tanh()
(5): Dropout(p=0.0051691451807639194, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): Tanh()
(8): Dropout(p=0.0051691451807639194, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): Tanh()
(11): Dropout(p=0.0051691451807639194, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2918200492858887 │ │ val_acc │ 0.23674911260604858 │ │ val_loss │ 2.2918200492858887 │ │ valid_mapk │ 0.3379629850387573 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.3379629850387573, 'val_loss': 2.2918200492858887, 'val_acc': 0.23674911260604858, 'hp_metric': 2.2918200492858887}
spotPython tuning: 2.2329816818237305 [#####-----] 45.83%
config: {'l1': 64, 'epochs': 128, 'batch_size': 16, 'act_fn': Tanh(), 'optimizer': 'NAdam', 'dropout_prob': 0.005171440031826116, 'lr_mult': 0.8327818027362548, 'patience': 32, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): Tanh()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): Tanh()
(2): Dropout(p=0.005171440031826116, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): Tanh()
(5): Dropout(p=0.005171440031826116, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): Tanh()
(8): Dropout(p=0.005171440031826116, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): Tanh()
(11): Dropout(p=0.005171440031826116, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.295330047607422 │ │ val_acc │ 0.2473498284816742 │ │ val_loss │ 2.295330047607422 │ │ valid_mapk │ 0.33101850748062134 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.33101850748062134, 'val_loss': 2.295330047607422, 'val_acc': 0.2473498284816742, 'hp_metric': 2.295330047607422}
spotPython tuning: 2.2329816818237305 [######----] 59.07%
config: {'l1': 128, 'epochs': 128, 'batch_size': 8, 'act_fn': ReLU(), 'optimizer': 'AdamW', 'dropout_prob': 0.005168205606205038, 'lr_mult': 0.8325345237459943, 'patience': 8, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): ReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=128, bias=True)
(1): ReLU()
(2): Dropout(p=0.005168205606205038, inplace=False)
(3): Linear(in_features=128, out_features=64, bias=True)
(4): ReLU()
(5): Dropout(p=0.005168205606205038, inplace=False)
(6): Linear(in_features=64, out_features=64, bias=True)
(7): ReLU()
(8): Dropout(p=0.005168205606205038, inplace=False)
(9): Linear(in_features=64, out_features=32, bias=True)
(10): ReLU()
(11): Dropout(p=0.005168205606205038, inplace=False)
(12): Linear(in_features=32, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2576189041137695 │ │ val_acc │ 0.268551230430603 │ │ val_loss │ 2.2576189041137695 │ │ valid_mapk │ 0.36342597007751465 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.36342597007751465, 'val_loss': 2.2576189041137695, 'val_acc': 0.268551230430603, 'hp_metric': 2.2576189041137695}
spotPython tuning: 2.2329816818237305 [#######---] 71.19%
config: {'l1': 512, 'epochs': 128, 'batch_size': 256, 'act_fn': ReLU(), 'optimizer': 'NAdam', 'dropout_prob': 0.005165563024496982, 'lr_mult': 0.8323238173327993, 'patience': 8, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): ReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=512, bias=True)
(1): ReLU()
(2): Dropout(p=0.005165563024496982, inplace=False)
(3): Linear(in_features=512, out_features=256, bias=True)
(4): ReLU()
(5): Dropout(p=0.005165563024496982, inplace=False)
(6): Linear(in_features=256, out_features=256, bias=True)
(7): ReLU()
(8): Dropout(p=0.005165563024496982, inplace=False)
(9): Linear(in_features=256, out_features=128, bias=True)
(10): ReLU()
(11): Dropout(p=0.005165563024496982, inplace=False)
(12): Linear(in_features=128, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2586112022399902 │ │ val_acc │ 0.2826855182647705 │ │ val_loss │ 2.2586112022399902 │ │ valid_mapk │ 0.3413628339767456 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.3413628339767456, 'val_loss': 2.2586112022399902, 'val_acc': 0.2826855182647705, 'hp_metric': 2.2586112022399902}
spotPython tuning: 2.2329816818237305 [#######---] 73.82%
config: {'l1': 4096, 'epochs': 512, 'batch_size': 32, 'act_fn': Sigmoid(), 'optimizer': 'Adamax', 'dropout_prob': 0.07563837577587414, 'lr_mult': 2.345074960681534, 'patience': 16, 'initialization': 'Kaiming'}
model: NetLightBase(
(act_fn): Sigmoid()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=4096, bias=True)
(1): Sigmoid()
(2): Dropout(p=0.07563837577587414, inplace=False)
(3): Linear(in_features=4096, out_features=2048, bias=True)
(4): Sigmoid()
(5): Dropout(p=0.07563837577587414, inplace=False)
(6): Linear(in_features=2048, out_features=2048, bias=True)
(7): Sigmoid()
(8): Dropout(p=0.07563837577587414, inplace=False)
(9): Linear(in_features=2048, out_features=1024, bias=True)
(10): Sigmoid()
(11): Dropout(p=0.07563837577587414, inplace=False)
(12): Linear(in_features=1024, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.388317346572876 │ │ val_acc │ 0.12720848619937897 │ │ val_loss │ 2.388317346572876 │ │ valid_mapk │ 0.2222650647163391 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.2222650647163391, 'val_loss': 2.388317346572876, 'val_acc': 0.12720848619937897, 'hp_metric': 2.388317346572876}
spotPython tuning: 2.2329816818237305 [##########] 100.00% Done...
<spotPython.spot.spot.Spot at 0x16ce50e50>
The textual output shown in the console (or code cell) can be visualized with Tensorboard as described in Section 14.9, see also the description in the documentation: Tensorboard.
After the hyperparameter tuning run is finished, the results can be analyzed as described in Section 14.10.
=False,
spot_tuner.plot_progress(log_y="./figures/" + experiment_name+"_progress.png") filename
from spotPython.utils.eda import gen_design_table
print(gen_design_table(fun_control=fun_control, spot=spot_tuner))
| name | type | default | lower | upper | tuned | transform | importance | stars |
|----------------|--------|-----------|---------|---------|----------------------|-----------------------|--------------|---------|
| l1 | int | 3 | 6.0 | 13.0 | 6.0 | transform_power_2_int | 0.12 | . |
| epochs | int | 4 | 6.0 | 13.0 | 7.0 | transform_power_2_int | 100.00 | *** |
| batch_size | int | 4 | 2.0 | 8.0 | 8.0 | transform_power_2_int | 0.00 | |
| act_fn | factor | ReLU | 0.0 | 5.0 | 3.0 | None | 0.00 | |
| optimizer | factor | SGD | 0.0 | 3.0 | 2.0 | None | 0.00 | |
| dropout_prob | float | 0.01 | 0.0 | 0.1 | 0.005170658955305807 | None | 0.35 | . |
| lr_mult | float | 1.0 | 0.1 | 10.0 | 0.832718394912432 | None | 98.68 | *** |
| patience | int | 2 | 2.0 | 6.0 | 3.0 | transform_power_2_int | 0.07 | |
| initialization | factor | Default | 0.0 | 2.0 | 1.0 | None | 100.00 | *** |
=0.025,
spot_tuner.plot_importance(threshold="./figures/" + experiment_name+"_importance.png") filename
from spotPython.hyperparameters.values import get_one_config_from_X
= spot_tuner.to_all_dim(spot_tuner.min_X.reshape(1,-1))
X = get_one_config_from_X(X, fun_control)
config config
{'l1': 64,
'epochs': 128,
'batch_size': 256,
'act_fn': LeakyReLU(),
'optimizer': 'Adamax',
'dropout_prob': 0.005170658955305807,
'lr_mult': 0.832718394912432,
'patience': 8,
'initialization': 'Kaiming'}
from spotPython.light.traintest import test_model
test_model(config, fun_control)
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Test metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.1616928577423096 │ │ test_mapk_epoch │ 0.4196580946445465 │ │ val_acc │ 0.38048091530799866 │ │ val_loss │ 2.1616928577423096 │ └───────────────────────────┴───────────────────────────┘
test_model result: {'test_mapk_epoch': 0.4196580946445465, 'val_loss': 2.1616928577423096, 'val_acc': 0.38048091530799866, 'hp_metric': 2.1616928577423096}
(2.1616928577423096, 0.38048091530799866)
KFold
class from sklearn.model_selection
is used to generate the folds for cross-validation.CrossValidationDataModule
class [SOURCE] is used to generate the folds for the hyperparameter tuning process.cv_model
function [SOURCE].from spotPython.light.traintest import cv_model
cv_model(config, fun_control)
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
k: 0
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2102842330932617 │ │ val_acc │ 0.3239436745643616 │ │ val_loss │ 2.2102842330932617 │ │ valid_mapk │ 0.40140846371650696 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.40140846371650696, 'val_loss': 2.2102842330932617, 'val_acc': 0.3239436745643616, 'hp_metric': 2.2102842330932617}
k: 1
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.1147525310516357 │ │ val_acc │ 0.4507042169570923 │ │ val_loss │ 2.1147525310516357 │ │ valid_mapk │ 0.5023474097251892 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.5023474097251892, 'val_loss': 2.1147525310516357, 'val_acc': 0.4507042169570923, 'hp_metric': 2.1147525310516357}
k: 2
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.2411983013153076 │ │ val_acc │ 0.30985915660858154 │ │ val_loss │ 2.2411983013153076 │ │ valid_mapk │ 0.32863849401474 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.32863849401474, 'val_loss': 2.2411983013153076, 'val_acc': 0.30985915660858154, 'hp_metric': 2.2411983013153076}
k: 3
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.002826452255249 │ │ val_acc │ 0.5492957830429077 │ │ val_loss │ 2.002826452255249 │ │ valid_mapk │ 0.5892018675804138 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.5892018675804138, 'val_loss': 2.002826452255249, 'val_acc': 0.5492957830429077, 'hp_metric': 2.002826452255249}
k: 4
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.095752239227295 │ │ val_acc │ 0.49295774102211 │ │ val_loss │ 2.095752239227295 │ │ valid_mapk │ 0.51408451795578 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.51408451795578, 'val_loss': 2.095752239227295, 'val_acc': 0.49295774102211, 'hp_metric': 2.095752239227295}
k: 5
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.072810173034668 │ │ val_acc │ 0.5070422291755676 │ │ val_loss │ 2.072810173034668 │ │ valid_mapk │ 0.5727699398994446 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.5727699398994446, 'val_loss': 2.072810173034668, 'val_acc': 0.5070422291755676, 'hp_metric': 2.072810173034668}
k: 6
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.1038923263549805 │ │ val_acc │ 0.43661972880363464 │ │ val_loss │ 2.1038923263549805 │ │ valid_mapk │ 0.4906103312969208 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.4906103312969208, 'val_loss': 2.1038923263549805, 'val_acc': 0.43661972880363464, 'hp_metric': 2.1038923263549805}
k: 7
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 1.988251805305481 │ │ val_acc │ 0.5857142806053162 │ │ val_loss │ 1.988251805305481 │ │ valid_mapk │ 0.6095238327980042 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.6095238327980042, 'val_loss': 1.988251805305481, 'val_acc': 0.5857142806053162, 'hp_metric': 1.988251805305481}
k: 8
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 2.028038263320923 │ │ val_acc │ 0.5428571701049805 │ │ val_loss │ 2.028038263320923 │ │ valid_mapk │ 0.5809524059295654 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.5809524059295654, 'val_loss': 2.028038263320923, 'val_acc': 0.5428571701049805, 'hp_metric': 2.028038263320923}
k: 9
model: NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ hp_metric │ 1.971924901008606 │ │ val_acc │ 0.6000000238418579 │ │ val_loss │ 1.971924901008606 │ │ valid_mapk │ 0.6309523582458496 │ └───────────────────────────┴───────────────────────────┘
train_model result: {'valid_mapk': 0.6309523582458496, 'val_loss': 1.971924901008606, 'val_acc': 0.6000000238418579, 'hp_metric': 1.971924901008606}
cv_model mapk result: 0.5220489621162414
0.5220489621162414
= "./figures/" + experiment_name
filename =filename) spot_tuner.plot_important_hyperparameter_contour(filename
l1: 0.11841349539420895
epochs: 100.0
dropout_prob: 0.35410320559765884
lr_mult: 98.6792989166949
patience: 0.07429547243917399
initialization: 100.0
spot_tuner.parallel_plot()
Parallel coordinates plots
= False
PLOT_ALL if PLOT_ALL:
= spot_tuner.k
n for i in range(n-1):
for j in range(i+1, n):
=i, j=j, min_z=min_z, max_z = max_z) spot_tuner.plot_contour(i
After we have trained the models, we can look at the actual activation values that find inside the model. For instance, how many neurons are set to zero in ReLU? Where do we find most values in Tanh? To answer these questions, we can write a simple function which takes a trained model, applies it to a batch of images, and plots the histogram of the activations inside the network:
from spotPython.torch.activation import Sigmoid, Tanh, ReLU, LeakyReLU, ELU, Swish
= {"sigmoid": Sigmoid, "tanh": Tanh, "relu": ReLU, "leakyrelu": LeakyReLU, "elu": ELU, "swish": Swish} act_fn_by_name
from spotPython.hyperparameters.values import get_one_config_from_X
= spot_tuner.to_all_dim(spot_tuner.min_X.reshape(1,-1))
X = get_one_config_from_X(X, fun_control)
config = fun_control["core_model"](**config, _L_in=64, _L_out=11)
model model
NetLightBase(
(act_fn): LeakyReLU()
(train_mapk): MAPK()
(valid_mapk): MAPK()
(test_mapk): MAPK()
(layers): Sequential(
(0): Linear(in_features=64, out_features=64, bias=True)
(1): LeakyReLU()
(2): Dropout(p=0.005170658955305807, inplace=False)
(3): Linear(in_features=64, out_features=32, bias=True)
(4): LeakyReLU()
(5): Dropout(p=0.005170658955305807, inplace=False)
(6): Linear(in_features=32, out_features=32, bias=True)
(7): LeakyReLU()
(8): Dropout(p=0.005170658955305807, inplace=False)
(9): Linear(in_features=32, out_features=16, bias=True)
(10): LeakyReLU()
(11): Dropout(p=0.005170658955305807, inplace=False)
(12): Linear(in_features=16, out_features=11, bias=True)
)
)
from spotPython.utils.eda import visualize_activations
="cpu", color=f"C{0}") visualize_activations(model, device