Module src.PyOghma_ML.Tuning

This module provides functionality for hyperparameter tuning of machine learning models using Keras Tuner.

It includes tools for building and optimizing model architectures, allowing users to explore different hyperparameter configurations. The module supports both single-input and multi-input models, enabling flexible experimentation with various network designs. Results from the tuning process, including the best hyperparameters, can be saved for further analysis.

Functions

def residual_block(inputs: tensorflow.python.framework.tensor.Tensor,
n_layers: int,
nodes: int,
activation: str,
dropout: float | None = None) ‑> tensorflow.python.framework.tensor.Tensor
Expand source code
def residual_block(inputs: tf.Tensor, n_layers: int, nodes: int, activation: str, dropout: Optional[float] = None) -> tf.Tensor:
    """
    Create a residual block for the model.

    A residual block implements the residual connection concept from ResNet,
    where the input is added to the output of a series of dense layers. This
    helps with gradient flow and enables training of deeper networks.

    Architecture:
        input -> BatchNorm -> Dense -> [Dense + Dropout]*(n_layers-1) -> Add(input, output)

    Args:
        inputs (tensorflow.Tensor): Input tensor for the block. Shape should be
            compatible with the dense layer node count.
        n_layers (int): Number of dense layers in the residual block. Must be >= 1.
        nodes (int): Number of nodes/neurons in each dense layer.
        activation (str): Activation function to use (e.g., 'relu', 'tanh', 'sigmoid').
        dropout (float, optional): Dropout rate applied after each layer except the first.
            If None, no dropout is applied. Should be between 0 and 1.

    Returns:
        tensorflow.Tensor: Output tensor with residual connection applied.
            Shape matches the input tensor.

    Note:
        The input tensor must have the same number of features as the nodes parameter
        for the residual connection to work properly.

    Example:
        >>> x = tf.keras.Input(shape=(64,))
        >>> y = residual_block(x, n_layers=3, nodes=64, activation='relu', dropout=0.2)
    """
    x = layers.BatchNormalization()(inputs)
    x = layers.Dense(nodes, activation=activation)(x)

    for idx in range(n_layers - 1):
        x = layers.Dense(nodes, activation=activation)(x)
        if dropout is not None:
            x = layers.Dropout(dropout)(x)

    x = layers.Add()([inputs, x])
    return x

Create a residual block for the model.

A residual block implements the residual connection concept from ResNet, where the input is added to the output of a series of dense layers. This helps with gradient flow and enables training of deeper networks.

Architecture

input -> BatchNorm -> Dense -> [Dense + Dropout]*(n_layers-1) -> Add(input, output)

Args

inputs : tensorflow.Tensor
Input tensor for the block. Shape should be compatible with the dense layer node count.
n_layers : int
Number of dense layers in the residual block. Must be >= 1.
nodes : int
Number of nodes/neurons in each dense layer.
activation : str
Activation function to use (e.g., 'relu', 'tanh', 'sigmoid').
dropout : float, optional
Dropout rate applied after each layer except the first. If None, no dropout is applied. Should be between 0 and 1.

Returns

tensorflow.Tensor
Output tensor with residual connection applied. Shape matches the input tensor.

Note

The input tensor must have the same number of features as the nodes parameter for the residual connection to work properly.

Example

>>> x = tf.keras.Input(shape=(64,))
>>> y = residual_block(x, n_layers=3, nodes=64, activation='relu', dropout=0.2)

Classes

class Tuning (training_features: numpy.ndarray,
training_targets: numpy.ndarray,
validation_features: numpy.ndarray,
validation_targets: numpy.ndarray,
dir: str)
Expand source code
class Tuning(kt.HyperModel):
    """
    A class for hyperparameter tuning of machine learning models using Keras Tuner.

    This class extends the Keras Tuner HyperModel to provide automated
    hyperparameter optimization for neural networks. It supports both
    single-input and multi-input model architectures with configurable
    hyperparameter search spaces.

    Features:
        - Automated hyperparameter search for layer sizes, learning rates, etc.
        - Support for single and dual input models
        - Residual block architectures
        - Custom learning rate scheduling
        - Results export and analysis

    Note:
        The tuning process uses Keras Tuner's optimization algorithms to find
        the best hyperparameter combinations based on validation performance.
    """

    def __init__(self, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray, dir: str) -> None:
        """
        Initialize a Tuning instance and perform hyperparameter tuning.

        This method sets up the hyperparameter tuning process, configures the
        search space, and executes the optimization to find the best model
        configuration. Results are automatically saved for later analysis.

        Args:
            training_features (numpy.ndarray): Input features for model training.
                Shape should be (n_samples, n_features).
            training_targets (numpy.ndarray): Target values for training.
                Shape should be (n_samples, n_targets).
            validation_features (numpy.ndarray): Input features for model validation.
                Used to evaluate hyperparameter combinations.
            validation_targets (numpy.ndarray): Target values for validation.
                Used to assess model performance during tuning.
            dir (str): Directory path where tuning results and best hyperparameters
                will be saved.

        Note:
            The tuning process may take significant time depending on the search
            space size and number of trials. Progress is displayed during execution.
        """
        self.hp = kt.HyperParameters()
        self.model = tf.keras.models.Sequential()
        self.input_dim = len(training_features[0])
        self.output_dim = np.shape(training_targets)[1]

        self.validation_features = validation_features.astype(float)
        self.validation_targets = validation_targets.astype(float)
        best_hps = self.tuning('mmm', training_features, training_targets, validation_features, validation_targets)
        df = pd.DataFrame(best_hps)
        df.to_csv(os.path.join(dir, 'best_hps.csv'))

    @staticmethod
    def builder_2(hp: kt.HyperParameters) -> keras.Model:
        """
        Build a model with two input branches for hyperparameter tuning.

        This static method creates a dual-input neural network architecture
        with hyperparameter optimization capabilities. The model combines
        two separate input streams through dense layers before merging.

        Args:
            hp (keras_tuner.HyperParameters): Hyperparameter object containing
                tunable values for the model architecture and training.

        Returns:
            keras.Model: A compiled Keras model with two input branches,
                ready for training and hyperparameter optimization.

        Architecture:
            - Two separate input branches with independent dense layers
            - Configurable layer sizes and activation functions
            - Dropout regularization with tunable rates
            - Merged output layer for final predictions
        """
        input_1 = keras.Input(shape=(100,))
        input_2 = keras.Input(shape=(100,))
        
        layer_nodes = [64, 64, 64, 64]
        activation = 'silu'
        dropout = [0.05, 0.05, 0.05, 0.05]

        x_1 = layers.Dense(layer_nodes[0], activation=None)(input_1)
        x_2 = layers.Dense(layer_nodes[0], activation=None)(input_2)

        x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])

        x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])

        x = layers.Add()([x_1, x_2])

        x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
        x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])

        outputs = layers.Dense(1, activation=None)(x)

        model = keras.Model(inputs=[input_1, input_2], outputs=outputs)

        inital_learning_rate = hp.Float(name='initial_learning_rate', min_value=1e-6, max_value=1, sampling='log', step=10)
        gamma_learning_rate = hp.Float(name='gamma_learning_rate', min_value=1e-6, max_value=1, sampling='log', step=10)
        power_learning_rate = hp.Float(name='power_learning_rate', min_value=1, max_value=5, step=1)

        model_optimiser = tf.keras.optimizers.Adam(learning_rate=lr(inital_learning_rate, gamma_learning_rate, power_learning_rate))
        model.compile(optimizer=model_optimiser, loss='mse', metrics=['mse'])
        return model

    @staticmethod
    def builder(hp: kt.HyperParameters) -> keras.Model:
        """
        Build a sequential model for hyperparameter tuning.

        This static method creates a standard sequential neural network
        with hyperparameter optimization for architecture and training
        configuration. The model uses tunable layer sizes, activation
        functions, and learning parameters.

        Args:
            hp (keras_tuner.HyperParameters): Hyperparameter object containing
                tunable values for model configuration.

        Returns:
            keras.Model: A compiled sequential Keras model optimized for
                hyperparameter tuning.

        Hyperparameters Tuned:
            - Initial learning rate
            - Learning rate decay parameters  
            - Activation functions
            - Batch normalization usage
            - Layer sizes and dropout rates
        """
        model = tf.keras.models.Sequential()
        init_learning_rate = hp.Float(name='Initial_learning_rate', min_value=1e-6, max_value=1, sampling="log", step=10)
        gamma_learning_rate = hp.Float(name='Gamma_learning_rate', min_value=1e-6, max_value=1, sampling="log", step=10)
        power_learning_rate = hp.Int(name='Power_learning_rate', min_value=1, max_value=5, step=1)
        activation = hp.Choice('activation', ['relu', 'tanh', 'sigmoid','gelu', 'selu'])
        batch_norm = hp.Boolean('batch_norm')
        layers = 4
        nodes = np.zeros(layers)
        dropout = np.zeros(layers)
        for i in range(layers):
            nodes[i] = hp.Choice(f'Nodes_{i}', [64, 128, 256, 512, 1024, 2048])
            dropout[i] = hp.Float(f'Dropout_{i}', min_value=0.0, max_value=0.8, step=0.1)

        for i, n in enumerate(nodes):
            if i == 0:
                model.add(tf.keras.layers.Dense(200, kernel_initializer='glorot_uniform', kernel_regularizer=None))
                model.add(tf.keras.layers.Activation(activation))
                if batch_norm:
                    model.add(tf.keras.layers.BatchNormalization())
                model.add(tf.keras.layers.Dropout(dropout[i]))
            else:
                model.add(tf.keras.layers.Dense(n, kernel_initializer='glorot_uniform', kernel_regularizer=None))
                model.add(tf.keras.layers.Activation(activation))
                if batch_norm:
                    model.add(tf.keras.layers.BatchNormalization())
                model.add(tf.keras.layers.Dropout(dropout[i]))

            model.add(tf.keras.layers.Dense(1, activation=None))

            model_optimiser = tf.keras.optimizers.Adam(learning_rate=lr(init_learning_rate, gamma_learning_rate, power_learning_rate))
            model.compile(optimizer=model_optimiser, loss='mse', metrics=['mse'])
            return model

    def tuning(self, model: Any, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray) -> List[kt.HyperParameters]:
        """
        Perform hyperparameter tuning using Keras Tuner.

        Args:
            model: The model to tune.
            training_features (numpy.ndarray): Features for training.
            training_targets (numpy.ndarray): Targets for training.
            validation_features (numpy.ndarray): Features for validation.
            validation_targets (numpy.ndarray): Targets for validation.

        Returns:
            list: Best hyperparameters found during tuning.
        """
        if not os.path.isdir(os.path.join(os.getcwd(), 'tuning')):
            os.mkdir(os.path.join(os.getcwd(), 'tuning'))
        tuner = kt.Hyperband(hypermodel=self.builder_2, objective='val_mse', project_name='tuning')
        callbacks = tf.keras.callbacks.EarlyStopping(patience=5, monitor='val_loss')
        tuner.search(
            x=[ training_features[:, :int(len(training_features[0])/2)], training_features[:, int(len(training_features[0])/2):]],
            y=training_targets,
            validation_data=([ validation_features[:, :int(len(validation_features[0])/2)], validation_features[:, int(len(validation_features[0])/2):]], validation_targets),
            epochs=2,
            callbacks=[callbacks],
            shuffle=True,
            batch_size=1024,
        )
        best_hps = tuner.get_best_hyperparameters()
        return best_hps

A class for hyperparameter tuning of machine learning models using Keras Tuner.

This class extends the Keras Tuner HyperModel to provide automated hyperparameter optimization for neural networks. It supports both single-input and multi-input model architectures with configurable hyperparameter search spaces.

Features

  • Automated hyperparameter search for layer sizes, learning rates, etc.
  • Support for single and dual input models
  • Residual block architectures
  • Custom learning rate scheduling
  • Results export and analysis

Note

The tuning process uses Keras Tuner's optimization algorithms to find the best hyperparameter combinations based on validation performance.

Initialize a Tuning instance and perform hyperparameter tuning.

This method sets up the hyperparameter tuning process, configures the search space, and executes the optimization to find the best model configuration. Results are automatically saved for later analysis.

Args

training_features : numpy.ndarray
Input features for model training. Shape should be (n_samples, n_features).
training_targets : numpy.ndarray
Target values for training. Shape should be (n_samples, n_targets).
validation_features : numpy.ndarray
Input features for model validation. Used to evaluate hyperparameter combinations.
validation_targets : numpy.ndarray
Target values for validation. Used to assess model performance during tuning.
dir : str
Directory path where tuning results and best hyperparameters will be saved.

Note

The tuning process may take significant time depending on the search space size and number of trials. Progress is displayed during execution.

Ancestors

  • keras_tuner.src.engine.hypermodel.HyperModel

Static methods

def builder(hp: keras_tuner.src.engine.hyperparameters.hyperparameters.HyperParameters) ‑> keras.src.models.model.Model
Expand source code
@staticmethod
def builder(hp: kt.HyperParameters) -> keras.Model:
    """
    Build a sequential model for hyperparameter tuning.

    This static method creates a standard sequential neural network
    with hyperparameter optimization for architecture and training
    configuration. The model uses tunable layer sizes, activation
    functions, and learning parameters.

    Args:
        hp (keras_tuner.HyperParameters): Hyperparameter object containing
            tunable values for model configuration.

    Returns:
        keras.Model: A compiled sequential Keras model optimized for
            hyperparameter tuning.

    Hyperparameters Tuned:
        - Initial learning rate
        - Learning rate decay parameters  
        - Activation functions
        - Batch normalization usage
        - Layer sizes and dropout rates
    """
    model = tf.keras.models.Sequential()
    init_learning_rate = hp.Float(name='Initial_learning_rate', min_value=1e-6, max_value=1, sampling="log", step=10)
    gamma_learning_rate = hp.Float(name='Gamma_learning_rate', min_value=1e-6, max_value=1, sampling="log", step=10)
    power_learning_rate = hp.Int(name='Power_learning_rate', min_value=1, max_value=5, step=1)
    activation = hp.Choice('activation', ['relu', 'tanh', 'sigmoid','gelu', 'selu'])
    batch_norm = hp.Boolean('batch_norm')
    layers = 4
    nodes = np.zeros(layers)
    dropout = np.zeros(layers)
    for i in range(layers):
        nodes[i] = hp.Choice(f'Nodes_{i}', [64, 128, 256, 512, 1024, 2048])
        dropout[i] = hp.Float(f'Dropout_{i}', min_value=0.0, max_value=0.8, step=0.1)

    for i, n in enumerate(nodes):
        if i == 0:
            model.add(tf.keras.layers.Dense(200, kernel_initializer='glorot_uniform', kernel_regularizer=None))
            model.add(tf.keras.layers.Activation(activation))
            if batch_norm:
                model.add(tf.keras.layers.BatchNormalization())
            model.add(tf.keras.layers.Dropout(dropout[i]))
        else:
            model.add(tf.keras.layers.Dense(n, kernel_initializer='glorot_uniform', kernel_regularizer=None))
            model.add(tf.keras.layers.Activation(activation))
            if batch_norm:
                model.add(tf.keras.layers.BatchNormalization())
            model.add(tf.keras.layers.Dropout(dropout[i]))

        model.add(tf.keras.layers.Dense(1, activation=None))

        model_optimiser = tf.keras.optimizers.Adam(learning_rate=lr(init_learning_rate, gamma_learning_rate, power_learning_rate))
        model.compile(optimizer=model_optimiser, loss='mse', metrics=['mse'])
        return model

Build a sequential model for hyperparameter tuning.

This static method creates a standard sequential neural network with hyperparameter optimization for architecture and training configuration. The model uses tunable layer sizes, activation functions, and learning parameters.

Args

hp : keras_tuner.HyperParameters
Hyperparameter object containing tunable values for model configuration.

Returns

keras.Model
A compiled sequential Keras model optimized for hyperparameter tuning.

Hyperparameters Tuned: - Initial learning rate - Learning rate decay parameters
- Activation functions - Batch normalization usage - Layer sizes and dropout rates

def builder_2(hp: keras_tuner.src.engine.hyperparameters.hyperparameters.HyperParameters) ‑> keras.src.models.model.Model
Expand source code
@staticmethod
def builder_2(hp: kt.HyperParameters) -> keras.Model:
    """
    Build a model with two input branches for hyperparameter tuning.

    This static method creates a dual-input neural network architecture
    with hyperparameter optimization capabilities. The model combines
    two separate input streams through dense layers before merging.

    Args:
        hp (keras_tuner.HyperParameters): Hyperparameter object containing
            tunable values for the model architecture and training.

    Returns:
        keras.Model: A compiled Keras model with two input branches,
            ready for training and hyperparameter optimization.

    Architecture:
        - Two separate input branches with independent dense layers
        - Configurable layer sizes and activation functions
        - Dropout regularization with tunable rates
        - Merged output layer for final predictions
    """
    input_1 = keras.Input(shape=(100,))
    input_2 = keras.Input(shape=(100,))
    
    layer_nodes = [64, 64, 64, 64]
    activation = 'silu'
    dropout = [0.05, 0.05, 0.05, 0.05]

    x_1 = layers.Dense(layer_nodes[0], activation=None)(input_1)
    x_2 = layers.Dense(layer_nodes[0], activation=None)(input_2)

    x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x_1 = residual_block(x_1, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])

    x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x_2 = residual_block(x_2, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])

    x = layers.Add()([x_1, x_2])

    x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])
    x = residual_block(x, nodes=layer_nodes[0], n_layers=len(layer_nodes), activation=activation, dropout=dropout[0])

    outputs = layers.Dense(1, activation=None)(x)

    model = keras.Model(inputs=[input_1, input_2], outputs=outputs)

    inital_learning_rate = hp.Float(name='initial_learning_rate', min_value=1e-6, max_value=1, sampling='log', step=10)
    gamma_learning_rate = hp.Float(name='gamma_learning_rate', min_value=1e-6, max_value=1, sampling='log', step=10)
    power_learning_rate = hp.Float(name='power_learning_rate', min_value=1, max_value=5, step=1)

    model_optimiser = tf.keras.optimizers.Adam(learning_rate=lr(inital_learning_rate, gamma_learning_rate, power_learning_rate))
    model.compile(optimizer=model_optimiser, loss='mse', metrics=['mse'])
    return model

Build a model with two input branches for hyperparameter tuning.

This static method creates a dual-input neural network architecture with hyperparameter optimization capabilities. The model combines two separate input streams through dense layers before merging.

Args

hp : keras_tuner.HyperParameters
Hyperparameter object containing tunable values for the model architecture and training.

Returns

keras.Model
A compiled Keras model with two input branches, ready for training and hyperparameter optimization.

Architecture

  • Two separate input branches with independent dense layers
  • Configurable layer sizes and activation functions
  • Dropout regularization with tunable rates
  • Merged output layer for final predictions

Methods

def tuning(self,
model: Any,
training_features: numpy.ndarray,
training_targets: numpy.ndarray,
validation_features: numpy.ndarray,
validation_targets: numpy.ndarray) ‑> List[keras_tuner.src.engine.hyperparameters.hyperparameters.HyperParameters]
Expand source code
def tuning(self, model: Any, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray) -> List[kt.HyperParameters]:
    """
    Perform hyperparameter tuning using Keras Tuner.

    Args:
        model: The model to tune.
        training_features (numpy.ndarray): Features for training.
        training_targets (numpy.ndarray): Targets for training.
        validation_features (numpy.ndarray): Features for validation.
        validation_targets (numpy.ndarray): Targets for validation.

    Returns:
        list: Best hyperparameters found during tuning.
    """
    if not os.path.isdir(os.path.join(os.getcwd(), 'tuning')):
        os.mkdir(os.path.join(os.getcwd(), 'tuning'))
    tuner = kt.Hyperband(hypermodel=self.builder_2, objective='val_mse', project_name='tuning')
    callbacks = tf.keras.callbacks.EarlyStopping(patience=5, monitor='val_loss')
    tuner.search(
        x=[ training_features[:, :int(len(training_features[0])/2)], training_features[:, int(len(training_features[0])/2):]],
        y=training_targets,
        validation_data=([ validation_features[:, :int(len(validation_features[0])/2)], validation_features[:, int(len(validation_features[0])/2):]], validation_targets),
        epochs=2,
        callbacks=[callbacks],
        shuffle=True,
        batch_size=1024,
    )
    best_hps = tuner.get_best_hyperparameters()
    return best_hps

Perform hyperparameter tuning using Keras Tuner.

Args

model
The model to tune.
training_features : numpy.ndarray
Features for training.
training_targets : numpy.ndarray
Targets for training.
validation_features : numpy.ndarray
Features for validation.
validation_targets : numpy.ndarray
Targets for validation.

Returns

list
Best hyperparameters found during tuning.
class lr (initial_learning_rate: float, gamma: float, power: float)
Expand source code
class lr(tf.keras.optimizers.schedules.LearningRateSchedule):
    """
    A custom learning rate schedule for TensorFlow/Keras.

    This schedule adjusts the learning rate based on the training step using
    a power decay formula.
    """

    def __init__(self, initial_learning_rate: float, gamma: float, power: float) -> None:
        """
        Initialize the learning rate schedule.

        Args:
            initial_learning_rate (float): Initial learning rate.
            gamma (float): Decay rate.
            power (float): Power for the decay formula.
        """
        self.initial_learning_rate = initial_learning_rate
        self.gamma = gamma
        self.power = power

    def __call__(self, step: int) -> float:
        """
        Calculate the learning rate for a given training step.

        Args:
            step (int): The current training step.

        Returns:
            float: The calculated learning rate.
        """
        return self.initial_learning_rate * tf.pow((step * self.gamma +1), -self.power)

    def get_config(self) -> Dict[str, float]:
        """
        Get the configuration of the learning rate schedule.

        Returns:
            dict: A dictionary containing the configuration.
        """
        config = {
            'initial_learning_rate': self.initial_learning_rate,
            'gamma': self.gamma,
            'power': self.power
        }
        return config

A custom learning rate schedule for TensorFlow/Keras.

This schedule adjusts the learning rate based on the training step using a power decay formula.

Initialize the learning rate schedule.

Args

initial_learning_rate : float
Initial learning rate.
gamma : float
Decay rate.
power : float
Power for the decay formula.

Ancestors

  • keras.src.optimizers.schedules.learning_rate_schedule.LearningRateSchedule

Methods

def get_config(self) ‑> Dict[str, float]
Expand source code
def get_config(self) -> Dict[str, float]:
    """
    Get the configuration of the learning rate schedule.

    Returns:
        dict: A dictionary containing the configuration.
    """
    config = {
        'initial_learning_rate': self.initial_learning_rate,
        'gamma': self.gamma,
        'power': self.power
    }
    return config

Get the configuration of the learning rate schedule.

Returns

dict
A dictionary containing the configuration.