Module src.PyOghma_ML.Networks

Neural Network Management and Training Framework for PyOghma_ML

This module provides a comprehensive framework for managing and training machine learning networks specifically designed for organic photovoltaic device analysis. It implements multiple network architectures, training strategies, and evaluation methods tailored for scientific data analysis.

Key Components: Networks (Base Class): Core functionality for network management, data loading, and training coordination. Provides factory methods and subclass registration for different network types.

Point Networks: Single-point prediction models for device parameter estimation.
    Optimized for individual device characteristic prediction.

Ensemble Networks: Multi-model aggregation systems for improved prediction accuracy.
    Combines multiple models to reduce variance and improve generalization.

Difference Networks: Specialized models for comparative analysis between experimental
    and predicted data. Useful for error analysis and model validation.

Features

  • Automatic data loading and preprocessing from simulation directories
  • Support for multiple input architectures (single, dual, quad, octal inputs)
  • Integrated hyperparameter tuning with Keras Tuner
  • Comprehensive model evaluation and statistical analysis
  • Automatic report generation with performance metrics
  • Support for both new training and continued training from existing models
  • Memory-efficient data handling for large datasets

Architecture Support: - Sequential dense networks with configurable depth - Residual networks with skip connections - Multi-input networks for complex feature interactions - Ensemble methods with model aggregation

Training Features: - Adaptive learning rate scheduling - Early stopping with patience - Batch normalization and dropout regularization - Custom loss functions and metrics - Data augmentation and permutation strategies

Integration

  • Seamless integration with Input module for data loading
  • Automatic interfacing with Training module for model fitting
  • Built-in Predicting capabilities for inference
  • Direct connection to Output module for result visualization

Performance Optimization: - GPU acceleration support - Efficient memory management for large datasets - Parallel training for ensemble methods - Optimized data pipelines for fast iteration

Example Usage: >>> # Initialize and train a Point network >>> networks = Networks.initialise('simulation_data/', 'Point', model_settings) >>> networks.train_networks() >>> >>> # Perform hyperparameter tuning >>> networks.tune_networks() >>> >>> # Generate predictions and analysis >>> predictions = networks.predict_experimental_data(experimental_inputs)

Note

This module requires TensorFlow/Keras for neural network operations and integrates closely with the OghmaNano simulation framework for data compatibility.

Classes

class Difference (networks_dir: str,
model_settings: Model_Settings | None = None)
Expand source code
class Difference(Networks):
    """
    Subclass of Networks for difference-based models.

    This class provides methods specific to difference-based models, including
    training and generating permutations.
    """
    _network_type = 'difference'

    def __init__(self, networks_dir: str, model_settings: Optional[Model_Settings] = None):
        """
        Initialize a Difference instance.

        Args:
            networks_dir (str): Directory containing network configurations.
            model_settings (Model_Settings, optional): Settings for the model.
        """
        self.networks_dir = networks_dir
        self.model_settings = model_settings
        self.rng = np.random.default_rng()

    def train(self, idx=None):
        self.setup_network_directories()
        
        if idx is None:
            for idx in range(self.total_networks):
                self.working_network = idx
                self.load_training_dataset()
                self.permutations_lowRAM()
                self.separate_training_dataset()
                self.train_networks()
        else:
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks('Difference')
    
    def train_existing(self, idx=None):
        self.setup_network_directories()
        
        if idx is None:
            for idx in range(self.total_networks):
                self.working_network = idx
                self.load_training_dataset()
                self.permutations_lowRAM()
                self.separate_training_dataset()
                self.train_networks_existing()
        else:
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks_existing('Difference')

    def tune(self) -> None:
        """
        Tune the difference-based networks.

        This method performs hyperparameter optimization for the networks.
        """
        self.setup_network_directories()

        for idx in range(self.total_networks):
            self.working_network = idx

            features, targets = self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.tune_networks()

    def permutations(self) -> Tuple[np.ndarray, np.ndarray]:
        """
        Generate permutations of features and targets for training.

        Returns:
            tuple: Features and targets with generated permutations.
        """
        features = self.features
        targets = self.targets
        if self.model_settings.permutations_limit is None:
            permutations_limit = len(targets)
        else:
            permutations_limit = self.model_settings.permutations_limit

        indices = random.choices(range(len(targets)), k=permutations_limit)
        self.indices = indices
        permutations = list(itertools.permutations(indices, r=2))
        permutations = np.array([list(f) for f in permutations], dtype=np.uint32)
        self.permutations_list = permutations

        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

        num_inputs = l

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)

        x = np.zeros((len(permutations), 2 * num_inputs))
        y = np.zeros((len(permutations), num_outputs))
        for idx, p in enumerate(permutations):
            x[idx, :num_inputs] = features[p[0]]
            x[idx, num_inputs:] = features[p[1]]
            y[idx] = targets[p[0]] - targets[p[1]]

        self.features = x  # copy.deepcopy(x)
        self.targets = y  # copy.deepcopy(y)
        self.population = len(permutations)
        return features, targets
    
    def permutations_lowRAM(self):
        # features = self.features
        # targets = self.targets
        if self.model_settings.permutations_limit is not None:
            permutations_limit = self.model_settings.permutations_limit

        indices = random.choices(range(1000), k=1000)
        self.indices = indices

        permutations = list(itertools.permutations(indices, r=2))
        permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

        Thousands = 40
        p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32)
        step = np.shape(permutations)[0]
        for idx in range(1, Thousands):
            if idx == 0:
                p[:step,:] = permutations + (1000*idx)
            else:
                p[step*(idx-1):step*idx,:] = permutations + (1000*idx)
            # permutations = np.append(permutations, permutations + (1000*idx), axis=0)
        permutations = p

        if self.model_settings.permutations_limit < np.shape(permutations)[0]:
            permutations = permutations[:self.model_settings.permutations_limit]
        self.permutations_list = permutations

        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

        num_inputs = l

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)

        x = np.zeros((len(permutations), 2 * num_inputs))
        y = np.zeros((len(permutations), num_outputs))
        for idx, p in enumerate(permutations):
            x[idx, :num_inputs] = self.features[p[0]]
            x[idx, num_inputs:] = self.features[p[1]]
            y[idx] = self.targets[p[0]] - self.targets[p[1]]

        self.features = x  # copy.deepcopy(x)
        self.targets = y  # copy.deepcopy(y)
        self.population = len(permutations)
        print(self.population)

        return self.features, self.targets
    
    def combinations(self) -> None:
        """
        Generate combinations of features and targets for training.
        """
        if self.model_settings.permutations_limit is None:
            permutations_limit = self.population
        else:
            permutations_limit = self.model_settings.permutations_limit

        rng = np.random.default_rng()
        indices = rng.choice(range(self.population), permutations_limit)
        permutations = list(itertools.combinations(indices, r=2))
        permutations = [list(f) for f in permutations]

        x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points))
        y = np.zeros((len(self.networks_configured), len(permutations), 1))

        for idx, p in enumerate(permutations):
            x[:, idx, :self.points] = self.features[:, p[0], :]
            x[:, idx, self.points:] = self.features[:, p[1], :]
            y[:, idx] = self.targets[p[0]] - self.targets[p[1]]

        self.features = copy.deepcopy(x)
        self.targets = copy.deepcopy(y)
        self.population = len(permutations)

    def predict(self, absolute_dir, experimental_feature):
        experimental_feature = copy.deepcopy(experimental_feature)

        self.setup_network_directories()
        # Gather Input Vectors Used
        self.load_input_vectors()

        # Samples Experimental Feature
        experimental_feature = self.sample_experimental_features(experimental_feature)

        # Normalises Experimental Feature
        experimental_feature = self.normalise_experimental_features(experimental_feature)

        for idx in range(self.total_networks):
            self.working_network = idx
            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

            # Gathers and sets up secondary dataset
            self.setup_absolute_dataset(absolute_dir)

            # Gathers number of ouptus
            outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

            s = np.shape(self.absolute_features)

            # Setup Prediction Array
            self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object)

            # Pairs experimental feature with secondary dataset
            feature = self.setup_experimental_feature(experimental_feature)

            # Sets up Network
            P = Predicting(dir, feature)

            # Generates Normalised Predictions
            self.normalised_predicitons = P.predict()

            self.turn_predictions_absolute()

            self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

            prediction = self.denormalise_prediction_single(absolute_dir)

            self.predictions[:, idx, :] = prediction[:]
            self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons)

    def renormalise(self, absolute_v, absolute_dir):
        network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
        network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ',
                                          names=['param', 'min', 'max', 'log'])

        absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
        absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ',
                                           names=['param', 'min', 'max', 'log'])

        temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]]

        absolute_min = temp['min'].to_numpy()[0]
        absolute_max = temp['max'].to_numpy()[0]
        absolute_log = temp['log'].to_numpy()[0]

        if absolute_log == 1:
            absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max)
        else:
            absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min,
                                                                  absolute_max)

        temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]]
        network_min = temp['min'].to_numpy()[0]
        network_max = temp['max'].to_numpy()[0]
        network_log = temp['log'].to_numpy()[0]

        if network_log == 1:
            absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max)
        else:
            absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max)

        vecs = []
        for f in absolute_v.columns:
            if 'vec' in f:
                if 'light' in f:
                    vecs.append(f)

        for vec in vecs:
            temp = absolute_min_max_log[absolute_min_max_log['param'] == vec]
            absolute_min = temp['min'].to_numpy()[0]
            absolute_max = temp['max'].to_numpy()[0]
            absolute_log = temp['log'].to_numpy()[0]

            if absolute_log == 1:
                absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max)
            else:
                absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max)

            temp = network_min_max_log[network_min_max_log['param'] == vec]
            network_min = temp['min'].to_numpy()[0]
            network_max = temp['max'].to_numpy()[0]
            network_log = temp['log'].to_numpy()[0]

            if network_log == 1:
                absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max)
            else:
                absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max)

        return absolute_v

    def setup_absolute_dataset(self, absolute_dir):
        absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv')

        inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']

        self.inputs = inputs
        self.outputs = outputs

        f = open(absolute_vectors_dir, 'r')
        v = pd.read_csv(f, delimiter=' ')
        self.vectors = self.renormalise(v, absolute_dir)
        f.close()

        col = self.vectors.columns.to_list()
        vecs = np.where(np.char.find(np.char.lower(col), 'light_1.0.vec') > -1)[0]
        col = [col[x] for x in vecs]
        self.absolute_targets = self.vectors[outputs].to_numpy()
        self.absolute_features = self.vectors[col].to_numpy()

        #self.generate_uniform_distribution()

    def generate_uniform_distribution(self):
        kde = stats.gaussian_kde(self.absolute_targets.ravel())
        density = kde(self.absolute_targets.ravel())
        density = self.normalise_linear(density, np.min(density), np.max(density))
        density = 1 - density
        density = density / np.sum(density)
        indicies = np.arange(len(density))
        normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2))
        self.absolute_targets = self.absolute_targets[normal]
        self.absolute_features = self.absolute_features[normal]

    def setup_experimental_feature(self, experimental_feature):
        absolute_features = self.absolute_features
        y = experimental_feature.y
        y = np.tile(y, len(absolute_features))
        y = np.reshape(y, np.shape(absolute_features))

        features1 = np.concatenate((absolute_features, y), axis=1)
        features2 = np.concatenate((y, absolute_features), axis=1)
        features = np.concatenate((features1, features2), axis=0)
        return features

    def turn_predictions_absolute(self):
        origin = self.absolute_targets
        l = len(self.absolute_targets)
        self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l]
        self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin

    def denormalise_predictions(self, absolute_dir):
        min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None,
                                  sep=' ', names=['param', 'min', 'max', 'log'])
        predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))
        normalised_predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

        for idx in range(self.total_networks):
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            num_outputs = len(outputs)
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values
            max = network_min_max_log['max'].values
            log = network_min_max_log['log'].values
            # try:
            for jdx in range(num_outputs):
                for kdx in range(len(self.absolute_targets)):
                    if not log:
                        predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (max[jdx] - min[jdx]) + \
                                                     min[jdx]

                    else:
                        predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (
                                    np.log10(max[jdx]) - np.log10(min[jdx])) + np.log10(min[jdx])
                        predictions[kdx, idx, jdx] = 10 ** predictions[kdx, idx, jdx]
            self.predictions = predictions

    def denormalise_prediction_single(self, absolute_dir):
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])
        predictions = np.zeros((len(self.absolute_targets), 10))

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values
        max = network_min_max_log['max'].values
        log = network_min_max_log['log'].values[0]

        for jdx in range(num_outputs):
            for kdx in range(len(self.absolute_targets)):
                if not log:
                    predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                                    max[jdx])
                else:
                    predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                                 max[jdx])

        return predictions

    def denormalise_target_single(self, absolute_dir):
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])
        predictions = np.zeros((len(self.absolute_targets), 10))

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values
        max = network_min_max_log['max'].values
        log = network_min_max_log['log'].values[0]

        for jdx in range(num_outputs):
            for kdx in range(len(self.absolute_targets)):
                if not log:
                    predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])
                else:
                    predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])

        return predictions

    def permutations_normal_distribution(self, features, targets):
        if self.model_settings.permutations_limit is None:
            permutations_limit = len(targets)
        else:
            permutations_limit = self.model_settings.permutations_limit

        indices = self.rng.choice(range(len(targets)), 300)  # , p=targets_weights)

        permutations = list(itertools.permutations(indices, r=2))
        permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

        num_inputs = l

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)

        x = np.zeros((len(permutations), 2 * num_inputs))
        y = np.zeros((len(permutations), num_outputs))
        for idx, p in enumerate(permutations):
            x[idx, :num_inputs] = features[p[0]]
            x[idx, num_inputs:] = features[p[1]]
            y[idx] = targets[p[0]] - targets[p[1]]

        features = x  # copy.deepcopy(x)
        targets = y.ravel()  # copy.deepcopy(y)

        self.population = len(permutations)
        return features, targets

    def setup_confusion_matrix_features(self):
        absolute_features = self.absolute_features
        absolute_targets = self.absolute_targets
        x_features = absolute_features
        x_targets = absolute_targets[::-1]
        y_features = absolute_features
        y_targets = absolute_targets[::-1]

        features1 = np.concatenate((x_features, y_features), axis=1)
        features2 = np.concatenate((y_features, x_features), axis=1)
        features = np.concatenate((features1, features2), axis=0)

        targets = np.concatenate((x_targets, y_targets), axis=0)
        return targets, features

    def confusion_matrix(self, absolute_dir=None):
        self.setup_network_directories()
        # Gather Input Vectors Used
        self.load_input_vectors()
        MAPE = np.zeros(len(self.networks_configured))
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        for idx in range(self.total_networks):
            self.working_network = idx
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values[0]
            max = network_min_max_log['max'].values[0]
            log = network_min_max_log['log'].values[0]
            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

            # Gathers and sets up secondary dataset
            self.setup_absolute_dataset(absolute_dir)

            # Gathers number of ouptus
            outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

            s = np.shape(self.absolute_features)

            # Setup Prediction Array
            self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object)

            # Pairs experimental feature with secondary dataset
            targets, feature = self.setup_confusion_matrix_features()

            # Sets up Network
            P = Predicting(dir, feature)

            # Generates Normalised Predictions
            self.normalised_predicitons = P.predict()

            self.absolute_targets = targets
            self.normalised_predicitons = targets - self.normalised_predicitons
            #self.normalised_predicitons[:l] = self.normalised_predicitons[:l] - targets[:l]

            #self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

            prediction = self.denormalise_prediction_single(absolute_dir)
            #self.absolute_targets = targets
            targets = self.denormalise_target_single(absolute_dir)

            #self.predictions[:, idx, :] = prediction[:]
            #targets = self.absolute_targets[self.confusion_matrix_ind]

            fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
            plt.xlabel('Target')
            plt.ylabel('Prediction')
            if log == 0:
                plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno')
            else:
                plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno')
                plt.xscale('log')
                plt.yscale('log')
            figname = 'tempCF' + str(self.working_network)
            if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
                os.mkdir(os.path.join(os.getcwd(), 'temp'))
            figname = os.path.join(os.getcwd(), 'temp', figname + '.png')
            data = pd.DataFrame()
            data['Target'] = targets[:,0].ravel()
            data['Predicted'] = prediction[:,0].ravel()
            data.to_csv(figname + '.csv', index=False)
            plt.savefig(figname)

            MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100)
            self.MAPE = MAPE
        return MAPE

    def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None:
        """
        Generate a distribution plot for a single network.

        Args:
            absolute_dir (str): Directory containing absolute data.
            idx (int): Index of the network.
            predictions (numpy.ndarray): Predictions to plot.
        """
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        self.working_network = idx
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]

        p = predictions[:, 0]

        if self.working_network == 0:
            self.mean = np.zeros(self.total_networks)
            self.std = np.zeros(self.total_networks)

        fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
        if log == 0:
            m = np.mean(p)
            s = np.std(p)

            self.mean[idx] = np.abs(m)
            self.std[idx] = s 
            count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) )
            ax.hist(np.abs(p), bins=np.linspace(min, max, 1000))
        else:
            m = np.mean(p)
            s = np.std(p)
            self.mean[idx] = np.abs(m)
            self.std[idx] = s
            count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
            ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
            ax.set_xscale('log')
        ax.axvline(m, color='tab:orange')
        L = Label(outputs[0])
        ax.set_xlabel(L.english + ' (' + L.units + ')')
        ax.set_ylabel('Count')

        hist = pd.DataFrame()
        count = np.append(count, 0)
        hist['bins'] = predictions[:, 0]

        hist_n = pd.DataFrame()
        hist_n['bins'] = norm_predictions[:, 0]

        figname = 'tempDF' + str(self.working_network) + '.png'
        figname_hist = 'tempDF' + str(self.working_network) + '.csv'
        figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv'
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))

        figname = os.path.join(os.getcwd(), 'temp', figname)
        figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist)
        plt.savefig(figname)
        hist.to_csv(figname_hist, index=False)
        hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False)

    def distribution_plot(self, absolute_dir):
        min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        for idx in range(self.total_networks):
            self.working_network = idx
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values[0]
            max = network_min_max_log['max'].values[0]
            log = network_min_max_log['log'].values[0]

            p = self.predictions[:, idx, 0]
            m = np.mean(p)
            fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
            ax.hist(p, bins=10000, range=(min, max))
            ax.axvline(m, color='tab:orange')
            # ax.set_xlim(left=min, right=max)
            ax.set_xlabel('Prediction')
            ax.set_ylabel('Count')

            figname = 'tempDF' + str(self.working_network) + '.png'
            if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
                os.mkdir(os.path.join(os.getcwd(), 'temp'))

            figname = os.path.join(os.getcwd(), 'temp', figname)
            plt.show()
            # plt.savefig(figname)

Subclass of Networks for difference-based models.

This class provides methods specific to difference-based models, including training and generating permutations.

Initialize a Difference instance.

Args

networks_dir : str
Directory containing network configurations.
model_settings : Model_Settings, optional
Settings for the model.

Ancestors

Methods

def combinations(self) ‑> None
Expand source code
def combinations(self) -> None:
    """
    Generate combinations of features and targets for training.
    """
    if self.model_settings.permutations_limit is None:
        permutations_limit = self.population
    else:
        permutations_limit = self.model_settings.permutations_limit

    rng = np.random.default_rng()
    indices = rng.choice(range(self.population), permutations_limit)
    permutations = list(itertools.combinations(indices, r=2))
    permutations = [list(f) for f in permutations]

    x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points))
    y = np.zeros((len(self.networks_configured), len(permutations), 1))

    for idx, p in enumerate(permutations):
        x[:, idx, :self.points] = self.features[:, p[0], :]
        x[:, idx, self.points:] = self.features[:, p[1], :]
        y[:, idx] = self.targets[p[0]] - self.targets[p[1]]

    self.features = copy.deepcopy(x)
    self.targets = copy.deepcopy(y)
    self.population = len(permutations)

Generate combinations of features and targets for training.

def confusion_matrix(self, absolute_dir=None)
Expand source code
def confusion_matrix(self, absolute_dir=None):
    self.setup_network_directories()
    # Gather Input Vectors Used
    self.load_input_vectors()
    MAPE = np.zeros(len(self.networks_configured))
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    for idx in range(self.total_networks):
        self.working_network = idx
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

        # Gathers and sets up secondary dataset
        self.setup_absolute_dataset(absolute_dir)

        # Gathers number of ouptus
        outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

        s = np.shape(self.absolute_features)

        # Setup Prediction Array
        self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object)

        # Pairs experimental feature with secondary dataset
        targets, feature = self.setup_confusion_matrix_features()

        # Sets up Network
        P = Predicting(dir, feature)

        # Generates Normalised Predictions
        self.normalised_predicitons = P.predict()

        self.absolute_targets = targets
        self.normalised_predicitons = targets - self.normalised_predicitons
        #self.normalised_predicitons[:l] = self.normalised_predicitons[:l] - targets[:l]

        #self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

        prediction = self.denormalise_prediction_single(absolute_dir)
        #self.absolute_targets = targets
        targets = self.denormalise_target_single(absolute_dir)

        #self.predictions[:, idx, :] = prediction[:]
        #targets = self.absolute_targets[self.confusion_matrix_ind]

        fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
        plt.xlabel('Target')
        plt.ylabel('Prediction')
        if log == 0:
            plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno')
        else:
            plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno')
            plt.xscale('log')
            plt.yscale('log')
        figname = 'tempCF' + str(self.working_network)
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))
        figname = os.path.join(os.getcwd(), 'temp', figname + '.png')
        data = pd.DataFrame()
        data['Target'] = targets[:,0].ravel()
        data['Predicted'] = prediction[:,0].ravel()
        data.to_csv(figname + '.csv', index=False)
        plt.savefig(figname)

        MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100)
        self.MAPE = MAPE
    return MAPE
def denormalise_prediction_single(self, absolute_dir)
Expand source code
def denormalise_prediction_single(self, absolute_dir):
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])
    predictions = np.zeros((len(self.absolute_targets), 10))

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)
    network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
    min = network_min_max_log['min'].values
    max = network_min_max_log['max'].values
    log = network_min_max_log['log'].values[0]

    for jdx in range(num_outputs):
        for kdx in range(len(self.absolute_targets)):
            if not log:
                predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                                max[jdx])
            else:
                predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                             max[jdx])

    return predictions
def denormalise_target_single(self, absolute_dir)
Expand source code
def denormalise_target_single(self, absolute_dir):
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])
    predictions = np.zeros((len(self.absolute_targets), 10))

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)
    network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
    min = network_min_max_log['min'].values
    max = network_min_max_log['max'].values
    log = network_min_max_log['log'].values[0]

    for jdx in range(num_outputs):
        for kdx in range(len(self.absolute_targets)):
            if not log:
                predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])
            else:
                predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])

    return predictions
def distribution_plot(self, absolute_dir)
Expand source code
def distribution_plot(self, absolute_dir):
    min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    for idx in range(self.total_networks):
        self.working_network = idx
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]

        p = self.predictions[:, idx, 0]
        m = np.mean(p)
        fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
        ax.hist(p, bins=10000, range=(min, max))
        ax.axvline(m, color='tab:orange')
        # ax.set_xlim(left=min, right=max)
        ax.set_xlabel('Prediction')
        ax.set_ylabel('Count')

        figname = 'tempDF' + str(self.working_network) + '.png'
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))

        figname = os.path.join(os.getcwd(), 'temp', figname)
        plt.show()
        # plt.savefig(figname)
def distribution_plot_single(self,
absolute_dir: str,
idx: int,
predictions: numpy.ndarray,
norm_predictions: numpy.ndarray) ‑> None
Expand source code
def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None:
    """
    Generate a distribution plot for a single network.

    Args:
        absolute_dir (str): Directory containing absolute data.
        idx (int): Index of the network.
        predictions (numpy.ndarray): Predictions to plot.
    """
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    self.working_network = idx
    outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
    network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
    min = network_min_max_log['min'].values[0]
    max = network_min_max_log['max'].values[0]
    log = network_min_max_log['log'].values[0]

    p = predictions[:, 0]

    if self.working_network == 0:
        self.mean = np.zeros(self.total_networks)
        self.std = np.zeros(self.total_networks)

    fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
    if log == 0:
        m = np.mean(p)
        s = np.std(p)

        self.mean[idx] = np.abs(m)
        self.std[idx] = s 
        count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) )
        ax.hist(np.abs(p), bins=np.linspace(min, max, 1000))
    else:
        m = np.mean(p)
        s = np.std(p)
        self.mean[idx] = np.abs(m)
        self.std[idx] = s
        count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
        ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
        ax.set_xscale('log')
    ax.axvline(m, color='tab:orange')
    L = Label(outputs[0])
    ax.set_xlabel(L.english + ' (' + L.units + ')')
    ax.set_ylabel('Count')

    hist = pd.DataFrame()
    count = np.append(count, 0)
    hist['bins'] = predictions[:, 0]

    hist_n = pd.DataFrame()
    hist_n['bins'] = norm_predictions[:, 0]

    figname = 'tempDF' + str(self.working_network) + '.png'
    figname_hist = 'tempDF' + str(self.working_network) + '.csv'
    figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv'
    if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
        os.mkdir(os.path.join(os.getcwd(), 'temp'))

    figname = os.path.join(os.getcwd(), 'temp', figname)
    figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist)
    plt.savefig(figname)
    hist.to_csv(figname_hist, index=False)
    hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False)

Generate a distribution plot for a single network.

Args

absolute_dir : str
Directory containing absolute data.
idx : int
Index of the network.
predictions : numpy.ndarray
Predictions to plot.
def generate_uniform_distribution(self)
Expand source code
def generate_uniform_distribution(self):
    kde = stats.gaussian_kde(self.absolute_targets.ravel())
    density = kde(self.absolute_targets.ravel())
    density = self.normalise_linear(density, np.min(density), np.max(density))
    density = 1 - density
    density = density / np.sum(density)
    indicies = np.arange(len(density))
    normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2))
    self.absolute_targets = self.absolute_targets[normal]
    self.absolute_features = self.absolute_features[normal]
def permutations(self) ‑> Tuple[numpy.ndarray, numpy.ndarray]
Expand source code
def permutations(self) -> Tuple[np.ndarray, np.ndarray]:
    """
    Generate permutations of features and targets for training.

    Returns:
        tuple: Features and targets with generated permutations.
    """
    features = self.features
    targets = self.targets
    if self.model_settings.permutations_limit is None:
        permutations_limit = len(targets)
    else:
        permutations_limit = self.model_settings.permutations_limit

    indices = random.choices(range(len(targets)), k=permutations_limit)
    self.indices = indices
    permutations = list(itertools.permutations(indices, r=2))
    permutations = np.array([list(f) for f in permutations], dtype=np.uint32)
    self.permutations_list = permutations

    keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    keys = list(keys)
    l = 0
    for key in keys:
        l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

    num_inputs = l

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)

    x = np.zeros((len(permutations), 2 * num_inputs))
    y = np.zeros((len(permutations), num_outputs))
    for idx, p in enumerate(permutations):
        x[idx, :num_inputs] = features[p[0]]
        x[idx, num_inputs:] = features[p[1]]
        y[idx] = targets[p[0]] - targets[p[1]]

    self.features = x  # copy.deepcopy(x)
    self.targets = y  # copy.deepcopy(y)
    self.population = len(permutations)
    return features, targets

Generate permutations of features and targets for training.

Returns

tuple
Features and targets with generated permutations.
def permutations_lowRAM(self)
Expand source code
def permutations_lowRAM(self):
    # features = self.features
    # targets = self.targets
    if self.model_settings.permutations_limit is not None:
        permutations_limit = self.model_settings.permutations_limit

    indices = random.choices(range(1000), k=1000)
    self.indices = indices

    permutations = list(itertools.permutations(indices, r=2))
    permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

    Thousands = 40
    p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32)
    step = np.shape(permutations)[0]
    for idx in range(1, Thousands):
        if idx == 0:
            p[:step,:] = permutations + (1000*idx)
        else:
            p[step*(idx-1):step*idx,:] = permutations + (1000*idx)
        # permutations = np.append(permutations, permutations + (1000*idx), axis=0)
    permutations = p

    if self.model_settings.permutations_limit < np.shape(permutations)[0]:
        permutations = permutations[:self.model_settings.permutations_limit]
    self.permutations_list = permutations

    keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    keys = list(keys)
    l = 0
    for key in keys:
        l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

    num_inputs = l

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)

    x = np.zeros((len(permutations), 2 * num_inputs))
    y = np.zeros((len(permutations), num_outputs))
    for idx, p in enumerate(permutations):
        x[idx, :num_inputs] = self.features[p[0]]
        x[idx, num_inputs:] = self.features[p[1]]
        y[idx] = self.targets[p[0]] - self.targets[p[1]]

    self.features = x  # copy.deepcopy(x)
    self.targets = y  # copy.deepcopy(y)
    self.population = len(permutations)
    print(self.population)

    return self.features, self.targets
def permutations_normal_distribution(self, features, targets)
Expand source code
def permutations_normal_distribution(self, features, targets):
    if self.model_settings.permutations_limit is None:
        permutations_limit = len(targets)
    else:
        permutations_limit = self.model_settings.permutations_limit

    indices = self.rng.choice(range(len(targets)), 300)  # , p=targets_weights)

    permutations = list(itertools.permutations(indices, r=2))
    permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

    keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    keys = list(keys)
    l = 0
    for key in keys:
        l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

    num_inputs = l

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)

    x = np.zeros((len(permutations), 2 * num_inputs))
    y = np.zeros((len(permutations), num_outputs))
    for idx, p in enumerate(permutations):
        x[idx, :num_inputs] = features[p[0]]
        x[idx, num_inputs:] = features[p[1]]
        y[idx] = targets[p[0]] - targets[p[1]]

    features = x  # copy.deepcopy(x)
    targets = y.ravel()  # copy.deepcopy(y)

    self.population = len(permutations)
    return features, targets
def predict(self, absolute_dir, experimental_feature)
Expand source code
def predict(self, absolute_dir, experimental_feature):
    experimental_feature = copy.deepcopy(experimental_feature)

    self.setup_network_directories()
    # Gather Input Vectors Used
    self.load_input_vectors()

    # Samples Experimental Feature
    experimental_feature = self.sample_experimental_features(experimental_feature)

    # Normalises Experimental Feature
    experimental_feature = self.normalise_experimental_features(experimental_feature)

    for idx in range(self.total_networks):
        self.working_network = idx
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

        # Gathers and sets up secondary dataset
        self.setup_absolute_dataset(absolute_dir)

        # Gathers number of ouptus
        outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

        s = np.shape(self.absolute_features)

        # Setup Prediction Array
        self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object)

        # Pairs experimental feature with secondary dataset
        feature = self.setup_experimental_feature(experimental_feature)

        # Sets up Network
        P = Predicting(dir, feature)

        # Generates Normalised Predictions
        self.normalised_predicitons = P.predict()

        self.turn_predictions_absolute()

        self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

        prediction = self.denormalise_prediction_single(absolute_dir)

        self.predictions[:, idx, :] = prediction[:]
        self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons)
def renormalise(self, absolute_v, absolute_dir)
Expand source code
def renormalise(self, absolute_v, absolute_dir):
    network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
    network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ',
                                      names=['param', 'min', 'max', 'log'])

    absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
    absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ',
                                       names=['param', 'min', 'max', 'log'])

    temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]]

    absolute_min = temp['min'].to_numpy()[0]
    absolute_max = temp['max'].to_numpy()[0]
    absolute_log = temp['log'].to_numpy()[0]

    if absolute_log == 1:
        absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max)
    else:
        absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min,
                                                              absolute_max)

    temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]]
    network_min = temp['min'].to_numpy()[0]
    network_max = temp['max'].to_numpy()[0]
    network_log = temp['log'].to_numpy()[0]

    if network_log == 1:
        absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max)
    else:
        absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max)

    vecs = []
    for f in absolute_v.columns:
        if 'vec' in f:
            if 'light' in f:
                vecs.append(f)

    for vec in vecs:
        temp = absolute_min_max_log[absolute_min_max_log['param'] == vec]
        absolute_min = temp['min'].to_numpy()[0]
        absolute_max = temp['max'].to_numpy()[0]
        absolute_log = temp['log'].to_numpy()[0]

        if absolute_log == 1:
            absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max)
        else:
            absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max)

        temp = network_min_max_log[network_min_max_log['param'] == vec]
        network_min = temp['min'].to_numpy()[0]
        network_max = temp['max'].to_numpy()[0]
        network_log = temp['log'].to_numpy()[0]

        if network_log == 1:
            absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max)
        else:
            absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max)

    return absolute_v
def setup_absolute_dataset(self, absolute_dir)
Expand source code
def setup_absolute_dataset(self, absolute_dir):
    absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv')

    inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']

    self.inputs = inputs
    self.outputs = outputs

    f = open(absolute_vectors_dir, 'r')
    v = pd.read_csv(f, delimiter=' ')
    self.vectors = self.renormalise(v, absolute_dir)
    f.close()

    col = self.vectors.columns.to_list()
    vecs = np.where(np.char.find(np.char.lower(col), 'light_1.0.vec') > -1)[0]
    col = [col[x] for x in vecs]
    self.absolute_targets = self.vectors[outputs].to_numpy()
    self.absolute_features = self.vectors[col].to_numpy()

    #self.generate_uniform_distribution()
def setup_confusion_matrix_features(self)
Expand source code
def setup_confusion_matrix_features(self):
    absolute_features = self.absolute_features
    absolute_targets = self.absolute_targets
    x_features = absolute_features
    x_targets = absolute_targets[::-1]
    y_features = absolute_features
    y_targets = absolute_targets[::-1]

    features1 = np.concatenate((x_features, y_features), axis=1)
    features2 = np.concatenate((y_features, x_features), axis=1)
    features = np.concatenate((features1, features2), axis=0)

    targets = np.concatenate((x_targets, y_targets), axis=0)
    return targets, features
def setup_experimental_feature(self, experimental_feature)
Expand source code
def setup_experimental_feature(self, experimental_feature):
    absolute_features = self.absolute_features
    y = experimental_feature.y
    y = np.tile(y, len(absolute_features))
    y = np.reshape(y, np.shape(absolute_features))

    features1 = np.concatenate((absolute_features, y), axis=1)
    features2 = np.concatenate((y, absolute_features), axis=1)
    features = np.concatenate((features1, features2), axis=0)
    return features
def train(self, idx=None)
Expand source code
def train(self, idx=None):
    self.setup_network_directories()
    
    if idx is None:
        for idx in range(self.total_networks):
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks()
    else:
        self.working_network = idx
        self.load_training_dataset()
        self.permutations_lowRAM()
        self.separate_training_dataset()
        self.train_networks('Difference')
def train_existing(self, idx=None)
Expand source code
def train_existing(self, idx=None):
    self.setup_network_directories()
    
    if idx is None:
        for idx in range(self.total_networks):
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks_existing()
    else:
        self.working_network = idx
        self.load_training_dataset()
        self.permutations_lowRAM()
        self.separate_training_dataset()
        self.train_networks_existing('Difference')
def tune(self) ‑> None
Expand source code
def tune(self) -> None:
    """
    Tune the difference-based networks.

    This method performs hyperparameter optimization for the networks.
    """
    self.setup_network_directories()

    for idx in range(self.total_networks):
        self.working_network = idx

        features, targets = self.load_training_dataset()
        self.permutations_lowRAM()
        self.separate_training_dataset()
        self.tune_networks()

Tune the difference-based networks.

This method performs hyperparameter optimization for the networks.

def turn_predictions_absolute(self)
Expand source code
def turn_predictions_absolute(self):
    origin = self.absolute_targets
    l = len(self.absolute_targets)
    self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l]
    self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin

Inherited members

class Ensemble (networks_dir: str)
Expand source code
class Ensemble(Networks):
    """
    Subclass of Networks for ensemble-based models.

    This class provides methods specific to ensemble-based models, including
    training and feature augmentation.
    """
    _network_type = 'ensemble'

    def __init__(self, networks_dir: str):
        """
        Initialize an Ensemble instance.

        Args:
            networks_dir (str): Directory containing network configurations.
        """
        self.networks_dir = networks_dir
        self.rng = np.random.default_rng()

    def train(self) -> None:
        """
        Train the ensemble-based networks.

        This method sets up directories, loads datasets, and trains the
        networks using ensemble techniques.
        """
        self.setup_network_directories()
        for idx in range(self.total_networks):
            self.working_network = idx

            self.load_training_dataset()
            training_features, training_targets, validation_features, validation_targets = self.separate_training_dataset()
            self.train_networks_ensemble(training_features, training_targets, validation_features, validation_targets)

    def train_networks_ensemble(self, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray) -> None:
        """
        Train ensemble networks with augmented features.

        Args:
            training_features (numpy.ndarray): Training features.
            training_targets (numpy.ndarray): Training targets.
            validation_features (numpy.ndarray): Validation features.
            validation_targets (numpy.ndarray): Validation targets.
        """
        network_metric = np.zeros((self.model_settings.ensemble_presample))

        presample_num = 0

        while presample_num < self.model_settings.ensemble_presample:
            augmented_training_features = self.augment_features(training_features)
            network = copy.deepcopy(self.networks[self.working_network])
            network_metric[presample_num] = Training(augmented_training_features, training_targets, validation_features,
                                                     validation_targets, dir)
            presample_num += 1

        mean_network_metric = np.mean(network_metric)

        ensemble_population = 0
        metric = 0
        patience = 0
        while ensemble_population < self.model_settings.ensemble_maximum and patience < self.model_settings.ensemble_patience:
            network = copy.deepcopy(self.networks[self.working_network])
            augmented_training_features = self.augment_features(training_features)
            candidate_network = network.fit(augmented_training_features[:], training_targets[:])

            if metric == 0:
                temp_metric = candidate_network
            else:
                temp_metric = np.mean([metric, candidate_network])

            percentage_change = (temp_metric - mean_network_metric) / mean_network_metric

            if percentage_change >= self.model_settings.ensemble_tollerance:
                path = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network],
                                    str(int(ensemble_population)))
                os.mkdir(path)
                network.save(path)
                self.save_rotation(path)
                ensemble_population += 1
            else:
                patience += 1

    def augment_features(self, training_features: np.ndarray, rotation: Optional[float] = None) -> np.ndarray:
        """
        Augment training features with rotation-based transformations.

        Args:
            training_features (numpy.ndarray): Training features to augment.
            rotation (float, optional): Rotation angle for augmentation.

        Returns:
            numpy.ndarray: Augmented training features.
        """
        devices = np.shape(training_features)[0]
        total_points = np.shape(training_features)[1]

        if rotation is None:
            rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8)

        self.rotation = rotation

        input_characterisations = int(total_points / self.points)
        false_x_values = np.linspace(-1, 1, self.points)
        false_x_values = np.tile(false_x_values, input_characterisations)

        cos = np.cos(rotation)
        sin = np.sin(rotation)

        indexed_features = np.zeros((devices, total_points, 2))
        indexed_features[:, :, 0] = false_x_values
        indexed_features[:, :, 1] = training_features[:, :]

        x_offset = 0
        y_offset = 0.5
        if input_characterisations != 1:
            characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)]
        else:
            characterisation_boundaries = [self.points]

        for idx, boundary in enumerate(characterisation_boundaries):
            if idx == 0:
                indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] - x_offset
                indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] - y_offset

                indexed_features[:, :boundary, 0] = (indexed_features[:, :boundary, 0] * cos) - (
                            indexed_features[:, :boundary, 1] * sin)
                indexed_features[:, :boundary, 1] = (indexed_features[:, :boundary, 1] * cos) + (
                            indexed_features[:, :boundary, 0] * sin)

                indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] + x_offset
                indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] + y_offset
            else:
                previous_boundary = characterisation_boundaries[idx - 1]
                indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary,
                                                                     0] - x_offset
                indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary,
                                                                     1] - y_offset

                indexed_features[:, previous_boundary:boundary, 0] = (indexed_features[:, previous_boundary:boundary,
                                                                      0] * cos) - (indexed_features[:,
                                                                                   previous_boundary:boundary, 1] * sin)
                indexed_features[:, previous_boundary:boundary, 1] = (indexed_features[:, previous_boundary:boundary,
                                                                      1] * cos) + (indexed_features[:,
                                                                                   previous_boundary:boundary, 0] * sin)

                indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary,
                                                                     0] + x_offset
                indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary,
                                                                     1] + y_offset

        indexed_features_subset = indexed_features[:, :, :]
        for idx, feature in enumerate(indexed_features_subset):
            x = feature[:, 0]
            y = feature[:, 1]
            for jdx, boundary in enumerate(characterisation_boundaries):
                if jdx == 0:
                    function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate')
                    indexed_features[idx, :boundary, 1] = function(false_x_values[:boundary])
                else:
                    previous_boundary = characterisation_boundaries[jdx - 1]
                    function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary],
                                                    fill_value='extrapolate')
                    indexed_features[idx, previous_boundary:boundary, 1] = function(
                        false_x_values[previous_boundary:boundary])

        training_features = indexed_features[:, :, 1]
        return training_features

    def save_rotation(self, path: str) -> None:
        """
        Save rotation data to a file.

        Args:
            path (str): Path to save the rotation data.
        """
        path = os.path.join(path, 'data.csv')
        data = pd.DataFrame(data={'rotation': [self.rotation]})
        data.to_csv(path, index=False)

    def predict(self, experimental_feature: Any) -> None:
        """
        Predict outputs for given experimental features using ensemble models.

        Args:
            experimental_feature: Experimental features to predict outputs for.
        """
        self.setup_network_directories()
        self.normalised_predicitons = np.zeros(len(self.networks_configured))
        self.predictions = np.zeros(len(self.networks_configured))
        for idx in range(len(self.networks_configured)):
            self.working_network = idx

            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
            members = os.listdir(dir)
            predictions_population = np.zeros(len(members))
            for jdx, member in enumerate(members):
                member_dir = os.path.join(dir, member)
                self.load_input_vectors()
                self.interpret_input_vectors()
                rotation = pd.read_csv(os.path.join(member_dir, 'data.csv'))
                rotation = rotation['rotation']

                experimental_feature = self.sample_experimental_features(experimental_feature)

                experimental_feature = self.augment_experimental_features(experimental_feature, rotation.values[0])

                experimental_feature = self.normalise_experimental_features(experimental_feature)
                P = Predicting(member_dir, np.array([experimental_feature.y]))
                predictions_population[jdx] = P.predict()
            self.normalised_predicitons[self.working_network] = np.mean(predictions_population)
        self.denormalise_predictions()

    def augment_experimental_features(self, experimental_features: Any, rotation: float) -> Any:
        """
        Augment experimental features with rotation-based transformations.

        Args:
            experimental_features: Experimental features to augment.
            rotation (float): Rotation angle for augmentation.

        Returns:
            Augmented experimental features.
        """
        devices = 1
        total_points = len(experimental_features.y)

        if rotation is None:
            rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8)

        self.rotation = rotation
        input_characterisations = int(total_points / self.points)
        false_x_values = np.linspace(-1, 1, self.points)
        false_x_values = np.tile(false_x_values, input_characterisations)

        cos = np.cos(rotation)
        sin = np.sin(rotation)

        indexed_features = np.zeros((total_points, 2))
        indexed_features[:, 0] = false_x_values
        indexed_features[:, 1] = experimental_features.y

        x_offset = 0
        y_offset = 0.5
        if input_characterisations != 1:
            characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)]
        else:
            characterisation_boundaries = [self.points]

        for idx, boundary in enumerate(characterisation_boundaries):
            if idx == 0:
                indexed_features[:boundary, 0] = indexed_features[:boundary, 0] - x_offset
                indexed_features[:boundary, 1] = indexed_features[:boundary, 1] - y_offset

                indexed_features[:boundary, 0] = (indexed_features[:boundary, 0] * cos) - (
                            indexed_features[:boundary, 1] * sin)
                indexed_features[:boundary, 1] = (indexed_features[:boundary, 1] * cos) + (
                            indexed_features[:boundary, 0] * sin)

                indexed_features[:boundary, 0] = indexed_features[:boundary, 0] + x_offset
                indexed_features[:boundary, 1] = indexed_features[:boundary, 1] + y_offset
            else:
                previous_boundary = characterisation_boundaries[idx - 1]
                indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary,
                                                                  0] - x_offset
                indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary,
                                                                  1] - y_offset

                indexed_features[previous_boundary:boundary, 0] = (indexed_features[previous_boundary:boundary,
                                                                   0] * cos) - (indexed_features[
                                                                                previous_boundary:boundary, 1] * sin)
                indexed_features[previous_boundary:boundary, 1] = (indexed_features[previous_boundary:boundary,
                                                                   1] * cos) + (indexed_features[
                                                                                previous_boundary:boundary, 0] * sin)

                indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary,
                                                                  0] + x_offset
                indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary,
                                                                  1] + y_offset

        x = indexed_features[:, 0]
        y = indexed_features[:, 1]
        for jdx, boundary in enumerate(characterisation_boundaries):
            if jdx == 0:
                function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate')
                indexed_features[:boundary, 1] = function(false_x_values[:boundary])
            else:
                previous_boundary = characterisation_boundaries[jdx - 1]
                function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary],
                                                fill_value='extrapolate')
                indexed_features[previous_boundary:boundary, 1] = function(false_x_values[previous_boundary:boundary])

        experimental_features.y = indexed_features[:, 1]
        return experimental_features

Subclass of Networks for ensemble-based models.

This class provides methods specific to ensemble-based models, including training and feature augmentation.

Initialize an Ensemble instance.

Args

networks_dir : str
Directory containing network configurations.

Ancestors

Methods

def augment_experimental_features(self, experimental_features: Any, rotation: float) ‑> Any
Expand source code
def augment_experimental_features(self, experimental_features: Any, rotation: float) -> Any:
    """
    Augment experimental features with rotation-based transformations.

    Args:
        experimental_features: Experimental features to augment.
        rotation (float): Rotation angle for augmentation.

    Returns:
        Augmented experimental features.
    """
    devices = 1
    total_points = len(experimental_features.y)

    if rotation is None:
        rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8)

    self.rotation = rotation
    input_characterisations = int(total_points / self.points)
    false_x_values = np.linspace(-1, 1, self.points)
    false_x_values = np.tile(false_x_values, input_characterisations)

    cos = np.cos(rotation)
    sin = np.sin(rotation)

    indexed_features = np.zeros((total_points, 2))
    indexed_features[:, 0] = false_x_values
    indexed_features[:, 1] = experimental_features.y

    x_offset = 0
    y_offset = 0.5
    if input_characterisations != 1:
        characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)]
    else:
        characterisation_boundaries = [self.points]

    for idx, boundary in enumerate(characterisation_boundaries):
        if idx == 0:
            indexed_features[:boundary, 0] = indexed_features[:boundary, 0] - x_offset
            indexed_features[:boundary, 1] = indexed_features[:boundary, 1] - y_offset

            indexed_features[:boundary, 0] = (indexed_features[:boundary, 0] * cos) - (
                        indexed_features[:boundary, 1] * sin)
            indexed_features[:boundary, 1] = (indexed_features[:boundary, 1] * cos) + (
                        indexed_features[:boundary, 0] * sin)

            indexed_features[:boundary, 0] = indexed_features[:boundary, 0] + x_offset
            indexed_features[:boundary, 1] = indexed_features[:boundary, 1] + y_offset
        else:
            previous_boundary = characterisation_boundaries[idx - 1]
            indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary,
                                                              0] - x_offset
            indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary,
                                                              1] - y_offset

            indexed_features[previous_boundary:boundary, 0] = (indexed_features[previous_boundary:boundary,
                                                               0] * cos) - (indexed_features[
                                                                            previous_boundary:boundary, 1] * sin)
            indexed_features[previous_boundary:boundary, 1] = (indexed_features[previous_boundary:boundary,
                                                               1] * cos) + (indexed_features[
                                                                            previous_boundary:boundary, 0] * sin)

            indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary,
                                                              0] + x_offset
            indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary,
                                                              1] + y_offset

    x = indexed_features[:, 0]
    y = indexed_features[:, 1]
    for jdx, boundary in enumerate(characterisation_boundaries):
        if jdx == 0:
            function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate')
            indexed_features[:boundary, 1] = function(false_x_values[:boundary])
        else:
            previous_boundary = characterisation_boundaries[jdx - 1]
            function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary],
                                            fill_value='extrapolate')
            indexed_features[previous_boundary:boundary, 1] = function(false_x_values[previous_boundary:boundary])

    experimental_features.y = indexed_features[:, 1]
    return experimental_features

Augment experimental features with rotation-based transformations.

Args

experimental_features
Experimental features to augment.
rotation : float
Rotation angle for augmentation.

Returns

Augmented experimental features.

def augment_features(self, training_features: numpy.ndarray, rotation: float | None = None) ‑> numpy.ndarray
Expand source code
def augment_features(self, training_features: np.ndarray, rotation: Optional[float] = None) -> np.ndarray:
    """
    Augment training features with rotation-based transformations.

    Args:
        training_features (numpy.ndarray): Training features to augment.
        rotation (float, optional): Rotation angle for augmentation.

    Returns:
        numpy.ndarray: Augmented training features.
    """
    devices = np.shape(training_features)[0]
    total_points = np.shape(training_features)[1]

    if rotation is None:
        rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8)

    self.rotation = rotation

    input_characterisations = int(total_points / self.points)
    false_x_values = np.linspace(-1, 1, self.points)
    false_x_values = np.tile(false_x_values, input_characterisations)

    cos = np.cos(rotation)
    sin = np.sin(rotation)

    indexed_features = np.zeros((devices, total_points, 2))
    indexed_features[:, :, 0] = false_x_values
    indexed_features[:, :, 1] = training_features[:, :]

    x_offset = 0
    y_offset = 0.5
    if input_characterisations != 1:
        characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)]
    else:
        characterisation_boundaries = [self.points]

    for idx, boundary in enumerate(characterisation_boundaries):
        if idx == 0:
            indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] - x_offset
            indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] - y_offset

            indexed_features[:, :boundary, 0] = (indexed_features[:, :boundary, 0] * cos) - (
                        indexed_features[:, :boundary, 1] * sin)
            indexed_features[:, :boundary, 1] = (indexed_features[:, :boundary, 1] * cos) + (
                        indexed_features[:, :boundary, 0] * sin)

            indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] + x_offset
            indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] + y_offset
        else:
            previous_boundary = characterisation_boundaries[idx - 1]
            indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary,
                                                                 0] - x_offset
            indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary,
                                                                 1] - y_offset

            indexed_features[:, previous_boundary:boundary, 0] = (indexed_features[:, previous_boundary:boundary,
                                                                  0] * cos) - (indexed_features[:,
                                                                               previous_boundary:boundary, 1] * sin)
            indexed_features[:, previous_boundary:boundary, 1] = (indexed_features[:, previous_boundary:boundary,
                                                                  1] * cos) + (indexed_features[:,
                                                                               previous_boundary:boundary, 0] * sin)

            indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary,
                                                                 0] + x_offset
            indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary,
                                                                 1] + y_offset

    indexed_features_subset = indexed_features[:, :, :]
    for idx, feature in enumerate(indexed_features_subset):
        x = feature[:, 0]
        y = feature[:, 1]
        for jdx, boundary in enumerate(characterisation_boundaries):
            if jdx == 0:
                function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate')
                indexed_features[idx, :boundary, 1] = function(false_x_values[:boundary])
            else:
                previous_boundary = characterisation_boundaries[jdx - 1]
                function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary],
                                                fill_value='extrapolate')
                indexed_features[idx, previous_boundary:boundary, 1] = function(
                    false_x_values[previous_boundary:boundary])

    training_features = indexed_features[:, :, 1]
    return training_features

Augment training features with rotation-based transformations.

Args

training_features : numpy.ndarray
Training features to augment.
rotation : float, optional
Rotation angle for augmentation.

Returns

numpy.ndarray
Augmented training features.
def predict(self, experimental_feature: Any) ‑> None
Expand source code
def predict(self, experimental_feature: Any) -> None:
    """
    Predict outputs for given experimental features using ensemble models.

    Args:
        experimental_feature: Experimental features to predict outputs for.
    """
    self.setup_network_directories()
    self.normalised_predicitons = np.zeros(len(self.networks_configured))
    self.predictions = np.zeros(len(self.networks_configured))
    for idx in range(len(self.networks_configured)):
        self.working_network = idx

        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
        members = os.listdir(dir)
        predictions_population = np.zeros(len(members))
        for jdx, member in enumerate(members):
            member_dir = os.path.join(dir, member)
            self.load_input_vectors()
            self.interpret_input_vectors()
            rotation = pd.read_csv(os.path.join(member_dir, 'data.csv'))
            rotation = rotation['rotation']

            experimental_feature = self.sample_experimental_features(experimental_feature)

            experimental_feature = self.augment_experimental_features(experimental_feature, rotation.values[0])

            experimental_feature = self.normalise_experimental_features(experimental_feature)
            P = Predicting(member_dir, np.array([experimental_feature.y]))
            predictions_population[jdx] = P.predict()
        self.normalised_predicitons[self.working_network] = np.mean(predictions_population)
    self.denormalise_predictions()

Predict outputs for given experimental features using ensemble models.

Args

experimental_feature
Experimental features to predict outputs for.
def save_rotation(self, path: str) ‑> None
Expand source code
def save_rotation(self, path: str) -> None:
    """
    Save rotation data to a file.

    Args:
        path (str): Path to save the rotation data.
    """
    path = os.path.join(path, 'data.csv')
    data = pd.DataFrame(data={'rotation': [self.rotation]})
    data.to_csv(path, index=False)

Save rotation data to a file.

Args

path : str
Path to save the rotation data.
def train(self) ‑> None
Expand source code
def train(self) -> None:
    """
    Train the ensemble-based networks.

    This method sets up directories, loads datasets, and trains the
    networks using ensemble techniques.
    """
    self.setup_network_directories()
    for idx in range(self.total_networks):
        self.working_network = idx

        self.load_training_dataset()
        training_features, training_targets, validation_features, validation_targets = self.separate_training_dataset()
        self.train_networks_ensemble(training_features, training_targets, validation_features, validation_targets)

Train the ensemble-based networks.

This method sets up directories, loads datasets, and trains the networks using ensemble techniques.

def train_networks_ensemble(self,
training_features: numpy.ndarray,
training_targets: numpy.ndarray,
validation_features: numpy.ndarray,
validation_targets: numpy.ndarray) ‑> None
Expand source code
def train_networks_ensemble(self, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray) -> None:
    """
    Train ensemble networks with augmented features.

    Args:
        training_features (numpy.ndarray): Training features.
        training_targets (numpy.ndarray): Training targets.
        validation_features (numpy.ndarray): Validation features.
        validation_targets (numpy.ndarray): Validation targets.
    """
    network_metric = np.zeros((self.model_settings.ensemble_presample))

    presample_num = 0

    while presample_num < self.model_settings.ensemble_presample:
        augmented_training_features = self.augment_features(training_features)
        network = copy.deepcopy(self.networks[self.working_network])
        network_metric[presample_num] = Training(augmented_training_features, training_targets, validation_features,
                                                 validation_targets, dir)
        presample_num += 1

    mean_network_metric = np.mean(network_metric)

    ensemble_population = 0
    metric = 0
    patience = 0
    while ensemble_population < self.model_settings.ensemble_maximum and patience < self.model_settings.ensemble_patience:
        network = copy.deepcopy(self.networks[self.working_network])
        augmented_training_features = self.augment_features(training_features)
        candidate_network = network.fit(augmented_training_features[:], training_targets[:])

        if metric == 0:
            temp_metric = candidate_network
        else:
            temp_metric = np.mean([metric, candidate_network])

        percentage_change = (temp_metric - mean_network_metric) / mean_network_metric

        if percentage_change >= self.model_settings.ensemble_tollerance:
            path = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network],
                                str(int(ensemble_population)))
            os.mkdir(path)
            network.save(path)
            self.save_rotation(path)
            ensemble_population += 1
        else:
            patience += 1

Train ensemble networks with augmented features.

Args

training_features : numpy.ndarray
Training features.
training_targets : numpy.ndarray
Training targets.
validation_features : numpy.ndarray
Validation features.
validation_targets : numpy.ndarray
Validation targets.

Inherited members

class Networks
Expand source code
class Networks:
    """
    Base class for managing and training machine learning networks.

    This class provides methods for initializing, configuring, and training
    networks, as well as handling input and output data for machine learning
    tasks. It also supports subclassing for different types of networks.
    """
    subclasses = {}

    def __init_subclass__(cls, **kwargs):
        """
        Automatically register subclasses for different network types.

        Args:
            **kwargs: Additional keyword arguments for subclass initialization.
        """
        super().__init_subclass__(**kwargs)
        cls.subclasses[cls.__name__] = cls

    @classmethod
    def initialise(cls, networks_dir: str, network_type: Optional[str] = None, model_settings = None) -> 'Networks':
        """
        Factory method to create an instance of a specific network subclass.

        This method instantiates the appropriate network subclass based on the 
        specified network type. If no model settings are provided, default 
        settings are used.

        Args:
            networks_dir (str): Directory containing network configurations and data.
            network_type (str, optional): Type of network to create. Available types
                depend on registered subclasses (e.g., 'Point', 'Ensemble', 'Difference').
            model_settings (Model_Settings, optional): Configuration settings for the model.
                If None, default Model_Settings will be created.

        Returns:
            Networks: An instance of the appropriate network subclass.

        Raises:
            ValueError: If the specified network type is not recognized or registered.

        Example:
            >>> networks = Networks.initialise('path/to/networks', 'Point')
        """
        if network_type not in cls.subclasses:
            raise ValueError('Network Type: {} Not recognized'.format(network_type))
        if model_settings is None:
            model_settings = Model_Settings()   
        return cls.subclasses[network_type](networks_dir, model_settings=model_settings)

    def setup_network_directories(self) -> None:
        """
        Set up directories for network configurations and ensure required files exist.

        This method validates the network configuration, loads the network settings
        from the nets.json file, and creates necessary directories for each configured
        network. It initializes tracking variables for network management.

        Raises:
            ValueError: If the network configuration file (nets.json) is not found
                in the expected location (networks_dir/faster/nets.json).

        Sets:
            self.networks_configured (list): List of configured network names
            self.working_network (int): Index of currently active network (starts at 0)
            self.total_networks (int): Total number of configured networks
            self.networks (numpy.ndarray): Array to store network instances
            self.oghma_network_config (dict): Loaded network configuration data
        """
        oghma_network_config = os.path.join(self.networks_dir, 'faster', 'nets.json')

        if os.path.isfile(oghma_network_config) == False:
            raise ValueError('Network Config File Not Found')
        else:
            f = open(oghma_network_config, 'r')
            oghma_network_config = json.load(f)
            f.close()

        self.networks_configured = list(oghma_network_config['sims'].keys())
        self.working_network = 0
        self.total_networks = len(self.networks_configured)
        self.networks = np.zeros(len(self.networks_configured), dtype=object)

        self.oghma_network_config = oghma_network_config

        for network in self.networks_configured:
            network_dir = os.path.join(self.networks_dir, 'faster', network)
            if os.path.isdir(network_dir) == False:
                os.mkdir(network_dir)

    def load_input_vectors(self) -> None:
        """
        Load input vectors from the network configuration file.

        This method parses experimental input vectors and stores them for use
        in training and prediction.
        """
        input_vectors = {}
        input_experiments = self.oghma_network_config['experimental']
        self.networks_configured = list(self.oghma_network_config['sims'].keys())
        for experiment in input_experiments:
            vector = self.oghma_network_config['experimental'][experiment]['vec']['points'].split(',')
            vector = np.asarray(vector).astype(float)
            input_vectors[experiment] = vector
        self.input_vectors = input_vectors
        self.points = len(vector)

    def load_training_dataset(self) -> Tuple[np.ndarray, np.ndarray]:
        """
        Load the training dataset from the network configuration file.

        This method reads the dataset, extracts input and output features, and
        prepares them for training.

        Returns:
            tuple: A tuple containing features and targets as numpy arrays.
        """
        training_dataset = pd.read_csv(self.oghma_network_config['csv_file'], sep=" ")

        inputs_vectors = {}
        input_experiments = self.oghma_network_config['experimental']
        for experiment in input_experiments:
            points = len(self.oghma_network_config['experimental'][experiment]['vec']['points'].split(','))
            inputs_vectors[experiment] = points
            self.points = points
            self.population = len(training_dataset)

        self.inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        self.outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']

        self.input_points = 0
        for input in self.inputs:
            points = len(self.oghma_network_config['experimental'][input]['vec']['points'].split(','))
            self.input_points += points

        self.output_points = len(self.outputs)

        # feature = np.zeros(self.input_points)
        features = np.zeros((self.population, self.input_points))
        # target = np.zeros((len(self.outputs)))
        targets = np.zeros((self.population, self.output_points))

        inputs = np.empty(self.input_points, dtype=object)
        previous_end = 0
        for idx in range(len(self.inputs)):
            vector_points = inputs_vectors[self.inputs[idx]]
            inputs[previous_end:previous_end + vector_points] = np.asarray(
                [input + '.vec' + str(x) for x in range(vector_points)])
            previous_end = previous_end + vector_points
        #print(inputs)
        self.features = training_dataset[inputs].to_numpy().astype(np.float32)
        self.targets = training_dataset[self.outputs].to_numpy().astype(np.float32)

        return features, targets

    def separate_training_dataset(self) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
        """
        Separate the training dataset into training and validation sets.

        This method splits the dataset based on the training percentage defined
        in the model settings.

        Returns:
            tuple: A tuple containing training features, training targets,
                   validation features, and validation targets.
        """

        training_population = int(self.model_settings.training_percentage * self.population)

        indices = np.linspace(0, self.population - 1, self.population, dtype=int)

        training_indices = random.choices(indices, k=training_population)
        validation_indices = np.delete(indices, training_indices)

        training_features = self.features[training_indices]
        training_targets = self.targets[training_indices]

        validation_features = self.features[validation_indices]
        validation_targets = self.targets[validation_indices]
        self.validation_indices = validation_indices
        self.training_features = training_features
        self.training_targets = training_targets
        self.validation_features = validation_features
        self.validation_targets = validation_targets

        return training_features, training_targets, validation_features, validation_targets

    def get_uniform_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
        """
        Generate a uniform distribution of features and targets.

        Args:
            features (numpy.ndarray): Input features.
            targets (numpy.ndarray): Target values.

        Returns:
            tuple: Validation features and validation targets.
        """
        features = np.asarray(features)
        targets = np.asarray(targets)

        training_population = int(self.model_settings.training_percentage * self.population)

        indices = np.linspace(0, self.population - 1, self.population, dtype=int)
        rng = np.random.default_rng()
        training_indices = rng.choice(indices, training_population, replace=False)
        validation_indices = np.array([i not in training_indices for i in indices])

        training_features = features[training_indices]
        training_targets = targets[training_indices]

        validation_features = features[validation_indices]
        validation_targets = targets[validation_indices]

        return validation_features, validation_targets

    def train_networks(self, network_type=None) -> None:
        """
        Train the networks using the training dataset.

        This method initializes the training process for the current working
        network.
        """
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
        print('Learning Rate:', self.model_settings.inital_learning_rate)
        print('Decay Rate:', self.model_settings.decay_rate)
        Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type)

    def train_networks_existing(self, network_type=None) -> None:
        """
        Train the networks using the training dataset.

        This method initializes the training process for the current working
        network.
        """
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
        print('Learning Rate:', self.model_settings.inital_learning_rate)
        print('Decay Rate:', self.model_settings.decay_rate)
        Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type, existing=True)
    
    def tune_networks(self) -> None:
        """
        Tune the networks using hyperparameter optimization.

        This method performs tuning to optimize the network's performance.
        """
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
        bhp = Tuning(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir)

    @staticmethod
    def mape(x, y):
        """
        Calculate the Mean Absolute Percentage Error (MAPE) between two arrays.

        Args:
            x (numpy.ndarray): Actual values.
            y (numpy.ndarray): Predicted values.

        Returns:
            float: Mean Absolute Percentage Error.
        """
        return np.mean(np.abs((x - y) / y)) * 100

    def interpret_input_vectors(self) -> None:
        """
        Interpret input vectors to extract experimental conditions.

        This method processes input vectors to determine experimental
        intensities or other relevant parameters.
        """
        intensity = np.zeros(len(self.input_vectors))
        for idx, experiment in enumerate(self.input_vectors):
            match experiment:
                case x if 'light' in x:
                    experiment.split('_')
                    intensity[idx] = float(experiment.split('_')[-1])
                case x if 'dark' in x:
                    intensity[idx] = 0

    def sample_experimental_features(self, features: Any) -> Any:
        """
        Sample experimental features to match input vectors.

        Args:
            features: Experimental features to sample.

        Returns:
            Updated experimental features.
        """
        if type(features) != list:
            keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
            keys = list(keys)
            if len(keys) > 1:
                keys = keys[0]
            l = 0
            for key in keys:
                l = l + len(self.input_vectors[key])
            filter = np.zeros(l)
            input = []
            for key in keys:
                input.append(self.input_vectors[key])

            input = np.array(input).ravel()
            for idx, i in enumerate(input):
                exp = features.x
                diff = exp - i
                diff = np.abs(diff)
                diff = np.argmin(diff)
                filter[idx] = np.abs(diff)
            filter = np.array(filter).astype(int)
            features.x = features.x[filter]
            features.y = features.y[filter]
        else:
            keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
            keys = list(keys)
            l = 0
            for key in keys:
                l = l + len(self.input_vectors[key])
            filter = np.zeros(l)
            input = []
            for key in keys:
                input.append(self.input_vectors[key])
            input = np.array(input).ravel()
            for jdx in range(len(features)):
                for idx, i in enumerate(input):
                    exp = features.x[jdx]
                    diff = exp - i
                    diff = np.abs(diff)
                    diff = np.argmin(diff)
                    filter[idx] = np.abs(diff)
                filter = np.array(filter).astype(int)
                features.x[jdx] = features.x[jdx][filter]
                features.y[jdx] = features.y[jdx][filter]

        return features

    def normalise_experimental_features(self, features: Any, dir: Optional[str] = None, idx = None) -> Any:
        """
        Normalize experimental features based on configuration settings.

        Args:
            features: Experimental features to normalize.
            dir (str, optional): Directory containing normalization settings.

        Returns:
            Normalized experimental features.

        """

        if idx == None:
            idx = 0

        if dir == None:
            min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                      header=None, sep=' ', names=['param', 'min', 'max', 'log'])
        else:
            min_max_log = pd.read_csv(os.path.join(dir, 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ',
                                      names=['param', 'min', 'max', 'log'])
        prefix = [self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'][idx]]
        #prefix = prefix[idx]

        if len(prefix) > 1:
            vecs = []
            for idx in range(len(prefix)):
                temp_vecs = [prefix[idx] + '.vec' + str(i) for i in range(self.points)]
                vecs.append(temp_vecs)
        else:
            vecs = [prefix[0] + '.vec' + str(i) for i in range(self.points)]

        min_max_log = min_max_log[min_max_log['param'].isin(vecs)]
        dir_min = min_max_log['min'].values
        dir_max = min_max_log['max'].values
        dir_log = min_max_log['log'].values
        y = features.y
        if len(y) != self.points:
            for idx in range(len(y)):
                if dir_log[0] == 0:
                    y[idx] = self.normalise_linear(y[idx], dir_min[idx], dir_max[idx])
                else:
                    y[idx] = self.normalise_log(y[idx], dir_min[idx], dir_max[idx])
        else:
            if dir_log[0] == 0:
                y = self.normalise_linear(y, dir_min, dir_max)
            else:
                y = self.normalise_log(y, dir_min, dir_max)

        features.y = y
        return features

    def denormalise_predictions(self) -> None:
        """
        Denormalize predictions to their original scale.

        This method converts normalized predictions back to their original
        scale using configuration settings.
        """
        predictions = np.zeros((self.Device_Population, self.total_networks, 100))
        self.mean = np.zeros((self.total_networks, 100))
        for idx in range(self.total_networks):
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            num_outputs = len(outputs)
            network_min_max_log = self.min_max_log[self.min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values.ravel()
            max = network_min_max_log['max'].values.ravel()
            log = network_min_max_log['log'].values.ravel()

            for jdx in range(num_outputs):
                if num_outputs > 1:
                    if log[jdx] == 1:
                        predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx])
                    else:
                        predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx])
                else:
                    if log == 1:
                        predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx, jdx], min, max)
                    else:
                        predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx, jdx], min, max)
        self.predicitons = predictions

        for idx in range(self.total_networks):
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            num_outputs = len(outputs)
            for jdx in range(num_outputs):
                if num_outputs > 1:
                    self.mean[idx, jdx] = np.mean(self.predicitons[:, idx, jdx])
                else:
                    self.mean[idx,0] = np.mean(self.predicitons[:, idx, jdx])

    def normalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
        """
        Normalize values linearly.

        Args:
            x (numpy.ndarray): Values to normalize.
            x_min (float): Minimum value for normalization.
            x_max (float): Maximum value for normalization.

        Returns:
            numpy.ndarray: Normalized values.
        """
        return (x - x_min) / (x_max - x_min)

    def normalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
        """
        Normalize values logarithmically.

        Args:
            x (numpy.ndarray): Values to normalize.
            x_min (float): Minimum value for normalization.
            x_max (float): Maximum value for normalization.

        Returns:
            numpy.ndarray: Normalized values.
        """
        return (np.log10(x) - np.log10(x_min)) / (np.log10(x_max) - np.log10(x_min))

    def denormalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
        """
        Denormalize values linearly.

        Args:
            x (numpy.ndarray): Values to denormalize.
            x_min (float): Minimum value for denormalization.
            x_max (float): Maximum value for denormalization.

        Returns:
            numpy.ndarray: Denormalized values.
        """
        return x * (x_max - x_min) + x_min

    def denormalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
        """
        Denormalize values logarithmically.

        Args:
            x (numpy.ndarray): Values to denormalize.
            x_min (float): Minimum value for denormalization.
            x_max (float): Maximum value for denormalization.

        Returns:
            numpy.ndarray: Denormalized values.
        """
        return 10 ** (x * (np.log10(x_max) - np.log10(x_min)) + np.log10(x_min))

Base class for managing and training machine learning networks.

This class provides methods for initializing, configuring, and training networks, as well as handling input and output data for machine learning tasks. It also supports subclassing for different types of networks.

Subclasses

Class variables

var subclasses

Static methods

def initialise(networks_dir: str, network_type: str | None = None, model_settings=None) ‑> Networks

Factory method to create an instance of a specific network subclass.

This method instantiates the appropriate network subclass based on the specified network type. If no model settings are provided, default settings are used.

Args

networks_dir : str
Directory containing network configurations and data.
network_type : str, optional
Type of network to create. Available types depend on registered subclasses (e.g., 'Point', 'Ensemble', 'Difference').
model_settings : Model_Settings, optional
Configuration settings for the model. If None, default Model_Settings will be created.

Returns

Networks
An instance of the appropriate network subclass.

Raises

ValueError
If the specified network type is not recognized or registered.

Example

>>> networks = Networks.initialise('path/to/networks', 'Point')
def mape(x, y)
Expand source code
@staticmethod
def mape(x, y):
    """
    Calculate the Mean Absolute Percentage Error (MAPE) between two arrays.

    Args:
        x (numpy.ndarray): Actual values.
        y (numpy.ndarray): Predicted values.

    Returns:
        float: Mean Absolute Percentage Error.
    """
    return np.mean(np.abs((x - y) / y)) * 100

Calculate the Mean Absolute Percentage Error (MAPE) between two arrays.

Args

x : numpy.ndarray
Actual values.
y : numpy.ndarray
Predicted values.

Returns

float
Mean Absolute Percentage Error.

Methods

def denormalise_linear(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
Expand source code
def denormalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
    """
    Denormalize values linearly.

    Args:
        x (numpy.ndarray): Values to denormalize.
        x_min (float): Minimum value for denormalization.
        x_max (float): Maximum value for denormalization.

    Returns:
        numpy.ndarray: Denormalized values.
    """
    return x * (x_max - x_min) + x_min

Denormalize values linearly.

Args

x : numpy.ndarray
Values to denormalize.
x_min : float
Minimum value for denormalization.
x_max : float
Maximum value for denormalization.

Returns

numpy.ndarray
Denormalized values.
def denormalise_log(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
Expand source code
def denormalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
    """
    Denormalize values logarithmically.

    Args:
        x (numpy.ndarray): Values to denormalize.
        x_min (float): Minimum value for denormalization.
        x_max (float): Maximum value for denormalization.

    Returns:
        numpy.ndarray: Denormalized values.
    """
    return 10 ** (x * (np.log10(x_max) - np.log10(x_min)) + np.log10(x_min))

Denormalize values logarithmically.

Args

x : numpy.ndarray
Values to denormalize.
x_min : float
Minimum value for denormalization.
x_max : float
Maximum value for denormalization.

Returns

numpy.ndarray
Denormalized values.
def denormalise_predictions(self) ‑> None
Expand source code
def denormalise_predictions(self) -> None:
    """
    Denormalize predictions to their original scale.

    This method converts normalized predictions back to their original
    scale using configuration settings.
    """
    predictions = np.zeros((self.Device_Population, self.total_networks, 100))
    self.mean = np.zeros((self.total_networks, 100))
    for idx in range(self.total_networks):
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        num_outputs = len(outputs)
        network_min_max_log = self.min_max_log[self.min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values.ravel()
        max = network_min_max_log['max'].values.ravel()
        log = network_min_max_log['log'].values.ravel()

        for jdx in range(num_outputs):
            if num_outputs > 1:
                if log[jdx] == 1:
                    predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx])
                else:
                    predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx])
            else:
                if log == 1:
                    predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx, jdx], min, max)
                else:
                    predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx, jdx], min, max)
    self.predicitons = predictions

    for idx in range(self.total_networks):
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        num_outputs = len(outputs)
        for jdx in range(num_outputs):
            if num_outputs > 1:
                self.mean[idx, jdx] = np.mean(self.predicitons[:, idx, jdx])
            else:
                self.mean[idx,0] = np.mean(self.predicitons[:, idx, jdx])

Denormalize predictions to their original scale.

This method converts normalized predictions back to their original scale using configuration settings.

def get_uniform_distribution(self, features: numpy.ndarray, targets: numpy.ndarray) ‑> Tuple[numpy.ndarray, numpy.ndarray]
Expand source code
def get_uniform_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
    """
    Generate a uniform distribution of features and targets.

    Args:
        features (numpy.ndarray): Input features.
        targets (numpy.ndarray): Target values.

    Returns:
        tuple: Validation features and validation targets.
    """
    features = np.asarray(features)
    targets = np.asarray(targets)

    training_population = int(self.model_settings.training_percentage * self.population)

    indices = np.linspace(0, self.population - 1, self.population, dtype=int)
    rng = np.random.default_rng()
    training_indices = rng.choice(indices, training_population, replace=False)
    validation_indices = np.array([i not in training_indices for i in indices])

    training_features = features[training_indices]
    training_targets = targets[training_indices]

    validation_features = features[validation_indices]
    validation_targets = targets[validation_indices]

    return validation_features, validation_targets

Generate a uniform distribution of features and targets.

Args

features : numpy.ndarray
Input features.
targets : numpy.ndarray
Target values.

Returns

tuple
Validation features and validation targets.
def interpret_input_vectors(self) ‑> None
Expand source code
def interpret_input_vectors(self) -> None:
    """
    Interpret input vectors to extract experimental conditions.

    This method processes input vectors to determine experimental
    intensities or other relevant parameters.
    """
    intensity = np.zeros(len(self.input_vectors))
    for idx, experiment in enumerate(self.input_vectors):
        match experiment:
            case x if 'light' in x:
                experiment.split('_')
                intensity[idx] = float(experiment.split('_')[-1])
            case x if 'dark' in x:
                intensity[idx] = 0

Interpret input vectors to extract experimental conditions.

This method processes input vectors to determine experimental intensities or other relevant parameters.

def load_input_vectors(self) ‑> None
Expand source code
def load_input_vectors(self) -> None:
    """
    Load input vectors from the network configuration file.

    This method parses experimental input vectors and stores them for use
    in training and prediction.
    """
    input_vectors = {}
    input_experiments = self.oghma_network_config['experimental']
    self.networks_configured = list(self.oghma_network_config['sims'].keys())
    for experiment in input_experiments:
        vector = self.oghma_network_config['experimental'][experiment]['vec']['points'].split(',')
        vector = np.asarray(vector).astype(float)
        input_vectors[experiment] = vector
    self.input_vectors = input_vectors
    self.points = len(vector)

Load input vectors from the network configuration file.

This method parses experimental input vectors and stores them for use in training and prediction.

def load_training_dataset(self) ‑> Tuple[numpy.ndarray, numpy.ndarray]
Expand source code
def load_training_dataset(self) -> Tuple[np.ndarray, np.ndarray]:
    """
    Load the training dataset from the network configuration file.

    This method reads the dataset, extracts input and output features, and
    prepares them for training.

    Returns:
        tuple: A tuple containing features and targets as numpy arrays.
    """
    training_dataset = pd.read_csv(self.oghma_network_config['csv_file'], sep=" ")

    inputs_vectors = {}
    input_experiments = self.oghma_network_config['experimental']
    for experiment in input_experiments:
        points = len(self.oghma_network_config['experimental'][experiment]['vec']['points'].split(','))
        inputs_vectors[experiment] = points
        self.points = points
        self.population = len(training_dataset)

    self.inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    self.outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']

    self.input_points = 0
    for input in self.inputs:
        points = len(self.oghma_network_config['experimental'][input]['vec']['points'].split(','))
        self.input_points += points

    self.output_points = len(self.outputs)

    # feature = np.zeros(self.input_points)
    features = np.zeros((self.population, self.input_points))
    # target = np.zeros((len(self.outputs)))
    targets = np.zeros((self.population, self.output_points))

    inputs = np.empty(self.input_points, dtype=object)
    previous_end = 0
    for idx in range(len(self.inputs)):
        vector_points = inputs_vectors[self.inputs[idx]]
        inputs[previous_end:previous_end + vector_points] = np.asarray(
            [input + '.vec' + str(x) for x in range(vector_points)])
        previous_end = previous_end + vector_points
    #print(inputs)
    self.features = training_dataset[inputs].to_numpy().astype(np.float32)
    self.targets = training_dataset[self.outputs].to_numpy().astype(np.float32)

    return features, targets

Load the training dataset from the network configuration file.

This method reads the dataset, extracts input and output features, and prepares them for training.

Returns

tuple
A tuple containing features and targets as numpy arrays.
def normalise_experimental_features(self, features: Any, dir: str | None = None, idx=None) ‑> Any
Expand source code
def normalise_experimental_features(self, features: Any, dir: Optional[str] = None, idx = None) -> Any:
    """
    Normalize experimental features based on configuration settings.

    Args:
        features: Experimental features to normalize.
        dir (str, optional): Directory containing normalization settings.

    Returns:
        Normalized experimental features.

    """

    if idx == None:
        idx = 0

    if dir == None:
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])
    else:
        min_max_log = pd.read_csv(os.path.join(dir, 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ',
                                  names=['param', 'min', 'max', 'log'])
    prefix = [self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'][idx]]
    #prefix = prefix[idx]

    if len(prefix) > 1:
        vecs = []
        for idx in range(len(prefix)):
            temp_vecs = [prefix[idx] + '.vec' + str(i) for i in range(self.points)]
            vecs.append(temp_vecs)
    else:
        vecs = [prefix[0] + '.vec' + str(i) for i in range(self.points)]

    min_max_log = min_max_log[min_max_log['param'].isin(vecs)]
    dir_min = min_max_log['min'].values
    dir_max = min_max_log['max'].values
    dir_log = min_max_log['log'].values
    y = features.y
    if len(y) != self.points:
        for idx in range(len(y)):
            if dir_log[0] == 0:
                y[idx] = self.normalise_linear(y[idx], dir_min[idx], dir_max[idx])
            else:
                y[idx] = self.normalise_log(y[idx], dir_min[idx], dir_max[idx])
    else:
        if dir_log[0] == 0:
            y = self.normalise_linear(y, dir_min, dir_max)
        else:
            y = self.normalise_log(y, dir_min, dir_max)

    features.y = y
    return features

Normalize experimental features based on configuration settings.

Args

features
Experimental features to normalize.
dir : str, optional
Directory containing normalization settings.

Returns

Normalized experimental features.

def normalise_linear(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
Expand source code
def normalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
    """
    Normalize values linearly.

    Args:
        x (numpy.ndarray): Values to normalize.
        x_min (float): Minimum value for normalization.
        x_max (float): Maximum value for normalization.

    Returns:
        numpy.ndarray: Normalized values.
    """
    return (x - x_min) / (x_max - x_min)

Normalize values linearly.

Args

x : numpy.ndarray
Values to normalize.
x_min : float
Minimum value for normalization.
x_max : float
Maximum value for normalization.

Returns

numpy.ndarray
Normalized values.
def normalise_log(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
Expand source code
def normalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray:
    """
    Normalize values logarithmically.

    Args:
        x (numpy.ndarray): Values to normalize.
        x_min (float): Minimum value for normalization.
        x_max (float): Maximum value for normalization.

    Returns:
        numpy.ndarray: Normalized values.
    """
    return (np.log10(x) - np.log10(x_min)) / (np.log10(x_max) - np.log10(x_min))

Normalize values logarithmically.

Args

x : numpy.ndarray
Values to normalize.
x_min : float
Minimum value for normalization.
x_max : float
Maximum value for normalization.

Returns

numpy.ndarray
Normalized values.
def sample_experimental_features(self, features: Any) ‑> Any
Expand source code
def sample_experimental_features(self, features: Any) -> Any:
    """
    Sample experimental features to match input vectors.

    Args:
        features: Experimental features to sample.

    Returns:
        Updated experimental features.
    """
    if type(features) != list:
        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        if len(keys) > 1:
            keys = keys[0]
        l = 0
        for key in keys:
            l = l + len(self.input_vectors[key])
        filter = np.zeros(l)
        input = []
        for key in keys:
            input.append(self.input_vectors[key])

        input = np.array(input).ravel()
        for idx, i in enumerate(input):
            exp = features.x
            diff = exp - i
            diff = np.abs(diff)
            diff = np.argmin(diff)
            filter[idx] = np.abs(diff)
        filter = np.array(filter).astype(int)
        features.x = features.x[filter]
        features.y = features.y[filter]
    else:
        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.input_vectors[key])
        filter = np.zeros(l)
        input = []
        for key in keys:
            input.append(self.input_vectors[key])
        input = np.array(input).ravel()
        for jdx in range(len(features)):
            for idx, i in enumerate(input):
                exp = features.x[jdx]
                diff = exp - i
                diff = np.abs(diff)
                diff = np.argmin(diff)
                filter[idx] = np.abs(diff)
            filter = np.array(filter).astype(int)
            features.x[jdx] = features.x[jdx][filter]
            features.y[jdx] = features.y[jdx][filter]

    return features

Sample experimental features to match input vectors.

Args

features
Experimental features to sample.

Returns

Updated experimental features.

def separate_training_dataset(self) ‑> Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]
Expand source code
def separate_training_dataset(self) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
    """
    Separate the training dataset into training and validation sets.

    This method splits the dataset based on the training percentage defined
    in the model settings.

    Returns:
        tuple: A tuple containing training features, training targets,
               validation features, and validation targets.
    """

    training_population = int(self.model_settings.training_percentage * self.population)

    indices = np.linspace(0, self.population - 1, self.population, dtype=int)

    training_indices = random.choices(indices, k=training_population)
    validation_indices = np.delete(indices, training_indices)

    training_features = self.features[training_indices]
    training_targets = self.targets[training_indices]

    validation_features = self.features[validation_indices]
    validation_targets = self.targets[validation_indices]
    self.validation_indices = validation_indices
    self.training_features = training_features
    self.training_targets = training_targets
    self.validation_features = validation_features
    self.validation_targets = validation_targets

    return training_features, training_targets, validation_features, validation_targets

Separate the training dataset into training and validation sets.

This method splits the dataset based on the training percentage defined in the model settings.

Returns

tuple
A tuple containing training features, training targets, validation features, and validation targets.
def setup_network_directories(self) ‑> None
Expand source code
def setup_network_directories(self) -> None:
    """
    Set up directories for network configurations and ensure required files exist.

    This method validates the network configuration, loads the network settings
    from the nets.json file, and creates necessary directories for each configured
    network. It initializes tracking variables for network management.

    Raises:
        ValueError: If the network configuration file (nets.json) is not found
            in the expected location (networks_dir/faster/nets.json).

    Sets:
        self.networks_configured (list): List of configured network names
        self.working_network (int): Index of currently active network (starts at 0)
        self.total_networks (int): Total number of configured networks
        self.networks (numpy.ndarray): Array to store network instances
        self.oghma_network_config (dict): Loaded network configuration data
    """
    oghma_network_config = os.path.join(self.networks_dir, 'faster', 'nets.json')

    if os.path.isfile(oghma_network_config) == False:
        raise ValueError('Network Config File Not Found')
    else:
        f = open(oghma_network_config, 'r')
        oghma_network_config = json.load(f)
        f.close()

    self.networks_configured = list(oghma_network_config['sims'].keys())
    self.working_network = 0
    self.total_networks = len(self.networks_configured)
    self.networks = np.zeros(len(self.networks_configured), dtype=object)

    self.oghma_network_config = oghma_network_config

    for network in self.networks_configured:
        network_dir = os.path.join(self.networks_dir, 'faster', network)
        if os.path.isdir(network_dir) == False:
            os.mkdir(network_dir)

Set up directories for network configurations and ensure required files exist.

This method validates the network configuration, loads the network settings from the nets.json file, and creates necessary directories for each configured network. It initializes tracking variables for network management.

Raises

ValueError
If the network configuration file (nets.json) is not found in the expected location (networks_dir/faster/nets.json).

Sets

self.networks_configured (list): List of configured network names self.working_network (int): Index of currently active network (starts at 0) self.total_networks (int): Total number of configured networks self.networks (numpy.ndarray): Array to store network instances self.oghma_network_config (dict): Loaded network configuration data

def train_networks(self, network_type=None) ‑> None
Expand source code
def train_networks(self, network_type=None) -> None:
    """
    Train the networks using the training dataset.

    This method initializes the training process for the current working
    network.
    """
    dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
    print('Learning Rate:', self.model_settings.inital_learning_rate)
    print('Decay Rate:', self.model_settings.decay_rate)
    Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type)

Train the networks using the training dataset.

This method initializes the training process for the current working network.

def train_networks_existing(self, network_type=None) ‑> None
Expand source code
def train_networks_existing(self, network_type=None) -> None:
    """
    Train the networks using the training dataset.

    This method initializes the training process for the current working
    network.
    """
    dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
    print('Learning Rate:', self.model_settings.inital_learning_rate)
    print('Decay Rate:', self.model_settings.decay_rate)
    Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type, existing=True)

Train the networks using the training dataset.

This method initializes the training process for the current working network.

def tune_networks(self) ‑> None
Expand source code
def tune_networks(self) -> None:
    """
    Tune the networks using hyperparameter optimization.

    This method performs tuning to optimize the network's performance.
    """
    dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
    bhp = Tuning(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir)

Tune the networks using hyperparameter optimization.

This method performs tuning to optimize the network's performance.

class Point (networks_dir: str, model_settings=None)
Expand source code
class Point(Networks):
    """
    Subclass of Networks for point-based models.

    This class provides methods specific to point-based models, including
    training and confusion matrix generation.
    """
    _network_type = 'point'

    def __init__(self, networks_dir: str, model_settings = None):
        """
        Initialize a Point instance.

        Args:
            networks_dir (str): Directory containing network configurations.
            model_settings (Model_Settings, optional): Settings for the model.
        """
        self.networks_dir = networks_dir
        # if model_settings == None:
        #     self.model_settings = Model_Settings()
        # else:
        self.model_settings = model_settings

        self.rng = np.random.default_rng()

        self.min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    def train(self, idx=None) -> None:
        """
        Train the point-based networks.

        This method sets up directories, loads datasets, and trains the
        networks.
        """
        self.setup_network_directories()

        if idx is None:

            for jdx in range(0, self.total_networks):
                self.working_network = jdx

                self.load_training_dataset()
                self.separate_training_dataset()
                self.train_networks_existing()
        else:
            self.working_network = idx
            self.load_training_dataset()
            self.separate_training_dataset()
            self.train_networks_existing('Point')

    def train_existing(self, idx=None) -> None:
        """
        Train the point-based networks.

        This method sets up directories, loads datasets, and trains the
        networks.
        """
        self.setup_network_directories()

        if idx is None:

            for jdx in range(0, self.total_networks):
                self.working_network = jdx

                self.load_training_dataset()
                self.separate_training_dataset()
                self.train_networks_existing()
        else:
            self.working_network = idx
            self.load_training_dataset()
            self.separate_training_dataset()
            self.train_networks_existing()

    def confusion_matrix(self, abs_dir: Optional[str] = None) -> np.ndarray:
        """
        Generate a confusion matrix for the point-based networks.

        Args:
            abs_dir (str, optional): Directory containing absolute data.

        Returns:
            numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks.
        """
        self.setup_network_directories()
        self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object)
        self.predictions = np.zeros((len(self.networks_configured), 10))
        self.MAPE = np.zeros((len(self.networks_configured),10))

        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        for idx in range(len(self.networks_configured)):
            self.working_network = idx

            self.load_training_dataset()
            self.separate_training_dataset()

            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values[0]
            max = network_min_max_log['max'].values[0]
            log = network_min_max_log['log'].values[0]

            figname = 'tempCF' + str(idx)
            fig, ax = plt.subplots(figsize=(6,6), dpi=300)
            ax.set_xlabel('Target')
            ax.set_ylabel('Predicted')
            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
            validation_features = np.array([i.astype(float) for i in self.validation_features])
            P = Predicting(dir, validation_features)
            self.normalised_predicitons[idx,0] = P.predict()
            if log == 1:
                vt = self.denormalise_log(self.validation_targets, min, max)
                predictions = self.denormalise_log(self.normalised_predicitons[idx][:], min, max)
            else:
                vt = self.denormalise_linear(self.validation_targets, min, max)
                predictions = self.denormalise_linear(self.normalised_predicitons[idx][:], min, max)

            if np.shape(self.normalised_predicitons[idx][0])[1] > 1:
                for jdx in range(np.shape(self.normalised_predicitons[idx][0])[1]):
                    plt.hist2d(self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel(), bins=np.linspace(0,1, 100), range=[[0, 1], [0, 1]], cmap='inferno')
                    self.MAPE[idx,jdx] = np.abs(np.mean(np.abs(self.validation_targets[:,jdx].ravel() - self.normalised_predicitons[idx][0][:,jdx].ravel()) /self.normalised_predicitons[idx][0][:,jdx].ravel()) * 100)
                    #bs = stats.bootstrap((self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel()), self.mape, confidence_level=0.95, n_resamples=100)
                    #self.MAPE[idx,jdx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high))
            else:
                plt.hist2d(self.validation_targets[:].ravel(), self.normalised_predicitons[idx][0].ravel(), bins=np.linspace(0, 1, 100), range=[[0, 1], [0, 1]], cmap='inferno')

                self.MAPE[idx,0] = np.abs(np.mean(np.abs(self.validation_targets.ravel() - self.normalised_predicitons[idx][0].ravel()) / self.normalised_predicitons[idx][0].ravel()) * 100)
                #bs = stats.bootstrap((self.validation_targets.ravel(), self.normalised_predicitons[idx][0].ravel()), self.mape, confidence_level=0.95, n_resamples=100)
                #self.MAPE[idx, 0] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high))

            if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
                os.mkdir(os.path.join(os.getcwd(), 'temp'))
            figname = 'tempCF' + str(self.working_network)
            if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
                os.mkdir(os.path.join(os.getcwd(), 'temp'))
            figname = os.path.join(os.getcwd(), 'temp', figname + '.png')
            data = pd.DataFrame()
            data['Target'] = vt[:].ravel()[:20000]
            print(predictions[0].ravel())
            data['Predicted'] = predictions[0].ravel()[:20000]
            data.to_csv(figname + '.csv', index=False)
            plt.savefig(figname)

        return self.MAPE

    def predict(self, experimental_feature: Any) -> None:
        """
        Predict outputs for given experimental features.

        Args:
            experimental_feature: Experimental features to predict outputs for.
        """
        self.Device_Population = experimental_feature.Device_Population
        self.setup_network_directories()
        self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object)
        self.predictions = np.zeros((len(self.networks_configured), 10))
        for idx in range(len(self.networks_configured)):
            ef = copy.deepcopy(experimental_feature)
            self.working_network = idx

            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

            self.load_input_vectors()
            self.interpret_input_vectors()

            ef = self.sample_experimental_features(ef)
            ef = self.normalise_experimental_features(ef)

            if ef.Device_Population > 1:
                ef.y = np.array([f for f in ef.y])
            else:
                ef.y = np.array([ef.y])

            P = Predicting(dir, ef.y)

            self.normalised_predicitons[idx,0] = P.predict()
        self.denormalise_predictions()

Subclass of Networks for point-based models.

This class provides methods specific to point-based models, including training and confusion matrix generation.

Initialize a Point instance.

Args

networks_dir : str
Directory containing network configurations.
model_settings : Model_Settings, optional
Settings for the model.

Ancestors

Methods

def confusion_matrix(self, abs_dir: str | None = None) ‑> numpy.ndarray
Expand source code
def confusion_matrix(self, abs_dir: Optional[str] = None) -> np.ndarray:
    """
    Generate a confusion matrix for the point-based networks.

    Args:
        abs_dir (str, optional): Directory containing absolute data.

    Returns:
        numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks.
    """
    self.setup_network_directories()
    self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object)
    self.predictions = np.zeros((len(self.networks_configured), 10))
    self.MAPE = np.zeros((len(self.networks_configured),10))

    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    for idx in range(len(self.networks_configured)):
        self.working_network = idx

        self.load_training_dataset()
        self.separate_training_dataset()

        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]

        figname = 'tempCF' + str(idx)
        fig, ax = plt.subplots(figsize=(6,6), dpi=300)
        ax.set_xlabel('Target')
        ax.set_ylabel('Predicted')
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])
        validation_features = np.array([i.astype(float) for i in self.validation_features])
        P = Predicting(dir, validation_features)
        self.normalised_predicitons[idx,0] = P.predict()
        if log == 1:
            vt = self.denormalise_log(self.validation_targets, min, max)
            predictions = self.denormalise_log(self.normalised_predicitons[idx][:], min, max)
        else:
            vt = self.denormalise_linear(self.validation_targets, min, max)
            predictions = self.denormalise_linear(self.normalised_predicitons[idx][:], min, max)

        if np.shape(self.normalised_predicitons[idx][0])[1] > 1:
            for jdx in range(np.shape(self.normalised_predicitons[idx][0])[1]):
                plt.hist2d(self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel(), bins=np.linspace(0,1, 100), range=[[0, 1], [0, 1]], cmap='inferno')
                self.MAPE[idx,jdx] = np.abs(np.mean(np.abs(self.validation_targets[:,jdx].ravel() - self.normalised_predicitons[idx][0][:,jdx].ravel()) /self.normalised_predicitons[idx][0][:,jdx].ravel()) * 100)
                #bs = stats.bootstrap((self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel()), self.mape, confidence_level=0.95, n_resamples=100)
                #self.MAPE[idx,jdx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high))
        else:
            plt.hist2d(self.validation_targets[:].ravel(), self.normalised_predicitons[idx][0].ravel(), bins=np.linspace(0, 1, 100), range=[[0, 1], [0, 1]], cmap='inferno')

            self.MAPE[idx,0] = np.abs(np.mean(np.abs(self.validation_targets.ravel() - self.normalised_predicitons[idx][0].ravel()) / self.normalised_predicitons[idx][0].ravel()) * 100)
            #bs = stats.bootstrap((self.validation_targets.ravel(), self.normalised_predicitons[idx][0].ravel()), self.mape, confidence_level=0.95, n_resamples=100)
            #self.MAPE[idx, 0] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high))

        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))
        figname = 'tempCF' + str(self.working_network)
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))
        figname = os.path.join(os.getcwd(), 'temp', figname + '.png')
        data = pd.DataFrame()
        data['Target'] = vt[:].ravel()[:20000]
        print(predictions[0].ravel())
        data['Predicted'] = predictions[0].ravel()[:20000]
        data.to_csv(figname + '.csv', index=False)
        plt.savefig(figname)

    return self.MAPE

Generate a confusion matrix for the point-based networks.

Args

abs_dir : str, optional
Directory containing absolute data.

Returns

numpy.ndarray
Mean Absolute Percentage Error (MAPE) for the networks.
def predict(self, experimental_feature: Any) ‑> None
Expand source code
def predict(self, experimental_feature: Any) -> None:
    """
    Predict outputs for given experimental features.

    Args:
        experimental_feature: Experimental features to predict outputs for.
    """
    self.Device_Population = experimental_feature.Device_Population
    self.setup_network_directories()
    self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object)
    self.predictions = np.zeros((len(self.networks_configured), 10))
    for idx in range(len(self.networks_configured)):
        ef = copy.deepcopy(experimental_feature)
        self.working_network = idx

        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

        self.load_input_vectors()
        self.interpret_input_vectors()

        ef = self.sample_experimental_features(ef)
        ef = self.normalise_experimental_features(ef)

        if ef.Device_Population > 1:
            ef.y = np.array([f for f in ef.y])
        else:
            ef.y = np.array([ef.y])

        P = Predicting(dir, ef.y)

        self.normalised_predicitons[idx,0] = P.predict()
    self.denormalise_predictions()

Predict outputs for given experimental features.

Args

experimental_feature
Experimental features to predict outputs for.
def train(self, idx=None) ‑> None
Expand source code
def train(self, idx=None) -> None:
    """
    Train the point-based networks.

    This method sets up directories, loads datasets, and trains the
    networks.
    """
    self.setup_network_directories()

    if idx is None:

        for jdx in range(0, self.total_networks):
            self.working_network = jdx

            self.load_training_dataset()
            self.separate_training_dataset()
            self.train_networks_existing()
    else:
        self.working_network = idx
        self.load_training_dataset()
        self.separate_training_dataset()
        self.train_networks_existing('Point')

Train the point-based networks.

This method sets up directories, loads datasets, and trains the networks.

def train_existing(self, idx=None) ‑> None
Expand source code
def train_existing(self, idx=None) -> None:
    """
    Train the point-based networks.

    This method sets up directories, loads datasets, and trains the
    networks.
    """
    self.setup_network_directories()

    if idx is None:

        for jdx in range(0, self.total_networks):
            self.working_network = jdx

            self.load_training_dataset()
            self.separate_training_dataset()
            self.train_networks_existing()
    else:
        self.working_network = idx
        self.load_training_dataset()
        self.separate_training_dataset()
        self.train_networks_existing()

Train the point-based networks.

This method sets up directories, loads datasets, and trains the networks.

Inherited members

class Residual (networks_dir, model_settings=None)
Expand source code
class Residual(Networks):
    _network_type = 'residual'

    def __init__(self, networks_dir, model_settings=None):
        self.networks_dir = networks_dir

        self.model_settings = model_settings
        self.rng = np.random.default_rng()

    def train(self, idx=None):
        self.setup_network_directories()
        
        if idx is None:
            for idx in range(self.total_networks):
                self.working_network = idx
                self.load_training_dataset()
                self.permutations_lowRAM()
                self.separate_training_dataset()
                self.train_networks()
        else:
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks('Residual')
    
    def train_existing(self, idx=None):
        self.setup_network_directories()
        
        if idx is None:
            for idx in range(self.total_networks):
                self.working_network = idx
                self.load_training_dataset()
                self.permutations_lowRAM()
                self.separate_training_dataset()
                self.train_networks_existing()
        else:
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks_existing('Residual')

    def tune(self):
        self.setup_network_directories()

        for idx in range(self.total_networks):
            self.working_network = idx

            self.load_training_dataset()
            self.permutations()
            self.separate_training_dataset()
            self.tune_networks()

    def permutations(self):
        features = self.features
        targets = self.targets
        if self.model_settings.permutations_limit is None:
            permutations_limit = len(targets)
        else:
            permutations_limit = self.model_settings.permutations_limit

        indices = random.choices(range(len(targets)), k=permutations_limit)
        self.indices = indices
        permutations = list(itertools.permutations(indices, r=2))
        permutations = np.array([list(f) for f in permutations], dtype=np.uint32)
        self.permutations_list = permutations

        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

        num_inputs = l

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)

        x = np.zeros((len(permutations), 2 * num_inputs))
        y = np.zeros((len(permutations), num_outputs))
        for idx, p in enumerate(permutations):
            x[idx, :num_inputs] = features[p[0]]
            x[idx, num_inputs:] = features[p[1]]
            y[idx] = targets[p[0]] - targets[p[1]]

        self.features = x  # copy.deepcopy(x)
        self.targets = y  # copy.deepcopy(y)
        self.population = len(permutations)
        return features, targets

    def permutations_lowRAM(self):
        # features = self.features
        # targets = self.targets
        if self.model_settings.permutations_limit is not None:
            permutations_limit = self.model_settings.permutations_limit

        indices = random.choices(range(1000), k=1000)
        self.indices = indices

        permutations = list(itertools.permutations(indices, r=2))
        permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

        Thousands = 40
        p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32)
        step = np.shape(permutations)[0]
        for idx in range(1, Thousands):
            if idx == 0:
                p[:step,:] = permutations + (1000*idx)
            else:
                p[step*(idx-1):step*idx,:] = permutations + (1000*idx)
            # permutations = np.append(permutations, permutations + (1000*idx), axis=0)
        permutations = p

        if self.model_settings.permutations_limit < np.shape(permutations)[0]:
            permutations = permutations[:self.model_settings.permutations_limit]
        self.permutations_list = permutations

        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

        num_inputs = l

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)

        x = np.zeros((len(permutations), 2 * num_inputs))
        y = np.zeros((len(permutations), num_outputs))
        for idx, p in enumerate(permutations):
            x[idx, :num_inputs] = self.features[p[0]]
            x[idx, num_inputs:] = self.features[p[1]]
            y[idx] = self.targets[p[0]] - self.targets[p[1]]

        self.features = x  # copy.deepcopy(x)
        self.targets = y  # copy.deepcopy(y)
        self.population = len(permutations)

        return self.features, self.targets

    def combinations(self):
        if self.model_settings.permutations_limit is None:
            permutations_limit = self.population
        else:
            permutations_limit = self.model_settings.permutations_limit

        rng = np.random.default_rng()
        indices = rng.choice(range(self.population), permutations_limit)
        permutations = list(itertools.combinations(indices, r=2))
        permutations = [list(f) for f in permutations]

        x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points))
        y = np.zeros((len(self.networks_configured), len(permutations), 1))

        for idx, p in enumerate(permutations):
            x[:, idx, :self.points] = self.features[:, p[0], :]
            x[:, idx, self.points:] = self.features[:, p[1], :]
            y[:, idx] = self.targets[p[0]] - self.targets[p[1]]

        self.features = copy.deepcopy(x)
        self.targets = copy.deepcopy(y)
        self.population = len(permutations)

    def predict(self, absolute_dir: str, *experimental_features) -> None:
        """
        Predict outputs for given experimental features using difference-based models.

        Args:
            absolute_dir (str): Directory containing absolute data.
            experimental_feature (np.ndarray): Experimental features to predict outputs for.
        """
        experimental_features = experimental_features[0]
        self.setup_network_directories()
        # Gather Input Vectors Used
        self.load_input_vectors()

        exps = []
        if type(experimental_features) is list:
            for idx, experimental_feature in enumerate(experimental_features):
                experimental_feature = copy.deepcopy(experimental_feature)
                # Samples Experimental Feature
                experimental_feature = self.sample_experimental_features(experimental_feature)

                # Normalises Experimental Feature
                experimental_feature = self.normalise_experimental_features(experimental_feature, dir=None, idx=idx)
                exps.append(experimental_feature)
        else:
            experimental_features = copy.deepcopy(experimental_features)
            # Samples Experimental Feature
            experimental_features = self.sample_experimental_features(experimental_features)

            # Normalises Experimental Feature
            experimental_features = self.normalise_experimental_features(experimental_features, dir=None, idx=None)
            exps.append(experimental_features)
            exps = exps[0]

        for idx in range(self.total_networks):
            self.working_network = idx
            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

            # Gathers and sets up secondary dataset
            self.setup_absolute_dataset(absolute_dir)

            # Gathers number of ouputs
            outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

            s = np.shape(self.absolute_features)

            # Setup Prediction Array
            self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object)

            # Pairs experimental feature with secondary dataset
            feature = self.setup_experimental_feature(exps)

            # Sets up Network
            P = Predicting(dir, feature, inputs=self.len_exp_features*2)

            # Generates Normalised Predictions
            self.normalised_predicitons = P.predict()

            self.turn_predictions_absolute()

            self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

            prediction = self.denormalise_prediction_single(absolute_dir)

            self.predictions[:, idx, :] = prediction[:]
            self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons)

    def renormalise(self, absolute_v: pd.DataFrame, absolute_dir: str) -> pd.DataFrame:
        """
        Renormalize absolute values to match the network's configuration.

        Args:
            absolute_v (pandas.DataFrame): Absolute values to renormalize.
            absolute_dir (str): Directory containing absolute data.

        Returns:
            pandas.DataFrame: Renormalized values.
        """
        network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
        network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ',
                                          names=['param', 'min', 'max', 'log'])

        absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
        absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ',
                                           names=['param', 'min', 'max', 'log'])

        temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]]

        absolute_min = temp['min'].to_numpy()[0]
        absolute_max = temp['max'].to_numpy()[0]
        absolute_log = temp['log'].to_numpy()[0]

        if absolute_log == 1:
            absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max)
        else:
            absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min, absolute_max)

        temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]]
        network_min = temp['min'].to_numpy()[0]
        network_max = temp['max'].to_numpy()[0]
        network_log = temp['log'].to_numpy()[0]

        if network_log == 1:
            absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max)
        else:
            absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max)

        vecs = []
        for f in absolute_v.columns:
            if 'vec' in f:
                if 'light' in f:
                    vecs.append(f)

        for vec in vecs:
            temp = absolute_min_max_log[absolute_min_max_log['param'] == vec]
            absolute_min = temp['min'].to_numpy()[0]
            absolute_max = temp['max'].to_numpy()[0]
            absolute_log = temp['log'].to_numpy()[0]

            if absolute_log == 1:
                absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max)
            else:
                absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max)

            temp = network_min_max_log[network_min_max_log['param'] == vec]
            network_min = temp['min'].to_numpy()[0]
            network_max = temp['max'].to_numpy()[0]
            network_log = temp['log'].to_numpy()[0]

            if network_log == 1:
                absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max)
            else:
                absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max)

        return absolute_v

    def setup_absolute_dataset(self, absolute_dir: str) -> None:
        """
        Set up the absolute dataset for predictions.

        Args:
            absolute_dir (str): Directory containing absolute data.
        """
        absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv')

        inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']

        self.inputs = inputs
        self.outputs = outputs

        f = open(absolute_vectors_dir, 'r')
        v = pd.read_csv(f, delimiter=' ')
        self.vectors = self.renormalise(v, absolute_dir)
        f.close()

        col = self.vectors.columns.to_list()
        inp = [i + '.vec' for i in self.inputs]
        V = []
        for i in inp:
            vecs = np.where(np.char.find(np.char.lower(col), i) > -1)[0]
            V = np.append(V, vecs)
        col = [col[int(x)] for x in V]
        self.absolute_targets = self.vectors[outputs].to_numpy()
        self.absolute_features = self.vectors[col].to_numpy()
        self.absolute_features = self.absolute_features[:]
        self.absolute_targets = self.absolute_targets[:]

        #self.generate_uniform_distribution()

    def generate_uniform_distribution(self) -> None:
        """
        Generate a uniform distribution of the absolute dataset.
        """
        kde = stats.gaussian_kde(self.absolute_targets.ravel())
        density = kde(self.absolute_targets.ravel())
        density = self.normalise_linear(density, np.min(density), np.max(density))
        density = 1 - density
        density = density / np.sum(density)
        indicies = np.arange(len(density))
        normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2))
        self.absolute_targets = self.absolute_targets[normal]
        self.absolute_features = self.absolute_features[normal]

    def setup_experimental_feature(self, experimental_features) -> np.ndarray:

        absolute_features = self.absolute_features
        try:
            len_exp_features = len(experimental_features)
            self.len_exp_features = int(len_exp_features/2)
            len_exp0 = len(experimental_features[0].y)
        except:
            len_exp_features = 1
            self.len_exp_features = 1
            len_exp0 = len(experimental_features.y)
        match len_exp_features:
            case 1:
                y = np.tile(experimental_features.y, (len(absolute_features),1))
                y = np.reshape(y, np.shape(absolute_features))

                features1 = np.concatenate((absolute_features, y), axis=1)
                features2 = np.concatenate((y, absolute_features), axis=1)
                features = np.concatenate((features1, features2), axis=0)
            case 2:
                abf_1 = absolute_features[:, 0:len_exp0]
                abf_2 = absolute_features[:, len_exp0:len_exp0*2]

                y1 = np.tile(experimental_features[0].y, (len(abf_1), 1))
                y1 = np.reshape(y1, np.shape(abf_1))
                y2 = np.tile(experimental_features[1].y, (len(abf_2), 1))
                y2 = np.reshape(y2, np.shape(abf_2))

                abf = np.concatenate((abf_1, abf_2), axis=1)
                abf_r = np.concatenate((abf_2, abf_1), axis=1)
                abf = np.concatenate((abf, abf_r), axis=0)

                y = np.concatenate((y1, y2), axis=1)
                y_r = np.concatenate((y2, y1), axis=1)
                y = np.concatenate((y, y_r), axis=0)

                features = np.concatenate((abf, y), axis=1)

                # features1 = np.concatenate((abf_1, y1), axis=1)
                # features1_r = np.concatenate((y1, abf_1), axis=1)
                # features1 = np.concatenate((features1, features1_r), axis=0)
                # features2 = np.concatenate((abf_2, y2), axis=1)
                # features2_r = np.concatenate((y2, abf_2), axis=1)
                # features2 = np.concatenate((features2, features2_r), axis=0)
                # features = np.concatenate((features1, features2), axis=1)
            case 4:
                abf_1 = absolute_features[:, 0:len_exp0]
                abf_2 = absolute_features[:, len_exp0:len_exp0*2]
                abf_3 = absolute_features[:, len_exp0*2:len_exp0*3]
                abf_4 = absolute_features[:, len_exp0*3:len_exp0*4]

                y1 = np.tile(experimental_features[0].y, (len(abf_1), 1))
                y1 = np.reshape(y1, np.shape(abf_1))
                y2 = np.tile(experimental_features[1].y, (len(abf_2), 1))
                y2 = np.reshape(y2, np.shape(abf_2))
                y3 = np.tile(experimental_features[2].y, (len(abf_3), 1))
                y3 = np.reshape(y3, np.shape(abf_3))
                y4 = np.tile(experimental_features[3].y, (len(abf_4), 1))
                y4 = np.reshape(y4, np.shape(abf_4))

                features1 = np.concatenate((abf_1, y1), axis=1)
                features1_r = np.concatenate((y1, abf_1), axis=1)
                features1 = np.concatenate((features1, features1_r), axis=0)
                features2 = np.concatenate((abf_2, y2), axis=1)
                features2_r = np.concatenate((y2, abf_2), axis=1)
                features2 = np.concatenate((features2, features2_r), axis=0)
                features3 = np.concatenate((abf_3, y3), axis=1)
                features3_r = np.concatenate((y3, abf_3), axis=1)
                features3 = np.concatenate((features3, features3_r), axis=0)
                features4 = np.concatenate((abf_4, y4), axis=1)
                features4_r = np.concatenate((y4, abf_4), axis=1)
                features4 = np.concatenate((features4, features4_r), axis=0)

                features = np.concatenate((features1, features2, features3, features4), axis=1)
        return features

    def turn_predictions_absolute(self) -> None:
        """
        Convert predictions to absolute values.
        """
        origin = self.absolute_targets
        l = len(self.absolute_targets)
        self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l]
        self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin

    def denormalise_predictions(self, absolute_dir: str) -> None:
        """
        Denormalize predictions to their original scale.

        Args:
            absolute_dir (str): Directory containing absolute data.
        """
        min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None,
                                  sep=' ', names=['param', 'min', 'max', 'log'])
        predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))
        normalised_predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

        for idx in range(self.total_networks):
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            num_outputs = len(outputs)
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values
            max = network_min_max_log['max'].values
            log = network_min_max_log['log'].values

            for jdx in range(num_outputs):
                for kdx in range(len(self.absolute_targets)):
                    if not log:
                        predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (max[jdx] - min[jdx]) + \
                                                     min[jdx]

                    else:
                        predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (
                                    np.log10(max[jdx]) - np.log10(min[jdx])) + np.log10(min[jdx])
                        predictions[kdx, idx, jdx] = 10 ** predictions[kdx, idx, jdx]
            self.predictions = predictions

    def denormalise_prediction_single(self, absolute_dir: str) -> np.ndarray:
        """
        Denormalize a single prediction to its original scale.

        Args:
            absolute_dir (str): Directory containing absolute data.

        Returns:
            numpy.ndarray: Denormalized prediction.
        """
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])
        predictions = np.zeros((len(self.absolute_targets), 10))

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values
        max = network_min_max_log['max'].values
        log = network_min_max_log['log'].values[0]

        for jdx in range(num_outputs):
            for kdx in range(len(self.absolute_targets)):
                if not log:
                    predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                                    max[jdx])
                else:
                    predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                                 max[jdx])

        return predictions

    def denormalise_target_single(self, absolute_dir: str) -> np.ndarray:
        """
        Denormalize a single target to its original scale.

        Args:
            absolute_dir (str): Directory containing absolute data.

        Returns:
            numpy.ndarray: Denormalized target.
        """
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])
        predictions = np.zeros((len(self.absolute_targets), 10))

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values
        max = network_min_max_log['max'].values
        log = network_min_max_log['log'].values[0]

        for jdx in range(num_outputs):
            for kdx in range(len(self.absolute_targets)):
                if not log:
                    predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])
                else:
                    predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])

        return predictions

    def permutations_normal_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
        """
        Generate permutations of features and targets with a normal distribution.

        Args:
            features (numpy.ndarray): Input features.
            targets (numpy.ndarray): Target values.

        Returns:
            tuple: Features and targets with generated permutations.
        """
        if self.model_settings.permutations_limit is None:
            permutations_limit = len(targets)
        else:
            permutations_limit = self.model_settings.permutations_limit

        indices = self.rng.choice(range(len(targets)), 300)

        permutations = list(itertools.permutations(indices, r=2))
        permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

        keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
        keys = list(keys)
        l = 0
        for key in keys:
            l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

        num_inputs = l

        outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
        num_outputs = len(outputs)

        x = np.zeros((len(permutations), 2 * num_inputs))
        y = np.zeros((len(permutations), num_outputs))
        for idx, p in enumerate(permutations):
            x[idx, :num_inputs] = features[p[0]]
            x[idx, num_inputs:] = features[p[1]]
            y[idx] = targets[p[0]] - targets[p[1]]

        features = x
        targets = y.ravel()

        self.population = len(permutations)
        return features, targets

    def setup_confusion_matrix_features(self) -> Tuple[np.ndarray, np.ndarray]:
        """
        Set up features for generating a confusion matrix.

        Returns:
            tuple: Targets and features for the confusion matrix.
        """
        absolute_features = self.absolute_features
        absolute_targets = self.absolute_targets
        x_features = absolute_features
        x_targets = absolute_targets[::-1]
        y_features = absolute_features
        y_targets = absolute_targets[::-1]

        features1 = np.concatenate((x_features, y_features), axis=1)
        features2 = np.concatenate((y_features, x_features), axis=1)
        features = np.concatenate((features1, features2), axis=0)

        targets = np.concatenate((x_targets, y_targets), axis=0)
        return targets, features

    def confusion_matrix(self, absolute_dir: Optional[str] = None) -> np.ndarray:
        """
        Generate a confusion matrix for the difference-based networks.

        Args:
            absolute_dir (str, optional): Directory containing absolute data.

        Returns:
            numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks.
        """
        self.setup_network_directories()
        self.load_input_vectors()
        MAPE = np.zeros(len(self.networks_configured))
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        for idx in range(self.total_networks):
            self.working_network = idx
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values[0]
            max = network_min_max_log['max'].values[0]
            log = network_min_max_log['log'].values[0]
            dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

            self.setup_absolute_dataset(absolute_dir)

            outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

            s = np.shape(self.absolute_features)

            self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object)

            targets, feature = self.setup_confusion_matrix_features()

            P = Predicting(dir, feature, inputs=2)

            self.normalised_predicitons = P.predict()

            self.absolute_targets = targets
            self.normalised_predicitons = targets - self.normalised_predicitons

            prediction = self.denormalise_prediction_single(absolute_dir)
            targets = self.denormalise_target_single(absolute_dir)

            fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
            plt.xlabel('Target')
            plt.ylabel('Prediction')
            if log == 0:
                plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno')
            else:
                plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno')
                plt.xscale('log')
                plt.yscale('log')
            figname = 'tempCF' + str(self.working_network) 
            if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
                os.mkdir(os.path.join(os.getcwd(), 'temp'))
            figname = os.path.join(os.getcwd(), 'temp', figname)
            data = pd.DataFrame()
            data['Target'] = targets[:,0].ravel()
            data['Predicted'] = prediction[:,0].ravel()
            data.to_csv(figname + '.csv', index=False)
            plt.savefig(figname+ '.png')

            MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100)
            # diff = np.abs(targets[:, 0].ravel() - prediction[:, 0].ravel())
            # bs = stats.bootstrap((diff, ), np.mean, confidence_level=0.95, n_resamples=100)
            # MAPE[idx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) * 100
            self.MAPE = MAPE
        return MAPE

    def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None:
        """
        Generate a distribution plot for a single network.

        Args:
            absolute_dir (str): Directory containing absolute data.
            idx (int): Index of the network.
            predictions (numpy.ndarray): Predictions to plot.
        """
        min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        self.working_network = idx
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]

        p = predictions[:, 0]

        if self.working_network == 0:
            self.mean = np.zeros(self.total_networks)
            self.std = np.zeros(self.total_networks)

        fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
        if log == 0:
            m = np.mean(p)
            s = np.std(p)

            self.mean[idx] = np.abs(m)
            self.std[idx] = s 
            count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) )
            ax.hist(np.abs(p), bins=np.linspace(min, max, 1000))
        else:
            m = np.mean(p)
            s = np.std(p)
            self.mean[idx] = np.abs(m)
            self.std[idx] = s
            count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
            ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
            ax.set_xscale('log')
        ax.axvline(m, color='tab:orange')
        L = Label(outputs[0])
        ax.set_xlabel(L.english + ' (' + L.units + ')')
        ax.set_ylabel('Count')

        hist = pd.DataFrame()
        count = np.append(count, 0)
        hist['bins'] = predictions[:, 0]

        hist_n = pd.DataFrame()
        hist_n['bins'] = norm_predictions[:, 0]

        figname = 'tempDF' + str(self.working_network) + '.png'
        figname_hist = 'tempDF' + str(self.working_network) + '.csv'
        figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv'
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))

        figname = os.path.join(os.getcwd(), 'temp', figname)
        figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist)
        plt.savefig(figname)
        hist.to_csv(figname_hist, index=False)
        hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False)

        plt.close()

    def distribution_plot(self, absolute_dir: str) -> None:
        """
        Generate distribution plots for all networks.

        Args:
            absolute_dir (str): Directory containing absolute data.
        """
        min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                                  header=None, sep=' ', names=['param', 'min', 'max', 'log'])

        for idx in range(self.total_networks):
            self.working_network = idx
            outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
            network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
            min = network_min_max_log['min'].values[0]
            max = network_min_max_log['max'].values[0]
            log = network_min_max_log['log'].values[0]

            p = self.predictions[:, idx, 0]
            m = np.mean(p)
            fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
            counts, bins = ax.hist(p, bins=10000, range=(min, max))
            ax.axvline(m, color='tab:orange')
            ax.set_xlabel('Prediction')
            ax.set_ylabel('Count')

            hist = pd.DataFrame()
            hist['bins'] = p

            figname = 'tempDF' + str(self.working_network) + '.png'
            hist_figname = 'tempDF' + str(self.working_network) + '.csv'
            if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
                os.mkdir(os.path.join(os.getcwd(), 'temp'))

            figname = os.path.join(os.getcwd(), 'temp', figname)
            plt.show()

Base class for managing and training machine learning networks.

This class provides methods for initializing, configuring, and training networks, as well as handling input and output data for machine learning tasks. It also supports subclassing for different types of networks.

Ancestors

Methods

def combinations(self)
Expand source code
def combinations(self):
    if self.model_settings.permutations_limit is None:
        permutations_limit = self.population
    else:
        permutations_limit = self.model_settings.permutations_limit

    rng = np.random.default_rng()
    indices = rng.choice(range(self.population), permutations_limit)
    permutations = list(itertools.combinations(indices, r=2))
    permutations = [list(f) for f in permutations]

    x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points))
    y = np.zeros((len(self.networks_configured), len(permutations), 1))

    for idx, p in enumerate(permutations):
        x[:, idx, :self.points] = self.features[:, p[0], :]
        x[:, idx, self.points:] = self.features[:, p[1], :]
        y[:, idx] = self.targets[p[0]] - self.targets[p[1]]

    self.features = copy.deepcopy(x)
    self.targets = copy.deepcopy(y)
    self.population = len(permutations)
def confusion_matrix(self, absolute_dir: str | None = None) ‑> numpy.ndarray
Expand source code
def confusion_matrix(self, absolute_dir: Optional[str] = None) -> np.ndarray:
    """
    Generate a confusion matrix for the difference-based networks.

    Args:
        absolute_dir (str, optional): Directory containing absolute data.

    Returns:
        numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks.
    """
    self.setup_network_directories()
    self.load_input_vectors()
    MAPE = np.zeros(len(self.networks_configured))
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    for idx in range(self.total_networks):
        self.working_network = idx
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

        self.setup_absolute_dataset(absolute_dir)

        outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

        s = np.shape(self.absolute_features)

        self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object)

        targets, feature = self.setup_confusion_matrix_features()

        P = Predicting(dir, feature, inputs=2)

        self.normalised_predicitons = P.predict()

        self.absolute_targets = targets
        self.normalised_predicitons = targets - self.normalised_predicitons

        prediction = self.denormalise_prediction_single(absolute_dir)
        targets = self.denormalise_target_single(absolute_dir)

        fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
        plt.xlabel('Target')
        plt.ylabel('Prediction')
        if log == 0:
            plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno')
        else:
            plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno')
            plt.xscale('log')
            plt.yscale('log')
        figname = 'tempCF' + str(self.working_network) 
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))
        figname = os.path.join(os.getcwd(), 'temp', figname)
        data = pd.DataFrame()
        data['Target'] = targets[:,0].ravel()
        data['Predicted'] = prediction[:,0].ravel()
        data.to_csv(figname + '.csv', index=False)
        plt.savefig(figname+ '.png')

        MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100)
        # diff = np.abs(targets[:, 0].ravel() - prediction[:, 0].ravel())
        # bs = stats.bootstrap((diff, ), np.mean, confidence_level=0.95, n_resamples=100)
        # MAPE[idx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) * 100
        self.MAPE = MAPE
    return MAPE

Generate a confusion matrix for the difference-based networks.

Args

absolute_dir : str, optional
Directory containing absolute data.

Returns

numpy.ndarray
Mean Absolute Percentage Error (MAPE) for the networks.
def denormalise_prediction_single(self, absolute_dir: str) ‑> numpy.ndarray
Expand source code
def denormalise_prediction_single(self, absolute_dir: str) -> np.ndarray:
    """
    Denormalize a single prediction to its original scale.

    Args:
        absolute_dir (str): Directory containing absolute data.

    Returns:
        numpy.ndarray: Denormalized prediction.
    """
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])
    predictions = np.zeros((len(self.absolute_targets), 10))

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)
    network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
    min = network_min_max_log['min'].values
    max = network_min_max_log['max'].values
    log = network_min_max_log['log'].values[0]

    for jdx in range(num_outputs):
        for kdx in range(len(self.absolute_targets)):
            if not log:
                predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                                max[jdx])
            else:
                predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx],
                                                             max[jdx])

    return predictions

Denormalize a single prediction to its original scale.

Args

absolute_dir : str
Directory containing absolute data.

Returns

numpy.ndarray
Denormalized prediction.
def denormalise_predictions(self, absolute_dir: str) ‑> None
Expand source code
def denormalise_predictions(self, absolute_dir: str) -> None:
    """
    Denormalize predictions to their original scale.

    Args:
        absolute_dir (str): Directory containing absolute data.
    """
    min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None,
                              sep=' ', names=['param', 'min', 'max', 'log'])
    predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))
    normalised_predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

    for idx in range(self.total_networks):
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        num_outputs = len(outputs)
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values
        max = network_min_max_log['max'].values
        log = network_min_max_log['log'].values

        for jdx in range(num_outputs):
            for kdx in range(len(self.absolute_targets)):
                if not log:
                    predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (max[jdx] - min[jdx]) + \
                                                 min[jdx]

                else:
                    predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (
                                np.log10(max[jdx]) - np.log10(min[jdx])) + np.log10(min[jdx])
                    predictions[kdx, idx, jdx] = 10 ** predictions[kdx, idx, jdx]
        self.predictions = predictions

Denormalize predictions to their original scale.

Args

absolute_dir : str
Directory containing absolute data.
def denormalise_target_single(self, absolute_dir: str) ‑> numpy.ndarray
Expand source code
def denormalise_target_single(self, absolute_dir: str) -> np.ndarray:
    """
    Denormalize a single target to its original scale.

    Args:
        absolute_dir (str): Directory containing absolute data.

    Returns:
        numpy.ndarray: Denormalized target.
    """
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])
    predictions = np.zeros((len(self.absolute_targets), 10))

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)
    network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
    min = network_min_max_log['min'].values
    max = network_min_max_log['max'].values
    log = network_min_max_log['log'].values[0]

    for jdx in range(num_outputs):
        for kdx in range(len(self.absolute_targets)):
            if not log:
                predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])
            else:
                predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx])

    return predictions

Denormalize a single target to its original scale.

Args

absolute_dir : str
Directory containing absolute data.

Returns

numpy.ndarray
Denormalized target.
def distribution_plot(self, absolute_dir: str) ‑> None
Expand source code
def distribution_plot(self, absolute_dir: str) -> None:
    """
    Generate distribution plots for all networks.

    Args:
        absolute_dir (str): Directory containing absolute data.
    """
    min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    for idx in range(self.total_networks):
        self.working_network = idx
        outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
        network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
        min = network_min_max_log['min'].values[0]
        max = network_min_max_log['max'].values[0]
        log = network_min_max_log['log'].values[0]

        p = self.predictions[:, idx, 0]
        m = np.mean(p)
        fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
        counts, bins = ax.hist(p, bins=10000, range=(min, max))
        ax.axvline(m, color='tab:orange')
        ax.set_xlabel('Prediction')
        ax.set_ylabel('Count')

        hist = pd.DataFrame()
        hist['bins'] = p

        figname = 'tempDF' + str(self.working_network) + '.png'
        hist_figname = 'tempDF' + str(self.working_network) + '.csv'
        if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
            os.mkdir(os.path.join(os.getcwd(), 'temp'))

        figname = os.path.join(os.getcwd(), 'temp', figname)
        plt.show()

Generate distribution plots for all networks.

Args

absolute_dir : str
Directory containing absolute data.
def distribution_plot_single(self,
absolute_dir: str,
idx: int,
predictions: numpy.ndarray,
norm_predictions: numpy.ndarray) ‑> None
Expand source code
def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None:
    """
    Generate a distribution plot for a single network.

    Args:
        absolute_dir (str): Directory containing absolute data.
        idx (int): Index of the network.
        predictions (numpy.ndarray): Predictions to plot.
    """
    min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'),
                              header=None, sep=' ', names=['param', 'min', 'max', 'log'])

    self.working_network = idx
    outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs']
    network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)]
    min = network_min_max_log['min'].values[0]
    max = network_min_max_log['max'].values[0]
    log = network_min_max_log['log'].values[0]

    p = predictions[:, 0]

    if self.working_network == 0:
        self.mean = np.zeros(self.total_networks)
        self.std = np.zeros(self.total_networks)

    fig, ax = plt.subplots(figsize=(6, 6), dpi=300)
    if log == 0:
        m = np.mean(p)
        s = np.std(p)

        self.mean[idx] = np.abs(m)
        self.std[idx] = s 
        count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) )
        ax.hist(np.abs(p), bins=np.linspace(min, max, 1000))
    else:
        m = np.mean(p)
        s = np.std(p)
        self.mean[idx] = np.abs(m)
        self.std[idx] = s
        count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
        ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000))
        ax.set_xscale('log')
    ax.axvline(m, color='tab:orange')
    L = Label(outputs[0])
    ax.set_xlabel(L.english + ' (' + L.units + ')')
    ax.set_ylabel('Count')

    hist = pd.DataFrame()
    count = np.append(count, 0)
    hist['bins'] = predictions[:, 0]

    hist_n = pd.DataFrame()
    hist_n['bins'] = norm_predictions[:, 0]

    figname = 'tempDF' + str(self.working_network) + '.png'
    figname_hist = 'tempDF' + str(self.working_network) + '.csv'
    figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv'
    if not os.path.isdir(os.path.join(os.getcwd(), 'temp')):
        os.mkdir(os.path.join(os.getcwd(), 'temp'))

    figname = os.path.join(os.getcwd(), 'temp', figname)
    figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist)
    plt.savefig(figname)
    hist.to_csv(figname_hist, index=False)
    hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False)

    plt.close()

Generate a distribution plot for a single network.

Args

absolute_dir : str
Directory containing absolute data.
idx : int
Index of the network.
predictions : numpy.ndarray
Predictions to plot.
def generate_uniform_distribution(self) ‑> None
Expand source code
def generate_uniform_distribution(self) -> None:
    """
    Generate a uniform distribution of the absolute dataset.
    """
    kde = stats.gaussian_kde(self.absolute_targets.ravel())
    density = kde(self.absolute_targets.ravel())
    density = self.normalise_linear(density, np.min(density), np.max(density))
    density = 1 - density
    density = density / np.sum(density)
    indicies = np.arange(len(density))
    normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2))
    self.absolute_targets = self.absolute_targets[normal]
    self.absolute_features = self.absolute_features[normal]

Generate a uniform distribution of the absolute dataset.

def permutations(self)
Expand source code
def permutations(self):
    features = self.features
    targets = self.targets
    if self.model_settings.permutations_limit is None:
        permutations_limit = len(targets)
    else:
        permutations_limit = self.model_settings.permutations_limit

    indices = random.choices(range(len(targets)), k=permutations_limit)
    self.indices = indices
    permutations = list(itertools.permutations(indices, r=2))
    permutations = np.array([list(f) for f in permutations], dtype=np.uint32)
    self.permutations_list = permutations

    keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    keys = list(keys)
    l = 0
    for key in keys:
        l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

    num_inputs = l

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)

    x = np.zeros((len(permutations), 2 * num_inputs))
    y = np.zeros((len(permutations), num_outputs))
    for idx, p in enumerate(permutations):
        x[idx, :num_inputs] = features[p[0]]
        x[idx, num_inputs:] = features[p[1]]
        y[idx] = targets[p[0]] - targets[p[1]]

    self.features = x  # copy.deepcopy(x)
    self.targets = y  # copy.deepcopy(y)
    self.population = len(permutations)
    return features, targets
def permutations_lowRAM(self)
Expand source code
def permutations_lowRAM(self):
    # features = self.features
    # targets = self.targets
    if self.model_settings.permutations_limit is not None:
        permutations_limit = self.model_settings.permutations_limit

    indices = random.choices(range(1000), k=1000)
    self.indices = indices

    permutations = list(itertools.permutations(indices, r=2))
    permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

    Thousands = 40
    p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32)
    step = np.shape(permutations)[0]
    for idx in range(1, Thousands):
        if idx == 0:
            p[:step,:] = permutations + (1000*idx)
        else:
            p[step*(idx-1):step*idx,:] = permutations + (1000*idx)
        # permutations = np.append(permutations, permutations + (1000*idx), axis=0)
    permutations = p

    if self.model_settings.permutations_limit < np.shape(permutations)[0]:
        permutations = permutations[:self.model_settings.permutations_limit]
    self.permutations_list = permutations

    keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    keys = list(keys)
    l = 0
    for key in keys:
        l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

    num_inputs = l

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)

    x = np.zeros((len(permutations), 2 * num_inputs))
    y = np.zeros((len(permutations), num_outputs))
    for idx, p in enumerate(permutations):
        x[idx, :num_inputs] = self.features[p[0]]
        x[idx, num_inputs:] = self.features[p[1]]
        y[idx] = self.targets[p[0]] - self.targets[p[1]]

    self.features = x  # copy.deepcopy(x)
    self.targets = y  # copy.deepcopy(y)
    self.population = len(permutations)

    return self.features, self.targets
def permutations_normal_distribution(self, features: numpy.ndarray, targets: numpy.ndarray) ‑> Tuple[numpy.ndarray, numpy.ndarray]
Expand source code
def permutations_normal_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]:
    """
    Generate permutations of features and targets with a normal distribution.

    Args:
        features (numpy.ndarray): Input features.
        targets (numpy.ndarray): Target values.

    Returns:
        tuple: Features and targets with generated permutations.
    """
    if self.model_settings.permutations_limit is None:
        permutations_limit = len(targets)
    else:
        permutations_limit = self.model_settings.permutations_limit

    indices = self.rng.choice(range(len(targets)), 300)

    permutations = list(itertools.permutations(indices, r=2))
    permutations = np.array([list(f) for f in permutations], dtype=np.uint32)

    keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    keys = list(keys)
    l = 0
    for key in keys:
        l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(','))

    num_inputs = l

    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']
    num_outputs = len(outputs)

    x = np.zeros((len(permutations), 2 * num_inputs))
    y = np.zeros((len(permutations), num_outputs))
    for idx, p in enumerate(permutations):
        x[idx, :num_inputs] = features[p[0]]
        x[idx, num_inputs:] = features[p[1]]
        y[idx] = targets[p[0]] - targets[p[1]]

    features = x
    targets = y.ravel()

    self.population = len(permutations)
    return features, targets

Generate permutations of features and targets with a normal distribution.

Args

features : numpy.ndarray
Input features.
targets : numpy.ndarray
Target values.

Returns

tuple
Features and targets with generated permutations.
def predict(self, absolute_dir: str, *experimental_features) ‑> None
Expand source code
def predict(self, absolute_dir: str, *experimental_features) -> None:
    """
    Predict outputs for given experimental features using difference-based models.

    Args:
        absolute_dir (str): Directory containing absolute data.
        experimental_feature (np.ndarray): Experimental features to predict outputs for.
    """
    experimental_features = experimental_features[0]
    self.setup_network_directories()
    # Gather Input Vectors Used
    self.load_input_vectors()

    exps = []
    if type(experimental_features) is list:
        for idx, experimental_feature in enumerate(experimental_features):
            experimental_feature = copy.deepcopy(experimental_feature)
            # Samples Experimental Feature
            experimental_feature = self.sample_experimental_features(experimental_feature)

            # Normalises Experimental Feature
            experimental_feature = self.normalise_experimental_features(experimental_feature, dir=None, idx=idx)
            exps.append(experimental_feature)
    else:
        experimental_features = copy.deepcopy(experimental_features)
        # Samples Experimental Feature
        experimental_features = self.sample_experimental_features(experimental_features)

        # Normalises Experimental Feature
        experimental_features = self.normalise_experimental_features(experimental_features, dir=None, idx=None)
        exps.append(experimental_features)
        exps = exps[0]

    for idx in range(self.total_networks):
        self.working_network = idx
        dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network])

        # Gathers and sets up secondary dataset
        self.setup_absolute_dataset(absolute_dir)

        # Gathers number of ouputs
        outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'])

        s = np.shape(self.absolute_features)

        # Setup Prediction Array
        self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object)

        # Pairs experimental feature with secondary dataset
        feature = self.setup_experimental_feature(exps)

        # Sets up Network
        P = Predicting(dir, feature, inputs=self.len_exp_features*2)

        # Generates Normalised Predictions
        self.normalised_predicitons = P.predict()

        self.turn_predictions_absolute()

        self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10))

        prediction = self.denormalise_prediction_single(absolute_dir)

        self.predictions[:, idx, :] = prediction[:]
        self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons)

Predict outputs for given experimental features using difference-based models.

Args

absolute_dir : str
Directory containing absolute data.
experimental_feature : np.ndarray
Experimental features to predict outputs for.
def renormalise(self, absolute_v: pandas.core.frame.DataFrame, absolute_dir: str) ‑> pandas.core.frame.DataFrame
Expand source code
def renormalise(self, absolute_v: pd.DataFrame, absolute_dir: str) -> pd.DataFrame:
    """
    Renormalize absolute values to match the network's configuration.

    Args:
        absolute_v (pandas.DataFrame): Absolute values to renormalize.
        absolute_dir (str): Directory containing absolute data.

    Returns:
        pandas.DataFrame: Renormalized values.
    """
    network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
    network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ',
                                      names=['param', 'min', 'max', 'log'])

    absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv')
    absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ',
                                       names=['param', 'min', 'max', 'log'])

    temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]]

    absolute_min = temp['min'].to_numpy()[0]
    absolute_max = temp['max'].to_numpy()[0]
    absolute_log = temp['log'].to_numpy()[0]

    if absolute_log == 1:
        absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max)
    else:
        absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min, absolute_max)

    temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]]
    network_min = temp['min'].to_numpy()[0]
    network_max = temp['max'].to_numpy()[0]
    network_log = temp['log'].to_numpy()[0]

    if network_log == 1:
        absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max)
    else:
        absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max)

    vecs = []
    for f in absolute_v.columns:
        if 'vec' in f:
            if 'light' in f:
                vecs.append(f)

    for vec in vecs:
        temp = absolute_min_max_log[absolute_min_max_log['param'] == vec]
        absolute_min = temp['min'].to_numpy()[0]
        absolute_max = temp['max'].to_numpy()[0]
        absolute_log = temp['log'].to_numpy()[0]

        if absolute_log == 1:
            absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max)
        else:
            absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max)

        temp = network_min_max_log[network_min_max_log['param'] == vec]
        network_min = temp['min'].to_numpy()[0]
        network_max = temp['max'].to_numpy()[0]
        network_log = temp['log'].to_numpy()[0]

        if network_log == 1:
            absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max)
        else:
            absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max)

    return absolute_v

Renormalize absolute values to match the network's configuration.

Args

absolute_v : pandas.DataFrame
Absolute values to renormalize.
absolute_dir : str
Directory containing absolute data.

Returns

pandas.DataFrame
Renormalized values.
def setup_absolute_dataset(self, absolute_dir: str) ‑> None
Expand source code
def setup_absolute_dataset(self, absolute_dir: str) -> None:
    """
    Set up the absolute dataset for predictions.

    Args:
        absolute_dir (str): Directory containing absolute data.
    """
    absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv')

    inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs']
    outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']

    self.inputs = inputs
    self.outputs = outputs

    f = open(absolute_vectors_dir, 'r')
    v = pd.read_csv(f, delimiter=' ')
    self.vectors = self.renormalise(v, absolute_dir)
    f.close()

    col = self.vectors.columns.to_list()
    inp = [i + '.vec' for i in self.inputs]
    V = []
    for i in inp:
        vecs = np.where(np.char.find(np.char.lower(col), i) > -1)[0]
        V = np.append(V, vecs)
    col = [col[int(x)] for x in V]
    self.absolute_targets = self.vectors[outputs].to_numpy()
    self.absolute_features = self.vectors[col].to_numpy()
    self.absolute_features = self.absolute_features[:]
    self.absolute_targets = self.absolute_targets[:]

    #self.generate_uniform_distribution()

Set up the absolute dataset for predictions.

Args

absolute_dir : str
Directory containing absolute data.
def setup_confusion_matrix_features(self) ‑> Tuple[numpy.ndarray, numpy.ndarray]
Expand source code
def setup_confusion_matrix_features(self) -> Tuple[np.ndarray, np.ndarray]:
    """
    Set up features for generating a confusion matrix.

    Returns:
        tuple: Targets and features for the confusion matrix.
    """
    absolute_features = self.absolute_features
    absolute_targets = self.absolute_targets
    x_features = absolute_features
    x_targets = absolute_targets[::-1]
    y_features = absolute_features
    y_targets = absolute_targets[::-1]

    features1 = np.concatenate((x_features, y_features), axis=1)
    features2 = np.concatenate((y_features, x_features), axis=1)
    features = np.concatenate((features1, features2), axis=0)

    targets = np.concatenate((x_targets, y_targets), axis=0)
    return targets, features

Set up features for generating a confusion matrix.

Returns

tuple
Targets and features for the confusion matrix.
def setup_experimental_feature(self, experimental_features) ‑> numpy.ndarray
Expand source code
def setup_experimental_feature(self, experimental_features) -> np.ndarray:

    absolute_features = self.absolute_features
    try:
        len_exp_features = len(experimental_features)
        self.len_exp_features = int(len_exp_features/2)
        len_exp0 = len(experimental_features[0].y)
    except:
        len_exp_features = 1
        self.len_exp_features = 1
        len_exp0 = len(experimental_features.y)
    match len_exp_features:
        case 1:
            y = np.tile(experimental_features.y, (len(absolute_features),1))
            y = np.reshape(y, np.shape(absolute_features))

            features1 = np.concatenate((absolute_features, y), axis=1)
            features2 = np.concatenate((y, absolute_features), axis=1)
            features = np.concatenate((features1, features2), axis=0)
        case 2:
            abf_1 = absolute_features[:, 0:len_exp0]
            abf_2 = absolute_features[:, len_exp0:len_exp0*2]

            y1 = np.tile(experimental_features[0].y, (len(abf_1), 1))
            y1 = np.reshape(y1, np.shape(abf_1))
            y2 = np.tile(experimental_features[1].y, (len(abf_2), 1))
            y2 = np.reshape(y2, np.shape(abf_2))

            abf = np.concatenate((abf_1, abf_2), axis=1)
            abf_r = np.concatenate((abf_2, abf_1), axis=1)
            abf = np.concatenate((abf, abf_r), axis=0)

            y = np.concatenate((y1, y2), axis=1)
            y_r = np.concatenate((y2, y1), axis=1)
            y = np.concatenate((y, y_r), axis=0)

            features = np.concatenate((abf, y), axis=1)

            # features1 = np.concatenate((abf_1, y1), axis=1)
            # features1_r = np.concatenate((y1, abf_1), axis=1)
            # features1 = np.concatenate((features1, features1_r), axis=0)
            # features2 = np.concatenate((abf_2, y2), axis=1)
            # features2_r = np.concatenate((y2, abf_2), axis=1)
            # features2 = np.concatenate((features2, features2_r), axis=0)
            # features = np.concatenate((features1, features2), axis=1)
        case 4:
            abf_1 = absolute_features[:, 0:len_exp0]
            abf_2 = absolute_features[:, len_exp0:len_exp0*2]
            abf_3 = absolute_features[:, len_exp0*2:len_exp0*3]
            abf_4 = absolute_features[:, len_exp0*3:len_exp0*4]

            y1 = np.tile(experimental_features[0].y, (len(abf_1), 1))
            y1 = np.reshape(y1, np.shape(abf_1))
            y2 = np.tile(experimental_features[1].y, (len(abf_2), 1))
            y2 = np.reshape(y2, np.shape(abf_2))
            y3 = np.tile(experimental_features[2].y, (len(abf_3), 1))
            y3 = np.reshape(y3, np.shape(abf_3))
            y4 = np.tile(experimental_features[3].y, (len(abf_4), 1))
            y4 = np.reshape(y4, np.shape(abf_4))

            features1 = np.concatenate((abf_1, y1), axis=1)
            features1_r = np.concatenate((y1, abf_1), axis=1)
            features1 = np.concatenate((features1, features1_r), axis=0)
            features2 = np.concatenate((abf_2, y2), axis=1)
            features2_r = np.concatenate((y2, abf_2), axis=1)
            features2 = np.concatenate((features2, features2_r), axis=0)
            features3 = np.concatenate((abf_3, y3), axis=1)
            features3_r = np.concatenate((y3, abf_3), axis=1)
            features3 = np.concatenate((features3, features3_r), axis=0)
            features4 = np.concatenate((abf_4, y4), axis=1)
            features4_r = np.concatenate((y4, abf_4), axis=1)
            features4 = np.concatenate((features4, features4_r), axis=0)

            features = np.concatenate((features1, features2, features3, features4), axis=1)
    return features
def train(self, idx=None)
Expand source code
def train(self, idx=None):
    self.setup_network_directories()
    
    if idx is None:
        for idx in range(self.total_networks):
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks()
    else:
        self.working_network = idx
        self.load_training_dataset()
        self.permutations_lowRAM()
        self.separate_training_dataset()
        self.train_networks('Residual')
def train_existing(self, idx=None)
Expand source code
def train_existing(self, idx=None):
    self.setup_network_directories()
    
    if idx is None:
        for idx in range(self.total_networks):
            self.working_network = idx
            self.load_training_dataset()
            self.permutations_lowRAM()
            self.separate_training_dataset()
            self.train_networks_existing()
    else:
        self.working_network = idx
        self.load_training_dataset()
        self.permutations_lowRAM()
        self.separate_training_dataset()
        self.train_networks_existing('Residual')
def tune(self)
Expand source code
def tune(self):
    self.setup_network_directories()

    for idx in range(self.total_networks):
        self.working_network = idx

        self.load_training_dataset()
        self.permutations()
        self.separate_training_dataset()
        self.tune_networks()
def turn_predictions_absolute(self) ‑> None
Expand source code
def turn_predictions_absolute(self) -> None:
    """
    Convert predictions to absolute values.
    """
    origin = self.absolute_targets
    l = len(self.absolute_targets)
    self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l]
    self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin

Convert predictions to absolute values.

Inherited members