Module src.PyOghma_ML.Networks
Neural Network Management and Training Framework for PyOghma_ML
This module provides a comprehensive framework for managing and training machine learning networks specifically designed for organic photovoltaic device analysis. It implements multiple network architectures, training strategies, and evaluation methods tailored for scientific data analysis.
Key Components: Networks (Base Class): Core functionality for network management, data loading, and training coordination. Provides factory methods and subclass registration for different network types.
Point Networks: Single-point prediction models for device parameter estimation.
Optimized for individual device characteristic prediction.
Ensemble Networks: Multi-model aggregation systems for improved prediction accuracy.
Combines multiple models to reduce variance and improve generalization.
Difference Networks: Specialized models for comparative analysis between experimental
and predicted data. Useful for error analysis and model validation.
Features
- Automatic data loading and preprocessing from simulation directories
- Support for multiple input architectures (single, dual, quad, octal inputs)
- Integrated hyperparameter tuning with Keras Tuner
- Comprehensive model evaluation and statistical analysis
- Automatic report generation with performance metrics
- Support for both new training and continued training from existing models
- Memory-efficient data handling for large datasets
Architecture Support: - Sequential dense networks with configurable depth - Residual networks with skip connections - Multi-input networks for complex feature interactions - Ensemble methods with model aggregation
Training Features: - Adaptive learning rate scheduling - Early stopping with patience - Batch normalization and dropout regularization - Custom loss functions and metrics - Data augmentation and permutation strategies
Integration
- Seamless integration with Input module for data loading
- Automatic interfacing with Training module for model fitting
- Built-in Predicting capabilities for inference
- Direct connection to Output module for result visualization
Performance Optimization: - GPU acceleration support - Efficient memory management for large datasets - Parallel training for ensemble methods - Optimized data pipelines for fast iteration
Example Usage: >>> # Initialize and train a Point network >>> networks = Networks.initialise('simulation_data/', 'Point', model_settings) >>> networks.train_networks() >>> >>> # Perform hyperparameter tuning >>> networks.tune_networks() >>> >>> # Generate predictions and analysis >>> predictions = networks.predict_experimental_data(experimental_inputs)
Note
This module requires TensorFlow/Keras for neural network operations and integrates closely with the OghmaNano simulation framework for data compatibility.
Classes
class Difference (networks_dir: str,
model_settings: Model_Settings | None = None)-
Expand source code
class Difference(Networks): """ Subclass of Networks for difference-based models. This class provides methods specific to difference-based models, including training and generating permutations. """ _network_type = 'difference' def __init__(self, networks_dir: str, model_settings: Optional[Model_Settings] = None): """ Initialize a Difference instance. Args: networks_dir (str): Directory containing network configurations. model_settings (Model_Settings, optional): Settings for the model. """ self.networks_dir = networks_dir self.model_settings = model_settings self.rng = np.random.default_rng() def train(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks('Difference') def train_existing(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing('Difference') def tune(self) -> None: """ Tune the difference-based networks. This method performs hyperparameter optimization for the networks. """ self.setup_network_directories() for idx in range(self.total_networks): self.working_network = idx features, targets = self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.tune_networks() def permutations(self) -> Tuple[np.ndarray, np.ndarray]: """ Generate permutations of features and targets for training. Returns: tuple: Features and targets with generated permutations. """ features = self.features targets = self.targets if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(len(targets)), k=permutations_limit) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) return features, targets def permutations_lowRAM(self): # features = self.features # targets = self.targets if self.model_settings.permutations_limit is not None: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(1000), k=1000) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) Thousands = 40 p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32) step = np.shape(permutations)[0] for idx in range(1, Thousands): if idx == 0: p[:step,:] = permutations + (1000*idx) else: p[step*(idx-1):step*idx,:] = permutations + (1000*idx) # permutations = np.append(permutations, permutations + (1000*idx), axis=0) permutations = p if self.model_settings.permutations_limit < np.shape(permutations)[0]: permutations = permutations[:self.model_settings.permutations_limit] self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = self.features[p[0]] x[idx, num_inputs:] = self.features[p[1]] y[idx] = self.targets[p[0]] - self.targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) print(self.population) return self.features, self.targets def combinations(self) -> None: """ Generate combinations of features and targets for training. """ if self.model_settings.permutations_limit is None: permutations_limit = self.population else: permutations_limit = self.model_settings.permutations_limit rng = np.random.default_rng() indices = rng.choice(range(self.population), permutations_limit) permutations = list(itertools.combinations(indices, r=2)) permutations = [list(f) for f in permutations] x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points)) y = np.zeros((len(self.networks_configured), len(permutations), 1)) for idx, p in enumerate(permutations): x[:, idx, :self.points] = self.features[:, p[0], :] x[:, idx, self.points:] = self.features[:, p[1], :] y[:, idx] = self.targets[p[0]] - self.targets[p[1]] self.features = copy.deepcopy(x) self.targets = copy.deepcopy(y) self.population = len(permutations) def predict(self, absolute_dir, experimental_feature): experimental_feature = copy.deepcopy(experimental_feature) self.setup_network_directories() # Gather Input Vectors Used self.load_input_vectors() # Samples Experimental Feature experimental_feature = self.sample_experimental_features(experimental_feature) # Normalises Experimental Feature experimental_feature = self.normalise_experimental_features(experimental_feature) for idx in range(self.total_networks): self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) # Gathers and sets up secondary dataset self.setup_absolute_dataset(absolute_dir) # Gathers number of ouptus outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) # Setup Prediction Array self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object) # Pairs experimental feature with secondary dataset feature = self.setup_experimental_feature(experimental_feature) # Sets up Network P = Predicting(dir, feature) # Generates Normalised Predictions self.normalised_predicitons = P.predict() self.turn_predictions_absolute() self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) prediction = self.denormalise_prediction_single(absolute_dir) self.predictions[:, idx, :] = prediction[:] self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons) def renormalise(self, absolute_v, absolute_dir): network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv') network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv') absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max) else: absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max) else: absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max) vecs = [] for f in absolute_v.columns: if 'vec' in f: if 'light' in f: vecs.append(f) for vec in vecs: temp = absolute_min_max_log[absolute_min_max_log['param'] == vec] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max) else: absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == vec] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max) else: absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max) return absolute_v def setup_absolute_dataset(self, absolute_dir): absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv') inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] self.inputs = inputs self.outputs = outputs f = open(absolute_vectors_dir, 'r') v = pd.read_csv(f, delimiter=' ') self.vectors = self.renormalise(v, absolute_dir) f.close() col = self.vectors.columns.to_list() vecs = np.where(np.char.find(np.char.lower(col), 'light_1.0.vec') > -1)[0] col = [col[x] for x in vecs] self.absolute_targets = self.vectors[outputs].to_numpy() self.absolute_features = self.vectors[col].to_numpy() #self.generate_uniform_distribution() def generate_uniform_distribution(self): kde = stats.gaussian_kde(self.absolute_targets.ravel()) density = kde(self.absolute_targets.ravel()) density = self.normalise_linear(density, np.min(density), np.max(density)) density = 1 - density density = density / np.sum(density) indicies = np.arange(len(density)) normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2)) self.absolute_targets = self.absolute_targets[normal] self.absolute_features = self.absolute_features[normal] def setup_experimental_feature(self, experimental_feature): absolute_features = self.absolute_features y = experimental_feature.y y = np.tile(y, len(absolute_features)) y = np.reshape(y, np.shape(absolute_features)) features1 = np.concatenate((absolute_features, y), axis=1) features2 = np.concatenate((y, absolute_features), axis=1) features = np.concatenate((features1, features2), axis=0) return features def turn_predictions_absolute(self): origin = self.absolute_targets l = len(self.absolute_targets) self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l] self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin def denormalise_predictions(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) normalised_predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values # try: for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (max[jdx] - min[jdx]) + \ min[jdx] else: predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * ( np.log10(max[jdx]) - np.log10(min[jdx])) + np.log10(min[jdx]) predictions[kdx, idx, jdx] = 10 ** predictions[kdx, idx, jdx] self.predictions = predictions def denormalise_prediction_single(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) return predictions def denormalise_target_single(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) return predictions def permutations_normal_distribution(self, features, targets): if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = self.rng.choice(range(len(targets)), 300) # , p=targets_weights) permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] features = x # copy.deepcopy(x) targets = y.ravel() # copy.deepcopy(y) self.population = len(permutations) return features, targets def setup_confusion_matrix_features(self): absolute_features = self.absolute_features absolute_targets = self.absolute_targets x_features = absolute_features x_targets = absolute_targets[::-1] y_features = absolute_features y_targets = absolute_targets[::-1] features1 = np.concatenate((x_features, y_features), axis=1) features2 = np.concatenate((y_features, x_features), axis=1) features = np.concatenate((features1, features2), axis=0) targets = np.concatenate((x_targets, y_targets), axis=0) return targets, features def confusion_matrix(self, absolute_dir=None): self.setup_network_directories() # Gather Input Vectors Used self.load_input_vectors() MAPE = np.zeros(len(self.networks_configured)) min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) # Gathers and sets up secondary dataset self.setup_absolute_dataset(absolute_dir) # Gathers number of ouptus outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) # Setup Prediction Array self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object) # Pairs experimental feature with secondary dataset targets, feature = self.setup_confusion_matrix_features() # Sets up Network P = Predicting(dir, feature) # Generates Normalised Predictions self.normalised_predicitons = P.predict() self.absolute_targets = targets self.normalised_predicitons = targets - self.normalised_predicitons #self.normalised_predicitons[:l] = self.normalised_predicitons[:l] - targets[:l] #self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) prediction = self.denormalise_prediction_single(absolute_dir) #self.absolute_targets = targets targets = self.denormalise_target_single(absolute_dir) #self.predictions[:, idx, :] = prediction[:] #targets = self.absolute_targets[self.confusion_matrix_ind] fig, ax = plt.subplots(figsize=(6, 6), dpi=300) plt.xlabel('Target') plt.ylabel('Prediction') if log == 0: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno') else: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno') plt.xscale('log') plt.yscale('log') figname = 'tempCF' + str(self.working_network) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname + '.png') data = pd.DataFrame() data['Target'] = targets[:,0].ravel() data['Predicted'] = prediction[:,0].ravel() data.to_csv(figname + '.csv', index=False) plt.savefig(figname) MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100) self.MAPE = MAPE return MAPE def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None: """ Generate a distribution plot for a single network. Args: absolute_dir (str): Directory containing absolute data. idx (int): Index of the network. predictions (numpy.ndarray): Predictions to plot. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = predictions[:, 0] if self.working_network == 0: self.mean = np.zeros(self.total_networks) self.std = np.zeros(self.total_networks) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) if log == 0: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) ) ax.hist(np.abs(p), bins=np.linspace(min, max, 1000)) else: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.set_xscale('log') ax.axvline(m, color='tab:orange') L = Label(outputs[0]) ax.set_xlabel(L.english + ' (' + L.units + ')') ax.set_ylabel('Count') hist = pd.DataFrame() count = np.append(count, 0) hist['bins'] = predictions[:, 0] hist_n = pd.DataFrame() hist_n['bins'] = norm_predictions[:, 0] figname = 'tempDF' + str(self.working_network) + '.png' figname_hist = 'tempDF' + str(self.working_network) + '.csv' figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist) plt.savefig(figname) hist.to_csv(figname_hist, index=False) hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False) def distribution_plot(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = self.predictions[:, idx, 0] m = np.mean(p) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) ax.hist(p, bins=10000, range=(min, max)) ax.axvline(m, color='tab:orange') # ax.set_xlim(left=min, right=max) ax.set_xlabel('Prediction') ax.set_ylabel('Count') figname = 'tempDF' + str(self.working_network) + '.png' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) plt.show() # plt.savefig(figname)
Subclass of Networks for difference-based models.
This class provides methods specific to difference-based models, including training and generating permutations.
Initialize a Difference instance.
Args
networks_dir
:str
- Directory containing network configurations.
model_settings
:Model_Settings
, optional- Settings for the model.
Ancestors
Methods
def combinations(self) ‑> None
-
Expand source code
def combinations(self) -> None: """ Generate combinations of features and targets for training. """ if self.model_settings.permutations_limit is None: permutations_limit = self.population else: permutations_limit = self.model_settings.permutations_limit rng = np.random.default_rng() indices = rng.choice(range(self.population), permutations_limit) permutations = list(itertools.combinations(indices, r=2)) permutations = [list(f) for f in permutations] x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points)) y = np.zeros((len(self.networks_configured), len(permutations), 1)) for idx, p in enumerate(permutations): x[:, idx, :self.points] = self.features[:, p[0], :] x[:, idx, self.points:] = self.features[:, p[1], :] y[:, idx] = self.targets[p[0]] - self.targets[p[1]] self.features = copy.deepcopy(x) self.targets = copy.deepcopy(y) self.population = len(permutations)
Generate combinations of features and targets for training.
def confusion_matrix(self, absolute_dir=None)
-
Expand source code
def confusion_matrix(self, absolute_dir=None): self.setup_network_directories() # Gather Input Vectors Used self.load_input_vectors() MAPE = np.zeros(len(self.networks_configured)) min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) # Gathers and sets up secondary dataset self.setup_absolute_dataset(absolute_dir) # Gathers number of ouptus outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) # Setup Prediction Array self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object) # Pairs experimental feature with secondary dataset targets, feature = self.setup_confusion_matrix_features() # Sets up Network P = Predicting(dir, feature) # Generates Normalised Predictions self.normalised_predicitons = P.predict() self.absolute_targets = targets self.normalised_predicitons = targets - self.normalised_predicitons #self.normalised_predicitons[:l] = self.normalised_predicitons[:l] - targets[:l] #self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) prediction = self.denormalise_prediction_single(absolute_dir) #self.absolute_targets = targets targets = self.denormalise_target_single(absolute_dir) #self.predictions[:, idx, :] = prediction[:] #targets = self.absolute_targets[self.confusion_matrix_ind] fig, ax = plt.subplots(figsize=(6, 6), dpi=300) plt.xlabel('Target') plt.ylabel('Prediction') if log == 0: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno') else: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno') plt.xscale('log') plt.yscale('log') figname = 'tempCF' + str(self.working_network) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname + '.png') data = pd.DataFrame() data['Target'] = targets[:,0].ravel() data['Predicted'] = prediction[:,0].ravel() data.to_csv(figname + '.csv', index=False) plt.savefig(figname) MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100) self.MAPE = MAPE return MAPE
def denormalise_prediction_single(self, absolute_dir)
-
Expand source code
def denormalise_prediction_single(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) return predictions
def denormalise_target_single(self, absolute_dir)
-
Expand source code
def denormalise_target_single(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) return predictions
def distribution_plot(self, absolute_dir)
-
Expand source code
def distribution_plot(self, absolute_dir): min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = self.predictions[:, idx, 0] m = np.mean(p) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) ax.hist(p, bins=10000, range=(min, max)) ax.axvline(m, color='tab:orange') # ax.set_xlim(left=min, right=max) ax.set_xlabel('Prediction') ax.set_ylabel('Count') figname = 'tempDF' + str(self.working_network) + '.png' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) plt.show() # plt.savefig(figname)
def distribution_plot_single(self,
absolute_dir: str,
idx: int,
predictions: numpy.ndarray,
norm_predictions: numpy.ndarray) ‑> None-
Expand source code
def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None: """ Generate a distribution plot for a single network. Args: absolute_dir (str): Directory containing absolute data. idx (int): Index of the network. predictions (numpy.ndarray): Predictions to plot. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = predictions[:, 0] if self.working_network == 0: self.mean = np.zeros(self.total_networks) self.std = np.zeros(self.total_networks) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) if log == 0: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) ) ax.hist(np.abs(p), bins=np.linspace(min, max, 1000)) else: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.set_xscale('log') ax.axvline(m, color='tab:orange') L = Label(outputs[0]) ax.set_xlabel(L.english + ' (' + L.units + ')') ax.set_ylabel('Count') hist = pd.DataFrame() count = np.append(count, 0) hist['bins'] = predictions[:, 0] hist_n = pd.DataFrame() hist_n['bins'] = norm_predictions[:, 0] figname = 'tempDF' + str(self.working_network) + '.png' figname_hist = 'tempDF' + str(self.working_network) + '.csv' figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist) plt.savefig(figname) hist.to_csv(figname_hist, index=False) hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False)
Generate a distribution plot for a single network.
Args
absolute_dir
:str
- Directory containing absolute data.
idx
:int
- Index of the network.
predictions
:numpy.ndarray
- Predictions to plot.
def generate_uniform_distribution(self)
-
Expand source code
def generate_uniform_distribution(self): kde = stats.gaussian_kde(self.absolute_targets.ravel()) density = kde(self.absolute_targets.ravel()) density = self.normalise_linear(density, np.min(density), np.max(density)) density = 1 - density density = density / np.sum(density) indicies = np.arange(len(density)) normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2)) self.absolute_targets = self.absolute_targets[normal] self.absolute_features = self.absolute_features[normal]
def permutations(self) ‑> Tuple[numpy.ndarray, numpy.ndarray]
-
Expand source code
def permutations(self) -> Tuple[np.ndarray, np.ndarray]: """ Generate permutations of features and targets for training. Returns: tuple: Features and targets with generated permutations. """ features = self.features targets = self.targets if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(len(targets)), k=permutations_limit) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) return features, targets
Generate permutations of features and targets for training.
Returns
tuple
- Features and targets with generated permutations.
def permutations_lowRAM(self)
-
Expand source code
def permutations_lowRAM(self): # features = self.features # targets = self.targets if self.model_settings.permutations_limit is not None: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(1000), k=1000) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) Thousands = 40 p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32) step = np.shape(permutations)[0] for idx in range(1, Thousands): if idx == 0: p[:step,:] = permutations + (1000*idx) else: p[step*(idx-1):step*idx,:] = permutations + (1000*idx) # permutations = np.append(permutations, permutations + (1000*idx), axis=0) permutations = p if self.model_settings.permutations_limit < np.shape(permutations)[0]: permutations = permutations[:self.model_settings.permutations_limit] self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = self.features[p[0]] x[idx, num_inputs:] = self.features[p[1]] y[idx] = self.targets[p[0]] - self.targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) print(self.population) return self.features, self.targets
def permutations_normal_distribution(self, features, targets)
-
Expand source code
def permutations_normal_distribution(self, features, targets): if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = self.rng.choice(range(len(targets)), 300) # , p=targets_weights) permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] features = x # copy.deepcopy(x) targets = y.ravel() # copy.deepcopy(y) self.population = len(permutations) return features, targets
def predict(self, absolute_dir, experimental_feature)
-
Expand source code
def predict(self, absolute_dir, experimental_feature): experimental_feature = copy.deepcopy(experimental_feature) self.setup_network_directories() # Gather Input Vectors Used self.load_input_vectors() # Samples Experimental Feature experimental_feature = self.sample_experimental_features(experimental_feature) # Normalises Experimental Feature experimental_feature = self.normalise_experimental_features(experimental_feature) for idx in range(self.total_networks): self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) # Gathers and sets up secondary dataset self.setup_absolute_dataset(absolute_dir) # Gathers number of ouptus outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) # Setup Prediction Array self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object) # Pairs experimental feature with secondary dataset feature = self.setup_experimental_feature(experimental_feature) # Sets up Network P = Predicting(dir, feature) # Generates Normalised Predictions self.normalised_predicitons = P.predict() self.turn_predictions_absolute() self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) prediction = self.denormalise_prediction_single(absolute_dir) self.predictions[:, idx, :] = prediction[:] self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons)
def renormalise(self, absolute_v, absolute_dir)
-
Expand source code
def renormalise(self, absolute_v, absolute_dir): network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv') network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv') absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max) else: absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max) else: absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max) vecs = [] for f in absolute_v.columns: if 'vec' in f: if 'light' in f: vecs.append(f) for vec in vecs: temp = absolute_min_max_log[absolute_min_max_log['param'] == vec] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max) else: absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == vec] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max) else: absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max) return absolute_v
def setup_absolute_dataset(self, absolute_dir)
-
Expand source code
def setup_absolute_dataset(self, absolute_dir): absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv') inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] self.inputs = inputs self.outputs = outputs f = open(absolute_vectors_dir, 'r') v = pd.read_csv(f, delimiter=' ') self.vectors = self.renormalise(v, absolute_dir) f.close() col = self.vectors.columns.to_list() vecs = np.where(np.char.find(np.char.lower(col), 'light_1.0.vec') > -1)[0] col = [col[x] for x in vecs] self.absolute_targets = self.vectors[outputs].to_numpy() self.absolute_features = self.vectors[col].to_numpy() #self.generate_uniform_distribution()
def setup_confusion_matrix_features(self)
-
Expand source code
def setup_confusion_matrix_features(self): absolute_features = self.absolute_features absolute_targets = self.absolute_targets x_features = absolute_features x_targets = absolute_targets[::-1] y_features = absolute_features y_targets = absolute_targets[::-1] features1 = np.concatenate((x_features, y_features), axis=1) features2 = np.concatenate((y_features, x_features), axis=1) features = np.concatenate((features1, features2), axis=0) targets = np.concatenate((x_targets, y_targets), axis=0) return targets, features
def setup_experimental_feature(self, experimental_feature)
-
Expand source code
def setup_experimental_feature(self, experimental_feature): absolute_features = self.absolute_features y = experimental_feature.y y = np.tile(y, len(absolute_features)) y = np.reshape(y, np.shape(absolute_features)) features1 = np.concatenate((absolute_features, y), axis=1) features2 = np.concatenate((y, absolute_features), axis=1) features = np.concatenate((features1, features2), axis=0) return features
def train(self, idx=None)
-
Expand source code
def train(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks('Difference')
def train_existing(self, idx=None)
-
Expand source code
def train_existing(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing('Difference')
def tune(self) ‑> None
-
Expand source code
def tune(self) -> None: """ Tune the difference-based networks. This method performs hyperparameter optimization for the networks. """ self.setup_network_directories() for idx in range(self.total_networks): self.working_network = idx features, targets = self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.tune_networks()
Tune the difference-based networks.
This method performs hyperparameter optimization for the networks.
def turn_predictions_absolute(self)
-
Expand source code
def turn_predictions_absolute(self): origin = self.absolute_targets l = len(self.absolute_targets) self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l] self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin
Inherited members
Networks
:denormalise_linear
denormalise_log
denormalise_predictions
get_uniform_distribution
initialise
interpret_input_vectors
load_input_vectors
load_training_dataset
mape
normalise_experimental_features
normalise_linear
normalise_log
sample_experimental_features
separate_training_dataset
setup_network_directories
train_networks
train_networks_existing
tune_networks
class Ensemble (networks_dir: str)
-
Expand source code
class Ensemble(Networks): """ Subclass of Networks for ensemble-based models. This class provides methods specific to ensemble-based models, including training and feature augmentation. """ _network_type = 'ensemble' def __init__(self, networks_dir: str): """ Initialize an Ensemble instance. Args: networks_dir (str): Directory containing network configurations. """ self.networks_dir = networks_dir self.rng = np.random.default_rng() def train(self) -> None: """ Train the ensemble-based networks. This method sets up directories, loads datasets, and trains the networks using ensemble techniques. """ self.setup_network_directories() for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() training_features, training_targets, validation_features, validation_targets = self.separate_training_dataset() self.train_networks_ensemble(training_features, training_targets, validation_features, validation_targets) def train_networks_ensemble(self, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray) -> None: """ Train ensemble networks with augmented features. Args: training_features (numpy.ndarray): Training features. training_targets (numpy.ndarray): Training targets. validation_features (numpy.ndarray): Validation features. validation_targets (numpy.ndarray): Validation targets. """ network_metric = np.zeros((self.model_settings.ensemble_presample)) presample_num = 0 while presample_num < self.model_settings.ensemble_presample: augmented_training_features = self.augment_features(training_features) network = copy.deepcopy(self.networks[self.working_network]) network_metric[presample_num] = Training(augmented_training_features, training_targets, validation_features, validation_targets, dir) presample_num += 1 mean_network_metric = np.mean(network_metric) ensemble_population = 0 metric = 0 patience = 0 while ensemble_population < self.model_settings.ensemble_maximum and patience < self.model_settings.ensemble_patience: network = copy.deepcopy(self.networks[self.working_network]) augmented_training_features = self.augment_features(training_features) candidate_network = network.fit(augmented_training_features[:], training_targets[:]) if metric == 0: temp_metric = candidate_network else: temp_metric = np.mean([metric, candidate_network]) percentage_change = (temp_metric - mean_network_metric) / mean_network_metric if percentage_change >= self.model_settings.ensemble_tollerance: path = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network], str(int(ensemble_population))) os.mkdir(path) network.save(path) self.save_rotation(path) ensemble_population += 1 else: patience += 1 def augment_features(self, training_features: np.ndarray, rotation: Optional[float] = None) -> np.ndarray: """ Augment training features with rotation-based transformations. Args: training_features (numpy.ndarray): Training features to augment. rotation (float, optional): Rotation angle for augmentation. Returns: numpy.ndarray: Augmented training features. """ devices = np.shape(training_features)[0] total_points = np.shape(training_features)[1] if rotation is None: rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8) self.rotation = rotation input_characterisations = int(total_points / self.points) false_x_values = np.linspace(-1, 1, self.points) false_x_values = np.tile(false_x_values, input_characterisations) cos = np.cos(rotation) sin = np.sin(rotation) indexed_features = np.zeros((devices, total_points, 2)) indexed_features[:, :, 0] = false_x_values indexed_features[:, :, 1] = training_features[:, :] x_offset = 0 y_offset = 0.5 if input_characterisations != 1: characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)] else: characterisation_boundaries = [self.points] for idx, boundary in enumerate(characterisation_boundaries): if idx == 0: indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] - x_offset indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] - y_offset indexed_features[:, :boundary, 0] = (indexed_features[:, :boundary, 0] * cos) - ( indexed_features[:, :boundary, 1] * sin) indexed_features[:, :boundary, 1] = (indexed_features[:, :boundary, 1] * cos) + ( indexed_features[:, :boundary, 0] * sin) indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] + x_offset indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] + y_offset else: previous_boundary = characterisation_boundaries[idx - 1] indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary, 0] - x_offset indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary, 1] - y_offset indexed_features[:, previous_boundary:boundary, 0] = (indexed_features[:, previous_boundary:boundary, 0] * cos) - (indexed_features[:, previous_boundary:boundary, 1] * sin) indexed_features[:, previous_boundary:boundary, 1] = (indexed_features[:, previous_boundary:boundary, 1] * cos) + (indexed_features[:, previous_boundary:boundary, 0] * sin) indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary, 0] + x_offset indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary, 1] + y_offset indexed_features_subset = indexed_features[:, :, :] for idx, feature in enumerate(indexed_features_subset): x = feature[:, 0] y = feature[:, 1] for jdx, boundary in enumerate(characterisation_boundaries): if jdx == 0: function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate') indexed_features[idx, :boundary, 1] = function(false_x_values[:boundary]) else: previous_boundary = characterisation_boundaries[jdx - 1] function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary], fill_value='extrapolate') indexed_features[idx, previous_boundary:boundary, 1] = function( false_x_values[previous_boundary:boundary]) training_features = indexed_features[:, :, 1] return training_features def save_rotation(self, path: str) -> None: """ Save rotation data to a file. Args: path (str): Path to save the rotation data. """ path = os.path.join(path, 'data.csv') data = pd.DataFrame(data={'rotation': [self.rotation]}) data.to_csv(path, index=False) def predict(self, experimental_feature: Any) -> None: """ Predict outputs for given experimental features using ensemble models. Args: experimental_feature: Experimental features to predict outputs for. """ self.setup_network_directories() self.normalised_predicitons = np.zeros(len(self.networks_configured)) self.predictions = np.zeros(len(self.networks_configured)) for idx in range(len(self.networks_configured)): self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) members = os.listdir(dir) predictions_population = np.zeros(len(members)) for jdx, member in enumerate(members): member_dir = os.path.join(dir, member) self.load_input_vectors() self.interpret_input_vectors() rotation = pd.read_csv(os.path.join(member_dir, 'data.csv')) rotation = rotation['rotation'] experimental_feature = self.sample_experimental_features(experimental_feature) experimental_feature = self.augment_experimental_features(experimental_feature, rotation.values[0]) experimental_feature = self.normalise_experimental_features(experimental_feature) P = Predicting(member_dir, np.array([experimental_feature.y])) predictions_population[jdx] = P.predict() self.normalised_predicitons[self.working_network] = np.mean(predictions_population) self.denormalise_predictions() def augment_experimental_features(self, experimental_features: Any, rotation: float) -> Any: """ Augment experimental features with rotation-based transformations. Args: experimental_features: Experimental features to augment. rotation (float): Rotation angle for augmentation. Returns: Augmented experimental features. """ devices = 1 total_points = len(experimental_features.y) if rotation is None: rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8) self.rotation = rotation input_characterisations = int(total_points / self.points) false_x_values = np.linspace(-1, 1, self.points) false_x_values = np.tile(false_x_values, input_characterisations) cos = np.cos(rotation) sin = np.sin(rotation) indexed_features = np.zeros((total_points, 2)) indexed_features[:, 0] = false_x_values indexed_features[:, 1] = experimental_features.y x_offset = 0 y_offset = 0.5 if input_characterisations != 1: characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)] else: characterisation_boundaries = [self.points] for idx, boundary in enumerate(characterisation_boundaries): if idx == 0: indexed_features[:boundary, 0] = indexed_features[:boundary, 0] - x_offset indexed_features[:boundary, 1] = indexed_features[:boundary, 1] - y_offset indexed_features[:boundary, 0] = (indexed_features[:boundary, 0] * cos) - ( indexed_features[:boundary, 1] * sin) indexed_features[:boundary, 1] = (indexed_features[:boundary, 1] * cos) + ( indexed_features[:boundary, 0] * sin) indexed_features[:boundary, 0] = indexed_features[:boundary, 0] + x_offset indexed_features[:boundary, 1] = indexed_features[:boundary, 1] + y_offset else: previous_boundary = characterisation_boundaries[idx - 1] indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary, 0] - x_offset indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary, 1] - y_offset indexed_features[previous_boundary:boundary, 0] = (indexed_features[previous_boundary:boundary, 0] * cos) - (indexed_features[ previous_boundary:boundary, 1] * sin) indexed_features[previous_boundary:boundary, 1] = (indexed_features[previous_boundary:boundary, 1] * cos) + (indexed_features[ previous_boundary:boundary, 0] * sin) indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary, 0] + x_offset indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary, 1] + y_offset x = indexed_features[:, 0] y = indexed_features[:, 1] for jdx, boundary in enumerate(characterisation_boundaries): if jdx == 0: function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate') indexed_features[:boundary, 1] = function(false_x_values[:boundary]) else: previous_boundary = characterisation_boundaries[jdx - 1] function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary], fill_value='extrapolate') indexed_features[previous_boundary:boundary, 1] = function(false_x_values[previous_boundary:boundary]) experimental_features.y = indexed_features[:, 1] return experimental_features
Subclass of Networks for ensemble-based models.
This class provides methods specific to ensemble-based models, including training and feature augmentation.
Initialize an Ensemble instance.
Args
networks_dir
:str
- Directory containing network configurations.
Ancestors
Methods
def augment_experimental_features(self, experimental_features: Any, rotation: float) ‑> Any
-
Expand source code
def augment_experimental_features(self, experimental_features: Any, rotation: float) -> Any: """ Augment experimental features with rotation-based transformations. Args: experimental_features: Experimental features to augment. rotation (float): Rotation angle for augmentation. Returns: Augmented experimental features. """ devices = 1 total_points = len(experimental_features.y) if rotation is None: rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8) self.rotation = rotation input_characterisations = int(total_points / self.points) false_x_values = np.linspace(-1, 1, self.points) false_x_values = np.tile(false_x_values, input_characterisations) cos = np.cos(rotation) sin = np.sin(rotation) indexed_features = np.zeros((total_points, 2)) indexed_features[:, 0] = false_x_values indexed_features[:, 1] = experimental_features.y x_offset = 0 y_offset = 0.5 if input_characterisations != 1: characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)] else: characterisation_boundaries = [self.points] for idx, boundary in enumerate(characterisation_boundaries): if idx == 0: indexed_features[:boundary, 0] = indexed_features[:boundary, 0] - x_offset indexed_features[:boundary, 1] = indexed_features[:boundary, 1] - y_offset indexed_features[:boundary, 0] = (indexed_features[:boundary, 0] * cos) - ( indexed_features[:boundary, 1] * sin) indexed_features[:boundary, 1] = (indexed_features[:boundary, 1] * cos) + ( indexed_features[:boundary, 0] * sin) indexed_features[:boundary, 0] = indexed_features[:boundary, 0] + x_offset indexed_features[:boundary, 1] = indexed_features[:boundary, 1] + y_offset else: previous_boundary = characterisation_boundaries[idx - 1] indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary, 0] - x_offset indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary, 1] - y_offset indexed_features[previous_boundary:boundary, 0] = (indexed_features[previous_boundary:boundary, 0] * cos) - (indexed_features[ previous_boundary:boundary, 1] * sin) indexed_features[previous_boundary:boundary, 1] = (indexed_features[previous_boundary:boundary, 1] * cos) + (indexed_features[ previous_boundary:boundary, 0] * sin) indexed_features[previous_boundary:boundary, 0] = indexed_features[previous_boundary:boundary, 0] + x_offset indexed_features[previous_boundary:boundary, 1] = indexed_features[previous_boundary:boundary, 1] + y_offset x = indexed_features[:, 0] y = indexed_features[:, 1] for jdx, boundary in enumerate(characterisation_boundaries): if jdx == 0: function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate') indexed_features[:boundary, 1] = function(false_x_values[:boundary]) else: previous_boundary = characterisation_boundaries[jdx - 1] function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary], fill_value='extrapolate') indexed_features[previous_boundary:boundary, 1] = function(false_x_values[previous_boundary:boundary]) experimental_features.y = indexed_features[:, 1] return experimental_features
Augment experimental features with rotation-based transformations.
Args
experimental_features
- Experimental features to augment.
rotation
:float
- Rotation angle for augmentation.
Returns
Augmented experimental features.
def augment_features(self, training_features: numpy.ndarray, rotation: float | None = None) ‑> numpy.ndarray
-
Expand source code
def augment_features(self, training_features: np.ndarray, rotation: Optional[float] = None) -> np.ndarray: """ Augment training features with rotation-based transformations. Args: training_features (numpy.ndarray): Training features to augment. rotation (float, optional): Rotation angle for augmentation. Returns: numpy.ndarray: Augmented training features. """ devices = np.shape(training_features)[0] total_points = np.shape(training_features)[1] if rotation is None: rotation = np.random.uniform(low=-np.pi / 8, high=np.pi / 8) self.rotation = rotation input_characterisations = int(total_points / self.points) false_x_values = np.linspace(-1, 1, self.points) false_x_values = np.tile(false_x_values, input_characterisations) cos = np.cos(rotation) sin = np.sin(rotation) indexed_features = np.zeros((devices, total_points, 2)) indexed_features[:, :, 0] = false_x_values indexed_features[:, :, 1] = training_features[:, :] x_offset = 0 y_offset = 0.5 if input_characterisations != 1: characterisation_boundaries = [self.points * idx for idx in range(1, input_characterisations + 1)] else: characterisation_boundaries = [self.points] for idx, boundary in enumerate(characterisation_boundaries): if idx == 0: indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] - x_offset indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] - y_offset indexed_features[:, :boundary, 0] = (indexed_features[:, :boundary, 0] * cos) - ( indexed_features[:, :boundary, 1] * sin) indexed_features[:, :boundary, 1] = (indexed_features[:, :boundary, 1] * cos) + ( indexed_features[:, :boundary, 0] * sin) indexed_features[:, :boundary, 0] = indexed_features[:, :boundary, 0] + x_offset indexed_features[:, :boundary, 1] = indexed_features[:, :boundary, 1] + y_offset else: previous_boundary = characterisation_boundaries[idx - 1] indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary, 0] - x_offset indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary, 1] - y_offset indexed_features[:, previous_boundary:boundary, 0] = (indexed_features[:, previous_boundary:boundary, 0] * cos) - (indexed_features[:, previous_boundary:boundary, 1] * sin) indexed_features[:, previous_boundary:boundary, 1] = (indexed_features[:, previous_boundary:boundary, 1] * cos) + (indexed_features[:, previous_boundary:boundary, 0] * sin) indexed_features[:, previous_boundary:boundary, 0] = indexed_features[:, previous_boundary:boundary, 0] + x_offset indexed_features[:, previous_boundary:boundary, 1] = indexed_features[:, previous_boundary:boundary, 1] + y_offset indexed_features_subset = indexed_features[:, :, :] for idx, feature in enumerate(indexed_features_subset): x = feature[:, 0] y = feature[:, 1] for jdx, boundary in enumerate(characterisation_boundaries): if jdx == 0: function = interpolate.interp1d(x[:boundary], y[:boundary], fill_value='extrapolate') indexed_features[idx, :boundary, 1] = function(false_x_values[:boundary]) else: previous_boundary = characterisation_boundaries[jdx - 1] function = interpolate.interp1d(x[previous_boundary:boundary], y[previous_boundary:boundary], fill_value='extrapolate') indexed_features[idx, previous_boundary:boundary, 1] = function( false_x_values[previous_boundary:boundary]) training_features = indexed_features[:, :, 1] return training_features
Augment training features with rotation-based transformations.
Args
training_features
:numpy.ndarray
- Training features to augment.
rotation
:float
, optional- Rotation angle for augmentation.
Returns
numpy.ndarray
- Augmented training features.
def predict(self, experimental_feature: Any) ‑> None
-
Expand source code
def predict(self, experimental_feature: Any) -> None: """ Predict outputs for given experimental features using ensemble models. Args: experimental_feature: Experimental features to predict outputs for. """ self.setup_network_directories() self.normalised_predicitons = np.zeros(len(self.networks_configured)) self.predictions = np.zeros(len(self.networks_configured)) for idx in range(len(self.networks_configured)): self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) members = os.listdir(dir) predictions_population = np.zeros(len(members)) for jdx, member in enumerate(members): member_dir = os.path.join(dir, member) self.load_input_vectors() self.interpret_input_vectors() rotation = pd.read_csv(os.path.join(member_dir, 'data.csv')) rotation = rotation['rotation'] experimental_feature = self.sample_experimental_features(experimental_feature) experimental_feature = self.augment_experimental_features(experimental_feature, rotation.values[0]) experimental_feature = self.normalise_experimental_features(experimental_feature) P = Predicting(member_dir, np.array([experimental_feature.y])) predictions_population[jdx] = P.predict() self.normalised_predicitons[self.working_network] = np.mean(predictions_population) self.denormalise_predictions()
Predict outputs for given experimental features using ensemble models.
Args
experimental_feature
- Experimental features to predict outputs for.
def save_rotation(self, path: str) ‑> None
-
Expand source code
def save_rotation(self, path: str) -> None: """ Save rotation data to a file. Args: path (str): Path to save the rotation data. """ path = os.path.join(path, 'data.csv') data = pd.DataFrame(data={'rotation': [self.rotation]}) data.to_csv(path, index=False)
Save rotation data to a file.
Args
path
:str
- Path to save the rotation data.
def train(self) ‑> None
-
Expand source code
def train(self) -> None: """ Train the ensemble-based networks. This method sets up directories, loads datasets, and trains the networks using ensemble techniques. """ self.setup_network_directories() for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() training_features, training_targets, validation_features, validation_targets = self.separate_training_dataset() self.train_networks_ensemble(training_features, training_targets, validation_features, validation_targets)
Train the ensemble-based networks.
This method sets up directories, loads datasets, and trains the networks using ensemble techniques.
def train_networks_ensemble(self,
training_features: numpy.ndarray,
training_targets: numpy.ndarray,
validation_features: numpy.ndarray,
validation_targets: numpy.ndarray) ‑> None-
Expand source code
def train_networks_ensemble(self, training_features: np.ndarray, training_targets: np.ndarray, validation_features: np.ndarray, validation_targets: np.ndarray) -> None: """ Train ensemble networks with augmented features. Args: training_features (numpy.ndarray): Training features. training_targets (numpy.ndarray): Training targets. validation_features (numpy.ndarray): Validation features. validation_targets (numpy.ndarray): Validation targets. """ network_metric = np.zeros((self.model_settings.ensemble_presample)) presample_num = 0 while presample_num < self.model_settings.ensemble_presample: augmented_training_features = self.augment_features(training_features) network = copy.deepcopy(self.networks[self.working_network]) network_metric[presample_num] = Training(augmented_training_features, training_targets, validation_features, validation_targets, dir) presample_num += 1 mean_network_metric = np.mean(network_metric) ensemble_population = 0 metric = 0 patience = 0 while ensemble_population < self.model_settings.ensemble_maximum and patience < self.model_settings.ensemble_patience: network = copy.deepcopy(self.networks[self.working_network]) augmented_training_features = self.augment_features(training_features) candidate_network = network.fit(augmented_training_features[:], training_targets[:]) if metric == 0: temp_metric = candidate_network else: temp_metric = np.mean([metric, candidate_network]) percentage_change = (temp_metric - mean_network_metric) / mean_network_metric if percentage_change >= self.model_settings.ensemble_tollerance: path = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network], str(int(ensemble_population))) os.mkdir(path) network.save(path) self.save_rotation(path) ensemble_population += 1 else: patience += 1
Train ensemble networks with augmented features.
Args
training_features
:numpy.ndarray
- Training features.
training_targets
:numpy.ndarray
- Training targets.
validation_features
:numpy.ndarray
- Validation features.
validation_targets
:numpy.ndarray
- Validation targets.
Inherited members
Networks
:denormalise_linear
denormalise_log
denormalise_predictions
get_uniform_distribution
initialise
interpret_input_vectors
load_input_vectors
load_training_dataset
mape
normalise_experimental_features
normalise_linear
normalise_log
sample_experimental_features
separate_training_dataset
setup_network_directories
train_networks
train_networks_existing
tune_networks
class Networks
-
Expand source code
class Networks: """ Base class for managing and training machine learning networks. This class provides methods for initializing, configuring, and training networks, as well as handling input and output data for machine learning tasks. It also supports subclassing for different types of networks. """ subclasses = {} def __init_subclass__(cls, **kwargs): """ Automatically register subclasses for different network types. Args: **kwargs: Additional keyword arguments for subclass initialization. """ super().__init_subclass__(**kwargs) cls.subclasses[cls.__name__] = cls @classmethod def initialise(cls, networks_dir: str, network_type: Optional[str] = None, model_settings = None) -> 'Networks': """ Factory method to create an instance of a specific network subclass. This method instantiates the appropriate network subclass based on the specified network type. If no model settings are provided, default settings are used. Args: networks_dir (str): Directory containing network configurations and data. network_type (str, optional): Type of network to create. Available types depend on registered subclasses (e.g., 'Point', 'Ensemble', 'Difference'). model_settings (Model_Settings, optional): Configuration settings for the model. If None, default Model_Settings will be created. Returns: Networks: An instance of the appropriate network subclass. Raises: ValueError: If the specified network type is not recognized or registered. Example: >>> networks = Networks.initialise('path/to/networks', 'Point') """ if network_type not in cls.subclasses: raise ValueError('Network Type: {} Not recognized'.format(network_type)) if model_settings is None: model_settings = Model_Settings() return cls.subclasses[network_type](networks_dir, model_settings=model_settings) def setup_network_directories(self) -> None: """ Set up directories for network configurations and ensure required files exist. This method validates the network configuration, loads the network settings from the nets.json file, and creates necessary directories for each configured network. It initializes tracking variables for network management. Raises: ValueError: If the network configuration file (nets.json) is not found in the expected location (networks_dir/faster/nets.json). Sets: self.networks_configured (list): List of configured network names self.working_network (int): Index of currently active network (starts at 0) self.total_networks (int): Total number of configured networks self.networks (numpy.ndarray): Array to store network instances self.oghma_network_config (dict): Loaded network configuration data """ oghma_network_config = os.path.join(self.networks_dir, 'faster', 'nets.json') if os.path.isfile(oghma_network_config) == False: raise ValueError('Network Config File Not Found') else: f = open(oghma_network_config, 'r') oghma_network_config = json.load(f) f.close() self.networks_configured = list(oghma_network_config['sims'].keys()) self.working_network = 0 self.total_networks = len(self.networks_configured) self.networks = np.zeros(len(self.networks_configured), dtype=object) self.oghma_network_config = oghma_network_config for network in self.networks_configured: network_dir = os.path.join(self.networks_dir, 'faster', network) if os.path.isdir(network_dir) == False: os.mkdir(network_dir) def load_input_vectors(self) -> None: """ Load input vectors from the network configuration file. This method parses experimental input vectors and stores them for use in training and prediction. """ input_vectors = {} input_experiments = self.oghma_network_config['experimental'] self.networks_configured = list(self.oghma_network_config['sims'].keys()) for experiment in input_experiments: vector = self.oghma_network_config['experimental'][experiment]['vec']['points'].split(',') vector = np.asarray(vector).astype(float) input_vectors[experiment] = vector self.input_vectors = input_vectors self.points = len(vector) def load_training_dataset(self) -> Tuple[np.ndarray, np.ndarray]: """ Load the training dataset from the network configuration file. This method reads the dataset, extracts input and output features, and prepares them for training. Returns: tuple: A tuple containing features and targets as numpy arrays. """ training_dataset = pd.read_csv(self.oghma_network_config['csv_file'], sep=" ") inputs_vectors = {} input_experiments = self.oghma_network_config['experimental'] for experiment in input_experiments: points = len(self.oghma_network_config['experimental'][experiment]['vec']['points'].split(',')) inputs_vectors[experiment] = points self.points = points self.population = len(training_dataset) self.inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] self.outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] self.input_points = 0 for input in self.inputs: points = len(self.oghma_network_config['experimental'][input]['vec']['points'].split(',')) self.input_points += points self.output_points = len(self.outputs) # feature = np.zeros(self.input_points) features = np.zeros((self.population, self.input_points)) # target = np.zeros((len(self.outputs))) targets = np.zeros((self.population, self.output_points)) inputs = np.empty(self.input_points, dtype=object) previous_end = 0 for idx in range(len(self.inputs)): vector_points = inputs_vectors[self.inputs[idx]] inputs[previous_end:previous_end + vector_points] = np.asarray( [input + '.vec' + str(x) for x in range(vector_points)]) previous_end = previous_end + vector_points #print(inputs) self.features = training_dataset[inputs].to_numpy().astype(np.float32) self.targets = training_dataset[self.outputs].to_numpy().astype(np.float32) return features, targets def separate_training_dataset(self) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: """ Separate the training dataset into training and validation sets. This method splits the dataset based on the training percentage defined in the model settings. Returns: tuple: A tuple containing training features, training targets, validation features, and validation targets. """ training_population = int(self.model_settings.training_percentage * self.population) indices = np.linspace(0, self.population - 1, self.population, dtype=int) training_indices = random.choices(indices, k=training_population) validation_indices = np.delete(indices, training_indices) training_features = self.features[training_indices] training_targets = self.targets[training_indices] validation_features = self.features[validation_indices] validation_targets = self.targets[validation_indices] self.validation_indices = validation_indices self.training_features = training_features self.training_targets = training_targets self.validation_features = validation_features self.validation_targets = validation_targets return training_features, training_targets, validation_features, validation_targets def get_uniform_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: """ Generate a uniform distribution of features and targets. Args: features (numpy.ndarray): Input features. targets (numpy.ndarray): Target values. Returns: tuple: Validation features and validation targets. """ features = np.asarray(features) targets = np.asarray(targets) training_population = int(self.model_settings.training_percentage * self.population) indices = np.linspace(0, self.population - 1, self.population, dtype=int) rng = np.random.default_rng() training_indices = rng.choice(indices, training_population, replace=False) validation_indices = np.array([i not in training_indices for i in indices]) training_features = features[training_indices] training_targets = targets[training_indices] validation_features = features[validation_indices] validation_targets = targets[validation_indices] return validation_features, validation_targets def train_networks(self, network_type=None) -> None: """ Train the networks using the training dataset. This method initializes the training process for the current working network. """ dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) print('Learning Rate:', self.model_settings.inital_learning_rate) print('Decay Rate:', self.model_settings.decay_rate) Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type) def train_networks_existing(self, network_type=None) -> None: """ Train the networks using the training dataset. This method initializes the training process for the current working network. """ dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) print('Learning Rate:', self.model_settings.inital_learning_rate) print('Decay Rate:', self.model_settings.decay_rate) Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type, existing=True) def tune_networks(self) -> None: """ Tune the networks using hyperparameter optimization. This method performs tuning to optimize the network's performance. """ dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) bhp = Tuning(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir) @staticmethod def mape(x, y): """ Calculate the Mean Absolute Percentage Error (MAPE) between two arrays. Args: x (numpy.ndarray): Actual values. y (numpy.ndarray): Predicted values. Returns: float: Mean Absolute Percentage Error. """ return np.mean(np.abs((x - y) / y)) * 100 def interpret_input_vectors(self) -> None: """ Interpret input vectors to extract experimental conditions. This method processes input vectors to determine experimental intensities or other relevant parameters. """ intensity = np.zeros(len(self.input_vectors)) for idx, experiment in enumerate(self.input_vectors): match experiment: case x if 'light' in x: experiment.split('_') intensity[idx] = float(experiment.split('_')[-1]) case x if 'dark' in x: intensity[idx] = 0 def sample_experimental_features(self, features: Any) -> Any: """ Sample experimental features to match input vectors. Args: features: Experimental features to sample. Returns: Updated experimental features. """ if type(features) != list: keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) if len(keys) > 1: keys = keys[0] l = 0 for key in keys: l = l + len(self.input_vectors[key]) filter = np.zeros(l) input = [] for key in keys: input.append(self.input_vectors[key]) input = np.array(input).ravel() for idx, i in enumerate(input): exp = features.x diff = exp - i diff = np.abs(diff) diff = np.argmin(diff) filter[idx] = np.abs(diff) filter = np.array(filter).astype(int) features.x = features.x[filter] features.y = features.y[filter] else: keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.input_vectors[key]) filter = np.zeros(l) input = [] for key in keys: input.append(self.input_vectors[key]) input = np.array(input).ravel() for jdx in range(len(features)): for idx, i in enumerate(input): exp = features.x[jdx] diff = exp - i diff = np.abs(diff) diff = np.argmin(diff) filter[idx] = np.abs(diff) filter = np.array(filter).astype(int) features.x[jdx] = features.x[jdx][filter] features.y[jdx] = features.y[jdx][filter] return features def normalise_experimental_features(self, features: Any, dir: Optional[str] = None, idx = None) -> Any: """ Normalize experimental features based on configuration settings. Args: features: Experimental features to normalize. dir (str, optional): Directory containing normalization settings. Returns: Normalized experimental features. """ if idx == None: idx = 0 if dir == None: min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) else: min_max_log = pd.read_csv(os.path.join(dir, 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) prefix = [self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'][idx]] #prefix = prefix[idx] if len(prefix) > 1: vecs = [] for idx in range(len(prefix)): temp_vecs = [prefix[idx] + '.vec' + str(i) for i in range(self.points)] vecs.append(temp_vecs) else: vecs = [prefix[0] + '.vec' + str(i) for i in range(self.points)] min_max_log = min_max_log[min_max_log['param'].isin(vecs)] dir_min = min_max_log['min'].values dir_max = min_max_log['max'].values dir_log = min_max_log['log'].values y = features.y if len(y) != self.points: for idx in range(len(y)): if dir_log[0] == 0: y[idx] = self.normalise_linear(y[idx], dir_min[idx], dir_max[idx]) else: y[idx] = self.normalise_log(y[idx], dir_min[idx], dir_max[idx]) else: if dir_log[0] == 0: y = self.normalise_linear(y, dir_min, dir_max) else: y = self.normalise_log(y, dir_min, dir_max) features.y = y return features def denormalise_predictions(self) -> None: """ Denormalize predictions to their original scale. This method converts normalized predictions back to their original scale using configuration settings. """ predictions = np.zeros((self.Device_Population, self.total_networks, 100)) self.mean = np.zeros((self.total_networks, 100)) for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) network_min_max_log = self.min_max_log[self.min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values.ravel() max = network_min_max_log['max'].values.ravel() log = network_min_max_log['log'].values.ravel() for jdx in range(num_outputs): if num_outputs > 1: if log[jdx] == 1: predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx]) else: predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx]) else: if log == 1: predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx, jdx], min, max) else: predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx, jdx], min, max) self.predicitons = predictions for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) for jdx in range(num_outputs): if num_outputs > 1: self.mean[idx, jdx] = np.mean(self.predicitons[:, idx, jdx]) else: self.mean[idx,0] = np.mean(self.predicitons[:, idx, jdx]) def normalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Normalize values linearly. Args: x (numpy.ndarray): Values to normalize. x_min (float): Minimum value for normalization. x_max (float): Maximum value for normalization. Returns: numpy.ndarray: Normalized values. """ return (x - x_min) / (x_max - x_min) def normalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Normalize values logarithmically. Args: x (numpy.ndarray): Values to normalize. x_min (float): Minimum value for normalization. x_max (float): Maximum value for normalization. Returns: numpy.ndarray: Normalized values. """ return (np.log10(x) - np.log10(x_min)) / (np.log10(x_max) - np.log10(x_min)) def denormalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Denormalize values linearly. Args: x (numpy.ndarray): Values to denormalize. x_min (float): Minimum value for denormalization. x_max (float): Maximum value for denormalization. Returns: numpy.ndarray: Denormalized values. """ return x * (x_max - x_min) + x_min def denormalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Denormalize values logarithmically. Args: x (numpy.ndarray): Values to denormalize. x_min (float): Minimum value for denormalization. x_max (float): Maximum value for denormalization. Returns: numpy.ndarray: Denormalized values. """ return 10 ** (x * (np.log10(x_max) - np.log10(x_min)) + np.log10(x_min))
Base class for managing and training machine learning networks.
This class provides methods for initializing, configuring, and training networks, as well as handling input and output data for machine learning tasks. It also supports subclassing for different types of networks.
Subclasses
Class variables
var subclasses
Static methods
def initialise(networks_dir: str, network_type: str | None = None, model_settings=None) ‑> Networks
-
Factory method to create an instance of a specific network subclass.
This method instantiates the appropriate network subclass based on the specified network type. If no model settings are provided, default settings are used.
Args
networks_dir
:str
- Directory containing network configurations and data.
network_type
:str
, optional- Type of network to create. Available types depend on registered subclasses (e.g., 'Point', 'Ensemble', 'Difference').
model_settings
:Model_Settings
, optional- Configuration settings for the model. If None, default Model_Settings will be created.
Returns
Networks
- An instance of the appropriate network subclass.
Raises
ValueError
- If the specified network type is not recognized or registered.
Example
>>> networks = Networks.initialise('path/to/networks', 'Point')
def mape(x, y)
-
Expand source code
@staticmethod def mape(x, y): """ Calculate the Mean Absolute Percentage Error (MAPE) between two arrays. Args: x (numpy.ndarray): Actual values. y (numpy.ndarray): Predicted values. Returns: float: Mean Absolute Percentage Error. """ return np.mean(np.abs((x - y) / y)) * 100
Calculate the Mean Absolute Percentage Error (MAPE) between two arrays.
Args
x
:numpy.ndarray
- Actual values.
y
:numpy.ndarray
- Predicted values.
Returns
float
- Mean Absolute Percentage Error.
Methods
def denormalise_linear(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
-
Expand source code
def denormalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Denormalize values linearly. Args: x (numpy.ndarray): Values to denormalize. x_min (float): Minimum value for denormalization. x_max (float): Maximum value for denormalization. Returns: numpy.ndarray: Denormalized values. """ return x * (x_max - x_min) + x_min
Denormalize values linearly.
Args
x
:numpy.ndarray
- Values to denormalize.
x_min
:float
- Minimum value for denormalization.
x_max
:float
- Maximum value for denormalization.
Returns
numpy.ndarray
- Denormalized values.
def denormalise_log(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
-
Expand source code
def denormalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Denormalize values logarithmically. Args: x (numpy.ndarray): Values to denormalize. x_min (float): Minimum value for denormalization. x_max (float): Maximum value for denormalization. Returns: numpy.ndarray: Denormalized values. """ return 10 ** (x * (np.log10(x_max) - np.log10(x_min)) + np.log10(x_min))
Denormalize values logarithmically.
Args
x
:numpy.ndarray
- Values to denormalize.
x_min
:float
- Minimum value for denormalization.
x_max
:float
- Maximum value for denormalization.
Returns
numpy.ndarray
- Denormalized values.
def denormalise_predictions(self) ‑> None
-
Expand source code
def denormalise_predictions(self) -> None: """ Denormalize predictions to their original scale. This method converts normalized predictions back to their original scale using configuration settings. """ predictions = np.zeros((self.Device_Population, self.total_networks, 100)) self.mean = np.zeros((self.total_networks, 100)) for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) network_min_max_log = self.min_max_log[self.min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values.ravel() max = network_min_max_log['max'].values.ravel() log = network_min_max_log['log'].values.ravel() for jdx in range(num_outputs): if num_outputs > 1: if log[jdx] == 1: predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx]) else: predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx][0][0][jdx], min[jdx], max[jdx]) else: if log == 1: predictions[:, idx, jdx] = self.denormalise_log(self.normalised_predicitons[idx, jdx], min, max) else: predictions[:, idx, jdx] = self.denormalise_linear(self.normalised_predicitons[idx, jdx], min, max) self.predicitons = predictions for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) for jdx in range(num_outputs): if num_outputs > 1: self.mean[idx, jdx] = np.mean(self.predicitons[:, idx, jdx]) else: self.mean[idx,0] = np.mean(self.predicitons[:, idx, jdx])
Denormalize predictions to their original scale.
This method converts normalized predictions back to their original scale using configuration settings.
def get_uniform_distribution(self, features: numpy.ndarray, targets: numpy.ndarray) ‑> Tuple[numpy.ndarray, numpy.ndarray]
-
Expand source code
def get_uniform_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: """ Generate a uniform distribution of features and targets. Args: features (numpy.ndarray): Input features. targets (numpy.ndarray): Target values. Returns: tuple: Validation features and validation targets. """ features = np.asarray(features) targets = np.asarray(targets) training_population = int(self.model_settings.training_percentage * self.population) indices = np.linspace(0, self.population - 1, self.population, dtype=int) rng = np.random.default_rng() training_indices = rng.choice(indices, training_population, replace=False) validation_indices = np.array([i not in training_indices for i in indices]) training_features = features[training_indices] training_targets = targets[training_indices] validation_features = features[validation_indices] validation_targets = targets[validation_indices] return validation_features, validation_targets
Generate a uniform distribution of features and targets.
Args
features
:numpy.ndarray
- Input features.
targets
:numpy.ndarray
- Target values.
Returns
tuple
- Validation features and validation targets.
def interpret_input_vectors(self) ‑> None
-
Expand source code
def interpret_input_vectors(self) -> None: """ Interpret input vectors to extract experimental conditions. This method processes input vectors to determine experimental intensities or other relevant parameters. """ intensity = np.zeros(len(self.input_vectors)) for idx, experiment in enumerate(self.input_vectors): match experiment: case x if 'light' in x: experiment.split('_') intensity[idx] = float(experiment.split('_')[-1]) case x if 'dark' in x: intensity[idx] = 0
Interpret input vectors to extract experimental conditions.
This method processes input vectors to determine experimental intensities or other relevant parameters.
def load_input_vectors(self) ‑> None
-
Expand source code
def load_input_vectors(self) -> None: """ Load input vectors from the network configuration file. This method parses experimental input vectors and stores them for use in training and prediction. """ input_vectors = {} input_experiments = self.oghma_network_config['experimental'] self.networks_configured = list(self.oghma_network_config['sims'].keys()) for experiment in input_experiments: vector = self.oghma_network_config['experimental'][experiment]['vec']['points'].split(',') vector = np.asarray(vector).astype(float) input_vectors[experiment] = vector self.input_vectors = input_vectors self.points = len(vector)
Load input vectors from the network configuration file.
This method parses experimental input vectors and stores them for use in training and prediction.
def load_training_dataset(self) ‑> Tuple[numpy.ndarray, numpy.ndarray]
-
Expand source code
def load_training_dataset(self) -> Tuple[np.ndarray, np.ndarray]: """ Load the training dataset from the network configuration file. This method reads the dataset, extracts input and output features, and prepares them for training. Returns: tuple: A tuple containing features and targets as numpy arrays. """ training_dataset = pd.read_csv(self.oghma_network_config['csv_file'], sep=" ") inputs_vectors = {} input_experiments = self.oghma_network_config['experimental'] for experiment in input_experiments: points = len(self.oghma_network_config['experimental'][experiment]['vec']['points'].split(',')) inputs_vectors[experiment] = points self.points = points self.population = len(training_dataset) self.inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] self.outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] self.input_points = 0 for input in self.inputs: points = len(self.oghma_network_config['experimental'][input]['vec']['points'].split(',')) self.input_points += points self.output_points = len(self.outputs) # feature = np.zeros(self.input_points) features = np.zeros((self.population, self.input_points)) # target = np.zeros((len(self.outputs))) targets = np.zeros((self.population, self.output_points)) inputs = np.empty(self.input_points, dtype=object) previous_end = 0 for idx in range(len(self.inputs)): vector_points = inputs_vectors[self.inputs[idx]] inputs[previous_end:previous_end + vector_points] = np.asarray( [input + '.vec' + str(x) for x in range(vector_points)]) previous_end = previous_end + vector_points #print(inputs) self.features = training_dataset[inputs].to_numpy().astype(np.float32) self.targets = training_dataset[self.outputs].to_numpy().astype(np.float32) return features, targets
Load the training dataset from the network configuration file.
This method reads the dataset, extracts input and output features, and prepares them for training.
Returns
tuple
- A tuple containing features and targets as numpy arrays.
def normalise_experimental_features(self, features: Any, dir: str | None = None, idx=None) ‑> Any
-
Expand source code
def normalise_experimental_features(self, features: Any, dir: Optional[str] = None, idx = None) -> Any: """ Normalize experimental features based on configuration settings. Args: features: Experimental features to normalize. dir (str, optional): Directory containing normalization settings. Returns: Normalized experimental features. """ if idx == None: idx = 0 if dir == None: min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) else: min_max_log = pd.read_csv(os.path.join(dir, 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) prefix = [self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'][idx]] #prefix = prefix[idx] if len(prefix) > 1: vecs = [] for idx in range(len(prefix)): temp_vecs = [prefix[idx] + '.vec' + str(i) for i in range(self.points)] vecs.append(temp_vecs) else: vecs = [prefix[0] + '.vec' + str(i) for i in range(self.points)] min_max_log = min_max_log[min_max_log['param'].isin(vecs)] dir_min = min_max_log['min'].values dir_max = min_max_log['max'].values dir_log = min_max_log['log'].values y = features.y if len(y) != self.points: for idx in range(len(y)): if dir_log[0] == 0: y[idx] = self.normalise_linear(y[idx], dir_min[idx], dir_max[idx]) else: y[idx] = self.normalise_log(y[idx], dir_min[idx], dir_max[idx]) else: if dir_log[0] == 0: y = self.normalise_linear(y, dir_min, dir_max) else: y = self.normalise_log(y, dir_min, dir_max) features.y = y return features
Normalize experimental features based on configuration settings.
Args
features
- Experimental features to normalize.
dir
:str
, optional- Directory containing normalization settings.
Returns
Normalized experimental features.
def normalise_linear(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
-
Expand source code
def normalise_linear(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Normalize values linearly. Args: x (numpy.ndarray): Values to normalize. x_min (float): Minimum value for normalization. x_max (float): Maximum value for normalization. Returns: numpy.ndarray: Normalized values. """ return (x - x_min) / (x_max - x_min)
Normalize values linearly.
Args
x
:numpy.ndarray
- Values to normalize.
x_min
:float
- Minimum value for normalization.
x_max
:float
- Maximum value for normalization.
Returns
numpy.ndarray
- Normalized values.
def normalise_log(self, x: numpy.ndarray, x_min: float, x_max: float) ‑> numpy.ndarray
-
Expand source code
def normalise_log(self, x: np.ndarray, x_min: float, x_max: float) -> np.ndarray: """ Normalize values logarithmically. Args: x (numpy.ndarray): Values to normalize. x_min (float): Minimum value for normalization. x_max (float): Maximum value for normalization. Returns: numpy.ndarray: Normalized values. """ return (np.log10(x) - np.log10(x_min)) / (np.log10(x_max) - np.log10(x_min))
Normalize values logarithmically.
Args
x
:numpy.ndarray
- Values to normalize.
x_min
:float
- Minimum value for normalization.
x_max
:float
- Maximum value for normalization.
Returns
numpy.ndarray
- Normalized values.
def sample_experimental_features(self, features: Any) ‑> Any
-
Expand source code
def sample_experimental_features(self, features: Any) -> Any: """ Sample experimental features to match input vectors. Args: features: Experimental features to sample. Returns: Updated experimental features. """ if type(features) != list: keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) if len(keys) > 1: keys = keys[0] l = 0 for key in keys: l = l + len(self.input_vectors[key]) filter = np.zeros(l) input = [] for key in keys: input.append(self.input_vectors[key]) input = np.array(input).ravel() for idx, i in enumerate(input): exp = features.x diff = exp - i diff = np.abs(diff) diff = np.argmin(diff) filter[idx] = np.abs(diff) filter = np.array(filter).astype(int) features.x = features.x[filter] features.y = features.y[filter] else: keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.input_vectors[key]) filter = np.zeros(l) input = [] for key in keys: input.append(self.input_vectors[key]) input = np.array(input).ravel() for jdx in range(len(features)): for idx, i in enumerate(input): exp = features.x[jdx] diff = exp - i diff = np.abs(diff) diff = np.argmin(diff) filter[idx] = np.abs(diff) filter = np.array(filter).astype(int) features.x[jdx] = features.x[jdx][filter] features.y[jdx] = features.y[jdx][filter] return features
Sample experimental features to match input vectors.
Args
features
- Experimental features to sample.
Returns
Updated experimental features.
def separate_training_dataset(self) ‑> Tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray]
-
Expand source code
def separate_training_dataset(self) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]: """ Separate the training dataset into training and validation sets. This method splits the dataset based on the training percentage defined in the model settings. Returns: tuple: A tuple containing training features, training targets, validation features, and validation targets. """ training_population = int(self.model_settings.training_percentage * self.population) indices = np.linspace(0, self.population - 1, self.population, dtype=int) training_indices = random.choices(indices, k=training_population) validation_indices = np.delete(indices, training_indices) training_features = self.features[training_indices] training_targets = self.targets[training_indices] validation_features = self.features[validation_indices] validation_targets = self.targets[validation_indices] self.validation_indices = validation_indices self.training_features = training_features self.training_targets = training_targets self.validation_features = validation_features self.validation_targets = validation_targets return training_features, training_targets, validation_features, validation_targets
Separate the training dataset into training and validation sets.
This method splits the dataset based on the training percentage defined in the model settings.
Returns
tuple
- A tuple containing training features, training targets, validation features, and validation targets.
def setup_network_directories(self) ‑> None
-
Expand source code
def setup_network_directories(self) -> None: """ Set up directories for network configurations and ensure required files exist. This method validates the network configuration, loads the network settings from the nets.json file, and creates necessary directories for each configured network. It initializes tracking variables for network management. Raises: ValueError: If the network configuration file (nets.json) is not found in the expected location (networks_dir/faster/nets.json). Sets: self.networks_configured (list): List of configured network names self.working_network (int): Index of currently active network (starts at 0) self.total_networks (int): Total number of configured networks self.networks (numpy.ndarray): Array to store network instances self.oghma_network_config (dict): Loaded network configuration data """ oghma_network_config = os.path.join(self.networks_dir, 'faster', 'nets.json') if os.path.isfile(oghma_network_config) == False: raise ValueError('Network Config File Not Found') else: f = open(oghma_network_config, 'r') oghma_network_config = json.load(f) f.close() self.networks_configured = list(oghma_network_config['sims'].keys()) self.working_network = 0 self.total_networks = len(self.networks_configured) self.networks = np.zeros(len(self.networks_configured), dtype=object) self.oghma_network_config = oghma_network_config for network in self.networks_configured: network_dir = os.path.join(self.networks_dir, 'faster', network) if os.path.isdir(network_dir) == False: os.mkdir(network_dir)
Set up directories for network configurations and ensure required files exist.
This method validates the network configuration, loads the network settings from the nets.json file, and creates necessary directories for each configured network. It initializes tracking variables for network management.
Raises
ValueError
- If the network configuration file (nets.json) is not found in the expected location (networks_dir/faster/nets.json).
Sets
self.networks_configured (list): List of configured network names self.working_network (int): Index of currently active network (starts at 0) self.total_networks (int): Total number of configured networks self.networks (numpy.ndarray): Array to store network instances self.oghma_network_config (dict): Loaded network configuration data
def train_networks(self, network_type=None) ‑> None
-
Expand source code
def train_networks(self, network_type=None) -> None: """ Train the networks using the training dataset. This method initializes the training process for the current working network. """ dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) print('Learning Rate:', self.model_settings.inital_learning_rate) print('Decay Rate:', self.model_settings.decay_rate) Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type)
Train the networks using the training dataset.
This method initializes the training process for the current working network.
def train_networks_existing(self, network_type=None) ‑> None
-
Expand source code
def train_networks_existing(self, network_type=None) -> None: """ Train the networks using the training dataset. This method initializes the training process for the current working network. """ dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) print('Learning Rate:', self.model_settings.inital_learning_rate) print('Decay Rate:', self.model_settings.decay_rate) Training(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir, model_settings=self.model_settings, model_type=network_type, existing=True)
Train the networks using the training dataset.
This method initializes the training process for the current working network.
def tune_networks(self) ‑> None
-
Expand source code
def tune_networks(self) -> None: """ Tune the networks using hyperparameter optimization. This method performs tuning to optimize the network's performance. """ dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) bhp = Tuning(self.training_features, self.training_targets, self.validation_features, self.validation_targets, dir)
Tune the networks using hyperparameter optimization.
This method performs tuning to optimize the network's performance.
class Point (networks_dir: str, model_settings=None)
-
Expand source code
class Point(Networks): """ Subclass of Networks for point-based models. This class provides methods specific to point-based models, including training and confusion matrix generation. """ _network_type = 'point' def __init__(self, networks_dir: str, model_settings = None): """ Initialize a Point instance. Args: networks_dir (str): Directory containing network configurations. model_settings (Model_Settings, optional): Settings for the model. """ self.networks_dir = networks_dir # if model_settings == None: # self.model_settings = Model_Settings() # else: self.model_settings = model_settings self.rng = np.random.default_rng() self.min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) def train(self, idx=None) -> None: """ Train the point-based networks. This method sets up directories, loads datasets, and trains the networks. """ self.setup_network_directories() if idx is None: for jdx in range(0, self.total_networks): self.working_network = jdx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing('Point') def train_existing(self, idx=None) -> None: """ Train the point-based networks. This method sets up directories, loads datasets, and trains the networks. """ self.setup_network_directories() if idx is None: for jdx in range(0, self.total_networks): self.working_network = jdx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing() def confusion_matrix(self, abs_dir: Optional[str] = None) -> np.ndarray: """ Generate a confusion matrix for the point-based networks. Args: abs_dir (str, optional): Directory containing absolute data. Returns: numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks. """ self.setup_network_directories() self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object) self.predictions = np.zeros((len(self.networks_configured), 10)) self.MAPE = np.zeros((len(self.networks_configured),10)) min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(len(self.networks_configured)): self.working_network = idx self.load_training_dataset() self.separate_training_dataset() outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] figname = 'tempCF' + str(idx) fig, ax = plt.subplots(figsize=(6,6), dpi=300) ax.set_xlabel('Target') ax.set_ylabel('Predicted') dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) validation_features = np.array([i.astype(float) for i in self.validation_features]) P = Predicting(dir, validation_features) self.normalised_predicitons[idx,0] = P.predict() if log == 1: vt = self.denormalise_log(self.validation_targets, min, max) predictions = self.denormalise_log(self.normalised_predicitons[idx][:], min, max) else: vt = self.denormalise_linear(self.validation_targets, min, max) predictions = self.denormalise_linear(self.normalised_predicitons[idx][:], min, max) if np.shape(self.normalised_predicitons[idx][0])[1] > 1: for jdx in range(np.shape(self.normalised_predicitons[idx][0])[1]): plt.hist2d(self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel(), bins=np.linspace(0,1, 100), range=[[0, 1], [0, 1]], cmap='inferno') self.MAPE[idx,jdx] = np.abs(np.mean(np.abs(self.validation_targets[:,jdx].ravel() - self.normalised_predicitons[idx][0][:,jdx].ravel()) /self.normalised_predicitons[idx][0][:,jdx].ravel()) * 100) #bs = stats.bootstrap((self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel()), self.mape, confidence_level=0.95, n_resamples=100) #self.MAPE[idx,jdx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) else: plt.hist2d(self.validation_targets[:].ravel(), self.normalised_predicitons[idx][0].ravel(), bins=np.linspace(0, 1, 100), range=[[0, 1], [0, 1]], cmap='inferno') self.MAPE[idx,0] = np.abs(np.mean(np.abs(self.validation_targets.ravel() - self.normalised_predicitons[idx][0].ravel()) / self.normalised_predicitons[idx][0].ravel()) * 100) #bs = stats.bootstrap((self.validation_targets.ravel(), self.normalised_predicitons[idx][0].ravel()), self.mape, confidence_level=0.95, n_resamples=100) #self.MAPE[idx, 0] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = 'tempCF' + str(self.working_network) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname + '.png') data = pd.DataFrame() data['Target'] = vt[:].ravel()[:20000] print(predictions[0].ravel()) data['Predicted'] = predictions[0].ravel()[:20000] data.to_csv(figname + '.csv', index=False) plt.savefig(figname) return self.MAPE def predict(self, experimental_feature: Any) -> None: """ Predict outputs for given experimental features. Args: experimental_feature: Experimental features to predict outputs for. """ self.Device_Population = experimental_feature.Device_Population self.setup_network_directories() self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object) self.predictions = np.zeros((len(self.networks_configured), 10)) for idx in range(len(self.networks_configured)): ef = copy.deepcopy(experimental_feature) self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) self.load_input_vectors() self.interpret_input_vectors() ef = self.sample_experimental_features(ef) ef = self.normalise_experimental_features(ef) if ef.Device_Population > 1: ef.y = np.array([f for f in ef.y]) else: ef.y = np.array([ef.y]) P = Predicting(dir, ef.y) self.normalised_predicitons[idx,0] = P.predict() self.denormalise_predictions()
Subclass of Networks for point-based models.
This class provides methods specific to point-based models, including training and confusion matrix generation.
Initialize a Point instance.
Args
networks_dir
:str
- Directory containing network configurations.
model_settings
:Model_Settings
, optional- Settings for the model.
Ancestors
Methods
def confusion_matrix(self, abs_dir: str | None = None) ‑> numpy.ndarray
-
Expand source code
def confusion_matrix(self, abs_dir: Optional[str] = None) -> np.ndarray: """ Generate a confusion matrix for the point-based networks. Args: abs_dir (str, optional): Directory containing absolute data. Returns: numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks. """ self.setup_network_directories() self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object) self.predictions = np.zeros((len(self.networks_configured), 10)) self.MAPE = np.zeros((len(self.networks_configured),10)) min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(len(self.networks_configured)): self.working_network = idx self.load_training_dataset() self.separate_training_dataset() outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] figname = 'tempCF' + str(idx) fig, ax = plt.subplots(figsize=(6,6), dpi=300) ax.set_xlabel('Target') ax.set_ylabel('Predicted') dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) validation_features = np.array([i.astype(float) for i in self.validation_features]) P = Predicting(dir, validation_features) self.normalised_predicitons[idx,0] = P.predict() if log == 1: vt = self.denormalise_log(self.validation_targets, min, max) predictions = self.denormalise_log(self.normalised_predicitons[idx][:], min, max) else: vt = self.denormalise_linear(self.validation_targets, min, max) predictions = self.denormalise_linear(self.normalised_predicitons[idx][:], min, max) if np.shape(self.normalised_predicitons[idx][0])[1] > 1: for jdx in range(np.shape(self.normalised_predicitons[idx][0])[1]): plt.hist2d(self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel(), bins=np.linspace(0,1, 100), range=[[0, 1], [0, 1]], cmap='inferno') self.MAPE[idx,jdx] = np.abs(np.mean(np.abs(self.validation_targets[:,jdx].ravel() - self.normalised_predicitons[idx][0][:,jdx].ravel()) /self.normalised_predicitons[idx][0][:,jdx].ravel()) * 100) #bs = stats.bootstrap((self.validation_targets[:,jdx].ravel(), self.normalised_predicitons[idx][0][:,jdx].ravel()), self.mape, confidence_level=0.95, n_resamples=100) #self.MAPE[idx,jdx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) else: plt.hist2d(self.validation_targets[:].ravel(), self.normalised_predicitons[idx][0].ravel(), bins=np.linspace(0, 1, 100), range=[[0, 1], [0, 1]], cmap='inferno') self.MAPE[idx,0] = np.abs(np.mean(np.abs(self.validation_targets.ravel() - self.normalised_predicitons[idx][0].ravel()) / self.normalised_predicitons[idx][0].ravel()) * 100) #bs = stats.bootstrap((self.validation_targets.ravel(), self.normalised_predicitons[idx][0].ravel()), self.mape, confidence_level=0.95, n_resamples=100) #self.MAPE[idx, 0] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = 'tempCF' + str(self.working_network) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname + '.png') data = pd.DataFrame() data['Target'] = vt[:].ravel()[:20000] print(predictions[0].ravel()) data['Predicted'] = predictions[0].ravel()[:20000] data.to_csv(figname + '.csv', index=False) plt.savefig(figname) return self.MAPE
Generate a confusion matrix for the point-based networks.
Args
abs_dir
:str
, optional- Directory containing absolute data.
Returns
numpy.ndarray
- Mean Absolute Percentage Error (MAPE) for the networks.
def predict(self, experimental_feature: Any) ‑> None
-
Expand source code
def predict(self, experimental_feature: Any) -> None: """ Predict outputs for given experimental features. Args: experimental_feature: Experimental features to predict outputs for. """ self.Device_Population = experimental_feature.Device_Population self.setup_network_directories() self.normalised_predicitons = np.zeros((len(self.networks_configured), 1), dtype=object) self.predictions = np.zeros((len(self.networks_configured), 10)) for idx in range(len(self.networks_configured)): ef = copy.deepcopy(experimental_feature) self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) self.load_input_vectors() self.interpret_input_vectors() ef = self.sample_experimental_features(ef) ef = self.normalise_experimental_features(ef) if ef.Device_Population > 1: ef.y = np.array([f for f in ef.y]) else: ef.y = np.array([ef.y]) P = Predicting(dir, ef.y) self.normalised_predicitons[idx,0] = P.predict() self.denormalise_predictions()
Predict outputs for given experimental features.
Args
experimental_feature
- Experimental features to predict outputs for.
def train(self, idx=None) ‑> None
-
Expand source code
def train(self, idx=None) -> None: """ Train the point-based networks. This method sets up directories, loads datasets, and trains the networks. """ self.setup_network_directories() if idx is None: for jdx in range(0, self.total_networks): self.working_network = jdx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing('Point')
Train the point-based networks.
This method sets up directories, loads datasets, and trains the networks.
def train_existing(self, idx=None) ‑> None
-
Expand source code
def train_existing(self, idx=None) -> None: """ Train the point-based networks. This method sets up directories, loads datasets, and trains the networks. """ self.setup_network_directories() if idx is None: for jdx in range(0, self.total_networks): self.working_network = jdx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.separate_training_dataset() self.train_networks_existing()
Train the point-based networks.
This method sets up directories, loads datasets, and trains the networks.
Inherited members
Networks
:denormalise_linear
denormalise_log
denormalise_predictions
get_uniform_distribution
initialise
interpret_input_vectors
load_input_vectors
load_training_dataset
mape
normalise_experimental_features
normalise_linear
normalise_log
sample_experimental_features
separate_training_dataset
setup_network_directories
train_networks
train_networks_existing
tune_networks
class Residual (networks_dir, model_settings=None)
-
Expand source code
class Residual(Networks): _network_type = 'residual' def __init__(self, networks_dir, model_settings=None): self.networks_dir = networks_dir self.model_settings = model_settings self.rng = np.random.default_rng() def train(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks('Residual') def train_existing(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing('Residual') def tune(self): self.setup_network_directories() for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations() self.separate_training_dataset() self.tune_networks() def permutations(self): features = self.features targets = self.targets if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(len(targets)), k=permutations_limit) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) return features, targets def permutations_lowRAM(self): # features = self.features # targets = self.targets if self.model_settings.permutations_limit is not None: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(1000), k=1000) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) Thousands = 40 p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32) step = np.shape(permutations)[0] for idx in range(1, Thousands): if idx == 0: p[:step,:] = permutations + (1000*idx) else: p[step*(idx-1):step*idx,:] = permutations + (1000*idx) # permutations = np.append(permutations, permutations + (1000*idx), axis=0) permutations = p if self.model_settings.permutations_limit < np.shape(permutations)[0]: permutations = permutations[:self.model_settings.permutations_limit] self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = self.features[p[0]] x[idx, num_inputs:] = self.features[p[1]] y[idx] = self.targets[p[0]] - self.targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) return self.features, self.targets def combinations(self): if self.model_settings.permutations_limit is None: permutations_limit = self.population else: permutations_limit = self.model_settings.permutations_limit rng = np.random.default_rng() indices = rng.choice(range(self.population), permutations_limit) permutations = list(itertools.combinations(indices, r=2)) permutations = [list(f) for f in permutations] x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points)) y = np.zeros((len(self.networks_configured), len(permutations), 1)) for idx, p in enumerate(permutations): x[:, idx, :self.points] = self.features[:, p[0], :] x[:, idx, self.points:] = self.features[:, p[1], :] y[:, idx] = self.targets[p[0]] - self.targets[p[1]] self.features = copy.deepcopy(x) self.targets = copy.deepcopy(y) self.population = len(permutations) def predict(self, absolute_dir: str, *experimental_features) -> None: """ Predict outputs for given experimental features using difference-based models. Args: absolute_dir (str): Directory containing absolute data. experimental_feature (np.ndarray): Experimental features to predict outputs for. """ experimental_features = experimental_features[0] self.setup_network_directories() # Gather Input Vectors Used self.load_input_vectors() exps = [] if type(experimental_features) is list: for idx, experimental_feature in enumerate(experimental_features): experimental_feature = copy.deepcopy(experimental_feature) # Samples Experimental Feature experimental_feature = self.sample_experimental_features(experimental_feature) # Normalises Experimental Feature experimental_feature = self.normalise_experimental_features(experimental_feature, dir=None, idx=idx) exps.append(experimental_feature) else: experimental_features = copy.deepcopy(experimental_features) # Samples Experimental Feature experimental_features = self.sample_experimental_features(experimental_features) # Normalises Experimental Feature experimental_features = self.normalise_experimental_features(experimental_features, dir=None, idx=None) exps.append(experimental_features) exps = exps[0] for idx in range(self.total_networks): self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) # Gathers and sets up secondary dataset self.setup_absolute_dataset(absolute_dir) # Gathers number of ouputs outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) # Setup Prediction Array self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object) # Pairs experimental feature with secondary dataset feature = self.setup_experimental_feature(exps) # Sets up Network P = Predicting(dir, feature, inputs=self.len_exp_features*2) # Generates Normalised Predictions self.normalised_predicitons = P.predict() self.turn_predictions_absolute() self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) prediction = self.denormalise_prediction_single(absolute_dir) self.predictions[:, idx, :] = prediction[:] self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons) def renormalise(self, absolute_v: pd.DataFrame, absolute_dir: str) -> pd.DataFrame: """ Renormalize absolute values to match the network's configuration. Args: absolute_v (pandas.DataFrame): Absolute values to renormalize. absolute_dir (str): Directory containing absolute data. Returns: pandas.DataFrame: Renormalized values. """ network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv') network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv') absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max) else: absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max) else: absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max) vecs = [] for f in absolute_v.columns: if 'vec' in f: if 'light' in f: vecs.append(f) for vec in vecs: temp = absolute_min_max_log[absolute_min_max_log['param'] == vec] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max) else: absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == vec] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max) else: absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max) return absolute_v def setup_absolute_dataset(self, absolute_dir: str) -> None: """ Set up the absolute dataset for predictions. Args: absolute_dir (str): Directory containing absolute data. """ absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv') inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] self.inputs = inputs self.outputs = outputs f = open(absolute_vectors_dir, 'r') v = pd.read_csv(f, delimiter=' ') self.vectors = self.renormalise(v, absolute_dir) f.close() col = self.vectors.columns.to_list() inp = [i + '.vec' for i in self.inputs] V = [] for i in inp: vecs = np.where(np.char.find(np.char.lower(col), i) > -1)[0] V = np.append(V, vecs) col = [col[int(x)] for x in V] self.absolute_targets = self.vectors[outputs].to_numpy() self.absolute_features = self.vectors[col].to_numpy() self.absolute_features = self.absolute_features[:] self.absolute_targets = self.absolute_targets[:] #self.generate_uniform_distribution() def generate_uniform_distribution(self) -> None: """ Generate a uniform distribution of the absolute dataset. """ kde = stats.gaussian_kde(self.absolute_targets.ravel()) density = kde(self.absolute_targets.ravel()) density = self.normalise_linear(density, np.min(density), np.max(density)) density = 1 - density density = density / np.sum(density) indicies = np.arange(len(density)) normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2)) self.absolute_targets = self.absolute_targets[normal] self.absolute_features = self.absolute_features[normal] def setup_experimental_feature(self, experimental_features) -> np.ndarray: absolute_features = self.absolute_features try: len_exp_features = len(experimental_features) self.len_exp_features = int(len_exp_features/2) len_exp0 = len(experimental_features[0].y) except: len_exp_features = 1 self.len_exp_features = 1 len_exp0 = len(experimental_features.y) match len_exp_features: case 1: y = np.tile(experimental_features.y, (len(absolute_features),1)) y = np.reshape(y, np.shape(absolute_features)) features1 = np.concatenate((absolute_features, y), axis=1) features2 = np.concatenate((y, absolute_features), axis=1) features = np.concatenate((features1, features2), axis=0) case 2: abf_1 = absolute_features[:, 0:len_exp0] abf_2 = absolute_features[:, len_exp0:len_exp0*2] y1 = np.tile(experimental_features[0].y, (len(abf_1), 1)) y1 = np.reshape(y1, np.shape(abf_1)) y2 = np.tile(experimental_features[1].y, (len(abf_2), 1)) y2 = np.reshape(y2, np.shape(abf_2)) abf = np.concatenate((abf_1, abf_2), axis=1) abf_r = np.concatenate((abf_2, abf_1), axis=1) abf = np.concatenate((abf, abf_r), axis=0) y = np.concatenate((y1, y2), axis=1) y_r = np.concatenate((y2, y1), axis=1) y = np.concatenate((y, y_r), axis=0) features = np.concatenate((abf, y), axis=1) # features1 = np.concatenate((abf_1, y1), axis=1) # features1_r = np.concatenate((y1, abf_1), axis=1) # features1 = np.concatenate((features1, features1_r), axis=0) # features2 = np.concatenate((abf_2, y2), axis=1) # features2_r = np.concatenate((y2, abf_2), axis=1) # features2 = np.concatenate((features2, features2_r), axis=0) # features = np.concatenate((features1, features2), axis=1) case 4: abf_1 = absolute_features[:, 0:len_exp0] abf_2 = absolute_features[:, len_exp0:len_exp0*2] abf_3 = absolute_features[:, len_exp0*2:len_exp0*3] abf_4 = absolute_features[:, len_exp0*3:len_exp0*4] y1 = np.tile(experimental_features[0].y, (len(abf_1), 1)) y1 = np.reshape(y1, np.shape(abf_1)) y2 = np.tile(experimental_features[1].y, (len(abf_2), 1)) y2 = np.reshape(y2, np.shape(abf_2)) y3 = np.tile(experimental_features[2].y, (len(abf_3), 1)) y3 = np.reshape(y3, np.shape(abf_3)) y4 = np.tile(experimental_features[3].y, (len(abf_4), 1)) y4 = np.reshape(y4, np.shape(abf_4)) features1 = np.concatenate((abf_1, y1), axis=1) features1_r = np.concatenate((y1, abf_1), axis=1) features1 = np.concatenate((features1, features1_r), axis=0) features2 = np.concatenate((abf_2, y2), axis=1) features2_r = np.concatenate((y2, abf_2), axis=1) features2 = np.concatenate((features2, features2_r), axis=0) features3 = np.concatenate((abf_3, y3), axis=1) features3_r = np.concatenate((y3, abf_3), axis=1) features3 = np.concatenate((features3, features3_r), axis=0) features4 = np.concatenate((abf_4, y4), axis=1) features4_r = np.concatenate((y4, abf_4), axis=1) features4 = np.concatenate((features4, features4_r), axis=0) features = np.concatenate((features1, features2, features3, features4), axis=1) return features def turn_predictions_absolute(self) -> None: """ Convert predictions to absolute values. """ origin = self.absolute_targets l = len(self.absolute_targets) self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l] self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin def denormalise_predictions(self, absolute_dir: str) -> None: """ Denormalize predictions to their original scale. Args: absolute_dir (str): Directory containing absolute data. """ min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) normalised_predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (max[jdx] - min[jdx]) + \ min[jdx] else: predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * ( np.log10(max[jdx]) - np.log10(min[jdx])) + np.log10(min[jdx]) predictions[kdx, idx, jdx] = 10 ** predictions[kdx, idx, jdx] self.predictions = predictions def denormalise_prediction_single(self, absolute_dir: str) -> np.ndarray: """ Denormalize a single prediction to its original scale. Args: absolute_dir (str): Directory containing absolute data. Returns: numpy.ndarray: Denormalized prediction. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) return predictions def denormalise_target_single(self, absolute_dir: str) -> np.ndarray: """ Denormalize a single target to its original scale. Args: absolute_dir (str): Directory containing absolute data. Returns: numpy.ndarray: Denormalized target. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) return predictions def permutations_normal_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: """ Generate permutations of features and targets with a normal distribution. Args: features (numpy.ndarray): Input features. targets (numpy.ndarray): Target values. Returns: tuple: Features and targets with generated permutations. """ if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = self.rng.choice(range(len(targets)), 300) permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] features = x targets = y.ravel() self.population = len(permutations) return features, targets def setup_confusion_matrix_features(self) -> Tuple[np.ndarray, np.ndarray]: """ Set up features for generating a confusion matrix. Returns: tuple: Targets and features for the confusion matrix. """ absolute_features = self.absolute_features absolute_targets = self.absolute_targets x_features = absolute_features x_targets = absolute_targets[::-1] y_features = absolute_features y_targets = absolute_targets[::-1] features1 = np.concatenate((x_features, y_features), axis=1) features2 = np.concatenate((y_features, x_features), axis=1) features = np.concatenate((features1, features2), axis=0) targets = np.concatenate((x_targets, y_targets), axis=0) return targets, features def confusion_matrix(self, absolute_dir: Optional[str] = None) -> np.ndarray: """ Generate a confusion matrix for the difference-based networks. Args: absolute_dir (str, optional): Directory containing absolute data. Returns: numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks. """ self.setup_network_directories() self.load_input_vectors() MAPE = np.zeros(len(self.networks_configured)) min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) self.setup_absolute_dataset(absolute_dir) outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object) targets, feature = self.setup_confusion_matrix_features() P = Predicting(dir, feature, inputs=2) self.normalised_predicitons = P.predict() self.absolute_targets = targets self.normalised_predicitons = targets - self.normalised_predicitons prediction = self.denormalise_prediction_single(absolute_dir) targets = self.denormalise_target_single(absolute_dir) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) plt.xlabel('Target') plt.ylabel('Prediction') if log == 0: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno') else: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno') plt.xscale('log') plt.yscale('log') figname = 'tempCF' + str(self.working_network) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) data = pd.DataFrame() data['Target'] = targets[:,0].ravel() data['Predicted'] = prediction[:,0].ravel() data.to_csv(figname + '.csv', index=False) plt.savefig(figname+ '.png') MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100) # diff = np.abs(targets[:, 0].ravel() - prediction[:, 0].ravel()) # bs = stats.bootstrap((diff, ), np.mean, confidence_level=0.95, n_resamples=100) # MAPE[idx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) * 100 self.MAPE = MAPE return MAPE def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None: """ Generate a distribution plot for a single network. Args: absolute_dir (str): Directory containing absolute data. idx (int): Index of the network. predictions (numpy.ndarray): Predictions to plot. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = predictions[:, 0] if self.working_network == 0: self.mean = np.zeros(self.total_networks) self.std = np.zeros(self.total_networks) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) if log == 0: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) ) ax.hist(np.abs(p), bins=np.linspace(min, max, 1000)) else: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.set_xscale('log') ax.axvline(m, color='tab:orange') L = Label(outputs[0]) ax.set_xlabel(L.english + ' (' + L.units + ')') ax.set_ylabel('Count') hist = pd.DataFrame() count = np.append(count, 0) hist['bins'] = predictions[:, 0] hist_n = pd.DataFrame() hist_n['bins'] = norm_predictions[:, 0] figname = 'tempDF' + str(self.working_network) + '.png' figname_hist = 'tempDF' + str(self.working_network) + '.csv' figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist) plt.savefig(figname) hist.to_csv(figname_hist, index=False) hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False) plt.close() def distribution_plot(self, absolute_dir: str) -> None: """ Generate distribution plots for all networks. Args: absolute_dir (str): Directory containing absolute data. """ min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = self.predictions[:, idx, 0] m = np.mean(p) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) counts, bins = ax.hist(p, bins=10000, range=(min, max)) ax.axvline(m, color='tab:orange') ax.set_xlabel('Prediction') ax.set_ylabel('Count') hist = pd.DataFrame() hist['bins'] = p figname = 'tempDF' + str(self.working_network) + '.png' hist_figname = 'tempDF' + str(self.working_network) + '.csv' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) plt.show()
Base class for managing and training machine learning networks.
This class provides methods for initializing, configuring, and training networks, as well as handling input and output data for machine learning tasks. It also supports subclassing for different types of networks.
Ancestors
Methods
def combinations(self)
-
Expand source code
def combinations(self): if self.model_settings.permutations_limit is None: permutations_limit = self.population else: permutations_limit = self.model_settings.permutations_limit rng = np.random.default_rng() indices = rng.choice(range(self.population), permutations_limit) permutations = list(itertools.combinations(indices, r=2)) permutations = [list(f) for f in permutations] x = np.zeros((len(self.networks_configured), len(permutations), 2 * self.points)) y = np.zeros((len(self.networks_configured), len(permutations), 1)) for idx, p in enumerate(permutations): x[:, idx, :self.points] = self.features[:, p[0], :] x[:, idx, self.points:] = self.features[:, p[1], :] y[:, idx] = self.targets[p[0]] - self.targets[p[1]] self.features = copy.deepcopy(x) self.targets = copy.deepcopy(y) self.population = len(permutations)
def confusion_matrix(self, absolute_dir: str | None = None) ‑> numpy.ndarray
-
Expand source code
def confusion_matrix(self, absolute_dir: Optional[str] = None) -> np.ndarray: """ Generate a confusion matrix for the difference-based networks. Args: absolute_dir (str, optional): Directory containing absolute data. Returns: numpy.ndarray: Mean Absolute Percentage Error (MAPE) for the networks. """ self.setup_network_directories() self.load_input_vectors() MAPE = np.zeros(len(self.networks_configured)) min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) self.setup_absolute_dataset(absolute_dir) outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) self.normalised_predicitons = np.zeros((s[0]*2, outputs), dtype=object) targets, feature = self.setup_confusion_matrix_features() P = Predicting(dir, feature, inputs=2) self.normalised_predicitons = P.predict() self.absolute_targets = targets self.normalised_predicitons = targets - self.normalised_predicitons prediction = self.denormalise_prediction_single(absolute_dir) targets = self.denormalise_target_single(absolute_dir) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) plt.xlabel('Target') plt.ylabel('Prediction') if log == 0: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=np.linspace(min,max,150), cmap='inferno') else: plt.hist2d(targets[:,0].ravel(), prediction[:,0].ravel(), bins=10**np.linspace(np.log10(min),np.log10(max),150), cmap='inferno') plt.xscale('log') plt.yscale('log') figname = 'tempCF' + str(self.working_network) if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) data = pd.DataFrame() data['Target'] = targets[:,0].ravel() data['Predicted'] = prediction[:,0].ravel() data.to_csv(figname + '.csv', index=False) plt.savefig(figname+ '.png') MAPE[idx] = np.abs(np.mean(np.abs(targets[:,0].ravel() - prediction[:,0].ravel()) / targets[:,0].ravel()) * 100) # diff = np.abs(targets[:, 0].ravel() - prediction[:, 0].ravel()) # bs = stats.bootstrap((diff, ), np.mean, confidence_level=0.95, n_resamples=100) # MAPE[idx] = np.mean((bs.confidence_interval.low, bs.confidence_interval.high)) * 100 self.MAPE = MAPE return MAPE
Generate a confusion matrix for the difference-based networks.
Args
absolute_dir
:str
, optional- Directory containing absolute data.
Returns
numpy.ndarray
- Mean Absolute Percentage Error (MAPE) for the networks.
def denormalise_prediction_single(self, absolute_dir: str) ‑> numpy.ndarray
-
Expand source code
def denormalise_prediction_single(self, absolute_dir: str) -> np.ndarray: """ Denormalize a single prediction to its original scale. Args: absolute_dir (str): Directory containing absolute data. Returns: numpy.ndarray: Denormalized prediction. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.normalised_predicitons[kdx, jdx], min[jdx], max[jdx]) return predictions
Denormalize a single prediction to its original scale.
Args
absolute_dir
:str
- Directory containing absolute data.
Returns
numpy.ndarray
- Denormalized prediction.
def denormalise_predictions(self, absolute_dir: str) ‑> None
-
Expand source code
def denormalise_predictions(self, absolute_dir: str) -> None: """ Denormalize predictions to their original scale. Args: absolute_dir (str): Directory containing absolute data. """ min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) normalised_predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) for idx in range(self.total_networks): outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * (max[jdx] - min[jdx]) + \ min[jdx] else: predictions[kdx, idx, jdx] = self.normalised_predicitons[kdx, jdx] * ( np.log10(max[jdx]) - np.log10(min[jdx])) + np.log10(min[jdx]) predictions[kdx, idx, jdx] = 10 ** predictions[kdx, idx, jdx] self.predictions = predictions
Denormalize predictions to their original scale.
Args
absolute_dir
:str
- Directory containing absolute data.
def denormalise_target_single(self, absolute_dir: str) ‑> numpy.ndarray
-
Expand source code
def denormalise_target_single(self, absolute_dir: str) -> np.ndarray: """ Denormalize a single target to its original scale. Args: absolute_dir (str): Directory containing absolute data. Returns: numpy.ndarray: Denormalized target. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) predictions = np.zeros((len(self.absolute_targets), 10)) outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values max = network_min_max_log['max'].values log = network_min_max_log['log'].values[0] for jdx in range(num_outputs): for kdx in range(len(self.absolute_targets)): if not log: predictions[kdx, jdx] = self.denormalise_linear(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) else: predictions[kdx, jdx] = self.denormalise_log(self.absolute_targets[kdx, jdx], min[jdx], max[jdx]) return predictions
Denormalize a single target to its original scale.
Args
absolute_dir
:str
- Directory containing absolute data.
Returns
numpy.ndarray
- Denormalized target.
def distribution_plot(self, absolute_dir: str) ‑> None
-
Expand source code
def distribution_plot(self, absolute_dir: str) -> None: """ Generate distribution plots for all networks. Args: absolute_dir (str): Directory containing absolute data. """ min_max_log = pd.read_csv(os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) for idx in range(self.total_networks): self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = self.predictions[:, idx, 0] m = np.mean(p) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) counts, bins = ax.hist(p, bins=10000, range=(min, max)) ax.axvline(m, color='tab:orange') ax.set_xlabel('Prediction') ax.set_ylabel('Count') hist = pd.DataFrame() hist['bins'] = p figname = 'tempDF' + str(self.working_network) + '.png' hist_figname = 'tempDF' + str(self.working_network) + '.csv' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) plt.show()
Generate distribution plots for all networks.
Args
absolute_dir
:str
- Directory containing absolute data.
def distribution_plot_single(self,
absolute_dir: str,
idx: int,
predictions: numpy.ndarray,
norm_predictions: numpy.ndarray) ‑> None-
Expand source code
def distribution_plot_single(self, absolute_dir: str, idx: int, predictions: np.ndarray, norm_predictions: np.ndarray) -> None: """ Generate a distribution plot for a single network. Args: absolute_dir (str): Directory containing absolute data. idx (int): Index of the network. predictions (numpy.ndarray): Predictions to plot. """ min_max_log = pd.read_csv(os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv'), header=None, sep=' ', names=['param', 'min', 'max', 'log']) self.working_network = idx outputs = self.oghma_network_config['sims'][self.networks_configured[idx]]['outputs'] network_min_max_log = min_max_log[min_max_log['param'].isin(outputs)] min = network_min_max_log['min'].values[0] max = network_min_max_log['max'].values[0] log = network_min_max_log['log'].values[0] p = predictions[:, 0] if self.working_network == 0: self.mean = np.zeros(self.total_networks) self.std = np.zeros(self.total_networks) fig, ax = plt.subplots(figsize=(6, 6), dpi=300) if log == 0: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=np.linspace(min, max, 1000) ) ax.hist(np.abs(p), bins=np.linspace(min, max, 1000)) else: m = np.mean(p) s = np.std(p) self.mean[idx] = np.abs(m) self.std[idx] = s count, bins = np.histogram(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.hist(p, bins=10 ** np.linspace(np.log10(min), np.log10(max), 1000)) ax.set_xscale('log') ax.axvline(m, color='tab:orange') L = Label(outputs[0]) ax.set_xlabel(L.english + ' (' + L.units + ')') ax.set_ylabel('Count') hist = pd.DataFrame() count = np.append(count, 0) hist['bins'] = predictions[:, 0] hist_n = pd.DataFrame() hist_n['bins'] = norm_predictions[:, 0] figname = 'tempDF' + str(self.working_network) + '.png' figname_hist = 'tempDF' + str(self.working_network) + '.csv' figname_hist_n = 'tempDF' + str(self.working_network) + '_norm.csv' if not os.path.isdir(os.path.join(os.getcwd(), 'temp')): os.mkdir(os.path.join(os.getcwd(), 'temp')) figname = os.path.join(os.getcwd(), 'temp', figname) figname_hist = os.path.join(os.getcwd(), 'temp', figname_hist) plt.savefig(figname) hist.to_csv(figname_hist, index=False) hist_n.to_csv(os.path.join(os.getcwd(), 'temp', figname_hist_n), index=False) plt.close()
Generate a distribution plot for a single network.
Args
absolute_dir
:str
- Directory containing absolute data.
idx
:int
- Index of the network.
predictions
:numpy.ndarray
- Predictions to plot.
def generate_uniform_distribution(self) ‑> None
-
Expand source code
def generate_uniform_distribution(self) -> None: """ Generate a uniform distribution of the absolute dataset. """ kde = stats.gaussian_kde(self.absolute_targets.ravel()) density = kde(self.absolute_targets.ravel()) density = self.normalise_linear(density, np.min(density), np.max(density)) density = 1 - density density = density / np.sum(density) indicies = np.arange(len(density)) normal = random.choices(indicies, weights=density, k=int(len(self.absolute_targets) / 2)) self.absolute_targets = self.absolute_targets[normal] self.absolute_features = self.absolute_features[normal]
Generate a uniform distribution of the absolute dataset.
def permutations(self)
-
Expand source code
def permutations(self): features = self.features targets = self.targets if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(len(targets)), k=permutations_limit) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) return features, targets
def permutations_lowRAM(self)
-
Expand source code
def permutations_lowRAM(self): # features = self.features # targets = self.targets if self.model_settings.permutations_limit is not None: permutations_limit = self.model_settings.permutations_limit indices = random.choices(range(1000), k=1000) self.indices = indices permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) Thousands = 40 p = np.zeros((len(permutations) * Thousands, 2), dtype=np.uint32) step = np.shape(permutations)[0] for idx in range(1, Thousands): if idx == 0: p[:step,:] = permutations + (1000*idx) else: p[step*(idx-1):step*idx,:] = permutations + (1000*idx) # permutations = np.append(permutations, permutations + (1000*idx), axis=0) permutations = p if self.model_settings.permutations_limit < np.shape(permutations)[0]: permutations = permutations[:self.model_settings.permutations_limit] self.permutations_list = permutations keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = self.features[p[0]] x[idx, num_inputs:] = self.features[p[1]] y[idx] = self.targets[p[0]] - self.targets[p[1]] self.features = x # copy.deepcopy(x) self.targets = y # copy.deepcopy(y) self.population = len(permutations) return self.features, self.targets
def permutations_normal_distribution(self, features: numpy.ndarray, targets: numpy.ndarray) ‑> Tuple[numpy.ndarray, numpy.ndarray]
-
Expand source code
def permutations_normal_distribution(self, features: np.ndarray, targets: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: """ Generate permutations of features and targets with a normal distribution. Args: features (numpy.ndarray): Input features. targets (numpy.ndarray): Target values. Returns: tuple: Features and targets with generated permutations. """ if self.model_settings.permutations_limit is None: permutations_limit = len(targets) else: permutations_limit = self.model_settings.permutations_limit indices = self.rng.choice(range(len(targets)), 300) permutations = list(itertools.permutations(indices, r=2)) permutations = np.array([list(f) for f in permutations], dtype=np.uint32) keys = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] keys = list(keys) l = 0 for key in keys: l = l + len(self.oghma_network_config['experimental'][key]['vec']['points'].split(',')) num_inputs = l outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] num_outputs = len(outputs) x = np.zeros((len(permutations), 2 * num_inputs)) y = np.zeros((len(permutations), num_outputs)) for idx, p in enumerate(permutations): x[idx, :num_inputs] = features[p[0]] x[idx, num_inputs:] = features[p[1]] y[idx] = targets[p[0]] - targets[p[1]] features = x targets = y.ravel() self.population = len(permutations) return features, targets
Generate permutations of features and targets with a normal distribution.
Args
features
:numpy.ndarray
- Input features.
targets
:numpy.ndarray
- Target values.
Returns
tuple
- Features and targets with generated permutations.
def predict(self, absolute_dir: str, *experimental_features) ‑> None
-
Expand source code
def predict(self, absolute_dir: str, *experimental_features) -> None: """ Predict outputs for given experimental features using difference-based models. Args: absolute_dir (str): Directory containing absolute data. experimental_feature (np.ndarray): Experimental features to predict outputs for. """ experimental_features = experimental_features[0] self.setup_network_directories() # Gather Input Vectors Used self.load_input_vectors() exps = [] if type(experimental_features) is list: for idx, experimental_feature in enumerate(experimental_features): experimental_feature = copy.deepcopy(experimental_feature) # Samples Experimental Feature experimental_feature = self.sample_experimental_features(experimental_feature) # Normalises Experimental Feature experimental_feature = self.normalise_experimental_features(experimental_feature, dir=None, idx=idx) exps.append(experimental_feature) else: experimental_features = copy.deepcopy(experimental_features) # Samples Experimental Feature experimental_features = self.sample_experimental_features(experimental_features) # Normalises Experimental Feature experimental_features = self.normalise_experimental_features(experimental_features, dir=None, idx=None) exps.append(experimental_features) exps = exps[0] for idx in range(self.total_networks): self.working_network = idx dir = os.path.join(self.networks_dir, 'faster', self.networks_configured[self.working_network]) # Gathers and sets up secondary dataset self.setup_absolute_dataset(absolute_dir) # Gathers number of ouputs outputs = len(self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs']) s = np.shape(self.absolute_features) # Setup Prediction Array self.normalised_predicitons = np.zeros((s[0], outputs), dtype=object) # Pairs experimental feature with secondary dataset feature = self.setup_experimental_feature(exps) # Sets up Network P = Predicting(dir, feature, inputs=self.len_exp_features*2) # Generates Normalised Predictions self.normalised_predicitons = P.predict() self.turn_predictions_absolute() self.predictions = np.zeros((len(self.absolute_targets), self.total_networks, 10)) prediction = self.denormalise_prediction_single(absolute_dir) self.predictions[:, idx, :] = prediction[:] self.distribution_plot_single(absolute_dir, idx, prediction, self.normalised_predicitons)
Predict outputs for given experimental features using difference-based models.
Args
absolute_dir
:str
- Directory containing absolute data.
experimental_feature
:np.ndarray
- Experimental features to predict outputs for.
def renormalise(self, absolute_v: pandas.core.frame.DataFrame, absolute_dir: str) ‑> pandas.core.frame.DataFrame
-
Expand source code
def renormalise(self, absolute_v: pd.DataFrame, absolute_dir: str) -> pd.DataFrame: """ Renormalize absolute values to match the network's configuration. Args: absolute_v (pandas.DataFrame): Absolute values to renormalize. absolute_dir (str): Directory containing absolute data. Returns: pandas.DataFrame: Renormalized values. """ network_min_max_log_dir = os.path.join(self.networks_dir, 'faster', 'vectors', 'vec', 'min_max.csv') network_min_max_log = pd.read_csv(network_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) absolute_min_max_log_dir = os.path.join(absolute_dir, 'faster', 'vectors', 'vec', 'min_max.csv') absolute_min_max_log = pd.read_csv(absolute_min_max_log_dir, header=None, sep=' ', names=['param', 'min', 'max', 'log']) temp = absolute_min_max_log[absolute_min_max_log['param'] == self.outputs[0]] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[self.outputs[0]] = self.denormalise_log(absolute_v[self.outputs[0]], absolute_min, absolute_max) else: absolute_v[self.outputs[0]] = self.denormalise_linear(absolute_v[self.outputs[0]], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == self.outputs[0]] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[self.outputs[0]] = self.normalise_log(absolute_v[self.outputs[0]], network_min, network_max) else: absolute_v[self.outputs[0]] = self.normalise_linear(absolute_v[self.outputs[0]], network_min, network_max) vecs = [] for f in absolute_v.columns: if 'vec' in f: if 'light' in f: vecs.append(f) for vec in vecs: temp = absolute_min_max_log[absolute_min_max_log['param'] == vec] absolute_min = temp['min'].to_numpy()[0] absolute_max = temp['max'].to_numpy()[0] absolute_log = temp['log'].to_numpy()[0] if absolute_log == 1: absolute_v[vec] = self.denormalise_log(absolute_v[vec], absolute_min, absolute_max) else: absolute_v[vec] = self.denormalise_linear(absolute_v[vec], absolute_min, absolute_max) temp = network_min_max_log[network_min_max_log['param'] == vec] network_min = temp['min'].to_numpy()[0] network_max = temp['max'].to_numpy()[0] network_log = temp['log'].to_numpy()[0] if network_log == 1: absolute_v[vec] = self.normalise_log(absolute_v[vec], network_min, network_max) else: absolute_v[vec] = self.normalise_linear(absolute_v[vec], network_min, network_max) return absolute_v
Renormalize absolute values to match the network's configuration.
Args
absolute_v
:pandas.DataFrame
- Absolute values to renormalize.
absolute_dir
:str
- Directory containing absolute data.
Returns
pandas.DataFrame
- Renormalized values.
def setup_absolute_dataset(self, absolute_dir: str) ‑> None
-
Expand source code
def setup_absolute_dataset(self, absolute_dir: str) -> None: """ Set up the absolute dataset for predictions. Args: absolute_dir (str): Directory containing absolute data. """ absolute_vectors_dir = os.path.join(absolute_dir, 'faster', 'vectors.csv') inputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['inputs'] outputs = self.oghma_network_config['sims'][self.networks_configured[self.working_network]]['outputs'] self.inputs = inputs self.outputs = outputs f = open(absolute_vectors_dir, 'r') v = pd.read_csv(f, delimiter=' ') self.vectors = self.renormalise(v, absolute_dir) f.close() col = self.vectors.columns.to_list() inp = [i + '.vec' for i in self.inputs] V = [] for i in inp: vecs = np.where(np.char.find(np.char.lower(col), i) > -1)[0] V = np.append(V, vecs) col = [col[int(x)] for x in V] self.absolute_targets = self.vectors[outputs].to_numpy() self.absolute_features = self.vectors[col].to_numpy() self.absolute_features = self.absolute_features[:] self.absolute_targets = self.absolute_targets[:] #self.generate_uniform_distribution()
Set up the absolute dataset for predictions.
Args
absolute_dir
:str
- Directory containing absolute data.
def setup_confusion_matrix_features(self) ‑> Tuple[numpy.ndarray, numpy.ndarray]
-
Expand source code
def setup_confusion_matrix_features(self) -> Tuple[np.ndarray, np.ndarray]: """ Set up features for generating a confusion matrix. Returns: tuple: Targets and features for the confusion matrix. """ absolute_features = self.absolute_features absolute_targets = self.absolute_targets x_features = absolute_features x_targets = absolute_targets[::-1] y_features = absolute_features y_targets = absolute_targets[::-1] features1 = np.concatenate((x_features, y_features), axis=1) features2 = np.concatenate((y_features, x_features), axis=1) features = np.concatenate((features1, features2), axis=0) targets = np.concatenate((x_targets, y_targets), axis=0) return targets, features
Set up features for generating a confusion matrix.
Returns
tuple
- Targets and features for the confusion matrix.
def setup_experimental_feature(self, experimental_features) ‑> numpy.ndarray
-
Expand source code
def setup_experimental_feature(self, experimental_features) -> np.ndarray: absolute_features = self.absolute_features try: len_exp_features = len(experimental_features) self.len_exp_features = int(len_exp_features/2) len_exp0 = len(experimental_features[0].y) except: len_exp_features = 1 self.len_exp_features = 1 len_exp0 = len(experimental_features.y) match len_exp_features: case 1: y = np.tile(experimental_features.y, (len(absolute_features),1)) y = np.reshape(y, np.shape(absolute_features)) features1 = np.concatenate((absolute_features, y), axis=1) features2 = np.concatenate((y, absolute_features), axis=1) features = np.concatenate((features1, features2), axis=0) case 2: abf_1 = absolute_features[:, 0:len_exp0] abf_2 = absolute_features[:, len_exp0:len_exp0*2] y1 = np.tile(experimental_features[0].y, (len(abf_1), 1)) y1 = np.reshape(y1, np.shape(abf_1)) y2 = np.tile(experimental_features[1].y, (len(abf_2), 1)) y2 = np.reshape(y2, np.shape(abf_2)) abf = np.concatenate((abf_1, abf_2), axis=1) abf_r = np.concatenate((abf_2, abf_1), axis=1) abf = np.concatenate((abf, abf_r), axis=0) y = np.concatenate((y1, y2), axis=1) y_r = np.concatenate((y2, y1), axis=1) y = np.concatenate((y, y_r), axis=0) features = np.concatenate((abf, y), axis=1) # features1 = np.concatenate((abf_1, y1), axis=1) # features1_r = np.concatenate((y1, abf_1), axis=1) # features1 = np.concatenate((features1, features1_r), axis=0) # features2 = np.concatenate((abf_2, y2), axis=1) # features2_r = np.concatenate((y2, abf_2), axis=1) # features2 = np.concatenate((features2, features2_r), axis=0) # features = np.concatenate((features1, features2), axis=1) case 4: abf_1 = absolute_features[:, 0:len_exp0] abf_2 = absolute_features[:, len_exp0:len_exp0*2] abf_3 = absolute_features[:, len_exp0*2:len_exp0*3] abf_4 = absolute_features[:, len_exp0*3:len_exp0*4] y1 = np.tile(experimental_features[0].y, (len(abf_1), 1)) y1 = np.reshape(y1, np.shape(abf_1)) y2 = np.tile(experimental_features[1].y, (len(abf_2), 1)) y2 = np.reshape(y2, np.shape(abf_2)) y3 = np.tile(experimental_features[2].y, (len(abf_3), 1)) y3 = np.reshape(y3, np.shape(abf_3)) y4 = np.tile(experimental_features[3].y, (len(abf_4), 1)) y4 = np.reshape(y4, np.shape(abf_4)) features1 = np.concatenate((abf_1, y1), axis=1) features1_r = np.concatenate((y1, abf_1), axis=1) features1 = np.concatenate((features1, features1_r), axis=0) features2 = np.concatenate((abf_2, y2), axis=1) features2_r = np.concatenate((y2, abf_2), axis=1) features2 = np.concatenate((features2, features2_r), axis=0) features3 = np.concatenate((abf_3, y3), axis=1) features3_r = np.concatenate((y3, abf_3), axis=1) features3 = np.concatenate((features3, features3_r), axis=0) features4 = np.concatenate((abf_4, y4), axis=1) features4_r = np.concatenate((y4, abf_4), axis=1) features4 = np.concatenate((features4, features4_r), axis=0) features = np.concatenate((features1, features2, features3, features4), axis=1) return features
def train(self, idx=None)
-
Expand source code
def train(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks('Residual')
def train_existing(self, idx=None)
-
Expand source code
def train_existing(self, idx=None): self.setup_network_directories() if idx is None: for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing() else: self.working_network = idx self.load_training_dataset() self.permutations_lowRAM() self.separate_training_dataset() self.train_networks_existing('Residual')
def tune(self)
-
Expand source code
def tune(self): self.setup_network_directories() for idx in range(self.total_networks): self.working_network = idx self.load_training_dataset() self.permutations() self.separate_training_dataset() self.tune_networks()
def turn_predictions_absolute(self) ‑> None
-
Expand source code
def turn_predictions_absolute(self) -> None: """ Convert predictions to absolute values. """ origin = self.absolute_targets l = len(self.absolute_targets) self.normalised_predicitons[:l] = origin - self.normalised_predicitons[:l] self.normalised_predicitons[l:] = self.normalised_predicitons[l:] - origin
Convert predictions to absolute values.
Inherited members
Networks
:denormalise_linear
denormalise_log
get_uniform_distribution
initialise
interpret_input_vectors
load_input_vectors
load_training_dataset
mape
normalise_experimental_features
normalise_linear
normalise_log
sample_experimental_features
separate_training_dataset
setup_network_directories
train_networks
train_networks_existing
tune_networks