abcgan package
Submodules
abcgan.constants module
File for global constants used in the program.
abcgan.interface module
Code for top level interface.
This code is added to the main package level in __init__.py
- abcgan.interface.anomaly_estimation_1d(fakes, data)
compute an unbounded anomaly score for a new data sample using logsumexp computation method
- Parameters
fakes (torch.Tensor) – n_samples x n_alt x n_features background variables
data (torch.Tensor) – 1 x n_alt x n_features broadcast n_samples times to match fakes data shape
- Returns
anomalies – 1 x n_alt x n_feat output of anomaly scores (unbounded).
- Return type
1 xnp.ndarray, np.ndarray
- abcgan.interface.anomaly_estimation_nd(fakes, data)
compute an unbounded anomaly score for a new data sample using logsumexp computation method (N-dimensional)
- Parameters
fakes (torch.Tensor) – n_samples x n_alt x n_features background variables
data (torch.Tensor) – 1 x n_alt x n_features broadcast n_samples times to match fakes data shape
- Returns
anomalies – 1 x n_alt x n_feat output of anomaly scores (unbounded).
- Return type
1 xnp.ndarray, np.ndarray
- abcgan.interface.anomaly_score(drivers, data=None, model='mm_gan_radar', bv_type='radar')
returns unbounded anomaly score for a given set of driver parameters and data. more positive numbers are more confident.
- Parameters
drivers (np.ndarray) – 1 x n_drivers input driving parameters (not z-scaled). one sample at a time
data (np.ndarray) – 1 x n_alt_in x n_meas
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
anomalies – 1 x n_alt x n_feat output of anomaly scores (unbounded).
- Return type
1 xnp.ndarray, np.ndarray
- abcgan.interface.discriminate(drivers, measurements, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'], model='mm_gan_radar', bv_type='radar')
Score how well the measurements match with historical observations.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
driver_names (list) – list of names of driving parameters
measurements (np.ndarray) – n_samples x n_alt_in x n_meas input list of altitude measurements, n_alt_in should be less than max_alt.
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
scores – n_samples x n_alt output normalcy scores in the range [0, 1.0].
- Return type
np.ndarray
- abcgan.interface.estimate_drivers(drivers, model='dr_gan')
Predict drivers 2 hours into the future driver GAN model. Used for real-time background predictions using drivers from 2 hours ago.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
model (str, optional) – name of model to use
- Returns
predicted_drivers – estimation of driver features two hours from the drivers inputted
- Return type
np.ndarray
- abcgan.interface.gen_stats(drivers, data=None, model='mm_gan_radar', bv_type='radar')
Statistical distribution of 10,000 upper altitude data points conditioned on driver parameters and lower altitude measurements.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
data (np.ndarray) – n_samples x n_alt_in x n_meas
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
samples – 2xn_avg*n_samples x n_alt x n_feat output anomaly scores (unbounded). The first element is the fake output. The second array entry contains the scaled background variables with repeats
- Return type
[np.ndarray, np.ndarray]
- abcgan.interface.generate(drivers, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'], measurements=None, n_alt=30, model='mm_gan_radar', bv_type='radar')
Generate synthetic data consistent with the historical distribution.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
driver_names (list) – list of names of driving parameters
measurements (np.ndarray, optional) – n_samples x n_alt_in x n_meas input list of altitude measurements, n_alt_in should be less than n_alt. These represent fixed measurements for the lowest altitudes to condition on. Usually left as default (None)
n_alt (int, optional) – number of altitude measurements to draw, defaults to max_alt
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
samples – n_samples x n_alt x n_meas output measurements at each requested altitude. If measurements is not None then the measurements for the first n_alt_in will be copied over from the input.
- Return type
np.ndarray
- abcgan.interface.stack_bvs(bv_dict, bv_type='radar')
Stacks drivers in appropriate format.
This function is provided for convenience.
- Parameters
bv_dict (dict) – Dictionary mapping names of background variables to numpy arrays with values for those bvs. Each array should have shape n_sapmles x n_altitudes. Can also use h5py.Group.
bv_type (str) – string specifying weather to stack radar or lidar data
abcgan.bv_names (Valid names for drivers can be found at) –
- Raises
ValueError: – If the input shape of the bv dict values is not corrects
KeyError: – If one of the required bvs is missing.
- abcgan.interface.stack_drivers(driver_dict, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'])
Stacks drivers in appropriate format.
This function is provided for convenience.
- Parameters
driver_dict (dict) – Dictionary mapping names of drivers to the numpy arrays with values for those drivers. Each array has a single dimension of the same length n_samples. Can also use an h5py.Group.
driver_names (list) – names of the drivers to load
abcgan.driver_names (Valid names for drivers can be found at) –
- Raises
ValueError: – If the driver values have the wrong type or shape.
KeyError: – If one of the required drivers is missing.
abcgan.mask module
- abcgan.mask.mask_altitude(bv_feat)
Creates an altitude mask for nans in bvs.
Also replaces nans with numbers.
- Parameters
bv_feat (torch.Tensor) – background variables
- Returns
bv_feat (torch.Tensor) – bv_feat with nans replaced, done in place but returned for clarity
alt_mask (torch.Tensor) – Mask that is true for valid altitudes
- Raises
ValueError: – If valid values are not contiguous.
- abcgan.mask.prev_driver_mask(unix_time)
Creates a driver mask of samples that have a previous sample and a mapping vector to the previous sample.
- Parameters
unix_time (np.array) – time stamp of driver samples
- Returns
prev_dr_map (np.array) – vector mapping each sample to its delayed sample
dr_mask (torch.Tensor) – Mask of valid driver samples that have a delayed sample
abcgan.mean_estimation module
- class abcgan.mean_estimation.Transformer(d_dr: int = 18, d_bv: int = 12, n_alt: int = 30, d_model: int = 64, nhead: int = 1, num_encoder_layers: int = 1, dim_feedforward: int = 64, dropout: float = 0.0, activation: str = 'relu')
Bases:
torch.nn.modules.module.Module
Transformer Class with only the encoder
- Parameters
d_model (int) – the number of expected features in the encoder/decoder inputs
d_stack (int) – the number of features to stack to output
nhead (int) – the number of heads in the multiheadattention models
num_encoder_layers (int) – the number of sub-encoder-layers in the encoder
dim_feedforward (int) – the dimension of the feedforward network model
dropout (int) – the dropout value
activation (str) – the activation function of encoder/decoder intermediate layer
- forward(driver_src: torch.Tensor, bv_src: torch.Tensor, src_key_padding_mask: Optional[torch.Tensor] = None)
Take in and process masked source/target sequences.
- Parameters
driver_src (torch.Tensor) – (n_batch, d_dr) the sequence to the encoder (required) .
bv_src (torch.Tensor) – (n_batch, n_alt, d_bv) the sequence to the decoder (required).
src_key_padding_mask (torch.Tensor, optional) – the ByteTensor mask for src keys per batch (optional).
- generate_square_subsequent_mask(sz: int) torch.Tensor
Generate a square mask for the sequence. The masked positions are filled with float(‘-inf’).
- Parameters
sz (int) – Unmasked positions are filled with float(0.0).
- training: bool
abcgan.model module
- class abcgan.model.Critic(transformer: torch.nn.modules.module.Module, n_layers=4, img_dim=12, hidden_dim=128)
Bases:
torch.nn.modules.module.Module
Critic Class
- Parameters
transformer (torch.nn.Module) – transformer for the critic
n_layers (int) – number of layers in MLP
img_dim (int) – the dimension of the images, fitted for the dataset used, a scalar
hidden_dim (int) – the inner dimension, a scalar
- forward(bv_features, driver_src, real, src_key_mask=None)
Function for completing a forward pass of the critic: Given an image tensor, returns a 1-dimension tensor representing a fake/real prediction.
- Parameters
bv_features (torch.Tensor) – a flattened image tensor with dimension (n_batch, max_alt, n_bv_feat)
driver_src (torch.Tensor) – tensor of driver features from data loader (n_batch, n_dr_feat)
real (torch.Tensor) – tensor of bv features from data loader (n_batch, n_alt, n_bv_feat)
src_key_mask (torch.Tensor, optional) – mask for bv features from data loader (n_batch, n_alt)
- training: bool
- class abcgan.model.Driver_Critic(n_layers=2, img_dim=18, hidden_dim=64)
Bases:
torch.nn.modules.module.Module
Critic Class
- Parameters
n_layers (int) – number of layers in MLP
img_dim (int) – the dimension of the images, fitted for the dataset used, a scalar
hidden_dim (int) – the inner dimension, a scalar
- forward(dr_src, dr_prev)
forward pass of the critic for driver augmentation: Given an image tensor, returns a 1-dimension tensor representing a fake/real prediction.
- Parameters
dr_src (torch.Tensor) – tensor of driver features (n_batch, n_dr_feat)
dr_prev (torch.Tensor) – tensor of past driver features (n_batch, n_dr_feat)
- training: bool
- class abcgan.model.Driver_Generator(n_layers=2, latent_dim=16, img_dim=18, hidden_dim=64)
Bases:
torch.nn.modules.module.Module
Generator Class
- Parameters
n_layers (int) – number of MLP layers
latent_dim (int) – the dimension of the input latent vector
img_dim (int) – the dimension of the images, fitted for the dataset used, a scalar
hidden_dim (int) – the inner dimension, a scalar
- forward(dr_prev, noise=None)
forward pass of the generator for driver augmentation: Given driver sample from the past and noise tensor, returns generated driver sample.
- Parameters
dr_prev (torch.Tensor) – tensor of past driver features from data loader (n_batch, n_dr_feat)
noise (torch.Tensor, optional) – a noise tensor with dimensions (n_batch, latent_dim)
- training: bool
- class abcgan.model.Generator(transformer: torch.nn.modules.module.Module, n_layers=4, latent_dim=16, img_dim=12, hidden_dim=128)
Bases:
torch.nn.modules.module.Module
Generator Class
- Parameters
transformer (torch.nn.Module) – transformer for the generator
n_layers (int) – number of MLP layers
latent_dim (int) – the dimension of the input latent vector
img_dim (int) – the dimension of the images, fitted for the dataset used, a scalar
hidden_dim (int) – the inner dimension, a scalar
- forward(driver_src, bv_src, src_key_mask=None, noise=None)
Function for completing a forward pass of the generator: Given a noise tensor, returns generated images.
- Parameters
driver_src (torch.Tensor) – tensor of driver features from data loader (n_batch, n_dr_feat)
bv_src (torch.Tensor) – tensor of bv featrues from data loader (n_batch, n_alt, n_bv_feat)
src_key_mask (torch.Tensor, optional) – mask for bv features from data loader (n_alt, n_batch)
noise (torch.Tensor, optional) – a noise tensor with dimensions (n_batch, latent_dim)
- training: bool
abcgan.persist module
This module supports persistence of the generator and discriminator.
It saves two files a parameters file and a configuration file.
It also supports persisting of multiple modules.
To be persistable in this way the module must have a property containing a json serializable input dictionary as mdl.input_args
- abcgan.persist.fullname(inst)
- abcgan.persist.persist(generator, critic, name='wgan_gp', dir_path='/home/valentic/sandbox/atmosense/test/lib/python3.9/site-packages/abcgan/models')
Persists abcgan generator and critic modules.
Persists both input arguments and parameters.
- Parameters
generator – torch.nn.Module module for the generator
critic – torch.nn.Module module for the critic
name – str, optional name of the saved configuration
dir_path – str, optional default is the models directory. None assumes file is in local directory.
The generator, critic and any transformers passed in as arguments to these must be registered in persist.py and must have a parameter ‘input_args’ that specifies their input arguments as a dictionary
- abcgan.persist.recreate(name='wgan_gp', dir_path='/home/valentic/sandbox/atmosense/test/lib/python3.9/site-packages/abcgan/models')
Load a pre-trained generator and discriminator.
- Parameters
name (str, optional) – name of the configuration to load, as saved by persist. default: ‘wgan_gp’
dir_path (str, optional) – default is the models directory. None assumes file is in local directory.
- Returns
generator (torch.nn.module) – the loaded generator
critic (torch.nn.module) – the loaded critic
Modules must have previosuly been saved. All modules are
loaded on the cpu, they can subsequently be moved.
abcgan.transforms module
Transforms to and from z-scaled variables.
Uses numpy only (no pytorch)
- abcgan.transforms.compute_valid(bvs, bv_thresholds=array([[- 1.00000000e+00, 2.88214929e+14], [1.00000000e+00, 1.89264686e+12], [- 1.00000000e+00, 5.00000000e+05], [- 1.00000000e+00, 4.39506857e+09], [- 1.00000000e+00, 1.00247000e+05], [- 1.00000000e+00, 8.45428636e+06], [- 2.00000000e+03, 2.00000000e+03], [1.00000000e-06, 2.00000000e+03], [- 2.00000000e+03, 2.00000000e+03], [1.00000000e-06, 2.00000000e+03], [- 2.00000000e+03, 2.00000000e+03], [1.00000000e-06, 2.00000000e+03]]))
- abcgan.transforms.decode(data, driver_names)
Encode variables, or just add extra dimension
- Parameters
data (np.ndarray) – array of feature values.
driver_names (list: str) – list driver names in data
- Returns
enc – array of encoded variables
- Return type
np.ndarray
- abcgan.transforms.encode(data, name)
Encode variables, or just add extra dimension
- Parameters
data (np.ndarray) – array of variable values.
name (str) – name of the variable.
- Returns
enc – array of encoded variables (with an extra dimension in all cases)
- Return type
np.ndarray
- abcgan.transforms.get_bv(bv_feat, bv_type='radar')
Invert featurization to recover bvs.
- Parameters
bv_feat (np.ndarray) – n_samples x n_bv_feat
bv_type (str) – radar or lidar bvs
- Returns
scaled_feat – n_samples x n_bv
- Return type
np.ndarray
- abcgan.transforms.get_driver(driver_feat, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'])
Invert featurization to recover driving parameters.
- Parameters
driver_feat (np.ndarray) – n_samples x n_driver_feat
driver_names (list: str) – list driver names in driver_feat
- Returns
original driver – n_samples x n_driver
- Return type
np.ndarray
- abcgan.transforms.scale_bv(bvs, bv_type='radar')
Return a scaled version of the drivers.
- Parameters
bvs (np.ndarray) – n_samples x n_bv
bv_type (str) – string specifying weather to scale
- Returns
bv_feat – n_samples x n_bv_feat
- Return type
np.ndarray
- abcgan.transforms.scale_driver(drivers, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'])
Return a scaled version of the drivers.
- Parameters
drivers (np.ndarray) – n_samples x n_driver
driver_names (list: str) – list of driver names
- Returns
driver_feat – n_samples x n_driver_feat
- Return type
np.ndarray
Module contents
- abcgan.anomaly_score(drivers, data=None, model='mm_gan_radar', bv_type='radar')
returns unbounded anomaly score for a given set of driver parameters and data. more positive numbers are more confident.
- Parameters
drivers (np.ndarray) – 1 x n_drivers input driving parameters (not z-scaled). one sample at a time
data (np.ndarray) – 1 x n_alt_in x n_meas
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
anomalies – 1 x n_alt x n_feat output of anomaly scores (unbounded).
- Return type
1 xnp.ndarray, np.ndarray
- abcgan.discriminate(drivers, measurements, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'], model='mm_gan_radar', bv_type='radar')
Score how well the measurements match with historical observations.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
driver_names (list) – list of names of driving parameters
measurements (np.ndarray) – n_samples x n_alt_in x n_meas input list of altitude measurements, n_alt_in should be less than max_alt.
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
scores – n_samples x n_alt output normalcy scores in the range [0, 1.0].
- Return type
np.ndarray
- abcgan.estimate_drivers(drivers, model='dr_gan')
Predict drivers 2 hours into the future driver GAN model. Used for real-time background predictions using drivers from 2 hours ago.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
model (str, optional) – name of model to use
- Returns
predicted_drivers – estimation of driver features two hours from the drivers inputted
- Return type
np.ndarray
- abcgan.gen_stats(drivers, data=None, model='mm_gan_radar', bv_type='radar')
Statistical distribution of 10,000 upper altitude data points conditioned on driver parameters and lower altitude measurements.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
data (np.ndarray) – n_samples x n_alt_in x n_meas
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
samples – 2xn_avg*n_samples x n_alt x n_feat output anomaly scores (unbounded). The first element is the fake output. The second array entry contains the scaled background variables with repeats
- Return type
[np.ndarray, np.ndarray]
- abcgan.generate(drivers, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'], measurements=None, n_alt=30, model='mm_gan_radar', bv_type='radar')
Generate synthetic data consistent with the historical distribution.
- Parameters
drivers (np.ndarray) – n_samples x n_drivers input list of driving parameters (not z-scaled).
driver_names (list) – list of names of driving parameters
measurements (np.ndarray, optional) – n_samples x n_alt_in x n_meas input list of altitude measurements, n_alt_in should be less than n_alt. These represent fixed measurements for the lowest altitudes to condition on. Usually left as default (None)
n_alt (int, optional) – number of altitude measurements to draw, defaults to max_alt
model (str, optional) – name of model to use
bv_type (str. optional) – name of the type of background variables to use (lidar or radar)
- Returns
samples – n_samples x n_alt x n_meas output measurements at each requested altitude. If measurements is not None then the measurements for the first n_alt_in will be copied over from the input.
- Return type
np.ndarray
- abcgan.stack_bvs(bv_dict, bv_type='radar')
Stacks drivers in appropriate format.
This function is provided for convenience.
- Parameters
bv_dict (dict) – Dictionary mapping names of background variables to numpy arrays with values for those bvs. Each array should have shape n_sapmles x n_altitudes. Can also use h5py.Group.
bv_type (str) – string specifying weather to stack radar or lidar data
abcgan.bv_names (Valid names for drivers can be found at) –
- Raises
ValueError: – If the input shape of the bv dict values is not corrects
KeyError: – If one of the required bvs is missing.
- abcgan.stack_drivers(driver_dict, driver_names=['Ap', 'F10.7', 'F10.7avg', 'MLT', 'SLT', 'SZA', 'ap', 'MEI', 'RMM1', 'RMM2', 'TCI', 'moon_phase', 'moon_x', 'moon_y', 'moon_z'])
Stacks drivers in appropriate format.
This function is provided for convenience.
- Parameters
driver_dict (dict) – Dictionary mapping names of drivers to the numpy arrays with values for those drivers. Each array has a single dimension of the same length n_samples. Can also use an h5py.Group.
driver_names (list) – names of the drivers to load
abcgan.driver_names (Valid names for drivers can be found at) –
- Raises
ValueError: – If the driver values have the wrong type or shape.
KeyError: – If one of the required drivers is missing.