ise.data

ise.data.dataclasses

class ise.data.dataclasses.EmulatorDataset(X, y, sequence_length=5, projection_length=86)[source]

Bases: Dataset

A PyTorch dataset for loading emulator data, designed to handle sequence-based inputs and projections.

Parameters:
  • X (pandas.DataFrame, numpy.ndarray, or torch.Tensor) - The input data.

  • y (pandas.DataFrame, numpy.ndarray, or torch.Tensor) - The target data.

  • sequence_length (int, optional) - The length of the input sequence. Default is 5.

  • projection_length (int or tuple, optional) - The length of the projection period. Default is 86.

X

The input data converted to a PyTorch tensor.

Type:

torch.Tensor

y

The target data converted to a PyTorch tensor.

Type:

torch.Tensor

sequence_length

The length of the input sequence.

Type:

int

xdim

The number of dimensions in X.

Type:

int

num_projections

The number of projections in the dataset.

Type:

int

num_timesteps

The number of timesteps per projection.

Type:

int

num_features

The number of features in the dataset.

Type:

int

_to_tensor(x)[source]

Converts input data to a PyTorch tensor.

__len__()[source]

Returns the total number of samples.

__getitem__(i)[source]

Retrieves the i-th sample from the dataset, including proper padding.

class ise.data.dataclasses.PyTorchDataset(X, y)[source]

Bases: Dataset

A PyTorch dataset for general-purpose data loading.

Parameters:
  • X (torch.Tensor) - The input data.

  • y (torch.Tensor) - The target data.

__getitem__(index)[source]

Retrieves the sample at the specified index.

__len__()[source]

Returns the total dataset length.

class ise.data.dataclasses.ScenarioDataset(features, labels)[source]

Bases: Dataset

A PyTorch dataset designed for scenario-based data loading.

Parameters:
  • features (torch.Tensor) - The input features.

  • labels (torch.Tensor) - The target labels.

features

The input features.

Type:

torch.Tensor

labels

The target labels.

Type:

torch.Tensor

__len__()[source]

Returns the dataset length.

__getitem__(idx)[source]

Retrieves the sample at the given index.

class ise.data.dataclasses.TSDataset(X, y, sequence_length=5)[source]

Bases: Dataset

A PyTorch dataset for handling time series data with sequence-based input.

Parameters:
  • X (torch.Tensor) - The input data.

  • y (torch.Tensor) - The target data.

  • sequence_length (int, optional) - The length of the input sequence. Default is 5.

X

The input data.

Type:

torch.Tensor

y

The target data.

Type:

torch.Tensor

sequence_length

The sequence length.

Type:

int

__len__()[source]

Returns the dataset length.

__getitem__(i)[source]

Retrieves the i-th time series sample.

ise.data.feature_engineer

class ise.data.feature_engineer.FeatureEngineer(ice_sheet, data: DataFrame, fill_mrro_nans: bool = False, split_dataset: bool = False, train_size: float = 0.7, val_size: float = 0.15, test_size: float = 0.15, output_directory: str = None)[source]

Bases: object

A class for performing feature engineering on a given dataset, including preprocessing, scaling, dataset splitting, and outlier handling.

Parameters:
  • ice_sheet (str) - The name of the ice sheet being analyzed.

  • data (pd.DataFrame) - The input dataset.

  • fill_mrro_nans (bool, optional) - Whether to fill missing values in the ‘mrro’ column. Defaults to False.

  • split_dataset (bool, optional) - Whether to split the dataset into training, validation, and test sets. Defaults to False.

  • train_size (float, optional) - Proportion of data to use for training. Defaults to 0.7.

  • val_size (float, optional) - Proportion of data to use for validation. Defaults to 0.15.

  • test_size (float, optional) - Proportion of data to use for testing. Defaults to 0.15.

  • output_directory (str, optional) - Directory to save the split datasets. Defaults to None.

data

The input dataset.

Type:

pd.DataFrame

train_size

Proportion of training data.

Type:

float

val_size

Proportion of validation data.

Type:

float

test_size

Proportion of testing data.

Type:

float

output_directory

Directory to save datasets.

Type:

str

scaler_X_path

Path to the saved input feature scaler.

Type:

str

scaler_y_path

Path to the saved target variable scaler.

Type:

str

scaler_X

Scaler for input features.

Type:

scaler object

scaler_y

Scaler for target variables.

Type:

scaler object

train

Training dataset.

Type:

pd.DataFrame

val

Validation dataset.

Type:

pd.DataFrame

test

Test dataset.

Type:

pd.DataFrame

_including_model_characteristics

Whether model characteristics have been included.

Type:

bool

split_data()[source]

Splits dataset into train, validation, and test sets.

fill_mrro_nans()[source]

Fills missing values in the ‘mrro’ column.

scale_data()[source]

Scales input and target variables using a specified method.

unscale_data()[source]

Reverses the scaling transformation.

add_lag_variables()[source]

Adds lag features to the dataset.

backfill_outliers()[source]

Replaces extreme values in target variables.

drop_outliers()[source]

Removes outliers based on specified criteria.

add_model_characteristics()[source]

Merges model characteristics into the dataset.

add_lag_variables(lag, data=None)[source]

Adds lagged versions of predictor variables to the dataset.

Parameters:
  • lag (int) - Number of time steps to lag the variables.

  • data (pd.DataFrame, optional) - The dataset. If not provided, the class attribute ‘data’ is used.

Returns:

The modified instance with lag variables added.

Return type:

FeatureEngineer

add_model_characteristics(data=None, model_char_path=None, encode=True, ids_path=None)[source]

Merges model characteristic data with the dataset.

Parameters:
  • data (pd.DataFrame, optional) - The dataset. If not provided, the class attribute ‘data’ is used.

  • model_char_path (str, optional) - Path to the model characteristics file. Defaults to the internal path.

  • encode (bool, optional) - Whether to one-hot encode categorical characteristics. Defaults to True.

  • ids_path (str, optional) - Path to an additional ID mapping file. Defaults to None.

Returns:

The modified instance with model characteristics added.

Return type:

FeatureEngineer

backfill_outliers(percentile=99.999, data=None)[source]

Replaces extreme values in target variables with the previous row’s value.

Parameters:
  • percentile (float, optional) - Percentile threshold for identifying outliers. Defaults to 99.999.

  • data (pd.DataFrame, optional) - The dataset. If not provided, the class attribute ‘data’ is used.

Returns:

The modified instance with outliers handled.

Return type:

FeatureEngineer

drop_outliers(method, column, expression=None, quantiles=[0.01, 0.99], data=None)[source]

Drops simulations that are outliers based on the provided method.

Parameters:
  • method (str) - Method of outlier deletion (‘quantile’ or ‘explicit’).

  • column (str) - Column used for detecting outliers.

  • expression (list[tuple], optional) - List of filtering expressions in the form [(column, operator, value)]. Defaults to None.

  • quantiles (list[float], optional) - Quantiles for ‘quantile’ method. Defaults to [0.01, 0.99].

  • data (pd.DataFrame, optional) - The dataset. If not provided, the class attribute ‘data’ is used.

Returns:

The modified instance with outliers removed.

Return type:

FeatureEngineer

fill_mrro_nans(method, data=None)[source]

Fills missing values in the ‘mrro’ column.

Parameters:
  • method (str) - The method used to fill missing values.

  • data (pd.DataFrame, optional) - The dataset. Defaults to None.

Returns:

The dataset with missing values filled.

Return type:

pd.DataFrame

scale_data(X=None, y=None, method='standard', save_dir=None)[source]

Scales input (X) and target (y) variables using a specified scaling method.

Parameters:
  • X (pd.DataFrame or np.ndarray, optional) - Input data. Defaults to None.

  • y (pd.DataFrame or np.ndarray, optional) - Target data. Defaults to None.

  • method (str, optional) - Scaling method (‘standard’, ‘minmax’, ‘robust’). Defaults to ‘standard’.

  • save_dir (str, optional) - Directory to save scalers. Defaults to None.

Returns:

Scaled X and y values.

Return type:

tuple

split_data(data=None, train_size=None, val_size=None, test_size=None, output_directory=None, random_state=42)[source]

Splits the dataset into training, validation, and test sets.

Parameters:
  • data (pd.DataFrame, optional) - The input dataset. Defaults to None.

  • train_size (float, optional) - Proportion of training data. Defaults to None.

  • val_size (float, optional) - Proportion of validation data. Defaults to None.

  • test_size (float, optional) - Proportion of testing data. Defaults to None.

  • output_directory (str, optional) - Directory to save split datasets. Defaults to None.

  • random_state (int, optional) - Random seed for reproducibility. Defaults to 42.

Returns:

Training, validation, and test datasets as pandas DataFrames.

Return type:

tuple

unscale_data(X=None, y=None, scaler_X_path=None, scaler_y_path=None)[source]

Reverses the scaling transformation for input (X) and target (y) variables.

Parameters:
  • X (pd.DataFrame or np.ndarray, optional) - The input data to be unscaled. Defaults to None.

  • y (pd.DataFrame, np.ndarray, or torch.Tensor, optional) - The target data to be unscaled. Defaults to None.

  • scaler_X_path (str, optional) - Path to the stored input scaler. Defaults to None.

  • scaler_y_path (str, optional) - Path to the stored target scaler. Defaults to None.

Returns:

Unscaled X and y data.

Return type:

tuple

ise.data.feature_engineer.add_lag_variables(data: DataFrame, lag: int, verbose=True) DataFrame[source]

Adds lagged variables to the input dataset, creating time-shifted versions of the predictor variables.

Parameters:
  • data (pd.DataFrame) - The dataset containing time series data.

  • lag (int) - The number of time steps to lag the variables.

  • verbose (bool, optional) - Whether to display a progress bar. Defaults to True.

Returns:

The dataset with lagged variables added.

Return type:

pd.DataFrame

ise.data.feature_engineer.add_model_characteristics(data, model_char_path='./ise/utils/model_characteristics.csv', encode=True, ids_path=None) DataFrame[source]

Adds model characteristics to the dataset.

Parameters:
  • data (pd.DataFrame) - The input dataset.

  • model_char_path (str, optional) - Path to the model characteristics file. Defaults to internal path.

  • encode (bool, optional) - Whether to one-hot encode categorical characteristics. Defaults to True.

  • ids_path (str, optional) - Path to an additional ID mapping file. Defaults to None.

Returns:

The dataset with model characteristics added.

Return type:

pd.DataFrame

ise.data.feature_engineer.backfill_outliers(data, percentile=99.999)[source]

Replaces extreme values in y-values (above the specified percentile and below the 1-percentile across all y-values) with the value from the previous row.

Parameters:
  • data (pd.DataFrame) - The dataset containing y-values.

  • percentile (float, optional) - The percentile threshold to define upper extreme values. Defaults to 99.999.

Returns:

The dataset with extreme values replaced using backfill.

Return type:

pd.DataFrame

ise.data.feature_engineer.drop_outliers(data: DataFrame, column: str, method: str, expression: List[tuple] = None, quantiles: List[float] = [0.01, 0.99])[source]

Removes outliers from the dataset based on a specified method.

Parameters:
  • data (pd.DataFrame) - The dataset containing the column with potential outliers.

  • column (str) - The column to assess for outliers.

  • method (str) - The method of outlier detection (‘quantile’ or ‘explicit’).

  • expression (list of tuples, optional) - A list of conditions in the format [(column, operator, value)] for explicit filtering. Defaults to None.

  • quantiles (list of float, optional) - Quantiles for filtering when using the ‘quantile’ method. Defaults to [0.01, 0.99].

Returns:

The dataset with outliers removed.

Return type:

pd.DataFrame

Raises:
  • AttributeError - If the method is ‘quantile’ but no quantiles are provided.

  • AttributeError - If the method is ‘explicit’ but no expression is provided.

  • ValueError - If the operator in the expression is not recognized.

ise.data.feature_engineer.fill_mrro_nans(data: DataFrame, method) DataFrame[source]

Fills the NaN values in the specified columns with the given method.

Parameters:
  • data (pd.DataFrame) - The input DataFrame.

  • method (str or int) - The method to fill NaN values. Must be one of ‘zero’, ‘mean’, ‘median’, or ‘drop’.

Returns:

The DataFrame with NaN values filled according to the specified method.

Return type:

pd.DataFrame

Raises:

ValueError - If the method is not one of ‘zero’, ‘mean’, ‘median’, or ‘drop’.

ise.data.feature_engineer.scale_data(data, scaler_path)[source]

Scales the provided dataset using a pre-trained scaler.

Parameters:
  • data (pd.DataFrame) - The dataset to be scaled.

  • scaler_path (str) - Path to the saved scaler.

Returns:

The scaled dataset.

Return type:

pd.DataFrame

ise.data.feature_engineer.split_training_data(data, train_size, val_size, test_size=None, output_directory=None, random_state=42)[source]

Splits the dataset into training, validation, and test sets.

Parameters:
  • data (str or pd.DataFrame) - The dataset or path to the dataset to be split.

  • train_size (float) - Proportion of data to use for training.

  • val_size (float) - Proportion of data to use for validation.

  • test_size (float, optional) - Proportion of data to use for testing. Defaults to the remainder.

  • output_directory (str, optional) - Directory to save the split datasets as CSV files. Defaults to None.

  • random_state (int, optional) - Seed for reproducibility. Defaults to 42.

Returns:

Training, validation, and test datasets as pandas DataFrames.

Return type:

tuple

Raises:
  • ValueError - If the dataset length is not divisible by 86, indicating incomplete projections.

  • ValueError - If the dataset does not contain an ‘id’ column.

ise.data.process

class ise.data.process.DatasetMerger(ice_sheet, forcings, projections, experiment_file, output_dir)[source]

Bases: object

A class for merging datasets from forcing and projection files to create a unified dataset for analysis.

Parameters:
  • ice_sheet (str) - The ice sheet name (‘AIS’ or ‘GrIS’).

  • forcings (str) - The directory path containing forcing files.

  • projections (str) - The directory path containing projection files.

  • experiment_file (str) - The file path to the experiment metadata (CSV or JSON).

  • output_dir (str) - The directory path to save the merged dataset.

experiments

The experiment metadata loaded from the provided file.

Type:

pd.DataFrame

forcing_paths

List of file paths for forcing datasets.

Type:

list

projection_paths

List of file paths for projection datasets.

Type:

list

forcing_metadata

Metadata about forcing files, including CMIP model and pathway.

Type:

pd.DataFrame

merge_dataset()[source]

Merges the forcing and projection datasets into a single structured dataset.

merge_sectors()[source]

(Placeholder) Method to merge sectors based on specified criteria.

_get_forcing_metadata()[source]

Extracts metadata from forcing files.

merge_dataset()[source]

Merges forcing and projection datasets based on CMIP model and pathway metadata.

Returns:

Returns 0 upon successful merging and saving of the dataset.

Return type:

int

merge_sectors(forcings_file=None, projections_file=None, save_dir=None)[source]
class ise.data.process.ProjectionProcessor(ice_sheet, forcings_directory, projections_directory, scalefac_path=None, densities_path=None)[source]

Bases: object

A class for processing ISMIP6 projections (outputs) for ice sheet models, specifically for calculating Ice Volume Above Flotation (IVAF), handling control projections, and processing experimental projections.

Parameters:
  • ice_sheet (str) - The ice sheet being analyzed (‘AIS’ or ‘GIS’).

  • forcings_directory (str) - Path to the directory containing forcing datasets.

  • projections_directory (str) - Path to the directory containing projection datasets.

  • scalefac_path (str, optional) - Path to the NetCDF file containing scaling factors for each grid cell. Defaults to None.

  • densities_path (str, optional) - Path to the CSV file containing density data for models. Defaults to None.

forcings_directory

Path to forcing data.

Type:

str

projections_directory

Path to projection data.

Type:

str

densities_path

Path to density dataset.

Type:

str

scalefac_path

Path to scaling factor dataset.

Type:

str

ice_sheet

Ice sheet identifier (‘AIS’ or ‘GIS’).

Type:

str

resolution

Resolution of the dataset (5 for GIS, 8 for AIS).

Type:

int

process()[source]

Processes ISMIP6 projections by calculating IVAF and subtracting control projections.

_calculate_ivaf_minus_control()[source]

Computes IVAF and subtracts control values for experimental projections.

_calculate_ivaf_single_file()[source]

Computes IVAF for a single model run, accounting for control projections.

process()[source]

Process ISMIP6 projections by calculating IVAF for control and experiment projections, subtracting out control IVAF from experiments, and exporting IVAF files.

Returns:

1 if processing is successful.

Return type:

int

Raises:

ValueError - If projections_directory is not specified.

ise.data.process.combine_gris_forcings(forcing_dir)[source]

Combines GrIS forcings from multiple CMIP model directories into consolidated NetCDF files.

Parameters:

forcing_dir (str) - Directory containing the GrIS forcing files.

Returns:

0 upon successful processing.

Return type:

int

ise.data.process.convert_and_subset_times(dataset)[source]

Converts time variables in an xarray dataset to a uniform format and subsets time to the range 2015-2100.

Parameters:

dataset (xarray.Dataset) - The dataset with time values to be converted and subset.

Returns:

The dataset with standardized time format and subset to the correct time range.

Return type:

xarray.Dataset

Raises:

ValueError - If time values are not in a recognizable format.

ise.data.process.get_model_densities(zenodo_directory: str, output_path: str = None)[source]

Extracts density values (rhoi and rhow) from NetCDF files in the specified directory and returns them in a pandas DataFrame.

Parameters:
  • zenodo_directory (str) - Path to the directory containing the NetCDF files.

  • output_path (str, optional) - Path to save the extracted density values as a CSV file. Defaults to None.

Returns:

A DataFrame containing the group, model, rhoi, and rhow values for each model run.

Return type:

pandas.DataFrame

ise.data.process.get_xarray_data(dataset_fp, var_name=None, ice_sheet='AIS', convert_and_subset=False)[source]

Retrieves and processes data from an xarray dataset.

Parameters:
  • dataset_fp (str) - The file path to the xarray dataset.

  • var_name (str, optional) - The name of the variable to retrieve from the dataset. Defaults to None.

  • ice_sheet (str, optional) - The ice sheet type (‘AIS’ or ‘GrIS’). Defaults to ‘AIS’.

  • convert_and_subset (bool, optional) - If True, converts and subsets the dataset for the target time range. Defaults to False.

Returns:

The extracted variable as a NumPy array or the entire processed dataset.

Return type:

np.ndarray or xarray.Dataset

ise.data.process.interpolate_values(data)[source]

Interpolates missing values in the x and y dimensions of the input dataset using linear interpolation. Ensures that first and last values are properly adjusted to maintain consistency.

Parameters:

data (xarray.Dataset) - A dataset containing x and y dimensions with potential missing values.

Returns:

A tuple containing the interpolated x and y arrays.

Return type:

tuple

ise.data.process.merge_datasets(forcings, projections, experiments_file, ice_sheet='AIS', export_directory=None)[source]

Merges forcing and projection datasets using experiment metadata.

Parameters:
  • forcings (pd.DataFrame) - Forcing dataset.

  • projections (pd.DataFrame) - Projection dataset.

  • experiments_file (str or pd.DataFrame) - Path to the experiment metadata file or a DataFrame.

  • ice_sheet (str, optional) - The ice sheet type (‘AIS’ or ‘GrIS’). Defaults to ‘AIS’.

  • export_directory (str, optional) - Directory to save the merged dataset. Defaults to None.

Returns:

The merged dataset containing forcing, projection, and metadata.

Return type:

pandas.DataFrame

ise.data.process.process_AIS_atmospheric_sectors(forcing_directory, grid_file)[source]

Processes atmospheric forcing data for AIS sectors, aggregating sector-level data.

Parameters:
  • forcing_directory (str) - Directory containing atmospheric forcing data.

  • grid_file (str or xarray.Dataset) - Grid file defining sector boundaries.

Returns:

DataFrame containing processed atmospheric forcing data for AIS sectors.

Return type:

pandas.DataFrame

ise.data.process.process_AIS_oceanic_sectors(forcing_directory, grid_file)[source]

Processes oceanic forcing data for AIS sectors, aggregating sector-level data for thermal forcing, salinity, and temperature.

Parameters:
  • forcing_directory (str) - Directory containing oceanic forcing data.

  • grid_file (str or xarray.Dataset) - Grid file defining sector boundaries.

Returns:

DataFrame containing processed oceanic forcing data for AIS sectors.

Return type:

pandas.DataFrame

ise.data.process.process_AIS_outputs(zenodo_directory, with_ctrl=False)[source]

Processes AIS model outputs by extracting Ice Volume Above Flotation (IVAF) data and computing sea-level equivalents.

Parameters:
  • zenodo_directory (str) - Directory containing AIS output files.

  • with_ctrl (bool, optional) - If True, includes control projections. Defaults to False.

Returns:

DataFrame containing processed AIS output data.

Return type:

pandas.DataFrame

ise.data.process.process_GrIS_atmospheric_sectors(forcing_directory, grid_file)[source]

Processes atmospheric forcing data for GrIS sectors, aggregating sector-level data.

Parameters:
  • forcing_directory (str) - Directory containing atmospheric forcing data.

  • grid_file (str or xarray.Dataset) - Grid file defining sector boundaries.

Returns:

DataFrame containing processed atmospheric forcing data for GrIS sectors.

Return type:

pandas.DataFrame

ise.data.process.process_GrIS_oceanic_sectors(forcing_directory, grid_file)[source]

Processes oceanic forcing data for GrIS sectors, aggregating sector-level data for thermal forcing and basin runoff.

Parameters:
  • forcing_directory (str) - Directory containing oceanic forcing data.

  • grid_file (str or xarray.Dataset) - Grid file defining sector boundaries.

Returns:

DataFrame containing processed oceanic forcing data for GrIS sectors.

Return type:

pandas.DataFrame

ise.data.process.process_GrIS_outputs(zenodo_directory)[source]

Processes GrIS model outputs by extracting Ice Volume Above Flotation (IVAF) data and computing sea-level equivalents.

Parameters:

zenodo_directory (str) - Directory containing GrIS output files.

Returns:

DataFrame containing processed GrIS output data.

Return type:

pandas.DataFrame

ise.data.process.process_sectors(ice_sheet, forcing_directory, grid_file, zenodo_directory, experiments_file, export_directory=None, overwrite=False, with_ctrl=False)[source]

Processes sector-based datasets by merging atmospheric, oceanic, and projection data for the given ice sheet.

Parameters:
  • ice_sheet (str) - The ice sheet being processed (‘AIS’ or ‘GrIS’).

  • forcing_directory (str) - Directory containing forcing data.

  • grid_file (str) - Path to the grid file defining sectors.

  • zenodo_directory (str) - Directory containing projection data.

  • experiments_file (str) - Path to the experiment metadata file.

  • export_directory (str, optional) - Directory to save processed datasets. Defaults to None.

  • overwrite (bool, optional) - If True, overwrites existing datasets. Defaults to False.

  • with_ctrl (bool, optional) - If True, includes control projections. Defaults to False.

Returns:

The final merged dataset.

Return type:

pandas.DataFrame

ise.data.scaler

class ise.data.scaler.LogScaler(epsilon=1e-08)[source]

Bases: Module

A class for scaling input data using a logarithmic transformation, ensuring all values are positive by applying a shift.

Parameters:

epsilon (float, optional) - A small constant to avoid log(0) errors. Defaults to 1e-8.

epsilon

A small constant to avoid log(0) errors.

Type:

float

min_value

The minimum value in the dataset used for shifting.

Type:

float

device

The device (CPU or GPU) on which calculations are performed.

Type:

torch.device

fit(X)[source]

Computes the minimum value of the input data for shifting.

transform(X)[source]

Applies the logarithmic transformation.

inverse_transform(X)[source]

Reverses the log transformation.

save(path)[source]

Saves the scaler parameters to a file.

load(path)[source]

Loads the scaler parameters from a file.

fit(X)[source]

Computes the minimum value in the dataset to ensure all values remain positive during transformation.

Parameters:

X (torch.Tensor) - The input data to be scaled.

inverse_transform(X)[source]

Reverses the log transformation to recover the original scale of the data.

Parameters:

X (torch.Tensor) - The log-transformed input data.

Returns:

The transformed input data in its original scale.

Return type:

torch.Tensor

static load(path)[source]
save(path)[source]
transform(X)[source]

Applies the logarithmic transformation to the input data.

Parameters:

X (torch.Tensor) - The input data to be transformed.

Returns:

The log-transformed input data.

Return type:

torch.Tensor

class ise.data.scaler.RobustScaler[source]

Bases: Module

A class for scaling input data using the median and interquartile range (IQR), making it robust to outliers.

Parameters:

nn.Module - The base class for all neural network modules in PyTorch.

median_

The median values of the input data.

Type:

torch.Tensor

iqr_

The interquartile range (IQR) values of the input data.

Type:

torch.Tensor

device

The device (CPU or GPU) on which the calculations are performed.

Type:

torch.device

fit(X)[source]

Computes the median and IQR of the input data.

transform(X)[source]

Scales the input data using the computed median and IQR.

inverse_transform(X)[source]

Reverses the scaling operation on the input data.

save(path)[source]

Saves the median and IQR to a file.

load(path)[source]

Loads the median and IQR from a file.

fit(X)[source]

Computes the median and interquartile range (IQR) of the input data.

Parameters:

X (torch.Tensor) - The input data to be scaled.

inverse_transform(X)[source]

Reverses the scaling operation on the input data.

Parameters:

X (torch.Tensor) - The scaled input data to be transformed back.

Returns:

The transformed input data.

Return type:

torch.Tensor

Raises:

RuntimeError - If the RobustScaler instance is not fitted yet.

static load(path)[source]
save(path)[source]
transform(X)[source]

Scales the input data using the computed median and IQR.

Parameters:

X (torch.Tensor) - The input data to be scaled.

Returns:

The scaled input data.

Return type:

torch.Tensor

Raises:

RuntimeError - If the RobustScaler instance is not fitted yet.

class ise.data.scaler.StandardScaler[source]

Bases: Module

A class for scaling input data using mean and standard deviation.

Parameters:

nn.Module - The base class for all neural network modules in PyTorch.

mean_

The mean values of the input data.

Type:

torch.Tensor

scale_

The standard deviation values of the input data.

Type:

torch.Tensor

device

The device (CPU or GPU) on which the calculations are performed.

Type:

torch.device

fit(X)[source]

Computes the mean and standard deviation of the input data.

transform(X)[source]

Scales the input data using the computed mean and standard deviation.

inverse_transform(X)[source]

Reverses the scaling operation on the input data.

save(path)[source]

Saves the mean and standard deviation to a file.

load(path)[source]

Loads the mean and standard deviation from a file.

fit(X)[source]

Computes the mean and standard deviation of the input data.

Parameters:

X (torch.Tensor) - The input data to be scaled.

inverse_transform(X)[source]

Reverses the scaling operation on the input data.

Parameters:

X (torch.Tensor) - The scaled input data to be transformed back.

Returns:

The transformed input data.

Return type:

torch.Tensor

Raises:

RuntimeError - If the Scaler instance is not fitted yet.

static load(path)[source]

Loads the mean and standard deviation from a file.

Parameters:

path (str) - The path to load the file from.

Returns:

A Scaler instance with the loaded mean and standard deviation.

Return type:

Scaler

save(path)[source]

Saves the mean and standard deviation to a file.

Parameters:

path (str) - The path to save the file.

transform(X)[source]

Scales the input data using the computed mean and standard deviation.

Parameters:

X (torch.Tensor) - The input data to be scaled.

Returns:

The scaled input data.

Return type:

torch.Tensor

Raises:

RuntimeError - If the Scaler instance is not fitted yet.