Medfl.NetManager package
Submodules
Medfl.NetManager.dataset module
- class Medfl.NetManager.dataset.DataSet(name: str, path: str, engine=None)[source]
Bases:
object
- __init__(name: str, path: str, engine=None)[source]
Initialize a DataSet object.
- Parameters:
name (str) – The name of the dataset.
path (str) – The file path of the dataset CSV file.
- delete_dataset()[source]
Delete the dataset from the database.
Notes: - Assumes the dataset name is unique in the ‘DataSets’ table.
- static list_alldatasets(engine)[source]
List all dataset names from the ‘DataSets’ table.
- Returns:
A DataFrame containing the names of all datasets in the ‘DataSets’ table.
- Return type:
pd.DataFrame
Medfl.NetManager.flsetup module
- class Medfl.NetManager.flsetup.FLsetup(name: str, description: str, network: Network)[source]
Bases:
object
- __init__(name: str, description: str, network: Network)[source]
Initialize a Federated Learning (FL) setup.
- Parameters:
name (str) – The name of the FL setup.
description (str) – A description of the FL setup.
network (Network) – An instance of the Network class representing the network architecture.
- create_dataloader_from_node(node: Node, output, fill_strategy='mean', fit_encode=[], to_drop=[], train_batch_size: int = 32, test_batch_size: int = 1, split_frac: float = 0.2, dataset: Optional[Dataset] = None)[source]
Create DataLoader from a Node.
- Parameters:
node (Node) – The node from which to create DataLoader.
train_batch_size (int) – The batch size for training data.
test_batch_size (int) – The batch size for test data.
split_frac (float) – The fraction of data to be used for training.
dataset (Dataset) – The dataset to use. If None, the method will read the dataset from the node.
- Returns:
The DataLoader instances for training and testing.
- Return type:
DataLoader
- create_federated_dataset(output, fill_strategy='mean', fit_encode=[], to_drop=[], val_frac=0.1, test_frac=0.2) FederatedDataset [source]
Create a federated dataset.
- Parameters:
output (string) – the output feature of the dataset
val_frac (float) – The fraction of data to be used for validation.
test_frac (float) – The fraction of data to be used for testing.
- Returns:
The FederatedDataset instance containing train, validation, and test data.
- Return type:
- create_nodes_from_master_dataset(params_dict: dict)[source]
Create nodes from the master dataset.
- Parameters:
params_dict (dict) – A dictionary containing parameters for node creation. - column_name (str): The name of the column in the MasterDataset used to create nodes. - train_nodes (list): A list of node names that will be used for training. - test_nodes (list): A list of node names that will be used for testing.
- Returns:
A list of Node instances created from the master dataset.
- Return type:
list
- get_flDataSet()[source]
Retrieve the federated dataset associated with the FL setup using the FL setup’s name.
- Returns:
DataFrame containing the federated dataset information.
- Return type:
pandas.DataFrame
- static list_allsetups()[source]
List all the FL setups.
- Returns:
A DataFrame containing information about all the FL setups.
- Return type:
DataFrame
Medfl.NetManager.net_helper module
- Medfl.NetManager.net_helper.get_feddataset_id_from_name(name)[source]
Get the Federated dataset Id from the FedDatasets table based on the federated dataset name.
- Parameters:
name (str) – Federated dataset name.
- Returns:
FedId or None if not found.
- Return type:
int or None
- Medfl.NetManager.net_helper.get_flpipeline_from_name(name)[source]
Get the FLpipeline Id from the FLpipeline table based on the FL pipeline name.
- Parameters:
name (str) – FL pipeline name.
- Returns:
FLpipelineId or None if not found.
- Return type:
int or None
- Medfl.NetManager.net_helper.get_flsetupid_from_name(name)[source]
Get the FLsetupId from the FLsetup table based on the FL setup name.
- Parameters:
name (str) – FL setup name.
- Returns:
FLsetupId or None if not found.
- Return type:
int or None
- Medfl.NetManager.net_helper.get_netid_from_name(name)[source]
Get the Network Id from the Networks table based on the NetName.
- Parameters:
name (str) – Network name.
- Returns:
NetId or None if not found.
- Return type:
int or None
- Medfl.NetManager.net_helper.get_nodeid_from_name(name)[source]
Get the NodeId from the Nodes table based on the NodeName.
- Parameters:
name (str) – Node name.
- Returns:
NodeId or None if not found.
- Return type:
int or None
- Medfl.NetManager.net_helper.is_str(data_df, row, x)[source]
Check if a column in a DataFrame is of type ‘object’ and convert the value accordingly.
- Parameters:
data_df (pandas.DataFrame) – DataFrame containing the data.
row (pandas.Series) – Data row.
x (str) – Column name.
- Returns:
Processed value based on the column type.
- Return type:
str or float
- Medfl.NetManager.net_helper.master_table_exists()[source]
Check if the MasterDataset table exists in the database.
- Returns:
True if the table exists, False otherwise.
- Return type:
bool
- Medfl.NetManager.net_helper.process_data_after_reading(data, output, fill_strategy='mean', fit_encode=[], to_drop=[])[source]
Process data after reading from the database, including encoding, dropping columns, and creating a PyTorch TensorDataset.
- Parameters:
data (pandas.DataFrame) – Input data.
output (str) – Output column name.
fill_strategy (str, optional) – Imputation strategy for missing values. Default is “mean”.
fit_encode (list, optional) – List of columns to be label-encoded. Default is an empty list.
to_drop (list, optional) – List of columns to be dropped from the DataFrame. Default is an empty list.
- Returns:
Processed data as a PyTorch TensorDataset.
- Return type:
torch.utils.data.TensorDataset
Medfl.NetManager.net_manager_queries module
Medfl.NetManager.network module
- class Medfl.NetManager.network.Network(name: str = '')[source]
Bases:
object
A class representing a network.
- name
The name of the network.
- Type:
str
- mtable_exists
An integer flag indicating whether the MasterDataset table exists (1) or not (0).
- Type:
int
- __init__(name: str = '')[source]
Initialize a Network instance.
- Parameters:
name (str) – The name of the network.
- create_master_dataset(path_to_csv: str = '/home/local/USHERBROOKE/saho6810/MEDfl/code/MEDfl/notebooks/eicu_test.csv')[source]
Create the MasterDataset table and insert dataset values.
- Parameters:
path_to_csv – Path to the CSV file containing the dataset.
- static list_allnetworks()[source]
List all networks in the database. :returns: A DataFrame containing information about all networks in the database. :rtype: DataFrame
- list_allnodes()[source]
List all nodes in the network.
- Parameters:
None –
- Returns:
A DataFrame containing information about all nodes in the network.
- Return type:
DataFrame
- update_network(FLsetupId: int)[source]
Update the network’s FLsetupId in the database.
- Parameters:
FLsetupId (int) – The FLsetupId to update.
Medfl.NetManager.node module
- class Medfl.NetManager.node.Node(name: str, train: int, test_fraction: float = 0.2, engine=<sqlalchemy.engine.base.Connection object>)[source]
Bases:
object
A class representing a node in the network.
- name
The name of the node.
- Type:
str
- train
An integer flag representing whether the node is used for training (1) or testing (0).
- Type:
int
- test_fraction
The fraction of data used for testing when train=1. Default is 0.2.
- Type:
float, optional
- __init__(name: str, train: int, test_fraction: float = 0.2, engine=<sqlalchemy.engine.base.Connection object>)[source]
Initialize a Node instance.
- Parameters:
name (str) – The name of the node.
train (int) – An integer flag representing whether the node is used for training (1) or testing (0).
test_fraction (float, optional) – The fraction of data used for testing when train=1. Default is 0.2.
- assign_dataset(dataset_name: str)[source]
Assigning existing dataSet to node :param dataset_name: The name of the dataset to assign. :type dataset_name: str
- Returns:
None
- check_dataset_compatibility(data_df)[source]
Check if the dataset is compatible with the master dataset. :param data_df: The dataset to check. :type data_df: DataFrame
- Returns:
None
- create_node(NetId: int)[source]
Create a node in the database. :param NetId: The ID of the network to which the node belongs. :type NetId: int
- Returns:
None
- get_dataset(column_name: Optional[str] = None)[source]
Get the dataset for the node based on the given column name. :param column_name: The column name to filter the dataset. Default is None. :type column_name: str, optional
- Returns:
The dataset associated with the node.
- Return type:
DataFrame
- list_alldatasets()[source]
List all datasets associated with the node. :returns: A DataFrame containing information about all datasets associated with the node. :rtype: DataFrame
- static list_allnodes()[source]
List all nodes in the database. :returns: A DataFrame containing information about all nodes in the database. :rtype: DataFrame
- unassign_dataset(dataset_name: str)[source]
unssigning existing dataSet to node :param dataset_name: The name of the dataset to assign. :type dataset_name: str
- Returns:
None
- upload_dataset(dataset_name: str, path_to_csv: str = '/home/local/USHERBROOKE/saho6810/MEDfl/code/MEDfl/notebooks/eicu_test.csv')[source]
Upload the dataset to the database for the node. :param dataset_name: The name of the dataset. :type dataset_name: str :param path_to_csv: Path to the CSV file containing the dataset. Default is the path in params. :type path_to_csv: str, optional
- Returns:
None