synthetic_aia_mia.fetch_data package

Submodules

synthetic_aia_mia.fetch_data.adult module

Load Adult dataset and manage cross validation.

synthetic_aia_mia.fetch_data.adult.load(sensitive=[], k=0, p=1)[source]

Download if necessary folktables adult. Split and return train and test.

Parameters:
  • sensitive (list of str) – (Optional default=[]) List of sensitive attributes to include in the features. The sensitive attribute are “sex” and “race”.

  • k (int) – (Optinal default=0) Corss validation step in {0,1,2,3,4}.

  • p (float) – Proportion (0<p<=1) of data used.

Returns:

Train and test split dataframes in a dictionary.

Return type:

Dictionary

synthetic_aia_mia.fetch_data.split module

Split data into train / test using 5 folding corss validation.

synthetic_aia_mia.fetch_data.split.split_numpy(data, k=0)[source]

5-folding of dataset dictionary of numpy array.

Parameters:
  • data (Dictionary) – Dataset where each key maps to a numpy array.

  • k (int) – (Optional) Indice of the fold, can be 0,1,2,3 or 4.

Returns:

Dataset with train and test.

Return type:

Dictionary

synthetic_aia_mia.fetch_data.split.split_pandas(data, k=0)[source]

5-folding of dataset dictionary of numpy array.

Parameters:
  • data (pandas.dataframe) – Dataset in the form of a dataframe.

  • k (int) – (Optional) Indice of the fold, can be 0,1,2,3 or 4.

Returns:

Dataset with train and test.

Return type:

Dictionary

synthetic_aia_mia.fetch_data.utk module

Downlaod and manages train / test split for UTKFaces dataset.

class synthetic_aia_mia.fetch_data.utk.StorageData[source]

Bases: object

A dataset structure that loads images from storage. On initialisation

extraction(i)[source]

Create a new smaller StorageDataset.

Parameters:

i (list of int) – List in indicies.

Returns:

Extracted StorageDataset.

Return type:

StorageDataset

synthetic_aia_mia.fetch_data.utk.load(k=0, p=1)[source]

Load UTK in a dictionary with train and test.

Parameters:
  • k (int) – Validation step in {0,1,2,3,4}.

  • p (float) – Proportion of data used in [0,1].

Returns:

Dictionary containing train and test.

Return type:

Dictionary of StorageDataset

synthetic_aia_mia.fetch_data.utk.pyrDown(src[, dst[, dstsize[, borderType]]]) dst

. @brief Blurs an image and downsamples it. . . By default, size of the output image is computed as Size((src.cols+1)/2, (src.rows+1)/2), but in . any case, the following conditions should be satisfied: . . f[begin{array}{l} | texttt{dstsize.width} *2-src.cols| leq 2 \ | texttt{dstsize.height} *2-src.rows| leq 2 end{array}f] . . The function performs the downsampling step of the Gaussian pyramid construction. First, it . convolves the source image with the kernel: . . f[frac{1}{256} begin{bmatrix} 1 & 4 & 6 & 4 & 1 \ 4 & 16 & 24 & 16 & 4 \ 6 & 24 & 36 & 24 & 6 \ 4 & 16 & 24 & 16 & 4 \ 1 & 4 & 6 & 4 & 1 end{bmatrix}f] . . Then, it downsamples the image by rejecting even rows and columns. . . @param src input image. . @param dst output image; it has the specified size and the same type as src. . @param dstsize size of the output image. . @param borderType Pixel extrapolation method, see #BorderTypes (#BORDER_CONSTANT isn’t supported)

synthetic_aia_mia.fetch_data.utk.pyrUp(src[, dst[, dstsize[, borderType]]]) dst

. @brief Upsamples an image and then blurs it. . . By default, size of the output image is computed as Size(src.cols*2, (src.rows*2), but in any . case, the following conditions should be satisfied: . . f[begin{array}{l} | texttt{dstsize.width} -src.cols*2| leq ( texttt{dstsize.width} mod 2) \ | texttt{dstsize.height} -src.rows*2| leq ( texttt{dstsize.height} mod 2) end{array}f] . . The function performs the upsampling step of the Gaussian pyramid construction, though it can . actually be used to construct the Laplacian pyramid. First, it upsamples the source image by . injecting even zero rows and columns and then convolves the result with the same kernel as in . pyrDown multiplied by 4. . . @param src input image. . @param dst output image. It has the specified size and the same type as src . . @param dstsize size of the output image. . @param borderType Pixel extrapolation method, see #BorderTypes (only #BORDER_DEFAULT is supported)

Module contents

Downloads datasets and splits in train/test.

class synthetic_aia_mia.fetch_data.Dataset[source]

Bases: object

Managing dataset in the high level interface.

load()[source]

Return the dataset loaded into memory.

Returns:

Previously updated dataset.

Return type:

pandas.dataframe for adul or dictionary of numpy.ndarray for utkfaces

save(path)[source]

Save the underlying pandas objet to a permanant file.

Parameters:

path (Valid Unix path) – Where to save the pandas object using pickle.

update(data)[source]

Update the content of the dataset.