Function Reference¶
Somoclu Class¶
-
class
somoclu.
Somoclu
(n_columns, n_rows, data=None, initialcodebook=None, kerneltype=0, maptype='planar', gridtype='rectangular', compactsupport=False, neighborhood='gaussian', initialization=None)¶ Class for training and visualizing a self-organizing map.
Parameters: - n_columns (int.) – The number of columns in the map.
- n_rows (int.) – The number of rows in the map.
- data (2D numpy.array of float32.) – Optional parameter to provide training data. It is not necessary if the map is otherwise trained outside Python, e.g., on a GPU cluster.
- initialcodebook (2D numpy.array of float32.) – Optional parameter to start the training with a given codebook.
- kerneltype (int.) –
Optional parameter to specify which kernel to use:
- 0: dense CPU kernel (default)
- 1: dense GPU kernel (if compiled with it)
- maptype (str.) –
Optional parameter to specify the map topology:
- “planar”: Planar map (default)
- “toroid”: Toroid map
- gridtype (str.) –
Optional parameter to specify the grid form of the nodes:
- “rectangular”: rectangular neurons (default)
- “hexagonal”: hexagonal neurons
- compactsupport (bool.) – Optional parameter to cut off map updates beyond the training radius with the Gaussian neighborhood. Default: False.
- neighborhood (str.) –
Optional parameter to specify the neighborhood:
- “gaussian”: Gaussian neighborhood (default)
- “bubble”: bubble neighborhood function
- initialization (str.) –
Optional parameter to specify the initalization:
- “random”: random weights in the codebook
- “pca”: codebook is initialized from the first subspace spanned by the first two eigenvectors of the correlation matrix
-
cluster
(algorithm=None)¶ Cluster the codebook. The clusters of the data instances can be assigned based on the BMUs. The method populates the class variable Somoclu.clusters. If viewing methods are called after clustering, but without colors for best matching units, colors will be automatically assigned based on cluster membership.
Parameters: algorithm – Optional parameter to specify a scikit-learn clustering algorithm. The default is K-means with eight clusters.
-
load_bmus
(filename)¶ Load the best matching units from a file to the Somoclu object.
Parameters: filename (str.) – The name of the file.
-
load_codebook
(filename)¶ Load the codebook from a file to the Somoclu object.
Parameters: filename (str.) – The name of the file.
-
load_umatrix
(filename)¶ Load the umatrix from a file to the Somoclu object.
Parameters: filename (str.) – The name of the file.
-
train
(epochs=10, radius0=0, radiusN=1, radiuscooling='linear', scale0=0.1, scaleN=0.01, scalecooling='linear')¶ Train the map on the current data in the Somoclu object.
Parameters: - epochs (int.) – The number of epochs to train the map for.
- radius0 (int.) – The initial radius on the map where the update happens around a best matching unit. Default value of 0 will trigger a value of min(n_columns, n_rows)/2.
- radiusN (int.) – The radius on the map where the update happens around a best matching unit in the final epoch. Default: 1.
- radiuscooling –
The cooling strategy between radius0 and radiusN:
- “linear”: Linear interpolation (default)
- “exponential”: Exponential decay
- scale0 (int.) – The initial learning scale. Default value: 0.1.
- scaleN (int.) – The learning scale in the final epoch. Default: 0.01.
- scalecooling (str.) –
The cooling strategy between scale0 and scaleN:
- “linear”: Linear interpolation (default)
- “exponential”: Exponential decay
-
update_data
(data)¶ Change the data set in the Somoclu object. It is useful when the data is updated and the training should continue on the new data.
Parameters: data (2D numpy.array of float32.) – The training data.
-
view_component_planes
(dimensions=None, figsize=None, colormap=<Mock name='mock.cm.Spectral_r' id='140426738917728'>, colorbar=False, bestmatches=False, bestmatchcolors=None, labels=None, zoom=None, filename=None)¶ Observe the component planes in the codebook of the SOM.
Parameters: - dimensions – Optional parameter to specify along which dimension or dimensions should the plotting happen. By default, each dimension is plotted in a sequence of plots.
- figsize ((int, int)) – Optional parameter to specify the size of the figure.
- colormap (matplotlib.colors.Colormap) – Optional parameter to specify the color map to be used.
- colorbar (bool.) – Optional parameter to include a colormap as legend.
- bestmatches (bool.) – Optional parameter to plot best matching units.
- bestmatchcolors (list of int.) – Optional parameter to specify the color of each best matching unit.
- labels (list of str.) – Optional parameter to specify the label of each point.
- zoom (((int, int), (int, int))) – Optional parameter to zoom into a region on the map. The first two coordinates of the tuple are the row limits, the second tuple contains the column limits.
- filename (str.) – If specified, the plot will not be shown but saved to this file.
-
view_umatrix
(figsize=None, colormap=<Mock name='mock.cm.Spectral_r' id='140426738917728'>, colorbar=False, bestmatches=False, bestmatchcolors=None, labels=None, zoom=None, filename=None)¶ Plot the U-matrix of the trained map.
Parameters: - figsize ((int, int)) – Optional parameter to specify the size of the figure.
- colormap (matplotlib.colors.Colormap) – Optional parameter to specify the color map to be used.
- colorbar (bool.) – Optional parameter to include a colormap as legend.
- bestmatches (bool.) – Optional parameter to plot best matching units.
- bestmatchcolors (list of int.) – Optional parameter to specify the color of each best matching unit.
- labels (list of str.) – Optional parameter to specify the label of each point.
- zoom (((int, int), (int, int))) – Optional parameter to zoom into a region on the map. The first two coordinates of the tuple are the row limits, the second tuple contains the column limits.
- filename (str.) – If specified, the plot will not be shown but saved to this file.