vbvarsel.custodian
==================

.. py:module:: vbvarsel.custodian


Classes
-------

.. autoapisummary::

   vbvarsel.custodian.UserDataHandler


Module Contents
---------------

.. py:class:: UserDataHandler

   Class object to represent user supplied data.


   .. py:attribute:: ExperimentValues


   .. py:method:: normalise_data(data: pandas.DataFrame) -> numpy.ndarray

      Function that returns a normalised dataframe from a non-normalised input.

      Params
         data: pd.DataFrame
          DataFrame of unsorted, unshuffled and non-normalised data to be used.
      Returns
          array_normalised: np.ndarray
              2-D array of normalised data




   .. py:method:: shuffle_normalised_data(normalised_data: numpy.ndarray) -> Union[pandas.Series, numpy.ndarray]

      Shuffles a DataFrame of normalised data.

      Params
          normalised_data: pd.DataFrame
              A normalisd dataframe of numerical data.
      Returns
          Tuple:
              shuffled_data: pd.Series, shuffled_indices: np.ndarray
                  A tuple of shuffled data and their corresponding indices.




   .. py:method:: load_data(data_source: str | os.PathLike, cols_to_ignore: list[str] = None, labels: str | list[str] = None, header: int = 0, index_col: bool = False) -> None

      Loads data to be be used in simulations with option to clean data.

      Params
          data_loc: str | os.Pathlike
              The file location of the spreadsheet, must be in CSV format.
              IMPORTANT Format note: Columns should be variables, and rows should be observation.
              Header rows and index column will not be loaded. CSV must only have numerical values.
              Non-numerical values can be passed, via the `cols_to_ignore` parameter.
          cols_to_ignore: list[str] (Optional) (Default: None)
              Any columns which are irrelevant or non-numerical to be excluded from analysis.
          labels: str | list[str] (Optional) (Default: None)
              Labels to be used to calculate the ARI to check clustering accuracy.
              This parameter is optional but strongly encouraged. If a string value
              is passed, logic assumes that is the name of a column to be used as
              labels. Alternatively, a list of strings may be passed separately.
              Label column can be included in the `cols_to_drop` parameter, labels
              are extracted before the ignored columns are dropped.
          header: int (Optional) (Default: 0)
              Parameter to determine the header of the incoming dataframe.
          index_col: bool (Optional) (Default: False)
              Parameter to determine whether to include an index col.
      Returns
          None




