dataset¶
- class corankco.dataset.Dataset(rankings: List[Ranking], name: str = '')¶
Class representing a dataset containing rankings.
- Parameters:
rankings (list of Ranking) – Rankings in the dataset.
name (str, optional) – Name of the dataset.
- contains_element(element: str | int) bool ¶
- Parameters:
element – the element to find
- Returns:
true iif the target element is ranked in at least one input ranking of the dataset
- description() str ¶
- Returns:
A complete description of the Dataset object containing all the available information
- classmethod from_file(path: str) Dataset ¶
Create a Dataset from a file containing rankings.
- Parameters:
path (str) – The path to the file.
- Returns:
A new Dataset object.
- Return type:
- classmethod from_raw_list(rankings: List[List[Set[int | str]]], name: str = '') Dataset ¶
Create a Dataset from a raw list of rankings.
- Parameters:
rankings (List[List[Set[Union[int, str]]]]) – A list of rankings.
name (str, optional) – The name of the dataset.
- Returns:
A new Dataset object.
- Return type:
- static get_dataset_from_file(path: str) Dataset ¶
Read a file of rankings and return a Dataset object.
- Parameters:
path (str) – The path to the ranking file to read.
- Returns:
A Dataset object containing the read rankings.
- Return type:
- static get_datasets_from_folder(path_folder: str) List[Dataset] ¶
Get a List of Dataset, one by file of the folder path :param path_folder: the path of the folder containing the datasets :return: a List containing one Dataset by dataset file in the input folder path
- get_positions() ndarray ¶
- Returns:
A (nb_elements, nb_rankings) numpy matrix where m[i][j] denotes the position of element i in ranking j position = -1 if element i is non-ranked in ranking j
- static get_random_dataset_markov(nb_elem: int, nb_rankings: int, steps: int, complete: bool = False) Dataset ¶
Get a Dataset generated using a Markov chain. Note that if complete is set to false, the return dataset may contain fewer elements than initially wanted if one or more elements have been removed from all the rankings during the markov walking :param nb_elem: the number of elements in the wanted dataset :param nb_rankings: the number of rankings in the wanted dataset :param steps: the number of steps in the Markov chain for each ranking to generate :param complete: true iif the wanted dataset must be complete that is if all the elements must be ranked in all the rankings :return: A Dataset generated using a Markov chain, see details in corankco.rankingsgeneration.rankingsgenerate
- static get_uniform_permutation_dataset(nb_elem: int, nb_rankings: int)¶
Get a Dataset of nb_elem elements and nb_rankings complete rankings without ties where each ranking is uniformly generated :param nb_elem: the number of wanted elements for the dataset :param nb_rankings: the number of rankings for the dataset :return: a new Dataset instance whose rankings are uniformly generated complete rankings without ties
- property is_complete: bool¶
Method to check if the dataset is complete that is if all the rankings of the dataset have the same domain.
- Returns:
Returns True if the object is complete, False otherwise.
- Return type:
bool
- property mapping_elem_id: Dict[Element, int]¶
Method to get the mapping element -> unique int ID for each element of the universe of the dataset
- Returns:
Returns a dictionary that associates for each element of the universe a unique int ID.
- Return type:
Dict[Element, int]
- property mapping_id_elem: Dict[int, Element]¶
Method to get the mapping element -> unique int ID for each element of the universe of the dataset
- Returns:
Returns a dictionary that associates for each element of the universe a unique int ID.
- Return type:
Dict[Element, int]
- property name: str¶
Method to get the name of the dataset.
- Returns:
Returns the name of the dataset.
- Return type:
str
- property nb_elements: int¶
Get the total number of elements that appear in at least one ranking of the Dataset.
- Returns:
The total number of elements in the Dataset.
- Return type:
int
- property nb_rankings: int¶
Method to get the number of rankings of the dataset.
- Returns:
Returns the number of rankings.
- Return type:
int
- penalties_relative_positions(scoring_scheme: ScoringScheme) Set[PairwiseElementComparison] ¶
Get a set of all pairs of elements with their costs of different relative positions under a Kemeny prism regarding the given ScoringScheme. Complexity: O(nb_elements * nb_elements * nb_rankings) Complexity: O(nb_rankings * nb_elements²) :param scoring_scheme: the ScoringScheme to use for the computation of the different relative costs :return: a Set of ElementComparison objects
- property rankings: List[Ranking]¶
Get the rankings from the Dataset.
- Returns:
The list of rankings in this Dataset object.
- Return type:
List[Ranking]
- remove_elements(elements_to_remove: Set)¶
Remove elements from all rankings in the dataset.
- Parameters:
elements_to_remove (Set[Element]) – Set of elements to remove.
- Returns:
None
- remove_elements_rate_presence_lower_than(rate_presence: float)¶
Remove elements whose rate of presence in the rankings is lower than the provided threshold.
- Parameters:
rate_presence (float) – Threshold below which elements are removed.
- Returns:
None
- remove_empty_rankings()¶
Remove empty rankings from the dataset.
- Returns:
None
- sub_problem_from_elements(elements_to_keep: Set[Element]) Dataset ¶
Generates a sub-problem Dataset by projecting the original Dataset on a given set of elements.
The resulting Dataset only includes the rankings that contain at least one of the elements from the ‘elements_to_keep’ set. Similarly, within each ranking, only buckets that contain at least one of the elements from the ‘elements_to_keep’ set are kept.
- sub_problem_from_ids(id_elements_to_keep: Set[int]) Dataset ¶
Generates a sub-problem Dataset by projecting the original Dataset on a given set of int IDs of elements.
The resulting Dataset only includes the rankings that contain at least one of the elements from the ‘elements_to_keep’ set. Similarly, within each ranking, only buckets that contain at least one of the elements from the ‘elements_to_keep’ set are kept.
- Parameters:
id_elements_to_keep (Set[int]) – A set of elements which the sub-problem should be based on.
- Returns:
A Dataset representing the sub-problem which only includes the elements from ‘id_elements_to_keep’ set.
- Return type:
- unified_dataset()¶
Get a new Dataset object, representing the unified version of the instance dataset. In the returned dataset, for each input ranking r, all the elements of the universe non-ranked in r are added in a unifying bucket at the end of r. :return: a new Dataset object representing the unified version of the current instance
- unified_rankings() List[Ranking] ¶
Get a unified version of the dataset as a List of Ranking objects, that is a list of the input rankings such that for each ranking r, all the elements of the universe non-ranked in r are added in a unifying bucket at the end of r. :return: the unified rankings of the Dataset within a new Ranking List
- property universe: Set¶
Method to get the set of elements that appear in at least one input ranking of the dataset.
- Returns:
Returns a set of elements.
- Return type:
Set
- property without_ties: bool¶
Method to check if the dataset is a list of rankings without ties :return: Returns True iif all the rankings of the dataset are without ties :rtype: bool
- write(path) None ¶
Stores the input rankings of the dataset in a file :param path: the path to store the dataset :return: None
- class corankco.dataset.DatasetSelector(nb_elem_min: int = 0, nb_elem_max: int | float = inf, nb_rankings_min: int = 0, nb_rankings_max: int | float = inf)¶
Class usable to filter datasets according to their number of elements and / or rankings
- property nb_elem_max: int | float¶
- Returns:
The value of the attribute, that is the maximal number of elements to retain a dataset
- property nb_elem_min: int¶
- Returns:
The value of the attribute, that is the minimal number of elements to retain a dataset
- property nb_rankings_max: int | float¶
- Returns:
The value of the attribute, that is the maximal number of rankings to retain a dataset
- property nb_rankings_min: int¶
- Returns:
The value of the attribute, that is the minimal number of rankings to retain a dataset
- select_datasets(list_datasets: List[Dataset]) List[Dataset] ¶
Given a list of Dataset objects, returns the List of Dataset references that fit with the filter :param list_datasets: the list of datasets to filter :return: the list l of Dataset references such that d in l iif: * self.nb_elem_min <= d.nb_elements <= self.nb_elem_max * self.nb_rankings_min <= d.nb_rankings <= self.nb_rankings_max
- exception corankco.dataset.EmptyDatasetException¶
Custom exception for empty dataset