UniqueValuesMapping
- class ase2sprkkr.common.unique_values.UniqueValuesMapping(mapping, value_to_class_id=None)[source]
A class, that can map a collection of (possible non-unique) values to a set of unique identifiers. It effectively makes the classes of equivalence between indexes of the input array.
The instances of the class can be merged to distinct the values, that are the same according to one criterion, but distinct on the other.
>>> UniqueValuesMapping.from_values([1,4,1]).mapping array([1, 2, 1], dtype=int32) >>> UniqueValuesMapping.from_values([int, int, str]).mapping array([1, 1, 2], dtype=int32) >>> UniqueValuesMapping.from_values([1,4,1]).value_to_class_id {1: 1, 4: 2} >>> UniqueValuesMapping.from_values([1,4,1,1]).merge([1,1,2,1]).mapping array([1, 2, 3, 1], dtype=int32)
Class hierarchy
Constructor
- Parameters:
mapping (List) –
value_to_class_id (Dict) –
- __init__(mapping, value_to_class_id=None)[source]
- Parameters:
mapping (Union[np.ndarray, list]) – Array of equivalence class members members[id] = <eq class id>
value_to_class_id (dict) – Mapping { value: <eq class id> }
- mapping
Map from <object index> to <object equivalence class id>.
- value_to_class_id
Map from <object> to <object equivalence class id>. If two mappings are merged, this attribute is not available.
- indexes(start_from=0)[source]
Returns the dictionary that maps equivalence class id to the list of class members indexes.
- Parameters:
start_from (int) – The indexes are by default zero-based, however they can start with the given number (typically with 1).
..doctest:: –
>>> UniqueValuesMapping([1,4,1]).indexes() {1: [0, 2], 4: [1]} >>> UniqueValuesMapping([1,4,1]).indexes(start_from = 1) {1: [1, 3], 4: [2]}
- unique_indexes()[source]
Returns the dictionary that maps equivalence class id to the list of class members indexes.
- ..doctest::
>>> UniqueValuesMapping([1,1,4]).unique_indexes() [0, 2]
- static from_values(values, length=None)[source]
Create equivalence-classes mapping. Unlike the constructor, this method tags the values by integers and also compute the reverse (value to equivalence class) mapping.
- values: iterable
Values to find the equivalence classes
- length: int
Length of values - provide it, if len(values) is not available
- Parameters:
length (int | None) –
- static _create_mapping(values, length=None, start_from=1, dtype=<class 'numpy.int32'>)[source]
- Returns:
mapping (np.ndarray) – maps the value indexes to equivalence class id
reverse (dict) – maps equivalence classes to value indexes
.. doctest:: – >>> UniqueValuesMapping._create_mapping([1.,4.,1.]) (array([1, 2, 1], dtype=int32), {1.0: 1, 4.0: 2})
- is_equivalent_to(mapping)[source]
Return, whether the mapping is equal to given another mapping, regardless the actual “names” of the equivalence classes.
- Parameters:
mapping (UniqueValuesMapping | Iterable) – The other mapping can be given either by instance of this class, or just by any iterable (that returns equivalence class names for the items)
doctest:: (..) –
>>> UniqueValuesMapping([1,4,1]).is_equivalent_to([0,1,0]) True >>> UniqueValuesMapping([1,4,1]).is_equivalent_to([0,0,0]) False >>> UniqueValuesMapping([1,4,1]).is_equivalent_to([0,1,1]) False >>> UniqueValuesMapping([1,4,1]).is_equivalent_to([5,3,5]) True >>> UniqueValuesMapping([1,4,1]).is_equivalent_to(UniqueValuesMapping.from_values([2,5,2])) True
- Return type:
bool
- static are_equivalent(a, b)[source]
Return, whether the two mappings are equal, regardless the actual “names” of the equivalence classes.
See
is_equivalent
- Parameters:
a (UniqueValuesMapping | Iterable) –
b (UniqueValuesMapping | Iterable) –
- Return type:
bool
- normalized(start_from=1, strict=True, dtype=None)[source]
Map the class ids to integers
- Parameters:
strict (bool) – If True, the resulting integer names will be from range (start_from)..(n+start_from-1), where n is the number of equivalence classes. If False and the names are already integers in a numpy array, do nothing.
start_from – Number the equivalent classes starting from.
- Returns:
mapping (np.ndarray) – Array of integer starting from start_from, denotes the equivalence classes for the values, It holds, that
mappind[index] == equivalence_class
reverse (dict) – Dict
{ equivalence_class : value }
.. doctest:: – >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalized() (array([1, 2, 1], dtype=int32), {1: 1, 2: 2}) >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalized(start_from=0) (array([0, 1, 0], dtype=int32), {1: 0, 2: 1})
- normalize(start_from=1, strict=False, dtype=None)[source]
Replace the names of equivalent classes by the integers.
- Parameters:
strict (bool) – If True, the resulting integer names will be from range (start_from)..(n+start_from-1), where n is the number of equivalence classes. If False and the names are already integers in a numpy array, do nothing.
start_from – Number the equivalent classes starting from.
dtype – dtype of the normalized values. None means
numpy.int32
, however if not strict, any integer type will be sufficient.
- Returns:
unique_values_mapping – Return self.
.. doctest:: – >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalize().mapping array([1, 2, 1], dtype=int32) >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalize().value_to_class_id[(0,3)] 2 >>> UniqueValuesMapping.from_values([(0,2),(0,3),(0,2)]).normalize(start_from=0).mapping array([0, 1, 0], dtype=int32)