Miscellaneous utilities: uniqify, listunion, listintersection, perminverse
Relatively fast pure Python uniqification function that preservs ordering.
Parameters
seq : sequence
Sequence object to uniqify.idfun : function, optional
Optional collapse function to identify items as the same.
Returns
result : list
Python list with first occurence of each item in seq, in order.
Take the union of a list of lists.
Take a Python list of Python lists:
[[l11,l12, ...], [l21,l22, ...], ... , [ln1, ln2, ...]]
and return the aggregated list:
[l11,l12, ..., l21, l22 , ...]
For a list of two lists, e.g. [a, b], this is like:
a.extend(b)
Parameters
ListOfLists : Python list
Python list of Python lists.
Returns
u : Python list
Python list created by taking the union of the lists in ListOfLists.
Fast inverse of a (numpy) permutation.
Paramters
s : sequence
Sequence of indices giving a permutation.
Returns
inv : numpy array
Sequence of indices giving the inverse of permutation s.
Returns a null value for each of the various kinds of numpy formats.
Default null value function used in tabular.spreadsheet.join().
Parameters
format : string
Numpy format descriptor, e.g. '<i4', '|S5'.
Returns
null : element in [0, 0.0, ‘’]
Null value corresponding to the given format:
- if format.startswith(('<i', '|b')), e.g. format corresponds to an integer or Boolean, return 0
- else if format.startswith(‘<f’), e.g. format corresponds to a float, return 0.0
- else, e.g. format corresponds to a string, return ‘’
Returns a null value for each of various kinds of test values.
Parameters
test : bool, int, float or string
Value to test.
null : element in [False, 0, 0.0, ‘’]
Null value corresponding to the given test value:
- if test is a bool, return False
- else if test is an int, return 0
- else if test is a float, return 0.0
- else test is a str, return ‘’
Infer the data type (int, float, str) of a list of strings.
Take a list of strings, and attempts to infer a numeric data type that fits them all.
If the strings are all integers, returns a NumPy array of integers.
If the strings are all floats, returns a NumPy array of floats.
Otherwise, returns a NumPy array of the original list of strings.
Used to determine the datatype of a column read from a separated-variable (CSV) text file (e.g. .tsv, .csv) of data where columns are expected to be of uniform Python type.
This function is used by tabular load functions for SV files, e.g. by :func`tabular.io.loadSV` when type information is not provided in the header, and by tabular.io.loadSVsafe().
Parameters
column : list of strings
List of strings corresponding to a column of data.
Returns
out : numpy array
Numpy array of data from column, with data type int, float or str.