glasspy.data package
Submodules
glasspy.data.load module
This is the module to load available data in GlassPy.
Right now, the main source of GlassPy data is the SciGlass database. The SciGlass database is available at https://github.com/epam/SciGlass licensed under ODC Open Database License (ODbL). For a plain text version of this database, see the for at https://github.com/drcassar/SciGlass. Data that ships with GlassPy is the same as the data in the plain text fork.
Typical usage example:
source = SciGlass() df = source.data
- class glasspy.data.load.SciGlass(elements_cfg: dict = {}, properties_cfg: dict = {}, compounds_cfg: dict = {}, autocleanup: bool = True, metadata: bool = True)
Bases:
object
Loader of SciGlass data.
- Parameters:
elements_cfg – Dictonary configuring how the elements information is collected. See docstring for get_elements method for more details.
properties_cfg – Dictonary configuring how the properties information is collected. See docstring for get_properties method for more details.
compounds_cfg – Dictonary configuring how the compounds information is collected. See docstring for get_compounds method for more details.
autocleanup – If True, automatically delete columns of the final DataFrame that do not have any information (only zeros). Default value: True.
metadata – If True, add the metadata information to the DataFrame. Default value: True.
- data
DataFrame of the collected data.
- static available_properties()
Returns a list of available properties.
- static available_properties_metadata()
Returns a list of available properties metadata.
- elements_from_compounds(final_sum=1, compounds_in_weight=False)
Create atomic fraction information from compound information.
- Parameters:
final_sum – positive int or float The final sum of all atomic fractions is normalized to this value. Usual values are 1 if you want atomic fractions or 100 if you want atomic percentages. Default value is 1.
compounds_in_weight – bool If True, then assume that the compounds fractions are in weight%, otherwise assume that the compounds fractions are in mol%. Default value is False.
- get_compounds(**kwargs)
Get compound information.
- Parameters:
path – str String with the path to the database csv file.
keep – list List of compounds to keep in the final DataFrame.
drop – list List of compounds to remove from the final DataFrame.
acceptable_sum_deviation – positive int or float The sum of all compound fractions should be 100%. However, due to float point errors or rounding errors, this sum will not be exactly 100%. This argument controls the acceptable deviation of this sum in %. A value of 1 means that the sum of all compound fractions can be between 99 and 101. All examples that are not within this range are discarted.
final_sum – positive int or float The final sum of all compound fractions is normalized to this value. Usual values are 1 if you want compound fractions or 100 if you want compound percentages.
return_weight – bool If True, the chemical information stored in the DataFrame will be in weight%. Otherwise it will be in mol%.
IDs – pd.Index IDs of the dataset to consider. Each glass in the SciGlass database has a glass number and a paper number. This ID used in GlassPy is an integer that merges both numbers
- get_elements(**kwargs)
Get elemental atomic fraction information.
- Parameters:
path – str String with the path to the database csv file.
keep – list List of elements to keep in the final DataFrame.
drop – list List of elements to remove from the final DataFrame.
translate – dict Dictionary with the information on how to read and convert the elements. See variable AtMol_translation for an example.
acceptable_sum_deviation – positive int or float The sum of all atomic fractions should be 100%. However, due to float point errors or rounding errors, this sum will not be exactly 100%. This argument controls the acceptable deviation of this sum in %. A value of 1 means that the sum of all atomic fractions can be between 99 and 101. All examples that are not within this range are discarted.
final_sum – positive int or float The final sum of all atomic fractions is normalized to this value. Usual values are 1 if you want atomic fractions or 100 if you want atomic percentages.
IDs – pd.Index IDs of the dataset to consider. Each glass in the SciGlass database has a glass number and a paper number. This ID used in GlassPy is an integer that merges both numbers
- get_properties(**kwargs)
Get properties information.
- Parameters:
path – str String with the path to the database csv file.
keep – list List of properties to keep in the final DataFrame.
drop – list List of properties to remove from the final DataFrame.
translate – dict Dictionary with the information on how to read and convert the properties. See variable SciSK_translation for an example.
IDs – pd.Index IDs of the dataset to consider. Each glass in the SciGlass database has a glass number and a paper number. This ID used in GlassPy is an integer that merges both numbers
- remove_duplicate_composition(scope='elements', decimals=3, aggregator='median')
Remove duplicate compositions from the data attribute.
Note that the original ID and the metadata are lost upon this operation.
- remove_zero_sum_columns(scope='compounds')
Removes all columns that have zero sum from the data attribute.
glasspy.data.translators module
Module contents
- class glasspy.data.SciGlass(elements_cfg: dict = {}, properties_cfg: dict = {}, compounds_cfg: dict = {}, autocleanup: bool = True, metadata: bool = True)
Bases:
object
Loader of SciGlass data.
- Parameters:
elements_cfg – Dictonary configuring how the elements information is collected. See docstring for get_elements method for more details.
properties_cfg – Dictonary configuring how the properties information is collected. See docstring for get_properties method for more details.
compounds_cfg – Dictonary configuring how the compounds information is collected. See docstring for get_compounds method for more details.
autocleanup – If True, automatically delete columns of the final DataFrame that do not have any information (only zeros). Default value: True.
metadata – If True, add the metadata information to the DataFrame. Default value: True.
- data
DataFrame of the collected data.
- static available_properties()
Returns a list of available properties.
- static available_properties_metadata()
Returns a list of available properties metadata.
- elements_from_compounds(final_sum=1, compounds_in_weight=False)
Create atomic fraction information from compound information.
- Parameters:
final_sum – positive int or float The final sum of all atomic fractions is normalized to this value. Usual values are 1 if you want atomic fractions or 100 if you want atomic percentages. Default value is 1.
compounds_in_weight – bool If True, then assume that the compounds fractions are in weight%, otherwise assume that the compounds fractions are in mol%. Default value is False.
- get_compounds(**kwargs)
Get compound information.
- Parameters:
path – str String with the path to the database csv file.
keep – list List of compounds to keep in the final DataFrame.
drop – list List of compounds to remove from the final DataFrame.
acceptable_sum_deviation – positive int or float The sum of all compound fractions should be 100%. However, due to float point errors or rounding errors, this sum will not be exactly 100%. This argument controls the acceptable deviation of this sum in %. A value of 1 means that the sum of all compound fractions can be between 99 and 101. All examples that are not within this range are discarted.
final_sum – positive int or float The final sum of all compound fractions is normalized to this value. Usual values are 1 if you want compound fractions or 100 if you want compound percentages.
return_weight – bool If True, the chemical information stored in the DataFrame will be in weight%. Otherwise it will be in mol%.
IDs – pd.Index IDs of the dataset to consider. Each glass in the SciGlass database has a glass number and a paper number. This ID used in GlassPy is an integer that merges both numbers
- get_elements(**kwargs)
Get elemental atomic fraction information.
- Parameters:
path – str String with the path to the database csv file.
keep – list List of elements to keep in the final DataFrame.
drop – list List of elements to remove from the final DataFrame.
translate – dict Dictionary with the information on how to read and convert the elements. See variable AtMol_translation for an example.
acceptable_sum_deviation – positive int or float The sum of all atomic fractions should be 100%. However, due to float point errors or rounding errors, this sum will not be exactly 100%. This argument controls the acceptable deviation of this sum in %. A value of 1 means that the sum of all atomic fractions can be between 99 and 101. All examples that are not within this range are discarted.
final_sum – positive int or float The final sum of all atomic fractions is normalized to this value. Usual values are 1 if you want atomic fractions or 100 if you want atomic percentages.
IDs – pd.Index IDs of the dataset to consider. Each glass in the SciGlass database has a glass number and a paper number. This ID used in GlassPy is an integer that merges both numbers
- get_properties(**kwargs)
Get properties information.
- Parameters:
path – str String with the path to the database csv file.
keep – list List of properties to keep in the final DataFrame.
drop – list List of properties to remove from the final DataFrame.
translate – dict Dictionary with the information on how to read and convert the properties. See variable SciSK_translation for an example.
IDs – pd.Index IDs of the dataset to consider. Each glass in the SciGlass database has a glass number and a paper number. This ID used in GlassPy is an integer that merges both numbers
- remove_duplicate_composition(scope='elements', decimals=3, aggregator='median')
Remove duplicate compositions from the data attribute.
Note that the original ID and the metadata are lost upon this operation.
- remove_zero_sum_columns(scope='compounds')
Removes all columns that have zero sum from the data attribute.