GlassPy: loading data
Introduction
GlassPy can load experimental data in its subpackage glasspy.data
. Currently, GlassPy has the SciGlass database as an available data source.
Basic usage
Below is a minimal example of loading SciGlass data into a pandas
DataFrame. This loads the SciGlass data with the default configuration. This means that you will load most of the available data and metadata.
[1]:
from glasspy.data import SciGlass, sciglass_dbinfo
source = SciGlass()
df = source.data
It takes a while to run this cell, but after it loads all the data, we can check what we have.
[2]:
df
[2]:
elements | ... | property | metadata | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
H | Li | Be | B | C | N | O | F | Na | Mg | ... | SurfaceTensionAboveTg | SurfaceTension1173K | SurfaceTension1473K | SurfaceTension1573K | SurfaceTension1673K | ChemicalAnalysis | Author | Year | NumberElements | NumberCompounds | |
ID | |||||||||||||||||||||
20400020000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.666667 | 0.0 | 0.000000 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Volarovich M.P. | 1936 | 2 | 1 |
20500020001 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.579213 | 0.0 | 0.196815 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
20500020002 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.580869 | 0.0 | 0.193449 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
20500020003 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.581986 | 0.0 | 0.187167 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
20500020004 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.583672 | 0.0 | 0.183080 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Hoj J.W. | 1992 | 5 | 4 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4493300611694 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.625485 | 0.0 | 0.000000 | 0.049125 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 7 | 6 |
4493300611695 | 0.0 | 0.0 | 0.0 | 0.001948 | 0.0 | 0.0 | 0.637540 | 0.0 | 0.000000 | 0.009932 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 10 | 9 |
4493300611696 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.635921 | 0.0 | 0.000000 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 8 | 7 |
4493300611697 | 0.0 | 0.0 | 0.0 | 0.014544 | 0.0 | 0.0 | 0.622226 | 0.0 | 0.035890 | 0.000000 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 9 | 8 |
4493300611698 | 0.0 | 0.0 | 0.0 | 0.041532 | 0.0 | 0.0 | 0.634462 | 0.0 | 0.000000 | 0.000487 | ... | NaN | NaN | NaN | NaN | NaN | False | Murata T. | 2019 | 7 | 6 |
283102 rows × 793 columns
To avoid naming conflicts and to make it easier to navigate through the DataFrame, the data is structured in two levels. In the first level, we have information grouped by composition, property, or metadata.
[3]:
print(df.columns.levels[0])
Index(['elements', 'compounds', 'property', 'metadata'], dtype='object')
So if you want to explore the chemical elements of the data, you can just filter that part of the DataFrame.
[4]:
els = df["elements"]
els
[4]:
H | Li | Be | B | C | N | O | F | Na | Mg | ... | W | Re | Pt | Au | Hg | Tl | Pb | Bi | Th | U | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | |||||||||||||||||||||
20400020000 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.666667 | 0.0 | 0.000000 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20500020001 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.579213 | 0.0 | 0.196815 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20500020002 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.580869 | 0.0 | 0.193449 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20500020003 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.581986 | 0.0 | 0.187167 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
20500020004 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.583672 | 0.0 | 0.183080 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4493300611694 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.625485 | 0.0 | 0.000000 | 0.049125 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4493300611695 | 0.0 | 0.0 | 0.0 | 0.001948 | 0.0 | 0.0 | 0.637540 | 0.0 | 0.000000 | 0.009932 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4493300611696 | 0.0 | 0.0 | 0.0 | 0.000000 | 0.0 | 0.0 | 0.635921 | 0.0 | 0.000000 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4493300611697 | 0.0 | 0.0 | 0.0 | 0.014544 | 0.0 | 0.0 | 0.622226 | 0.0 | 0.035890 | 0.000000 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
4493300611698 | 0.0 | 0.0 | 0.0 | 0.041532 | 0.0 | 0.0 | 0.634462 | 0.0 | 0.000000 | 0.000487 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
283102 rows × 76 columns
The same is true if you want to explore a particular column of the DataFrame. Suppose you want to explore the glass transition temperature:
[5]:
Tg = df["property"]["Tg"]
Tg
[5]:
ID
20400020000 NaN
20500020001 1017.15
20500020002 1096.15
20500020003 1013.15
20500020004 1013.15
...
4493300611694 NaN
4493300611695 NaN
4493300611696 NaN
4493300611697 NaN
4493300611698 NaN
Name: Tg, Length: 283102, dtype: float64
As you can see, not all entries have a value for Tg.
To check for all available properties in GlassPy, run:
[9]:
sciglass_dbinfo()
ChemicalAnalysis: Indicates if the glass composition was obtained by chemical analysis [metadata]
Author: First author of the publication [metadata]
Year: Year of the publication [metadata]
T0: Temperature where viscosity is 1 Pa.s (K)
T1: Temperature where viscosity is 10 Pa.s (K)
T2: Temperature where viscosity is 100 Pa.s (K)
T3: Temperature where viscosity is 1000 Pa.s (K)
T4: Temperature where viscosity is 10000 Pa.s (K)
T5: Temperature where viscosity is 100000 Pa.s (K)
T6: Temperature where viscosity is 1000000 Pa.s (K)
T7: Temperature where viscosity is 10000000 Pa.s (K)
T8: Temperature where viscosity is 100000000 Pa.s (K)
T9: Temperature where viscosity is 1000000000 Pa.s (K)
T10: Temperature where viscosity is 10000000000 Pa.s (K)
T11: Temperature where viscosity is 100000000000 Pa.s (K)
T12: Temperature where viscosity is 1000000000000 Pa.s (K)
Viscosity773K: Viscosity at 773 K (Pa.s)
Viscosity873K: Viscosity at 873 K (Pa.s)
Viscosity973K: Viscosity at 973 K (Pa.s)
Viscosity1073K: Viscosity at 1073 K (Pa.s)
Viscosity1173K: Viscosity at 1173 K (Pa.s)
Viscosity1273K: Viscosity at 1273 K (Pa.s)
Viscosity1373K: Viscosity at 1373 K (Pa.s)
Viscosity1473K: Viscosity at 1473 K (Pa.s)
Viscosity1573K: Viscosity at 1573 K (Pa.s)
Viscosity1673K: Viscosity at 1673 K (Pa.s)
Viscosity1773K: Viscosity at 1773 K (Pa.s)
Viscosity1873K: Viscosity at 1873 K (Pa.s)
Viscosity2073K: Viscosity at 2073 K (Pa.s)
Viscosity2273K: Viscosity at 2273 K (Pa.s)
Viscosity2473K: Viscosity at 2473 K (Pa.s)
Tg: Glass transition temperature (K)
Tmelt: Melting temperature (K)
Tliquidus: Liquidus temperature (K)
TLittletons: Littletons softening temperature (K)
TAnnealing: Annealing point (K)
Tstrain: Strain point (K)
Tsoft: Softening point (K)
TdilatometricSoftening: Dilatometric softening temperature (K)
AbbeNum: Abbe's number
RefractiveIndex: Refractive index
RefractiveIndexLow: Refractive index measured at a wavelenght between 0.6 and 1 micron at 293 K
RefractiveIndexHigh: Refractive index measured at a wavelenght greater than 1 micron at 293 K
MeanDispersion: Mean dispersion (nF - nC)
Permittivity: Relative permittivity at ambient temperature anf frequency of1 MHz (or the nearest frequency in the range of 0.01 MHz to 10 MHz)
TangentOfLossAngle: Tangent of loss angle
TresistivityIs1MOhm.m: Temperature where the specific electrical resistivity is 1MOhm.m (K)
Resistivity293K: Specific electrical resistivity measured at 293 K (Ohm.m)
Resistivity373K: Specific electrical resistivity measured at 373 K (Ohm.m)
Resistivity423K: Specific electrical resistivity measured at 423 K (Ohm.m)
Resistivity573K: Specific electrical resistivity measured at 573 K (Ohm.m)
Resistivity1073K: Specific electrical resistivity measured at 1073 K (Ohm.m)
Resistivity1273K: Specific electrical resistivity measured at 1273 K (Ohm.m)
Resistivity1473K: Specific electrical resistivity measured at 1473 K (Ohm.m)
Resistivity1673K: Specific electrical resistivity measured at 1673 K (Ohm.m)
YoungModulus: Young's Modulus (GPa)
ShearModulus: Shear Modulus (GPa)
Microhardness: Microhardness measured by Knoop or Vickers indentation (GPa)
PoissonRatio: Poisson's ratio
Density293K: Density measured at 293 K (g/cm3)
Density1073K: Density measured at 1073 K (g/cm3)
Density1273K: Density measured at 1273 K (g/cm3)
Density1473K: Density measured at 1473 K (g/cm3)
Density1673K: Density measured at 1673 K (g/cm3)
ThermalConductivity: Thermal conductivity (W/(m.K))
ThermalShockRes: Thermal shock resistance (K)
CTEbelowTg: Linear coefficient of thermal expansion measured below the glass transition temperature (1/K)
CTE328K: Linear coefficient of thermal expansion measured at 328 +/- 10 K (1/K)
CTE373K: Linear coefficient of thermal expansion measured at 373 +/- 10 K (1/K)
CTE433K: Linear coefficient of thermal expansion measured at 433 +/- 10 K (1/K)
CTE483K: Linear coefficient of thermal expansion measured at 483 +/- 10 K (1/K)
CTE623K: Linear coefficient of thermal expansion measured at 623 +/- 10 K (1/K)
Cp293K: Heat capacity at constant pressure measured at 293 K (J/(kg.K))
Cp473K: Heat capacity at constant pressure measured at 473 K (J/(kg.K))
Cp673K: Heat capacity at constant pressure measured at 673 K (J/(kg.K))
Cp1073K: Heat capacity at constant pressure measured at 1073 K (J/(kg.K))
Cp1273K: Heat capacity at constant pressure measured at 1273 K (J/(kg.K))
Cp1473K: Heat capacity at constant pressure measured at 1473 K (J/(kg.K))
Cp1673K: Heat capacity at constant pressure measured at 1673 K (J/(kg.K))
NucleationTemperature: Nucleation temperature (K)
NucleationRate: Crystal nucleation rate (1/(s.m3))
TMaxGrowthVelocity: Temperature of maximum crystal growth velocity (K)
MaxGrowthVelocity: Maximum crystal growth velocity (m/s)
CrystallizationPeak: DTA temperature of crystallization peak (K)
CrystallizationOnset: DTA temperature of crystallization onset (K)
SurfaceTensionAboveTg: Surface tension above the glass transition temperature (J/m2)
SurfaceTension1173K: Surface tension at 1173 K (J/m2)
SurfaceTension1473K: Surface tension at 1473 K (J/m2)
SurfaceTension1573K: Surface tension at 1573 K (J/m2)
SurfaceTension1673K: Surface tension at 1673 K (J/m2)
See the pandas
documentation if you are not familiar with how to use a pandas
DataFrame.
Controlling the initial data collection
It takes a while to load all the SciGlass data. It is wise to load only what you will actually use. You can control what you load by passing your configuration as dictionaries to the SciGlass
class.
For example, say you don’t want glasses with silver or gold in their composition, you are only interested in the glass transition temperature, and you don’t want information about the compounds that make up the glass. You can run this query like this:
[7]:
all_properties_except_Tg = SciGlass.available_properties()
all_properties_except_Tg.remove("Tg")
config_el = {
"drop": ["Ag", "Au"],
}
config_prop = {
"keep": ["Tg"],
"drop": all_properties_except_Tg,
}
config_comp = {}
source = SciGlass(
elements_cfg=config_el,
properties_cfg=config_prop,
compounds_cfg=config_comp,
)
df = source.data
[8]:
df
[8]:
elements | property | metadata | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
H | Li | Be | B | C | N | O | F | Na | Mg | ... | Tl | Pb | Bi | Th | U | Tg | ChemicalAnalysis | Author | Year | NumberElements | |
ID | |||||||||||||||||||||
20500020001 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 57.921249 | 0.0 | 19.681530 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1017.15 | False | Hoj J.W. | 1992 | 5 |
20500020002 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.086941 | 0.0 | 19.344940 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1096.15 | False | Hoj J.W. | 1992 | 5 |
20500020003 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.198601 | 0.0 | 18.716690 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1013.15 | False | Hoj J.W. | 1992 | 5 |
20500020004 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.367241 | 0.0 | 18.308001 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 1013.15 | False | Hoj J.W. | 1992 | 5 |
20500020005 | 0.0 | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 58.282768 | 0.0 | 18.264561 | 0.0 | ... | 0.000000 | 0.0 | 0.000000 | 0.0 | 0.0 | 978.15 | False | Hoj J.W. | 1992 | 5 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4493200611415 | 0.0 | 7.250638 | 0.0 | 2.368801 | 0.0 | 0.0 | 59.389221 | 0.0 | 0.000000 | 0.0 | ... | 8.964828 | 0.0 | 5.536447 | 0.0 | 0.0 | 543.15 | False | Jung Woo Man | 2019 | 9 |
4493200611416 | 0.0 | 7.445931 | 0.0 | 2.358826 | 0.0 | 0.0 | 59.595871 | 0.0 | 0.000000 | 0.0 | ... | 6.650183 | 0.0 | 5.808963 | 0.0 | 0.0 | 545.15 | False | Jung Woo Man | 2019 | 9 |
4493200611417 | 0.0 | 6.593068 | 0.0 | 10.288480 | 0.0 | 0.0 | 59.600090 | 0.0 | 0.000000 | 0.0 | ... | 10.782570 | 0.0 | 0.000000 | 0.0 | 0.0 | 532.15 | False | Jung Woo Man | 2019 | 9 |
4493200611418 | 0.0 | 5.919064 | 0.0 | 1.936039 | 0.0 | 0.0 | 64.014076 | 0.0 | 0.000000 | 0.0 | ... | 7.322553 | 0.0 | 0.000000 | 0.0 | 0.0 | 506.15 | False | Jung Woo Man | 2019 | 9 |
4493200611419 | 0.0 | 6.371798 | 0.0 | 2.019926 | 0.0 | 0.0 | 63.761761 | 0.0 | 0.000000 | 0.0 | ... | 7.882636 | 0.0 | 0.000000 | 0.0 | 0.0 | 522.15 | False | Jung Woo Man | 2019 | 9 |
91738 rows × 78 columns
See the documentation for the SciGlass
class for more information on how to control your initial data collection.