D47crunch
Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47 and Δ48 values with fully propagated analytical error estimates (Daëron, 2021).
The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.
1. Tutorial
1.1 Installation
The easy option is to use pip
; open a shell terminal and simply type:
python -m pip install D47crunch
For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:
- Download the
dev
branch source code here and rename it toD47crunch.py
. - Do any of the following:
- copy
D47crunch.py
to somewhere in your Python path - copy
D47crunch.py
to a working directory (import D47crunch
will only work if called within that directory) - copy
D47crunch.py
to any other location (e.g.,/foo/bar
) and then use the following code snippet in your own code to importD47crunch
:
- copy
import sys
sys.path.append('/foo/bar')
import D47crunch
Documentation for the development version can be downloaded here (save html file and open it locally).
1.2 Usage
Start by creating a file named rawdata.csv
with the following contents:
UID, Sample, d45, d46, d47, d48, d49
A01, ETH-1, 5.79502, 11.62767, 16.89351, 24.56708, 0.79486
A02, MYSAMPLE-1, 6.21907, 11.49107, 17.27749, 24.58270, 1.56318
A03, ETH-2, -6.05868, -4.81718, -11.63506, -10.32578, 0.61352
A04, MYSAMPLE-2, -3.86184, 4.94184, 0.60612, 10.52732, 0.57118
A05, ETH-3, 5.54365, 12.05228, 17.40555, 25.96919, 0.74608
A06, ETH-2, -6.06706, -4.87710, -11.69927, -10.64421, 1.61234
A07, ETH-1, 5.78821, 11.55910, 16.80191, 24.56423, 1.47963
A08, MYSAMPLE-2, -3.87692, 4.86889, 0.52185, 10.40390, 1.07032
Then instantiate a D47data
object which will store and process this data:
import D47crunch
mydata = D47crunch.D47data()
For now, this object is empty:
>>> print(mydata)
[]
To load the analyses saved in rawdata.csv
into our D47data
object and process the data:
mydata.read('rawdata.csv')
# compute δ13C, δ18O of working gas:
mydata.wg()
# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()
# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()
We can now print a summary of the data processing:
>>> mydata.summary(verbose = True, save_to_file = False)
[summary]
––––––––––––––––––––––––––––––– –––––––––
N samples (anchors + unknowns) 5 (3 + 2)
N analyses (anchors + unknowns) 8 (5 + 3)
Repeatability of δ13C_VPDB 4.2 ppm
Repeatability of δ18O_VSMOW 47.5 ppm
Repeatability of Δ47 (anchors) 13.4 ppm
Repeatability of Δ47 (unknowns) 2.5 ppm
Repeatability of Δ47 (all) 9.6 ppm
Model degrees of freedom 3
Student's 95% t-factor 3.18
Standardization method pooled
––––––––––––––––––––––––––––––– –––––––––
This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).
To see the actual results:
>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples]
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 2 2.01 37.01 0.2052 0.0131
ETH-2 2 -10.17 19.88 0.2085 0.0026
ETH-3 1 1.73 37.49 0.6132
MYSAMPLE-1 1 2.48 36.90 0.2996 0.0091 ± 0.0291
MYSAMPLE-2 2 -8.17 30.05 0.6600 0.0115 ± 0.0366 0.0025
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.
We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW
is the composition of the CO2 analyte):
>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses]
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
A01 mySession ETH-1 -3.807 24.921 5.795020 11.627670 16.893510 24.567080 0.794860 2.014086 37.041843 -0.574686 1.149684 -27.690250 0.214454
A02 mySession MYSAMPLE-1 -3.807 24.921 6.219070 11.491070 17.277490 24.582700 1.563180 2.476827 36.898281 -0.499264 1.435380 -27.122614 0.299589
A03 mySession ETH-2 -3.807 24.921 -6.058680 -4.817180 -11.635060 -10.325780 0.613520 -10.166796 19.907706 -0.685979 -0.721617 16.716901 0.206693
A04 mySession MYSAMPLE-2 -3.807 24.921 -3.861840 4.941840 0.606120 10.527320 0.571180 -8.159927 30.087230 -0.248531 0.613099 -4.979413 0.658270
A05 mySession ETH-3 -3.807 24.921 5.543650 12.052280 17.405550 25.969190 0.746080 1.727029 37.485567 -0.226150 1.678699 -28.280301 0.613200
A06 mySession ETH-2 -3.807 24.921 -6.067060 -4.877100 -11.699270 -10.644210 1.612340 -10.173599 19.845192 -0.683054 -0.922832 17.861363 0.210328
A07 mySession ETH-1 -3.807 24.921 5.788210 11.559100 16.801910 24.564230 1.479630 2.009281 36.970298 -0.591129 1.282632 -26.888335 0.195926
A08 mySession MYSAMPLE-2 -3.807 24.921 -3.876920 4.868890 0.521850 10.403900 1.070320 -8.173486 30.011134 -0.245768 0.636159 -4.324964 0.661803
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
2. How-to
2.1 Simulate a virtual data set to play with
It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).
This can be achieved with virtual_data()
. The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO
and BAR
with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data()
documentation for additional configuration parameters.
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
2.2 Control data quality
D47crunch
offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:
from D47crunch import *
from random import shuffle
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 8),
dict(Sample = 'ETH-2', N = 8),
dict(Sample = 'ETH-3', N = 8),
dict(Sample = 'FOO', N = 4,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 4,
d13C_VPDB = -15., d18O_VPDB = -15.,
D47 = 0.5, D48 = 0.2),
])
sessions = [
virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
for k in range(10)]
# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])
# create D47data instance:
data47 = D47data(data)
# process D47data instance:
data47.crunch()
data47.standardize()
2.2.1 Plotting the distribution of analyses through time
data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')
The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses()
for how to plot analyses as a function of “true” time (based on the TimeTag
for each analysis).
2.2.2 Generating session plots
data47.plot_sessions()
Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.
2.2.3 Plotting Δ47 or Δ48 residuals
data47.plot_residuals(filename = 'residuals.pdf', kde = True)
Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.
2.2.4 Checking δ13C and δ18O dispersion
mydata = D47data(virtual_data(
session = 'mysession',
samples = [
dict(Sample = 'ETH-1', N = 4),
dict(Sample = 'ETH-2', N = 4),
dict(Sample = 'ETH-3', N = 4),
dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
], seed = 123))
mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()
D4xdata.plot_bulk_compositions()
produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE
:
2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters
Nominal values for various carbonate standards are defined in four places:
D4xdata.Nominal_d13C_VPDB
D4xdata.Nominal_d18O_VPDB
D47data.Nominal_D4x
(also accessible throughD47data.Nominal_D47
)D48data.Nominal_D4x
(also accessible throughD48data.Nominal_D48
)
17O correction parameters are defined by:
D4xdata.R13_VPDB
D4xdata.R18_VSMOW
D4xdata.R18_VPDB
D4xdata.LAMBDA_17
D4xdata.R17_VSMOW
D4xdata.R17_VPDB
When creating a new instance of D47data
or D48data
, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW
and Nominal_D47
can thus be done in several ways:
Option 1: by redefining D4xdata.R17_VSMOW
and D47data.Nominal_D47
_before_ creating a D47data
object:
from D47crunch import D4xdata, D47data
# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
a: D47data.Nominal_D4x[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600
# only now create D47data object:
mydata = D47data()
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
Option 2: by redefining R17_VSMOW
and Nominal_D47
_after_ creating a D47data
object:
from D47crunch import D47data
# first create D47data object:
mydata = D47data()
# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
a: mydata.Nominal_D47[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:
from D47crunch import D47data
# create two D47data objects:
foo = D47data()
bar = D47data()
# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
'ETH-1': foo.Nominal_D47['ETH-1'],
'ETH-2': foo.Nominal_D47['ETH-1'],
'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
'INLAB_REF_MATERIAL': 0.666,
}
# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg() # compute δ13C, δ18O of working gas
foo.crunch() # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values
bar.read('rawdata.csv')
bar.wg() # compute δ13C, δ18O of working gas
bar.crunch() # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values
# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)
2.4 Process paired Δ47 and Δ48 values
Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch
uses two independent classes — D47data
and D48data
— which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data
and D48data
.
from D47crunch import *
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)
# create D47data instance:
data47 = D47data(session1 + session2)
# process D47data instance:
data47.crunch()
data47.standardize()
# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)
# process D48data instance:
data48.crunch()
data48.standardize()
# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)
Expected output:
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a_47 ± SE 1e3 x b_47 ± SE c_47 ± SE r_D48 a_48 ± SE 1e3 x b_48 ± SE c_48 ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session_01 9 3 -4.000 26.000 0.0000 0.0000 0.0098 1.021 ± 0.019 -0.398 ± 0.260 -0.903 ± 0.006 0.0486 0.540 ± 0.151 1.235 ± 0.607 -0.390 ± 0.025
Session_02 9 3 -4.000 26.000 0.0000 0.0000 0.0090 1.015 ± 0.019 0.376 ± 0.260 -0.905 ± 0.006 0.0186 1.350 ± 0.156 -0.871 ± 0.608 -0.504 ± 0.027
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene D48 SE 95% CL SD p_Levene
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 6 2.02 37.02 0.2052 0.0078 0.1380 0.0223
ETH-2 6 -10.17 19.88 0.2085 0.0036 0.1380 0.0482
ETH-3 6 1.71 37.45 0.6132 0.0080 0.2700 0.0176
FOO 6 -5.00 28.91 0.3026 0.0044 ± 0.0093 0.0121 0.164 0.1397 0.0121 ± 0.0255 0.0267 0.127
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47 D48
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.120787 21.286237 27.780042 2.020000 37.024281 -0.708176 -0.316435 -0.000013 0.197297 0.087763
2 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132240 21.307795 27.780042 2.020000 37.024281 -0.696913 -0.295333 -0.000013 0.208328 0.126791
3 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132438 21.313884 27.780042 2.020000 37.024281 -0.696718 -0.289374 -0.000013 0.208519 0.137813
4 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700300 -12.210735 -18.023381 -10.170000 19.875825 -0.683938 -0.297902 -0.000002 0.209785 0.198705
5 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.707421 -12.270781 -18.023381 -10.170000 19.875825 -0.691145 -0.358673 -0.000002 0.202726 0.086308
6 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700061 -12.278310 -18.023381 -10.170000 19.875825 -0.683696 -0.366292 -0.000002 0.210022 0.072215
7 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.684379 22.225827 28.306614 1.710000 37.450394 -0.273094 -0.216392 -0.000014 0.623472 0.270873
8 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.660163 22.233729 28.306614 1.710000 37.450394 -0.296906 -0.208664 -0.000014 0.600150 0.285167
9 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.675191 22.215632 28.306614 1.710000 37.450394 -0.282128 -0.226363 -0.000014 0.614623 0.252432
10 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.328380 5.374933 4.665655 -5.000000 28.907344 -0.582131 -0.288924 -0.000006 0.314928 0.175105
11 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.302220 5.384454 4.665655 -5.000000 28.907344 -0.608241 -0.279457 -0.000006 0.289356 0.192614
12 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.322530 5.372841 4.665655 -5.000000 28.907344 -0.587970 -0.291004 -0.000006 0.309209 0.171257
13 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.140853 21.267202 27.780042 2.020000 37.024281 -0.688442 -0.335067 -0.000013 0.207730 0.138730
14 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.127087 21.256983 27.780042 2.020000 37.024281 -0.701980 -0.345071 -0.000013 0.194396 0.131311
15 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.148253 21.287779 27.780042 2.020000 37.024281 -0.681165 -0.314926 -0.000013 0.214898 0.153668
16 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715859 -12.204791 -18.023381 -10.170000 19.875825 -0.699685 -0.291887 -0.000002 0.207349 0.149128
17 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.709763 -12.188685 -18.023381 -10.170000 19.875825 -0.693516 -0.275587 -0.000002 0.213426 0.161217
18 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715427 -12.253049 -18.023381 -10.170000 19.875825 -0.699249 -0.340727 -0.000002 0.207780 0.112907
19 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.685994 22.249463 28.306614 1.710000 37.450394 -0.271506 -0.193275 -0.000014 0.618328 0.244431
20 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.681351 22.298166 28.306614 1.710000 37.450394 -0.276071 -0.145641 -0.000014 0.613831 0.279758
21 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.676169 22.306848 28.306614 1.710000 37.450394 -0.281167 -0.137150 -0.000014 0.608813 0.286056
22 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.324359 5.339497 4.665655 -5.000000 28.907344 -0.586144 -0.324160 -0.000006 0.314015 0.136535
23 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.297658 5.325854 4.665655 -5.000000 28.907344 -0.612794 -0.337727 -0.000006 0.287767 0.126473
24 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.310185 5.339898 4.665655 -5.000000 28.907344 -0.600291 -0.323761 -0.000006 0.300082 0.136830
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
3. Command-Line Interface (CLI)
Instead of writing Python code, you may directly use the CLI to process raw D47 data using reasonable defaults. The simplest way is simply to call
D47crunch rawdata.csv
This will create a directory named output
and populate it by calling the following methods:
D47data.wg()
D47data.crunch()
D47data.standardize()
D47data.summary()
D47data.table_of_samples()
D47data.table_of_sessions()
D47data.plot_sessions()
D47data.plot_residuals()
D47data.table_of_analyses()
D47data.plot_distribution_of_analyses()
D47data.plot_bulk_compositions()
You may specify a custom set of anchors instead of the default ones using the --anchors
or -a
option:
D47crunch -a anchors.csv rawdata.csv
In this case, the anchors.csv
file (you may use any other file name) must have the following format:
Sample, d13C_VPDB, d18O_VPDB, D47
ETH-1, 2.02, -2.19, 0.2052
ETH-2, -10.17, -18.69, 0.2085
ETH-3, 1.71, -1.78, 0.6132
ETH-4, , , 0.4511
The samples with non-empty d13C_VPDB
, d18O_VPDB
, and D47
values are used to standardize δ13C, δ18O, and Δ47 values respectively.
You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude
or -e
option:
D47crunch -e badbatch.csv rawdata.csv
In this case, the badbatch.csv
file (again, you may use a different file name) must have the following format:
UID, Sample
A03
A09
B06
, MYBADSAMPLE-1
, MYBADSAMPLE-2
This will exclude (ignore) analyses with the UIDs A03
, A09
, and B06
, and those of samples MYBADSAMPLE-1
and MYBADSAMPLE-2
. It is possible to have and exclude file with only the UID
column, or only the Sample
column, or both, in any order.
The --output-dir
or -o
option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:
D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv
4. API Documentation
1''' 2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements 3 4Process and standardize carbonate and/or CO2 clumped-isotope analyses, 5from low-level data out of a dual-inlet mass spectrometer to final, “absolute” 6Δ47 and Δ48 values with fully propagated analytical error estimates 7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). 8 9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results. 10The **how-to** section provides instructions applicable to various specific tasks. 11 12.. include:: ../docs/tutorial.md 13.. include:: ../docs/howto.md 14.. include:: ../docs/cli.md 15 16# 4. API Documentation 17''' 18 19__docformat__ = "restructuredtext" 20__author__ = 'Mathieu Daëron' 21__contact__ = 'daeron@lsce.ipsl.fr' 22__copyright__ = 'Copyright (c) 2023 Mathieu Daëron' 23__license__ = 'Modified BSD License - https://opensource.org/licenses/BSD-3-Clause' 24__date__ = '2023-07-20' 25__version__ = '2.2.0' 26 27import os 28import numpy as np 29import typer 30from typing_extensions import Annotated 31from statistics import stdev 32from scipy.stats import t as tstudent 33from scipy.stats import levene 34from scipy.interpolate import interp1d 35from numpy import linalg 36from lmfit import Minimizer, Parameters, report_fit 37from matplotlib import pyplot as ppl 38from datetime import datetime as dt 39from functools import wraps 40from colorsys import hls_to_rgb 41from matplotlib import rcParams 42 43rcParams['font.family'] = 'sans-serif' 44rcParams['font.sans-serif'] = 'Helvetica' 45rcParams['font.size'] = 10 46rcParams['mathtext.fontset'] = 'custom' 47rcParams['mathtext.rm'] = 'sans' 48rcParams['mathtext.bf'] = 'sans:bold' 49rcParams['mathtext.it'] = 'sans:italic' 50rcParams['mathtext.cal'] = 'sans:italic' 51rcParams['mathtext.default'] = 'rm' 52rcParams['xtick.major.size'] = 4 53rcParams['xtick.major.width'] = 1 54rcParams['ytick.major.size'] = 4 55rcParams['ytick.major.width'] = 1 56rcParams['axes.grid'] = False 57rcParams['axes.linewidth'] = 1 58rcParams['grid.linewidth'] = .75 59rcParams['grid.linestyle'] = '-' 60rcParams['grid.alpha'] = .15 61rcParams['savefig.dpi'] = 150 62 63Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]]) 64_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1]) 65def fCO2eqD47_Petersen(T): 66 ''' 67 CO2 equilibrium Δ47 value as a function of T (in degrees C) 68 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 69 70 ''' 71 return float(_fCO2eqD47_Petersen(T)) 72 73 74Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]]) 75_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1]) 76def fCO2eqD47_Wang(T): 77 ''' 78 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 79 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 80 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 81 ''' 82 return float(_fCO2eqD47_Wang(T)) 83 84 85def correlated_sum(X, C, w = None): 86 ''' 87 Compute covariance-aware linear combinations 88 89 **Parameters** 90 91 + `X`: list or 1-D array of values to sum 92 + `C`: covariance matrix for the elements of `X` 93 + `w`: list or 1-D array of weights to apply to the elements of `X` 94 (all equal to 1 by default) 95 96 Return the sum (and its SE) of the elements of `X`, with optional weights equal 97 to the elements of `w`, accounting for covariances between the elements of `X`. 98 ''' 99 if w is None: 100 w = [1 for x in X] 101 return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5 102 103 104def make_csv(x, hsep = ',', vsep = '\n'): 105 ''' 106 Formats a list of lists of strings as a CSV 107 108 **Parameters** 109 110 + `x`: the list of lists of strings to format 111 + `hsep`: the field separator (`,` by default) 112 + `vsep`: the line-ending convention to use (`\\n` by default) 113 114 **Example** 115 116 ```py 117 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 118 ``` 119 120 outputs: 121 122 ```py 123 a,b,c 124 d,e,f 125 ``` 126 ''' 127 return vsep.join([hsep.join(l) for l in x]) 128 129 130def pf(txt): 131 ''' 132 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 133 ''' 134 return txt.replace('-','_').replace('.','_').replace(' ','_') 135 136 137def smart_type(x): 138 ''' 139 Tries to convert string `x` to a float if it includes a decimal point, or 140 to an integer if it does not. If both attempts fail, return the original 141 string unchanged. 142 ''' 143 try: 144 y = float(x) 145 except ValueError: 146 return x 147 if '.' not in x: 148 return int(y) 149 return y 150 151 152def pretty_table(x, header = 1, hsep = ' ', vsep = '–', align = '<'): 153 ''' 154 Reads a list of lists of strings and outputs an ascii table 155 156 **Parameters** 157 158 + `x`: a list of lists of strings 159 + `header`: the number of lines to treat as header lines 160 + `hsep`: the horizontal separator between columns 161 + `vsep`: the character to use as vertical separator 162 + `align`: string of left (`<`) or right (`>`) alignment characters. 163 164 **Example** 165 166 ```py 167 x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']] 168 print(pretty_table(x)) 169 ``` 170 yields: 171 ``` 172 -- ------ --- 173 A B C 174 -- ------ --- 175 1 1.9999 foo 176 10 x bar 177 -- ------ --- 178 ``` 179 180 ''' 181 txt = [] 182 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 183 184 if len(widths) > len(align): 185 align += '>' * (len(widths)-len(align)) 186 sepline = hsep.join([vsep*w for w in widths]) 187 txt += [sepline] 188 for k,l in enumerate(x): 189 if k and k == header: 190 txt += [sepline] 191 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 192 txt += [sepline] 193 txt += [''] 194 return '\n'.join(txt) 195 196 197def transpose_table(x): 198 ''' 199 Transpose a list if lists 200 201 **Parameters** 202 203 + `x`: a list of lists 204 205 **Example** 206 207 ```py 208 x = [[1, 2], [3, 4]] 209 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 210 ``` 211 ''' 212 return [[e for e in c] for c in zip(*x)] 213 214 215def w_avg(X, sX) : 216 ''' 217 Compute variance-weighted average 218 219 Returns the value and SE of the weighted average of the elements of `X`, 220 with relative weights equal to their inverse variances (`1/sX**2`). 221 222 **Parameters** 223 224 + `X`: array-like of elements to average 225 + `sX`: array-like of the corresponding SE values 226 227 **Tip** 228 229 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 230 they may be rearranged using `zip()`: 231 232 ```python 233 foo = [(0, 1), (1, 0.5), (2, 0.5)] 234 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 235 ``` 236 ''' 237 X = [ x for x in X ] 238 sX = [ sx for sx in sX ] 239 W = [ sx**-2 for sx in sX ] 240 W = [ w/sum(W) for w in W ] 241 Xavg = sum([ w*x for w,x in zip(W,X) ]) 242 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 243 return Xavg, sXavg 244 245 246def read_csv(filename, sep = ''): 247 ''' 248 Read contents of `filename` in csv format and return a list of dictionaries. 249 250 In the csv string, spaces before and after field separators (`','` by default) 251 are optional. 252 253 **Parameters** 254 255 + `filename`: the csv file to read 256 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 257 whichever appers most often in the contents of `filename`. 258 ''' 259 with open(filename) as fid: 260 txt = fid.read() 261 262 if sep == '': 263 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 264 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 265 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]] 266 267 268def simulate_single_analysis( 269 sample = 'MYSAMPLE', 270 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 271 d13C_VPDB = None, d18O_VPDB = None, 272 D47 = None, D48 = None, D49 = 0., D17O = 0., 273 a47 = 1., b47 = 0., c47 = -0.9, 274 a48 = 1., b48 = 0., c48 = -0.45, 275 Nominal_D47 = None, 276 Nominal_D48 = None, 277 Nominal_d13C_VPDB = None, 278 Nominal_d18O_VPDB = None, 279 ALPHA_18O_ACID_REACTION = None, 280 R13_VPDB = None, 281 R17_VSMOW = None, 282 R18_VSMOW = None, 283 LAMBDA_17 = None, 284 R18_VPDB = None, 285 ): 286 ''' 287 Compute working-gas delta values for a single analysis, assuming a stochastic working 288 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 289 290 **Parameters** 291 292 + `sample`: sample name 293 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 294 (respectively –4 and +26 ‰ by default) 295 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 296 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 297 of the carbonate sample 298 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 299 Δ48 values if `D47` or `D48` are not specified 300 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 301 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 302 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 303 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 304 correction parameters (by default equal to the `D4xdata` default values) 305 306 Returns a dictionary with fields 307 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 308 ''' 309 310 if Nominal_d13C_VPDB is None: 311 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 312 313 if Nominal_d18O_VPDB is None: 314 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 315 316 if ALPHA_18O_ACID_REACTION is None: 317 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 318 319 if R13_VPDB is None: 320 R13_VPDB = D4xdata().R13_VPDB 321 322 if R17_VSMOW is None: 323 R17_VSMOW = D4xdata().R17_VSMOW 324 325 if R18_VSMOW is None: 326 R18_VSMOW = D4xdata().R18_VSMOW 327 328 if LAMBDA_17 is None: 329 LAMBDA_17 = D4xdata().LAMBDA_17 330 331 if R18_VPDB is None: 332 R18_VPDB = D4xdata().R18_VPDB 333 334 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 335 336 if Nominal_D47 is None: 337 Nominal_D47 = D47data().Nominal_D47 338 339 if Nominal_D48 is None: 340 Nominal_D48 = D48data().Nominal_D48 341 342 if d13C_VPDB is None: 343 if sample in Nominal_d13C_VPDB: 344 d13C_VPDB = Nominal_d13C_VPDB[sample] 345 else: 346 raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.") 347 348 if d18O_VPDB is None: 349 if sample in Nominal_d18O_VPDB: 350 d18O_VPDB = Nominal_d18O_VPDB[sample] 351 else: 352 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 353 354 if D47 is None: 355 if sample in Nominal_D47: 356 D47 = Nominal_D47[sample] 357 else: 358 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 359 360 if D48 is None: 361 if sample in Nominal_D48: 362 D48 = Nominal_D48[sample] 363 else: 364 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 365 366 X = D4xdata() 367 X.R13_VPDB = R13_VPDB 368 X.R17_VSMOW = R17_VSMOW 369 X.R18_VSMOW = R18_VSMOW 370 X.LAMBDA_17 = LAMBDA_17 371 X.R18_VPDB = R18_VPDB 372 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 373 374 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 375 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 376 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 377 ) 378 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 379 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 380 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 381 D17O=D17O, D47=D47, D48=D48, D49=D49, 382 ) 383 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 384 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 385 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 386 D17O=D17O, 387 ) 388 389 d45 = 1000 * (R45/R45wg - 1) 390 d46 = 1000 * (R46/R46wg - 1) 391 d47 = 1000 * (R47/R47wg - 1) 392 d48 = 1000 * (R48/R48wg - 1) 393 d49 = 1000 * (R49/R49wg - 1) 394 395 for k in range(3): # dumb iteration to adjust for small changes in d47 396 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 397 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 398 d47 = 1000 * (R47raw/R47wg - 1) 399 d48 = 1000 * (R48raw/R48wg - 1) 400 401 return dict( 402 Sample = sample, 403 D17O = D17O, 404 d13Cwg_VPDB = d13Cwg_VPDB, 405 d18Owg_VSMOW = d18Owg_VSMOW, 406 d45 = d45, 407 d46 = d46, 408 d47 = d47, 409 d48 = d48, 410 d49 = d49, 411 ) 412 413 414def virtual_data( 415 samples = [], 416 a47 = 1., b47 = 0., c47 = -0.9, 417 a48 = 1., b48 = 0., c48 = -0.45, 418 rd45 = 0.020, rd46 = 0.060, 419 rD47 = 0.015, rD48 = 0.045, 420 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 421 session = None, 422 Nominal_D47 = None, Nominal_D48 = None, 423 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 424 ALPHA_18O_ACID_REACTION = None, 425 R13_VPDB = None, 426 R17_VSMOW = None, 427 R18_VSMOW = None, 428 LAMBDA_17 = None, 429 R18_VPDB = None, 430 seed = 0, 431 shuffle = True, 432 ): 433 ''' 434 Return list with simulated analyses from a single session. 435 436 **Parameters** 437 438 + `samples`: a list of entries; each entry is a dictionary with the following fields: 439 * `Sample`: the name of the sample 440 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 441 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 442 * `N`: how many analyses to generate for this sample 443 + `a47`: scrambling factor for Δ47 444 + `b47`: compositional nonlinearity for Δ47 445 + `c47`: working gas offset for Δ47 446 + `a48`: scrambling factor for Δ48 447 + `b48`: compositional nonlinearity for Δ48 448 + `c48`: working gas offset for Δ48 449 + `rd45`: analytical repeatability of δ45 450 + `rd46`: analytical repeatability of δ46 451 + `rD47`: analytical repeatability of Δ47 452 + `rD48`: analytical repeatability of Δ48 453 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 454 (by default equal to the `simulate_single_analysis` default values) 455 + `session`: name of the session (no name by default) 456 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 457 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 458 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 459 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 460 (by default equal to the `simulate_single_analysis` defaults) 461 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 462 (by default equal to the `simulate_single_analysis` defaults) 463 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 464 correction parameters (by default equal to the `simulate_single_analysis` default) 465 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 466 + `shuffle`: randomly reorder the sequence of analyses 467 468 469 Here is an example of using this method to generate an arbitrary combination of 470 anchors and unknowns for a bunch of sessions: 471 472 ```py 473 .. include:: ../code_examples/virtual_data/example.py 474 ``` 475 476 This should output something like: 477 478 ``` 479 .. include:: ../code_examples/virtual_data/output.txt 480 ``` 481 ''' 482 483 kwargs = locals().copy() 484 485 from numpy import random as nprandom 486 if seed: 487 rng = nprandom.default_rng(seed) 488 else: 489 rng = nprandom.default_rng() 490 491 N = sum([s['N'] for s in samples]) 492 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 493 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 494 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 495 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 496 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 497 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 498 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 499 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 500 501 k = 0 502 out = [] 503 for s in samples: 504 kw = {} 505 kw['sample'] = s['Sample'] 506 kw = { 507 **kw, 508 **{var: kwargs[var] 509 for var in [ 510 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 511 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 512 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 513 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 514 ] 515 if kwargs[var] is not None}, 516 **{var: s[var] 517 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 518 if var in s}, 519 } 520 521 sN = s['N'] 522 while sN: 523 out.append(simulate_single_analysis(**kw)) 524 out[-1]['d45'] += errors45[k] 525 out[-1]['d46'] += errors46[k] 526 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 527 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 528 sN -= 1 529 k += 1 530 531 if session is not None: 532 for r in out: 533 r['Session'] = session 534 535 if shuffle: 536 nprandom.shuffle(out) 537 538 return out 539 540def table_of_samples( 541 data47 = None, 542 data48 = None, 543 dir = 'output', 544 filename = None, 545 save_to_file = True, 546 print_out = True, 547 output = None, 548 ): 549 ''' 550 Print out, save to disk and/or return a combined table of samples 551 for a pair of `D47data` and `D48data` objects. 552 553 **Parameters** 554 555 + `data47`: `D47data` instance 556 + `data48`: `D48data` instance 557 + `dir`: the directory in which to save the table 558 + `filename`: the name to the csv file to write to 559 + `save_to_file`: whether to save the table to disk 560 + `print_out`: whether to print out the table 561 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 562 if set to `'raw'`: return a list of list of strings 563 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 564 ''' 565 if data47 is None: 566 if data48 is None: 567 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 568 else: 569 return data48.table_of_samples( 570 dir = dir, 571 filename = filename, 572 save_to_file = save_to_file, 573 print_out = print_out, 574 output = output 575 ) 576 else: 577 if data48 is None: 578 return data47.table_of_samples( 579 dir = dir, 580 filename = filename, 581 save_to_file = save_to_file, 582 print_out = print_out, 583 output = output 584 ) 585 else: 586 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 587 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 588 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 589 590 if save_to_file: 591 if not os.path.exists(dir): 592 os.makedirs(dir) 593 if filename is None: 594 filename = f'D47D48_samples.csv' 595 with open(f'{dir}/{filename}', 'w') as fid: 596 fid.write(make_csv(out)) 597 if print_out: 598 print('\n'+pretty_table(out)) 599 if output == 'raw': 600 return out 601 elif output == 'pretty': 602 return pretty_table(out) 603 604 605def table_of_sessions( 606 data47 = None, 607 data48 = None, 608 dir = 'output', 609 filename = None, 610 save_to_file = True, 611 print_out = True, 612 output = None, 613 ): 614 ''' 615 Print out, save to disk and/or return a combined table of sessions 616 for a pair of `D47data` and `D48data` objects. 617 ***Only applicable if the sessions in `data47` and those in `data48` 618 consist of the exact same sets of analyses.*** 619 620 **Parameters** 621 622 + `data47`: `D47data` instance 623 + `data48`: `D48data` instance 624 + `dir`: the directory in which to save the table 625 + `filename`: the name to the csv file to write to 626 + `save_to_file`: whether to save the table to disk 627 + `print_out`: whether to print out the table 628 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 629 if set to `'raw'`: return a list of list of strings 630 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 631 ''' 632 if data47 is None: 633 if data48 is None: 634 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 635 else: 636 return data48.table_of_sessions( 637 dir = dir, 638 filename = filename, 639 save_to_file = save_to_file, 640 print_out = print_out, 641 output = output 642 ) 643 else: 644 if data48 is None: 645 return data47.table_of_sessions( 646 dir = dir, 647 filename = filename, 648 save_to_file = save_to_file, 649 print_out = print_out, 650 output = output 651 ) 652 else: 653 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 654 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 655 for k,x in enumerate(out47[0]): 656 if k>7: 657 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 658 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 659 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 660 661 if save_to_file: 662 if not os.path.exists(dir): 663 os.makedirs(dir) 664 if filename is None: 665 filename = f'D47D48_sessions.csv' 666 with open(f'{dir}/{filename}', 'w') as fid: 667 fid.write(make_csv(out)) 668 if print_out: 669 print('\n'+pretty_table(out)) 670 if output == 'raw': 671 return out 672 elif output == 'pretty': 673 return pretty_table(out) 674 675 676def table_of_analyses( 677 data47 = None, 678 data48 = None, 679 dir = 'output', 680 filename = None, 681 save_to_file = True, 682 print_out = True, 683 output = None, 684 ): 685 ''' 686 Print out, save to disk and/or return a combined table of analyses 687 for a pair of `D47data` and `D48data` objects. 688 689 If the sessions in `data47` and those in `data48` do not consist of 690 the exact same sets of analyses, the table will have two columns 691 `Session_47` and `Session_48` instead of a single `Session` column. 692 693 **Parameters** 694 695 + `data47`: `D47data` instance 696 + `data48`: `D48data` instance 697 + `dir`: the directory in which to save the table 698 + `filename`: the name to the csv file to write to 699 + `save_to_file`: whether to save the table to disk 700 + `print_out`: whether to print out the table 701 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 702 if set to `'raw'`: return a list of list of strings 703 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 704 ''' 705 if data47 is None: 706 if data48 is None: 707 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 708 else: 709 return data48.table_of_analyses( 710 dir = dir, 711 filename = filename, 712 save_to_file = save_to_file, 713 print_out = print_out, 714 output = output 715 ) 716 else: 717 if data48 is None: 718 return data47.table_of_analyses( 719 dir = dir, 720 filename = filename, 721 save_to_file = save_to_file, 722 print_out = print_out, 723 output = output 724 ) 725 else: 726 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 727 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 728 729 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 730 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 731 else: 732 out47[0][1] = 'Session_47' 733 out48[0][1] = 'Session_48' 734 out47 = transpose_table(out47) 735 out48 = transpose_table(out48) 736 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 737 738 if save_to_file: 739 if not os.path.exists(dir): 740 os.makedirs(dir) 741 if filename is None: 742 filename = f'D47D48_sessions.csv' 743 with open(f'{dir}/{filename}', 'w') as fid: 744 fid.write(make_csv(out)) 745 if print_out: 746 print('\n'+pretty_table(out)) 747 if output == 'raw': 748 return out 749 elif output == 'pretty': 750 return pretty_table(out) 751 752 753def _fullcovar(minresult, epsilon = 0.01, named = False): 754 ''' 755 Construct full covariance matrix in the case of constrained parameters 756 ''' 757 758 import asteval 759 760 def f(values): 761 interp = asteval.Interpreter() 762 for n,v in zip(minresult.var_names, values): 763 interp(f'{n} = {v}') 764 for q in minresult.params: 765 if minresult.params[q].expr: 766 interp(f'{q} = {minresult.params[q].expr}') 767 return np.array([interp.symtable[q] for q in minresult.params]) 768 769 # construct Jacobian 770 J = np.zeros((minresult.nvarys, len(minresult.params))) 771 X = np.array([minresult.params[p].value for p in minresult.var_names]) 772 sX = np.array([minresult.params[p].stderr for p in minresult.var_names]) 773 774 for j in range(minresult.nvarys): 775 x1 = [_ for _ in X] 776 x1[j] += epsilon * sX[j] 777 x2 = [_ for _ in X] 778 x2[j] -= epsilon * sX[j] 779 J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j]) 780 781 _names = [q for q in minresult.params] 782 _covar = J.T @ minresult.covar @ J 783 _se = np.diag(_covar)**.5 784 _correl = _covar.copy() 785 for k,s in enumerate(_se): 786 if s: 787 _correl[k,:] /= s 788 _correl[:,k] /= s 789 790 if named: 791 _covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params} 792 _se = {i: _se[i] for i in minresult.params} 793 _correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params} 794 795 return _names, _covar, _se, _correl 796 797 798class D4xdata(list): 799 ''' 800 Store and process data for a large set of Δ47 and/or Δ48 801 analyses, usually comprising more than one analytical session. 802 ''' 803 804 ### 17O CORRECTION PARAMETERS 805 R13_VPDB = 0.01118 # (Chang & Li, 1990) 806 ''' 807 Absolute (13C/12C) ratio of VPDB. 808 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 809 ''' 810 811 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 812 ''' 813 Absolute (18O/16C) ratio of VSMOW. 814 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 815 ''' 816 817 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 818 ''' 819 Mass-dependent exponent for triple oxygen isotopes. 820 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 821 ''' 822 823 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 824 ''' 825 Absolute (17O/16C) ratio of VSMOW. 826 By default equal to 0.00038475 827 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 828 rescaled to `R13_VPDB`) 829 ''' 830 831 R18_VPDB = R18_VSMOW * 1.03092 832 ''' 833 Absolute (18O/16C) ratio of VPDB. 834 By definition equal to `R18_VSMOW * 1.03092`. 835 ''' 836 837 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 838 ''' 839 Absolute (17O/16C) ratio of VPDB. 840 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 841 ''' 842 843 LEVENE_REF_SAMPLE = 'ETH-3' 844 ''' 845 After the Δ4x standardization step, each sample is tested to 846 assess whether the Δ4x variance within all analyses for that 847 sample differs significantly from that observed for a given reference 848 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 849 which yields a p-value corresponding to the null hypothesis that the 850 underlying variances are equal). 851 852 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 853 sample should be used as a reference for this test. 854 ''' 855 856 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 857 ''' 858 Specifies the 18O/16O fractionation factor generally applicable 859 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 860 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 861 862 By default equal to 1.008129 (calcite reacted at 90 °C, 863 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 864 ''' 865 866 Nominal_d13C_VPDB = { 867 'ETH-1': 2.02, 868 'ETH-2': -10.17, 869 'ETH-3': 1.71, 870 } # (Bernasconi et al., 2018) 871 ''' 872 Nominal δ13C_VPDB values assigned to carbonate standards, used by 873 `D4xdata.standardize_d13C()`. 874 875 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 876 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 877 ''' 878 879 Nominal_d18O_VPDB = { 880 'ETH-1': -2.19, 881 'ETH-2': -18.69, 882 'ETH-3': -1.78, 883 } # (Bernasconi et al., 2018) 884 ''' 885 Nominal δ18O_VPDB values assigned to carbonate standards, used by 886 `D4xdata.standardize_d18O()`. 887 888 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 889 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 890 ''' 891 892 d13C_STANDARDIZATION_METHOD = '2pt' 893 ''' 894 Method by which to standardize δ13C values: 895 896 + `none`: do not apply any δ13C standardization. 897 + `'1pt'`: within each session, offset all initial δ13C values so as to 898 minimize the difference between final δ13C_VPDB values and 899 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 900 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 901 values so as to minimize the difference between final δ13C_VPDB 902 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 903 is defined). 904 ''' 905 906 d18O_STANDARDIZATION_METHOD = '2pt' 907 ''' 908 Method by which to standardize δ18O values: 909 910 + `none`: do not apply any δ18O standardization. 911 + `'1pt'`: within each session, offset all initial δ18O values so as to 912 minimize the difference between final δ18O_VPDB values and 913 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 914 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 915 values so as to minimize the difference between final δ18O_VPDB 916 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 917 is defined). 918 ''' 919 920 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 921 ''' 922 **Parameters** 923 924 + `l`: a list of dictionaries, with each dictionary including at least the keys 925 `Sample`, `d45`, `d46`, and `d47` or `d48`. 926 + `mass`: `'47'` or `'48'` 927 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 928 + `session`: define session name for analyses without a `Session` key 929 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 930 931 Returns a `D4xdata` object derived from `list`. 932 ''' 933 self._4x = mass 934 self.verbose = verbose 935 self.prefix = 'D4xdata' 936 self.logfile = logfile 937 list.__init__(self, l) 938 self.Nf = None 939 self.repeatability = {} 940 self.refresh(session = session) 941 942 943 def make_verbal(oldfun): 944 ''' 945 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 946 ''' 947 @wraps(oldfun) 948 def newfun(*args, verbose = '', **kwargs): 949 myself = args[0] 950 oldprefix = myself.prefix 951 myself.prefix = oldfun.__name__ 952 if verbose != '': 953 oldverbose = myself.verbose 954 myself.verbose = verbose 955 out = oldfun(*args, **kwargs) 956 myself.prefix = oldprefix 957 if verbose != '': 958 myself.verbose = oldverbose 959 return out 960 return newfun 961 962 963 def msg(self, txt): 964 ''' 965 Log a message to `self.logfile`, and print it out if `verbose = True` 966 ''' 967 self.log(txt) 968 if self.verbose: 969 print(f'{f"[{self.prefix}]":<16} {txt}') 970 971 972 def vmsg(self, txt): 973 ''' 974 Log a message to `self.logfile` and print it out 975 ''' 976 self.log(txt) 977 print(txt) 978 979 980 def log(self, *txts): 981 ''' 982 Log a message to `self.logfile` 983 ''' 984 if self.logfile: 985 with open(self.logfile, 'a') as fid: 986 for txt in txts: 987 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 988 989 990 def refresh(self, session = 'mySession'): 991 ''' 992 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 993 ''' 994 self.fill_in_missing_info(session = session) 995 self.refresh_sessions() 996 self.refresh_samples() 997 998 999 def refresh_sessions(self): 1000 ''' 1001 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1002 to `False` for all sessions. 1003 ''' 1004 self.sessions = { 1005 s: {'data': [r for r in self if r['Session'] == s]} 1006 for s in sorted({r['Session'] for r in self}) 1007 } 1008 for s in self.sessions: 1009 self.sessions[s]['scrambling_drift'] = False 1010 self.sessions[s]['slope_drift'] = False 1011 self.sessions[s]['wg_drift'] = False 1012 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1013 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1014 1015 1016 def refresh_samples(self): 1017 ''' 1018 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1019 ''' 1020 self.samples = { 1021 s: {'data': [r for r in self if r['Sample'] == s]} 1022 for s in sorted({r['Sample'] for r in self}) 1023 } 1024 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1025 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1026 1027 1028 def read(self, filename, sep = '', session = ''): 1029 ''' 1030 Read file in csv format to load data into a `D47data` object. 1031 1032 In the csv file, spaces before and after field separators (`','` by default) 1033 are optional. Each line corresponds to a single analysis. 1034 1035 The required fields are: 1036 1037 + `UID`: a unique identifier 1038 + `Session`: an identifier for the analytical session 1039 + `Sample`: a sample identifier 1040 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1041 1042 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1043 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1044 and `d49` are optional, and set to NaN by default. 1045 1046 **Parameters** 1047 1048 + `fileneme`: the path of the file to read 1049 + `sep`: csv separator delimiting the fields 1050 + `session`: set `Session` field to this string for all analyses 1051 ''' 1052 with open(filename) as fid: 1053 self.input(fid.read(), sep = sep, session = session) 1054 1055 1056 def input(self, txt, sep = '', session = ''): 1057 ''' 1058 Read `txt` string in csv format to load analysis data into a `D47data` object. 1059 1060 In the csv string, spaces before and after field separators (`','` by default) 1061 are optional. Each line corresponds to a single analysis. 1062 1063 The required fields are: 1064 1065 + `UID`: a unique identifier 1066 + `Session`: an identifier for the analytical session 1067 + `Sample`: a sample identifier 1068 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1069 1070 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1071 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1072 and `d49` are optional, and set to NaN by default. 1073 1074 **Parameters** 1075 1076 + `txt`: the csv string to read 1077 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1078 whichever appers most often in `txt`. 1079 + `session`: set `Session` field to this string for all analyses 1080 ''' 1081 if sep == '': 1082 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1083 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1084 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1085 1086 if session != '': 1087 for r in data: 1088 r['Session'] = session 1089 1090 self += data 1091 self.refresh() 1092 1093 1094 @make_verbal 1095 def wg(self, samples = None, a18_acid = None): 1096 ''' 1097 Compute bulk composition of the working gas for each session based on 1098 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1099 `self.Nominal_d18O_VPDB`. 1100 ''' 1101 1102 self.msg('Computing WG composition:') 1103 1104 if a18_acid is None: 1105 a18_acid = self.ALPHA_18O_ACID_REACTION 1106 if samples is None: 1107 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1108 1109 assert a18_acid, f'Acid fractionation factor should not be zero.' 1110 1111 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1112 R45R46_standards = {} 1113 for sample in samples: 1114 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1115 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1116 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1117 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1118 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1119 1120 C12_s = 1 / (1 + R13_s) 1121 C13_s = R13_s / (1 + R13_s) 1122 C16_s = 1 / (1 + R17_s + R18_s) 1123 C17_s = R17_s / (1 + R17_s + R18_s) 1124 C18_s = R18_s / (1 + R17_s + R18_s) 1125 1126 C626_s = C12_s * C16_s ** 2 1127 C627_s = 2 * C12_s * C16_s * C17_s 1128 C628_s = 2 * C12_s * C16_s * C18_s 1129 C636_s = C13_s * C16_s ** 2 1130 C637_s = 2 * C13_s * C16_s * C17_s 1131 C727_s = C12_s * C17_s ** 2 1132 1133 R45_s = (C627_s + C636_s) / C626_s 1134 R46_s = (C628_s + C637_s + C727_s) / C626_s 1135 R45R46_standards[sample] = (R45_s, R46_s) 1136 1137 for s in self.sessions: 1138 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1139 assert db, f'No sample from {samples} found in session "{s}".' 1140# dbsamples = sorted({r['Sample'] for r in db}) 1141 1142 X = [r['d45'] for r in db] 1143 Y = [R45R46_standards[r['Sample']][0] for r in db] 1144 x1, x2 = np.min(X), np.max(X) 1145 1146 if x1 < x2: 1147 wgcoord = x1/(x1-x2) 1148 else: 1149 wgcoord = 999 1150 1151 if wgcoord < -.5 or wgcoord > 1.5: 1152 # unreasonable to extrapolate to d45 = 0 1153 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1154 else : 1155 # d45 = 0 is reasonably well bracketed 1156 R45_wg = np.polyfit(X, Y, 1)[1] 1157 1158 X = [r['d46'] for r in db] 1159 Y = [R45R46_standards[r['Sample']][1] for r in db] 1160 x1, x2 = np.min(X), np.max(X) 1161 1162 if x1 < x2: 1163 wgcoord = x1/(x1-x2) 1164 else: 1165 wgcoord = 999 1166 1167 if wgcoord < -.5 or wgcoord > 1.5: 1168 # unreasonable to extrapolate to d46 = 0 1169 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1170 else : 1171 # d46 = 0 is reasonably well bracketed 1172 R46_wg = np.polyfit(X, Y, 1)[1] 1173 1174 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1175 1176 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1177 1178 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1179 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1180 for r in self.sessions[s]['data']: 1181 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1182 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1183 1184 1185 def compute_bulk_delta(self, R45, R46, D17O = 0): 1186 ''' 1187 Compute δ13C_VPDB and δ18O_VSMOW, 1188 by solving the generalized form of equation (17) from 1189 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1190 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1191 solving the corresponding second-order Taylor polynomial. 1192 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1193 ''' 1194 1195 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1196 1197 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1198 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1199 C = 2 * self.R18_VSMOW 1200 D = -R46 1201 1202 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1203 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1204 cc = A + B + C + D 1205 1206 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1207 1208 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1209 R17 = K * R18 ** self.LAMBDA_17 1210 R13 = R45 - 2 * R17 1211 1212 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1213 1214 return d13C_VPDB, d18O_VSMOW 1215 1216 1217 @make_verbal 1218 def crunch(self, verbose = ''): 1219 ''' 1220 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1221 ''' 1222 for r in self: 1223 self.compute_bulk_and_clumping_deltas(r) 1224 self.standardize_d13C() 1225 self.standardize_d18O() 1226 self.msg(f"Crunched {len(self)} analyses.") 1227 1228 1229 def fill_in_missing_info(self, session = 'mySession'): 1230 ''' 1231 Fill in optional fields with default values 1232 ''' 1233 for i,r in enumerate(self): 1234 if 'D17O' not in r: 1235 r['D17O'] = 0. 1236 if 'UID' not in r: 1237 r['UID'] = f'{i+1}' 1238 if 'Session' not in r: 1239 r['Session'] = session 1240 for k in ['d47', 'd48', 'd49']: 1241 if k not in r: 1242 r[k] = np.nan 1243 1244 1245 def standardize_d13C(self): 1246 ''' 1247 Perform δ13C standadization within each session `s` according to 1248 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1249 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1250 may be redefined abitrarily at a later stage. 1251 ''' 1252 for s in self.sessions: 1253 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1254 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1255 X,Y = zip(*XY) 1256 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1257 offset = np.mean(Y) - np.mean(X) 1258 for r in self.sessions[s]['data']: 1259 r['d13C_VPDB'] += offset 1260 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1261 a,b = np.polyfit(X,Y,1) 1262 for r in self.sessions[s]['data']: 1263 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1264 1265 def standardize_d18O(self): 1266 ''' 1267 Perform δ18O standadization within each session `s` according to 1268 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1269 which is defined by default by `D47data.refresh_sessions()`as equal to 1270 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1271 ''' 1272 for s in self.sessions: 1273 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1274 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1275 X,Y = zip(*XY) 1276 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1277 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1278 offset = np.mean(Y) - np.mean(X) 1279 for r in self.sessions[s]['data']: 1280 r['d18O_VSMOW'] += offset 1281 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1282 a,b = np.polyfit(X,Y,1) 1283 for r in self.sessions[s]['data']: 1284 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1285 1286 1287 def compute_bulk_and_clumping_deltas(self, r): 1288 ''' 1289 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1290 ''' 1291 1292 # Compute working gas R13, R18, and isobar ratios 1293 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1294 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1295 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1296 1297 # Compute analyte isobar ratios 1298 R45 = (1 + r['d45'] / 1000) * R45_wg 1299 R46 = (1 + r['d46'] / 1000) * R46_wg 1300 R47 = (1 + r['d47'] / 1000) * R47_wg 1301 R48 = (1 + r['d48'] / 1000) * R48_wg 1302 R49 = (1 + r['d49'] / 1000) * R49_wg 1303 1304 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1305 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1306 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1307 1308 # Compute stochastic isobar ratios of the analyte 1309 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1310 R13, R18, D17O = r['D17O'] 1311 ) 1312 1313 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1314 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1315 if (R45 / R45stoch - 1) > 5e-8: 1316 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1317 if (R46 / R46stoch - 1) > 5e-8: 1318 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1319 1320 # Compute raw clumped isotope anomalies 1321 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1322 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1323 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1324 1325 1326 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1327 ''' 1328 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1329 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1330 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1331 ''' 1332 1333 # Compute R17 1334 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1335 1336 # Compute isotope concentrations 1337 C12 = (1 + R13) ** -1 1338 C13 = C12 * R13 1339 C16 = (1 + R17 + R18) ** -1 1340 C17 = C16 * R17 1341 C18 = C16 * R18 1342 1343 # Compute stochastic isotopologue concentrations 1344 C626 = C16 * C12 * C16 1345 C627 = C16 * C12 * C17 * 2 1346 C628 = C16 * C12 * C18 * 2 1347 C636 = C16 * C13 * C16 1348 C637 = C16 * C13 * C17 * 2 1349 C638 = C16 * C13 * C18 * 2 1350 C727 = C17 * C12 * C17 1351 C728 = C17 * C12 * C18 * 2 1352 C737 = C17 * C13 * C17 1353 C738 = C17 * C13 * C18 * 2 1354 C828 = C18 * C12 * C18 1355 C838 = C18 * C13 * C18 1356 1357 # Compute stochastic isobar ratios 1358 R45 = (C636 + C627) / C626 1359 R46 = (C628 + C637 + C727) / C626 1360 R47 = (C638 + C728 + C737) / C626 1361 R48 = (C738 + C828) / C626 1362 R49 = C838 / C626 1363 1364 # Account for stochastic anomalies 1365 R47 *= 1 + D47 / 1000 1366 R48 *= 1 + D48 / 1000 1367 R49 *= 1 + D49 / 1000 1368 1369 # Return isobar ratios 1370 return R45, R46, R47, R48, R49 1371 1372 1373 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1374 ''' 1375 Split unknown samples by UID (treat all analyses as different samples) 1376 or by session (treat analyses of a given sample in different sessions as 1377 different samples). 1378 1379 **Parameters** 1380 1381 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1382 + `grouping`: `by_uid` | `by_session` 1383 ''' 1384 if samples_to_split == 'all': 1385 samples_to_split = [s for s in self.unknowns] 1386 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1387 self.grouping = grouping.lower() 1388 if self.grouping in gkeys: 1389 gkey = gkeys[self.grouping] 1390 for r in self: 1391 if r['Sample'] in samples_to_split: 1392 r['Sample_original'] = r['Sample'] 1393 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1394 elif r['Sample'] in self.unknowns: 1395 r['Sample_original'] = r['Sample'] 1396 self.refresh_samples() 1397 1398 1399 def unsplit_samples(self, tables = False): 1400 ''' 1401 Reverse the effects of `D47data.split_samples()`. 1402 1403 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1404 1405 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1406 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1407 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1408 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1409 that case session-averaged Δ4x values are statistically independent). 1410 ''' 1411 unknowns_old = sorted({s for s in self.unknowns}) 1412 CM_old = self.standardization.covar[:,:] 1413 VD_old = self.standardization.params.valuesdict().copy() 1414 vars_old = self.standardization.var_names 1415 1416 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1417 1418 Ns = len(vars_old) - len(unknowns_old) 1419 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1420 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1421 1422 W = np.zeros((len(vars_new), len(vars_old))) 1423 W[:Ns,:Ns] = np.eye(Ns) 1424 for u in unknowns_new: 1425 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1426 if self.grouping == 'by_session': 1427 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1428 elif self.grouping == 'by_uid': 1429 weights = [1 for s in splits] 1430 sw = sum(weights) 1431 weights = [w/sw for w in weights] 1432 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1433 1434 CM_new = W @ CM_old @ W.T 1435 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1436 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1437 1438 self.standardization.covar = CM_new 1439 self.standardization.params.valuesdict = lambda : VD_new 1440 self.standardization.var_names = vars_new 1441 1442 for r in self: 1443 if r['Sample'] in self.unknowns: 1444 r['Sample_split'] = r['Sample'] 1445 r['Sample'] = r['Sample_original'] 1446 1447 self.refresh_samples() 1448 self.consolidate_samples() 1449 self.repeatabilities() 1450 1451 if tables: 1452 self.table_of_analyses() 1453 self.table_of_samples() 1454 1455 def assign_timestamps(self): 1456 ''' 1457 Assign a time field `t` of type `float` to each analysis. 1458 1459 If `TimeTag` is one of the data fields, `t` is equal within a given session 1460 to `TimeTag` minus the mean value of `TimeTag` for that session. 1461 Otherwise, `TimeTag` is by default equal to the index of each analysis 1462 in the dataset and `t` is defined as above. 1463 ''' 1464 for session in self.sessions: 1465 sdata = self.sessions[session]['data'] 1466 try: 1467 t0 = np.mean([r['TimeTag'] for r in sdata]) 1468 for r in sdata: 1469 r['t'] = r['TimeTag'] - t0 1470 except KeyError: 1471 t0 = (len(sdata)-1)/2 1472 for t,r in enumerate(sdata): 1473 r['t'] = t - t0 1474 1475 1476 def report(self): 1477 ''' 1478 Prints a report on the standardization fit. 1479 Only applicable after `D4xdata.standardize(method='pooled')`. 1480 ''' 1481 report_fit(self.standardization) 1482 1483 1484 def combine_samples(self, sample_groups): 1485 ''' 1486 Combine analyses of different samples to compute weighted average Δ4x 1487 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1488 dictionary. 1489 1490 Caution: samples are weighted by number of replicate analyses, which is a 1491 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1492 correlated analytical errors for one or more samples). 1493 1494 Returns a tuplet of: 1495 1496 + the list of group names 1497 + an array of the corresponding Δ4x values 1498 + the corresponding (co)variance matrix 1499 1500 **Parameters** 1501 1502 + `sample_groups`: a dictionary of the form: 1503 ```py 1504 {'group1': ['sample_1', 'sample_2'], 1505 'group2': ['sample_3', 'sample_4', 'sample_5']} 1506 ``` 1507 ''' 1508 1509 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1510 groups = sorted(sample_groups.keys()) 1511 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1512 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1513 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1514 W = np.array([ 1515 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1516 for j in groups]) 1517 D4x_new = W @ D4x_old 1518 CM_new = W @ CM_old @ W.T 1519 1520 return groups, D4x_new[:,0], CM_new 1521 1522 1523 @make_verbal 1524 def standardize(self, 1525 method = 'pooled', 1526 weighted_sessions = [], 1527 consolidate = True, 1528 consolidate_tables = False, 1529 consolidate_plots = False, 1530 constraints = {}, 1531 ): 1532 ''' 1533 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1534 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1535 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1536 i.e. that their true Δ4x value does not change between sessions, 1537 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1538 `'indep_sessions'`, the standardization processes each session independently, based only 1539 on anchors analyses. 1540 ''' 1541 1542 self.standardization_method = method 1543 self.assign_timestamps() 1544 1545 if method == 'pooled': 1546 if weighted_sessions: 1547 for session_group in weighted_sessions: 1548 if self._4x == '47': 1549 X = D47data([r for r in self if r['Session'] in session_group]) 1550 elif self._4x == '48': 1551 X = D48data([r for r in self if r['Session'] in session_group]) 1552 X.Nominal_D4x = self.Nominal_D4x.copy() 1553 X.refresh() 1554 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1555 w = np.sqrt(result.redchi) 1556 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1557 for r in X: 1558 r[f'wD{self._4x}raw'] *= w 1559 else: 1560 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1561 for r in self: 1562 r[f'wD{self._4x}raw'] = 1. 1563 1564 params = Parameters() 1565 for k,session in enumerate(self.sessions): 1566 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1567 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1568 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1569 s = pf(session) 1570 params.add(f'a_{s}', value = 0.9) 1571 params.add(f'b_{s}', value = 0.) 1572 params.add(f'c_{s}', value = -0.9) 1573 params.add(f'a2_{s}', value = 0., 1574# vary = self.sessions[session]['scrambling_drift'], 1575 ) 1576 params.add(f'b2_{s}', value = 0., 1577# vary = self.sessions[session]['slope_drift'], 1578 ) 1579 params.add(f'c2_{s}', value = 0., 1580# vary = self.sessions[session]['wg_drift'], 1581 ) 1582 if not self.sessions[session]['scrambling_drift']: 1583 params[f'a2_{s}'].expr = '0' 1584 if not self.sessions[session]['slope_drift']: 1585 params[f'b2_{s}'].expr = '0' 1586 if not self.sessions[session]['wg_drift']: 1587 params[f'c2_{s}'].expr = '0' 1588 1589 for sample in self.unknowns: 1590 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1591 1592 for k in constraints: 1593 params[k].expr = constraints[k] 1594 1595 def residuals(p): 1596 R = [] 1597 for r in self: 1598 session = pf(r['Session']) 1599 sample = pf(r['Sample']) 1600 if r['Sample'] in self.Nominal_D4x: 1601 R += [ ( 1602 r[f'D{self._4x}raw'] - ( 1603 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1604 + p[f'b_{session}'] * r[f'd{self._4x}'] 1605 + p[f'c_{session}'] 1606 + r['t'] * ( 1607 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1608 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1609 + p[f'c2_{session}'] 1610 ) 1611 ) 1612 ) / r[f'wD{self._4x}raw'] ] 1613 else: 1614 R += [ ( 1615 r[f'D{self._4x}raw'] - ( 1616 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1617 + p[f'b_{session}'] * r[f'd{self._4x}'] 1618 + p[f'c_{session}'] 1619 + r['t'] * ( 1620 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1621 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1622 + p[f'c2_{session}'] 1623 ) 1624 ) 1625 ) / r[f'wD{self._4x}raw'] ] 1626 return R 1627 1628 M = Minimizer(residuals, params) 1629 result = M.least_squares() 1630 self.Nf = result.nfree 1631 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1632 new_names, new_covar, new_se = _fullcovar(result)[:3] 1633 result.var_names = new_names 1634 result.covar = new_covar 1635 1636 for r in self: 1637 s = pf(r["Session"]) 1638 a = result.params.valuesdict()[f'a_{s}'] 1639 b = result.params.valuesdict()[f'b_{s}'] 1640 c = result.params.valuesdict()[f'c_{s}'] 1641 a2 = result.params.valuesdict()[f'a2_{s}'] 1642 b2 = result.params.valuesdict()[f'b2_{s}'] 1643 c2 = result.params.valuesdict()[f'c2_{s}'] 1644 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1645 1646 1647 self.standardization = result 1648 1649 for session in self.sessions: 1650 self.sessions[session]['Np'] = 3 1651 for k in ['scrambling', 'slope', 'wg']: 1652 if self.sessions[session][f'{k}_drift']: 1653 self.sessions[session]['Np'] += 1 1654 1655 if consolidate: 1656 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1657 return result 1658 1659 1660 elif method == 'indep_sessions': 1661 1662 if weighted_sessions: 1663 for session_group in weighted_sessions: 1664 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1665 X.Nominal_D4x = self.Nominal_D4x.copy() 1666 X.refresh() 1667 # This is only done to assign r['wD47raw'] for r in X: 1668 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1669 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1670 else: 1671 self.msg('All weights set to 1 ‰') 1672 for r in self: 1673 r[f'wD{self._4x}raw'] = 1 1674 1675 for session in self.sessions: 1676 s = self.sessions[session] 1677 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1678 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1679 s['Np'] = sum(p_active) 1680 sdata = s['data'] 1681 1682 A = np.array([ 1683 [ 1684 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1685 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1686 1 / r[f'wD{self._4x}raw'], 1687 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1688 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1689 r['t'] / r[f'wD{self._4x}raw'] 1690 ] 1691 for r in sdata if r['Sample'] in self.anchors 1692 ])[:,p_active] # only keep columns for the active parameters 1693 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1694 s['Na'] = Y.size 1695 CM = linalg.inv(A.T @ A) 1696 bf = (CM @ A.T @ Y).T[0,:] 1697 k = 0 1698 for n,a in zip(p_names, p_active): 1699 if a: 1700 s[n] = bf[k] 1701# self.msg(f'{n} = {bf[k]}') 1702 k += 1 1703 else: 1704 s[n] = 0. 1705# self.msg(f'{n} = 0.0') 1706 1707 for r in sdata : 1708 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1709 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1710 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1711 1712 s['CM'] = np.zeros((6,6)) 1713 i = 0 1714 k_active = [j for j,a in enumerate(p_active) if a] 1715 for j,a in enumerate(p_active): 1716 if a: 1717 s['CM'][j,k_active] = CM[i,:] 1718 i += 1 1719 1720 if not weighted_sessions: 1721 w = self.rmswd()['rmswd'] 1722 for r in self: 1723 r[f'wD{self._4x}'] *= w 1724 r[f'wD{self._4x}raw'] *= w 1725 for session in self.sessions: 1726 self.sessions[session]['CM'] *= w**2 1727 1728 for session in self.sessions: 1729 s = self.sessions[session] 1730 s['SE_a'] = s['CM'][0,0]**.5 1731 s['SE_b'] = s['CM'][1,1]**.5 1732 s['SE_c'] = s['CM'][2,2]**.5 1733 s['SE_a2'] = s['CM'][3,3]**.5 1734 s['SE_b2'] = s['CM'][4,4]**.5 1735 s['SE_c2'] = s['CM'][5,5]**.5 1736 1737 if not weighted_sessions: 1738 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1739 else: 1740 self.Nf = 0 1741 for sg in weighted_sessions: 1742 self.Nf += self.rmswd(sessions = sg)['Nf'] 1743 1744 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1745 1746 avgD4x = { 1747 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1748 for sample in self.samples 1749 } 1750 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1751 rD4x = (chi2/self.Nf)**.5 1752 self.repeatability[f'sigma_{self._4x}'] = rD4x 1753 1754 if consolidate: 1755 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1756 1757 1758 def standardization_error(self, session, d4x, D4x, t = 0): 1759 ''' 1760 Compute standardization error for a given session and 1761 (δ47, Δ47) composition. 1762 ''' 1763 a = self.sessions[session]['a'] 1764 b = self.sessions[session]['b'] 1765 c = self.sessions[session]['c'] 1766 a2 = self.sessions[session]['a2'] 1767 b2 = self.sessions[session]['b2'] 1768 c2 = self.sessions[session]['c2'] 1769 CM = self.sessions[session]['CM'] 1770 1771 x, y = D4x, d4x 1772 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1773# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1774 dxdy = -(b+b2*t) / (a+a2*t) 1775 dxdz = 1. / (a+a2*t) 1776 dxda = -x / (a+a2*t) 1777 dxdb = -y / (a+a2*t) 1778 dxdc = -1. / (a+a2*t) 1779 dxda2 = -x * a2 / (a+a2*t) 1780 dxdb2 = -y * t / (a+a2*t) 1781 dxdc2 = -t / (a+a2*t) 1782 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1783 sx = (V @ CM @ V.T) ** .5 1784 return sx 1785 1786 1787 @make_verbal 1788 def summary(self, 1789 dir = 'output', 1790 filename = None, 1791 save_to_file = True, 1792 print_out = True, 1793 ): 1794 ''' 1795 Print out an/or save to disk a summary of the standardization results. 1796 1797 **Parameters** 1798 1799 + `dir`: the directory in which to save the table 1800 + `filename`: the name to the csv file to write to 1801 + `save_to_file`: whether to save the table to disk 1802 + `print_out`: whether to print out the table 1803 ''' 1804 1805 out = [] 1806 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1807 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1808 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1809 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1810 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1811 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1812 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1813 out += [['Model degrees of freedom', f"{self.Nf}"]] 1814 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1815 out += [['Standardization method', self.standardization_method]] 1816 1817 if save_to_file: 1818 if not os.path.exists(dir): 1819 os.makedirs(dir) 1820 if filename is None: 1821 filename = f'D{self._4x}_summary.csv' 1822 with open(f'{dir}/{filename}', 'w') as fid: 1823 fid.write(make_csv(out)) 1824 if print_out: 1825 self.msg('\n' + pretty_table(out, header = 0)) 1826 1827 1828 @make_verbal 1829 def table_of_sessions(self, 1830 dir = 'output', 1831 filename = None, 1832 save_to_file = True, 1833 print_out = True, 1834 output = None, 1835 ): 1836 ''' 1837 Print out an/or save to disk a table of sessions. 1838 1839 **Parameters** 1840 1841 + `dir`: the directory in which to save the table 1842 + `filename`: the name to the csv file to write to 1843 + `save_to_file`: whether to save the table to disk 1844 + `print_out`: whether to print out the table 1845 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1846 if set to `'raw'`: return a list of list of strings 1847 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1848 ''' 1849 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1850 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1851 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1852 1853 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1854 if include_a2: 1855 out[-1] += ['a2 ± SE'] 1856 if include_b2: 1857 out[-1] += ['b2 ± SE'] 1858 if include_c2: 1859 out[-1] += ['c2 ± SE'] 1860 for session in self.sessions: 1861 out += [[ 1862 session, 1863 f"{self.sessions[session]['Na']}", 1864 f"{self.sessions[session]['Nu']}", 1865 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1866 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1867 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1868 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1869 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1870 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1871 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1872 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1873 ]] 1874 if include_a2: 1875 if self.sessions[session]['scrambling_drift']: 1876 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1877 else: 1878 out[-1] += [''] 1879 if include_b2: 1880 if self.sessions[session]['slope_drift']: 1881 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1882 else: 1883 out[-1] += [''] 1884 if include_c2: 1885 if self.sessions[session]['wg_drift']: 1886 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1887 else: 1888 out[-1] += [''] 1889 1890 if save_to_file: 1891 if not os.path.exists(dir): 1892 os.makedirs(dir) 1893 if filename is None: 1894 filename = f'D{self._4x}_sessions.csv' 1895 with open(f'{dir}/{filename}', 'w') as fid: 1896 fid.write(make_csv(out)) 1897 if print_out: 1898 self.msg('\n' + pretty_table(out)) 1899 if output == 'raw': 1900 return out 1901 elif output == 'pretty': 1902 return pretty_table(out) 1903 1904 1905 @make_verbal 1906 def table_of_analyses( 1907 self, 1908 dir = 'output', 1909 filename = None, 1910 save_to_file = True, 1911 print_out = True, 1912 output = None, 1913 ): 1914 ''' 1915 Print out an/or save to disk a table of analyses. 1916 1917 **Parameters** 1918 1919 + `dir`: the directory in which to save the table 1920 + `filename`: the name to the csv file to write to 1921 + `save_to_file`: whether to save the table to disk 1922 + `print_out`: whether to print out the table 1923 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1924 if set to `'raw'`: return a list of list of strings 1925 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1926 ''' 1927 1928 out = [['UID','Session','Sample']] 1929 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1930 for f in extra_fields: 1931 out[-1] += [f[0]] 1932 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1933 for r in self: 1934 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1935 for f in extra_fields: 1936 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1937 out[-1] += [ 1938 f"{r['d13Cwg_VPDB']:.3f}", 1939 f"{r['d18Owg_VSMOW']:.3f}", 1940 f"{r['d45']:.6f}", 1941 f"{r['d46']:.6f}", 1942 f"{r['d47']:.6f}", 1943 f"{r['d48']:.6f}", 1944 f"{r['d49']:.6f}", 1945 f"{r['d13C_VPDB']:.6f}", 1946 f"{r['d18O_VSMOW']:.6f}", 1947 f"{r['D47raw']:.6f}", 1948 f"{r['D48raw']:.6f}", 1949 f"{r['D49raw']:.6f}", 1950 f"{r[f'D{self._4x}']:.6f}" 1951 ] 1952 if save_to_file: 1953 if not os.path.exists(dir): 1954 os.makedirs(dir) 1955 if filename is None: 1956 filename = f'D{self._4x}_analyses.csv' 1957 with open(f'{dir}/{filename}', 'w') as fid: 1958 fid.write(make_csv(out)) 1959 if print_out: 1960 self.msg('\n' + pretty_table(out)) 1961 return out 1962 1963 @make_verbal 1964 def covar_table( 1965 self, 1966 correl = False, 1967 dir = 'output', 1968 filename = None, 1969 save_to_file = True, 1970 print_out = True, 1971 output = None, 1972 ): 1973 ''' 1974 Print out, save to disk and/or return the variance-covariance matrix of D4x 1975 for all unknown samples. 1976 1977 **Parameters** 1978 1979 + `dir`: the directory in which to save the csv 1980 + `filename`: the name of the csv file to write to 1981 + `save_to_file`: whether to save the csv 1982 + `print_out`: whether to print out the matrix 1983 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 1984 if set to `'raw'`: return a list of list of strings 1985 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1986 ''' 1987 samples = sorted([u for u in self.unknowns]) 1988 out = [[''] + samples] 1989 for s1 in samples: 1990 out.append([s1]) 1991 for s2 in samples: 1992 if correl: 1993 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 1994 else: 1995 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 1996 1997 if save_to_file: 1998 if not os.path.exists(dir): 1999 os.makedirs(dir) 2000 if filename is None: 2001 if correl: 2002 filename = f'D{self._4x}_correl.csv' 2003 else: 2004 filename = f'D{self._4x}_covar.csv' 2005 with open(f'{dir}/{filename}', 'w') as fid: 2006 fid.write(make_csv(out)) 2007 if print_out: 2008 self.msg('\n'+pretty_table(out)) 2009 if output == 'raw': 2010 return out 2011 elif output == 'pretty': 2012 return pretty_table(out) 2013 2014 @make_verbal 2015 def table_of_samples( 2016 self, 2017 dir = 'output', 2018 filename = None, 2019 save_to_file = True, 2020 print_out = True, 2021 output = None, 2022 ): 2023 ''' 2024 Print out, save to disk and/or return a table of samples. 2025 2026 **Parameters** 2027 2028 + `dir`: the directory in which to save the csv 2029 + `filename`: the name of the csv file to write to 2030 + `save_to_file`: whether to save the csv 2031 + `print_out`: whether to print out the table 2032 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2033 if set to `'raw'`: return a list of list of strings 2034 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2035 ''' 2036 2037 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2038 for sample in self.anchors: 2039 out += [[ 2040 f"{sample}", 2041 f"{self.samples[sample]['N']}", 2042 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2043 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2044 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2045 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2046 ]] 2047 for sample in self.unknowns: 2048 out += [[ 2049 f"{sample}", 2050 f"{self.samples[sample]['N']}", 2051 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2052 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2053 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2054 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2055 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2056 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2057 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2058 ]] 2059 if save_to_file: 2060 if not os.path.exists(dir): 2061 os.makedirs(dir) 2062 if filename is None: 2063 filename = f'D{self._4x}_samples.csv' 2064 with open(f'{dir}/{filename}', 'w') as fid: 2065 fid.write(make_csv(out)) 2066 if print_out: 2067 self.msg('\n'+pretty_table(out)) 2068 if output == 'raw': 2069 return out 2070 elif output == 'pretty': 2071 return pretty_table(out) 2072 2073 2074 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2075 ''' 2076 Generate session plots and save them to disk. 2077 2078 **Parameters** 2079 2080 + `dir`: the directory in which to save the plots 2081 + `figsize`: the width and height (in inches) of each plot 2082 + `filetype`: 'pdf' or 'png' 2083 + `dpi`: resolution for PNG output 2084 ''' 2085 if not os.path.exists(dir): 2086 os.makedirs(dir) 2087 2088 for session in self.sessions: 2089 sp = self.plot_single_session(session, xylimits = 'constant') 2090 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2091 ppl.close(sp.fig) 2092 2093 2094 @make_verbal 2095 def consolidate_samples(self): 2096 ''' 2097 Compile various statistics for each sample. 2098 2099 For each anchor sample: 2100 2101 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2102 + `SE_D47` or `SE_D48`: set to zero by definition 2103 2104 For each unknown sample: 2105 2106 + `D47` or `D48`: the standardized Δ4x value for this unknown 2107 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2108 2109 For each anchor and unknown: 2110 2111 + `N`: the total number of analyses of this sample 2112 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2113 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2114 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2115 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2116 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2117 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2118 ''' 2119 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2120 for sample in self.samples: 2121 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2122 if self.samples[sample]['N'] > 1: 2123 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2124 2125 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2126 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2127 2128 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2129 if len(D4x_pop) > 2: 2130 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2131 2132 if self.standardization_method == 'pooled': 2133 for sample in self.anchors: 2134 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2135 self.samples[sample][f'SE_D{self._4x}'] = 0. 2136 for sample in self.unknowns: 2137 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2138 try: 2139 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2140 except ValueError: 2141 # when `sample` is constrained by self.standardize(constraints = {...}), 2142 # it is no longer listed in self.standardization.var_names. 2143 # Temporary fix: define SE as zero for now 2144 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2145 2146 elif self.standardization_method == 'indep_sessions': 2147 for sample in self.anchors: 2148 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2149 self.samples[sample][f'SE_D{self._4x}'] = 0. 2150 for sample in self.unknowns: 2151 self.msg(f'Consolidating sample {sample}') 2152 self.unknowns[sample][f'session_D{self._4x}'] = {} 2153 session_avg = [] 2154 for session in self.sessions: 2155 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2156 if sdata: 2157 self.msg(f'{sample} found in session {session}') 2158 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2159 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2160 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2161 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2162 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2163 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2164 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2165 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2166 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2167 wsum = sum([weights[s] for s in weights]) 2168 for s in weights: 2169 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2170 2171 for r in self: 2172 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2173 2174 2175 2176 def consolidate_sessions(self): 2177 ''' 2178 Compute various statistics for each session. 2179 2180 + `Na`: Number of anchor analyses in the session 2181 + `Nu`: Number of unknown analyses in the session 2182 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2183 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2184 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2185 + `a`: scrambling factor 2186 + `b`: compositional slope 2187 + `c`: WG offset 2188 + `SE_a`: Model stadard erorr of `a` 2189 + `SE_b`: Model stadard erorr of `b` 2190 + `SE_c`: Model stadard erorr of `c` 2191 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2192 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2193 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2194 + `a2`: scrambling factor drift 2195 + `b2`: compositional slope drift 2196 + `c2`: WG offset drift 2197 + `Np`: Number of standardization parameters to fit 2198 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2199 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2200 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2201 ''' 2202 for session in self.sessions: 2203 if 'd13Cwg_VPDB' not in self.sessions[session]: 2204 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2205 if 'd18Owg_VSMOW' not in self.sessions[session]: 2206 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2207 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2208 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2209 2210 self.msg(f'Computing repeatabilities for session {session}') 2211 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2212 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2213 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2214 2215 if self.standardization_method == 'pooled': 2216 for session in self.sessions: 2217 2218 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2219 i = self.standardization.var_names.index(f'a_{pf(session)}') 2220 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2221 2222 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2223 i = self.standardization.var_names.index(f'b_{pf(session)}') 2224 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2225 2226 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2227 i = self.standardization.var_names.index(f'c_{pf(session)}') 2228 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2229 2230 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2231 if self.sessions[session]['scrambling_drift']: 2232 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2233 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2234 else: 2235 self.sessions[session]['SE_a2'] = 0. 2236 2237 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2238 if self.sessions[session]['slope_drift']: 2239 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2240 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2241 else: 2242 self.sessions[session]['SE_b2'] = 0. 2243 2244 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2245 if self.sessions[session]['wg_drift']: 2246 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2247 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2248 else: 2249 self.sessions[session]['SE_c2'] = 0. 2250 2251 i = self.standardization.var_names.index(f'a_{pf(session)}') 2252 j = self.standardization.var_names.index(f'b_{pf(session)}') 2253 k = self.standardization.var_names.index(f'c_{pf(session)}') 2254 CM = np.zeros((6,6)) 2255 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2256 try: 2257 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2258 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2259 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2260 try: 2261 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2262 CM[3,4] = self.standardization.covar[i2,j2] 2263 CM[4,3] = self.standardization.covar[j2,i2] 2264 except ValueError: 2265 pass 2266 try: 2267 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2268 CM[3,5] = self.standardization.covar[i2,k2] 2269 CM[5,3] = self.standardization.covar[k2,i2] 2270 except ValueError: 2271 pass 2272 except ValueError: 2273 pass 2274 try: 2275 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2276 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2277 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2278 try: 2279 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2280 CM[4,5] = self.standardization.covar[j2,k2] 2281 CM[5,4] = self.standardization.covar[k2,j2] 2282 except ValueError: 2283 pass 2284 except ValueError: 2285 pass 2286 try: 2287 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2288 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2289 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2290 except ValueError: 2291 pass 2292 2293 self.sessions[session]['CM'] = CM 2294 2295 elif self.standardization_method == 'indep_sessions': 2296 pass # Not implemented yet 2297 2298 2299 @make_verbal 2300 def repeatabilities(self): 2301 ''' 2302 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2303 (for all samples, for anchors, and for unknowns). 2304 ''' 2305 self.msg('Computing reproducibilities for all sessions') 2306 2307 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2308 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2309 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2310 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2311 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2312 2313 2314 @make_verbal 2315 def consolidate(self, tables = True, plots = True): 2316 ''' 2317 Collect information about samples, sessions and repeatabilities. 2318 ''' 2319 self.consolidate_samples() 2320 self.consolidate_sessions() 2321 self.repeatabilities() 2322 2323 if tables: 2324 self.summary() 2325 self.table_of_sessions() 2326 self.table_of_analyses() 2327 self.table_of_samples() 2328 2329 if plots: 2330 self.plot_sessions() 2331 2332 2333 @make_verbal 2334 def rmswd(self, 2335 samples = 'all samples', 2336 sessions = 'all sessions', 2337 ): 2338 ''' 2339 Compute the χ2, root mean squared weighted deviation 2340 (i.e. reduced χ2), and corresponding degrees of freedom of the 2341 Δ4x values for samples in `samples` and sessions in `sessions`. 2342 2343 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2344 ''' 2345 if samples == 'all samples': 2346 mysamples = [k for k in self.samples] 2347 elif samples == 'anchors': 2348 mysamples = [k for k in self.anchors] 2349 elif samples == 'unknowns': 2350 mysamples = [k for k in self.unknowns] 2351 else: 2352 mysamples = samples 2353 2354 if sessions == 'all sessions': 2355 sessions = [k for k in self.sessions] 2356 2357 chisq, Nf = 0, 0 2358 for sample in mysamples : 2359 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2360 if len(G) > 1 : 2361 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2362 Nf += (len(G) - 1) 2363 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2364 r = (chisq / Nf)**.5 if Nf > 0 else 0 2365 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2366 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2367 2368 2369 @make_verbal 2370 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2371 ''' 2372 Compute the repeatability of `[r[key] for r in self]` 2373 ''' 2374 2375 if samples == 'all samples': 2376 mysamples = [k for k in self.samples] 2377 elif samples == 'anchors': 2378 mysamples = [k for k in self.anchors] 2379 elif samples == 'unknowns': 2380 mysamples = [k for k in self.unknowns] 2381 else: 2382 mysamples = samples 2383 2384 if sessions == 'all sessions': 2385 sessions = [k for k in self.sessions] 2386 2387 if key in ['D47', 'D48']: 2388 # Full disclosure: the definition of Nf is tricky/debatable 2389 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2390 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2391 Nf = len(G) 2392# print(f'len(G) = {Nf}') 2393 Nf -= len([s for s in mysamples if s in self.unknowns]) 2394# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2395 for session in sessions: 2396 Np = len([ 2397 _ for _ in self.standardization.params 2398 if ( 2399 self.standardization.params[_].expr is not None 2400 and ( 2401 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2402 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2403 ) 2404 ) 2405 ]) 2406# print(f'session {session}: {Np} parameters to consider') 2407 Na = len({ 2408 r['Sample'] for r in self.sessions[session]['data'] 2409 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2410 }) 2411# print(f'session {session}: {Na} different anchors in that session') 2412 Nf -= min(Np, Na) 2413# print(f'Nf = {Nf}') 2414 2415# for sample in mysamples : 2416# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2417# if len(X) > 1 : 2418# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2419# if sample in self.unknowns: 2420# Nf += len(X) - 1 2421# else: 2422# Nf += len(X) 2423# if samples in ['anchors', 'all samples']: 2424# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2425 r = (chisq / Nf)**.5 if Nf > 0 else 0 2426 2427 else: # if key not in ['D47', 'D48'] 2428 chisq, Nf = 0, 0 2429 for sample in mysamples : 2430 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2431 if len(X) > 1 : 2432 Nf += len(X) - 1 2433 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2434 r = (chisq / Nf)**.5 if Nf > 0 else 0 2435 2436 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2437 return r 2438 2439 def sample_average(self, samples, weights = 'equal', normalize = True): 2440 ''' 2441 Weighted average Δ4x value of a group of samples, accounting for covariance. 2442 2443 Returns the weighed average Δ4x value and associated SE 2444 of a group of samples. Weights are equal by default. If `normalize` is 2445 true, `weights` will be rescaled so that their sum equals 1. 2446 2447 **Examples** 2448 2449 ```python 2450 self.sample_average(['X','Y'], [1, 2]) 2451 ``` 2452 2453 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2454 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2455 values of samples X and Y, respectively. 2456 2457 ```python 2458 self.sample_average(['X','Y'], [1, -1], normalize = False) 2459 ``` 2460 2461 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2462 ''' 2463 if weights == 'equal': 2464 weights = [1/len(samples)] * len(samples) 2465 2466 if normalize: 2467 s = sum(weights) 2468 if s: 2469 weights = [w/s for w in weights] 2470 2471 try: 2472# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2473# C = self.standardization.covar[indices,:][:,indices] 2474 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2475 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2476 return correlated_sum(X, C, weights) 2477 except ValueError: 2478 return (0., 0.) 2479 2480 2481 def sample_D4x_covar(self, sample1, sample2 = None): 2482 ''' 2483 Covariance between Δ4x values of samples 2484 2485 Returns the error covariance between the average Δ4x values of two 2486 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2487 returns the Δ4x variance for that sample. 2488 ''' 2489 if sample2 is None: 2490 sample2 = sample1 2491 if self.standardization_method == 'pooled': 2492 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2493 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2494 return self.standardization.covar[i, j] 2495 elif self.standardization_method == 'indep_sessions': 2496 if sample1 == sample2: 2497 return self.samples[sample1][f'SE_D{self._4x}']**2 2498 else: 2499 c = 0 2500 for session in self.sessions: 2501 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2502 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2503 if sdata1 and sdata2: 2504 a = self.sessions[session]['a'] 2505 # !! TODO: CM below does not account for temporal changes in standardization parameters 2506 CM = self.sessions[session]['CM'][:3,:3] 2507 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2508 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2509 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2510 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2511 c += ( 2512 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2513 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2514 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2515 @ CM 2516 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2517 ) / a**2 2518 return float(c) 2519 2520 def sample_D4x_correl(self, sample1, sample2 = None): 2521 ''' 2522 Correlation between Δ4x errors of samples 2523 2524 Returns the error correlation between the average Δ4x values of two samples. 2525 ''' 2526 if sample2 is None or sample2 == sample1: 2527 return 1. 2528 return ( 2529 self.sample_D4x_covar(sample1, sample2) 2530 / self.unknowns[sample1][f'SE_D{self._4x}'] 2531 / self.unknowns[sample2][f'SE_D{self._4x}'] 2532 ) 2533 2534 def plot_single_session(self, 2535 session, 2536 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2537 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2538 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2539 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2540 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2541 xylimits = 'free', # | 'constant' 2542 x_label = None, 2543 y_label = None, 2544 error_contour_interval = 'auto', 2545 fig = 'new', 2546 ): 2547 ''' 2548 Generate plot for a single session 2549 ''' 2550 if x_label is None: 2551 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2552 if y_label is None: 2553 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2554 2555 out = _SessionPlot() 2556 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2557 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2558 2559 if fig == 'new': 2560 out.fig = ppl.figure(figsize = (6,6)) 2561 ppl.subplots_adjust(.1,.1,.9,.9) 2562 2563 out.anchor_analyses, = ppl.plot( 2564 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2565 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2566 **kw_plot_anchors) 2567 out.unknown_analyses, = ppl.plot( 2568 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2569 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2570 **kw_plot_unknowns) 2571 out.anchor_avg = ppl.plot( 2572 np.array([ np.array([ 2573 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2574 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2575 ]) for sample in anchors]).T, 2576 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T, 2577 **kw_plot_anchor_avg) 2578 out.unknown_avg = ppl.plot( 2579 np.array([ np.array([ 2580 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2581 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2582 ]) for sample in unknowns]).T, 2583 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T, 2584 **kw_plot_unknown_avg) 2585 if xylimits == 'constant': 2586 x = [r[f'd{self._4x}'] for r in self] 2587 y = [r[f'D{self._4x}'] for r in self] 2588 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2589 w, h = x2-x1, y2-y1 2590 x1 -= w/20 2591 x2 += w/20 2592 y1 -= h/20 2593 y2 += h/20 2594 ppl.axis([x1, x2, y1, y2]) 2595 elif xylimits == 'free': 2596 x1, x2, y1, y2 = ppl.axis() 2597 else: 2598 x1, x2, y1, y2 = ppl.axis(xylimits) 2599 2600 if error_contour_interval != 'none': 2601 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2602 XI,YI = np.meshgrid(xi, yi) 2603 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2604 if error_contour_interval == 'auto': 2605 rng = np.max(SI) - np.min(SI) 2606 if rng <= 0.01: 2607 cinterval = 0.001 2608 elif rng <= 0.03: 2609 cinterval = 0.004 2610 elif rng <= 0.1: 2611 cinterval = 0.01 2612 elif rng <= 0.3: 2613 cinterval = 0.03 2614 elif rng <= 1.: 2615 cinterval = 0.1 2616 else: 2617 cinterval = 0.5 2618 else: 2619 cinterval = error_contour_interval 2620 2621 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2622 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2623 out.clabel = ppl.clabel(out.contour) 2624 2625 ppl.xlabel(x_label) 2626 ppl.ylabel(y_label) 2627 ppl.title(session, weight = 'bold') 2628 ppl.grid(alpha = .2) 2629 out.ax = ppl.gca() 2630 2631 return out 2632 2633 def plot_residuals( 2634 self, 2635 kde = False, 2636 hist = False, 2637 binwidth = 2/3, 2638 dir = 'output', 2639 filename = None, 2640 highlight = [], 2641 colors = None, 2642 figsize = None, 2643 dpi = 100, 2644 yspan = None, 2645 ): 2646 ''' 2647 Plot residuals of each analysis as a function of time (actually, as a function of 2648 the order of analyses in the `D4xdata` object) 2649 2650 + `kde`: whether to add a kernel density estimate of residuals 2651 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2652 + `histbins`: specify bin edges for the histogram 2653 + `dir`: the directory in which to save the plot 2654 + `highlight`: a list of samples to highlight 2655 + `colors`: a dict of `{<sample>: <color>}` for all samples 2656 + `figsize`: (width, height) of figure 2657 + `dpi`: resolution for PNG output 2658 + `yspan`: factor controlling the range of y values shown in plot 2659 (by default: `yspan = 1.5 if kde else 1.0`) 2660 ''' 2661 2662 from matplotlib import ticker 2663 2664 if yspan is None: 2665 if kde: 2666 yspan = 1.5 2667 else: 2668 yspan = 1.0 2669 2670 # Layout 2671 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2672 if hist or kde: 2673 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2674 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2675 else: 2676 ppl.subplots_adjust(.08,.05,.78,.8) 2677 ax1 = ppl.subplot(111) 2678 2679 # Colors 2680 N = len(self.anchors) 2681 if colors is None: 2682 if len(highlight) > 0: 2683 Nh = len(highlight) 2684 if Nh == 1: 2685 colors = {highlight[0]: (0,0,0)} 2686 elif Nh == 3: 2687 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2688 elif Nh == 4: 2689 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2690 else: 2691 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2692 else: 2693 if N == 3: 2694 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2695 elif N == 4: 2696 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2697 else: 2698 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2699 2700 ppl.sca(ax1) 2701 2702 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2703 2704 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2705 2706 session = self[0]['Session'] 2707 x1 = 0 2708# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2709 x_sessions = {} 2710 one_or_more_singlets = False 2711 one_or_more_multiplets = False 2712 multiplets = set() 2713 for k,r in enumerate(self): 2714 if r['Session'] != session: 2715 x2 = k-1 2716 x_sessions[session] = (x1+x2)/2 2717 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2718 session = r['Session'] 2719 x1 = k 2720 singlet = len(self.samples[r['Sample']]['data']) == 1 2721 if not singlet: 2722 multiplets.add(r['Sample']) 2723 if r['Sample'] in self.unknowns: 2724 if singlet: 2725 one_or_more_singlets = True 2726 else: 2727 one_or_more_multiplets = True 2728 kw = dict( 2729 marker = 'x' if singlet else '+', 2730 ms = 4 if singlet else 5, 2731 ls = 'None', 2732 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2733 mew = 1, 2734 alpha = 0.2 if singlet else 1, 2735 ) 2736 if highlight and r['Sample'] not in highlight: 2737 kw['alpha'] = 0.2 2738 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2739 x2 = k 2740 x_sessions[session] = (x1+x2)/2 2741 2742 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2743 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2744 if not (hist or kde): 2745 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2746 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2747 2748 xmin, xmax, ymin, ymax = ppl.axis() 2749 if yspan != 1: 2750 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2751 for s in x_sessions: 2752 ppl.text( 2753 x_sessions[s], 2754 ymax +1, 2755 s, 2756 va = 'bottom', 2757 **( 2758 dict(ha = 'center') 2759 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2760 else dict(ha = 'left', rotation = 45) 2761 ) 2762 ) 2763 2764 if hist or kde: 2765 ppl.sca(ax2) 2766 2767 for s in colors: 2768 kw['marker'] = '+' 2769 kw['ms'] = 5 2770 kw['mec'] = colors[s] 2771 kw['label'] = s 2772 kw['alpha'] = 1 2773 ppl.plot([], [], **kw) 2774 2775 kw['mec'] = (0,0,0) 2776 2777 if one_or_more_singlets: 2778 kw['marker'] = 'x' 2779 kw['ms'] = 4 2780 kw['alpha'] = .2 2781 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2782 ppl.plot([], [], **kw) 2783 2784 if one_or_more_multiplets: 2785 kw['marker'] = '+' 2786 kw['ms'] = 4 2787 kw['alpha'] = 1 2788 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2789 ppl.plot([], [], **kw) 2790 2791 if hist or kde: 2792 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2793 else: 2794 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2795 leg.set_zorder(-1000) 2796 2797 ppl.sca(ax1) 2798 2799 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2800 ppl.xticks([]) 2801 ppl.axis([-1, len(self), None, None]) 2802 2803 if hist or kde: 2804 ppl.sca(ax2) 2805 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2806 2807 if kde: 2808 from scipy.stats import gaussian_kde 2809 yi = np.linspace(ymin, ymax, 201) 2810 xi = gaussian_kde(X).evaluate(yi) 2811 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2812# ppl.plot(xi, yi, 'k-', lw = 1) 2813 elif hist: 2814 ppl.hist( 2815 X, 2816 orientation = 'horizontal', 2817 histtype = 'stepfilled', 2818 ec = [.4]*3, 2819 fc = [.25]*3, 2820 alpha = .25, 2821 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2822 ) 2823 ppl.text(0, 0, 2824 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2825 size = 7.5, 2826 alpha = 1, 2827 va = 'center', 2828 ha = 'left', 2829 ) 2830 2831 ppl.axis([0, None, ymin, ymax]) 2832 ppl.xticks([]) 2833 ppl.yticks([]) 2834# ax2.spines['left'].set_visible(False) 2835 ax2.spines['right'].set_visible(False) 2836 ax2.spines['top'].set_visible(False) 2837 ax2.spines['bottom'].set_visible(False) 2838 2839 ax1.axis([None, None, ymin, ymax]) 2840 2841 if not os.path.exists(dir): 2842 os.makedirs(dir) 2843 if filename is None: 2844 return fig 2845 elif filename == '': 2846 filename = f'D{self._4x}_residuals.pdf' 2847 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2848 ppl.close(fig) 2849 2850 2851 def simulate(self, *args, **kwargs): 2852 ''' 2853 Legacy function with warning message pointing to `virtual_data()` 2854 ''' 2855 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2856 2857 def plot_distribution_of_analyses( 2858 self, 2859 dir = 'output', 2860 filename = None, 2861 vs_time = False, 2862 figsize = (6,4), 2863 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2864 output = None, 2865 dpi = 100, 2866 ): 2867 ''' 2868 Plot temporal distribution of all analyses in the data set. 2869 2870 **Parameters** 2871 2872 + `dir`: the directory in which to save the plot 2873 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2874 + `dpi`: resolution for PNG output 2875 + `figsize`: (width, height) of figure 2876 + `dpi`: resolution for PNG output 2877 ''' 2878 2879 asamples = [s for s in self.anchors] 2880 usamples = [s for s in self.unknowns] 2881 if output is None or output == 'fig': 2882 fig = ppl.figure(figsize = figsize) 2883 ppl.subplots_adjust(*subplots_adjust) 2884 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2885 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2886 Xmax += (Xmax-Xmin)/40 2887 Xmin -= (Xmax-Xmin)/41 2888 for k, s in enumerate(asamples + usamples): 2889 if vs_time: 2890 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2891 else: 2892 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2893 Y = [-k for x in X] 2894 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2895 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2896 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2897 ppl.axis([Xmin, Xmax, -k-1, 1]) 2898 ppl.xlabel('\ntime') 2899 ppl.gca().annotate('', 2900 xy = (0.6, -0.02), 2901 xycoords = 'axes fraction', 2902 xytext = (.4, -0.02), 2903 arrowprops = dict(arrowstyle = "->", color = 'k'), 2904 ) 2905 2906 2907 x2 = -1 2908 for session in self.sessions: 2909 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2910 if vs_time: 2911 ppl.axvline(x1, color = 'k', lw = .75) 2912 if x2 > -1: 2913 if not vs_time: 2914 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2915 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2916# from xlrd import xldate_as_datetime 2917# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2918 if vs_time: 2919 ppl.axvline(x2, color = 'k', lw = .75) 2920 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2921 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2922 2923 ppl.xticks([]) 2924 ppl.yticks([]) 2925 2926 if output is None: 2927 if not os.path.exists(dir): 2928 os.makedirs(dir) 2929 if filename == None: 2930 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2931 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2932 ppl.close(fig) 2933 elif output == 'ax': 2934 return ppl.gca() 2935 elif output == 'fig': 2936 return fig 2937 2938 2939 def plot_bulk_compositions( 2940 self, 2941 samples = None, 2942 dir = 'output/bulk_compositions', 2943 figsize = (6,6), 2944 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 2945 show = False, 2946 sample_color = (0,.5,1), 2947 analysis_color = (.7,.7,.7), 2948 labeldist = 0.3, 2949 radius = 0.05, 2950 ): 2951 ''' 2952 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 2953 2954 By default, creates a directory `./output/bulk_compositions` where plots for 2955 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 2956 2957 2958 **Parameters** 2959 2960 + `samples`: Only these samples are processed (by default: all samples). 2961 + `dir`: where to save the plots 2962 + `figsize`: (width, height) of figure 2963 + `subplots_adjust`: passed to `subplots_adjust()` 2964 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 2965 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 2966 + `sample_color`: color used for replicate markers/labels 2967 + `analysis_color`: color used for sample markers/labels 2968 + `labeldist`: distance (in inches) from replicate markers to replicate labels 2969 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 2970 ''' 2971 2972 from matplotlib.patches import Ellipse 2973 2974 if samples is None: 2975 samples = [_ for _ in self.samples] 2976 2977 saved = {} 2978 2979 for s in samples: 2980 2981 fig = ppl.figure(figsize = figsize) 2982 fig.subplots_adjust(*subplots_adjust) 2983 ax = ppl.subplot(111) 2984 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 2985 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 2986 ppl.title(s) 2987 2988 2989 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 2990 UID = [_['UID'] for _ in self.samples[s]['data']] 2991 XY0 = XY.mean(0) 2992 2993 for xy in XY: 2994 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 2995 2996 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 2997 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 2998 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 2999 saved[s] = [XY, XY0] 3000 3001 x1, x2, y1, y2 = ppl.axis() 3002 x0, dx = (x1+x2)/2, (x2-x1)/2 3003 y0, dy = (y1+y2)/2, (y2-y1)/2 3004 dx, dy = [max(max(dx, dy), radius)]*2 3005 3006 ppl.axis([ 3007 x0 - 1.2*dx, 3008 x0 + 1.2*dx, 3009 y0 - 1.2*dy, 3010 y0 + 1.2*dy, 3011 ]) 3012 3013 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3014 3015 for xy, uid in zip(XY, UID): 3016 3017 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3018 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3019 3020 if (vector_in_display_space**2).sum() > 0: 3021 3022 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3023 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3024 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3025 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3026 3027 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3028 3029 else: 3030 3031 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3032 3033 if radius: 3034 ax.add_artist(Ellipse( 3035 xy = XY0, 3036 width = radius*2, 3037 height = radius*2, 3038 ls = (0, (2,2)), 3039 lw = .7, 3040 ec = analysis_color, 3041 fc = 'None', 3042 )) 3043 ppl.text( 3044 XY0[0], 3045 XY0[1]-radius, 3046 f'\n± {radius*1e3:.0f} ppm', 3047 color = analysis_color, 3048 va = 'top', 3049 ha = 'center', 3050 linespacing = 0.4, 3051 size = 8, 3052 ) 3053 3054 if not os.path.exists(dir): 3055 os.makedirs(dir) 3056 fig.savefig(f'{dir}/{s}.pdf') 3057 ppl.close(fig) 3058 3059 fig = ppl.figure(figsize = figsize) 3060 fig.subplots_adjust(*subplots_adjust) 3061 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3062 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3063 3064 for s in saved: 3065 for xy in saved[s][0]: 3066 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3067 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3068 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3069 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3070 3071 x1, x2, y1, y2 = ppl.axis() 3072 ppl.axis([ 3073 x1 - (x2-x1)/10, 3074 x2 + (x2-x1)/10, 3075 y1 - (y2-y1)/10, 3076 y2 + (y2-y1)/10, 3077 ]) 3078 3079 3080 if not os.path.exists(dir): 3081 os.makedirs(dir) 3082 fig.savefig(f'{dir}/__all__.pdf') 3083 if show: 3084 ppl.show() 3085 ppl.close(fig) 3086 3087 3088 3089class D47data(D4xdata): 3090 ''' 3091 Store and process data for a large set of Δ47 analyses, 3092 usually comprising more than one analytical session. 3093 ''' 3094 3095 Nominal_D4x = { 3096 'ETH-1': 0.2052, 3097 'ETH-2': 0.2085, 3098 'ETH-3': 0.6132, 3099 'ETH-4': 0.4511, 3100 'IAEA-C1': 0.3018, 3101 'IAEA-C2': 0.6409, 3102 'MERCK': 0.5135, 3103 } # I-CDES (Bernasconi et al., 2021) 3104 ''' 3105 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3106 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3107 reference frame. 3108 3109 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3110 ```py 3111 { 3112 'ETH-1' : 0.2052, 3113 'ETH-2' : 0.2085, 3114 'ETH-3' : 0.6132, 3115 'ETH-4' : 0.4511, 3116 'IAEA-C1' : 0.3018, 3117 'IAEA-C2' : 0.6409, 3118 'MERCK' : 0.5135, 3119 } 3120 ``` 3121 ''' 3122 3123 3124 @property 3125 def Nominal_D47(self): 3126 return self.Nominal_D4x 3127 3128 3129 @Nominal_D47.setter 3130 def Nominal_D47(self, new): 3131 self.Nominal_D4x = dict(**new) 3132 self.refresh() 3133 3134 3135 def __init__(self, l = [], **kwargs): 3136 ''' 3137 **Parameters:** same as `D4xdata.__init__()` 3138 ''' 3139 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3140 3141 3142 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3143 ''' 3144 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3145 value for that temperature, and add treat these samples as additional anchors. 3146 3147 **Parameters** 3148 3149 + `fCo2eqD47`: Which CO2 equilibrium law to use 3150 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3151 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3152 + `priority`: if `replace`: forget old anchors and only use the new ones; 3153 if `new`: keep pre-existing anchors but update them in case of conflict 3154 between old and new Δ47 values; 3155 if `old`: keep pre-existing anchors but preserve their original Δ47 3156 values in case of conflict. 3157 ''' 3158 f = { 3159 'petersen': fCO2eqD47_Petersen, 3160 'wang': fCO2eqD47_Wang, 3161 }[fCo2eqD47] 3162 foo = {} 3163 for r in self: 3164 if 'Teq' in r: 3165 if r['Sample'] in foo: 3166 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3167 else: 3168 foo[r['Sample']] = f(r['Teq']) 3169 else: 3170 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3171 3172 if priority == 'replace': 3173 self.Nominal_D47 = {} 3174 for s in foo: 3175 if priority != 'old' or s not in self.Nominal_D47: 3176 self.Nominal_D47[s] = foo[s] 3177 3178 3179 3180 3181class D48data(D4xdata): 3182 ''' 3183 Store and process data for a large set of Δ48 analyses, 3184 usually comprising more than one analytical session. 3185 ''' 3186 3187 Nominal_D4x = { 3188 'ETH-1': 0.138, 3189 'ETH-2': 0.138, 3190 'ETH-3': 0.270, 3191 'ETH-4': 0.223, 3192 'GU-1': -0.419, 3193 } # (Fiebig et al., 2019, 2021) 3194 ''' 3195 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3196 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3197 reference frame. 3198 3199 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3200 Fiebig et al. (in press)): 3201 3202 ```py 3203 { 3204 'ETH-1' : 0.138, 3205 'ETH-2' : 0.138, 3206 'ETH-3' : 0.270, 3207 'ETH-4' : 0.223, 3208 'GU-1' : -0.419, 3209 } 3210 ``` 3211 ''' 3212 3213 3214 @property 3215 def Nominal_D48(self): 3216 return self.Nominal_D4x 3217 3218 3219 @Nominal_D48.setter 3220 def Nominal_D48(self, new): 3221 self.Nominal_D4x = dict(**new) 3222 self.refresh() 3223 3224 3225 def __init__(self, l = [], **kwargs): 3226 ''' 3227 **Parameters:** same as `D4xdata.__init__()` 3228 ''' 3229 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3230 3231 3232 3233class _SessionPlot(): 3234 ''' 3235 Simple placeholder class 3236 ''' 3237 def __init__(self): 3238 pass 3239 3240_app = typer.Typer( 3241 add_completion = False, 3242 context_settings={'help_option_names': ['-h', '--help']}, 3243 rich_markup_mode = 'rich', 3244 ) 3245 3246@_app.command() 3247def _cli( 3248 rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")], 3249 exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none', 3250 anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none', 3251 output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output', 3252 ): 3253 """ 3254 Process raw D47 data and return standardized results. 3255 """ 3256 3257 data = D47data() 3258 data.read(rawdata) 3259 3260 if exclude != 'none': 3261 exclude = read_csv(exclude) 3262 exclude_uid = {r['UID'] for r in exclude if 'UID' in r} 3263 exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r} 3264 else: 3265 exclude_uid = [] 3266 exclude_sample = [] 3267 3268 data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3269 3270 if anchors != 'none': 3271 anchors = read_csv(anchors) 3272 data.Nominal_d13C_VPDB = { 3273 _['Sample']: _['d13C_VPDB'] 3274 for _ in anchors 3275 if 'd13C_VPDB' in _ 3276 } 3277 data.Nominal_d18O_VPDB = { 3278 _['Sample']: _['d18O_VPDB'] 3279 for _ in anchors 3280 if 'd18O_VPDB' in _ 3281 } 3282 data.Nominal_D4x = { 3283 _['Sample']: _['D47'] 3284 for _ in anchors 3285 if 'D47' in _ 3286 } 3287 3288 data.refresh() 3289 data.wg() 3290 data.crunch() 3291 data.standardize() 3292 data.summary(dir = output_dir) 3293 data.table_of_samples(dir = output_dir) 3294 data.table_of_sessions(dir = output_dir) 3295 data.plot_sessions(dir = output_dir) 3296 data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True) 3297 data.table_of_analyses(dir = output_dir) 3298 data.plot_distribution_of_analyses(dir = output_dir) 3299 data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions') 3300 3301def __cli(): 3302 _app()
66def fCO2eqD47_Petersen(T): 67 ''' 68 CO2 equilibrium Δ47 value as a function of T (in degrees C) 69 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 70 71 ''' 72 return float(_fCO2eqD47_Petersen(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).
77def fCO2eqD47_Wang(T): 78 ''' 79 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 80 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 81 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 82 ''' 83 return float(_fCO2eqD47_Wang(T))
CO2 equilibrium Δ47 value as a function of T
(in degrees C)
according to Wang et al. (2004)
(supplementary data of Dennis et al., 2011).
105def make_csv(x, hsep = ',', vsep = '\n'): 106 ''' 107 Formats a list of lists of strings as a CSV 108 109 **Parameters** 110 111 + `x`: the list of lists of strings to format 112 + `hsep`: the field separator (`,` by default) 113 + `vsep`: the line-ending convention to use (`\\n` by default) 114 115 **Example** 116 117 ```py 118 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 119 ``` 120 121 outputs: 122 123 ```py 124 a,b,c 125 d,e,f 126 ``` 127 ''' 128 return vsep.join([hsep.join(l) for l in x])
Formats a list of lists of strings as a CSV
Parameters
x
: the list of lists of strings to formathsep
: the field separator (,
by default)vsep
: the line-ending convention to use (\n
by default)
Example
print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
outputs:
a,b,c
d,e,f
131def pf(txt): 132 ''' 133 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 134 ''' 135 return txt.replace('-','_').replace('.','_').replace(' ','_')
Modify string txt
to follow lmfit.Parameter()
naming rules.
138def smart_type(x): 139 ''' 140 Tries to convert string `x` to a float if it includes a decimal point, or 141 to an integer if it does not. If both attempts fail, return the original 142 string unchanged. 143 ''' 144 try: 145 y = float(x) 146 except ValueError: 147 return x 148 if '.' not in x: 149 return int(y) 150 return y
Tries to convert string x
to a float if it includes a decimal point, or
to an integer if it does not. If both attempts fail, return the original
string unchanged.
153def pretty_table(x, header = 1, hsep = ' ', vsep = '–', align = '<'): 154 ''' 155 Reads a list of lists of strings and outputs an ascii table 156 157 **Parameters** 158 159 + `x`: a list of lists of strings 160 + `header`: the number of lines to treat as header lines 161 + `hsep`: the horizontal separator between columns 162 + `vsep`: the character to use as vertical separator 163 + `align`: string of left (`<`) or right (`>`) alignment characters. 164 165 **Example** 166 167 ```py 168 x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']] 169 print(pretty_table(x)) 170 ``` 171 yields: 172 ``` 173 -- ------ --- 174 A B C 175 -- ------ --- 176 1 1.9999 foo 177 10 x bar 178 -- ------ --- 179 ``` 180 181 ''' 182 txt = [] 183 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 184 185 if len(widths) > len(align): 186 align += '>' * (len(widths)-len(align)) 187 sepline = hsep.join([vsep*w for w in widths]) 188 txt += [sepline] 189 for k,l in enumerate(x): 190 if k and k == header: 191 txt += [sepline] 192 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 193 txt += [sepline] 194 txt += [''] 195 return '\n'.join(txt)
Reads a list of lists of strings and outputs an ascii table
Parameters
x
: a list of lists of stringsheader
: the number of lines to treat as header lineshsep
: the horizontal separator between columnsvsep
: the character to use as vertical separatoralign
: string of left (<
) or right (>
) alignment characters.
Example
x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
print(pretty_table(x))
yields:
-- ------ ---
A B C
-- ------ ---
1 1.9999 foo
10 x bar
-- ------ ---
198def transpose_table(x): 199 ''' 200 Transpose a list if lists 201 202 **Parameters** 203 204 + `x`: a list of lists 205 206 **Example** 207 208 ```py 209 x = [[1, 2], [3, 4]] 210 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 211 ``` 212 ''' 213 return [[e for e in c] for c in zip(*x)]
Transpose a list if lists
Parameters
x
: a list of lists
Example
x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
216def w_avg(X, sX) : 217 ''' 218 Compute variance-weighted average 219 220 Returns the value and SE of the weighted average of the elements of `X`, 221 with relative weights equal to their inverse variances (`1/sX**2`). 222 223 **Parameters** 224 225 + `X`: array-like of elements to average 226 + `sX`: array-like of the corresponding SE values 227 228 **Tip** 229 230 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 231 they may be rearranged using `zip()`: 232 233 ```python 234 foo = [(0, 1), (1, 0.5), (2, 0.5)] 235 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 236 ``` 237 ''' 238 X = [ x for x in X ] 239 sX = [ sx for sx in sX ] 240 W = [ sx**-2 for sx in sX ] 241 W = [ w/sum(W) for w in W ] 242 Xavg = sum([ w*x for w,x in zip(W,X) ]) 243 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 244 return Xavg, sXavg
Compute variance-weighted average
Returns the value and SE of the weighted average of the elements of X
,
with relative weights equal to their inverse variances (1/sX**2
).
Parameters
X
: array-like of elements to averagesX
: array-like of the corresponding SE values
Tip
If X
and sX
are initially arranged as a list of (x, sx)
doublets,
they may be rearranged using zip()
:
foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
247def read_csv(filename, sep = ''): 248 ''' 249 Read contents of `filename` in csv format and return a list of dictionaries. 250 251 In the csv string, spaces before and after field separators (`','` by default) 252 are optional. 253 254 **Parameters** 255 256 + `filename`: the csv file to read 257 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 258 whichever appers most often in the contents of `filename`. 259 ''' 260 with open(filename) as fid: 261 txt = fid.read() 262 263 if sep == '': 264 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 265 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 266 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
Read contents of filename
in csv format and return a list of dictionaries.
In the csv string, spaces before and after field separators (','
by default)
are optional.
Parameters
filename
: the csv file to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in the contents of
filename
.
269def simulate_single_analysis( 270 sample = 'MYSAMPLE', 271 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 272 d13C_VPDB = None, d18O_VPDB = None, 273 D47 = None, D48 = None, D49 = 0., D17O = 0., 274 a47 = 1., b47 = 0., c47 = -0.9, 275 a48 = 1., b48 = 0., c48 = -0.45, 276 Nominal_D47 = None, 277 Nominal_D48 = None, 278 Nominal_d13C_VPDB = None, 279 Nominal_d18O_VPDB = None, 280 ALPHA_18O_ACID_REACTION = None, 281 R13_VPDB = None, 282 R17_VSMOW = None, 283 R18_VSMOW = None, 284 LAMBDA_17 = None, 285 R18_VPDB = None, 286 ): 287 ''' 288 Compute working-gas delta values for a single analysis, assuming a stochastic working 289 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 290 291 **Parameters** 292 293 + `sample`: sample name 294 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 295 (respectively –4 and +26 ‰ by default) 296 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 297 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 298 of the carbonate sample 299 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 300 Δ48 values if `D47` or `D48` are not specified 301 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 302 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 303 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 304 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 305 correction parameters (by default equal to the `D4xdata` default values) 306 307 Returns a dictionary with fields 308 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 309 ''' 310 311 if Nominal_d13C_VPDB is None: 312 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 313 314 if Nominal_d18O_VPDB is None: 315 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 316 317 if ALPHA_18O_ACID_REACTION is None: 318 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 319 320 if R13_VPDB is None: 321 R13_VPDB = D4xdata().R13_VPDB 322 323 if R17_VSMOW is None: 324 R17_VSMOW = D4xdata().R17_VSMOW 325 326 if R18_VSMOW is None: 327 R18_VSMOW = D4xdata().R18_VSMOW 328 329 if LAMBDA_17 is None: 330 LAMBDA_17 = D4xdata().LAMBDA_17 331 332 if R18_VPDB is None: 333 R18_VPDB = D4xdata().R18_VPDB 334 335 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 336 337 if Nominal_D47 is None: 338 Nominal_D47 = D47data().Nominal_D47 339 340 if Nominal_D48 is None: 341 Nominal_D48 = D48data().Nominal_D48 342 343 if d13C_VPDB is None: 344 if sample in Nominal_d13C_VPDB: 345 d13C_VPDB = Nominal_d13C_VPDB[sample] 346 else: 347 raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.") 348 349 if d18O_VPDB is None: 350 if sample in Nominal_d18O_VPDB: 351 d18O_VPDB = Nominal_d18O_VPDB[sample] 352 else: 353 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 354 355 if D47 is None: 356 if sample in Nominal_D47: 357 D47 = Nominal_D47[sample] 358 else: 359 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 360 361 if D48 is None: 362 if sample in Nominal_D48: 363 D48 = Nominal_D48[sample] 364 else: 365 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 366 367 X = D4xdata() 368 X.R13_VPDB = R13_VPDB 369 X.R17_VSMOW = R17_VSMOW 370 X.R18_VSMOW = R18_VSMOW 371 X.LAMBDA_17 = LAMBDA_17 372 X.R18_VPDB = R18_VPDB 373 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 374 375 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 376 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 377 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 378 ) 379 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 380 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 381 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 382 D17O=D17O, D47=D47, D48=D48, D49=D49, 383 ) 384 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 385 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 386 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 387 D17O=D17O, 388 ) 389 390 d45 = 1000 * (R45/R45wg - 1) 391 d46 = 1000 * (R46/R46wg - 1) 392 d47 = 1000 * (R47/R47wg - 1) 393 d48 = 1000 * (R48/R48wg - 1) 394 d49 = 1000 * (R49/R49wg - 1) 395 396 for k in range(3): # dumb iteration to adjust for small changes in d47 397 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 398 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 399 d47 = 1000 * (R47raw/R47wg - 1) 400 d48 = 1000 * (R48raw/R48wg - 1) 401 402 return dict( 403 Sample = sample, 404 D17O = D17O, 405 d13Cwg_VPDB = d13Cwg_VPDB, 406 d18Owg_VSMOW = d18Owg_VSMOW, 407 d45 = d45, 408 d46 = d46, 409 d47 = d47, 410 d48 = d48, 411 d49 = d49, 412 )
Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
Parameters
sample
: sample named13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (respectively –4 and +26 ‰ by default)d13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
: clumped-isotope and oxygen-17 anomalies of the carbonate sampleNominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specifiedNominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specifiedALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factorR13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to theD4xdata
default values)
Returns a dictionary with fields
['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']
.
415def virtual_data( 416 samples = [], 417 a47 = 1., b47 = 0., c47 = -0.9, 418 a48 = 1., b48 = 0., c48 = -0.45, 419 rd45 = 0.020, rd46 = 0.060, 420 rD47 = 0.015, rD48 = 0.045, 421 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 422 session = None, 423 Nominal_D47 = None, Nominal_D48 = None, 424 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 425 ALPHA_18O_ACID_REACTION = None, 426 R13_VPDB = None, 427 R17_VSMOW = None, 428 R18_VSMOW = None, 429 LAMBDA_17 = None, 430 R18_VPDB = None, 431 seed = 0, 432 shuffle = True, 433 ): 434 ''' 435 Return list with simulated analyses from a single session. 436 437 **Parameters** 438 439 + `samples`: a list of entries; each entry is a dictionary with the following fields: 440 * `Sample`: the name of the sample 441 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 442 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 443 * `N`: how many analyses to generate for this sample 444 + `a47`: scrambling factor for Δ47 445 + `b47`: compositional nonlinearity for Δ47 446 + `c47`: working gas offset for Δ47 447 + `a48`: scrambling factor for Δ48 448 + `b48`: compositional nonlinearity for Δ48 449 + `c48`: working gas offset for Δ48 450 + `rd45`: analytical repeatability of δ45 451 + `rd46`: analytical repeatability of δ46 452 + `rD47`: analytical repeatability of Δ47 453 + `rD48`: analytical repeatability of Δ48 454 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 455 (by default equal to the `simulate_single_analysis` default values) 456 + `session`: name of the session (no name by default) 457 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 458 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 459 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 460 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 461 (by default equal to the `simulate_single_analysis` defaults) 462 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 463 (by default equal to the `simulate_single_analysis` defaults) 464 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 465 correction parameters (by default equal to the `simulate_single_analysis` default) 466 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 467 + `shuffle`: randomly reorder the sequence of analyses 468 469 470 Here is an example of using this method to generate an arbitrary combination of 471 anchors and unknowns for a bunch of sessions: 472 473 ```py 474 .. include:: ../code_examples/virtual_data/example.py 475 ``` 476 477 This should output something like: 478 479 ``` 480 .. include:: ../code_examples/virtual_data/output.txt 481 ``` 482 ''' 483 484 kwargs = locals().copy() 485 486 from numpy import random as nprandom 487 if seed: 488 rng = nprandom.default_rng(seed) 489 else: 490 rng = nprandom.default_rng() 491 492 N = sum([s['N'] for s in samples]) 493 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 494 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 495 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 496 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 497 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 498 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 499 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 500 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 501 502 k = 0 503 out = [] 504 for s in samples: 505 kw = {} 506 kw['sample'] = s['Sample'] 507 kw = { 508 **kw, 509 **{var: kwargs[var] 510 for var in [ 511 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 512 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 513 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 514 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 515 ] 516 if kwargs[var] is not None}, 517 **{var: s[var] 518 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 519 if var in s}, 520 } 521 522 sN = s['N'] 523 while sN: 524 out.append(simulate_single_analysis(**kw)) 525 out[-1]['d45'] += errors45[k] 526 out[-1]['d46'] += errors46[k] 527 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 528 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 529 sN -= 1 530 k += 1 531 532 if session is not None: 533 for r in out: 534 r['Session'] = session 535 536 if shuffle: 537 nprandom.shuffle(out) 538 539 return out
Return list with simulated analyses from a single session.
Parameters
samples
: a list of entries; each entry is a dictionary with the following fields:Sample
: the name of the sampled13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
(all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sampleN
: how many analyses to generate for this sample
a47
: scrambling factor for Δ47b47
: compositional nonlinearity for Δ47c47
: working gas offset for Δ47a48
: scrambling factor for Δ48b48
: compositional nonlinearity for Δ48c48
: working gas offset for Δ48rd45
: analytical repeatability of δ45rd46
: analytical repeatability of δ46rD47
: analytical repeatability of Δ47rD48
: analytical repeatability of Δ48d13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (by default equal to thesimulate_single_analysis
default values)session
: name of the session (no name by default)Nominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specified (by default equal to thesimulate_single_analysis
defaults)Nominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specified (by default equal to thesimulate_single_analysis
defaults)ALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factor (by default equal to thesimulate_single_analysis
defaults)R13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to thesimulate_single_analysis
default)seed
: explicitly set to a non-zero value to achieve random but repeatable simulationsshuffle
: randomly reorder the sequence of analyses
Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
This should output something like:
[table_of_sessions]
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a ± SE 1e3 x b ± SE c ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––– ––––––––––––––
Session_01 9 6 -4.000 26.000 0.0205 0.0633 0.0091 1.015 ± 0.015 0.427 ± 0.232 -0.909 ± 0.006
Session_02 9 6 -4.000 26.000 0.0210 0.0882 0.0100 0.990 ± 0.015 0.484 ± 0.232 -0.905 ± 0.006
Session_03 9 6 -4.000 26.000 0.0186 0.0505 0.0111 0.997 ± 0.015 0.167 ± 0.233 -0.901 ± 0.006
Session_04 9 6 -4.000 26.000 0.0192 0.0467 0.0086 1.017 ± 0.015 0.229 ± 0.232 -0.910 ± 0.006
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––– ––––––––––––––
[table_of_samples]
–––––– –– ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––– –– ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 12 2.02 37.01 0.2052 0.0083
ETH-2 12 -10.17 19.88 0.2085 0.0090
ETH-3 12 1.71 37.46 0.6132 0.0083
BAR 12 -15.02 37.22 0.6057 0.0042 ± 0.0085 0.0088 0.753
FOO 12 -5.00 28.89 0.3024 0.0031 ± 0.0062 0.0070 0.497
–––––– –– ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
[table_of_analyses]
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.010276 10.840276 16.207960 21.475150 27.780042 2.011176 37.073454 -0.704188 -0.315986 -0.172089 0.194589
2 Session_01 ETH-3 -4.000 26.000 5.734896 11.229855 16.740410 22.402091 28.306614 1.702875 37.472070 -0.276998 -0.179635 -0.125368 0.615396
3 Session_01 ETH-3 -4.000 26.000 5.727341 11.211663 16.713472 22.364770 28.306614 1.695479 37.453503 -0.278056 -0.180158 -0.082015 0.614365
4 Session_01 ETH-2 -4.000 26.000 -5.991278 -5.995054 -12.741562 -12.184075 -18.023381 -10.180122 19.902809 -0.711697 -0.232746 0.032602 0.199357
5 Session_01 BAR -4.000 26.000 -9.920507 10.903408 0.065076 21.704075 10.707292 -14.998270 37.174839 -0.307018 -0.216978 -0.026076 0.592818
6 Session_01 FOO -4.000 26.000 -0.838118 2.819853 1.310384 5.326005 4.665655 -5.004629 28.895933 -0.593755 -0.319861 0.014956 0.309692
7 Session_01 ETH-3 -4.000 26.000 5.755174 11.255104 16.792797 22.451660 28.306614 1.723596 37.497816 -0.270825 -0.181089 -0.195908 0.621458
8 Session_01 ETH-1 -4.000 26.000 6.049381 10.706856 16.135579 21.196941 27.780042 2.057827 36.937067 -0.685751 -0.324384 0.045870 0.212791
9 Session_01 FOO -4.000 26.000 -0.848028 2.874679 1.346196 5.439150 4.665655 -5.017230 28.951964 -0.601502 -0.316664 -0.081898 0.302042
10 Session_01 FOO -4.000 26.000 -0.876454 2.906764 1.341194 5.490264 4.665655 -5.048760 28.984806 -0.608593 -0.329808 -0.114437 0.295055
11 Session_01 ETH-2 -4.000 26.000 -5.974124 -5.955517 -12.668784 -12.208184 -18.023381 -10.163274 19.943159 -0.694902 -0.336672 -0.063946 0.215880
12 Session_01 ETH-1 -4.000 26.000 5.995601 10.755323 16.116087 21.285428 27.780042 1.998631 36.986704 -0.696924 -0.333640 0.008600 0.201787
13 Session_01 ETH-2 -4.000 26.000 -5.982229 -6.110437 -12.827036 -12.492272 -18.023381 -10.166188 19.784916 -0.693555 -0.312598 0.251040 0.217274
14 Session_01 BAR -4.000 26.000 -9.915975 10.968470 0.153453 21.749385 10.707292 -14.995822 37.241294 -0.286638 -0.301325 -0.157376 0.612868
15 Session_01 BAR -4.000 26.000 -9.959983 10.926995 0.053806 21.724901 10.707292 -15.041279 37.199026 -0.300066 -0.243252 -0.029371 0.599675
16 Session_02 ETH-1 -4.000 26.000 6.019963 10.773112 16.163825 21.331060 27.780042 2.029040 37.042346 -0.692234 -0.324161 -0.051788 0.207075
17 Session_02 FOO -4.000 26.000 -0.848415 2.849823 1.308081 5.427767 4.665655 -5.018107 28.927036 -0.614791 -0.278426 -0.032784 0.292547
18 Session_02 ETH-2 -4.000 26.000 -5.993476 -5.944866 -12.696865 -12.149754 -18.023381 -10.190430 19.913381 -0.713779 -0.298963 -0.064251 0.199436
19 Session_02 BAR -4.000 26.000 -9.957566 10.903888 0.031785 21.739434 10.707292 -15.048386 37.213724 -0.302139 -0.183327 0.012926 0.608897
20 Session_02 ETH-3 -4.000 26.000 5.757137 11.232751 16.744567 22.398244 28.306614 1.731295 37.514660 -0.298533 -0.189123 -0.154557 0.604363
21 Session_02 ETH-1 -4.000 26.000 5.993918 10.617469 15.991900 21.070358 27.780042 2.006934 36.882679 -0.683329 -0.271476 0.278458 0.216152
22 Session_02 FOO -4.000 26.000 -0.835046 2.870518 1.355370 5.487896 4.665655 -5.004585 28.948243 -0.601666 -0.259900 -0.087592 0.305777
23 Session_02 BAR -4.000 26.000 -9.963888 10.865863 -0.023549 21.615868 10.707292 -15.053743 37.174715 -0.313906 -0.229031 0.093637 0.597041
24 Session_02 ETH-2 -4.000 26.000 -5.950370 -5.959974 -12.650784 -12.197864 -18.023381 -10.143809 19.897777 -0.696916 -0.317263 -0.080604 0.216441
25 Session_02 ETH-2 -4.000 26.000 -5.982371 -6.036210 -12.762399 -12.309944 -18.023381 -10.175178 19.819614 -0.701348 -0.277354 0.104418 0.212021
26 Session_02 BAR -4.000 26.000 -9.936020 10.862339 0.024660 21.563307 10.707292 -15.023836 37.171034 -0.291333 -0.273498 0.070452 0.619812
27 Session_02 FOO -4.000 26.000 -0.819742 2.826793 1.317044 5.330616 4.665655 -4.986618 28.903335 -0.612871 -0.329113 -0.018244 0.294481
28 Session_02 ETH-1 -4.000 26.000 6.030532 10.851030 16.245571 21.457100 27.780042 2.037466 37.122284 -0.698413 -0.354920 -0.214443 0.200795
29 Session_02 ETH-3 -4.000 26.000 5.716356 11.091821 16.582487 22.123857 28.306614 1.692901 37.370126 -0.279100 -0.178789 0.162540 0.624067
30 Session_02 ETH-3 -4.000 26.000 5.719281 11.207303 16.681693 22.370886 28.306614 1.691780 37.488633 -0.296801 -0.165556 -0.065004 0.606143
31 Session_03 ETH-2 -4.000 26.000 -5.997147 -5.905858 -12.655382 -12.081612 -18.023381 -10.165400 19.891551 -0.706536 -0.308464 -0.137414 0.197550
32 Session_03 ETH-2 -4.000 26.000 -6.008525 -5.909707 -12.647727 -12.075913 -18.023381 -10.177379 19.887608 -0.683183 -0.294956 -0.117608 0.220975
33 Session_03 BAR -4.000 26.000 -9.928709 10.989665 0.148059 21.852677 10.707292 -14.976237 37.324152 -0.299358 -0.242185 -0.184835 0.603855
34 Session_03 BAR -4.000 26.000 -9.957114 10.898997 0.044946 21.602296 10.707292 -15.003175 37.230716 -0.284699 -0.307849 0.021944 0.618578
35 Session_03 ETH-2 -4.000 26.000 -6.000290 -5.947172 -12.697463 -12.164602 -18.023381 -10.167221 19.848953 -0.705037 -0.309350 -0.052386 0.199061
36 Session_03 ETH-1 -4.000 26.000 5.994622 10.743980 16.116098 21.243734 27.780042 1.997857 37.033567 -0.684883 -0.352014 0.031692 0.214449
37 Session_03 FOO -4.000 26.000 -0.823857 2.761300 1.258060 5.239992 4.665655 -4.973383 28.817444 -0.603327 -0.288652 0.114488 0.298751
38 Session_03 ETH-3 -4.000 26.000 5.748546 11.079879 16.580826 22.120063 28.306614 1.723364 37.380534 -0.302133 -0.158882 0.151641 0.598318
39 Session_03 ETH-1 -4.000 26.000 6.040566 10.786620 16.205283 21.374963 27.780042 2.045244 37.077432 -0.685706 -0.307909 -0.099869 0.213609
40 Session_03 ETH-3 -4.000 26.000 5.718991 11.146227 16.640814 22.243185 28.306614 1.689442 37.449023 -0.277332 -0.169668 0.053997 0.623187
41 Session_03 BAR -4.000 26.000 -9.952115 11.034508 0.169809 21.885915 10.707292 -15.002819 37.370451 -0.296804 -0.298351 -0.246731 0.606414
42 Session_03 ETH-3 -4.000 26.000 5.753467 11.206589 16.719131 22.373244 28.306614 1.723960 37.511190 -0.294350 -0.161838 -0.099835 0.606103
43 Session_03 FOO -4.000 26.000 -0.800284 2.851299 1.376828 5.379547 4.665655 -4.951581 28.910199 -0.597293 -0.329315 -0.087015 0.304784
44 Session_03 FOO -4.000 26.000 -0.873798 2.820799 1.272165 5.370745 4.665655 -5.028782 28.878917 -0.596008 -0.277258 0.051165 0.306090
45 Session_03 ETH-1 -4.000 26.000 6.004078 10.683951 16.045192 21.214355 27.780042 2.010134 36.971642 -0.705956 -0.262026 0.138399 0.193323
46 Session_04 BAR -4.000 26.000 -9.931741 10.819830 -0.023748 21.529372 10.707292 -15.006533 37.118743 -0.302866 -0.222623 0.148462 0.596536
47 Session_04 FOO -4.000 26.000 -0.853969 2.805035 1.267571 5.353907 4.665655 -5.030523 28.850660 -0.605611 -0.262571 0.060903 0.298685
48 Session_04 FOO -4.000 26.000 -0.791191 2.708220 1.256167 5.145784 4.665655 -4.960004 28.750896 -0.586913 -0.276505 0.183674 0.317065
49 Session_04 ETH-1 -4.000 26.000 6.023822 10.730714 16.121184 21.235757 27.780042 2.012958 36.989833 -0.696908 -0.333582 0.026555 0.205610
50 Session_04 ETH-2 -4.000 26.000 -5.973623 -5.975018 -12.694278 -12.194472 -18.023381 -10.166297 19.828211 -0.701951 -0.283570 -0.025935 0.207135
51 Session_04 ETH-1 -4.000 26.000 6.017312 10.735930 16.123043 21.270597 27.780042 2.005824 36.995214 -0.693479 -0.309795 0.023309 0.208980
52 Session_04 ETH-1 -4.000 26.000 6.029937 10.766997 16.151273 21.345479 27.780042 2.018148 37.027152 -0.708855 -0.297953 -0.050465 0.193862
53 Session_04 ETH-3 -4.000 26.000 5.739420 11.128582 16.641344 22.166106 28.306614 1.695046 37.399884 -0.280608 -0.210162 0.066645 0.614665
54 Session_04 ETH-3 -4.000 26.000 5.751908 11.207110 16.726741 22.380392 28.306614 1.705481 37.480657 -0.285776 -0.155878 -0.099197 0.609567
55 Session_04 FOO -4.000 26.000 -0.848192 2.777763 1.251297 5.280272 4.665655 -5.023358 28.822585 -0.601094 -0.281419 0.108186 0.303128
56 Session_04 ETH-2 -4.000 26.000 -5.966627 -5.893789 -12.597717 -12.120719 -18.023381 -10.161842 19.911776 -0.691757 -0.372308 -0.193986 0.217132
57 Session_04 ETH-3 -4.000 26.000 5.798016 11.254135 16.832228 22.432473 28.306614 1.752928 37.528936 -0.275047 -0.197935 -0.239408 0.620088
58 Session_04 ETH-2 -4.000 26.000 -5.986501 -5.915157 -12.656583 -12.060382 -18.023381 -10.182247 19.889836 -0.709603 -0.268277 -0.130450 0.199604
59 Session_04 BAR -4.000 26.000 -9.926078 10.884823 0.060864 21.650722 10.707292 -15.002880 37.185606 -0.287358 -0.232425 0.016044 0.611760
60 Session_04 BAR -4.000 26.000 -9.951025 10.951923 0.089386 21.738926 10.707292 -15.031949 37.254709 -0.298065 -0.278834 -0.087463 0.601230
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– ––––––––
541def table_of_samples( 542 data47 = None, 543 data48 = None, 544 dir = 'output', 545 filename = None, 546 save_to_file = True, 547 print_out = True, 548 output = None, 549 ): 550 ''' 551 Print out, save to disk and/or return a combined table of samples 552 for a pair of `D47data` and `D48data` objects. 553 554 **Parameters** 555 556 + `data47`: `D47data` instance 557 + `data48`: `D48data` instance 558 + `dir`: the directory in which to save the table 559 + `filename`: the name to the csv file to write to 560 + `save_to_file`: whether to save the table to disk 561 + `print_out`: whether to print out the table 562 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 563 if set to `'raw'`: return a list of list of strings 564 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 565 ''' 566 if data47 is None: 567 if data48 is None: 568 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 569 else: 570 return data48.table_of_samples( 571 dir = dir, 572 filename = filename, 573 save_to_file = save_to_file, 574 print_out = print_out, 575 output = output 576 ) 577 else: 578 if data48 is None: 579 return data47.table_of_samples( 580 dir = dir, 581 filename = filename, 582 save_to_file = save_to_file, 583 print_out = print_out, 584 output = output 585 ) 586 else: 587 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 588 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 589 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 590 591 if save_to_file: 592 if not os.path.exists(dir): 593 os.makedirs(dir) 594 if filename is None: 595 filename = f'D47D48_samples.csv' 596 with open(f'{dir}/{filename}', 'w') as fid: 597 fid.write(make_csv(out)) 598 if print_out: 599 print('\n'+pretty_table(out)) 600 if output == 'raw': 601 return out 602 elif output == 'pretty': 603 return pretty_table(out)
Print out, save to disk and/or return a combined table of samples
for a pair of D47data
and D48data
objects.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
606def table_of_sessions( 607 data47 = None, 608 data48 = None, 609 dir = 'output', 610 filename = None, 611 save_to_file = True, 612 print_out = True, 613 output = None, 614 ): 615 ''' 616 Print out, save to disk and/or return a combined table of sessions 617 for a pair of `D47data` and `D48data` objects. 618 ***Only applicable if the sessions in `data47` and those in `data48` 619 consist of the exact same sets of analyses.*** 620 621 **Parameters** 622 623 + `data47`: `D47data` instance 624 + `data48`: `D48data` instance 625 + `dir`: the directory in which to save the table 626 + `filename`: the name to the csv file to write to 627 + `save_to_file`: whether to save the table to disk 628 + `print_out`: whether to print out the table 629 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 630 if set to `'raw'`: return a list of list of strings 631 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 632 ''' 633 if data47 is None: 634 if data48 is None: 635 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 636 else: 637 return data48.table_of_sessions( 638 dir = dir, 639 filename = filename, 640 save_to_file = save_to_file, 641 print_out = print_out, 642 output = output 643 ) 644 else: 645 if data48 is None: 646 return data47.table_of_sessions( 647 dir = dir, 648 filename = filename, 649 save_to_file = save_to_file, 650 print_out = print_out, 651 output = output 652 ) 653 else: 654 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 655 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 656 for k,x in enumerate(out47[0]): 657 if k>7: 658 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 659 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 660 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 661 662 if save_to_file: 663 if not os.path.exists(dir): 664 os.makedirs(dir) 665 if filename is None: 666 filename = f'D47D48_sessions.csv' 667 with open(f'{dir}/{filename}', 'w') as fid: 668 fid.write(make_csv(out)) 669 if print_out: 670 print('\n'+pretty_table(out)) 671 if output == 'raw': 672 return out 673 elif output == 'pretty': 674 return pretty_table(out)
Print out, save to disk and/or return a combined table of sessions
for a pair of D47data
and D48data
objects.
Only applicable if the sessions in data47
and those in data48
consist of the exact same sets of analyses.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
677def table_of_analyses( 678 data47 = None, 679 data48 = None, 680 dir = 'output', 681 filename = None, 682 save_to_file = True, 683 print_out = True, 684 output = None, 685 ): 686 ''' 687 Print out, save to disk and/or return a combined table of analyses 688 for a pair of `D47data` and `D48data` objects. 689 690 If the sessions in `data47` and those in `data48` do not consist of 691 the exact same sets of analyses, the table will have two columns 692 `Session_47` and `Session_48` instead of a single `Session` column. 693 694 **Parameters** 695 696 + `data47`: `D47data` instance 697 + `data48`: `D48data` instance 698 + `dir`: the directory in which to save the table 699 + `filename`: the name to the csv file to write to 700 + `save_to_file`: whether to save the table to disk 701 + `print_out`: whether to print out the table 702 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 703 if set to `'raw'`: return a list of list of strings 704 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 705 ''' 706 if data47 is None: 707 if data48 is None: 708 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 709 else: 710 return data48.table_of_analyses( 711 dir = dir, 712 filename = filename, 713 save_to_file = save_to_file, 714 print_out = print_out, 715 output = output 716 ) 717 else: 718 if data48 is None: 719 return data47.table_of_analyses( 720 dir = dir, 721 filename = filename, 722 save_to_file = save_to_file, 723 print_out = print_out, 724 output = output 725 ) 726 else: 727 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 728 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 729 730 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 731 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 732 else: 733 out47[0][1] = 'Session_47' 734 out48[0][1] = 'Session_48' 735 out47 = transpose_table(out47) 736 out48 = transpose_table(out48) 737 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 738 739 if save_to_file: 740 if not os.path.exists(dir): 741 os.makedirs(dir) 742 if filename is None: 743 filename = f'D47D48_sessions.csv' 744 with open(f'{dir}/{filename}', 'w') as fid: 745 fid.write(make_csv(out)) 746 if print_out: 747 print('\n'+pretty_table(out)) 748 if output == 'raw': 749 return out 750 elif output == 'pretty': 751 return pretty_table(out)
Print out, save to disk and/or return a combined table of analyses
for a pair of D47data
and D48data
objects.
If the sessions in data47
and those in data48
do not consist of
the exact same sets of analyses, the table will have two columns
Session_47
and Session_48
instead of a single Session
column.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
799class D4xdata(list): 800 ''' 801 Store and process data for a large set of Δ47 and/or Δ48 802 analyses, usually comprising more than one analytical session. 803 ''' 804 805 ### 17O CORRECTION PARAMETERS 806 R13_VPDB = 0.01118 # (Chang & Li, 1990) 807 ''' 808 Absolute (13C/12C) ratio of VPDB. 809 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 810 ''' 811 812 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 813 ''' 814 Absolute (18O/16C) ratio of VSMOW. 815 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 816 ''' 817 818 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 819 ''' 820 Mass-dependent exponent for triple oxygen isotopes. 821 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 822 ''' 823 824 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 825 ''' 826 Absolute (17O/16C) ratio of VSMOW. 827 By default equal to 0.00038475 828 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 829 rescaled to `R13_VPDB`) 830 ''' 831 832 R18_VPDB = R18_VSMOW * 1.03092 833 ''' 834 Absolute (18O/16C) ratio of VPDB. 835 By definition equal to `R18_VSMOW * 1.03092`. 836 ''' 837 838 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 839 ''' 840 Absolute (17O/16C) ratio of VPDB. 841 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 842 ''' 843 844 LEVENE_REF_SAMPLE = 'ETH-3' 845 ''' 846 After the Δ4x standardization step, each sample is tested to 847 assess whether the Δ4x variance within all analyses for that 848 sample differs significantly from that observed for a given reference 849 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 850 which yields a p-value corresponding to the null hypothesis that the 851 underlying variances are equal). 852 853 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 854 sample should be used as a reference for this test. 855 ''' 856 857 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 858 ''' 859 Specifies the 18O/16O fractionation factor generally applicable 860 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 861 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 862 863 By default equal to 1.008129 (calcite reacted at 90 °C, 864 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 865 ''' 866 867 Nominal_d13C_VPDB = { 868 'ETH-1': 2.02, 869 'ETH-2': -10.17, 870 'ETH-3': 1.71, 871 } # (Bernasconi et al., 2018) 872 ''' 873 Nominal δ13C_VPDB values assigned to carbonate standards, used by 874 `D4xdata.standardize_d13C()`. 875 876 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 877 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 878 ''' 879 880 Nominal_d18O_VPDB = { 881 'ETH-1': -2.19, 882 'ETH-2': -18.69, 883 'ETH-3': -1.78, 884 } # (Bernasconi et al., 2018) 885 ''' 886 Nominal δ18O_VPDB values assigned to carbonate standards, used by 887 `D4xdata.standardize_d18O()`. 888 889 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 890 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 891 ''' 892 893 d13C_STANDARDIZATION_METHOD = '2pt' 894 ''' 895 Method by which to standardize δ13C values: 896 897 + `none`: do not apply any δ13C standardization. 898 + `'1pt'`: within each session, offset all initial δ13C values so as to 899 minimize the difference between final δ13C_VPDB values and 900 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 901 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 902 values so as to minimize the difference between final δ13C_VPDB 903 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 904 is defined). 905 ''' 906 907 d18O_STANDARDIZATION_METHOD = '2pt' 908 ''' 909 Method by which to standardize δ18O values: 910 911 + `none`: do not apply any δ18O standardization. 912 + `'1pt'`: within each session, offset all initial δ18O values so as to 913 minimize the difference between final δ18O_VPDB values and 914 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 915 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 916 values so as to minimize the difference between final δ18O_VPDB 917 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 918 is defined). 919 ''' 920 921 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 922 ''' 923 **Parameters** 924 925 + `l`: a list of dictionaries, with each dictionary including at least the keys 926 `Sample`, `d45`, `d46`, and `d47` or `d48`. 927 + `mass`: `'47'` or `'48'` 928 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 929 + `session`: define session name for analyses without a `Session` key 930 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 931 932 Returns a `D4xdata` object derived from `list`. 933 ''' 934 self._4x = mass 935 self.verbose = verbose 936 self.prefix = 'D4xdata' 937 self.logfile = logfile 938 list.__init__(self, l) 939 self.Nf = None 940 self.repeatability = {} 941 self.refresh(session = session) 942 943 944 def make_verbal(oldfun): 945 ''' 946 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 947 ''' 948 @wraps(oldfun) 949 def newfun(*args, verbose = '', **kwargs): 950 myself = args[0] 951 oldprefix = myself.prefix 952 myself.prefix = oldfun.__name__ 953 if verbose != '': 954 oldverbose = myself.verbose 955 myself.verbose = verbose 956 out = oldfun(*args, **kwargs) 957 myself.prefix = oldprefix 958 if verbose != '': 959 myself.verbose = oldverbose 960 return out 961 return newfun 962 963 964 def msg(self, txt): 965 ''' 966 Log a message to `self.logfile`, and print it out if `verbose = True` 967 ''' 968 self.log(txt) 969 if self.verbose: 970 print(f'{f"[{self.prefix}]":<16} {txt}') 971 972 973 def vmsg(self, txt): 974 ''' 975 Log a message to `self.logfile` and print it out 976 ''' 977 self.log(txt) 978 print(txt) 979 980 981 def log(self, *txts): 982 ''' 983 Log a message to `self.logfile` 984 ''' 985 if self.logfile: 986 with open(self.logfile, 'a') as fid: 987 for txt in txts: 988 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 989 990 991 def refresh(self, session = 'mySession'): 992 ''' 993 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 994 ''' 995 self.fill_in_missing_info(session = session) 996 self.refresh_sessions() 997 self.refresh_samples() 998 999 1000 def refresh_sessions(self): 1001 ''' 1002 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1003 to `False` for all sessions. 1004 ''' 1005 self.sessions = { 1006 s: {'data': [r for r in self if r['Session'] == s]} 1007 for s in sorted({r['Session'] for r in self}) 1008 } 1009 for s in self.sessions: 1010 self.sessions[s]['scrambling_drift'] = False 1011 self.sessions[s]['slope_drift'] = False 1012 self.sessions[s]['wg_drift'] = False 1013 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1014 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1015 1016 1017 def refresh_samples(self): 1018 ''' 1019 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1020 ''' 1021 self.samples = { 1022 s: {'data': [r for r in self if r['Sample'] == s]} 1023 for s in sorted({r['Sample'] for r in self}) 1024 } 1025 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1026 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1027 1028 1029 def read(self, filename, sep = '', session = ''): 1030 ''' 1031 Read file in csv format to load data into a `D47data` object. 1032 1033 In the csv file, spaces before and after field separators (`','` by default) 1034 are optional. Each line corresponds to a single analysis. 1035 1036 The required fields are: 1037 1038 + `UID`: a unique identifier 1039 + `Session`: an identifier for the analytical session 1040 + `Sample`: a sample identifier 1041 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1042 1043 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1044 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1045 and `d49` are optional, and set to NaN by default. 1046 1047 **Parameters** 1048 1049 + `fileneme`: the path of the file to read 1050 + `sep`: csv separator delimiting the fields 1051 + `session`: set `Session` field to this string for all analyses 1052 ''' 1053 with open(filename) as fid: 1054 self.input(fid.read(), sep = sep, session = session) 1055 1056 1057 def input(self, txt, sep = '', session = ''): 1058 ''' 1059 Read `txt` string in csv format to load analysis data into a `D47data` object. 1060 1061 In the csv string, spaces before and after field separators (`','` by default) 1062 are optional. Each line corresponds to a single analysis. 1063 1064 The required fields are: 1065 1066 + `UID`: a unique identifier 1067 + `Session`: an identifier for the analytical session 1068 + `Sample`: a sample identifier 1069 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1070 1071 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1072 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1073 and `d49` are optional, and set to NaN by default. 1074 1075 **Parameters** 1076 1077 + `txt`: the csv string to read 1078 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1079 whichever appers most often in `txt`. 1080 + `session`: set `Session` field to this string for all analyses 1081 ''' 1082 if sep == '': 1083 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1084 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1085 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1086 1087 if session != '': 1088 for r in data: 1089 r['Session'] = session 1090 1091 self += data 1092 self.refresh() 1093 1094 1095 @make_verbal 1096 def wg(self, samples = None, a18_acid = None): 1097 ''' 1098 Compute bulk composition of the working gas for each session based on 1099 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1100 `self.Nominal_d18O_VPDB`. 1101 ''' 1102 1103 self.msg('Computing WG composition:') 1104 1105 if a18_acid is None: 1106 a18_acid = self.ALPHA_18O_ACID_REACTION 1107 if samples is None: 1108 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1109 1110 assert a18_acid, f'Acid fractionation factor should not be zero.' 1111 1112 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1113 R45R46_standards = {} 1114 for sample in samples: 1115 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1116 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1117 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1118 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1119 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1120 1121 C12_s = 1 / (1 + R13_s) 1122 C13_s = R13_s / (1 + R13_s) 1123 C16_s = 1 / (1 + R17_s + R18_s) 1124 C17_s = R17_s / (1 + R17_s + R18_s) 1125 C18_s = R18_s / (1 + R17_s + R18_s) 1126 1127 C626_s = C12_s * C16_s ** 2 1128 C627_s = 2 * C12_s * C16_s * C17_s 1129 C628_s = 2 * C12_s * C16_s * C18_s 1130 C636_s = C13_s * C16_s ** 2 1131 C637_s = 2 * C13_s * C16_s * C17_s 1132 C727_s = C12_s * C17_s ** 2 1133 1134 R45_s = (C627_s + C636_s) / C626_s 1135 R46_s = (C628_s + C637_s + C727_s) / C626_s 1136 R45R46_standards[sample] = (R45_s, R46_s) 1137 1138 for s in self.sessions: 1139 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1140 assert db, f'No sample from {samples} found in session "{s}".' 1141# dbsamples = sorted({r['Sample'] for r in db}) 1142 1143 X = [r['d45'] for r in db] 1144 Y = [R45R46_standards[r['Sample']][0] for r in db] 1145 x1, x2 = np.min(X), np.max(X) 1146 1147 if x1 < x2: 1148 wgcoord = x1/(x1-x2) 1149 else: 1150 wgcoord = 999 1151 1152 if wgcoord < -.5 or wgcoord > 1.5: 1153 # unreasonable to extrapolate to d45 = 0 1154 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1155 else : 1156 # d45 = 0 is reasonably well bracketed 1157 R45_wg = np.polyfit(X, Y, 1)[1] 1158 1159 X = [r['d46'] for r in db] 1160 Y = [R45R46_standards[r['Sample']][1] for r in db] 1161 x1, x2 = np.min(X), np.max(X) 1162 1163 if x1 < x2: 1164 wgcoord = x1/(x1-x2) 1165 else: 1166 wgcoord = 999 1167 1168 if wgcoord < -.5 or wgcoord > 1.5: 1169 # unreasonable to extrapolate to d46 = 0 1170 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1171 else : 1172 # d46 = 0 is reasonably well bracketed 1173 R46_wg = np.polyfit(X, Y, 1)[1] 1174 1175 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1176 1177 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1178 1179 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1180 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1181 for r in self.sessions[s]['data']: 1182 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1183 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1184 1185 1186 def compute_bulk_delta(self, R45, R46, D17O = 0): 1187 ''' 1188 Compute δ13C_VPDB and δ18O_VSMOW, 1189 by solving the generalized form of equation (17) from 1190 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1191 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1192 solving the corresponding second-order Taylor polynomial. 1193 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1194 ''' 1195 1196 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1197 1198 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1199 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1200 C = 2 * self.R18_VSMOW 1201 D = -R46 1202 1203 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1204 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1205 cc = A + B + C + D 1206 1207 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1208 1209 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1210 R17 = K * R18 ** self.LAMBDA_17 1211 R13 = R45 - 2 * R17 1212 1213 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1214 1215 return d13C_VPDB, d18O_VSMOW 1216 1217 1218 @make_verbal 1219 def crunch(self, verbose = ''): 1220 ''' 1221 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1222 ''' 1223 for r in self: 1224 self.compute_bulk_and_clumping_deltas(r) 1225 self.standardize_d13C() 1226 self.standardize_d18O() 1227 self.msg(f"Crunched {len(self)} analyses.") 1228 1229 1230 def fill_in_missing_info(self, session = 'mySession'): 1231 ''' 1232 Fill in optional fields with default values 1233 ''' 1234 for i,r in enumerate(self): 1235 if 'D17O' not in r: 1236 r['D17O'] = 0. 1237 if 'UID' not in r: 1238 r['UID'] = f'{i+1}' 1239 if 'Session' not in r: 1240 r['Session'] = session 1241 for k in ['d47', 'd48', 'd49']: 1242 if k not in r: 1243 r[k] = np.nan 1244 1245 1246 def standardize_d13C(self): 1247 ''' 1248 Perform δ13C standadization within each session `s` according to 1249 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1250 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1251 may be redefined abitrarily at a later stage. 1252 ''' 1253 for s in self.sessions: 1254 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1255 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1256 X,Y = zip(*XY) 1257 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1258 offset = np.mean(Y) - np.mean(X) 1259 for r in self.sessions[s]['data']: 1260 r['d13C_VPDB'] += offset 1261 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1262 a,b = np.polyfit(X,Y,1) 1263 for r in self.sessions[s]['data']: 1264 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1265 1266 def standardize_d18O(self): 1267 ''' 1268 Perform δ18O standadization within each session `s` according to 1269 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1270 which is defined by default by `D47data.refresh_sessions()`as equal to 1271 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1272 ''' 1273 for s in self.sessions: 1274 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1275 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1276 X,Y = zip(*XY) 1277 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1278 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1279 offset = np.mean(Y) - np.mean(X) 1280 for r in self.sessions[s]['data']: 1281 r['d18O_VSMOW'] += offset 1282 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1283 a,b = np.polyfit(X,Y,1) 1284 for r in self.sessions[s]['data']: 1285 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1286 1287 1288 def compute_bulk_and_clumping_deltas(self, r): 1289 ''' 1290 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1291 ''' 1292 1293 # Compute working gas R13, R18, and isobar ratios 1294 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1295 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1296 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1297 1298 # Compute analyte isobar ratios 1299 R45 = (1 + r['d45'] / 1000) * R45_wg 1300 R46 = (1 + r['d46'] / 1000) * R46_wg 1301 R47 = (1 + r['d47'] / 1000) * R47_wg 1302 R48 = (1 + r['d48'] / 1000) * R48_wg 1303 R49 = (1 + r['d49'] / 1000) * R49_wg 1304 1305 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1306 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1307 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1308 1309 # Compute stochastic isobar ratios of the analyte 1310 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1311 R13, R18, D17O = r['D17O'] 1312 ) 1313 1314 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1315 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1316 if (R45 / R45stoch - 1) > 5e-8: 1317 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1318 if (R46 / R46stoch - 1) > 5e-8: 1319 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1320 1321 # Compute raw clumped isotope anomalies 1322 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1323 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1324 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1325 1326 1327 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1328 ''' 1329 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1330 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1331 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1332 ''' 1333 1334 # Compute R17 1335 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1336 1337 # Compute isotope concentrations 1338 C12 = (1 + R13) ** -1 1339 C13 = C12 * R13 1340 C16 = (1 + R17 + R18) ** -1 1341 C17 = C16 * R17 1342 C18 = C16 * R18 1343 1344 # Compute stochastic isotopologue concentrations 1345 C626 = C16 * C12 * C16 1346 C627 = C16 * C12 * C17 * 2 1347 C628 = C16 * C12 * C18 * 2 1348 C636 = C16 * C13 * C16 1349 C637 = C16 * C13 * C17 * 2 1350 C638 = C16 * C13 * C18 * 2 1351 C727 = C17 * C12 * C17 1352 C728 = C17 * C12 * C18 * 2 1353 C737 = C17 * C13 * C17 1354 C738 = C17 * C13 * C18 * 2 1355 C828 = C18 * C12 * C18 1356 C838 = C18 * C13 * C18 1357 1358 # Compute stochastic isobar ratios 1359 R45 = (C636 + C627) / C626 1360 R46 = (C628 + C637 + C727) / C626 1361 R47 = (C638 + C728 + C737) / C626 1362 R48 = (C738 + C828) / C626 1363 R49 = C838 / C626 1364 1365 # Account for stochastic anomalies 1366 R47 *= 1 + D47 / 1000 1367 R48 *= 1 + D48 / 1000 1368 R49 *= 1 + D49 / 1000 1369 1370 # Return isobar ratios 1371 return R45, R46, R47, R48, R49 1372 1373 1374 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1375 ''' 1376 Split unknown samples by UID (treat all analyses as different samples) 1377 or by session (treat analyses of a given sample in different sessions as 1378 different samples). 1379 1380 **Parameters** 1381 1382 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1383 + `grouping`: `by_uid` | `by_session` 1384 ''' 1385 if samples_to_split == 'all': 1386 samples_to_split = [s for s in self.unknowns] 1387 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1388 self.grouping = grouping.lower() 1389 if self.grouping in gkeys: 1390 gkey = gkeys[self.grouping] 1391 for r in self: 1392 if r['Sample'] in samples_to_split: 1393 r['Sample_original'] = r['Sample'] 1394 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1395 elif r['Sample'] in self.unknowns: 1396 r['Sample_original'] = r['Sample'] 1397 self.refresh_samples() 1398 1399 1400 def unsplit_samples(self, tables = False): 1401 ''' 1402 Reverse the effects of `D47data.split_samples()`. 1403 1404 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1405 1406 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1407 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1408 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1409 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1410 that case session-averaged Δ4x values are statistically independent). 1411 ''' 1412 unknowns_old = sorted({s for s in self.unknowns}) 1413 CM_old = self.standardization.covar[:,:] 1414 VD_old = self.standardization.params.valuesdict().copy() 1415 vars_old = self.standardization.var_names 1416 1417 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1418 1419 Ns = len(vars_old) - len(unknowns_old) 1420 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1421 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1422 1423 W = np.zeros((len(vars_new), len(vars_old))) 1424 W[:Ns,:Ns] = np.eye(Ns) 1425 for u in unknowns_new: 1426 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1427 if self.grouping == 'by_session': 1428 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1429 elif self.grouping == 'by_uid': 1430 weights = [1 for s in splits] 1431 sw = sum(weights) 1432 weights = [w/sw for w in weights] 1433 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1434 1435 CM_new = W @ CM_old @ W.T 1436 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1437 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1438 1439 self.standardization.covar = CM_new 1440 self.standardization.params.valuesdict = lambda : VD_new 1441 self.standardization.var_names = vars_new 1442 1443 for r in self: 1444 if r['Sample'] in self.unknowns: 1445 r['Sample_split'] = r['Sample'] 1446 r['Sample'] = r['Sample_original'] 1447 1448 self.refresh_samples() 1449 self.consolidate_samples() 1450 self.repeatabilities() 1451 1452 if tables: 1453 self.table_of_analyses() 1454 self.table_of_samples() 1455 1456 def assign_timestamps(self): 1457 ''' 1458 Assign a time field `t` of type `float` to each analysis. 1459 1460 If `TimeTag` is one of the data fields, `t` is equal within a given session 1461 to `TimeTag` minus the mean value of `TimeTag` for that session. 1462 Otherwise, `TimeTag` is by default equal to the index of each analysis 1463 in the dataset and `t` is defined as above. 1464 ''' 1465 for session in self.sessions: 1466 sdata = self.sessions[session]['data'] 1467 try: 1468 t0 = np.mean([r['TimeTag'] for r in sdata]) 1469 for r in sdata: 1470 r['t'] = r['TimeTag'] - t0 1471 except KeyError: 1472 t0 = (len(sdata)-1)/2 1473 for t,r in enumerate(sdata): 1474 r['t'] = t - t0 1475 1476 1477 def report(self): 1478 ''' 1479 Prints a report on the standardization fit. 1480 Only applicable after `D4xdata.standardize(method='pooled')`. 1481 ''' 1482 report_fit(self.standardization) 1483 1484 1485 def combine_samples(self, sample_groups): 1486 ''' 1487 Combine analyses of different samples to compute weighted average Δ4x 1488 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1489 dictionary. 1490 1491 Caution: samples are weighted by number of replicate analyses, which is a 1492 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1493 correlated analytical errors for one or more samples). 1494 1495 Returns a tuplet of: 1496 1497 + the list of group names 1498 + an array of the corresponding Δ4x values 1499 + the corresponding (co)variance matrix 1500 1501 **Parameters** 1502 1503 + `sample_groups`: a dictionary of the form: 1504 ```py 1505 {'group1': ['sample_1', 'sample_2'], 1506 'group2': ['sample_3', 'sample_4', 'sample_5']} 1507 ``` 1508 ''' 1509 1510 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1511 groups = sorted(sample_groups.keys()) 1512 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1513 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1514 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1515 W = np.array([ 1516 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1517 for j in groups]) 1518 D4x_new = W @ D4x_old 1519 CM_new = W @ CM_old @ W.T 1520 1521 return groups, D4x_new[:,0], CM_new 1522 1523 1524 @make_verbal 1525 def standardize(self, 1526 method = 'pooled', 1527 weighted_sessions = [], 1528 consolidate = True, 1529 consolidate_tables = False, 1530 consolidate_plots = False, 1531 constraints = {}, 1532 ): 1533 ''' 1534 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1535 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1536 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1537 i.e. that their true Δ4x value does not change between sessions, 1538 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1539 `'indep_sessions'`, the standardization processes each session independently, based only 1540 on anchors analyses. 1541 ''' 1542 1543 self.standardization_method = method 1544 self.assign_timestamps() 1545 1546 if method == 'pooled': 1547 if weighted_sessions: 1548 for session_group in weighted_sessions: 1549 if self._4x == '47': 1550 X = D47data([r for r in self if r['Session'] in session_group]) 1551 elif self._4x == '48': 1552 X = D48data([r for r in self if r['Session'] in session_group]) 1553 X.Nominal_D4x = self.Nominal_D4x.copy() 1554 X.refresh() 1555 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1556 w = np.sqrt(result.redchi) 1557 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1558 for r in X: 1559 r[f'wD{self._4x}raw'] *= w 1560 else: 1561 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1562 for r in self: 1563 r[f'wD{self._4x}raw'] = 1. 1564 1565 params = Parameters() 1566 for k,session in enumerate(self.sessions): 1567 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1568 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1569 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1570 s = pf(session) 1571 params.add(f'a_{s}', value = 0.9) 1572 params.add(f'b_{s}', value = 0.) 1573 params.add(f'c_{s}', value = -0.9) 1574 params.add(f'a2_{s}', value = 0., 1575# vary = self.sessions[session]['scrambling_drift'], 1576 ) 1577 params.add(f'b2_{s}', value = 0., 1578# vary = self.sessions[session]['slope_drift'], 1579 ) 1580 params.add(f'c2_{s}', value = 0., 1581# vary = self.sessions[session]['wg_drift'], 1582 ) 1583 if not self.sessions[session]['scrambling_drift']: 1584 params[f'a2_{s}'].expr = '0' 1585 if not self.sessions[session]['slope_drift']: 1586 params[f'b2_{s}'].expr = '0' 1587 if not self.sessions[session]['wg_drift']: 1588 params[f'c2_{s}'].expr = '0' 1589 1590 for sample in self.unknowns: 1591 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1592 1593 for k in constraints: 1594 params[k].expr = constraints[k] 1595 1596 def residuals(p): 1597 R = [] 1598 for r in self: 1599 session = pf(r['Session']) 1600 sample = pf(r['Sample']) 1601 if r['Sample'] in self.Nominal_D4x: 1602 R += [ ( 1603 r[f'D{self._4x}raw'] - ( 1604 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1605 + p[f'b_{session}'] * r[f'd{self._4x}'] 1606 + p[f'c_{session}'] 1607 + r['t'] * ( 1608 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1609 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1610 + p[f'c2_{session}'] 1611 ) 1612 ) 1613 ) / r[f'wD{self._4x}raw'] ] 1614 else: 1615 R += [ ( 1616 r[f'D{self._4x}raw'] - ( 1617 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1618 + p[f'b_{session}'] * r[f'd{self._4x}'] 1619 + p[f'c_{session}'] 1620 + r['t'] * ( 1621 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1622 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1623 + p[f'c2_{session}'] 1624 ) 1625 ) 1626 ) / r[f'wD{self._4x}raw'] ] 1627 return R 1628 1629 M = Minimizer(residuals, params) 1630 result = M.least_squares() 1631 self.Nf = result.nfree 1632 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1633 new_names, new_covar, new_se = _fullcovar(result)[:3] 1634 result.var_names = new_names 1635 result.covar = new_covar 1636 1637 for r in self: 1638 s = pf(r["Session"]) 1639 a = result.params.valuesdict()[f'a_{s}'] 1640 b = result.params.valuesdict()[f'b_{s}'] 1641 c = result.params.valuesdict()[f'c_{s}'] 1642 a2 = result.params.valuesdict()[f'a2_{s}'] 1643 b2 = result.params.valuesdict()[f'b2_{s}'] 1644 c2 = result.params.valuesdict()[f'c2_{s}'] 1645 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1646 1647 1648 self.standardization = result 1649 1650 for session in self.sessions: 1651 self.sessions[session]['Np'] = 3 1652 for k in ['scrambling', 'slope', 'wg']: 1653 if self.sessions[session][f'{k}_drift']: 1654 self.sessions[session]['Np'] += 1 1655 1656 if consolidate: 1657 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1658 return result 1659 1660 1661 elif method == 'indep_sessions': 1662 1663 if weighted_sessions: 1664 for session_group in weighted_sessions: 1665 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1666 X.Nominal_D4x = self.Nominal_D4x.copy() 1667 X.refresh() 1668 # This is only done to assign r['wD47raw'] for r in X: 1669 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1670 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1671 else: 1672 self.msg('All weights set to 1 ‰') 1673 for r in self: 1674 r[f'wD{self._4x}raw'] = 1 1675 1676 for session in self.sessions: 1677 s = self.sessions[session] 1678 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1679 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1680 s['Np'] = sum(p_active) 1681 sdata = s['data'] 1682 1683 A = np.array([ 1684 [ 1685 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1686 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1687 1 / r[f'wD{self._4x}raw'], 1688 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1689 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1690 r['t'] / r[f'wD{self._4x}raw'] 1691 ] 1692 for r in sdata if r['Sample'] in self.anchors 1693 ])[:,p_active] # only keep columns for the active parameters 1694 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1695 s['Na'] = Y.size 1696 CM = linalg.inv(A.T @ A) 1697 bf = (CM @ A.T @ Y).T[0,:] 1698 k = 0 1699 for n,a in zip(p_names, p_active): 1700 if a: 1701 s[n] = bf[k] 1702# self.msg(f'{n} = {bf[k]}') 1703 k += 1 1704 else: 1705 s[n] = 0. 1706# self.msg(f'{n} = 0.0') 1707 1708 for r in sdata : 1709 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1710 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1711 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1712 1713 s['CM'] = np.zeros((6,6)) 1714 i = 0 1715 k_active = [j for j,a in enumerate(p_active) if a] 1716 for j,a in enumerate(p_active): 1717 if a: 1718 s['CM'][j,k_active] = CM[i,:] 1719 i += 1 1720 1721 if not weighted_sessions: 1722 w = self.rmswd()['rmswd'] 1723 for r in self: 1724 r[f'wD{self._4x}'] *= w 1725 r[f'wD{self._4x}raw'] *= w 1726 for session in self.sessions: 1727 self.sessions[session]['CM'] *= w**2 1728 1729 for session in self.sessions: 1730 s = self.sessions[session] 1731 s['SE_a'] = s['CM'][0,0]**.5 1732 s['SE_b'] = s['CM'][1,1]**.5 1733 s['SE_c'] = s['CM'][2,2]**.5 1734 s['SE_a2'] = s['CM'][3,3]**.5 1735 s['SE_b2'] = s['CM'][4,4]**.5 1736 s['SE_c2'] = s['CM'][5,5]**.5 1737 1738 if not weighted_sessions: 1739 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1740 else: 1741 self.Nf = 0 1742 for sg in weighted_sessions: 1743 self.Nf += self.rmswd(sessions = sg)['Nf'] 1744 1745 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1746 1747 avgD4x = { 1748 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1749 for sample in self.samples 1750 } 1751 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1752 rD4x = (chi2/self.Nf)**.5 1753 self.repeatability[f'sigma_{self._4x}'] = rD4x 1754 1755 if consolidate: 1756 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1757 1758 1759 def standardization_error(self, session, d4x, D4x, t = 0): 1760 ''' 1761 Compute standardization error for a given session and 1762 (δ47, Δ47) composition. 1763 ''' 1764 a = self.sessions[session]['a'] 1765 b = self.sessions[session]['b'] 1766 c = self.sessions[session]['c'] 1767 a2 = self.sessions[session]['a2'] 1768 b2 = self.sessions[session]['b2'] 1769 c2 = self.sessions[session]['c2'] 1770 CM = self.sessions[session]['CM'] 1771 1772 x, y = D4x, d4x 1773 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1774# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1775 dxdy = -(b+b2*t) / (a+a2*t) 1776 dxdz = 1. / (a+a2*t) 1777 dxda = -x / (a+a2*t) 1778 dxdb = -y / (a+a2*t) 1779 dxdc = -1. / (a+a2*t) 1780 dxda2 = -x * a2 / (a+a2*t) 1781 dxdb2 = -y * t / (a+a2*t) 1782 dxdc2 = -t / (a+a2*t) 1783 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1784 sx = (V @ CM @ V.T) ** .5 1785 return sx 1786 1787 1788 @make_verbal 1789 def summary(self, 1790 dir = 'output', 1791 filename = None, 1792 save_to_file = True, 1793 print_out = True, 1794 ): 1795 ''' 1796 Print out an/or save to disk a summary of the standardization results. 1797 1798 **Parameters** 1799 1800 + `dir`: the directory in which to save the table 1801 + `filename`: the name to the csv file to write to 1802 + `save_to_file`: whether to save the table to disk 1803 + `print_out`: whether to print out the table 1804 ''' 1805 1806 out = [] 1807 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1808 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1809 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1810 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1811 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1812 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1813 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1814 out += [['Model degrees of freedom', f"{self.Nf}"]] 1815 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1816 out += [['Standardization method', self.standardization_method]] 1817 1818 if save_to_file: 1819 if not os.path.exists(dir): 1820 os.makedirs(dir) 1821 if filename is None: 1822 filename = f'D{self._4x}_summary.csv' 1823 with open(f'{dir}/{filename}', 'w') as fid: 1824 fid.write(make_csv(out)) 1825 if print_out: 1826 self.msg('\n' + pretty_table(out, header = 0)) 1827 1828 1829 @make_verbal 1830 def table_of_sessions(self, 1831 dir = 'output', 1832 filename = None, 1833 save_to_file = True, 1834 print_out = True, 1835 output = None, 1836 ): 1837 ''' 1838 Print out an/or save to disk a table of sessions. 1839 1840 **Parameters** 1841 1842 + `dir`: the directory in which to save the table 1843 + `filename`: the name to the csv file to write to 1844 + `save_to_file`: whether to save the table to disk 1845 + `print_out`: whether to print out the table 1846 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1847 if set to `'raw'`: return a list of list of strings 1848 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1849 ''' 1850 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1851 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1852 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1853 1854 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1855 if include_a2: 1856 out[-1] += ['a2 ± SE'] 1857 if include_b2: 1858 out[-1] += ['b2 ± SE'] 1859 if include_c2: 1860 out[-1] += ['c2 ± SE'] 1861 for session in self.sessions: 1862 out += [[ 1863 session, 1864 f"{self.sessions[session]['Na']}", 1865 f"{self.sessions[session]['Nu']}", 1866 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1867 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1868 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1869 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1870 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1871 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1872 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1873 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1874 ]] 1875 if include_a2: 1876 if self.sessions[session]['scrambling_drift']: 1877 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1878 else: 1879 out[-1] += [''] 1880 if include_b2: 1881 if self.sessions[session]['slope_drift']: 1882 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1883 else: 1884 out[-1] += [''] 1885 if include_c2: 1886 if self.sessions[session]['wg_drift']: 1887 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1888 else: 1889 out[-1] += [''] 1890 1891 if save_to_file: 1892 if not os.path.exists(dir): 1893 os.makedirs(dir) 1894 if filename is None: 1895 filename = f'D{self._4x}_sessions.csv' 1896 with open(f'{dir}/{filename}', 'w') as fid: 1897 fid.write(make_csv(out)) 1898 if print_out: 1899 self.msg('\n' + pretty_table(out)) 1900 if output == 'raw': 1901 return out 1902 elif output == 'pretty': 1903 return pretty_table(out) 1904 1905 1906 @make_verbal 1907 def table_of_analyses( 1908 self, 1909 dir = 'output', 1910 filename = None, 1911 save_to_file = True, 1912 print_out = True, 1913 output = None, 1914 ): 1915 ''' 1916 Print out an/or save to disk a table of analyses. 1917 1918 **Parameters** 1919 1920 + `dir`: the directory in which to save the table 1921 + `filename`: the name to the csv file to write to 1922 + `save_to_file`: whether to save the table to disk 1923 + `print_out`: whether to print out the table 1924 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1925 if set to `'raw'`: return a list of list of strings 1926 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1927 ''' 1928 1929 out = [['UID','Session','Sample']] 1930 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1931 for f in extra_fields: 1932 out[-1] += [f[0]] 1933 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1934 for r in self: 1935 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1936 for f in extra_fields: 1937 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1938 out[-1] += [ 1939 f"{r['d13Cwg_VPDB']:.3f}", 1940 f"{r['d18Owg_VSMOW']:.3f}", 1941 f"{r['d45']:.6f}", 1942 f"{r['d46']:.6f}", 1943 f"{r['d47']:.6f}", 1944 f"{r['d48']:.6f}", 1945 f"{r['d49']:.6f}", 1946 f"{r['d13C_VPDB']:.6f}", 1947 f"{r['d18O_VSMOW']:.6f}", 1948 f"{r['D47raw']:.6f}", 1949 f"{r['D48raw']:.6f}", 1950 f"{r['D49raw']:.6f}", 1951 f"{r[f'D{self._4x}']:.6f}" 1952 ] 1953 if save_to_file: 1954 if not os.path.exists(dir): 1955 os.makedirs(dir) 1956 if filename is None: 1957 filename = f'D{self._4x}_analyses.csv' 1958 with open(f'{dir}/{filename}', 'w') as fid: 1959 fid.write(make_csv(out)) 1960 if print_out: 1961 self.msg('\n' + pretty_table(out)) 1962 return out 1963 1964 @make_verbal 1965 def covar_table( 1966 self, 1967 correl = False, 1968 dir = 'output', 1969 filename = None, 1970 save_to_file = True, 1971 print_out = True, 1972 output = None, 1973 ): 1974 ''' 1975 Print out, save to disk and/or return the variance-covariance matrix of D4x 1976 for all unknown samples. 1977 1978 **Parameters** 1979 1980 + `dir`: the directory in which to save the csv 1981 + `filename`: the name of the csv file to write to 1982 + `save_to_file`: whether to save the csv 1983 + `print_out`: whether to print out the matrix 1984 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 1985 if set to `'raw'`: return a list of list of strings 1986 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1987 ''' 1988 samples = sorted([u for u in self.unknowns]) 1989 out = [[''] + samples] 1990 for s1 in samples: 1991 out.append([s1]) 1992 for s2 in samples: 1993 if correl: 1994 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 1995 else: 1996 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 1997 1998 if save_to_file: 1999 if not os.path.exists(dir): 2000 os.makedirs(dir) 2001 if filename is None: 2002 if correl: 2003 filename = f'D{self._4x}_correl.csv' 2004 else: 2005 filename = f'D{self._4x}_covar.csv' 2006 with open(f'{dir}/{filename}', 'w') as fid: 2007 fid.write(make_csv(out)) 2008 if print_out: 2009 self.msg('\n'+pretty_table(out)) 2010 if output == 'raw': 2011 return out 2012 elif output == 'pretty': 2013 return pretty_table(out) 2014 2015 @make_verbal 2016 def table_of_samples( 2017 self, 2018 dir = 'output', 2019 filename = None, 2020 save_to_file = True, 2021 print_out = True, 2022 output = None, 2023 ): 2024 ''' 2025 Print out, save to disk and/or return a table of samples. 2026 2027 **Parameters** 2028 2029 + `dir`: the directory in which to save the csv 2030 + `filename`: the name of the csv file to write to 2031 + `save_to_file`: whether to save the csv 2032 + `print_out`: whether to print out the table 2033 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2034 if set to `'raw'`: return a list of list of strings 2035 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2036 ''' 2037 2038 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2039 for sample in self.anchors: 2040 out += [[ 2041 f"{sample}", 2042 f"{self.samples[sample]['N']}", 2043 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2044 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2045 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2046 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2047 ]] 2048 for sample in self.unknowns: 2049 out += [[ 2050 f"{sample}", 2051 f"{self.samples[sample]['N']}", 2052 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2053 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2054 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2055 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2056 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2057 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2058 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2059 ]] 2060 if save_to_file: 2061 if not os.path.exists(dir): 2062 os.makedirs(dir) 2063 if filename is None: 2064 filename = f'D{self._4x}_samples.csv' 2065 with open(f'{dir}/{filename}', 'w') as fid: 2066 fid.write(make_csv(out)) 2067 if print_out: 2068 self.msg('\n'+pretty_table(out)) 2069 if output == 'raw': 2070 return out 2071 elif output == 'pretty': 2072 return pretty_table(out) 2073 2074 2075 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2076 ''' 2077 Generate session plots and save them to disk. 2078 2079 **Parameters** 2080 2081 + `dir`: the directory in which to save the plots 2082 + `figsize`: the width and height (in inches) of each plot 2083 + `filetype`: 'pdf' or 'png' 2084 + `dpi`: resolution for PNG output 2085 ''' 2086 if not os.path.exists(dir): 2087 os.makedirs(dir) 2088 2089 for session in self.sessions: 2090 sp = self.plot_single_session(session, xylimits = 'constant') 2091 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2092 ppl.close(sp.fig) 2093 2094 2095 @make_verbal 2096 def consolidate_samples(self): 2097 ''' 2098 Compile various statistics for each sample. 2099 2100 For each anchor sample: 2101 2102 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2103 + `SE_D47` or `SE_D48`: set to zero by definition 2104 2105 For each unknown sample: 2106 2107 + `D47` or `D48`: the standardized Δ4x value for this unknown 2108 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2109 2110 For each anchor and unknown: 2111 2112 + `N`: the total number of analyses of this sample 2113 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2114 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2115 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2116 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2117 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2118 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2119 ''' 2120 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2121 for sample in self.samples: 2122 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2123 if self.samples[sample]['N'] > 1: 2124 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2125 2126 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2127 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2128 2129 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2130 if len(D4x_pop) > 2: 2131 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2132 2133 if self.standardization_method == 'pooled': 2134 for sample in self.anchors: 2135 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2136 self.samples[sample][f'SE_D{self._4x}'] = 0. 2137 for sample in self.unknowns: 2138 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2139 try: 2140 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2141 except ValueError: 2142 # when `sample` is constrained by self.standardize(constraints = {...}), 2143 # it is no longer listed in self.standardization.var_names. 2144 # Temporary fix: define SE as zero for now 2145 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2146 2147 elif self.standardization_method == 'indep_sessions': 2148 for sample in self.anchors: 2149 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2150 self.samples[sample][f'SE_D{self._4x}'] = 0. 2151 for sample in self.unknowns: 2152 self.msg(f'Consolidating sample {sample}') 2153 self.unknowns[sample][f'session_D{self._4x}'] = {} 2154 session_avg = [] 2155 for session in self.sessions: 2156 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2157 if sdata: 2158 self.msg(f'{sample} found in session {session}') 2159 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2160 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2161 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2162 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2163 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2164 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2165 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2166 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2167 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2168 wsum = sum([weights[s] for s in weights]) 2169 for s in weights: 2170 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2171 2172 for r in self: 2173 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2174 2175 2176 2177 def consolidate_sessions(self): 2178 ''' 2179 Compute various statistics for each session. 2180 2181 + `Na`: Number of anchor analyses in the session 2182 + `Nu`: Number of unknown analyses in the session 2183 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2184 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2185 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2186 + `a`: scrambling factor 2187 + `b`: compositional slope 2188 + `c`: WG offset 2189 + `SE_a`: Model stadard erorr of `a` 2190 + `SE_b`: Model stadard erorr of `b` 2191 + `SE_c`: Model stadard erorr of `c` 2192 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2193 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2194 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2195 + `a2`: scrambling factor drift 2196 + `b2`: compositional slope drift 2197 + `c2`: WG offset drift 2198 + `Np`: Number of standardization parameters to fit 2199 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2200 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2201 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2202 ''' 2203 for session in self.sessions: 2204 if 'd13Cwg_VPDB' not in self.sessions[session]: 2205 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2206 if 'd18Owg_VSMOW' not in self.sessions[session]: 2207 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2208 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2209 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2210 2211 self.msg(f'Computing repeatabilities for session {session}') 2212 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2213 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2214 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2215 2216 if self.standardization_method == 'pooled': 2217 for session in self.sessions: 2218 2219 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2220 i = self.standardization.var_names.index(f'a_{pf(session)}') 2221 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2222 2223 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2224 i = self.standardization.var_names.index(f'b_{pf(session)}') 2225 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2226 2227 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2228 i = self.standardization.var_names.index(f'c_{pf(session)}') 2229 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2230 2231 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2232 if self.sessions[session]['scrambling_drift']: 2233 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2234 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2235 else: 2236 self.sessions[session]['SE_a2'] = 0. 2237 2238 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2239 if self.sessions[session]['slope_drift']: 2240 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2241 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2242 else: 2243 self.sessions[session]['SE_b2'] = 0. 2244 2245 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2246 if self.sessions[session]['wg_drift']: 2247 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2248 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2249 else: 2250 self.sessions[session]['SE_c2'] = 0. 2251 2252 i = self.standardization.var_names.index(f'a_{pf(session)}') 2253 j = self.standardization.var_names.index(f'b_{pf(session)}') 2254 k = self.standardization.var_names.index(f'c_{pf(session)}') 2255 CM = np.zeros((6,6)) 2256 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2257 try: 2258 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2259 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2260 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2261 try: 2262 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2263 CM[3,4] = self.standardization.covar[i2,j2] 2264 CM[4,3] = self.standardization.covar[j2,i2] 2265 except ValueError: 2266 pass 2267 try: 2268 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2269 CM[3,5] = self.standardization.covar[i2,k2] 2270 CM[5,3] = self.standardization.covar[k2,i2] 2271 except ValueError: 2272 pass 2273 except ValueError: 2274 pass 2275 try: 2276 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2277 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2278 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2279 try: 2280 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2281 CM[4,5] = self.standardization.covar[j2,k2] 2282 CM[5,4] = self.standardization.covar[k2,j2] 2283 except ValueError: 2284 pass 2285 except ValueError: 2286 pass 2287 try: 2288 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2289 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2290 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2291 except ValueError: 2292 pass 2293 2294 self.sessions[session]['CM'] = CM 2295 2296 elif self.standardization_method == 'indep_sessions': 2297 pass # Not implemented yet 2298 2299 2300 @make_verbal 2301 def repeatabilities(self): 2302 ''' 2303 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2304 (for all samples, for anchors, and for unknowns). 2305 ''' 2306 self.msg('Computing reproducibilities for all sessions') 2307 2308 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2309 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2310 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2311 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2312 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2313 2314 2315 @make_verbal 2316 def consolidate(self, tables = True, plots = True): 2317 ''' 2318 Collect information about samples, sessions and repeatabilities. 2319 ''' 2320 self.consolidate_samples() 2321 self.consolidate_sessions() 2322 self.repeatabilities() 2323 2324 if tables: 2325 self.summary() 2326 self.table_of_sessions() 2327 self.table_of_analyses() 2328 self.table_of_samples() 2329 2330 if plots: 2331 self.plot_sessions() 2332 2333 2334 @make_verbal 2335 def rmswd(self, 2336 samples = 'all samples', 2337 sessions = 'all sessions', 2338 ): 2339 ''' 2340 Compute the χ2, root mean squared weighted deviation 2341 (i.e. reduced χ2), and corresponding degrees of freedom of the 2342 Δ4x values for samples in `samples` and sessions in `sessions`. 2343 2344 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2345 ''' 2346 if samples == 'all samples': 2347 mysamples = [k for k in self.samples] 2348 elif samples == 'anchors': 2349 mysamples = [k for k in self.anchors] 2350 elif samples == 'unknowns': 2351 mysamples = [k for k in self.unknowns] 2352 else: 2353 mysamples = samples 2354 2355 if sessions == 'all sessions': 2356 sessions = [k for k in self.sessions] 2357 2358 chisq, Nf = 0, 0 2359 for sample in mysamples : 2360 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2361 if len(G) > 1 : 2362 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2363 Nf += (len(G) - 1) 2364 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2365 r = (chisq / Nf)**.5 if Nf > 0 else 0 2366 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2367 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2368 2369 2370 @make_verbal 2371 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2372 ''' 2373 Compute the repeatability of `[r[key] for r in self]` 2374 ''' 2375 2376 if samples == 'all samples': 2377 mysamples = [k for k in self.samples] 2378 elif samples == 'anchors': 2379 mysamples = [k for k in self.anchors] 2380 elif samples == 'unknowns': 2381 mysamples = [k for k in self.unknowns] 2382 else: 2383 mysamples = samples 2384 2385 if sessions == 'all sessions': 2386 sessions = [k for k in self.sessions] 2387 2388 if key in ['D47', 'D48']: 2389 # Full disclosure: the definition of Nf is tricky/debatable 2390 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2391 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2392 Nf = len(G) 2393# print(f'len(G) = {Nf}') 2394 Nf -= len([s for s in mysamples if s in self.unknowns]) 2395# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2396 for session in sessions: 2397 Np = len([ 2398 _ for _ in self.standardization.params 2399 if ( 2400 self.standardization.params[_].expr is not None 2401 and ( 2402 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2403 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2404 ) 2405 ) 2406 ]) 2407# print(f'session {session}: {Np} parameters to consider') 2408 Na = len({ 2409 r['Sample'] for r in self.sessions[session]['data'] 2410 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2411 }) 2412# print(f'session {session}: {Na} different anchors in that session') 2413 Nf -= min(Np, Na) 2414# print(f'Nf = {Nf}') 2415 2416# for sample in mysamples : 2417# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2418# if len(X) > 1 : 2419# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2420# if sample in self.unknowns: 2421# Nf += len(X) - 1 2422# else: 2423# Nf += len(X) 2424# if samples in ['anchors', 'all samples']: 2425# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2426 r = (chisq / Nf)**.5 if Nf > 0 else 0 2427 2428 else: # if key not in ['D47', 'D48'] 2429 chisq, Nf = 0, 0 2430 for sample in mysamples : 2431 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2432 if len(X) > 1 : 2433 Nf += len(X) - 1 2434 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2435 r = (chisq / Nf)**.5 if Nf > 0 else 0 2436 2437 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2438 return r 2439 2440 def sample_average(self, samples, weights = 'equal', normalize = True): 2441 ''' 2442 Weighted average Δ4x value of a group of samples, accounting for covariance. 2443 2444 Returns the weighed average Δ4x value and associated SE 2445 of a group of samples. Weights are equal by default. If `normalize` is 2446 true, `weights` will be rescaled so that their sum equals 1. 2447 2448 **Examples** 2449 2450 ```python 2451 self.sample_average(['X','Y'], [1, 2]) 2452 ``` 2453 2454 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2455 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2456 values of samples X and Y, respectively. 2457 2458 ```python 2459 self.sample_average(['X','Y'], [1, -1], normalize = False) 2460 ``` 2461 2462 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2463 ''' 2464 if weights == 'equal': 2465 weights = [1/len(samples)] * len(samples) 2466 2467 if normalize: 2468 s = sum(weights) 2469 if s: 2470 weights = [w/s for w in weights] 2471 2472 try: 2473# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2474# C = self.standardization.covar[indices,:][:,indices] 2475 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2476 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2477 return correlated_sum(X, C, weights) 2478 except ValueError: 2479 return (0., 0.) 2480 2481 2482 def sample_D4x_covar(self, sample1, sample2 = None): 2483 ''' 2484 Covariance between Δ4x values of samples 2485 2486 Returns the error covariance between the average Δ4x values of two 2487 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2488 returns the Δ4x variance for that sample. 2489 ''' 2490 if sample2 is None: 2491 sample2 = sample1 2492 if self.standardization_method == 'pooled': 2493 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2494 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2495 return self.standardization.covar[i, j] 2496 elif self.standardization_method == 'indep_sessions': 2497 if sample1 == sample2: 2498 return self.samples[sample1][f'SE_D{self._4x}']**2 2499 else: 2500 c = 0 2501 for session in self.sessions: 2502 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2503 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2504 if sdata1 and sdata2: 2505 a = self.sessions[session]['a'] 2506 # !! TODO: CM below does not account for temporal changes in standardization parameters 2507 CM = self.sessions[session]['CM'][:3,:3] 2508 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2509 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2510 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2511 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2512 c += ( 2513 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2514 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2515 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2516 @ CM 2517 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2518 ) / a**2 2519 return float(c) 2520 2521 def sample_D4x_correl(self, sample1, sample2 = None): 2522 ''' 2523 Correlation between Δ4x errors of samples 2524 2525 Returns the error correlation between the average Δ4x values of two samples. 2526 ''' 2527 if sample2 is None or sample2 == sample1: 2528 return 1. 2529 return ( 2530 self.sample_D4x_covar(sample1, sample2) 2531 / self.unknowns[sample1][f'SE_D{self._4x}'] 2532 / self.unknowns[sample2][f'SE_D{self._4x}'] 2533 ) 2534 2535 def plot_single_session(self, 2536 session, 2537 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2538 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2539 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2540 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2541 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2542 xylimits = 'free', # | 'constant' 2543 x_label = None, 2544 y_label = None, 2545 error_contour_interval = 'auto', 2546 fig = 'new', 2547 ): 2548 ''' 2549 Generate plot for a single session 2550 ''' 2551 if x_label is None: 2552 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2553 if y_label is None: 2554 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2555 2556 out = _SessionPlot() 2557 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2558 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2559 2560 if fig == 'new': 2561 out.fig = ppl.figure(figsize = (6,6)) 2562 ppl.subplots_adjust(.1,.1,.9,.9) 2563 2564 out.anchor_analyses, = ppl.plot( 2565 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2566 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2567 **kw_plot_anchors) 2568 out.unknown_analyses, = ppl.plot( 2569 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2570 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2571 **kw_plot_unknowns) 2572 out.anchor_avg = ppl.plot( 2573 np.array([ np.array([ 2574 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2575 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2576 ]) for sample in anchors]).T, 2577 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T, 2578 **kw_plot_anchor_avg) 2579 out.unknown_avg = ppl.plot( 2580 np.array([ np.array([ 2581 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2582 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2583 ]) for sample in unknowns]).T, 2584 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T, 2585 **kw_plot_unknown_avg) 2586 if xylimits == 'constant': 2587 x = [r[f'd{self._4x}'] for r in self] 2588 y = [r[f'D{self._4x}'] for r in self] 2589 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2590 w, h = x2-x1, y2-y1 2591 x1 -= w/20 2592 x2 += w/20 2593 y1 -= h/20 2594 y2 += h/20 2595 ppl.axis([x1, x2, y1, y2]) 2596 elif xylimits == 'free': 2597 x1, x2, y1, y2 = ppl.axis() 2598 else: 2599 x1, x2, y1, y2 = ppl.axis(xylimits) 2600 2601 if error_contour_interval != 'none': 2602 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2603 XI,YI = np.meshgrid(xi, yi) 2604 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2605 if error_contour_interval == 'auto': 2606 rng = np.max(SI) - np.min(SI) 2607 if rng <= 0.01: 2608 cinterval = 0.001 2609 elif rng <= 0.03: 2610 cinterval = 0.004 2611 elif rng <= 0.1: 2612 cinterval = 0.01 2613 elif rng <= 0.3: 2614 cinterval = 0.03 2615 elif rng <= 1.: 2616 cinterval = 0.1 2617 else: 2618 cinterval = 0.5 2619 else: 2620 cinterval = error_contour_interval 2621 2622 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2623 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2624 out.clabel = ppl.clabel(out.contour) 2625 2626 ppl.xlabel(x_label) 2627 ppl.ylabel(y_label) 2628 ppl.title(session, weight = 'bold') 2629 ppl.grid(alpha = .2) 2630 out.ax = ppl.gca() 2631 2632 return out 2633 2634 def plot_residuals( 2635 self, 2636 kde = False, 2637 hist = False, 2638 binwidth = 2/3, 2639 dir = 'output', 2640 filename = None, 2641 highlight = [], 2642 colors = None, 2643 figsize = None, 2644 dpi = 100, 2645 yspan = None, 2646 ): 2647 ''' 2648 Plot residuals of each analysis as a function of time (actually, as a function of 2649 the order of analyses in the `D4xdata` object) 2650 2651 + `kde`: whether to add a kernel density estimate of residuals 2652 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2653 + `histbins`: specify bin edges for the histogram 2654 + `dir`: the directory in which to save the plot 2655 + `highlight`: a list of samples to highlight 2656 + `colors`: a dict of `{<sample>: <color>}` for all samples 2657 + `figsize`: (width, height) of figure 2658 + `dpi`: resolution for PNG output 2659 + `yspan`: factor controlling the range of y values shown in plot 2660 (by default: `yspan = 1.5 if kde else 1.0`) 2661 ''' 2662 2663 from matplotlib import ticker 2664 2665 if yspan is None: 2666 if kde: 2667 yspan = 1.5 2668 else: 2669 yspan = 1.0 2670 2671 # Layout 2672 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2673 if hist or kde: 2674 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2675 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2676 else: 2677 ppl.subplots_adjust(.08,.05,.78,.8) 2678 ax1 = ppl.subplot(111) 2679 2680 # Colors 2681 N = len(self.anchors) 2682 if colors is None: 2683 if len(highlight) > 0: 2684 Nh = len(highlight) 2685 if Nh == 1: 2686 colors = {highlight[0]: (0,0,0)} 2687 elif Nh == 3: 2688 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2689 elif Nh == 4: 2690 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2691 else: 2692 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2693 else: 2694 if N == 3: 2695 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2696 elif N == 4: 2697 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2698 else: 2699 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2700 2701 ppl.sca(ax1) 2702 2703 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2704 2705 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2706 2707 session = self[0]['Session'] 2708 x1 = 0 2709# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2710 x_sessions = {} 2711 one_or_more_singlets = False 2712 one_or_more_multiplets = False 2713 multiplets = set() 2714 for k,r in enumerate(self): 2715 if r['Session'] != session: 2716 x2 = k-1 2717 x_sessions[session] = (x1+x2)/2 2718 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2719 session = r['Session'] 2720 x1 = k 2721 singlet = len(self.samples[r['Sample']]['data']) == 1 2722 if not singlet: 2723 multiplets.add(r['Sample']) 2724 if r['Sample'] in self.unknowns: 2725 if singlet: 2726 one_or_more_singlets = True 2727 else: 2728 one_or_more_multiplets = True 2729 kw = dict( 2730 marker = 'x' if singlet else '+', 2731 ms = 4 if singlet else 5, 2732 ls = 'None', 2733 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2734 mew = 1, 2735 alpha = 0.2 if singlet else 1, 2736 ) 2737 if highlight and r['Sample'] not in highlight: 2738 kw['alpha'] = 0.2 2739 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2740 x2 = k 2741 x_sessions[session] = (x1+x2)/2 2742 2743 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2744 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2745 if not (hist or kde): 2746 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2747 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2748 2749 xmin, xmax, ymin, ymax = ppl.axis() 2750 if yspan != 1: 2751 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2752 for s in x_sessions: 2753 ppl.text( 2754 x_sessions[s], 2755 ymax +1, 2756 s, 2757 va = 'bottom', 2758 **( 2759 dict(ha = 'center') 2760 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2761 else dict(ha = 'left', rotation = 45) 2762 ) 2763 ) 2764 2765 if hist or kde: 2766 ppl.sca(ax2) 2767 2768 for s in colors: 2769 kw['marker'] = '+' 2770 kw['ms'] = 5 2771 kw['mec'] = colors[s] 2772 kw['label'] = s 2773 kw['alpha'] = 1 2774 ppl.plot([], [], **kw) 2775 2776 kw['mec'] = (0,0,0) 2777 2778 if one_or_more_singlets: 2779 kw['marker'] = 'x' 2780 kw['ms'] = 4 2781 kw['alpha'] = .2 2782 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2783 ppl.plot([], [], **kw) 2784 2785 if one_or_more_multiplets: 2786 kw['marker'] = '+' 2787 kw['ms'] = 4 2788 kw['alpha'] = 1 2789 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2790 ppl.plot([], [], **kw) 2791 2792 if hist or kde: 2793 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2794 else: 2795 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2796 leg.set_zorder(-1000) 2797 2798 ppl.sca(ax1) 2799 2800 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2801 ppl.xticks([]) 2802 ppl.axis([-1, len(self), None, None]) 2803 2804 if hist or kde: 2805 ppl.sca(ax2) 2806 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2807 2808 if kde: 2809 from scipy.stats import gaussian_kde 2810 yi = np.linspace(ymin, ymax, 201) 2811 xi = gaussian_kde(X).evaluate(yi) 2812 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2813# ppl.plot(xi, yi, 'k-', lw = 1) 2814 elif hist: 2815 ppl.hist( 2816 X, 2817 orientation = 'horizontal', 2818 histtype = 'stepfilled', 2819 ec = [.4]*3, 2820 fc = [.25]*3, 2821 alpha = .25, 2822 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2823 ) 2824 ppl.text(0, 0, 2825 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2826 size = 7.5, 2827 alpha = 1, 2828 va = 'center', 2829 ha = 'left', 2830 ) 2831 2832 ppl.axis([0, None, ymin, ymax]) 2833 ppl.xticks([]) 2834 ppl.yticks([]) 2835# ax2.spines['left'].set_visible(False) 2836 ax2.spines['right'].set_visible(False) 2837 ax2.spines['top'].set_visible(False) 2838 ax2.spines['bottom'].set_visible(False) 2839 2840 ax1.axis([None, None, ymin, ymax]) 2841 2842 if not os.path.exists(dir): 2843 os.makedirs(dir) 2844 if filename is None: 2845 return fig 2846 elif filename == '': 2847 filename = f'D{self._4x}_residuals.pdf' 2848 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2849 ppl.close(fig) 2850 2851 2852 def simulate(self, *args, **kwargs): 2853 ''' 2854 Legacy function with warning message pointing to `virtual_data()` 2855 ''' 2856 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2857 2858 def plot_distribution_of_analyses( 2859 self, 2860 dir = 'output', 2861 filename = None, 2862 vs_time = False, 2863 figsize = (6,4), 2864 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2865 output = None, 2866 dpi = 100, 2867 ): 2868 ''' 2869 Plot temporal distribution of all analyses in the data set. 2870 2871 **Parameters** 2872 2873 + `dir`: the directory in which to save the plot 2874 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2875 + `dpi`: resolution for PNG output 2876 + `figsize`: (width, height) of figure 2877 + `dpi`: resolution for PNG output 2878 ''' 2879 2880 asamples = [s for s in self.anchors] 2881 usamples = [s for s in self.unknowns] 2882 if output is None or output == 'fig': 2883 fig = ppl.figure(figsize = figsize) 2884 ppl.subplots_adjust(*subplots_adjust) 2885 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2886 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2887 Xmax += (Xmax-Xmin)/40 2888 Xmin -= (Xmax-Xmin)/41 2889 for k, s in enumerate(asamples + usamples): 2890 if vs_time: 2891 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2892 else: 2893 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2894 Y = [-k for x in X] 2895 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2896 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2897 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2898 ppl.axis([Xmin, Xmax, -k-1, 1]) 2899 ppl.xlabel('\ntime') 2900 ppl.gca().annotate('', 2901 xy = (0.6, -0.02), 2902 xycoords = 'axes fraction', 2903 xytext = (.4, -0.02), 2904 arrowprops = dict(arrowstyle = "->", color = 'k'), 2905 ) 2906 2907 2908 x2 = -1 2909 for session in self.sessions: 2910 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2911 if vs_time: 2912 ppl.axvline(x1, color = 'k', lw = .75) 2913 if x2 > -1: 2914 if not vs_time: 2915 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2916 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2917# from xlrd import xldate_as_datetime 2918# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2919 if vs_time: 2920 ppl.axvline(x2, color = 'k', lw = .75) 2921 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2922 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2923 2924 ppl.xticks([]) 2925 ppl.yticks([]) 2926 2927 if output is None: 2928 if not os.path.exists(dir): 2929 os.makedirs(dir) 2930 if filename == None: 2931 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2932 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2933 ppl.close(fig) 2934 elif output == 'ax': 2935 return ppl.gca() 2936 elif output == 'fig': 2937 return fig 2938 2939 2940 def plot_bulk_compositions( 2941 self, 2942 samples = None, 2943 dir = 'output/bulk_compositions', 2944 figsize = (6,6), 2945 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 2946 show = False, 2947 sample_color = (0,.5,1), 2948 analysis_color = (.7,.7,.7), 2949 labeldist = 0.3, 2950 radius = 0.05, 2951 ): 2952 ''' 2953 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 2954 2955 By default, creates a directory `./output/bulk_compositions` where plots for 2956 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 2957 2958 2959 **Parameters** 2960 2961 + `samples`: Only these samples are processed (by default: all samples). 2962 + `dir`: where to save the plots 2963 + `figsize`: (width, height) of figure 2964 + `subplots_adjust`: passed to `subplots_adjust()` 2965 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 2966 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 2967 + `sample_color`: color used for replicate markers/labels 2968 + `analysis_color`: color used for sample markers/labels 2969 + `labeldist`: distance (in inches) from replicate markers to replicate labels 2970 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 2971 ''' 2972 2973 from matplotlib.patches import Ellipse 2974 2975 if samples is None: 2976 samples = [_ for _ in self.samples] 2977 2978 saved = {} 2979 2980 for s in samples: 2981 2982 fig = ppl.figure(figsize = figsize) 2983 fig.subplots_adjust(*subplots_adjust) 2984 ax = ppl.subplot(111) 2985 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 2986 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 2987 ppl.title(s) 2988 2989 2990 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 2991 UID = [_['UID'] for _ in self.samples[s]['data']] 2992 XY0 = XY.mean(0) 2993 2994 for xy in XY: 2995 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 2996 2997 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 2998 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 2999 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3000 saved[s] = [XY, XY0] 3001 3002 x1, x2, y1, y2 = ppl.axis() 3003 x0, dx = (x1+x2)/2, (x2-x1)/2 3004 y0, dy = (y1+y2)/2, (y2-y1)/2 3005 dx, dy = [max(max(dx, dy), radius)]*2 3006 3007 ppl.axis([ 3008 x0 - 1.2*dx, 3009 x0 + 1.2*dx, 3010 y0 - 1.2*dy, 3011 y0 + 1.2*dy, 3012 ]) 3013 3014 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3015 3016 for xy, uid in zip(XY, UID): 3017 3018 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3019 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3020 3021 if (vector_in_display_space**2).sum() > 0: 3022 3023 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3024 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3025 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3026 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3027 3028 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3029 3030 else: 3031 3032 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3033 3034 if radius: 3035 ax.add_artist(Ellipse( 3036 xy = XY0, 3037 width = radius*2, 3038 height = radius*2, 3039 ls = (0, (2,2)), 3040 lw = .7, 3041 ec = analysis_color, 3042 fc = 'None', 3043 )) 3044 ppl.text( 3045 XY0[0], 3046 XY0[1]-radius, 3047 f'\n± {radius*1e3:.0f} ppm', 3048 color = analysis_color, 3049 va = 'top', 3050 ha = 'center', 3051 linespacing = 0.4, 3052 size = 8, 3053 ) 3054 3055 if not os.path.exists(dir): 3056 os.makedirs(dir) 3057 fig.savefig(f'{dir}/{s}.pdf') 3058 ppl.close(fig) 3059 3060 fig = ppl.figure(figsize = figsize) 3061 fig.subplots_adjust(*subplots_adjust) 3062 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3063 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3064 3065 for s in saved: 3066 for xy in saved[s][0]: 3067 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3068 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3069 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3070 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3071 3072 x1, x2, y1, y2 = ppl.axis() 3073 ppl.axis([ 3074 x1 - (x2-x1)/10, 3075 x2 + (x2-x1)/10, 3076 y1 - (y2-y1)/10, 3077 y2 + (y2-y1)/10, 3078 ]) 3079 3080 3081 if not os.path.exists(dir): 3082 os.makedirs(dir) 3083 fig.savefig(f'{dir}/__all__.pdf') 3084 if show: 3085 ppl.show() 3086 ppl.close(fig)
Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.
921 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 922 ''' 923 **Parameters** 924 925 + `l`: a list of dictionaries, with each dictionary including at least the keys 926 `Sample`, `d45`, `d46`, and `d47` or `d48`. 927 + `mass`: `'47'` or `'48'` 928 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 929 + `session`: define session name for analyses without a `Session` key 930 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 931 932 Returns a `D4xdata` object derived from `list`. 933 ''' 934 self._4x = mass 935 self.verbose = verbose 936 self.prefix = 'D4xdata' 937 self.logfile = logfile 938 list.__init__(self, l) 939 self.Nf = None 940 self.repeatability = {} 941 self.refresh(session = session)
Parameters
l
: a list of dictionaries, with each dictionary including at least the keysSample
,d45
,d46
, andd47
ord48
.mass
:'47'
or'48'
logfile
: if specified, write detailed logs to this file path when callingD4xdata
methods.session
: define session name for analyses without aSession
keyverbose
: ifTrue
, print out detailed logs when callingD4xdata
methods.
Returns a D4xdata
object derived from list
.
Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)
Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)
Absolute (17O/16C) ratio of VSMOW.
By default equal to 0.00038475
(Assonov & Brenninkmeijer, 2003,
rescaled to R13_VPDB
)
Absolute (18O/16C) ratio of VPDB.
By definition equal to R18_VSMOW * 1.03092
.
Absolute (17O/16C) ratio of VPDB.
By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17
.
After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).
LEVENE_REF_SAMPLE
(by default equal to 'ETH-3'
) specifies which
sample should be used as a reference for this test.
Specifies the 18O/16O fractionation factor generally applicable
to acid reactions in the dataset. Currently used by D4xdata.wg()
,
D4xdata.standardize_d13C
, and D4xdata.standardize_d18O
.
By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).
Nominal δ13CVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d13C()
.
By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}
after
Bernasconi et al. (2018).
Nominal δ18OVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d18O()
.
By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}
after
Bernasconi et al. (2018).
Method by which to standardize δ13C values:
none
: do not apply any δ13C standardization.'1pt'
: within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).
Method by which to standardize δ18O values:
none
: do not apply any δ18O standardization.'1pt'
: within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).
944 def make_verbal(oldfun): 945 ''' 946 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 947 ''' 948 @wraps(oldfun) 949 def newfun(*args, verbose = '', **kwargs): 950 myself = args[0] 951 oldprefix = myself.prefix 952 myself.prefix = oldfun.__name__ 953 if verbose != '': 954 oldverbose = myself.verbose 955 myself.verbose = verbose 956 out = oldfun(*args, **kwargs) 957 myself.prefix = oldprefix 958 if verbose != '': 959 myself.verbose = oldverbose 960 return out 961 return newfun
Decorator: allow temporarily changing self.prefix
and overriding self.verbose
.
964 def msg(self, txt): 965 ''' 966 Log a message to `self.logfile`, and print it out if `verbose = True` 967 ''' 968 self.log(txt) 969 if self.verbose: 970 print(f'{f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
, and print it out if verbose = True
973 def vmsg(self, txt): 974 ''' 975 Log a message to `self.logfile` and print it out 976 ''' 977 self.log(txt) 978 print(txt)
Log a message to self.logfile
and print it out
981 def log(self, *txts): 982 ''' 983 Log a message to `self.logfile` 984 ''' 985 if self.logfile: 986 with open(self.logfile, 'a') as fid: 987 for txt in txts: 988 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
991 def refresh(self, session = 'mySession'): 992 ''' 993 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 994 ''' 995 self.fill_in_missing_info(session = session) 996 self.refresh_sessions() 997 self.refresh_samples()
Update self.sessions
, self.samples
, self.anchors
, and self.unknowns
.
1000 def refresh_sessions(self): 1001 ''' 1002 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1003 to `False` for all sessions. 1004 ''' 1005 self.sessions = { 1006 s: {'data': [r for r in self if r['Session'] == s]} 1007 for s in sorted({r['Session'] for r in self}) 1008 } 1009 for s in self.sessions: 1010 self.sessions[s]['scrambling_drift'] = False 1011 self.sessions[s]['slope_drift'] = False 1012 self.sessions[s]['wg_drift'] = False 1013 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1014 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
Update self.sessions
and set scrambling_drift
, slope_drift
, and wg_drift
to False
for all sessions.
1017 def refresh_samples(self): 1018 ''' 1019 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1020 ''' 1021 self.samples = { 1022 s: {'data': [r for r in self if r['Sample'] == s]} 1023 for s in sorted({r['Sample'] for r in self}) 1024 } 1025 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1026 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
Define self.samples
, self.anchors
, and self.unknowns
.
1029 def read(self, filename, sep = '', session = ''): 1030 ''' 1031 Read file in csv format to load data into a `D47data` object. 1032 1033 In the csv file, spaces before and after field separators (`','` by default) 1034 are optional. Each line corresponds to a single analysis. 1035 1036 The required fields are: 1037 1038 + `UID`: a unique identifier 1039 + `Session`: an identifier for the analytical session 1040 + `Sample`: a sample identifier 1041 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1042 1043 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1044 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1045 and `d49` are optional, and set to NaN by default. 1046 1047 **Parameters** 1048 1049 + `fileneme`: the path of the file to read 1050 + `sep`: csv separator delimiting the fields 1051 + `session`: set `Session` field to this string for all analyses 1052 ''' 1053 with open(filename) as fid: 1054 self.input(fid.read(), sep = sep, session = session)
Read file in csv format to load data into a D47data
object.
In the csv file, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
fileneme
: the path of the file to readsep
: csv separator delimiting the fieldssession
: setSession
field to this string for all analyses
1057 def input(self, txt, sep = '', session = ''): 1058 ''' 1059 Read `txt` string in csv format to load analysis data into a `D47data` object. 1060 1061 In the csv string, spaces before and after field separators (`','` by default) 1062 are optional. Each line corresponds to a single analysis. 1063 1064 The required fields are: 1065 1066 + `UID`: a unique identifier 1067 + `Session`: an identifier for the analytical session 1068 + `Sample`: a sample identifier 1069 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1070 1071 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1072 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1073 and `d49` are optional, and set to NaN by default. 1074 1075 **Parameters** 1076 1077 + `txt`: the csv string to read 1078 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1079 whichever appers most often in `txt`. 1080 + `session`: set `Session` field to this string for all analyses 1081 ''' 1082 if sep == '': 1083 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1084 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1085 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1086 1087 if session != '': 1088 for r in data: 1089 r['Session'] = session 1090 1091 self += data 1092 self.refresh()
Read txt
string in csv format to load analysis data into a D47data
object.
In the csv string, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
txt
: the csv string to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in
txt
.session
: setSession
field to this string for all analyses
1095 @make_verbal 1096 def wg(self, samples = None, a18_acid = None): 1097 ''' 1098 Compute bulk composition of the working gas for each session based on 1099 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1100 `self.Nominal_d18O_VPDB`. 1101 ''' 1102 1103 self.msg('Computing WG composition:') 1104 1105 if a18_acid is None: 1106 a18_acid = self.ALPHA_18O_ACID_REACTION 1107 if samples is None: 1108 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1109 1110 assert a18_acid, f'Acid fractionation factor should not be zero.' 1111 1112 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1113 R45R46_standards = {} 1114 for sample in samples: 1115 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1116 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1117 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1118 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1119 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1120 1121 C12_s = 1 / (1 + R13_s) 1122 C13_s = R13_s / (1 + R13_s) 1123 C16_s = 1 / (1 + R17_s + R18_s) 1124 C17_s = R17_s / (1 + R17_s + R18_s) 1125 C18_s = R18_s / (1 + R17_s + R18_s) 1126 1127 C626_s = C12_s * C16_s ** 2 1128 C627_s = 2 * C12_s * C16_s * C17_s 1129 C628_s = 2 * C12_s * C16_s * C18_s 1130 C636_s = C13_s * C16_s ** 2 1131 C637_s = 2 * C13_s * C16_s * C17_s 1132 C727_s = C12_s * C17_s ** 2 1133 1134 R45_s = (C627_s + C636_s) / C626_s 1135 R46_s = (C628_s + C637_s + C727_s) / C626_s 1136 R45R46_standards[sample] = (R45_s, R46_s) 1137 1138 for s in self.sessions: 1139 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1140 assert db, f'No sample from {samples} found in session "{s}".' 1141# dbsamples = sorted({r['Sample'] for r in db}) 1142 1143 X = [r['d45'] for r in db] 1144 Y = [R45R46_standards[r['Sample']][0] for r in db] 1145 x1, x2 = np.min(X), np.max(X) 1146 1147 if x1 < x2: 1148 wgcoord = x1/(x1-x2) 1149 else: 1150 wgcoord = 999 1151 1152 if wgcoord < -.5 or wgcoord > 1.5: 1153 # unreasonable to extrapolate to d45 = 0 1154 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1155 else : 1156 # d45 = 0 is reasonably well bracketed 1157 R45_wg = np.polyfit(X, Y, 1)[1] 1158 1159 X = [r['d46'] for r in db] 1160 Y = [R45R46_standards[r['Sample']][1] for r in db] 1161 x1, x2 = np.min(X), np.max(X) 1162 1163 if x1 < x2: 1164 wgcoord = x1/(x1-x2) 1165 else: 1166 wgcoord = 999 1167 1168 if wgcoord < -.5 or wgcoord > 1.5: 1169 # unreasonable to extrapolate to d46 = 0 1170 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1171 else : 1172 # d46 = 0 is reasonably well bracketed 1173 R46_wg = np.polyfit(X, Y, 1)[1] 1174 1175 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1176 1177 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1178 1179 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1180 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1181 for r in self.sessions[s]['data']: 1182 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1183 r['d18Owg_VSMOW'] = d18Owg_VSMOW
Compute bulk composition of the working gas for each session based on
the carbonate standards defined in both self.Nominal_d13C_VPDB
and
self.Nominal_d18O_VPDB
.
1186 def compute_bulk_delta(self, R45, R46, D17O = 0): 1187 ''' 1188 Compute δ13C_VPDB and δ18O_VSMOW, 1189 by solving the generalized form of equation (17) from 1190 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1191 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1192 solving the corresponding second-order Taylor polynomial. 1193 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1194 ''' 1195 1196 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1197 1198 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1199 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1200 C = 2 * self.R18_VSMOW 1201 D = -R46 1202 1203 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1204 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1205 cc = A + B + C + D 1206 1207 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1208 1209 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1210 R17 = K * R18 ** self.LAMBDA_17 1211 R13 = R45 - 2 * R17 1212 1213 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1214 1215 return d13C_VPDB, d18O_VSMOW
Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)
1218 @make_verbal 1219 def crunch(self, verbose = ''): 1220 ''' 1221 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1222 ''' 1223 for r in self: 1224 self.compute_bulk_and_clumping_deltas(r) 1225 self.standardize_d13C() 1226 self.standardize_d18O() 1227 self.msg(f"Crunched {len(self)} analyses.")
Compute bulk composition and raw clumped isotope anomalies for all analyses.
1230 def fill_in_missing_info(self, session = 'mySession'): 1231 ''' 1232 Fill in optional fields with default values 1233 ''' 1234 for i,r in enumerate(self): 1235 if 'D17O' not in r: 1236 r['D17O'] = 0. 1237 if 'UID' not in r: 1238 r['UID'] = f'{i+1}' 1239 if 'Session' not in r: 1240 r['Session'] = session 1241 for k in ['d47', 'd48', 'd49']: 1242 if k not in r: 1243 r[k] = np.nan
Fill in optional fields with default values
1246 def standardize_d13C(self): 1247 ''' 1248 Perform δ13C standadization within each session `s` according to 1249 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1250 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1251 may be redefined abitrarily at a later stage. 1252 ''' 1253 for s in self.sessions: 1254 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1255 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1256 X,Y = zip(*XY) 1257 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1258 offset = np.mean(Y) - np.mean(X) 1259 for r in self.sessions[s]['data']: 1260 r['d13C_VPDB'] += offset 1261 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1262 a,b = np.polyfit(X,Y,1) 1263 for r in self.sessions[s]['data']: 1264 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
Perform δ13C standadization within each session s
according to
self.sessions[s]['d13C_standardization_method']
, which is defined by default
by D47data.refresh_sessions()
as equal to self.d13C_STANDARDIZATION_METHOD
, but
may be redefined abitrarily at a later stage.
1266 def standardize_d18O(self): 1267 ''' 1268 Perform δ18O standadization within each session `s` according to 1269 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1270 which is defined by default by `D47data.refresh_sessions()`as equal to 1271 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1272 ''' 1273 for s in self.sessions: 1274 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1275 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1276 X,Y = zip(*XY) 1277 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1278 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1279 offset = np.mean(Y) - np.mean(X) 1280 for r in self.sessions[s]['data']: 1281 r['d18O_VSMOW'] += offset 1282 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1283 a,b = np.polyfit(X,Y,1) 1284 for r in self.sessions[s]['data']: 1285 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
Perform δ18O standadization within each session s
according to
self.ALPHA_18O_ACID_REACTION
and self.sessions[s]['d18O_standardization_method']
,
which is defined by default by D47data.refresh_sessions()
as equal to
self.d18O_STANDARDIZATION_METHOD
, but may be redefined abitrarily at a later stage.
1288 def compute_bulk_and_clumping_deltas(self, r): 1289 ''' 1290 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1291 ''' 1292 1293 # Compute working gas R13, R18, and isobar ratios 1294 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1295 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1296 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1297 1298 # Compute analyte isobar ratios 1299 R45 = (1 + r['d45'] / 1000) * R45_wg 1300 R46 = (1 + r['d46'] / 1000) * R46_wg 1301 R47 = (1 + r['d47'] / 1000) * R47_wg 1302 R48 = (1 + r['d48'] / 1000) * R48_wg 1303 R49 = (1 + r['d49'] / 1000) * R49_wg 1304 1305 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1306 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1307 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1308 1309 # Compute stochastic isobar ratios of the analyte 1310 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1311 R13, R18, D17O = r['D17O'] 1312 ) 1313 1314 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1315 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1316 if (R45 / R45stoch - 1) > 5e-8: 1317 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1318 if (R46 / R46stoch - 1) > 5e-8: 1319 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1320 1321 # Compute raw clumped isotope anomalies 1322 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1323 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1324 r['D49raw'] = 1000 * (R49 / R49stoch - 1)
Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r
.
1327 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1328 ''' 1329 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1330 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1331 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1332 ''' 1333 1334 # Compute R17 1335 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1336 1337 # Compute isotope concentrations 1338 C12 = (1 + R13) ** -1 1339 C13 = C12 * R13 1340 C16 = (1 + R17 + R18) ** -1 1341 C17 = C16 * R17 1342 C18 = C16 * R18 1343 1344 # Compute stochastic isotopologue concentrations 1345 C626 = C16 * C12 * C16 1346 C627 = C16 * C12 * C17 * 2 1347 C628 = C16 * C12 * C18 * 2 1348 C636 = C16 * C13 * C16 1349 C637 = C16 * C13 * C17 * 2 1350 C638 = C16 * C13 * C18 * 2 1351 C727 = C17 * C12 * C17 1352 C728 = C17 * C12 * C18 * 2 1353 C737 = C17 * C13 * C17 1354 C738 = C17 * C13 * C18 * 2 1355 C828 = C18 * C12 * C18 1356 C838 = C18 * C13 * C18 1357 1358 # Compute stochastic isobar ratios 1359 R45 = (C636 + C627) / C626 1360 R46 = (C628 + C637 + C727) / C626 1361 R47 = (C638 + C728 + C737) / C626 1362 R48 = (C738 + C828) / C626 1363 R49 = C838 / C626 1364 1365 # Account for stochastic anomalies 1366 R47 *= 1 + D47 / 1000 1367 R48 *= 1 + D48 / 1000 1368 R49 *= 1 + D49 / 1000 1369 1370 # Return isobar ratios 1371 return R45, R46, R47, R48, R49
Compute isobar ratios for a sample with isotopic ratios R13
and R18
,
optionally accounting for non-zero values of Δ17O (D17O
) and clumped isotope
anomalies (D47
, D48
, D49
), all expressed in permil.
1374 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1375 ''' 1376 Split unknown samples by UID (treat all analyses as different samples) 1377 or by session (treat analyses of a given sample in different sessions as 1378 different samples). 1379 1380 **Parameters** 1381 1382 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1383 + `grouping`: `by_uid` | `by_session` 1384 ''' 1385 if samples_to_split == 'all': 1386 samples_to_split = [s for s in self.unknowns] 1387 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1388 self.grouping = grouping.lower() 1389 if self.grouping in gkeys: 1390 gkey = gkeys[self.grouping] 1391 for r in self: 1392 if r['Sample'] in samples_to_split: 1393 r['Sample_original'] = r['Sample'] 1394 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1395 elif r['Sample'] in self.unknowns: 1396 r['Sample_original'] = r['Sample'] 1397 self.refresh_samples()
Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).
Parameters
samples_to_split
: a list of samples to split, e.g.,['IAEA-C1', 'IAEA-C2']
grouping
:by_uid
|by_session
1400 def unsplit_samples(self, tables = False): 1401 ''' 1402 Reverse the effects of `D47data.split_samples()`. 1403 1404 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1405 1406 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1407 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1408 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1409 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1410 that case session-averaged Δ4x values are statistically independent). 1411 ''' 1412 unknowns_old = sorted({s for s in self.unknowns}) 1413 CM_old = self.standardization.covar[:,:] 1414 VD_old = self.standardization.params.valuesdict().copy() 1415 vars_old = self.standardization.var_names 1416 1417 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1418 1419 Ns = len(vars_old) - len(unknowns_old) 1420 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1421 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1422 1423 W = np.zeros((len(vars_new), len(vars_old))) 1424 W[:Ns,:Ns] = np.eye(Ns) 1425 for u in unknowns_new: 1426 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1427 if self.grouping == 'by_session': 1428 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1429 elif self.grouping == 'by_uid': 1430 weights = [1 for s in splits] 1431 sw = sum(weights) 1432 weights = [w/sw for w in weights] 1433 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1434 1435 CM_new = W @ CM_old @ W.T 1436 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1437 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1438 1439 self.standardization.covar = CM_new 1440 self.standardization.params.valuesdict = lambda : VD_new 1441 self.standardization.var_names = vars_new 1442 1443 for r in self: 1444 if r['Sample'] in self.unknowns: 1445 r['Sample_split'] = r['Sample'] 1446 r['Sample'] = r['Sample_original'] 1447 1448 self.refresh_samples() 1449 self.consolidate_samples() 1450 self.repeatabilities() 1451 1452 if tables: 1453 self.table_of_analyses() 1454 self.table_of_samples()
Reverse the effects of D47data.split_samples()
.
This should only be used after D4xdata.standardize()
with method='pooled'
.
After D4xdata.standardize()
with method='indep_sessions'
, one should
probably use D4xdata.combine_samples()
instead to reverse the effects of
D47data.split_samples()
with grouping='by_uid'
, or w_avg()
to reverse the
effects of D47data.split_samples()
with grouping='by_sessions'
(because in
that case session-averaged Δ4x values are statistically independent).
1456 def assign_timestamps(self): 1457 ''' 1458 Assign a time field `t` of type `float` to each analysis. 1459 1460 If `TimeTag` is one of the data fields, `t` is equal within a given session 1461 to `TimeTag` minus the mean value of `TimeTag` for that session. 1462 Otherwise, `TimeTag` is by default equal to the index of each analysis 1463 in the dataset and `t` is defined as above. 1464 ''' 1465 for session in self.sessions: 1466 sdata = self.sessions[session]['data'] 1467 try: 1468 t0 = np.mean([r['TimeTag'] for r in sdata]) 1469 for r in sdata: 1470 r['t'] = r['TimeTag'] - t0 1471 except KeyError: 1472 t0 = (len(sdata)-1)/2 1473 for t,r in enumerate(sdata): 1474 r['t'] = t - t0
Assign a time field t
of type float
to each analysis.
If TimeTag
is one of the data fields, t
is equal within a given session
to TimeTag
minus the mean value of TimeTag
for that session.
Otherwise, TimeTag
is by default equal to the index of each analysis
in the dataset and t
is defined as above.
1477 def report(self): 1478 ''' 1479 Prints a report on the standardization fit. 1480 Only applicable after `D4xdata.standardize(method='pooled')`. 1481 ''' 1482 report_fit(self.standardization)
Prints a report on the standardization fit.
Only applicable after D4xdata.standardize(method='pooled')
.
1485 def combine_samples(self, sample_groups): 1486 ''' 1487 Combine analyses of different samples to compute weighted average Δ4x 1488 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1489 dictionary. 1490 1491 Caution: samples are weighted by number of replicate analyses, which is a 1492 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1493 correlated analytical errors for one or more samples). 1494 1495 Returns a tuplet of: 1496 1497 + the list of group names 1498 + an array of the corresponding Δ4x values 1499 + the corresponding (co)variance matrix 1500 1501 **Parameters** 1502 1503 + `sample_groups`: a dictionary of the form: 1504 ```py 1505 {'group1': ['sample_1', 'sample_2'], 1506 'group2': ['sample_3', 'sample_4', 'sample_5']} 1507 ``` 1508 ''' 1509 1510 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1511 groups = sorted(sample_groups.keys()) 1512 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1513 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1514 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1515 W = np.array([ 1516 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1517 for j in groups]) 1518 D4x_new = W @ D4x_old 1519 CM_new = W @ CM_old @ W.T 1520 1521 return groups, D4x_new[:,0], CM_new
Combine analyses of different samples to compute weighted average Δ4x
and new error (co)variances corresponding to the groups defined by the sample_groups
dictionary.
Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).
Returns a tuplet of:
- the list of group names
- an array of the corresponding Δ4x values
- the corresponding (co)variance matrix
Parameters
sample_groups
: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
'group2': ['sample_3', 'sample_4', 'sample_5']}
1524 @make_verbal 1525 def standardize(self, 1526 method = 'pooled', 1527 weighted_sessions = [], 1528 consolidate = True, 1529 consolidate_tables = False, 1530 consolidate_plots = False, 1531 constraints = {}, 1532 ): 1533 ''' 1534 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1535 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1536 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1537 i.e. that their true Δ4x value does not change between sessions, 1538 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1539 `'indep_sessions'`, the standardization processes each session independently, based only 1540 on anchors analyses. 1541 ''' 1542 1543 self.standardization_method = method 1544 self.assign_timestamps() 1545 1546 if method == 'pooled': 1547 if weighted_sessions: 1548 for session_group in weighted_sessions: 1549 if self._4x == '47': 1550 X = D47data([r for r in self if r['Session'] in session_group]) 1551 elif self._4x == '48': 1552 X = D48data([r for r in self if r['Session'] in session_group]) 1553 X.Nominal_D4x = self.Nominal_D4x.copy() 1554 X.refresh() 1555 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1556 w = np.sqrt(result.redchi) 1557 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1558 for r in X: 1559 r[f'wD{self._4x}raw'] *= w 1560 else: 1561 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1562 for r in self: 1563 r[f'wD{self._4x}raw'] = 1. 1564 1565 params = Parameters() 1566 for k,session in enumerate(self.sessions): 1567 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1568 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1569 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1570 s = pf(session) 1571 params.add(f'a_{s}', value = 0.9) 1572 params.add(f'b_{s}', value = 0.) 1573 params.add(f'c_{s}', value = -0.9) 1574 params.add(f'a2_{s}', value = 0., 1575# vary = self.sessions[session]['scrambling_drift'], 1576 ) 1577 params.add(f'b2_{s}', value = 0., 1578# vary = self.sessions[session]['slope_drift'], 1579 ) 1580 params.add(f'c2_{s}', value = 0., 1581# vary = self.sessions[session]['wg_drift'], 1582 ) 1583 if not self.sessions[session]['scrambling_drift']: 1584 params[f'a2_{s}'].expr = '0' 1585 if not self.sessions[session]['slope_drift']: 1586 params[f'b2_{s}'].expr = '0' 1587 if not self.sessions[session]['wg_drift']: 1588 params[f'c2_{s}'].expr = '0' 1589 1590 for sample in self.unknowns: 1591 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1592 1593 for k in constraints: 1594 params[k].expr = constraints[k] 1595 1596 def residuals(p): 1597 R = [] 1598 for r in self: 1599 session = pf(r['Session']) 1600 sample = pf(r['Sample']) 1601 if r['Sample'] in self.Nominal_D4x: 1602 R += [ ( 1603 r[f'D{self._4x}raw'] - ( 1604 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1605 + p[f'b_{session}'] * r[f'd{self._4x}'] 1606 + p[f'c_{session}'] 1607 + r['t'] * ( 1608 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1609 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1610 + p[f'c2_{session}'] 1611 ) 1612 ) 1613 ) / r[f'wD{self._4x}raw'] ] 1614 else: 1615 R += [ ( 1616 r[f'D{self._4x}raw'] - ( 1617 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1618 + p[f'b_{session}'] * r[f'd{self._4x}'] 1619 + p[f'c_{session}'] 1620 + r['t'] * ( 1621 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1622 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1623 + p[f'c2_{session}'] 1624 ) 1625 ) 1626 ) / r[f'wD{self._4x}raw'] ] 1627 return R 1628 1629 M = Minimizer(residuals, params) 1630 result = M.least_squares() 1631 self.Nf = result.nfree 1632 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1633 new_names, new_covar, new_se = _fullcovar(result)[:3] 1634 result.var_names = new_names 1635 result.covar = new_covar 1636 1637 for r in self: 1638 s = pf(r["Session"]) 1639 a = result.params.valuesdict()[f'a_{s}'] 1640 b = result.params.valuesdict()[f'b_{s}'] 1641 c = result.params.valuesdict()[f'c_{s}'] 1642 a2 = result.params.valuesdict()[f'a2_{s}'] 1643 b2 = result.params.valuesdict()[f'b2_{s}'] 1644 c2 = result.params.valuesdict()[f'c2_{s}'] 1645 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1646 1647 1648 self.standardization = result 1649 1650 for session in self.sessions: 1651 self.sessions[session]['Np'] = 3 1652 for k in ['scrambling', 'slope', 'wg']: 1653 if self.sessions[session][f'{k}_drift']: 1654 self.sessions[session]['Np'] += 1 1655 1656 if consolidate: 1657 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1658 return result 1659 1660 1661 elif method == 'indep_sessions': 1662 1663 if weighted_sessions: 1664 for session_group in weighted_sessions: 1665 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1666 X.Nominal_D4x = self.Nominal_D4x.copy() 1667 X.refresh() 1668 # This is only done to assign r['wD47raw'] for r in X: 1669 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1670 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1671 else: 1672 self.msg('All weights set to 1 ‰') 1673 for r in self: 1674 r[f'wD{self._4x}raw'] = 1 1675 1676 for session in self.sessions: 1677 s = self.sessions[session] 1678 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1679 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1680 s['Np'] = sum(p_active) 1681 sdata = s['data'] 1682 1683 A = np.array([ 1684 [ 1685 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1686 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1687 1 / r[f'wD{self._4x}raw'], 1688 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1689 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1690 r['t'] / r[f'wD{self._4x}raw'] 1691 ] 1692 for r in sdata if r['Sample'] in self.anchors 1693 ])[:,p_active] # only keep columns for the active parameters 1694 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1695 s['Na'] = Y.size 1696 CM = linalg.inv(A.T @ A) 1697 bf = (CM @ A.T @ Y).T[0,:] 1698 k = 0 1699 for n,a in zip(p_names, p_active): 1700 if a: 1701 s[n] = bf[k] 1702# self.msg(f'{n} = {bf[k]}') 1703 k += 1 1704 else: 1705 s[n] = 0. 1706# self.msg(f'{n} = 0.0') 1707 1708 for r in sdata : 1709 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1710 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1711 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1712 1713 s['CM'] = np.zeros((6,6)) 1714 i = 0 1715 k_active = [j for j,a in enumerate(p_active) if a] 1716 for j,a in enumerate(p_active): 1717 if a: 1718 s['CM'][j,k_active] = CM[i,:] 1719 i += 1 1720 1721 if not weighted_sessions: 1722 w = self.rmswd()['rmswd'] 1723 for r in self: 1724 r[f'wD{self._4x}'] *= w 1725 r[f'wD{self._4x}raw'] *= w 1726 for session in self.sessions: 1727 self.sessions[session]['CM'] *= w**2 1728 1729 for session in self.sessions: 1730 s = self.sessions[session] 1731 s['SE_a'] = s['CM'][0,0]**.5 1732 s['SE_b'] = s['CM'][1,1]**.5 1733 s['SE_c'] = s['CM'][2,2]**.5 1734 s['SE_a2'] = s['CM'][3,3]**.5 1735 s['SE_b2'] = s['CM'][4,4]**.5 1736 s['SE_c2'] = s['CM'][5,5]**.5 1737 1738 if not weighted_sessions: 1739 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1740 else: 1741 self.Nf = 0 1742 for sg in weighted_sessions: 1743 self.Nf += self.rmswd(sessions = sg)['Nf'] 1744 1745 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1746 1747 avgD4x = { 1748 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1749 for sample in self.samples 1750 } 1751 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1752 rD4x = (chi2/self.Nf)**.5 1753 self.repeatability[f'sigma_{self._4x}'] = rD4x 1754 1755 if consolidate: 1756 self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
Compute absolute Δ4x values for all replicate analyses and for sample averages.
If method
argument is set to 'pooled'
, the standardization processes all sessions
in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
i.e. that their true Δ4x value does not change between sessions,
(Daëron, 2021). If method
argument is set to
'indep_sessions'
, the standardization processes each session independently, based only
on anchors analyses.
1759 def standardization_error(self, session, d4x, D4x, t = 0): 1760 ''' 1761 Compute standardization error for a given session and 1762 (δ47, Δ47) composition. 1763 ''' 1764 a = self.sessions[session]['a'] 1765 b = self.sessions[session]['b'] 1766 c = self.sessions[session]['c'] 1767 a2 = self.sessions[session]['a2'] 1768 b2 = self.sessions[session]['b2'] 1769 c2 = self.sessions[session]['c2'] 1770 CM = self.sessions[session]['CM'] 1771 1772 x, y = D4x, d4x 1773 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1774# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1775 dxdy = -(b+b2*t) / (a+a2*t) 1776 dxdz = 1. / (a+a2*t) 1777 dxda = -x / (a+a2*t) 1778 dxdb = -y / (a+a2*t) 1779 dxdc = -1. / (a+a2*t) 1780 dxda2 = -x * a2 / (a+a2*t) 1781 dxdb2 = -y * t / (a+a2*t) 1782 dxdc2 = -t / (a+a2*t) 1783 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1784 sx = (V @ CM @ V.T) ** .5 1785 return sx
Compute standardization error for a given session and (δ47, Δ47) composition.
1788 @make_verbal 1789 def summary(self, 1790 dir = 'output', 1791 filename = None, 1792 save_to_file = True, 1793 print_out = True, 1794 ): 1795 ''' 1796 Print out an/or save to disk a summary of the standardization results. 1797 1798 **Parameters** 1799 1800 + `dir`: the directory in which to save the table 1801 + `filename`: the name to the csv file to write to 1802 + `save_to_file`: whether to save the table to disk 1803 + `print_out`: whether to print out the table 1804 ''' 1805 1806 out = [] 1807 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1808 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1809 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1810 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1811 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1812 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1813 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1814 out += [['Model degrees of freedom', f"{self.Nf}"]] 1815 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1816 out += [['Standardization method', self.standardization_method]] 1817 1818 if save_to_file: 1819 if not os.path.exists(dir): 1820 os.makedirs(dir) 1821 if filename is None: 1822 filename = f'D{self._4x}_summary.csv' 1823 with open(f'{dir}/{filename}', 'w') as fid: 1824 fid.write(make_csv(out)) 1825 if print_out: 1826 self.msg('\n' + pretty_table(out, header = 0))
Print out an/or save to disk a summary of the standardization results.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the table
1829 @make_verbal 1830 def table_of_sessions(self, 1831 dir = 'output', 1832 filename = None, 1833 save_to_file = True, 1834 print_out = True, 1835 output = None, 1836 ): 1837 ''' 1838 Print out an/or save to disk a table of sessions. 1839 1840 **Parameters** 1841 1842 + `dir`: the directory in which to save the table 1843 + `filename`: the name to the csv file to write to 1844 + `save_to_file`: whether to save the table to disk 1845 + `print_out`: whether to print out the table 1846 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1847 if set to `'raw'`: return a list of list of strings 1848 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1849 ''' 1850 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1851 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1852 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1853 1854 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1855 if include_a2: 1856 out[-1] += ['a2 ± SE'] 1857 if include_b2: 1858 out[-1] += ['b2 ± SE'] 1859 if include_c2: 1860 out[-1] += ['c2 ± SE'] 1861 for session in self.sessions: 1862 out += [[ 1863 session, 1864 f"{self.sessions[session]['Na']}", 1865 f"{self.sessions[session]['Nu']}", 1866 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1867 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1868 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1869 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1870 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1871 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1872 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1873 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1874 ]] 1875 if include_a2: 1876 if self.sessions[session]['scrambling_drift']: 1877 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1878 else: 1879 out[-1] += [''] 1880 if include_b2: 1881 if self.sessions[session]['slope_drift']: 1882 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1883 else: 1884 out[-1] += [''] 1885 if include_c2: 1886 if self.sessions[session]['wg_drift']: 1887 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1888 else: 1889 out[-1] += [''] 1890 1891 if save_to_file: 1892 if not os.path.exists(dir): 1893 os.makedirs(dir) 1894 if filename is None: 1895 filename = f'D{self._4x}_sessions.csv' 1896 with open(f'{dir}/{filename}', 'w') as fid: 1897 fid.write(make_csv(out)) 1898 if print_out: 1899 self.msg('\n' + pretty_table(out)) 1900 if output == 'raw': 1901 return out 1902 elif output == 'pretty': 1903 return pretty_table(out)
Print out an/or save to disk a table of sessions.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1906 @make_verbal 1907 def table_of_analyses( 1908 self, 1909 dir = 'output', 1910 filename = None, 1911 save_to_file = True, 1912 print_out = True, 1913 output = None, 1914 ): 1915 ''' 1916 Print out an/or save to disk a table of analyses. 1917 1918 **Parameters** 1919 1920 + `dir`: the directory in which to save the table 1921 + `filename`: the name to the csv file to write to 1922 + `save_to_file`: whether to save the table to disk 1923 + `print_out`: whether to print out the table 1924 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1925 if set to `'raw'`: return a list of list of strings 1926 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1927 ''' 1928 1929 out = [['UID','Session','Sample']] 1930 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1931 for f in extra_fields: 1932 out[-1] += [f[0]] 1933 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1934 for r in self: 1935 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1936 for f in extra_fields: 1937 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1938 out[-1] += [ 1939 f"{r['d13Cwg_VPDB']:.3f}", 1940 f"{r['d18Owg_VSMOW']:.3f}", 1941 f"{r['d45']:.6f}", 1942 f"{r['d46']:.6f}", 1943 f"{r['d47']:.6f}", 1944 f"{r['d48']:.6f}", 1945 f"{r['d49']:.6f}", 1946 f"{r['d13C_VPDB']:.6f}", 1947 f"{r['d18O_VSMOW']:.6f}", 1948 f"{r['D47raw']:.6f}", 1949 f"{r['D48raw']:.6f}", 1950 f"{r['D49raw']:.6f}", 1951 f"{r[f'D{self._4x}']:.6f}" 1952 ] 1953 if save_to_file: 1954 if not os.path.exists(dir): 1955 os.makedirs(dir) 1956 if filename is None: 1957 filename = f'D{self._4x}_analyses.csv' 1958 with open(f'{dir}/{filename}', 'w') as fid: 1959 fid.write(make_csv(out)) 1960 if print_out: 1961 self.msg('\n' + pretty_table(out)) 1962 return out
Print out an/or save to disk a table of analyses.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1964 @make_verbal 1965 def covar_table( 1966 self, 1967 correl = False, 1968 dir = 'output', 1969 filename = None, 1970 save_to_file = True, 1971 print_out = True, 1972 output = None, 1973 ): 1974 ''' 1975 Print out, save to disk and/or return the variance-covariance matrix of D4x 1976 for all unknown samples. 1977 1978 **Parameters** 1979 1980 + `dir`: the directory in which to save the csv 1981 + `filename`: the name of the csv file to write to 1982 + `save_to_file`: whether to save the csv 1983 + `print_out`: whether to print out the matrix 1984 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 1985 if set to `'raw'`: return a list of list of strings 1986 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1987 ''' 1988 samples = sorted([u for u in self.unknowns]) 1989 out = [[''] + samples] 1990 for s1 in samples: 1991 out.append([s1]) 1992 for s2 in samples: 1993 if correl: 1994 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 1995 else: 1996 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 1997 1998 if save_to_file: 1999 if not os.path.exists(dir): 2000 os.makedirs(dir) 2001 if filename is None: 2002 if correl: 2003 filename = f'D{self._4x}_correl.csv' 2004 else: 2005 filename = f'D{self._4x}_covar.csv' 2006 with open(f'{dir}/{filename}', 'w') as fid: 2007 fid.write(make_csv(out)) 2008 if print_out: 2009 self.msg('\n'+pretty_table(out)) 2010 if output == 'raw': 2011 return out 2012 elif output == 'pretty': 2013 return pretty_table(out)
Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the matrixoutput
: if set to'pretty'
: return a pretty text matrix (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2015 @make_verbal 2016 def table_of_samples( 2017 self, 2018 dir = 'output', 2019 filename = None, 2020 save_to_file = True, 2021 print_out = True, 2022 output = None, 2023 ): 2024 ''' 2025 Print out, save to disk and/or return a table of samples. 2026 2027 **Parameters** 2028 2029 + `dir`: the directory in which to save the csv 2030 + `filename`: the name of the csv file to write to 2031 + `save_to_file`: whether to save the csv 2032 + `print_out`: whether to print out the table 2033 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2034 if set to `'raw'`: return a list of list of strings 2035 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2036 ''' 2037 2038 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2039 for sample in self.anchors: 2040 out += [[ 2041 f"{sample}", 2042 f"{self.samples[sample]['N']}", 2043 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2044 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2045 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2046 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2047 ]] 2048 for sample in self.unknowns: 2049 out += [[ 2050 f"{sample}", 2051 f"{self.samples[sample]['N']}", 2052 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2053 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2054 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2055 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2056 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2057 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2058 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2059 ]] 2060 if save_to_file: 2061 if not os.path.exists(dir): 2062 os.makedirs(dir) 2063 if filename is None: 2064 filename = f'D{self._4x}_samples.csv' 2065 with open(f'{dir}/{filename}', 'w') as fid: 2066 fid.write(make_csv(out)) 2067 if print_out: 2068 self.msg('\n'+pretty_table(out)) 2069 if output == 'raw': 2070 return out 2071 elif output == 'pretty': 2072 return pretty_table(out)
Print out, save to disk and/or return a table of samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2075 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2076 ''' 2077 Generate session plots and save them to disk. 2078 2079 **Parameters** 2080 2081 + `dir`: the directory in which to save the plots 2082 + `figsize`: the width and height (in inches) of each plot 2083 + `filetype`: 'pdf' or 'png' 2084 + `dpi`: resolution for PNG output 2085 ''' 2086 if not os.path.exists(dir): 2087 os.makedirs(dir) 2088 2089 for session in self.sessions: 2090 sp = self.plot_single_session(session, xylimits = 'constant') 2091 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2092 ppl.close(sp.fig)
Generate session plots and save them to disk.
Parameters
dir
: the directory in which to save the plotsfigsize
: the width and height (in inches) of each plotfiletype
: 'pdf' or 'png'dpi
: resolution for PNG output
2095 @make_verbal 2096 def consolidate_samples(self): 2097 ''' 2098 Compile various statistics for each sample. 2099 2100 For each anchor sample: 2101 2102 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2103 + `SE_D47` or `SE_D48`: set to zero by definition 2104 2105 For each unknown sample: 2106 2107 + `D47` or `D48`: the standardized Δ4x value for this unknown 2108 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2109 2110 For each anchor and unknown: 2111 2112 + `N`: the total number of analyses of this sample 2113 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2114 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2115 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2116 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2117 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2118 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2119 ''' 2120 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2121 for sample in self.samples: 2122 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2123 if self.samples[sample]['N'] > 1: 2124 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2125 2126 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2127 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2128 2129 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2130 if len(D4x_pop) > 2: 2131 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2132 2133 if self.standardization_method == 'pooled': 2134 for sample in self.anchors: 2135 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2136 self.samples[sample][f'SE_D{self._4x}'] = 0. 2137 for sample in self.unknowns: 2138 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2139 try: 2140 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2141 except ValueError: 2142 # when `sample` is constrained by self.standardize(constraints = {...}), 2143 # it is no longer listed in self.standardization.var_names. 2144 # Temporary fix: define SE as zero for now 2145 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2146 2147 elif self.standardization_method == 'indep_sessions': 2148 for sample in self.anchors: 2149 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2150 self.samples[sample][f'SE_D{self._4x}'] = 0. 2151 for sample in self.unknowns: 2152 self.msg(f'Consolidating sample {sample}') 2153 self.unknowns[sample][f'session_D{self._4x}'] = {} 2154 session_avg = [] 2155 for session in self.sessions: 2156 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2157 if sdata: 2158 self.msg(f'{sample} found in session {session}') 2159 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2160 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2161 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2162 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2163 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2164 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2165 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2166 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2167 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2168 wsum = sum([weights[s] for s in weights]) 2169 for s in weights: 2170 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2171 2172 for r in self: 2173 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
Compile various statistics for each sample.
For each anchor sample:
D47
orD48
: the nominal Δ4x value for this anchor, specified byself.Nominal_D4x
SE_D47
orSE_D48
: set to zero by definition
For each unknown sample:
D47
orD48
: the standardized Δ4x value for this unknownSE_D47
orSE_D48
: the standard error of Δ4x for this unknown
For each anchor and unknown:
N
: the total number of analyses of this sampleSD_D47
orSD_D48
: the “sample” (in the statistical sense) standard deviation for this sampled13C_VPDB
: the average δ13CVPDB value for this sampled18O_VSMOW
: the average δ18OVSMOW value for this sample (as CO2)p_Levene
: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified byself.LEVENE_REF_SAMPLE
.
2177 def consolidate_sessions(self): 2178 ''' 2179 Compute various statistics for each session. 2180 2181 + `Na`: Number of anchor analyses in the session 2182 + `Nu`: Number of unknown analyses in the session 2183 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2184 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2185 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2186 + `a`: scrambling factor 2187 + `b`: compositional slope 2188 + `c`: WG offset 2189 + `SE_a`: Model stadard erorr of `a` 2190 + `SE_b`: Model stadard erorr of `b` 2191 + `SE_c`: Model stadard erorr of `c` 2192 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2193 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2194 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2195 + `a2`: scrambling factor drift 2196 + `b2`: compositional slope drift 2197 + `c2`: WG offset drift 2198 + `Np`: Number of standardization parameters to fit 2199 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2200 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2201 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2202 ''' 2203 for session in self.sessions: 2204 if 'd13Cwg_VPDB' not in self.sessions[session]: 2205 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2206 if 'd18Owg_VSMOW' not in self.sessions[session]: 2207 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2208 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2209 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2210 2211 self.msg(f'Computing repeatabilities for session {session}') 2212 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2213 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2214 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2215 2216 if self.standardization_method == 'pooled': 2217 for session in self.sessions: 2218 2219 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2220 i = self.standardization.var_names.index(f'a_{pf(session)}') 2221 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2222 2223 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2224 i = self.standardization.var_names.index(f'b_{pf(session)}') 2225 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2226 2227 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2228 i = self.standardization.var_names.index(f'c_{pf(session)}') 2229 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2230 2231 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2232 if self.sessions[session]['scrambling_drift']: 2233 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2234 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2235 else: 2236 self.sessions[session]['SE_a2'] = 0. 2237 2238 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2239 if self.sessions[session]['slope_drift']: 2240 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2241 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2242 else: 2243 self.sessions[session]['SE_b2'] = 0. 2244 2245 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2246 if self.sessions[session]['wg_drift']: 2247 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2248 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2249 else: 2250 self.sessions[session]['SE_c2'] = 0. 2251 2252 i = self.standardization.var_names.index(f'a_{pf(session)}') 2253 j = self.standardization.var_names.index(f'b_{pf(session)}') 2254 k = self.standardization.var_names.index(f'c_{pf(session)}') 2255 CM = np.zeros((6,6)) 2256 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2257 try: 2258 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2259 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2260 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2261 try: 2262 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2263 CM[3,4] = self.standardization.covar[i2,j2] 2264 CM[4,3] = self.standardization.covar[j2,i2] 2265 except ValueError: 2266 pass 2267 try: 2268 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2269 CM[3,5] = self.standardization.covar[i2,k2] 2270 CM[5,3] = self.standardization.covar[k2,i2] 2271 except ValueError: 2272 pass 2273 except ValueError: 2274 pass 2275 try: 2276 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2277 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2278 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2279 try: 2280 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2281 CM[4,5] = self.standardization.covar[j2,k2] 2282 CM[5,4] = self.standardization.covar[k2,j2] 2283 except ValueError: 2284 pass 2285 except ValueError: 2286 pass 2287 try: 2288 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2289 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2290 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2291 except ValueError: 2292 pass 2293 2294 self.sessions[session]['CM'] = CM 2295 2296 elif self.standardization_method == 'indep_sessions': 2297 pass # Not implemented yet
Compute various statistics for each session.
Na
: Number of anchor analyses in the sessionNu
: Number of unknown analyses in the sessionr_d13C_VPDB
: δ13CVPDB repeatability of analyses within the sessionr_d18O_VSMOW
: δ18OVSMOW repeatability of analyses within the sessionr_D47
orr_D48
: Δ4x repeatability of analyses within the sessiona
: scrambling factorb
: compositional slopec
: WG offsetSE_a
: Model stadard erorr ofa
SE_b
: Model stadard erorr ofb
SE_c
: Model stadard erorr ofc
scrambling_drift
(boolean): whether to allow a temporal drift in the scrambling factor (a
)slope_drift
(boolean): whether to allow a temporal drift in the compositional slope (b
)wg_drift
(boolean): whether to allow a temporal drift in the WG offset (c
)a2
: scrambling factor driftb2
: compositional slope driftc2
: WG offset driftNp
: Number of standardization parameters to fitCM
: model covariance matrix for (a
,b
,c
,a2
,b2
,c2
)d13Cwg_VPDB
: δ13CVPDB of WGd18Owg_VSMOW
: δ18OVSMOW of WG
2300 @make_verbal 2301 def repeatabilities(self): 2302 ''' 2303 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2304 (for all samples, for anchors, and for unknowns). 2305 ''' 2306 self.msg('Computing reproducibilities for all sessions') 2307 2308 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2309 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2310 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2311 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2312 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).
2315 @make_verbal 2316 def consolidate(self, tables = True, plots = True): 2317 ''' 2318 Collect information about samples, sessions and repeatabilities. 2319 ''' 2320 self.consolidate_samples() 2321 self.consolidate_sessions() 2322 self.repeatabilities() 2323 2324 if tables: 2325 self.summary() 2326 self.table_of_sessions() 2327 self.table_of_analyses() 2328 self.table_of_samples() 2329 2330 if plots: 2331 self.plot_sessions()
Collect information about samples, sessions and repeatabilities.
2334 @make_verbal 2335 def rmswd(self, 2336 samples = 'all samples', 2337 sessions = 'all sessions', 2338 ): 2339 ''' 2340 Compute the χ2, root mean squared weighted deviation 2341 (i.e. reduced χ2), and corresponding degrees of freedom of the 2342 Δ4x values for samples in `samples` and sessions in `sessions`. 2343 2344 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2345 ''' 2346 if samples == 'all samples': 2347 mysamples = [k for k in self.samples] 2348 elif samples == 'anchors': 2349 mysamples = [k for k in self.anchors] 2350 elif samples == 'unknowns': 2351 mysamples = [k for k in self.unknowns] 2352 else: 2353 mysamples = samples 2354 2355 if sessions == 'all sessions': 2356 sessions = [k for k in self.sessions] 2357 2358 chisq, Nf = 0, 0 2359 for sample in mysamples : 2360 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2361 if len(G) > 1 : 2362 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2363 Nf += (len(G) - 1) 2364 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2365 r = (chisq / Nf)**.5 if Nf > 0 else 0 2366 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2367 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
Compute the χ2, root mean squared weighted deviation
(i.e. reduced χ2), and corresponding degrees of freedom of the
Δ4x values for samples in samples
and sessions in sessions
.
Only used in D4xdata.standardize()
with method='indep_sessions'
.
2370 @make_verbal 2371 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2372 ''' 2373 Compute the repeatability of `[r[key] for r in self]` 2374 ''' 2375 2376 if samples == 'all samples': 2377 mysamples = [k for k in self.samples] 2378 elif samples == 'anchors': 2379 mysamples = [k for k in self.anchors] 2380 elif samples == 'unknowns': 2381 mysamples = [k for k in self.unknowns] 2382 else: 2383 mysamples = samples 2384 2385 if sessions == 'all sessions': 2386 sessions = [k for k in self.sessions] 2387 2388 if key in ['D47', 'D48']: 2389 # Full disclosure: the definition of Nf is tricky/debatable 2390 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2391 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2392 Nf = len(G) 2393# print(f'len(G) = {Nf}') 2394 Nf -= len([s for s in mysamples if s in self.unknowns]) 2395# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2396 for session in sessions: 2397 Np = len([ 2398 _ for _ in self.standardization.params 2399 if ( 2400 self.standardization.params[_].expr is not None 2401 and ( 2402 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2403 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2404 ) 2405 ) 2406 ]) 2407# print(f'session {session}: {Np} parameters to consider') 2408 Na = len({ 2409 r['Sample'] for r in self.sessions[session]['data'] 2410 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2411 }) 2412# print(f'session {session}: {Na} different anchors in that session') 2413 Nf -= min(Np, Na) 2414# print(f'Nf = {Nf}') 2415 2416# for sample in mysamples : 2417# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2418# if len(X) > 1 : 2419# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2420# if sample in self.unknowns: 2421# Nf += len(X) - 1 2422# else: 2423# Nf += len(X) 2424# if samples in ['anchors', 'all samples']: 2425# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2426 r = (chisq / Nf)**.5 if Nf > 0 else 0 2427 2428 else: # if key not in ['D47', 'D48'] 2429 chisq, Nf = 0, 0 2430 for sample in mysamples : 2431 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2432 if len(X) > 1 : 2433 Nf += len(X) - 1 2434 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2435 r = (chisq / Nf)**.5 if Nf > 0 else 0 2436 2437 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2438 return r
Compute the repeatability of [r[key] for r in self]
2440 def sample_average(self, samples, weights = 'equal', normalize = True): 2441 ''' 2442 Weighted average Δ4x value of a group of samples, accounting for covariance. 2443 2444 Returns the weighed average Δ4x value and associated SE 2445 of a group of samples. Weights are equal by default. If `normalize` is 2446 true, `weights` will be rescaled so that their sum equals 1. 2447 2448 **Examples** 2449 2450 ```python 2451 self.sample_average(['X','Y'], [1, 2]) 2452 ``` 2453 2454 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2455 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2456 values of samples X and Y, respectively. 2457 2458 ```python 2459 self.sample_average(['X','Y'], [1, -1], normalize = False) 2460 ``` 2461 2462 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2463 ''' 2464 if weights == 'equal': 2465 weights = [1/len(samples)] * len(samples) 2466 2467 if normalize: 2468 s = sum(weights) 2469 if s: 2470 weights = [w/s for w in weights] 2471 2472 try: 2473# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2474# C = self.standardization.covar[indices,:][:,indices] 2475 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2476 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2477 return correlated_sum(X, C, weights) 2478 except ValueError: 2479 return (0., 0.)
Weighted average Δ4x value of a group of samples, accounting for covariance.
Returns the weighed average Δ4x value and associated SE
of a group of samples. Weights are equal by default. If normalize
is
true, weights
will be rescaled so that their sum equals 1.
Examples
self.sample_average(['X','Y'], [1, 2])
returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.
self.sample_average(['X','Y'], [1, -1], normalize = False)
returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2482 def sample_D4x_covar(self, sample1, sample2 = None): 2483 ''' 2484 Covariance between Δ4x values of samples 2485 2486 Returns the error covariance between the average Δ4x values of two 2487 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2488 returns the Δ4x variance for that sample. 2489 ''' 2490 if sample2 is None: 2491 sample2 = sample1 2492 if self.standardization_method == 'pooled': 2493 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2494 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2495 return self.standardization.covar[i, j] 2496 elif self.standardization_method == 'indep_sessions': 2497 if sample1 == sample2: 2498 return self.samples[sample1][f'SE_D{self._4x}']**2 2499 else: 2500 c = 0 2501 for session in self.sessions: 2502 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2503 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2504 if sdata1 and sdata2: 2505 a = self.sessions[session]['a'] 2506 # !! TODO: CM below does not account for temporal changes in standardization parameters 2507 CM = self.sessions[session]['CM'][:3,:3] 2508 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2509 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2510 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2511 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2512 c += ( 2513 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2514 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2515 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2516 @ CM 2517 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2518 ) / a**2 2519 return float(c)
Covariance between Δ4x values of samples
Returns the error covariance between the average Δ4x values of two
samples. If if only sample_1
is specified, or if sample_1 == sample_2
),
returns the Δ4x variance for that sample.
2521 def sample_D4x_correl(self, sample1, sample2 = None): 2522 ''' 2523 Correlation between Δ4x errors of samples 2524 2525 Returns the error correlation between the average Δ4x values of two samples. 2526 ''' 2527 if sample2 is None or sample2 == sample1: 2528 return 1. 2529 return ( 2530 self.sample_D4x_covar(sample1, sample2) 2531 / self.unknowns[sample1][f'SE_D{self._4x}'] 2532 / self.unknowns[sample2][f'SE_D{self._4x}'] 2533 )
Correlation between Δ4x errors of samples
Returns the error correlation between the average Δ4x values of two samples.
2535 def plot_single_session(self, 2536 session, 2537 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2538 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2539 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2540 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2541 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2542 xylimits = 'free', # | 'constant' 2543 x_label = None, 2544 y_label = None, 2545 error_contour_interval = 'auto', 2546 fig = 'new', 2547 ): 2548 ''' 2549 Generate plot for a single session 2550 ''' 2551 if x_label is None: 2552 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2553 if y_label is None: 2554 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2555 2556 out = _SessionPlot() 2557 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2558 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2559 2560 if fig == 'new': 2561 out.fig = ppl.figure(figsize = (6,6)) 2562 ppl.subplots_adjust(.1,.1,.9,.9) 2563 2564 out.anchor_analyses, = ppl.plot( 2565 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2566 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2567 **kw_plot_anchors) 2568 out.unknown_analyses, = ppl.plot( 2569 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2570 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2571 **kw_plot_unknowns) 2572 out.anchor_avg = ppl.plot( 2573 np.array([ np.array([ 2574 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2575 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2576 ]) for sample in anchors]).T, 2577 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T, 2578 **kw_plot_anchor_avg) 2579 out.unknown_avg = ppl.plot( 2580 np.array([ np.array([ 2581 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2582 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2583 ]) for sample in unknowns]).T, 2584 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T, 2585 **kw_plot_unknown_avg) 2586 if xylimits == 'constant': 2587 x = [r[f'd{self._4x}'] for r in self] 2588 y = [r[f'D{self._4x}'] for r in self] 2589 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2590 w, h = x2-x1, y2-y1 2591 x1 -= w/20 2592 x2 += w/20 2593 y1 -= h/20 2594 y2 += h/20 2595 ppl.axis([x1, x2, y1, y2]) 2596 elif xylimits == 'free': 2597 x1, x2, y1, y2 = ppl.axis() 2598 else: 2599 x1, x2, y1, y2 = ppl.axis(xylimits) 2600 2601 if error_contour_interval != 'none': 2602 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2603 XI,YI = np.meshgrid(xi, yi) 2604 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2605 if error_contour_interval == 'auto': 2606 rng = np.max(SI) - np.min(SI) 2607 if rng <= 0.01: 2608 cinterval = 0.001 2609 elif rng <= 0.03: 2610 cinterval = 0.004 2611 elif rng <= 0.1: 2612 cinterval = 0.01 2613 elif rng <= 0.3: 2614 cinterval = 0.03 2615 elif rng <= 1.: 2616 cinterval = 0.1 2617 else: 2618 cinterval = 0.5 2619 else: 2620 cinterval = error_contour_interval 2621 2622 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2623 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2624 out.clabel = ppl.clabel(out.contour) 2625 2626 ppl.xlabel(x_label) 2627 ppl.ylabel(y_label) 2628 ppl.title(session, weight = 'bold') 2629 ppl.grid(alpha = .2) 2630 out.ax = ppl.gca() 2631 2632 return out
Generate plot for a single session
2634 def plot_residuals( 2635 self, 2636 kde = False, 2637 hist = False, 2638 binwidth = 2/3, 2639 dir = 'output', 2640 filename = None, 2641 highlight = [], 2642 colors = None, 2643 figsize = None, 2644 dpi = 100, 2645 yspan = None, 2646 ): 2647 ''' 2648 Plot residuals of each analysis as a function of time (actually, as a function of 2649 the order of analyses in the `D4xdata` object) 2650 2651 + `kde`: whether to add a kernel density estimate of residuals 2652 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2653 + `histbins`: specify bin edges for the histogram 2654 + `dir`: the directory in which to save the plot 2655 + `highlight`: a list of samples to highlight 2656 + `colors`: a dict of `{<sample>: <color>}` for all samples 2657 + `figsize`: (width, height) of figure 2658 + `dpi`: resolution for PNG output 2659 + `yspan`: factor controlling the range of y values shown in plot 2660 (by default: `yspan = 1.5 if kde else 1.0`) 2661 ''' 2662 2663 from matplotlib import ticker 2664 2665 if yspan is None: 2666 if kde: 2667 yspan = 1.5 2668 else: 2669 yspan = 1.0 2670 2671 # Layout 2672 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2673 if hist or kde: 2674 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2675 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2676 else: 2677 ppl.subplots_adjust(.08,.05,.78,.8) 2678 ax1 = ppl.subplot(111) 2679 2680 # Colors 2681 N = len(self.anchors) 2682 if colors is None: 2683 if len(highlight) > 0: 2684 Nh = len(highlight) 2685 if Nh == 1: 2686 colors = {highlight[0]: (0,0,0)} 2687 elif Nh == 3: 2688 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2689 elif Nh == 4: 2690 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2691 else: 2692 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2693 else: 2694 if N == 3: 2695 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2696 elif N == 4: 2697 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2698 else: 2699 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2700 2701 ppl.sca(ax1) 2702 2703 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2704 2705 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2706 2707 session = self[0]['Session'] 2708 x1 = 0 2709# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2710 x_sessions = {} 2711 one_or_more_singlets = False 2712 one_or_more_multiplets = False 2713 multiplets = set() 2714 for k,r in enumerate(self): 2715 if r['Session'] != session: 2716 x2 = k-1 2717 x_sessions[session] = (x1+x2)/2 2718 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2719 session = r['Session'] 2720 x1 = k 2721 singlet = len(self.samples[r['Sample']]['data']) == 1 2722 if not singlet: 2723 multiplets.add(r['Sample']) 2724 if r['Sample'] in self.unknowns: 2725 if singlet: 2726 one_or_more_singlets = True 2727 else: 2728 one_or_more_multiplets = True 2729 kw = dict( 2730 marker = 'x' if singlet else '+', 2731 ms = 4 if singlet else 5, 2732 ls = 'None', 2733 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2734 mew = 1, 2735 alpha = 0.2 if singlet else 1, 2736 ) 2737 if highlight and r['Sample'] not in highlight: 2738 kw['alpha'] = 0.2 2739 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2740 x2 = k 2741 x_sessions[session] = (x1+x2)/2 2742 2743 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2744 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2745 if not (hist or kde): 2746 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2747 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2748 2749 xmin, xmax, ymin, ymax = ppl.axis() 2750 if yspan != 1: 2751 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2752 for s in x_sessions: 2753 ppl.text( 2754 x_sessions[s], 2755 ymax +1, 2756 s, 2757 va = 'bottom', 2758 **( 2759 dict(ha = 'center') 2760 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2761 else dict(ha = 'left', rotation = 45) 2762 ) 2763 ) 2764 2765 if hist or kde: 2766 ppl.sca(ax2) 2767 2768 for s in colors: 2769 kw['marker'] = '+' 2770 kw['ms'] = 5 2771 kw['mec'] = colors[s] 2772 kw['label'] = s 2773 kw['alpha'] = 1 2774 ppl.plot([], [], **kw) 2775 2776 kw['mec'] = (0,0,0) 2777 2778 if one_or_more_singlets: 2779 kw['marker'] = 'x' 2780 kw['ms'] = 4 2781 kw['alpha'] = .2 2782 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2783 ppl.plot([], [], **kw) 2784 2785 if one_or_more_multiplets: 2786 kw['marker'] = '+' 2787 kw['ms'] = 4 2788 kw['alpha'] = 1 2789 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2790 ppl.plot([], [], **kw) 2791 2792 if hist or kde: 2793 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2794 else: 2795 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2796 leg.set_zorder(-1000) 2797 2798 ppl.sca(ax1) 2799 2800 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2801 ppl.xticks([]) 2802 ppl.axis([-1, len(self), None, None]) 2803 2804 if hist or kde: 2805 ppl.sca(ax2) 2806 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2807 2808 if kde: 2809 from scipy.stats import gaussian_kde 2810 yi = np.linspace(ymin, ymax, 201) 2811 xi = gaussian_kde(X).evaluate(yi) 2812 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2813# ppl.plot(xi, yi, 'k-', lw = 1) 2814 elif hist: 2815 ppl.hist( 2816 X, 2817 orientation = 'horizontal', 2818 histtype = 'stepfilled', 2819 ec = [.4]*3, 2820 fc = [.25]*3, 2821 alpha = .25, 2822 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2823 ) 2824 ppl.text(0, 0, 2825 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2826 size = 7.5, 2827 alpha = 1, 2828 va = 'center', 2829 ha = 'left', 2830 ) 2831 2832 ppl.axis([0, None, ymin, ymax]) 2833 ppl.xticks([]) 2834 ppl.yticks([]) 2835# ax2.spines['left'].set_visible(False) 2836 ax2.spines['right'].set_visible(False) 2837 ax2.spines['top'].set_visible(False) 2838 ax2.spines['bottom'].set_visible(False) 2839 2840 ax1.axis([None, None, ymin, ymax]) 2841 2842 if not os.path.exists(dir): 2843 os.makedirs(dir) 2844 if filename is None: 2845 return fig 2846 elif filename == '': 2847 filename = f'D{self._4x}_residuals.pdf' 2848 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2849 ppl.close(fig)
Plot residuals of each analysis as a function of time (actually, as a function of
the order of analyses in the D4xdata
object)
kde
: whether to add a kernel density estimate of residualshist
: whether to add a histogram of residuals (incompatible withkde
)histbins
: specify bin edges for the histogramdir
: the directory in which to save the plothighlight
: a list of samples to highlightcolors
: a dict of{<sample>: <color>}
for all samplesfigsize
: (width, height) of figuredpi
: resolution for PNG outputyspan
: factor controlling the range of y values shown in plot (by default:yspan = 1.5 if kde else 1.0
)
2852 def simulate(self, *args, **kwargs): 2853 ''' 2854 Legacy function with warning message pointing to `virtual_data()` 2855 ''' 2856 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
Legacy function with warning message pointing to virtual_data()
2858 def plot_distribution_of_analyses( 2859 self, 2860 dir = 'output', 2861 filename = None, 2862 vs_time = False, 2863 figsize = (6,4), 2864 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2865 output = None, 2866 dpi = 100, 2867 ): 2868 ''' 2869 Plot temporal distribution of all analyses in the data set. 2870 2871 **Parameters** 2872 2873 + `dir`: the directory in which to save the plot 2874 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2875 + `dpi`: resolution for PNG output 2876 + `figsize`: (width, height) of figure 2877 + `dpi`: resolution for PNG output 2878 ''' 2879 2880 asamples = [s for s in self.anchors] 2881 usamples = [s for s in self.unknowns] 2882 if output is None or output == 'fig': 2883 fig = ppl.figure(figsize = figsize) 2884 ppl.subplots_adjust(*subplots_adjust) 2885 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2886 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2887 Xmax += (Xmax-Xmin)/40 2888 Xmin -= (Xmax-Xmin)/41 2889 for k, s in enumerate(asamples + usamples): 2890 if vs_time: 2891 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2892 else: 2893 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2894 Y = [-k for x in X] 2895 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2896 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2897 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2898 ppl.axis([Xmin, Xmax, -k-1, 1]) 2899 ppl.xlabel('\ntime') 2900 ppl.gca().annotate('', 2901 xy = (0.6, -0.02), 2902 xycoords = 'axes fraction', 2903 xytext = (.4, -0.02), 2904 arrowprops = dict(arrowstyle = "->", color = 'k'), 2905 ) 2906 2907 2908 x2 = -1 2909 for session in self.sessions: 2910 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2911 if vs_time: 2912 ppl.axvline(x1, color = 'k', lw = .75) 2913 if x2 > -1: 2914 if not vs_time: 2915 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2916 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2917# from xlrd import xldate_as_datetime 2918# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2919 if vs_time: 2920 ppl.axvline(x2, color = 'k', lw = .75) 2921 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2922 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2923 2924 ppl.xticks([]) 2925 ppl.yticks([]) 2926 2927 if output is None: 2928 if not os.path.exists(dir): 2929 os.makedirs(dir) 2930 if filename == None: 2931 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2932 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2933 ppl.close(fig) 2934 elif output == 'ax': 2935 return ppl.gca() 2936 elif output == 'fig': 2937 return fig
Plot temporal distribution of all analyses in the data set.
Parameters
dir
: the directory in which to save the plotvs_time
: ifTrue
, plot as a function ofTimeTag
rather than sequentially.dpi
: resolution for PNG outputfigsize
: (width, height) of figuredpi
: resolution for PNG output
2940 def plot_bulk_compositions( 2941 self, 2942 samples = None, 2943 dir = 'output/bulk_compositions', 2944 figsize = (6,6), 2945 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 2946 show = False, 2947 sample_color = (0,.5,1), 2948 analysis_color = (.7,.7,.7), 2949 labeldist = 0.3, 2950 radius = 0.05, 2951 ): 2952 ''' 2953 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 2954 2955 By default, creates a directory `./output/bulk_compositions` where plots for 2956 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 2957 2958 2959 **Parameters** 2960 2961 + `samples`: Only these samples are processed (by default: all samples). 2962 + `dir`: where to save the plots 2963 + `figsize`: (width, height) of figure 2964 + `subplots_adjust`: passed to `subplots_adjust()` 2965 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 2966 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 2967 + `sample_color`: color used for replicate markers/labels 2968 + `analysis_color`: color used for sample markers/labels 2969 + `labeldist`: distance (in inches) from replicate markers to replicate labels 2970 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 2971 ''' 2972 2973 from matplotlib.patches import Ellipse 2974 2975 if samples is None: 2976 samples = [_ for _ in self.samples] 2977 2978 saved = {} 2979 2980 for s in samples: 2981 2982 fig = ppl.figure(figsize = figsize) 2983 fig.subplots_adjust(*subplots_adjust) 2984 ax = ppl.subplot(111) 2985 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 2986 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 2987 ppl.title(s) 2988 2989 2990 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 2991 UID = [_['UID'] for _ in self.samples[s]['data']] 2992 XY0 = XY.mean(0) 2993 2994 for xy in XY: 2995 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 2996 2997 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 2998 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 2999 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3000 saved[s] = [XY, XY0] 3001 3002 x1, x2, y1, y2 = ppl.axis() 3003 x0, dx = (x1+x2)/2, (x2-x1)/2 3004 y0, dy = (y1+y2)/2, (y2-y1)/2 3005 dx, dy = [max(max(dx, dy), radius)]*2 3006 3007 ppl.axis([ 3008 x0 - 1.2*dx, 3009 x0 + 1.2*dx, 3010 y0 - 1.2*dy, 3011 y0 + 1.2*dy, 3012 ]) 3013 3014 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3015 3016 for xy, uid in zip(XY, UID): 3017 3018 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3019 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3020 3021 if (vector_in_display_space**2).sum() > 0: 3022 3023 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3024 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3025 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3026 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3027 3028 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3029 3030 else: 3031 3032 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3033 3034 if radius: 3035 ax.add_artist(Ellipse( 3036 xy = XY0, 3037 width = radius*2, 3038 height = radius*2, 3039 ls = (0, (2,2)), 3040 lw = .7, 3041 ec = analysis_color, 3042 fc = 'None', 3043 )) 3044 ppl.text( 3045 XY0[0], 3046 XY0[1]-radius, 3047 f'\n± {radius*1e3:.0f} ppm', 3048 color = analysis_color, 3049 va = 'top', 3050 ha = 'center', 3051 linespacing = 0.4, 3052 size = 8, 3053 ) 3054 3055 if not os.path.exists(dir): 3056 os.makedirs(dir) 3057 fig.savefig(f'{dir}/{s}.pdf') 3058 ppl.close(fig) 3059 3060 fig = ppl.figure(figsize = figsize) 3061 fig.subplots_adjust(*subplots_adjust) 3062 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3063 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3064 3065 for s in saved: 3066 for xy in saved[s][0]: 3067 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3068 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3069 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3070 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3071 3072 x1, x2, y1, y2 = ppl.axis() 3073 ppl.axis([ 3074 x1 - (x2-x1)/10, 3075 x2 + (x2-x1)/10, 3076 y1 - (y2-y1)/10, 3077 y2 + (y2-y1)/10, 3078 ]) 3079 3080 3081 if not os.path.exists(dir): 3082 os.makedirs(dir) 3083 fig.savefig(f'{dir}/__all__.pdf') 3084 if show: 3085 ppl.show() 3086 ppl.close(fig)
Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.
By default, creates a directory ./output/bulk_compositions
where plots for
each sample are saved. Another plot named __all__.pdf
shows all analyses together.
Parameters
samples
: Only these samples are processed (by default: all samples).dir
: where to save the plotsfigsize
: (width, height) of figuresubplots_adjust
: passed tosubplots_adjust()
show
: whether to callmatplotlib.pyplot.show()
on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.sample_color
: color used for replicate markers/labelsanalysis_color
: color used for sample markers/labelslabeldist
: distance (in inches) from replicate markers to replicate labelsradius
: radius of the dashed circle providing scale. No circle ifradius = 0
.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3090class D47data(D4xdata): 3091 ''' 3092 Store and process data for a large set of Δ47 analyses, 3093 usually comprising more than one analytical session. 3094 ''' 3095 3096 Nominal_D4x = { 3097 'ETH-1': 0.2052, 3098 'ETH-2': 0.2085, 3099 'ETH-3': 0.6132, 3100 'ETH-4': 0.4511, 3101 'IAEA-C1': 0.3018, 3102 'IAEA-C2': 0.6409, 3103 'MERCK': 0.5135, 3104 } # I-CDES (Bernasconi et al., 2021) 3105 ''' 3106 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3107 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3108 reference frame. 3109 3110 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3111 ```py 3112 { 3113 'ETH-1' : 0.2052, 3114 'ETH-2' : 0.2085, 3115 'ETH-3' : 0.6132, 3116 'ETH-4' : 0.4511, 3117 'IAEA-C1' : 0.3018, 3118 'IAEA-C2' : 0.6409, 3119 'MERCK' : 0.5135, 3120 } 3121 ``` 3122 ''' 3123 3124 3125 @property 3126 def Nominal_D47(self): 3127 return self.Nominal_D4x 3128 3129 3130 @Nominal_D47.setter 3131 def Nominal_D47(self, new): 3132 self.Nominal_D4x = dict(**new) 3133 self.refresh() 3134 3135 3136 def __init__(self, l = [], **kwargs): 3137 ''' 3138 **Parameters:** same as `D4xdata.__init__()` 3139 ''' 3140 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3141 3142 3143 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3144 ''' 3145 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3146 value for that temperature, and add treat these samples as additional anchors. 3147 3148 **Parameters** 3149 3150 + `fCo2eqD47`: Which CO2 equilibrium law to use 3151 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3152 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3153 + `priority`: if `replace`: forget old anchors and only use the new ones; 3154 if `new`: keep pre-existing anchors but update them in case of conflict 3155 between old and new Δ47 values; 3156 if `old`: keep pre-existing anchors but preserve their original Δ47 3157 values in case of conflict. 3158 ''' 3159 f = { 3160 'petersen': fCO2eqD47_Petersen, 3161 'wang': fCO2eqD47_Wang, 3162 }[fCo2eqD47] 3163 foo = {} 3164 for r in self: 3165 if 'Teq' in r: 3166 if r['Sample'] in foo: 3167 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3168 else: 3169 foo[r['Sample']] = f(r['Teq']) 3170 else: 3171 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3172 3173 if priority == 'replace': 3174 self.Nominal_D47 = {} 3175 for s in foo: 3176 if priority != 'old' or s not in self.Nominal_D47: 3177 self.Nominal_D47[s] = foo[s]
Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.
3136 def __init__(self, l = [], **kwargs): 3137 ''' 3138 **Parameters:** same as `D4xdata.__init__()` 3139 ''' 3140 D4xdata.__init__(self, l = l, mass = '47', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ47 values assigned to the Δ47 anchor samples, used by
D47data.standardize()
to normalize unknown samples to an absolute Δ47
reference frame.
By default equal to (after Bernasconi et al. (2021)):
{
'ETH-1' : 0.2052,
'ETH-2' : 0.2085,
'ETH-3' : 0.6132,
'ETH-4' : 0.4511,
'IAEA-C1' : 0.3018,
'IAEA-C2' : 0.6409,
'MERCK' : 0.5135,
}
3143 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3144 ''' 3145 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3146 value for that temperature, and add treat these samples as additional anchors. 3147 3148 **Parameters** 3149 3150 + `fCo2eqD47`: Which CO2 equilibrium law to use 3151 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3152 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3153 + `priority`: if `replace`: forget old anchors and only use the new ones; 3154 if `new`: keep pre-existing anchors but update them in case of conflict 3155 between old and new Δ47 values; 3156 if `old`: keep pre-existing anchors but preserve their original Δ47 3157 values in case of conflict. 3158 ''' 3159 f = { 3160 'petersen': fCO2eqD47_Petersen, 3161 'wang': fCO2eqD47_Wang, 3162 }[fCo2eqD47] 3163 foo = {} 3164 for r in self: 3165 if 'Teq' in r: 3166 if r['Sample'] in foo: 3167 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3168 else: 3169 foo[r['Sample']] = f(r['Teq']) 3170 else: 3171 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3172 3173 if priority == 'replace': 3174 self.Nominal_D47 = {} 3175 for s in foo: 3176 if priority != 'old' or s not in self.Nominal_D47: 3177 self.Nominal_D47[s] = foo[s]
Find all samples for which Teq
is specified, compute equilibrium Δ47
value for that temperature, and add treat these samples as additional anchors.
Parameters
fCo2eqD47
: Which CO2 equilibrium law to use (petersen
: Petersen et al. (2019);wang
: Wang et al. (2019)).priority
: ifreplace
: forget old anchors and only use the new ones; ifnew
: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; ifold
: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3182class D48data(D4xdata): 3183 ''' 3184 Store and process data for a large set of Δ48 analyses, 3185 usually comprising more than one analytical session. 3186 ''' 3187 3188 Nominal_D4x = { 3189 'ETH-1': 0.138, 3190 'ETH-2': 0.138, 3191 'ETH-3': 0.270, 3192 'ETH-4': 0.223, 3193 'GU-1': -0.419, 3194 } # (Fiebig et al., 2019, 2021) 3195 ''' 3196 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3197 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3198 reference frame. 3199 3200 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3201 Fiebig et al. (in press)): 3202 3203 ```py 3204 { 3205 'ETH-1' : 0.138, 3206 'ETH-2' : 0.138, 3207 'ETH-3' : 0.270, 3208 'ETH-4' : 0.223, 3209 'GU-1' : -0.419, 3210 } 3211 ``` 3212 ''' 3213 3214 3215 @property 3216 def Nominal_D48(self): 3217 return self.Nominal_D4x 3218 3219 3220 @Nominal_D48.setter 3221 def Nominal_D48(self, new): 3222 self.Nominal_D4x = dict(**new) 3223 self.refresh() 3224 3225 3226 def __init__(self, l = [], **kwargs): 3227 ''' 3228 **Parameters:** same as `D4xdata.__init__()` 3229 ''' 3230 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.
3226 def __init__(self, l = [], **kwargs): 3227 ''' 3228 **Parameters:** same as `D4xdata.__init__()` 3229 ''' 3230 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ48 values assigned to the Δ48 anchor samples, used by
D48data.standardize()
to normalize unknown samples to an absolute Δ48
reference frame.
By default equal to (after Fiebig et al. (2019), Fiebig et al. (in press)):
{
'ETH-1' : 0.138,
'ETH-2' : 0.138,
'ETH-3' : 0.270,
'ETH-4' : 0.223,
'GU-1' : -0.419,
}
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort