D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47 and Δ48 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47crunch.D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw D47 data using reasonable defaults. The simplest way is simply to call

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

4. API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47 and Δ48 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../docs/tutorial.md
  13.. include:: ../docs/howto.md
  14.. include:: ../docs/cli.md
  15
  16# 4. API Documentation
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) 2023 Mathieu Daëron'
  23__license__   = 'Modified BSD License - https://opensource.org/licenses/BSD-3-Clause'
  24__date__      = '2023-07-20'
  25__version__   = '2.2.0'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42
  43rcParams['font.family'] = 'sans-serif'
  44rcParams['font.sans-serif'] = 'Helvetica'
  45rcParams['font.size'] = 10
  46rcParams['mathtext.fontset'] = 'custom'
  47rcParams['mathtext.rm'] = 'sans'
  48rcParams['mathtext.bf'] = 'sans:bold'
  49rcParams['mathtext.it'] = 'sans:italic'
  50rcParams['mathtext.cal'] = 'sans:italic'
  51rcParams['mathtext.default'] = 'rm'
  52rcParams['xtick.major.size'] = 4
  53rcParams['xtick.major.width'] = 1
  54rcParams['ytick.major.size'] = 4
  55rcParams['ytick.major.width'] = 1
  56rcParams['axes.grid'] = False
  57rcParams['axes.linewidth'] = 1
  58rcParams['grid.linewidth'] = .75
  59rcParams['grid.linestyle'] = '-'
  60rcParams['grid.alpha'] = .15
  61rcParams['savefig.dpi'] = 150
  62
  63Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  64_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  65def fCO2eqD47_Petersen(T):
  66	'''
  67	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  68	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  69
  70	'''
  71	return float(_fCO2eqD47_Petersen(T))
  72
  73
  74Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  75_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  76def fCO2eqD47_Wang(T):
  77	'''
  78	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  79	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  80	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  81	'''
  82	return float(_fCO2eqD47_Wang(T))
  83
  84
  85def correlated_sum(X, C, w = None):
  86	'''
  87	Compute covariance-aware linear combinations
  88
  89	**Parameters**
  90	
  91	+ `X`: list or 1-D array of values to sum
  92	+ `C`: covariance matrix for the elements of `X`
  93	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  94	       (all equal to 1 by default)
  95
  96	Return the sum (and its SE) of the elements of `X`, with optional weights equal
  97	to the elements of `w`, accounting for covariances between the elements of `X`.
  98	'''
  99	if w is None:
 100		w = [1 for x in X]
 101	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 102
 103
 104def make_csv(x, hsep = ',', vsep = '\n'):
 105	'''
 106	Formats a list of lists of strings as a CSV
 107
 108	**Parameters**
 109
 110	+ `x`: the list of lists of strings to format
 111	+ `hsep`: the field separator (`,` by default)
 112	+ `vsep`: the line-ending convention to use (`\\n` by default)
 113
 114	**Example**
 115
 116	```py
 117	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 118	```
 119
 120	outputs:
 121
 122	```py
 123	a,b,c
 124	d,e,f
 125	```
 126	'''
 127	return vsep.join([hsep.join(l) for l in x])
 128
 129
 130def pf(txt):
 131	'''
 132	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 133	'''
 134	return txt.replace('-','_').replace('.','_').replace(' ','_')
 135
 136
 137def smart_type(x):
 138	'''
 139	Tries to convert string `x` to a float if it includes a decimal point, or
 140	to an integer if it does not. If both attempts fail, return the original
 141	string unchanged.
 142	'''
 143	try:
 144		y = float(x)
 145	except ValueError:
 146		return x
 147	if '.' not in x:
 148		return int(y)
 149	return y
 150
 151
 152def pretty_table(x, header = 1, hsep = '  ', vsep = '–', align = '<'):
 153	'''
 154	Reads a list of lists of strings and outputs an ascii table
 155
 156	**Parameters**
 157
 158	+ `x`: a list of lists of strings
 159	+ `header`: the number of lines to treat as header lines
 160	+ `hsep`: the horizontal separator between columns
 161	+ `vsep`: the character to use as vertical separator
 162	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 163
 164	**Example**
 165
 166	```py
 167	x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
 168	print(pretty_table(x))
 169	```
 170	yields:	
 171	```
 172	--  ------  ---
 173	A        B    C
 174	--  ------  ---
 175	1   1.9999  foo
 176	10       x  bar
 177	--  ------  ---
 178	```
 179	
 180	'''
 181	txt = []
 182	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 183
 184	if len(widths) > len(align):
 185		align += '>' * (len(widths)-len(align))
 186	sepline = hsep.join([vsep*w for w in widths])
 187	txt += [sepline]
 188	for k,l in enumerate(x):
 189		if k and k == header:
 190			txt += [sepline]
 191		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 192	txt += [sepline]
 193	txt += ['']
 194	return '\n'.join(txt)
 195
 196
 197def transpose_table(x):
 198	'''
 199	Transpose a list if lists
 200
 201	**Parameters**
 202
 203	+ `x`: a list of lists
 204
 205	**Example**
 206
 207	```py
 208	x = [[1, 2], [3, 4]]
 209	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 210	```
 211	'''
 212	return [[e for e in c] for c in zip(*x)]
 213
 214
 215def w_avg(X, sX) :
 216	'''
 217	Compute variance-weighted average
 218
 219	Returns the value and SE of the weighted average of the elements of `X`,
 220	with relative weights equal to their inverse variances (`1/sX**2`).
 221
 222	**Parameters**
 223
 224	+ `X`: array-like of elements to average
 225	+ `sX`: array-like of the corresponding SE values
 226
 227	**Tip**
 228
 229	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 230	they may be rearranged using `zip()`:
 231
 232	```python
 233	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 234	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 235	```
 236	'''
 237	X = [ x for x in X ]
 238	sX = [ sx for sx in sX ]
 239	W = [ sx**-2 for sx in sX ]
 240	W = [ w/sum(W) for w in W ]
 241	Xavg = sum([ w*x for w,x in zip(W,X) ])
 242	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 243	return Xavg, sXavg
 244
 245
 246def read_csv(filename, sep = ''):
 247	'''
 248	Read contents of `filename` in csv format and return a list of dictionaries.
 249
 250	In the csv string, spaces before and after field separators (`','` by default)
 251	are optional.
 252
 253	**Parameters**
 254
 255	+ `filename`: the csv file to read
 256	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 257	whichever appers most often in the contents of `filename`.
 258	'''
 259	with open(filename) as fid:
 260		txt = fid.read()
 261
 262	if sep == '':
 263		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 264	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 265	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 266
 267
 268def simulate_single_analysis(
 269	sample = 'MYSAMPLE',
 270	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 271	d13C_VPDB = None, d18O_VPDB = None,
 272	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 273	a47 = 1., b47 = 0., c47 = -0.9,
 274	a48 = 1., b48 = 0., c48 = -0.45,
 275	Nominal_D47 = None,
 276	Nominal_D48 = None,
 277	Nominal_d13C_VPDB = None,
 278	Nominal_d18O_VPDB = None,
 279	ALPHA_18O_ACID_REACTION = None,
 280	R13_VPDB = None,
 281	R17_VSMOW = None,
 282	R18_VSMOW = None,
 283	LAMBDA_17 = None,
 284	R18_VPDB = None,
 285	):
 286	'''
 287	Compute working-gas delta values for a single analysis, assuming a stochastic working
 288	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 289	
 290	**Parameters**
 291
 292	+ `sample`: sample name
 293	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 294		(respectively –4 and +26 ‰ by default)
 295	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 296	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 297		of the carbonate sample
 298	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 299		Δ48 values if `D47` or `D48` are not specified
 300	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 301		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 302	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 303	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 304		correction parameters (by default equal to the `D4xdata` default values)
 305	
 306	Returns a dictionary with fields
 307	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 308	'''
 309
 310	if Nominal_d13C_VPDB is None:
 311		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 312
 313	if Nominal_d18O_VPDB is None:
 314		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 315
 316	if ALPHA_18O_ACID_REACTION is None:
 317		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 318
 319	if R13_VPDB is None:
 320		R13_VPDB = D4xdata().R13_VPDB
 321
 322	if R17_VSMOW is None:
 323		R17_VSMOW = D4xdata().R17_VSMOW
 324
 325	if R18_VSMOW is None:
 326		R18_VSMOW = D4xdata().R18_VSMOW
 327
 328	if LAMBDA_17 is None:
 329		LAMBDA_17 = D4xdata().LAMBDA_17
 330
 331	if R18_VPDB is None:
 332		R18_VPDB = D4xdata().R18_VPDB
 333	
 334	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 335	
 336	if Nominal_D47 is None:
 337		Nominal_D47 = D47data().Nominal_D47
 338
 339	if Nominal_D48 is None:
 340		Nominal_D48 = D48data().Nominal_D48
 341	
 342	if d13C_VPDB is None:
 343		if sample in Nominal_d13C_VPDB:
 344			d13C_VPDB = Nominal_d13C_VPDB[sample]
 345		else:
 346			raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.")
 347
 348	if d18O_VPDB is None:
 349		if sample in Nominal_d18O_VPDB:
 350			d18O_VPDB = Nominal_d18O_VPDB[sample]
 351		else:
 352			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 353
 354	if D47 is None:
 355		if sample in Nominal_D47:
 356			D47 = Nominal_D47[sample]
 357		else:
 358			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 359
 360	if D48 is None:
 361		if sample in Nominal_D48:
 362			D48 = Nominal_D48[sample]
 363		else:
 364			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 365
 366	X = D4xdata()
 367	X.R13_VPDB = R13_VPDB
 368	X.R17_VSMOW = R17_VSMOW
 369	X.R18_VSMOW = R18_VSMOW
 370	X.LAMBDA_17 = LAMBDA_17
 371	X.R18_VPDB = R18_VPDB
 372	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 373
 374	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 375		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 376		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 377		)
 378	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 379		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 380		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 381		D17O=D17O, D47=D47, D48=D48, D49=D49,
 382		)
 383	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 384		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 385		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 386		D17O=D17O,
 387		)
 388	
 389	d45 = 1000 * (R45/R45wg - 1)
 390	d46 = 1000 * (R46/R46wg - 1)
 391	d47 = 1000 * (R47/R47wg - 1)
 392	d48 = 1000 * (R48/R48wg - 1)
 393	d49 = 1000 * (R49/R49wg - 1)
 394
 395	for k in range(3): # dumb iteration to adjust for small changes in d47
 396		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 397		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 398		d47 = 1000 * (R47raw/R47wg - 1)
 399		d48 = 1000 * (R48raw/R48wg - 1)
 400
 401	return dict(
 402		Sample = sample,
 403		D17O = D17O,
 404		d13Cwg_VPDB = d13Cwg_VPDB,
 405		d18Owg_VSMOW = d18Owg_VSMOW,
 406		d45 = d45,
 407		d46 = d46,
 408		d47 = d47,
 409		d48 = d48,
 410		d49 = d49,
 411		)
 412
 413
 414def virtual_data(
 415	samples = [],
 416	a47 = 1., b47 = 0., c47 = -0.9,
 417	a48 = 1., b48 = 0., c48 = -0.45,
 418	rd45 = 0.020, rd46 = 0.060,
 419	rD47 = 0.015, rD48 = 0.045,
 420	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 421	session = None,
 422	Nominal_D47 = None, Nominal_D48 = None,
 423	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 424	ALPHA_18O_ACID_REACTION = None,
 425	R13_VPDB = None,
 426	R17_VSMOW = None,
 427	R18_VSMOW = None,
 428	LAMBDA_17 = None,
 429	R18_VPDB = None,
 430	seed = 0,
 431	shuffle = True,
 432	):
 433	'''
 434	Return list with simulated analyses from a single session.
 435	
 436	**Parameters**
 437	
 438	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 439	    * `Sample`: the name of the sample
 440	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 441	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 442	    * `N`: how many analyses to generate for this sample
 443	+ `a47`: scrambling factor for Δ47
 444	+ `b47`: compositional nonlinearity for Δ47
 445	+ `c47`: working gas offset for Δ47
 446	+ `a48`: scrambling factor for Δ48
 447	+ `b48`: compositional nonlinearity for Δ48
 448	+ `c48`: working gas offset for Δ48
 449	+ `rd45`: analytical repeatability of δ45
 450	+ `rd46`: analytical repeatability of δ46
 451	+ `rD47`: analytical repeatability of Δ47
 452	+ `rD48`: analytical repeatability of Δ48
 453	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 454		(by default equal to the `simulate_single_analysis` default values)
 455	+ `session`: name of the session (no name by default)
 456	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 457		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 458	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 459		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 460		(by default equal to the `simulate_single_analysis` defaults)
 461	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 462		(by default equal to the `simulate_single_analysis` defaults)
 463	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 464		correction parameters (by default equal to the `simulate_single_analysis` default)
 465	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 466	+ `shuffle`: randomly reorder the sequence of analyses
 467	
 468		
 469	Here is an example of using this method to generate an arbitrary combination of
 470	anchors and unknowns for a bunch of sessions:
 471
 472	```py
 473	.. include:: ../code_examples/virtual_data/example.py
 474	```
 475	
 476	This should output something like:
 477	
 478	```
 479	.. include:: ../code_examples/virtual_data/output.txt
 480	```
 481	'''
 482	
 483	kwargs = locals().copy()
 484
 485	from numpy import random as nprandom
 486	if seed:
 487		rng = nprandom.default_rng(seed)
 488	else:
 489		rng = nprandom.default_rng()
 490	
 491	N = sum([s['N'] for s in samples])
 492	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 493	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 494	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 495	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 496	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 497	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 498	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 499	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 500	
 501	k = 0
 502	out = []
 503	for s in samples:
 504		kw = {}
 505		kw['sample'] = s['Sample']
 506		kw = {
 507			**kw,
 508			**{var: kwargs[var]
 509				for var in [
 510					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 511					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 512					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 513					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 514					]
 515				if kwargs[var] is not None},
 516			**{var: s[var]
 517				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 518				if var in s},
 519			}
 520
 521		sN = s['N']
 522		while sN:
 523			out.append(simulate_single_analysis(**kw))
 524			out[-1]['d45'] += errors45[k]
 525			out[-1]['d46'] += errors46[k]
 526			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 527			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 528			sN -= 1
 529			k += 1
 530
 531		if session is not None:
 532			for r in out:
 533				r['Session'] = session
 534
 535		if shuffle:
 536			nprandom.shuffle(out)
 537
 538	return out
 539
 540def table_of_samples(
 541	data47 = None,
 542	data48 = None,
 543	dir = 'output',
 544	filename = None,
 545	save_to_file = True,
 546	print_out = True,
 547	output = None,
 548	):
 549	'''
 550	Print out, save to disk and/or return a combined table of samples
 551	for a pair of `D47data` and `D48data` objects.
 552
 553	**Parameters**
 554
 555	+ `data47`: `D47data` instance
 556	+ `data48`: `D48data` instance
 557	+ `dir`: the directory in which to save the table
 558	+ `filename`: the name to the csv file to write to
 559	+ `save_to_file`: whether to save the table to disk
 560	+ `print_out`: whether to print out the table
 561	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 562		if set to `'raw'`: return a list of list of strings
 563		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 564	'''
 565	if data47 is None:
 566		if data48 is None:
 567			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 568		else:
 569			return data48.table_of_samples(
 570				dir = dir,
 571				filename = filename,
 572				save_to_file = save_to_file,
 573				print_out = print_out,
 574				output = output
 575				)
 576	else:
 577		if data48 is None:
 578			return data47.table_of_samples(
 579				dir = dir,
 580				filename = filename,
 581				save_to_file = save_to_file,
 582				print_out = print_out,
 583				output = output
 584				)
 585		else:
 586			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 587			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 588			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
 589
 590			if save_to_file:
 591				if not os.path.exists(dir):
 592					os.makedirs(dir)
 593				if filename is None:
 594					filename = f'D47D48_samples.csv'
 595				with open(f'{dir}/{filename}', 'w') as fid:
 596					fid.write(make_csv(out))
 597			if print_out:
 598				print('\n'+pretty_table(out))
 599			if output == 'raw':
 600				return out
 601			elif output == 'pretty':
 602				return pretty_table(out)
 603
 604
 605def table_of_sessions(
 606	data47 = None,
 607	data48 = None,
 608	dir = 'output',
 609	filename = None,
 610	save_to_file = True,
 611	print_out = True,
 612	output = None,
 613	):
 614	'''
 615	Print out, save to disk and/or return a combined table of sessions
 616	for a pair of `D47data` and `D48data` objects.
 617	***Only applicable if the sessions in `data47` and those in `data48`
 618	consist of the exact same sets of analyses.***
 619
 620	**Parameters**
 621
 622	+ `data47`: `D47data` instance
 623	+ `data48`: `D48data` instance
 624	+ `dir`: the directory in which to save the table
 625	+ `filename`: the name to the csv file to write to
 626	+ `save_to_file`: whether to save the table to disk
 627	+ `print_out`: whether to print out the table
 628	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 629		if set to `'raw'`: return a list of list of strings
 630		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 631	'''
 632	if data47 is None:
 633		if data48 is None:
 634			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 635		else:
 636			return data48.table_of_sessions(
 637				dir = dir,
 638				filename = filename,
 639				save_to_file = save_to_file,
 640				print_out = print_out,
 641				output = output
 642				)
 643	else:
 644		if data48 is None:
 645			return data47.table_of_sessions(
 646				dir = dir,
 647				filename = filename,
 648				save_to_file = save_to_file,
 649				print_out = print_out,
 650				output = output
 651				)
 652		else:
 653			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 654			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 655			for k,x in enumerate(out47[0]):
 656				if k>7:
 657					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 658					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 659			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 660
 661			if save_to_file:
 662				if not os.path.exists(dir):
 663					os.makedirs(dir)
 664				if filename is None:
 665					filename = f'D47D48_sessions.csv'
 666				with open(f'{dir}/{filename}', 'w') as fid:
 667					fid.write(make_csv(out))
 668			if print_out:
 669				print('\n'+pretty_table(out))
 670			if output == 'raw':
 671				return out
 672			elif output == 'pretty':
 673				return pretty_table(out)
 674
 675
 676def table_of_analyses(
 677	data47 = None,
 678	data48 = None,
 679	dir = 'output',
 680	filename = None,
 681	save_to_file = True,
 682	print_out = True,
 683	output = None,
 684	):
 685	'''
 686	Print out, save to disk and/or return a combined table of analyses
 687	for a pair of `D47data` and `D48data` objects.
 688
 689	If the sessions in `data47` and those in `data48` do not consist of
 690	the exact same sets of analyses, the table will have two columns
 691	`Session_47` and `Session_48` instead of a single `Session` column.
 692
 693	**Parameters**
 694
 695	+ `data47`: `D47data` instance
 696	+ `data48`: `D48data` instance
 697	+ `dir`: the directory in which to save the table
 698	+ `filename`: the name to the csv file to write to
 699	+ `save_to_file`: whether to save the table to disk
 700	+ `print_out`: whether to print out the table
 701	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 702		if set to `'raw'`: return a list of list of strings
 703		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 704	'''
 705	if data47 is None:
 706		if data48 is None:
 707			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 708		else:
 709			return data48.table_of_analyses(
 710				dir = dir,
 711				filename = filename,
 712				save_to_file = save_to_file,
 713				print_out = print_out,
 714				output = output
 715				)
 716	else:
 717		if data48 is None:
 718			return data47.table_of_analyses(
 719				dir = dir,
 720				filename = filename,
 721				save_to_file = save_to_file,
 722				print_out = print_out,
 723				output = output
 724				)
 725		else:
 726			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 727			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 728			
 729			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 730				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 731			else:
 732				out47[0][1] = 'Session_47'
 733				out48[0][1] = 'Session_48'
 734				out47 = transpose_table(out47)
 735				out48 = transpose_table(out48)
 736				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 737
 738			if save_to_file:
 739				if not os.path.exists(dir):
 740					os.makedirs(dir)
 741				if filename is None:
 742					filename = f'D47D48_sessions.csv'
 743				with open(f'{dir}/{filename}', 'w') as fid:
 744					fid.write(make_csv(out))
 745			if print_out:
 746				print('\n'+pretty_table(out))
 747			if output == 'raw':
 748				return out
 749			elif output == 'pretty':
 750				return pretty_table(out)
 751
 752
 753def _fullcovar(minresult, epsilon = 0.01, named = False):
 754	'''
 755	Construct full covariance matrix in the case of constrained parameters
 756	'''
 757	
 758	import asteval
 759	
 760	def f(values):
 761		interp = asteval.Interpreter()
 762		for n,v in zip(minresult.var_names, values):
 763			interp(f'{n} = {v}')
 764		for q in minresult.params:
 765			if minresult.params[q].expr:
 766				interp(f'{q} = {minresult.params[q].expr}')
 767		return np.array([interp.symtable[q] for q in minresult.params])
 768
 769	# construct Jacobian
 770	J = np.zeros((minresult.nvarys, len(minresult.params)))
 771	X = np.array([minresult.params[p].value for p in minresult.var_names])
 772	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 773
 774	for j in range(minresult.nvarys):
 775		x1 = [_ for _ in X]
 776		x1[j] += epsilon * sX[j]
 777		x2 = [_ for _ in X]
 778		x2[j] -= epsilon * sX[j]
 779		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 780
 781	_names = [q for q in minresult.params]
 782	_covar = J.T @ minresult.covar @ J
 783	_se = np.diag(_covar)**.5
 784	_correl = _covar.copy()
 785	for k,s in enumerate(_se):
 786		if s:
 787			_correl[k,:] /= s
 788			_correl[:,k] /= s
 789
 790	if named:
 791		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 792		_se = {i: _se[i] for i in minresult.params}
 793		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 794
 795	return _names, _covar, _se, _correl
 796
 797
 798class D4xdata(list):
 799	'''
 800	Store and process data for a large set of Δ47 and/or Δ48
 801	analyses, usually comprising more than one analytical session.
 802	'''
 803
 804	### 17O CORRECTION PARAMETERS
 805	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 806	'''
 807	Absolute (13C/12C) ratio of VPDB.
 808	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 809	'''
 810
 811	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 812	'''
 813	Absolute (18O/16C) ratio of VSMOW.
 814	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 815	'''
 816
 817	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 818	'''
 819	Mass-dependent exponent for triple oxygen isotopes.
 820	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 821	'''
 822
 823	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 824	'''
 825	Absolute (17O/16C) ratio of VSMOW.
 826	By default equal to 0.00038475
 827	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 828	rescaled to `R13_VPDB`)
 829	'''
 830
 831	R18_VPDB = R18_VSMOW * 1.03092
 832	'''
 833	Absolute (18O/16C) ratio of VPDB.
 834	By definition equal to `R18_VSMOW * 1.03092`.
 835	'''
 836
 837	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 838	'''
 839	Absolute (17O/16C) ratio of VPDB.
 840	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 841	'''
 842
 843	LEVENE_REF_SAMPLE = 'ETH-3'
 844	'''
 845	After the Δ4x standardization step, each sample is tested to
 846	assess whether the Δ4x variance within all analyses for that
 847	sample differs significantly from that observed for a given reference
 848	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 849	which yields a p-value corresponding to the null hypothesis that the
 850	underlying variances are equal).
 851
 852	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 853	sample should be used as a reference for this test.
 854	'''
 855
 856	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 857	'''
 858	Specifies the 18O/16O fractionation factor generally applicable
 859	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 860	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 861
 862	By default equal to 1.008129 (calcite reacted at 90 °C,
 863	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 864	'''
 865
 866	Nominal_d13C_VPDB = {
 867		'ETH-1': 2.02,
 868		'ETH-2': -10.17,
 869		'ETH-3': 1.71,
 870		}	# (Bernasconi et al., 2018)
 871	'''
 872	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 873	`D4xdata.standardize_d13C()`.
 874
 875	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 876	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 877	'''
 878
 879	Nominal_d18O_VPDB = {
 880		'ETH-1': -2.19,
 881		'ETH-2': -18.69,
 882		'ETH-3': -1.78,
 883		}	# (Bernasconi et al., 2018)
 884	'''
 885	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 886	`D4xdata.standardize_d18O()`.
 887
 888	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 889	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 890	'''
 891
 892	d13C_STANDARDIZATION_METHOD = '2pt'
 893	'''
 894	Method by which to standardize δ13C values:
 895	
 896	+ `none`: do not apply any δ13C standardization.
 897	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 898	minimize the difference between final δ13C_VPDB values and
 899	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 900	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 901	values so as to minimize the difference between final δ13C_VPDB
 902	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 903	is defined).
 904	'''
 905
 906	d18O_STANDARDIZATION_METHOD = '2pt'
 907	'''
 908	Method by which to standardize δ18O values:
 909	
 910	+ `none`: do not apply any δ18O standardization.
 911	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 912	minimize the difference between final δ18O_VPDB values and
 913	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 914	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 915	values so as to minimize the difference between final δ18O_VPDB
 916	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 917	is defined).
 918	'''
 919
 920	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 921		'''
 922		**Parameters**
 923
 924		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 925		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 926		+ `mass`: `'47'` or `'48'`
 927		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 928		+ `session`: define session name for analyses without a `Session` key
 929		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 930
 931		Returns a `D4xdata` object derived from `list`.
 932		'''
 933		self._4x = mass
 934		self.verbose = verbose
 935		self.prefix = 'D4xdata'
 936		self.logfile = logfile
 937		list.__init__(self, l)
 938		self.Nf = None
 939		self.repeatability = {}
 940		self.refresh(session = session)
 941
 942
 943	def make_verbal(oldfun):
 944		'''
 945		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 946		'''
 947		@wraps(oldfun)
 948		def newfun(*args, verbose = '', **kwargs):
 949			myself = args[0]
 950			oldprefix = myself.prefix
 951			myself.prefix = oldfun.__name__
 952			if verbose != '':
 953				oldverbose = myself.verbose
 954				myself.verbose = verbose
 955			out = oldfun(*args, **kwargs)
 956			myself.prefix = oldprefix
 957			if verbose != '':
 958				myself.verbose = oldverbose
 959			return out
 960		return newfun
 961
 962
 963	def msg(self, txt):
 964		'''
 965		Log a message to `self.logfile`, and print it out if `verbose = True`
 966		'''
 967		self.log(txt)
 968		if self.verbose:
 969			print(f'{f"[{self.prefix}]":<16} {txt}')
 970
 971
 972	def vmsg(self, txt):
 973		'''
 974		Log a message to `self.logfile` and print it out
 975		'''
 976		self.log(txt)
 977		print(txt)
 978
 979
 980	def log(self, *txts):
 981		'''
 982		Log a message to `self.logfile`
 983		'''
 984		if self.logfile:
 985			with open(self.logfile, 'a') as fid:
 986				for txt in txts:
 987					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
 988
 989
 990	def refresh(self, session = 'mySession'):
 991		'''
 992		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
 993		'''
 994		self.fill_in_missing_info(session = session)
 995		self.refresh_sessions()
 996		self.refresh_samples()
 997
 998
 999	def refresh_sessions(self):
1000		'''
1001		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1002		to `False` for all sessions.
1003		'''
1004		self.sessions = {
1005			s: {'data': [r for r in self if r['Session'] == s]}
1006			for s in sorted({r['Session'] for r in self})
1007			}
1008		for s in self.sessions:
1009			self.sessions[s]['scrambling_drift'] = False
1010			self.sessions[s]['slope_drift'] = False
1011			self.sessions[s]['wg_drift'] = False
1012			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1013			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1014
1015
1016	def refresh_samples(self):
1017		'''
1018		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1019		'''
1020		self.samples = {
1021			s: {'data': [r for r in self if r['Sample'] == s]}
1022			for s in sorted({r['Sample'] for r in self})
1023			}
1024		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1025		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1026
1027
1028	def read(self, filename, sep = '', session = ''):
1029		'''
1030		Read file in csv format to load data into a `D47data` object.
1031
1032		In the csv file, spaces before and after field separators (`','` by default)
1033		are optional. Each line corresponds to a single analysis.
1034
1035		The required fields are:
1036
1037		+ `UID`: a unique identifier
1038		+ `Session`: an identifier for the analytical session
1039		+ `Sample`: a sample identifier
1040		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1041
1042		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1043		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1044		and `d49` are optional, and set to NaN by default.
1045
1046		**Parameters**
1047
1048		+ `fileneme`: the path of the file to read
1049		+ `sep`: csv separator delimiting the fields
1050		+ `session`: set `Session` field to this string for all analyses
1051		'''
1052		with open(filename) as fid:
1053			self.input(fid.read(), sep = sep, session = session)
1054
1055
1056	def input(self, txt, sep = '', session = ''):
1057		'''
1058		Read `txt` string in csv format to load analysis data into a `D47data` object.
1059
1060		In the csv string, spaces before and after field separators (`','` by default)
1061		are optional. Each line corresponds to a single analysis.
1062
1063		The required fields are:
1064
1065		+ `UID`: a unique identifier
1066		+ `Session`: an identifier for the analytical session
1067		+ `Sample`: a sample identifier
1068		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1069
1070		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1071		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1072		and `d49` are optional, and set to NaN by default.
1073
1074		**Parameters**
1075
1076		+ `txt`: the csv string to read
1077		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1078		whichever appers most often in `txt`.
1079		+ `session`: set `Session` field to this string for all analyses
1080		'''
1081		if sep == '':
1082			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1083		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1084		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1085
1086		if session != '':
1087			for r in data:
1088				r['Session'] = session
1089
1090		self += data
1091		self.refresh()
1092
1093
1094	@make_verbal
1095	def wg(self, samples = None, a18_acid = None):
1096		'''
1097		Compute bulk composition of the working gas for each session based on
1098		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1099		`self.Nominal_d18O_VPDB`.
1100		'''
1101
1102		self.msg('Computing WG composition:')
1103
1104		if a18_acid is None:
1105			a18_acid = self.ALPHA_18O_ACID_REACTION
1106		if samples is None:
1107			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1108
1109		assert a18_acid, f'Acid fractionation factor should not be zero.'
1110
1111		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1112		R45R46_standards = {}
1113		for sample in samples:
1114			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1115			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1116			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1117			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1118			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1119
1120			C12_s = 1 / (1 + R13_s)
1121			C13_s = R13_s / (1 + R13_s)
1122			C16_s = 1 / (1 + R17_s + R18_s)
1123			C17_s = R17_s / (1 + R17_s + R18_s)
1124			C18_s = R18_s / (1 + R17_s + R18_s)
1125
1126			C626_s = C12_s * C16_s ** 2
1127			C627_s = 2 * C12_s * C16_s * C17_s
1128			C628_s = 2 * C12_s * C16_s * C18_s
1129			C636_s = C13_s * C16_s ** 2
1130			C637_s = 2 * C13_s * C16_s * C17_s
1131			C727_s = C12_s * C17_s ** 2
1132
1133			R45_s = (C627_s + C636_s) / C626_s
1134			R46_s = (C628_s + C637_s + C727_s) / C626_s
1135			R45R46_standards[sample] = (R45_s, R46_s)
1136		
1137		for s in self.sessions:
1138			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1139			assert db, f'No sample from {samples} found in session "{s}".'
1140# 			dbsamples = sorted({r['Sample'] for r in db})
1141
1142			X = [r['d45'] for r in db]
1143			Y = [R45R46_standards[r['Sample']][0] for r in db]
1144			x1, x2 = np.min(X), np.max(X)
1145
1146			if x1 < x2:
1147				wgcoord = x1/(x1-x2)
1148			else:
1149				wgcoord = 999
1150
1151			if wgcoord < -.5 or wgcoord > 1.5:
1152				# unreasonable to extrapolate to d45 = 0
1153				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1154			else :
1155				# d45 = 0 is reasonably well bracketed
1156				R45_wg = np.polyfit(X, Y, 1)[1]
1157
1158			X = [r['d46'] for r in db]
1159			Y = [R45R46_standards[r['Sample']][1] for r in db]
1160			x1, x2 = np.min(X), np.max(X)
1161
1162			if x1 < x2:
1163				wgcoord = x1/(x1-x2)
1164			else:
1165				wgcoord = 999
1166
1167			if wgcoord < -.5 or wgcoord > 1.5:
1168				# unreasonable to extrapolate to d46 = 0
1169				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1170			else :
1171				# d46 = 0 is reasonably well bracketed
1172				R46_wg = np.polyfit(X, Y, 1)[1]
1173
1174			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1175
1176			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1177
1178			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1179			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1180			for r in self.sessions[s]['data']:
1181				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1182				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1183
1184
1185	def compute_bulk_delta(self, R45, R46, D17O = 0):
1186		'''
1187		Compute δ13C_VPDB and δ18O_VSMOW,
1188		by solving the generalized form of equation (17) from
1189		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1190		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1191		solving the corresponding second-order Taylor polynomial.
1192		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1193		'''
1194
1195		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1196
1197		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1198		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1199		C = 2 * self.R18_VSMOW
1200		D = -R46
1201
1202		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1203		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1204		cc = A + B + C + D
1205
1206		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1207
1208		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1209		R17 = K * R18 ** self.LAMBDA_17
1210		R13 = R45 - 2 * R17
1211
1212		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1213
1214		return d13C_VPDB, d18O_VSMOW
1215
1216
1217	@make_verbal
1218	def crunch(self, verbose = ''):
1219		'''
1220		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1221		'''
1222		for r in self:
1223			self.compute_bulk_and_clumping_deltas(r)
1224		self.standardize_d13C()
1225		self.standardize_d18O()
1226		self.msg(f"Crunched {len(self)} analyses.")
1227
1228
1229	def fill_in_missing_info(self, session = 'mySession'):
1230		'''
1231		Fill in optional fields with default values
1232		'''
1233		for i,r in enumerate(self):
1234			if 'D17O' not in r:
1235				r['D17O'] = 0.
1236			if 'UID' not in r:
1237				r['UID'] = f'{i+1}'
1238			if 'Session' not in r:
1239				r['Session'] = session
1240			for k in ['d47', 'd48', 'd49']:
1241				if k not in r:
1242					r[k] = np.nan
1243
1244
1245	def standardize_d13C(self):
1246		'''
1247		Perform δ13C standadization within each session `s` according to
1248		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1249		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1250		may be redefined abitrarily at a later stage.
1251		'''
1252		for s in self.sessions:
1253			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1254				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1255				X,Y = zip(*XY)
1256				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1257					offset = np.mean(Y) - np.mean(X)
1258					for r in self.sessions[s]['data']:
1259						r['d13C_VPDB'] += offset				
1260				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1261					a,b = np.polyfit(X,Y,1)
1262					for r in self.sessions[s]['data']:
1263						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1264
1265	def standardize_d18O(self):
1266		'''
1267		Perform δ18O standadization within each session `s` according to
1268		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1269		which is defined by default by `D47data.refresh_sessions()`as equal to
1270		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1271		'''
1272		for s in self.sessions:
1273			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1274				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1275				X,Y = zip(*XY)
1276				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1277				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1278					offset = np.mean(Y) - np.mean(X)
1279					for r in self.sessions[s]['data']:
1280						r['d18O_VSMOW'] += offset				
1281				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1282					a,b = np.polyfit(X,Y,1)
1283					for r in self.sessions[s]['data']:
1284						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1285	
1286
1287	def compute_bulk_and_clumping_deltas(self, r):
1288		'''
1289		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1290		'''
1291
1292		# Compute working gas R13, R18, and isobar ratios
1293		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1294		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1295		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1296
1297		# Compute analyte isobar ratios
1298		R45 = (1 + r['d45'] / 1000) * R45_wg
1299		R46 = (1 + r['d46'] / 1000) * R46_wg
1300		R47 = (1 + r['d47'] / 1000) * R47_wg
1301		R48 = (1 + r['d48'] / 1000) * R48_wg
1302		R49 = (1 + r['d49'] / 1000) * R49_wg
1303
1304		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1305		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1306		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1307
1308		# Compute stochastic isobar ratios of the analyte
1309		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1310			R13, R18, D17O = r['D17O']
1311		)
1312
1313		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1314		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1315		if (R45 / R45stoch - 1) > 5e-8:
1316			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1317		if (R46 / R46stoch - 1) > 5e-8:
1318			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1319
1320		# Compute raw clumped isotope anomalies
1321		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1322		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1323		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1324
1325
1326	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1327		'''
1328		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1329		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1330		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1331		'''
1332
1333		# Compute R17
1334		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1335
1336		# Compute isotope concentrations
1337		C12 = (1 + R13) ** -1
1338		C13 = C12 * R13
1339		C16 = (1 + R17 + R18) ** -1
1340		C17 = C16 * R17
1341		C18 = C16 * R18
1342
1343		# Compute stochastic isotopologue concentrations
1344		C626 = C16 * C12 * C16
1345		C627 = C16 * C12 * C17 * 2
1346		C628 = C16 * C12 * C18 * 2
1347		C636 = C16 * C13 * C16
1348		C637 = C16 * C13 * C17 * 2
1349		C638 = C16 * C13 * C18 * 2
1350		C727 = C17 * C12 * C17
1351		C728 = C17 * C12 * C18 * 2
1352		C737 = C17 * C13 * C17
1353		C738 = C17 * C13 * C18 * 2
1354		C828 = C18 * C12 * C18
1355		C838 = C18 * C13 * C18
1356
1357		# Compute stochastic isobar ratios
1358		R45 = (C636 + C627) / C626
1359		R46 = (C628 + C637 + C727) / C626
1360		R47 = (C638 + C728 + C737) / C626
1361		R48 = (C738 + C828) / C626
1362		R49 = C838 / C626
1363
1364		# Account for stochastic anomalies
1365		R47 *= 1 + D47 / 1000
1366		R48 *= 1 + D48 / 1000
1367		R49 *= 1 + D49 / 1000
1368
1369		# Return isobar ratios
1370		return R45, R46, R47, R48, R49
1371
1372
1373	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1374		'''
1375		Split unknown samples by UID (treat all analyses as different samples)
1376		or by session (treat analyses of a given sample in different sessions as
1377		different samples).
1378
1379		**Parameters**
1380
1381		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1382		+ `grouping`: `by_uid` | `by_session`
1383		'''
1384		if samples_to_split == 'all':
1385			samples_to_split = [s for s in self.unknowns]
1386		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1387		self.grouping = grouping.lower()
1388		if self.grouping in gkeys:
1389			gkey = gkeys[self.grouping]
1390		for r in self:
1391			if r['Sample'] in samples_to_split:
1392				r['Sample_original'] = r['Sample']
1393				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1394			elif r['Sample'] in self.unknowns:
1395				r['Sample_original'] = r['Sample']
1396		self.refresh_samples()
1397
1398
1399	def unsplit_samples(self, tables = False):
1400		'''
1401		Reverse the effects of `D47data.split_samples()`.
1402		
1403		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1404		
1405		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1406		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1407		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1408		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1409		that case session-averaged Δ4x values are statistically independent).
1410		'''
1411		unknowns_old = sorted({s for s in self.unknowns})
1412		CM_old = self.standardization.covar[:,:]
1413		VD_old = self.standardization.params.valuesdict().copy()
1414		vars_old = self.standardization.var_names
1415
1416		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1417
1418		Ns = len(vars_old) - len(unknowns_old)
1419		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1420		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1421
1422		W = np.zeros((len(vars_new), len(vars_old)))
1423		W[:Ns,:Ns] = np.eye(Ns)
1424		for u in unknowns_new:
1425			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1426			if self.grouping == 'by_session':
1427				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1428			elif self.grouping == 'by_uid':
1429				weights = [1 for s in splits]
1430			sw = sum(weights)
1431			weights = [w/sw for w in weights]
1432			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1433
1434		CM_new = W @ CM_old @ W.T
1435		V = W @ np.array([[VD_old[k]] for k in vars_old])
1436		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1437
1438		self.standardization.covar = CM_new
1439		self.standardization.params.valuesdict = lambda : VD_new
1440		self.standardization.var_names = vars_new
1441
1442		for r in self:
1443			if r['Sample'] in self.unknowns:
1444				r['Sample_split'] = r['Sample']
1445				r['Sample'] = r['Sample_original']
1446
1447		self.refresh_samples()
1448		self.consolidate_samples()
1449		self.repeatabilities()
1450
1451		if tables:
1452			self.table_of_analyses()
1453			self.table_of_samples()
1454
1455	def assign_timestamps(self):
1456		'''
1457		Assign a time field `t` of type `float` to each analysis.
1458
1459		If `TimeTag` is one of the data fields, `t` is equal within a given session
1460		to `TimeTag` minus the mean value of `TimeTag` for that session.
1461		Otherwise, `TimeTag` is by default equal to the index of each analysis
1462		in the dataset and `t` is defined as above.
1463		'''
1464		for session in self.sessions:
1465			sdata = self.sessions[session]['data']
1466			try:
1467				t0 = np.mean([r['TimeTag'] for r in sdata])
1468				for r in sdata:
1469					r['t'] = r['TimeTag'] - t0
1470			except KeyError:
1471				t0 = (len(sdata)-1)/2
1472				for t,r in enumerate(sdata):
1473					r['t'] = t - t0
1474
1475
1476	def report(self):
1477		'''
1478		Prints a report on the standardization fit.
1479		Only applicable after `D4xdata.standardize(method='pooled')`.
1480		'''
1481		report_fit(self.standardization)
1482
1483
1484	def combine_samples(self, sample_groups):
1485		'''
1486		Combine analyses of different samples to compute weighted average Δ4x
1487		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1488		dictionary.
1489		
1490		Caution: samples are weighted by number of replicate analyses, which is a
1491		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1492		correlated analytical errors for one or more samples).
1493		
1494		Returns a tuplet of:
1495		
1496		+ the list of group names
1497		+ an array of the corresponding Δ4x values
1498		+ the corresponding (co)variance matrix
1499		
1500		**Parameters**
1501
1502		+ `sample_groups`: a dictionary of the form:
1503		```py
1504		{'group1': ['sample_1', 'sample_2'],
1505		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1506		```
1507		'''
1508		
1509		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1510		groups = sorted(sample_groups.keys())
1511		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1512		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1513		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1514		W = np.array([
1515			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1516			for j in groups])
1517		D4x_new = W @ D4x_old
1518		CM_new = W @ CM_old @ W.T
1519
1520		return groups, D4x_new[:,0], CM_new
1521		
1522
1523	@make_verbal
1524	def standardize(self,
1525		method = 'pooled',
1526		weighted_sessions = [],
1527		consolidate = True,
1528		consolidate_tables = False,
1529		consolidate_plots = False,
1530		constraints = {},
1531		):
1532		'''
1533		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1534		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1535		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1536		i.e. that their true Δ4x value does not change between sessions,
1537		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1538		`'indep_sessions'`, the standardization processes each session independently, based only
1539		on anchors analyses.
1540		'''
1541
1542		self.standardization_method = method
1543		self.assign_timestamps()
1544
1545		if method == 'pooled':
1546			if weighted_sessions:
1547				for session_group in weighted_sessions:
1548					if self._4x == '47':
1549						X = D47data([r for r in self if r['Session'] in session_group])
1550					elif self._4x == '48':
1551						X = D48data([r for r in self if r['Session'] in session_group])
1552					X.Nominal_D4x = self.Nominal_D4x.copy()
1553					X.refresh()
1554					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1555					w = np.sqrt(result.redchi)
1556					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1557					for r in X:
1558						r[f'wD{self._4x}raw'] *= w
1559			else:
1560				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1561				for r in self:
1562					r[f'wD{self._4x}raw'] = 1.
1563
1564			params = Parameters()
1565			for k,session in enumerate(self.sessions):
1566				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1567				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1568				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1569				s = pf(session)
1570				params.add(f'a_{s}', value = 0.9)
1571				params.add(f'b_{s}', value = 0.)
1572				params.add(f'c_{s}', value = -0.9)
1573				params.add(f'a2_{s}', value = 0.,
1574# 					vary = self.sessions[session]['scrambling_drift'],
1575					)
1576				params.add(f'b2_{s}', value = 0.,
1577# 					vary = self.sessions[session]['slope_drift'],
1578					)
1579				params.add(f'c2_{s}', value = 0.,
1580# 					vary = self.sessions[session]['wg_drift'],
1581					)
1582				if not self.sessions[session]['scrambling_drift']:
1583					params[f'a2_{s}'].expr = '0'
1584				if not self.sessions[session]['slope_drift']:
1585					params[f'b2_{s}'].expr = '0'
1586				if not self.sessions[session]['wg_drift']:
1587					params[f'c2_{s}'].expr = '0'
1588
1589			for sample in self.unknowns:
1590				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1591
1592			for k in constraints:
1593				params[k].expr = constraints[k]
1594
1595			def residuals(p):
1596				R = []
1597				for r in self:
1598					session = pf(r['Session'])
1599					sample = pf(r['Sample'])
1600					if r['Sample'] in self.Nominal_D4x:
1601						R += [ (
1602							r[f'D{self._4x}raw'] - (
1603								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1604								+ p[f'b_{session}'] * r[f'd{self._4x}']
1605								+	p[f'c_{session}']
1606								+ r['t'] * (
1607									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1608									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1609									+	p[f'c2_{session}']
1610									)
1611								)
1612							) / r[f'wD{self._4x}raw'] ]
1613					else:
1614						R += [ (
1615							r[f'D{self._4x}raw'] - (
1616								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1617								+ p[f'b_{session}'] * r[f'd{self._4x}']
1618								+	p[f'c_{session}']
1619								+ r['t'] * (
1620									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1621									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1622									+	p[f'c2_{session}']
1623									)
1624								)
1625							) / r[f'wD{self._4x}raw'] ]
1626				return R
1627
1628			M = Minimizer(residuals, params)
1629			result = M.least_squares()
1630			self.Nf = result.nfree
1631			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1632			new_names, new_covar, new_se = _fullcovar(result)[:3]
1633			result.var_names = new_names
1634			result.covar = new_covar
1635
1636			for r in self:
1637				s = pf(r["Session"])
1638				a = result.params.valuesdict()[f'a_{s}']
1639				b = result.params.valuesdict()[f'b_{s}']
1640				c = result.params.valuesdict()[f'c_{s}']
1641				a2 = result.params.valuesdict()[f'a2_{s}']
1642				b2 = result.params.valuesdict()[f'b2_{s}']
1643				c2 = result.params.valuesdict()[f'c2_{s}']
1644				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1645				
1646
1647			self.standardization = result
1648
1649			for session in self.sessions:
1650				self.sessions[session]['Np'] = 3
1651				for k in ['scrambling', 'slope', 'wg']:
1652					if self.sessions[session][f'{k}_drift']:
1653						self.sessions[session]['Np'] += 1
1654
1655			if consolidate:
1656				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1657			return result
1658
1659
1660		elif method == 'indep_sessions':
1661
1662			if weighted_sessions:
1663				for session_group in weighted_sessions:
1664					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1665					X.Nominal_D4x = self.Nominal_D4x.copy()
1666					X.refresh()
1667					# This is only done to assign r['wD47raw'] for r in X:
1668					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1669					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1670			else:
1671				self.msg('All weights set to 1 ‰')
1672				for r in self:
1673					r[f'wD{self._4x}raw'] = 1
1674
1675			for session in self.sessions:
1676				s = self.sessions[session]
1677				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1678				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1679				s['Np'] = sum(p_active)
1680				sdata = s['data']
1681
1682				A = np.array([
1683					[
1684						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1685						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1686						1 / r[f'wD{self._4x}raw'],
1687						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1688						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1689						r['t'] / r[f'wD{self._4x}raw']
1690						]
1691					for r in sdata if r['Sample'] in self.anchors
1692					])[:,p_active] # only keep columns for the active parameters
1693				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1694				s['Na'] = Y.size
1695				CM = linalg.inv(A.T @ A)
1696				bf = (CM @ A.T @ Y).T[0,:]
1697				k = 0
1698				for n,a in zip(p_names, p_active):
1699					if a:
1700						s[n] = bf[k]
1701# 						self.msg(f'{n} = {bf[k]}')
1702						k += 1
1703					else:
1704						s[n] = 0.
1705# 						self.msg(f'{n} = 0.0')
1706
1707				for r in sdata :
1708					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1709					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1710					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1711
1712				s['CM'] = np.zeros((6,6))
1713				i = 0
1714				k_active = [j for j,a in enumerate(p_active) if a]
1715				for j,a in enumerate(p_active):
1716					if a:
1717						s['CM'][j,k_active] = CM[i,:]
1718						i += 1
1719
1720			if not weighted_sessions:
1721				w = self.rmswd()['rmswd']
1722				for r in self:
1723						r[f'wD{self._4x}'] *= w
1724						r[f'wD{self._4x}raw'] *= w
1725				for session in self.sessions:
1726					self.sessions[session]['CM'] *= w**2
1727
1728			for session in self.sessions:
1729				s = self.sessions[session]
1730				s['SE_a'] = s['CM'][0,0]**.5
1731				s['SE_b'] = s['CM'][1,1]**.5
1732				s['SE_c'] = s['CM'][2,2]**.5
1733				s['SE_a2'] = s['CM'][3,3]**.5
1734				s['SE_b2'] = s['CM'][4,4]**.5
1735				s['SE_c2'] = s['CM'][5,5]**.5
1736
1737			if not weighted_sessions:
1738				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1739			else:
1740				self.Nf = 0
1741				for sg in weighted_sessions:
1742					self.Nf += self.rmswd(sessions = sg)['Nf']
1743
1744			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1745
1746			avgD4x = {
1747				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1748				for sample in self.samples
1749				}
1750			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1751			rD4x = (chi2/self.Nf)**.5
1752			self.repeatability[f'sigma_{self._4x}'] = rD4x
1753
1754			if consolidate:
1755				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1756
1757
1758	def standardization_error(self, session, d4x, D4x, t = 0):
1759		'''
1760		Compute standardization error for a given session and
1761		(δ47, Δ47) composition.
1762		'''
1763		a = self.sessions[session]['a']
1764		b = self.sessions[session]['b']
1765		c = self.sessions[session]['c']
1766		a2 = self.sessions[session]['a2']
1767		b2 = self.sessions[session]['b2']
1768		c2 = self.sessions[session]['c2']
1769		CM = self.sessions[session]['CM']
1770
1771		x, y = D4x, d4x
1772		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1773# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1774		dxdy = -(b+b2*t) / (a+a2*t)
1775		dxdz = 1. / (a+a2*t)
1776		dxda = -x / (a+a2*t)
1777		dxdb = -y / (a+a2*t)
1778		dxdc = -1. / (a+a2*t)
1779		dxda2 = -x * a2 / (a+a2*t)
1780		dxdb2 = -y * t / (a+a2*t)
1781		dxdc2 = -t / (a+a2*t)
1782		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1783		sx = (V @ CM @ V.T) ** .5
1784		return sx
1785
1786
1787	@make_verbal
1788	def summary(self,
1789		dir = 'output',
1790		filename = None,
1791		save_to_file = True,
1792		print_out = True,
1793		):
1794		'''
1795		Print out an/or save to disk a summary of the standardization results.
1796
1797		**Parameters**
1798
1799		+ `dir`: the directory in which to save the table
1800		+ `filename`: the name to the csv file to write to
1801		+ `save_to_file`: whether to save the table to disk
1802		+ `print_out`: whether to print out the table
1803		'''
1804
1805		out = []
1806		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1807		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1808		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1809		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1810		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1811		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1812		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1813		out += [['Model degrees of freedom', f"{self.Nf}"]]
1814		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1815		out += [['Standardization method', self.standardization_method]]
1816
1817		if save_to_file:
1818			if not os.path.exists(dir):
1819				os.makedirs(dir)
1820			if filename is None:
1821				filename = f'D{self._4x}_summary.csv'
1822			with open(f'{dir}/{filename}', 'w') as fid:
1823				fid.write(make_csv(out))
1824		if print_out:
1825			self.msg('\n' + pretty_table(out, header = 0))
1826
1827
1828	@make_verbal
1829	def table_of_sessions(self,
1830		dir = 'output',
1831		filename = None,
1832		save_to_file = True,
1833		print_out = True,
1834		output = None,
1835		):
1836		'''
1837		Print out an/or save to disk a table of sessions.
1838
1839		**Parameters**
1840
1841		+ `dir`: the directory in which to save the table
1842		+ `filename`: the name to the csv file to write to
1843		+ `save_to_file`: whether to save the table to disk
1844		+ `print_out`: whether to print out the table
1845		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1846		    if set to `'raw'`: return a list of list of strings
1847		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1848		'''
1849		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1850		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1851		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1852
1853		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1854		if include_a2:
1855			out[-1] += ['a2 ± SE']
1856		if include_b2:
1857			out[-1] += ['b2 ± SE']
1858		if include_c2:
1859			out[-1] += ['c2 ± SE']
1860		for session in self.sessions:
1861			out += [[
1862				session,
1863				f"{self.sessions[session]['Na']}",
1864				f"{self.sessions[session]['Nu']}",
1865				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1866				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1867				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1868				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1869				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1870				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1871				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1872				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1873				]]
1874			if include_a2:
1875				if self.sessions[session]['scrambling_drift']:
1876					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1877				else:
1878					out[-1] += ['']
1879			if include_b2:
1880				if self.sessions[session]['slope_drift']:
1881					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1882				else:
1883					out[-1] += ['']
1884			if include_c2:
1885				if self.sessions[session]['wg_drift']:
1886					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1887				else:
1888					out[-1] += ['']
1889
1890		if save_to_file:
1891			if not os.path.exists(dir):
1892				os.makedirs(dir)
1893			if filename is None:
1894				filename = f'D{self._4x}_sessions.csv'
1895			with open(f'{dir}/{filename}', 'w') as fid:
1896				fid.write(make_csv(out))
1897		if print_out:
1898			self.msg('\n' + pretty_table(out))
1899		if output == 'raw':
1900			return out
1901		elif output == 'pretty':
1902			return pretty_table(out)
1903
1904
1905	@make_verbal
1906	def table_of_analyses(
1907		self,
1908		dir = 'output',
1909		filename = None,
1910		save_to_file = True,
1911		print_out = True,
1912		output = None,
1913		):
1914		'''
1915		Print out an/or save to disk a table of analyses.
1916
1917		**Parameters**
1918
1919		+ `dir`: the directory in which to save the table
1920		+ `filename`: the name to the csv file to write to
1921		+ `save_to_file`: whether to save the table to disk
1922		+ `print_out`: whether to print out the table
1923		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1924		    if set to `'raw'`: return a list of list of strings
1925		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1926		'''
1927
1928		out = [['UID','Session','Sample']]
1929		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1930		for f in extra_fields:
1931			out[-1] += [f[0]]
1932		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1933		for r in self:
1934			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1935			for f in extra_fields:
1936				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1937			out[-1] += [
1938				f"{r['d13Cwg_VPDB']:.3f}",
1939				f"{r['d18Owg_VSMOW']:.3f}",
1940				f"{r['d45']:.6f}",
1941				f"{r['d46']:.6f}",
1942				f"{r['d47']:.6f}",
1943				f"{r['d48']:.6f}",
1944				f"{r['d49']:.6f}",
1945				f"{r['d13C_VPDB']:.6f}",
1946				f"{r['d18O_VSMOW']:.6f}",
1947				f"{r['D47raw']:.6f}",
1948				f"{r['D48raw']:.6f}",
1949				f"{r['D49raw']:.6f}",
1950				f"{r[f'D{self._4x}']:.6f}"
1951				]
1952		if save_to_file:
1953			if not os.path.exists(dir):
1954				os.makedirs(dir)
1955			if filename is None:
1956				filename = f'D{self._4x}_analyses.csv'
1957			with open(f'{dir}/{filename}', 'w') as fid:
1958				fid.write(make_csv(out))
1959		if print_out:
1960			self.msg('\n' + pretty_table(out))
1961		return out
1962
1963	@make_verbal
1964	def covar_table(
1965		self,
1966		correl = False,
1967		dir = 'output',
1968		filename = None,
1969		save_to_file = True,
1970		print_out = True,
1971		output = None,
1972		):
1973		'''
1974		Print out, save to disk and/or return the variance-covariance matrix of D4x
1975		for all unknown samples.
1976
1977		**Parameters**
1978
1979		+ `dir`: the directory in which to save the csv
1980		+ `filename`: the name of the csv file to write to
1981		+ `save_to_file`: whether to save the csv
1982		+ `print_out`: whether to print out the matrix
1983		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
1984		    if set to `'raw'`: return a list of list of strings
1985		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1986		'''
1987		samples = sorted([u for u in self.unknowns])
1988		out = [[''] + samples]
1989		for s1 in samples:
1990			out.append([s1])
1991			for s2 in samples:
1992				if correl:
1993					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
1994				else:
1995					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
1996
1997		if save_to_file:
1998			if not os.path.exists(dir):
1999				os.makedirs(dir)
2000			if filename is None:
2001				if correl:
2002					filename = f'D{self._4x}_correl.csv'
2003				else:
2004					filename = f'D{self._4x}_covar.csv'
2005			with open(f'{dir}/{filename}', 'w') as fid:
2006				fid.write(make_csv(out))
2007		if print_out:
2008			self.msg('\n'+pretty_table(out))
2009		if output == 'raw':
2010			return out
2011		elif output == 'pretty':
2012			return pretty_table(out)
2013
2014	@make_verbal
2015	def table_of_samples(
2016		self,
2017		dir = 'output',
2018		filename = None,
2019		save_to_file = True,
2020		print_out = True,
2021		output = None,
2022		):
2023		'''
2024		Print out, save to disk and/or return a table of samples.
2025
2026		**Parameters**
2027
2028		+ `dir`: the directory in which to save the csv
2029		+ `filename`: the name of the csv file to write to
2030		+ `save_to_file`: whether to save the csv
2031		+ `print_out`: whether to print out the table
2032		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2033		    if set to `'raw'`: return a list of list of strings
2034		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2035		'''
2036
2037		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2038		for sample in self.anchors:
2039			out += [[
2040				f"{sample}",
2041				f"{self.samples[sample]['N']}",
2042				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2043				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2044				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2045				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2046				]]
2047		for sample in self.unknowns:
2048			out += [[
2049				f"{sample}",
2050				f"{self.samples[sample]['N']}",
2051				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2052				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2053				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2054				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2055				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2056				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2057				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2058				]]
2059		if save_to_file:
2060			if not os.path.exists(dir):
2061				os.makedirs(dir)
2062			if filename is None:
2063				filename = f'D{self._4x}_samples.csv'
2064			with open(f'{dir}/{filename}', 'w') as fid:
2065				fid.write(make_csv(out))
2066		if print_out:
2067			self.msg('\n'+pretty_table(out))
2068		if output == 'raw':
2069			return out
2070		elif output == 'pretty':
2071			return pretty_table(out)
2072
2073
2074	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2075		'''
2076		Generate session plots and save them to disk.
2077
2078		**Parameters**
2079
2080		+ `dir`: the directory in which to save the plots
2081		+ `figsize`: the width and height (in inches) of each plot
2082		+ `filetype`: 'pdf' or 'png'
2083		+ `dpi`: resolution for PNG output
2084		'''
2085		if not os.path.exists(dir):
2086			os.makedirs(dir)
2087
2088		for session in self.sessions:
2089			sp = self.plot_single_session(session, xylimits = 'constant')
2090			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2091			ppl.close(sp.fig)
2092
2093
2094	@make_verbal
2095	def consolidate_samples(self):
2096		'''
2097		Compile various statistics for each sample.
2098
2099		For each anchor sample:
2100
2101		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2102		+ `SE_D47` or `SE_D48`: set to zero by definition
2103
2104		For each unknown sample:
2105
2106		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2107		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2108
2109		For each anchor and unknown:
2110
2111		+ `N`: the total number of analyses of this sample
2112		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2113		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2114		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2115		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2116		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2117		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2118		'''
2119		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2120		for sample in self.samples:
2121			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2122			if self.samples[sample]['N'] > 1:
2123				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2124
2125			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2126			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2127
2128			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2129			if len(D4x_pop) > 2:
2130				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2131			
2132		if self.standardization_method == 'pooled':
2133			for sample in self.anchors:
2134				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2135				self.samples[sample][f'SE_D{self._4x}'] = 0.
2136			for sample in self.unknowns:
2137				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2138				try:
2139					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2140				except ValueError:
2141					# when `sample` is constrained by self.standardize(constraints = {...}),
2142					# it is no longer listed in self.standardization.var_names.
2143					# Temporary fix: define SE as zero for now
2144					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2145
2146		elif self.standardization_method == 'indep_sessions':
2147			for sample in self.anchors:
2148				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2149				self.samples[sample][f'SE_D{self._4x}'] = 0.
2150			for sample in self.unknowns:
2151				self.msg(f'Consolidating sample {sample}')
2152				self.unknowns[sample][f'session_D{self._4x}'] = {}
2153				session_avg = []
2154				for session in self.sessions:
2155					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2156					if sdata:
2157						self.msg(f'{sample} found in session {session}')
2158						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2159						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2160						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2161						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2162						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2163						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2164						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2165				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2166				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2167				wsum = sum([weights[s] for s in weights])
2168				for s in weights:
2169					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2170
2171		for r in self:
2172			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2173
2174
2175
2176	def consolidate_sessions(self):
2177		'''
2178		Compute various statistics for each session.
2179
2180		+ `Na`: Number of anchor analyses in the session
2181		+ `Nu`: Number of unknown analyses in the session
2182		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2183		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2184		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2185		+ `a`: scrambling factor
2186		+ `b`: compositional slope
2187		+ `c`: WG offset
2188		+ `SE_a`: Model stadard erorr of `a`
2189		+ `SE_b`: Model stadard erorr of `b`
2190		+ `SE_c`: Model stadard erorr of `c`
2191		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2192		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2193		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2194		+ `a2`: scrambling factor drift
2195		+ `b2`: compositional slope drift
2196		+ `c2`: WG offset drift
2197		+ `Np`: Number of standardization parameters to fit
2198		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2199		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2200		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2201		'''
2202		for session in self.sessions:
2203			if 'd13Cwg_VPDB' not in self.sessions[session]:
2204				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2205			if 'd18Owg_VSMOW' not in self.sessions[session]:
2206				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2207			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2208			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2209
2210			self.msg(f'Computing repeatabilities for session {session}')
2211			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2212			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2213			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2214
2215		if self.standardization_method == 'pooled':
2216			for session in self.sessions:
2217
2218				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2219				i = self.standardization.var_names.index(f'a_{pf(session)}')
2220				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2221
2222				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2223				i = self.standardization.var_names.index(f'b_{pf(session)}')
2224				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2225
2226				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2227				i = self.standardization.var_names.index(f'c_{pf(session)}')
2228				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2229
2230				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2231				if self.sessions[session]['scrambling_drift']:
2232					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2233					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2234				else:
2235					self.sessions[session]['SE_a2'] = 0.
2236
2237				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2238				if self.sessions[session]['slope_drift']:
2239					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2240					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2241				else:
2242					self.sessions[session]['SE_b2'] = 0.
2243
2244				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2245				if self.sessions[session]['wg_drift']:
2246					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2247					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2248				else:
2249					self.sessions[session]['SE_c2'] = 0.
2250
2251				i = self.standardization.var_names.index(f'a_{pf(session)}')
2252				j = self.standardization.var_names.index(f'b_{pf(session)}')
2253				k = self.standardization.var_names.index(f'c_{pf(session)}')
2254				CM = np.zeros((6,6))
2255				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2256				try:
2257					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2258					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2259					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2260					try:
2261						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2262						CM[3,4] = self.standardization.covar[i2,j2]
2263						CM[4,3] = self.standardization.covar[j2,i2]
2264					except ValueError:
2265						pass
2266					try:
2267						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2268						CM[3,5] = self.standardization.covar[i2,k2]
2269						CM[5,3] = self.standardization.covar[k2,i2]
2270					except ValueError:
2271						pass
2272				except ValueError:
2273					pass
2274				try:
2275					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2276					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2277					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2278					try:
2279						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2280						CM[4,5] = self.standardization.covar[j2,k2]
2281						CM[5,4] = self.standardization.covar[k2,j2]
2282					except ValueError:
2283						pass
2284				except ValueError:
2285					pass
2286				try:
2287					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2288					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2289					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2290				except ValueError:
2291					pass
2292
2293				self.sessions[session]['CM'] = CM
2294
2295		elif self.standardization_method == 'indep_sessions':
2296			pass # Not implemented yet
2297
2298
2299	@make_verbal
2300	def repeatabilities(self):
2301		'''
2302		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2303		(for all samples, for anchors, and for unknowns).
2304		'''
2305		self.msg('Computing reproducibilities for all sessions')
2306
2307		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2308		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2309		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2310		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2311		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2312
2313
2314	@make_verbal
2315	def consolidate(self, tables = True, plots = True):
2316		'''
2317		Collect information about samples, sessions and repeatabilities.
2318		'''
2319		self.consolidate_samples()
2320		self.consolidate_sessions()
2321		self.repeatabilities()
2322
2323		if tables:
2324			self.summary()
2325			self.table_of_sessions()
2326			self.table_of_analyses()
2327			self.table_of_samples()
2328
2329		if plots:
2330			self.plot_sessions()
2331
2332
2333	@make_verbal
2334	def rmswd(self,
2335		samples = 'all samples',
2336		sessions = 'all sessions',
2337		):
2338		'''
2339		Compute the χ2, root mean squared weighted deviation
2340		(i.e. reduced χ2), and corresponding degrees of freedom of the
2341		Δ4x values for samples in `samples` and sessions in `sessions`.
2342		
2343		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2344		'''
2345		if samples == 'all samples':
2346			mysamples = [k for k in self.samples]
2347		elif samples == 'anchors':
2348			mysamples = [k for k in self.anchors]
2349		elif samples == 'unknowns':
2350			mysamples = [k for k in self.unknowns]
2351		else:
2352			mysamples = samples
2353
2354		if sessions == 'all sessions':
2355			sessions = [k for k in self.sessions]
2356
2357		chisq, Nf = 0, 0
2358		for sample in mysamples :
2359			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2360			if len(G) > 1 :
2361				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2362				Nf += (len(G) - 1)
2363				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2364		r = (chisq / Nf)**.5 if Nf > 0 else 0
2365		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2366		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2367
2368	
2369	@make_verbal
2370	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2371		'''
2372		Compute the repeatability of `[r[key] for r in self]`
2373		'''
2374
2375		if samples == 'all samples':
2376			mysamples = [k for k in self.samples]
2377		elif samples == 'anchors':
2378			mysamples = [k for k in self.anchors]
2379		elif samples == 'unknowns':
2380			mysamples = [k for k in self.unknowns]
2381		else:
2382			mysamples = samples
2383
2384		if sessions == 'all sessions':
2385			sessions = [k for k in self.sessions]
2386
2387		if key in ['D47', 'D48']:
2388			# Full disclosure: the definition of Nf is tricky/debatable
2389			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2390			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2391			Nf = len(G)
2392# 			print(f'len(G) = {Nf}')
2393			Nf -= len([s for s in mysamples if s in self.unknowns])
2394# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2395			for session in sessions:
2396				Np = len([
2397					_ for _ in self.standardization.params
2398					if (
2399						self.standardization.params[_].expr is not None
2400						and (
2401							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2402							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2403							)
2404						)
2405					])
2406# 				print(f'session {session}: {Np} parameters to consider')
2407				Na = len({
2408					r['Sample'] for r in self.sessions[session]['data']
2409					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2410					})
2411# 				print(f'session {session}: {Na} different anchors in that session')
2412				Nf -= min(Np, Na)
2413# 			print(f'Nf = {Nf}')
2414
2415# 			for sample in mysamples :
2416# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2417# 				if len(X) > 1 :
2418# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2419# 					if sample in self.unknowns:
2420# 						Nf += len(X) - 1
2421# 					else:
2422# 						Nf += len(X)
2423# 			if samples in ['anchors', 'all samples']:
2424# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2425			r = (chisq / Nf)**.5 if Nf > 0 else 0
2426
2427		else: # if key not in ['D47', 'D48']
2428			chisq, Nf = 0, 0
2429			for sample in mysamples :
2430				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2431				if len(X) > 1 :
2432					Nf += len(X) - 1
2433					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2434			r = (chisq / Nf)**.5 if Nf > 0 else 0
2435
2436		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2437		return r
2438
2439	def sample_average(self, samples, weights = 'equal', normalize = True):
2440		'''
2441		Weighted average Δ4x value of a group of samples, accounting for covariance.
2442
2443		Returns the weighed average Δ4x value and associated SE
2444		of a group of samples. Weights are equal by default. If `normalize` is
2445		true, `weights` will be rescaled so that their sum equals 1.
2446
2447		**Examples**
2448
2449		```python
2450		self.sample_average(['X','Y'], [1, 2])
2451		```
2452
2453		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2454		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2455		values of samples X and Y, respectively.
2456
2457		```python
2458		self.sample_average(['X','Y'], [1, -1], normalize = False)
2459		```
2460
2461		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2462		'''
2463		if weights == 'equal':
2464			weights = [1/len(samples)] * len(samples)
2465
2466		if normalize:
2467			s = sum(weights)
2468			if s:
2469				weights = [w/s for w in weights]
2470
2471		try:
2472# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2473# 			C = self.standardization.covar[indices,:][:,indices]
2474			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2475			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2476			return correlated_sum(X, C, weights)
2477		except ValueError:
2478			return (0., 0.)
2479
2480
2481	def sample_D4x_covar(self, sample1, sample2 = None):
2482		'''
2483		Covariance between Δ4x values of samples
2484
2485		Returns the error covariance between the average Δ4x values of two
2486		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2487		returns the Δ4x variance for that sample.
2488		'''
2489		if sample2 is None:
2490			sample2 = sample1
2491		if self.standardization_method == 'pooled':
2492			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2493			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2494			return self.standardization.covar[i, j]
2495		elif self.standardization_method == 'indep_sessions':
2496			if sample1 == sample2:
2497				return self.samples[sample1][f'SE_D{self._4x}']**2
2498			else:
2499				c = 0
2500				for session in self.sessions:
2501					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2502					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2503					if sdata1 and sdata2:
2504						a = self.sessions[session]['a']
2505						# !! TODO: CM below does not account for temporal changes in standardization parameters
2506						CM = self.sessions[session]['CM'][:3,:3]
2507						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2508						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2509						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2510						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2511						c += (
2512							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2513							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2514							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2515							@ CM
2516							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2517							) / a**2
2518				return float(c)
2519
2520	def sample_D4x_correl(self, sample1, sample2 = None):
2521		'''
2522		Correlation between Δ4x errors of samples
2523
2524		Returns the error correlation between the average Δ4x values of two samples.
2525		'''
2526		if sample2 is None or sample2 == sample1:
2527			return 1.
2528		return (
2529			self.sample_D4x_covar(sample1, sample2)
2530			/ self.unknowns[sample1][f'SE_D{self._4x}']
2531			/ self.unknowns[sample2][f'SE_D{self._4x}']
2532			)
2533
2534	def plot_single_session(self,
2535		session,
2536		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2537		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2538		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2539		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2540		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2541		xylimits = 'free', # | 'constant'
2542		x_label = None,
2543		y_label = None,
2544		error_contour_interval = 'auto',
2545		fig = 'new',
2546		):
2547		'''
2548		Generate plot for a single session
2549		'''
2550		if x_label is None:
2551			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2552		if y_label is None:
2553			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2554
2555		out = _SessionPlot()
2556		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2557		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2558		
2559		if fig == 'new':
2560			out.fig = ppl.figure(figsize = (6,6))
2561			ppl.subplots_adjust(.1,.1,.9,.9)
2562
2563		out.anchor_analyses, = ppl.plot(
2564			[r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors],
2565			[r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors],
2566			**kw_plot_anchors)
2567		out.unknown_analyses, = ppl.plot(
2568			[r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns],
2569			[r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns],
2570			**kw_plot_unknowns)
2571		out.anchor_avg = ppl.plot(
2572			np.array([ np.array([
2573				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2574				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2575				]) for sample in anchors]).T,
2576			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T,
2577			**kw_plot_anchor_avg)
2578		out.unknown_avg = ppl.plot(
2579			np.array([ np.array([
2580				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2581				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2582				]) for sample in unknowns]).T,
2583			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T,
2584			**kw_plot_unknown_avg)
2585		if xylimits == 'constant':
2586			x = [r[f'd{self._4x}'] for r in self]
2587			y = [r[f'D{self._4x}'] for r in self]
2588			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2589			w, h = x2-x1, y2-y1
2590			x1 -= w/20
2591			x2 += w/20
2592			y1 -= h/20
2593			y2 += h/20
2594			ppl.axis([x1, x2, y1, y2])
2595		elif xylimits == 'free':
2596			x1, x2, y1, y2 = ppl.axis()
2597		else:
2598			x1, x2, y1, y2 = ppl.axis(xylimits)
2599				
2600		if error_contour_interval != 'none':
2601			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2602			XI,YI = np.meshgrid(xi, yi)
2603			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2604			if error_contour_interval == 'auto':
2605				rng = np.max(SI) - np.min(SI)
2606				if rng <= 0.01:
2607					cinterval = 0.001
2608				elif rng <= 0.03:
2609					cinterval = 0.004
2610				elif rng <= 0.1:
2611					cinterval = 0.01
2612				elif rng <= 0.3:
2613					cinterval = 0.03
2614				elif rng <= 1.:
2615					cinterval = 0.1
2616				else:
2617					cinterval = 0.5
2618			else:
2619				cinterval = error_contour_interval
2620
2621			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2622			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2623			out.clabel = ppl.clabel(out.contour)
2624
2625		ppl.xlabel(x_label)
2626		ppl.ylabel(y_label)
2627		ppl.title(session, weight = 'bold')
2628		ppl.grid(alpha = .2)
2629		out.ax = ppl.gca()		
2630
2631		return out
2632
2633	def plot_residuals(
2634		self,
2635		kde = False,
2636		hist = False,
2637		binwidth = 2/3,
2638		dir = 'output',
2639		filename = None,
2640		highlight = [],
2641		colors = None,
2642		figsize = None,
2643		dpi = 100,
2644		yspan = None,
2645		):
2646		'''
2647		Plot residuals of each analysis as a function of time (actually, as a function of
2648		the order of analyses in the `D4xdata` object)
2649
2650		+ `kde`: whether to add a kernel density estimate of residuals
2651		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2652		+ `histbins`: specify bin edges for the histogram
2653		+ `dir`: the directory in which to save the plot
2654		+ `highlight`: a list of samples to highlight
2655		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2656		+ `figsize`: (width, height) of figure
2657		+ `dpi`: resolution for PNG output
2658		+ `yspan`: factor controlling the range of y values shown in plot
2659		  (by default: `yspan = 1.5 if kde else 1.0`)
2660		'''
2661		
2662		from matplotlib import ticker
2663
2664		if yspan is None:
2665			if kde:
2666				yspan = 1.5
2667			else:
2668				yspan = 1.0
2669		
2670		# Layout
2671		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2672		if hist or kde:
2673			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2674			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2675		else:
2676			ppl.subplots_adjust(.08,.05,.78,.8)
2677			ax1 = ppl.subplot(111)
2678		
2679		# Colors
2680		N = len(self.anchors)
2681		if colors is None:
2682			if len(highlight) > 0:
2683				Nh = len(highlight)
2684				if Nh == 1:
2685					colors = {highlight[0]: (0,0,0)}
2686				elif Nh == 3:
2687					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2688				elif Nh == 4:
2689					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2690				else:
2691					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2692			else:
2693				if N == 3:
2694					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2695				elif N == 4:
2696					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2697				else:
2698					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2699
2700		ppl.sca(ax1)
2701		
2702		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2703
2704		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2705
2706		session = self[0]['Session']
2707		x1 = 0
2708# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2709		x_sessions = {}
2710		one_or_more_singlets = False
2711		one_or_more_multiplets = False
2712		multiplets = set()
2713		for k,r in enumerate(self):
2714			if r['Session'] != session:
2715				x2 = k-1
2716				x_sessions[session] = (x1+x2)/2
2717				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2718				session = r['Session']
2719				x1 = k
2720			singlet = len(self.samples[r['Sample']]['data']) == 1
2721			if not singlet:
2722				multiplets.add(r['Sample'])
2723			if r['Sample'] in self.unknowns:
2724				if singlet:
2725					one_or_more_singlets = True
2726				else:
2727					one_or_more_multiplets = True
2728			kw = dict(
2729				marker = 'x' if singlet else '+',
2730				ms = 4 if singlet else 5,
2731				ls = 'None',
2732				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2733				mew = 1,
2734				alpha = 0.2 if singlet else 1,
2735				)
2736			if highlight and r['Sample'] not in highlight:
2737				kw['alpha'] = 0.2
2738			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2739		x2 = k
2740		x_sessions[session] = (x1+x2)/2
2741
2742		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2743		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2744		if not (hist or kde):
2745			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2746			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2747
2748		xmin, xmax, ymin, ymax = ppl.axis()
2749		if yspan != 1:
2750			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2751		for s in x_sessions:
2752			ppl.text(
2753				x_sessions[s],
2754				ymax +1,
2755				s,
2756				va = 'bottom',
2757				**(
2758					dict(ha = 'center')
2759					if len(self.sessions[s]['data']) > (0.15 * len(self))
2760					else dict(ha = 'left', rotation = 45)
2761					)
2762				)
2763
2764		if hist or kde:
2765			ppl.sca(ax2)
2766
2767		for s in colors:
2768			kw['marker'] = '+'
2769			kw['ms'] = 5
2770			kw['mec'] = colors[s]
2771			kw['label'] = s
2772			kw['alpha'] = 1
2773			ppl.plot([], [], **kw)
2774
2775		kw['mec'] = (0,0,0)
2776
2777		if one_or_more_singlets:
2778			kw['marker'] = 'x'
2779			kw['ms'] = 4
2780			kw['alpha'] = .2
2781			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2782			ppl.plot([], [], **kw)
2783
2784		if one_or_more_multiplets:
2785			kw['marker'] = '+'
2786			kw['ms'] = 4
2787			kw['alpha'] = 1
2788			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2789			ppl.plot([], [], **kw)
2790
2791		if hist or kde:
2792			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2793		else:
2794			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2795		leg.set_zorder(-1000)
2796
2797		ppl.sca(ax1)
2798
2799		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2800		ppl.xticks([])
2801		ppl.axis([-1, len(self), None, None])
2802
2803		if hist or kde:
2804			ppl.sca(ax2)
2805			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2806
2807			if kde:
2808				from scipy.stats import gaussian_kde
2809				yi = np.linspace(ymin, ymax, 201)
2810				xi = gaussian_kde(X).evaluate(yi)
2811				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2812# 				ppl.plot(xi, yi, 'k-', lw = 1)
2813			elif hist:
2814				ppl.hist(
2815					X,
2816					orientation = 'horizontal',
2817					histtype = 'stepfilled',
2818					ec = [.4]*3,
2819					fc = [.25]*3,
2820					alpha = .25,
2821					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2822					)
2823			ppl.text(0, 0,
2824				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2825				size = 7.5,
2826				alpha = 1,
2827				va = 'center',
2828				ha = 'left',
2829				)
2830
2831			ppl.axis([0, None, ymin, ymax])
2832			ppl.xticks([])
2833			ppl.yticks([])
2834# 			ax2.spines['left'].set_visible(False)
2835			ax2.spines['right'].set_visible(False)
2836			ax2.spines['top'].set_visible(False)
2837			ax2.spines['bottom'].set_visible(False)
2838
2839		ax1.axis([None, None, ymin, ymax])
2840
2841		if not os.path.exists(dir):
2842			os.makedirs(dir)
2843		if filename is None:
2844			return fig
2845		elif filename == '':
2846			filename = f'D{self._4x}_residuals.pdf'
2847		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2848		ppl.close(fig)
2849				
2850
2851	def simulate(self, *args, **kwargs):
2852		'''
2853		Legacy function with warning message pointing to `virtual_data()`
2854		'''
2855		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2856
2857	def plot_distribution_of_analyses(
2858		self,
2859		dir = 'output',
2860		filename = None,
2861		vs_time = False,
2862		figsize = (6,4),
2863		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2864		output = None,
2865		dpi = 100,
2866		):
2867		'''
2868		Plot temporal distribution of all analyses in the data set.
2869		
2870		**Parameters**
2871
2872		+ `dir`: the directory in which to save the plot
2873		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2874		+ `dpi`: resolution for PNG output
2875		+ `figsize`: (width, height) of figure
2876		+ `dpi`: resolution for PNG output
2877		'''
2878
2879		asamples = [s for s in self.anchors]
2880		usamples = [s for s in self.unknowns]
2881		if output is None or output == 'fig':
2882			fig = ppl.figure(figsize = figsize)
2883			ppl.subplots_adjust(*subplots_adjust)
2884		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2885		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2886		Xmax += (Xmax-Xmin)/40
2887		Xmin -= (Xmax-Xmin)/41
2888		for k, s in enumerate(asamples + usamples):
2889			if vs_time:
2890				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2891			else:
2892				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2893			Y = [-k for x in X]
2894			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2895			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2896			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2897		ppl.axis([Xmin, Xmax, -k-1, 1])
2898		ppl.xlabel('\ntime')
2899		ppl.gca().annotate('',
2900			xy = (0.6, -0.02),
2901			xycoords = 'axes fraction',
2902			xytext = (.4, -0.02), 
2903            arrowprops = dict(arrowstyle = "->", color = 'k'),
2904            )
2905			
2906
2907		x2 = -1
2908		for session in self.sessions:
2909			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2910			if vs_time:
2911				ppl.axvline(x1, color = 'k', lw = .75)
2912			if x2 > -1:
2913				if not vs_time:
2914					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2915			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2916# 			from xlrd import xldate_as_datetime
2917# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2918			if vs_time:
2919				ppl.axvline(x2, color = 'k', lw = .75)
2920				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2921			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2922
2923		ppl.xticks([])
2924		ppl.yticks([])
2925
2926		if output is None:
2927			if not os.path.exists(dir):
2928				os.makedirs(dir)
2929			if filename == None:
2930				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2931			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2932			ppl.close(fig)
2933		elif output == 'ax':
2934			return ppl.gca()
2935		elif output == 'fig':
2936			return fig
2937
2938
2939	def plot_bulk_compositions(
2940		self,
2941		samples = None,
2942		dir = 'output/bulk_compositions',
2943		figsize = (6,6),
2944		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
2945		show = False,
2946		sample_color = (0,.5,1),
2947		analysis_color = (.7,.7,.7),
2948		labeldist = 0.3,
2949		radius = 0.05,
2950		):
2951		'''
2952		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
2953		
2954		By default, creates a directory `./output/bulk_compositions` where plots for
2955		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
2956		
2957		
2958		**Parameters**
2959
2960		+ `samples`: Only these samples are processed (by default: all samples).
2961		+ `dir`: where to save the plots
2962		+ `figsize`: (width, height) of figure
2963		+ `subplots_adjust`: passed to `subplots_adjust()`
2964		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
2965		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
2966		+ `sample_color`: color used for replicate markers/labels
2967		+ `analysis_color`: color used for sample markers/labels
2968		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
2969		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
2970		'''
2971
2972		from matplotlib.patches import Ellipse
2973
2974		if samples is None:
2975			samples = [_ for _ in self.samples]
2976
2977		saved = {}
2978
2979		for s in samples:
2980
2981			fig = ppl.figure(figsize = figsize)
2982			fig.subplots_adjust(*subplots_adjust)
2983			ax = ppl.subplot(111)
2984			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
2985			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
2986			ppl.title(s)
2987
2988
2989			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
2990			UID = [_['UID'] for _ in self.samples[s]['data']]
2991			XY0 = XY.mean(0)
2992
2993			for xy in XY:
2994				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
2995				
2996			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
2997			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
2998			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
2999			saved[s] = [XY, XY0]
3000			
3001			x1, x2, y1, y2 = ppl.axis()
3002			x0, dx = (x1+x2)/2, (x2-x1)/2
3003			y0, dy = (y1+y2)/2, (y2-y1)/2
3004			dx, dy = [max(max(dx, dy), radius)]*2
3005
3006			ppl.axis([
3007				x0 - 1.2*dx,
3008				x0 + 1.2*dx,
3009				y0 - 1.2*dy,
3010				y0 + 1.2*dy,
3011				])			
3012
3013			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3014
3015			for xy, uid in zip(XY, UID):
3016
3017				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3018				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3019
3020				if (vector_in_display_space**2).sum() > 0:
3021
3022					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3023					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3024					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3025					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3026
3027					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3028
3029				else:
3030
3031					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3032
3033			if radius:
3034				ax.add_artist(Ellipse(
3035					xy = XY0,
3036					width = radius*2,
3037					height = radius*2,
3038					ls = (0, (2,2)),
3039					lw = .7,
3040					ec = analysis_color,
3041					fc = 'None',
3042					))
3043				ppl.text(
3044					XY0[0],
3045					XY0[1]-radius,
3046					f'\n± {radius*1e3:.0f} ppm',
3047					color = analysis_color,
3048					va = 'top',
3049					ha = 'center',
3050					linespacing = 0.4,
3051					size = 8,
3052					)
3053
3054			if not os.path.exists(dir):
3055				os.makedirs(dir)
3056			fig.savefig(f'{dir}/{s}.pdf')
3057			ppl.close(fig)
3058
3059		fig = ppl.figure(figsize = figsize)
3060		fig.subplots_adjust(*subplots_adjust)
3061		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3062		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3063
3064		for s in saved:
3065			for xy in saved[s][0]:
3066				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3067			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3068			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3069			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3070
3071		x1, x2, y1, y2 = ppl.axis()
3072		ppl.axis([
3073			x1 - (x2-x1)/10,
3074			x2 + (x2-x1)/10,
3075			y1 - (y2-y1)/10,
3076			y2 + (y2-y1)/10,
3077			])			
3078
3079
3080		if not os.path.exists(dir):
3081			os.makedirs(dir)
3082		fig.savefig(f'{dir}/__all__.pdf')
3083		if show:
3084			ppl.show()
3085		ppl.close(fig)
3086		
3087
3088
3089class D47data(D4xdata):
3090	'''
3091	Store and process data for a large set of Δ47 analyses,
3092	usually comprising more than one analytical session.
3093	'''
3094
3095	Nominal_D4x = {
3096		'ETH-1':   0.2052,
3097		'ETH-2':   0.2085,
3098		'ETH-3':   0.6132,
3099		'ETH-4':   0.4511,
3100		'IAEA-C1': 0.3018,
3101		'IAEA-C2': 0.6409,
3102		'MERCK':   0.5135,
3103		} # I-CDES (Bernasconi et al., 2021)
3104	'''
3105	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3106	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3107	reference frame.
3108
3109	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3110	```py
3111	{
3112		'ETH-1'   : 0.2052,
3113		'ETH-2'   : 0.2085,
3114		'ETH-3'   : 0.6132,
3115		'ETH-4'   : 0.4511,
3116		'IAEA-C1' : 0.3018,
3117		'IAEA-C2' : 0.6409,
3118		'MERCK'   : 0.5135,
3119	}
3120	```
3121	'''
3122
3123
3124	@property
3125	def Nominal_D47(self):
3126		return self.Nominal_D4x
3127	
3128
3129	@Nominal_D47.setter
3130	def Nominal_D47(self, new):
3131		self.Nominal_D4x = dict(**new)
3132		self.refresh()
3133
3134
3135	def __init__(self, l = [], **kwargs):
3136		'''
3137		**Parameters:** same as `D4xdata.__init__()`
3138		'''
3139		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3140
3141
3142	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3143		'''
3144		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3145		value for that temperature, and add treat these samples as additional anchors.
3146
3147		**Parameters**
3148
3149		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3150		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3151		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3152		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3153		if `new`: keep pre-existing anchors but update them in case of conflict
3154		between old and new Δ47 values;
3155		if `old`: keep pre-existing anchors but preserve their original Δ47
3156		values in case of conflict.
3157		'''
3158		f = {
3159			'petersen': fCO2eqD47_Petersen,
3160			'wang': fCO2eqD47_Wang,
3161			}[fCo2eqD47]
3162		foo = {}
3163		for r in self:
3164			if 'Teq' in r:
3165				if r['Sample'] in foo:
3166					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3167				else:
3168					foo[r['Sample']] = f(r['Teq'])
3169			else:
3170					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3171
3172		if priority == 'replace':
3173			self.Nominal_D47 = {}
3174		for s in foo:
3175			if priority != 'old' or s not in self.Nominal_D47:
3176				self.Nominal_D47[s] = foo[s]
3177	
3178
3179
3180
3181class D48data(D4xdata):
3182	'''
3183	Store and process data for a large set of Δ48 analyses,
3184	usually comprising more than one analytical session.
3185	'''
3186
3187	Nominal_D4x = {
3188		'ETH-1':  0.138,
3189		'ETH-2':  0.138,
3190		'ETH-3':  0.270,
3191		'ETH-4':  0.223,
3192		'GU-1':  -0.419,
3193		} # (Fiebig et al., 2019, 2021)
3194	'''
3195	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3196	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3197	reference frame.
3198
3199	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3200	Fiebig et al. (in press)):
3201
3202	```py
3203	{
3204		'ETH-1' :  0.138,
3205		'ETH-2' :  0.138,
3206		'ETH-3' :  0.270,
3207		'ETH-4' :  0.223,
3208		'GU-1'  : -0.419,
3209	}
3210	```
3211	'''
3212
3213
3214	@property
3215	def Nominal_D48(self):
3216		return self.Nominal_D4x
3217
3218	
3219	@Nominal_D48.setter
3220	def Nominal_D48(self, new):
3221		self.Nominal_D4x = dict(**new)
3222		self.refresh()
3223
3224
3225	def __init__(self, l = [], **kwargs):
3226		'''
3227		**Parameters:** same as `D4xdata.__init__()`
3228		'''
3229		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3230
3231
3232
3233class _SessionPlot():
3234	'''
3235	Simple placeholder class
3236	'''
3237	def __init__(self):
3238		pass
3239
3240_app = typer.Typer(
3241	add_completion = False,
3242	context_settings={'help_option_names': ['-h', '--help']},
3243	rich_markup_mode = 'rich',
3244	)
3245
3246@_app.command()
3247def _cli(
3248	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3249	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3250	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3251	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3252	):
3253	"""
3254	Process raw D47 data and return standardized results.
3255	"""
3256
3257	data = D47data()
3258	data.read(rawdata)
3259
3260	if exclude != 'none':
3261		exclude = read_csv(exclude)
3262		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3263		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3264	else:
3265		exclude_uid = []
3266		exclude_sample = []
3267	
3268	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3269
3270	if anchors != 'none':
3271		anchors = read_csv(anchors)
3272		data.Nominal_d13C_VPDB = {
3273			_['Sample']: _['d13C_VPDB']
3274			for _ in anchors
3275			if 'd13C_VPDB' in _
3276			}
3277		data.Nominal_d18O_VPDB = {
3278			_['Sample']: _['d18O_VPDB']
3279			for _ in anchors
3280			if 'd18O_VPDB' in _
3281			}
3282		data.Nominal_D4x = {
3283			_['Sample']: _['D47']
3284			for _ in anchors
3285			if 'D47' in _
3286			}
3287
3288	data.refresh()
3289	data.wg()
3290	data.crunch()
3291	data.standardize()
3292	data.summary(dir = output_dir)
3293	data.table_of_samples(dir = output_dir)
3294	data.table_of_sessions(dir = output_dir)
3295	data.plot_sessions(dir = output_dir)
3296	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3297	data.table_of_analyses(dir = output_dir)
3298	data.plot_distribution_of_analyses(dir = output_dir)
3299	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3300
3301def __cli():
3302	_app()
def fCO2eqD47_Petersen(T):
66def fCO2eqD47_Petersen(T):
67	'''
68	CO2 equilibrium Δ47 value as a function of T (in degrees C)
69	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
70
71	'''
72	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

def fCO2eqD47_Wang(T):
77def fCO2eqD47_Wang(T):
78	'''
79	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
80	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
81	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
82	'''
83	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 86def correlated_sum(X, C, w = None):
 87	'''
 88	Compute covariance-aware linear combinations
 89
 90	**Parameters**
 91	
 92	+ `X`: list or 1-D array of values to sum
 93	+ `C`: covariance matrix for the elements of `X`
 94	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 95	       (all equal to 1 by default)
 96
 97	Return the sum (and its SE) of the elements of `X`, with optional weights equal
 98	to the elements of `w`, accounting for covariances between the elements of `X`.
 99	'''
100	if w is None:
101		w = [1 for x in X]
102	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
105def make_csv(x, hsep = ',', vsep = '\n'):
106	'''
107	Formats a list of lists of strings as a CSV
108
109	**Parameters**
110
111	+ `x`: the list of lists of strings to format
112	+ `hsep`: the field separator (`,` by default)
113	+ `vsep`: the line-ending convention to use (`\\n` by default)
114
115	**Example**
116
117	```py
118	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
119	```
120
121	outputs:
122
123	```py
124	a,b,c
125	d,e,f
126	```
127	'''
128	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
131def pf(txt):
132	'''
133	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
134	'''
135	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
138def smart_type(x):
139	'''
140	Tries to convert string `x` to a float if it includes a decimal point, or
141	to an integer if it does not. If both attempts fail, return the original
142	string unchanged.
143	'''
144	try:
145		y = float(x)
146	except ValueError:
147		return x
148	if '.' not in x:
149		return int(y)
150	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

def pretty_table(x, header=1, hsep=' ', vsep='–', align='<'):
153def pretty_table(x, header = 1, hsep = '  ', vsep = '–', align = '<'):
154	'''
155	Reads a list of lists of strings and outputs an ascii table
156
157	**Parameters**
158
159	+ `x`: a list of lists of strings
160	+ `header`: the number of lines to treat as header lines
161	+ `hsep`: the horizontal separator between columns
162	+ `vsep`: the character to use as vertical separator
163	+ `align`: string of left (`<`) or right (`>`) alignment characters.
164
165	**Example**
166
167	```py
168	x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
169	print(pretty_table(x))
170	```
171	yields:	
172	```
173	--  ------  ---
174	A        B    C
175	--  ------  ---
176	1   1.9999  foo
177	10       x  bar
178	--  ------  ---
179	```
180	
181	'''
182	txt = []
183	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
184
185	if len(widths) > len(align):
186		align += '>' * (len(widths)-len(align))
187	sepline = hsep.join([vsep*w for w in widths])
188	txt += [sepline]
189	for k,l in enumerate(x):
190		if k and k == header:
191			txt += [sepline]
192		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
193	txt += [sepline]
194	txt += ['']
195	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
print(pretty_table(x))

yields:

--  ------  ---
A        B    C
--  ------  ---
1   1.9999  foo
10       x  bar
--  ------  ---
def transpose_table(x):
198def transpose_table(x):
199	'''
200	Transpose a list if lists
201
202	**Parameters**
203
204	+ `x`: a list of lists
205
206	**Example**
207
208	```py
209	x = [[1, 2], [3, 4]]
210	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
211	```
212	'''
213	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
216def w_avg(X, sX) :
217	'''
218	Compute variance-weighted average
219
220	Returns the value and SE of the weighted average of the elements of `X`,
221	with relative weights equal to their inverse variances (`1/sX**2`).
222
223	**Parameters**
224
225	+ `X`: array-like of elements to average
226	+ `sX`: array-like of the corresponding SE values
227
228	**Tip**
229
230	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
231	they may be rearranged using `zip()`:
232
233	```python
234	foo = [(0, 1), (1, 0.5), (2, 0.5)]
235	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
236	```
237	'''
238	X = [ x for x in X ]
239	sX = [ sx for sx in sX ]
240	W = [ sx**-2 for sx in sX ]
241	W = [ w/sum(W) for w in W ]
242	Xavg = sum([ w*x for w,x in zip(W,X) ])
243	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
244	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
247def read_csv(filename, sep = ''):
248	'''
249	Read contents of `filename` in csv format and return a list of dictionaries.
250
251	In the csv string, spaces before and after field separators (`','` by default)
252	are optional.
253
254	**Parameters**
255
256	+ `filename`: the csv file to read
257	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
258	whichever appers most often in the contents of `filename`.
259	'''
260	with open(filename) as fid:
261		txt = fid.read()
262
263	if sep == '':
264		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
265	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
266	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
269def simulate_single_analysis(
270	sample = 'MYSAMPLE',
271	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
272	d13C_VPDB = None, d18O_VPDB = None,
273	D47 = None, D48 = None, D49 = 0., D17O = 0.,
274	a47 = 1., b47 = 0., c47 = -0.9,
275	a48 = 1., b48 = 0., c48 = -0.45,
276	Nominal_D47 = None,
277	Nominal_D48 = None,
278	Nominal_d13C_VPDB = None,
279	Nominal_d18O_VPDB = None,
280	ALPHA_18O_ACID_REACTION = None,
281	R13_VPDB = None,
282	R17_VSMOW = None,
283	R18_VSMOW = None,
284	LAMBDA_17 = None,
285	R18_VPDB = None,
286	):
287	'''
288	Compute working-gas delta values for a single analysis, assuming a stochastic working
289	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
290	
291	**Parameters**
292
293	+ `sample`: sample name
294	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
295		(respectively –4 and +26 ‰ by default)
296	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
297	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
298		of the carbonate sample
299	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
300		Δ48 values if `D47` or `D48` are not specified
301	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
302		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
303	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
304	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
305		correction parameters (by default equal to the `D4xdata` default values)
306	
307	Returns a dictionary with fields
308	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
309	'''
310
311	if Nominal_d13C_VPDB is None:
312		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
313
314	if Nominal_d18O_VPDB is None:
315		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
316
317	if ALPHA_18O_ACID_REACTION is None:
318		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
319
320	if R13_VPDB is None:
321		R13_VPDB = D4xdata().R13_VPDB
322
323	if R17_VSMOW is None:
324		R17_VSMOW = D4xdata().R17_VSMOW
325
326	if R18_VSMOW is None:
327		R18_VSMOW = D4xdata().R18_VSMOW
328
329	if LAMBDA_17 is None:
330		LAMBDA_17 = D4xdata().LAMBDA_17
331
332	if R18_VPDB is None:
333		R18_VPDB = D4xdata().R18_VPDB
334	
335	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
336	
337	if Nominal_D47 is None:
338		Nominal_D47 = D47data().Nominal_D47
339
340	if Nominal_D48 is None:
341		Nominal_D48 = D48data().Nominal_D48
342	
343	if d13C_VPDB is None:
344		if sample in Nominal_d13C_VPDB:
345			d13C_VPDB = Nominal_d13C_VPDB[sample]
346		else:
347			raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.")
348
349	if d18O_VPDB is None:
350		if sample in Nominal_d18O_VPDB:
351			d18O_VPDB = Nominal_d18O_VPDB[sample]
352		else:
353			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
354
355	if D47 is None:
356		if sample in Nominal_D47:
357			D47 = Nominal_D47[sample]
358		else:
359			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
360
361	if D48 is None:
362		if sample in Nominal_D48:
363			D48 = Nominal_D48[sample]
364		else:
365			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
366
367	X = D4xdata()
368	X.R13_VPDB = R13_VPDB
369	X.R17_VSMOW = R17_VSMOW
370	X.R18_VSMOW = R18_VSMOW
371	X.LAMBDA_17 = LAMBDA_17
372	X.R18_VPDB = R18_VPDB
373	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
374
375	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
376		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
377		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
378		)
379	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
380		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
381		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
382		D17O=D17O, D47=D47, D48=D48, D49=D49,
383		)
384	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
385		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
386		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
387		D17O=D17O,
388		)
389	
390	d45 = 1000 * (R45/R45wg - 1)
391	d46 = 1000 * (R46/R46wg - 1)
392	d47 = 1000 * (R47/R47wg - 1)
393	d48 = 1000 * (R48/R48wg - 1)
394	d49 = 1000 * (R49/R49wg - 1)
395
396	for k in range(3): # dumb iteration to adjust for small changes in d47
397		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
398		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
399		d47 = 1000 * (R47raw/R47wg - 1)
400		d48 = 1000 * (R48raw/R48wg - 1)
401
402	return dict(
403		Sample = sample,
404		D17O = D17O,
405		d13Cwg_VPDB = d13Cwg_VPDB,
406		d18Owg_VSMOW = d18Owg_VSMOW,
407		d45 = d45,
408		d46 = d46,
409		d47 = d47,
410		d48 = d48,
411		d49 = d49,
412		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
415def virtual_data(
416	samples = [],
417	a47 = 1., b47 = 0., c47 = -0.9,
418	a48 = 1., b48 = 0., c48 = -0.45,
419	rd45 = 0.020, rd46 = 0.060,
420	rD47 = 0.015, rD48 = 0.045,
421	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
422	session = None,
423	Nominal_D47 = None, Nominal_D48 = None,
424	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
425	ALPHA_18O_ACID_REACTION = None,
426	R13_VPDB = None,
427	R17_VSMOW = None,
428	R18_VSMOW = None,
429	LAMBDA_17 = None,
430	R18_VPDB = None,
431	seed = 0,
432	shuffle = True,
433	):
434	'''
435	Return list with simulated analyses from a single session.
436	
437	**Parameters**
438	
439	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
440	    * `Sample`: the name of the sample
441	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
442	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
443	    * `N`: how many analyses to generate for this sample
444	+ `a47`: scrambling factor for Δ47
445	+ `b47`: compositional nonlinearity for Δ47
446	+ `c47`: working gas offset for Δ47
447	+ `a48`: scrambling factor for Δ48
448	+ `b48`: compositional nonlinearity for Δ48
449	+ `c48`: working gas offset for Δ48
450	+ `rd45`: analytical repeatability of δ45
451	+ `rd46`: analytical repeatability of δ46
452	+ `rD47`: analytical repeatability of Δ47
453	+ `rD48`: analytical repeatability of Δ48
454	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
455		(by default equal to the `simulate_single_analysis` default values)
456	+ `session`: name of the session (no name by default)
457	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
458		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
459	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
460		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
461		(by default equal to the `simulate_single_analysis` defaults)
462	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
463		(by default equal to the `simulate_single_analysis` defaults)
464	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
465		correction parameters (by default equal to the `simulate_single_analysis` default)
466	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
467	+ `shuffle`: randomly reorder the sequence of analyses
468	
469		
470	Here is an example of using this method to generate an arbitrary combination of
471	anchors and unknowns for a bunch of sessions:
472
473	```py
474	.. include:: ../code_examples/virtual_data/example.py
475	```
476	
477	This should output something like:
478	
479	```
480	.. include:: ../code_examples/virtual_data/output.txt
481	```
482	'''
483	
484	kwargs = locals().copy()
485
486	from numpy import random as nprandom
487	if seed:
488		rng = nprandom.default_rng(seed)
489	else:
490		rng = nprandom.default_rng()
491	
492	N = sum([s['N'] for s in samples])
493	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
494	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
495	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
496	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
497	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
498	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
499	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
500	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
501	
502	k = 0
503	out = []
504	for s in samples:
505		kw = {}
506		kw['sample'] = s['Sample']
507		kw = {
508			**kw,
509			**{var: kwargs[var]
510				for var in [
511					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
512					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
513					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
514					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
515					]
516				if kwargs[var] is not None},
517			**{var: s[var]
518				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
519				if var in s},
520			}
521
522		sN = s['N']
523		while sN:
524			out.append(simulate_single_analysis(**kw))
525			out[-1]['d45'] += errors45[k]
526			out[-1]['d46'] += errors46[k]
527			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
528			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
529			sN -= 1
530			k += 1
531
532		if session is not None:
533			for r in out:
534				r['Session'] = session
535
536		if shuffle:
537			nprandom.shuffle(out)
538
539	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––  ––––––––––––––
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0091  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0100  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0111  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0086  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––  ––––––––––––––

[table_of_samples] 
––––––  ––  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––  ––  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
––––––  ––  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

[table_of_analyses] 
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
2    Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
3    Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
4    Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
5    Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
6    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
7    Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
8    Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
9    Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
10   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
11   Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
12   Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
13   Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
14   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
15   Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
16   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
17   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
18   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
19   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
20   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
21   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
22   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
23   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
24   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
25   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
26   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
27   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
28   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
29   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
30   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
31   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
32   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
33   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
34   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
35   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
36   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
37   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
38   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
39   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
40   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
41   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
42   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
43   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
44   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
45   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
46   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
47   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
48   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
49   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
50   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
51   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
52   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
53   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
54   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
55   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
56   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
57   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
58   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
59   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
60   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
541def table_of_samples(
542	data47 = None,
543	data48 = None,
544	dir = 'output',
545	filename = None,
546	save_to_file = True,
547	print_out = True,
548	output = None,
549	):
550	'''
551	Print out, save to disk and/or return a combined table of samples
552	for a pair of `D47data` and `D48data` objects.
553
554	**Parameters**
555
556	+ `data47`: `D47data` instance
557	+ `data48`: `D48data` instance
558	+ `dir`: the directory in which to save the table
559	+ `filename`: the name to the csv file to write to
560	+ `save_to_file`: whether to save the table to disk
561	+ `print_out`: whether to print out the table
562	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
563		if set to `'raw'`: return a list of list of strings
564		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
565	'''
566	if data47 is None:
567		if data48 is None:
568			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
569		else:
570			return data48.table_of_samples(
571				dir = dir,
572				filename = filename,
573				save_to_file = save_to_file,
574				print_out = print_out,
575				output = output
576				)
577	else:
578		if data48 is None:
579			return data47.table_of_samples(
580				dir = dir,
581				filename = filename,
582				save_to_file = save_to_file,
583				print_out = print_out,
584				output = output
585				)
586		else:
587			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
588			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
589			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
590
591			if save_to_file:
592				if not os.path.exists(dir):
593					os.makedirs(dir)
594				if filename is None:
595					filename = f'D47D48_samples.csv'
596				with open(f'{dir}/{filename}', 'w') as fid:
597					fid.write(make_csv(out))
598			if print_out:
599				print('\n'+pretty_table(out))
600			if output == 'raw':
601				return out
602			elif output == 'pretty':
603				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
606def table_of_sessions(
607	data47 = None,
608	data48 = None,
609	dir = 'output',
610	filename = None,
611	save_to_file = True,
612	print_out = True,
613	output = None,
614	):
615	'''
616	Print out, save to disk and/or return a combined table of sessions
617	for a pair of `D47data` and `D48data` objects.
618	***Only applicable if the sessions in `data47` and those in `data48`
619	consist of the exact same sets of analyses.***
620
621	**Parameters**
622
623	+ `data47`: `D47data` instance
624	+ `data48`: `D48data` instance
625	+ `dir`: the directory in which to save the table
626	+ `filename`: the name to the csv file to write to
627	+ `save_to_file`: whether to save the table to disk
628	+ `print_out`: whether to print out the table
629	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
630		if set to `'raw'`: return a list of list of strings
631		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
632	'''
633	if data47 is None:
634		if data48 is None:
635			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
636		else:
637			return data48.table_of_sessions(
638				dir = dir,
639				filename = filename,
640				save_to_file = save_to_file,
641				print_out = print_out,
642				output = output
643				)
644	else:
645		if data48 is None:
646			return data47.table_of_sessions(
647				dir = dir,
648				filename = filename,
649				save_to_file = save_to_file,
650				print_out = print_out,
651				output = output
652				)
653		else:
654			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
655			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
656			for k,x in enumerate(out47[0]):
657				if k>7:
658					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
659					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
660			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
661
662			if save_to_file:
663				if not os.path.exists(dir):
664					os.makedirs(dir)
665				if filename is None:
666					filename = f'D47D48_sessions.csv'
667				with open(f'{dir}/{filename}', 'w') as fid:
668					fid.write(make_csv(out))
669			if print_out:
670				print('\n'+pretty_table(out))
671			if output == 'raw':
672				return out
673			elif output == 'pretty':
674				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
677def table_of_analyses(
678	data47 = None,
679	data48 = None,
680	dir = 'output',
681	filename = None,
682	save_to_file = True,
683	print_out = True,
684	output = None,
685	):
686	'''
687	Print out, save to disk and/or return a combined table of analyses
688	for a pair of `D47data` and `D48data` objects.
689
690	If the sessions in `data47` and those in `data48` do not consist of
691	the exact same sets of analyses, the table will have two columns
692	`Session_47` and `Session_48` instead of a single `Session` column.
693
694	**Parameters**
695
696	+ `data47`: `D47data` instance
697	+ `data48`: `D48data` instance
698	+ `dir`: the directory in which to save the table
699	+ `filename`: the name to the csv file to write to
700	+ `save_to_file`: whether to save the table to disk
701	+ `print_out`: whether to print out the table
702	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
703		if set to `'raw'`: return a list of list of strings
704		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
705	'''
706	if data47 is None:
707		if data48 is None:
708			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
709		else:
710			return data48.table_of_analyses(
711				dir = dir,
712				filename = filename,
713				save_to_file = save_to_file,
714				print_out = print_out,
715				output = output
716				)
717	else:
718		if data48 is None:
719			return data47.table_of_analyses(
720				dir = dir,
721				filename = filename,
722				save_to_file = save_to_file,
723				print_out = print_out,
724				output = output
725				)
726		else:
727			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
728			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
729			
730			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
731				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
732			else:
733				out47[0][1] = 'Session_47'
734				out48[0][1] = 'Session_48'
735				out47 = transpose_table(out47)
736				out48 = transpose_table(out48)
737				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
738
739			if save_to_file:
740				if not os.path.exists(dir):
741					os.makedirs(dir)
742				if filename is None:
743					filename = f'D47D48_sessions.csv'
744				with open(f'{dir}/{filename}', 'w') as fid:
745					fid.write(make_csv(out))
746			if print_out:
747				print('\n'+pretty_table(out))
748			if output == 'raw':
749				return out
750			elif output == 'pretty':
751				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 799class D4xdata(list):
 800	'''
 801	Store and process data for a large set of Δ47 and/or Δ48
 802	analyses, usually comprising more than one analytical session.
 803	'''
 804
 805	### 17O CORRECTION PARAMETERS
 806	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 807	'''
 808	Absolute (13C/12C) ratio of VPDB.
 809	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 810	'''
 811
 812	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 813	'''
 814	Absolute (18O/16C) ratio of VSMOW.
 815	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 816	'''
 817
 818	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 819	'''
 820	Mass-dependent exponent for triple oxygen isotopes.
 821	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 822	'''
 823
 824	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 825	'''
 826	Absolute (17O/16C) ratio of VSMOW.
 827	By default equal to 0.00038475
 828	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 829	rescaled to `R13_VPDB`)
 830	'''
 831
 832	R18_VPDB = R18_VSMOW * 1.03092
 833	'''
 834	Absolute (18O/16C) ratio of VPDB.
 835	By definition equal to `R18_VSMOW * 1.03092`.
 836	'''
 837
 838	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 839	'''
 840	Absolute (17O/16C) ratio of VPDB.
 841	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 842	'''
 843
 844	LEVENE_REF_SAMPLE = 'ETH-3'
 845	'''
 846	After the Δ4x standardization step, each sample is tested to
 847	assess whether the Δ4x variance within all analyses for that
 848	sample differs significantly from that observed for a given reference
 849	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 850	which yields a p-value corresponding to the null hypothesis that the
 851	underlying variances are equal).
 852
 853	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 854	sample should be used as a reference for this test.
 855	'''
 856
 857	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 858	'''
 859	Specifies the 18O/16O fractionation factor generally applicable
 860	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 861	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 862
 863	By default equal to 1.008129 (calcite reacted at 90 °C,
 864	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 865	'''
 866
 867	Nominal_d13C_VPDB = {
 868		'ETH-1': 2.02,
 869		'ETH-2': -10.17,
 870		'ETH-3': 1.71,
 871		}	# (Bernasconi et al., 2018)
 872	'''
 873	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 874	`D4xdata.standardize_d13C()`.
 875
 876	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 877	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 878	'''
 879
 880	Nominal_d18O_VPDB = {
 881		'ETH-1': -2.19,
 882		'ETH-2': -18.69,
 883		'ETH-3': -1.78,
 884		}	# (Bernasconi et al., 2018)
 885	'''
 886	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 887	`D4xdata.standardize_d18O()`.
 888
 889	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 890	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 891	'''
 892
 893	d13C_STANDARDIZATION_METHOD = '2pt'
 894	'''
 895	Method by which to standardize δ13C values:
 896	
 897	+ `none`: do not apply any δ13C standardization.
 898	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 899	minimize the difference between final δ13C_VPDB values and
 900	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 901	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 902	values so as to minimize the difference between final δ13C_VPDB
 903	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 904	is defined).
 905	'''
 906
 907	d18O_STANDARDIZATION_METHOD = '2pt'
 908	'''
 909	Method by which to standardize δ18O values:
 910	
 911	+ `none`: do not apply any δ18O standardization.
 912	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 913	minimize the difference between final δ18O_VPDB values and
 914	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 915	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 916	values so as to minimize the difference between final δ18O_VPDB
 917	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 918	is defined).
 919	'''
 920
 921	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 922		'''
 923		**Parameters**
 924
 925		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 926		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 927		+ `mass`: `'47'` or `'48'`
 928		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 929		+ `session`: define session name for analyses without a `Session` key
 930		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 931
 932		Returns a `D4xdata` object derived from `list`.
 933		'''
 934		self._4x = mass
 935		self.verbose = verbose
 936		self.prefix = 'D4xdata'
 937		self.logfile = logfile
 938		list.__init__(self, l)
 939		self.Nf = None
 940		self.repeatability = {}
 941		self.refresh(session = session)
 942
 943
 944	def make_verbal(oldfun):
 945		'''
 946		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 947		'''
 948		@wraps(oldfun)
 949		def newfun(*args, verbose = '', **kwargs):
 950			myself = args[0]
 951			oldprefix = myself.prefix
 952			myself.prefix = oldfun.__name__
 953			if verbose != '':
 954				oldverbose = myself.verbose
 955				myself.verbose = verbose
 956			out = oldfun(*args, **kwargs)
 957			myself.prefix = oldprefix
 958			if verbose != '':
 959				myself.verbose = oldverbose
 960			return out
 961		return newfun
 962
 963
 964	def msg(self, txt):
 965		'''
 966		Log a message to `self.logfile`, and print it out if `verbose = True`
 967		'''
 968		self.log(txt)
 969		if self.verbose:
 970			print(f'{f"[{self.prefix}]":<16} {txt}')
 971
 972
 973	def vmsg(self, txt):
 974		'''
 975		Log a message to `self.logfile` and print it out
 976		'''
 977		self.log(txt)
 978		print(txt)
 979
 980
 981	def log(self, *txts):
 982		'''
 983		Log a message to `self.logfile`
 984		'''
 985		if self.logfile:
 986			with open(self.logfile, 'a') as fid:
 987				for txt in txts:
 988					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
 989
 990
 991	def refresh(self, session = 'mySession'):
 992		'''
 993		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
 994		'''
 995		self.fill_in_missing_info(session = session)
 996		self.refresh_sessions()
 997		self.refresh_samples()
 998
 999
1000	def refresh_sessions(self):
1001		'''
1002		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1003		to `False` for all sessions.
1004		'''
1005		self.sessions = {
1006			s: {'data': [r for r in self if r['Session'] == s]}
1007			for s in sorted({r['Session'] for r in self})
1008			}
1009		for s in self.sessions:
1010			self.sessions[s]['scrambling_drift'] = False
1011			self.sessions[s]['slope_drift'] = False
1012			self.sessions[s]['wg_drift'] = False
1013			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1014			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1015
1016
1017	def refresh_samples(self):
1018		'''
1019		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1020		'''
1021		self.samples = {
1022			s: {'data': [r for r in self if r['Sample'] == s]}
1023			for s in sorted({r['Sample'] for r in self})
1024			}
1025		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1026		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1027
1028
1029	def read(self, filename, sep = '', session = ''):
1030		'''
1031		Read file in csv format to load data into a `D47data` object.
1032
1033		In the csv file, spaces before and after field separators (`','` by default)
1034		are optional. Each line corresponds to a single analysis.
1035
1036		The required fields are:
1037
1038		+ `UID`: a unique identifier
1039		+ `Session`: an identifier for the analytical session
1040		+ `Sample`: a sample identifier
1041		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1042
1043		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1044		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1045		and `d49` are optional, and set to NaN by default.
1046
1047		**Parameters**
1048
1049		+ `fileneme`: the path of the file to read
1050		+ `sep`: csv separator delimiting the fields
1051		+ `session`: set `Session` field to this string for all analyses
1052		'''
1053		with open(filename) as fid:
1054			self.input(fid.read(), sep = sep, session = session)
1055
1056
1057	def input(self, txt, sep = '', session = ''):
1058		'''
1059		Read `txt` string in csv format to load analysis data into a `D47data` object.
1060
1061		In the csv string, spaces before and after field separators (`','` by default)
1062		are optional. Each line corresponds to a single analysis.
1063
1064		The required fields are:
1065
1066		+ `UID`: a unique identifier
1067		+ `Session`: an identifier for the analytical session
1068		+ `Sample`: a sample identifier
1069		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1070
1071		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1072		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1073		and `d49` are optional, and set to NaN by default.
1074
1075		**Parameters**
1076
1077		+ `txt`: the csv string to read
1078		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1079		whichever appers most often in `txt`.
1080		+ `session`: set `Session` field to this string for all analyses
1081		'''
1082		if sep == '':
1083			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1084		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1085		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1086
1087		if session != '':
1088			for r in data:
1089				r['Session'] = session
1090
1091		self += data
1092		self.refresh()
1093
1094
1095	@make_verbal
1096	def wg(self, samples = None, a18_acid = None):
1097		'''
1098		Compute bulk composition of the working gas for each session based on
1099		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1100		`self.Nominal_d18O_VPDB`.
1101		'''
1102
1103		self.msg('Computing WG composition:')
1104
1105		if a18_acid is None:
1106			a18_acid = self.ALPHA_18O_ACID_REACTION
1107		if samples is None:
1108			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1109
1110		assert a18_acid, f'Acid fractionation factor should not be zero.'
1111
1112		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1113		R45R46_standards = {}
1114		for sample in samples:
1115			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1116			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1117			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1118			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1119			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1120
1121			C12_s = 1 / (1 + R13_s)
1122			C13_s = R13_s / (1 + R13_s)
1123			C16_s = 1 / (1 + R17_s + R18_s)
1124			C17_s = R17_s / (1 + R17_s + R18_s)
1125			C18_s = R18_s / (1 + R17_s + R18_s)
1126
1127			C626_s = C12_s * C16_s ** 2
1128			C627_s = 2 * C12_s * C16_s * C17_s
1129			C628_s = 2 * C12_s * C16_s * C18_s
1130			C636_s = C13_s * C16_s ** 2
1131			C637_s = 2 * C13_s * C16_s * C17_s
1132			C727_s = C12_s * C17_s ** 2
1133
1134			R45_s = (C627_s + C636_s) / C626_s
1135			R46_s = (C628_s + C637_s + C727_s) / C626_s
1136			R45R46_standards[sample] = (R45_s, R46_s)
1137		
1138		for s in self.sessions:
1139			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1140			assert db, f'No sample from {samples} found in session "{s}".'
1141# 			dbsamples = sorted({r['Sample'] for r in db})
1142
1143			X = [r['d45'] for r in db]
1144			Y = [R45R46_standards[r['Sample']][0] for r in db]
1145			x1, x2 = np.min(X), np.max(X)
1146
1147			if x1 < x2:
1148				wgcoord = x1/(x1-x2)
1149			else:
1150				wgcoord = 999
1151
1152			if wgcoord < -.5 or wgcoord > 1.5:
1153				# unreasonable to extrapolate to d45 = 0
1154				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1155			else :
1156				# d45 = 0 is reasonably well bracketed
1157				R45_wg = np.polyfit(X, Y, 1)[1]
1158
1159			X = [r['d46'] for r in db]
1160			Y = [R45R46_standards[r['Sample']][1] for r in db]
1161			x1, x2 = np.min(X), np.max(X)
1162
1163			if x1 < x2:
1164				wgcoord = x1/(x1-x2)
1165			else:
1166				wgcoord = 999
1167
1168			if wgcoord < -.5 or wgcoord > 1.5:
1169				# unreasonable to extrapolate to d46 = 0
1170				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1171			else :
1172				# d46 = 0 is reasonably well bracketed
1173				R46_wg = np.polyfit(X, Y, 1)[1]
1174
1175			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1176
1177			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1178
1179			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1180			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1181			for r in self.sessions[s]['data']:
1182				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1183				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1184
1185
1186	def compute_bulk_delta(self, R45, R46, D17O = 0):
1187		'''
1188		Compute δ13C_VPDB and δ18O_VSMOW,
1189		by solving the generalized form of equation (17) from
1190		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1191		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1192		solving the corresponding second-order Taylor polynomial.
1193		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1194		'''
1195
1196		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1197
1198		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1199		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1200		C = 2 * self.R18_VSMOW
1201		D = -R46
1202
1203		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1204		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1205		cc = A + B + C + D
1206
1207		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1208
1209		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1210		R17 = K * R18 ** self.LAMBDA_17
1211		R13 = R45 - 2 * R17
1212
1213		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1214
1215		return d13C_VPDB, d18O_VSMOW
1216
1217
1218	@make_verbal
1219	def crunch(self, verbose = ''):
1220		'''
1221		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1222		'''
1223		for r in self:
1224			self.compute_bulk_and_clumping_deltas(r)
1225		self.standardize_d13C()
1226		self.standardize_d18O()
1227		self.msg(f"Crunched {len(self)} analyses.")
1228
1229
1230	def fill_in_missing_info(self, session = 'mySession'):
1231		'''
1232		Fill in optional fields with default values
1233		'''
1234		for i,r in enumerate(self):
1235			if 'D17O' not in r:
1236				r['D17O'] = 0.
1237			if 'UID' not in r:
1238				r['UID'] = f'{i+1}'
1239			if 'Session' not in r:
1240				r['Session'] = session
1241			for k in ['d47', 'd48', 'd49']:
1242				if k not in r:
1243					r[k] = np.nan
1244
1245
1246	def standardize_d13C(self):
1247		'''
1248		Perform δ13C standadization within each session `s` according to
1249		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1250		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1251		may be redefined abitrarily at a later stage.
1252		'''
1253		for s in self.sessions:
1254			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1255				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1256				X,Y = zip(*XY)
1257				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1258					offset = np.mean(Y) - np.mean(X)
1259					for r in self.sessions[s]['data']:
1260						r['d13C_VPDB'] += offset				
1261				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1262					a,b = np.polyfit(X,Y,1)
1263					for r in self.sessions[s]['data']:
1264						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1265
1266	def standardize_d18O(self):
1267		'''
1268		Perform δ18O standadization within each session `s` according to
1269		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1270		which is defined by default by `D47data.refresh_sessions()`as equal to
1271		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1272		'''
1273		for s in self.sessions:
1274			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1275				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1276				X,Y = zip(*XY)
1277				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1278				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1279					offset = np.mean(Y) - np.mean(X)
1280					for r in self.sessions[s]['data']:
1281						r['d18O_VSMOW'] += offset				
1282				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1283					a,b = np.polyfit(X,Y,1)
1284					for r in self.sessions[s]['data']:
1285						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1286	
1287
1288	def compute_bulk_and_clumping_deltas(self, r):
1289		'''
1290		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1291		'''
1292
1293		# Compute working gas R13, R18, and isobar ratios
1294		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1295		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1296		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1297
1298		# Compute analyte isobar ratios
1299		R45 = (1 + r['d45'] / 1000) * R45_wg
1300		R46 = (1 + r['d46'] / 1000) * R46_wg
1301		R47 = (1 + r['d47'] / 1000) * R47_wg
1302		R48 = (1 + r['d48'] / 1000) * R48_wg
1303		R49 = (1 + r['d49'] / 1000) * R49_wg
1304
1305		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1306		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1307		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1308
1309		# Compute stochastic isobar ratios of the analyte
1310		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1311			R13, R18, D17O = r['D17O']
1312		)
1313
1314		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1315		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1316		if (R45 / R45stoch - 1) > 5e-8:
1317			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1318		if (R46 / R46stoch - 1) > 5e-8:
1319			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1320
1321		# Compute raw clumped isotope anomalies
1322		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1323		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1324		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1325
1326
1327	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1328		'''
1329		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1330		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1331		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1332		'''
1333
1334		# Compute R17
1335		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1336
1337		# Compute isotope concentrations
1338		C12 = (1 + R13) ** -1
1339		C13 = C12 * R13
1340		C16 = (1 + R17 + R18) ** -1
1341		C17 = C16 * R17
1342		C18 = C16 * R18
1343
1344		# Compute stochastic isotopologue concentrations
1345		C626 = C16 * C12 * C16
1346		C627 = C16 * C12 * C17 * 2
1347		C628 = C16 * C12 * C18 * 2
1348		C636 = C16 * C13 * C16
1349		C637 = C16 * C13 * C17 * 2
1350		C638 = C16 * C13 * C18 * 2
1351		C727 = C17 * C12 * C17
1352		C728 = C17 * C12 * C18 * 2
1353		C737 = C17 * C13 * C17
1354		C738 = C17 * C13 * C18 * 2
1355		C828 = C18 * C12 * C18
1356		C838 = C18 * C13 * C18
1357
1358		# Compute stochastic isobar ratios
1359		R45 = (C636 + C627) / C626
1360		R46 = (C628 + C637 + C727) / C626
1361		R47 = (C638 + C728 + C737) / C626
1362		R48 = (C738 + C828) / C626
1363		R49 = C838 / C626
1364
1365		# Account for stochastic anomalies
1366		R47 *= 1 + D47 / 1000
1367		R48 *= 1 + D48 / 1000
1368		R49 *= 1 + D49 / 1000
1369
1370		# Return isobar ratios
1371		return R45, R46, R47, R48, R49
1372
1373
1374	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1375		'''
1376		Split unknown samples by UID (treat all analyses as different samples)
1377		or by session (treat analyses of a given sample in different sessions as
1378		different samples).
1379
1380		**Parameters**
1381
1382		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1383		+ `grouping`: `by_uid` | `by_session`
1384		'''
1385		if samples_to_split == 'all':
1386			samples_to_split = [s for s in self.unknowns]
1387		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1388		self.grouping = grouping.lower()
1389		if self.grouping in gkeys:
1390			gkey = gkeys[self.grouping]
1391		for r in self:
1392			if r['Sample'] in samples_to_split:
1393				r['Sample_original'] = r['Sample']
1394				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1395			elif r['Sample'] in self.unknowns:
1396				r['Sample_original'] = r['Sample']
1397		self.refresh_samples()
1398
1399
1400	def unsplit_samples(self, tables = False):
1401		'''
1402		Reverse the effects of `D47data.split_samples()`.
1403		
1404		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1405		
1406		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1407		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1408		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1409		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1410		that case session-averaged Δ4x values are statistically independent).
1411		'''
1412		unknowns_old = sorted({s for s in self.unknowns})
1413		CM_old = self.standardization.covar[:,:]
1414		VD_old = self.standardization.params.valuesdict().copy()
1415		vars_old = self.standardization.var_names
1416
1417		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1418
1419		Ns = len(vars_old) - len(unknowns_old)
1420		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1421		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1422
1423		W = np.zeros((len(vars_new), len(vars_old)))
1424		W[:Ns,:Ns] = np.eye(Ns)
1425		for u in unknowns_new:
1426			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1427			if self.grouping == 'by_session':
1428				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1429			elif self.grouping == 'by_uid':
1430				weights = [1 for s in splits]
1431			sw = sum(weights)
1432			weights = [w/sw for w in weights]
1433			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1434
1435		CM_new = W @ CM_old @ W.T
1436		V = W @ np.array([[VD_old[k]] for k in vars_old])
1437		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1438
1439		self.standardization.covar = CM_new
1440		self.standardization.params.valuesdict = lambda : VD_new
1441		self.standardization.var_names = vars_new
1442
1443		for r in self:
1444			if r['Sample'] in self.unknowns:
1445				r['Sample_split'] = r['Sample']
1446				r['Sample'] = r['Sample_original']
1447
1448		self.refresh_samples()
1449		self.consolidate_samples()
1450		self.repeatabilities()
1451
1452		if tables:
1453			self.table_of_analyses()
1454			self.table_of_samples()
1455
1456	def assign_timestamps(self):
1457		'''
1458		Assign a time field `t` of type `float` to each analysis.
1459
1460		If `TimeTag` is one of the data fields, `t` is equal within a given session
1461		to `TimeTag` minus the mean value of `TimeTag` for that session.
1462		Otherwise, `TimeTag` is by default equal to the index of each analysis
1463		in the dataset and `t` is defined as above.
1464		'''
1465		for session in self.sessions:
1466			sdata = self.sessions[session]['data']
1467			try:
1468				t0 = np.mean([r['TimeTag'] for r in sdata])
1469				for r in sdata:
1470					r['t'] = r['TimeTag'] - t0
1471			except KeyError:
1472				t0 = (len(sdata)-1)/2
1473				for t,r in enumerate(sdata):
1474					r['t'] = t - t0
1475
1476
1477	def report(self):
1478		'''
1479		Prints a report on the standardization fit.
1480		Only applicable after `D4xdata.standardize(method='pooled')`.
1481		'''
1482		report_fit(self.standardization)
1483
1484
1485	def combine_samples(self, sample_groups):
1486		'''
1487		Combine analyses of different samples to compute weighted average Δ4x
1488		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1489		dictionary.
1490		
1491		Caution: samples are weighted by number of replicate analyses, which is a
1492		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1493		correlated analytical errors for one or more samples).
1494		
1495		Returns a tuplet of:
1496		
1497		+ the list of group names
1498		+ an array of the corresponding Δ4x values
1499		+ the corresponding (co)variance matrix
1500		
1501		**Parameters**
1502
1503		+ `sample_groups`: a dictionary of the form:
1504		```py
1505		{'group1': ['sample_1', 'sample_2'],
1506		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1507		```
1508		'''
1509		
1510		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1511		groups = sorted(sample_groups.keys())
1512		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1513		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1514		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1515		W = np.array([
1516			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1517			for j in groups])
1518		D4x_new = W @ D4x_old
1519		CM_new = W @ CM_old @ W.T
1520
1521		return groups, D4x_new[:,0], CM_new
1522		
1523
1524	@make_verbal
1525	def standardize(self,
1526		method = 'pooled',
1527		weighted_sessions = [],
1528		consolidate = True,
1529		consolidate_tables = False,
1530		consolidate_plots = False,
1531		constraints = {},
1532		):
1533		'''
1534		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1535		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1536		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1537		i.e. that their true Δ4x value does not change between sessions,
1538		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1539		`'indep_sessions'`, the standardization processes each session independently, based only
1540		on anchors analyses.
1541		'''
1542
1543		self.standardization_method = method
1544		self.assign_timestamps()
1545
1546		if method == 'pooled':
1547			if weighted_sessions:
1548				for session_group in weighted_sessions:
1549					if self._4x == '47':
1550						X = D47data([r for r in self if r['Session'] in session_group])
1551					elif self._4x == '48':
1552						X = D48data([r for r in self if r['Session'] in session_group])
1553					X.Nominal_D4x = self.Nominal_D4x.copy()
1554					X.refresh()
1555					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1556					w = np.sqrt(result.redchi)
1557					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1558					for r in X:
1559						r[f'wD{self._4x}raw'] *= w
1560			else:
1561				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1562				for r in self:
1563					r[f'wD{self._4x}raw'] = 1.
1564
1565			params = Parameters()
1566			for k,session in enumerate(self.sessions):
1567				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1568				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1569				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1570				s = pf(session)
1571				params.add(f'a_{s}', value = 0.9)
1572				params.add(f'b_{s}', value = 0.)
1573				params.add(f'c_{s}', value = -0.9)
1574				params.add(f'a2_{s}', value = 0.,
1575# 					vary = self.sessions[session]['scrambling_drift'],
1576					)
1577				params.add(f'b2_{s}', value = 0.,
1578# 					vary = self.sessions[session]['slope_drift'],
1579					)
1580				params.add(f'c2_{s}', value = 0.,
1581# 					vary = self.sessions[session]['wg_drift'],
1582					)
1583				if not self.sessions[session]['scrambling_drift']:
1584					params[f'a2_{s}'].expr = '0'
1585				if not self.sessions[session]['slope_drift']:
1586					params[f'b2_{s}'].expr = '0'
1587				if not self.sessions[session]['wg_drift']:
1588					params[f'c2_{s}'].expr = '0'
1589
1590			for sample in self.unknowns:
1591				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1592
1593			for k in constraints:
1594				params[k].expr = constraints[k]
1595
1596			def residuals(p):
1597				R = []
1598				for r in self:
1599					session = pf(r['Session'])
1600					sample = pf(r['Sample'])
1601					if r['Sample'] in self.Nominal_D4x:
1602						R += [ (
1603							r[f'D{self._4x}raw'] - (
1604								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1605								+ p[f'b_{session}'] * r[f'd{self._4x}']
1606								+	p[f'c_{session}']
1607								+ r['t'] * (
1608									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1609									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1610									+	p[f'c2_{session}']
1611									)
1612								)
1613							) / r[f'wD{self._4x}raw'] ]
1614					else:
1615						R += [ (
1616							r[f'D{self._4x}raw'] - (
1617								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1618								+ p[f'b_{session}'] * r[f'd{self._4x}']
1619								+	p[f'c_{session}']
1620								+ r['t'] * (
1621									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1622									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1623									+	p[f'c2_{session}']
1624									)
1625								)
1626							) / r[f'wD{self._4x}raw'] ]
1627				return R
1628
1629			M = Minimizer(residuals, params)
1630			result = M.least_squares()
1631			self.Nf = result.nfree
1632			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1633			new_names, new_covar, new_se = _fullcovar(result)[:3]
1634			result.var_names = new_names
1635			result.covar = new_covar
1636
1637			for r in self:
1638				s = pf(r["Session"])
1639				a = result.params.valuesdict()[f'a_{s}']
1640				b = result.params.valuesdict()[f'b_{s}']
1641				c = result.params.valuesdict()[f'c_{s}']
1642				a2 = result.params.valuesdict()[f'a2_{s}']
1643				b2 = result.params.valuesdict()[f'b2_{s}']
1644				c2 = result.params.valuesdict()[f'c2_{s}']
1645				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1646				
1647
1648			self.standardization = result
1649
1650			for session in self.sessions:
1651				self.sessions[session]['Np'] = 3
1652				for k in ['scrambling', 'slope', 'wg']:
1653					if self.sessions[session][f'{k}_drift']:
1654						self.sessions[session]['Np'] += 1
1655
1656			if consolidate:
1657				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1658			return result
1659
1660
1661		elif method == 'indep_sessions':
1662
1663			if weighted_sessions:
1664				for session_group in weighted_sessions:
1665					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1666					X.Nominal_D4x = self.Nominal_D4x.copy()
1667					X.refresh()
1668					# This is only done to assign r['wD47raw'] for r in X:
1669					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1670					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1671			else:
1672				self.msg('All weights set to 1 ‰')
1673				for r in self:
1674					r[f'wD{self._4x}raw'] = 1
1675
1676			for session in self.sessions:
1677				s = self.sessions[session]
1678				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1679				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1680				s['Np'] = sum(p_active)
1681				sdata = s['data']
1682
1683				A = np.array([
1684					[
1685						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1686						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1687						1 / r[f'wD{self._4x}raw'],
1688						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1689						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1690						r['t'] / r[f'wD{self._4x}raw']
1691						]
1692					for r in sdata if r['Sample'] in self.anchors
1693					])[:,p_active] # only keep columns for the active parameters
1694				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1695				s['Na'] = Y.size
1696				CM = linalg.inv(A.T @ A)
1697				bf = (CM @ A.T @ Y).T[0,:]
1698				k = 0
1699				for n,a in zip(p_names, p_active):
1700					if a:
1701						s[n] = bf[k]
1702# 						self.msg(f'{n} = {bf[k]}')
1703						k += 1
1704					else:
1705						s[n] = 0.
1706# 						self.msg(f'{n} = 0.0')
1707
1708				for r in sdata :
1709					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1710					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1711					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1712
1713				s['CM'] = np.zeros((6,6))
1714				i = 0
1715				k_active = [j for j,a in enumerate(p_active) if a]
1716				for j,a in enumerate(p_active):
1717					if a:
1718						s['CM'][j,k_active] = CM[i,:]
1719						i += 1
1720
1721			if not weighted_sessions:
1722				w = self.rmswd()['rmswd']
1723				for r in self:
1724						r[f'wD{self._4x}'] *= w
1725						r[f'wD{self._4x}raw'] *= w
1726				for session in self.sessions:
1727					self.sessions[session]['CM'] *= w**2
1728
1729			for session in self.sessions:
1730				s = self.sessions[session]
1731				s['SE_a'] = s['CM'][0,0]**.5
1732				s['SE_b'] = s['CM'][1,1]**.5
1733				s['SE_c'] = s['CM'][2,2]**.5
1734				s['SE_a2'] = s['CM'][3,3]**.5
1735				s['SE_b2'] = s['CM'][4,4]**.5
1736				s['SE_c2'] = s['CM'][5,5]**.5
1737
1738			if not weighted_sessions:
1739				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1740			else:
1741				self.Nf = 0
1742				for sg in weighted_sessions:
1743					self.Nf += self.rmswd(sessions = sg)['Nf']
1744
1745			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1746
1747			avgD4x = {
1748				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1749				for sample in self.samples
1750				}
1751			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1752			rD4x = (chi2/self.Nf)**.5
1753			self.repeatability[f'sigma_{self._4x}'] = rD4x
1754
1755			if consolidate:
1756				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1757
1758
1759	def standardization_error(self, session, d4x, D4x, t = 0):
1760		'''
1761		Compute standardization error for a given session and
1762		(δ47, Δ47) composition.
1763		'''
1764		a = self.sessions[session]['a']
1765		b = self.sessions[session]['b']
1766		c = self.sessions[session]['c']
1767		a2 = self.sessions[session]['a2']
1768		b2 = self.sessions[session]['b2']
1769		c2 = self.sessions[session]['c2']
1770		CM = self.sessions[session]['CM']
1771
1772		x, y = D4x, d4x
1773		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1774# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1775		dxdy = -(b+b2*t) / (a+a2*t)
1776		dxdz = 1. / (a+a2*t)
1777		dxda = -x / (a+a2*t)
1778		dxdb = -y / (a+a2*t)
1779		dxdc = -1. / (a+a2*t)
1780		dxda2 = -x * a2 / (a+a2*t)
1781		dxdb2 = -y * t / (a+a2*t)
1782		dxdc2 = -t / (a+a2*t)
1783		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1784		sx = (V @ CM @ V.T) ** .5
1785		return sx
1786
1787
1788	@make_verbal
1789	def summary(self,
1790		dir = 'output',
1791		filename = None,
1792		save_to_file = True,
1793		print_out = True,
1794		):
1795		'''
1796		Print out an/or save to disk a summary of the standardization results.
1797
1798		**Parameters**
1799
1800		+ `dir`: the directory in which to save the table
1801		+ `filename`: the name to the csv file to write to
1802		+ `save_to_file`: whether to save the table to disk
1803		+ `print_out`: whether to print out the table
1804		'''
1805
1806		out = []
1807		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1808		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1809		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1810		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1811		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1812		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1813		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1814		out += [['Model degrees of freedom', f"{self.Nf}"]]
1815		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1816		out += [['Standardization method', self.standardization_method]]
1817
1818		if save_to_file:
1819			if not os.path.exists(dir):
1820				os.makedirs(dir)
1821			if filename is None:
1822				filename = f'D{self._4x}_summary.csv'
1823			with open(f'{dir}/{filename}', 'w') as fid:
1824				fid.write(make_csv(out))
1825		if print_out:
1826			self.msg('\n' + pretty_table(out, header = 0))
1827
1828
1829	@make_verbal
1830	def table_of_sessions(self,
1831		dir = 'output',
1832		filename = None,
1833		save_to_file = True,
1834		print_out = True,
1835		output = None,
1836		):
1837		'''
1838		Print out an/or save to disk a table of sessions.
1839
1840		**Parameters**
1841
1842		+ `dir`: the directory in which to save the table
1843		+ `filename`: the name to the csv file to write to
1844		+ `save_to_file`: whether to save the table to disk
1845		+ `print_out`: whether to print out the table
1846		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1847		    if set to `'raw'`: return a list of list of strings
1848		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1849		'''
1850		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1851		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1852		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1853
1854		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1855		if include_a2:
1856			out[-1] += ['a2 ± SE']
1857		if include_b2:
1858			out[-1] += ['b2 ± SE']
1859		if include_c2:
1860			out[-1] += ['c2 ± SE']
1861		for session in self.sessions:
1862			out += [[
1863				session,
1864				f"{self.sessions[session]['Na']}",
1865				f"{self.sessions[session]['Nu']}",
1866				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1867				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1868				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1869				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1870				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1871				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1872				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1873				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1874				]]
1875			if include_a2:
1876				if self.sessions[session]['scrambling_drift']:
1877					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1878				else:
1879					out[-1] += ['']
1880			if include_b2:
1881				if self.sessions[session]['slope_drift']:
1882					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1883				else:
1884					out[-1] += ['']
1885			if include_c2:
1886				if self.sessions[session]['wg_drift']:
1887					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1888				else:
1889					out[-1] += ['']
1890
1891		if save_to_file:
1892			if not os.path.exists(dir):
1893				os.makedirs(dir)
1894			if filename is None:
1895				filename = f'D{self._4x}_sessions.csv'
1896			with open(f'{dir}/{filename}', 'w') as fid:
1897				fid.write(make_csv(out))
1898		if print_out:
1899			self.msg('\n' + pretty_table(out))
1900		if output == 'raw':
1901			return out
1902		elif output == 'pretty':
1903			return pretty_table(out)
1904
1905
1906	@make_verbal
1907	def table_of_analyses(
1908		self,
1909		dir = 'output',
1910		filename = None,
1911		save_to_file = True,
1912		print_out = True,
1913		output = None,
1914		):
1915		'''
1916		Print out an/or save to disk a table of analyses.
1917
1918		**Parameters**
1919
1920		+ `dir`: the directory in which to save the table
1921		+ `filename`: the name to the csv file to write to
1922		+ `save_to_file`: whether to save the table to disk
1923		+ `print_out`: whether to print out the table
1924		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1925		    if set to `'raw'`: return a list of list of strings
1926		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1927		'''
1928
1929		out = [['UID','Session','Sample']]
1930		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1931		for f in extra_fields:
1932			out[-1] += [f[0]]
1933		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1934		for r in self:
1935			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1936			for f in extra_fields:
1937				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1938			out[-1] += [
1939				f"{r['d13Cwg_VPDB']:.3f}",
1940				f"{r['d18Owg_VSMOW']:.3f}",
1941				f"{r['d45']:.6f}",
1942				f"{r['d46']:.6f}",
1943				f"{r['d47']:.6f}",
1944				f"{r['d48']:.6f}",
1945				f"{r['d49']:.6f}",
1946				f"{r['d13C_VPDB']:.6f}",
1947				f"{r['d18O_VSMOW']:.6f}",
1948				f"{r['D47raw']:.6f}",
1949				f"{r['D48raw']:.6f}",
1950				f"{r['D49raw']:.6f}",
1951				f"{r[f'D{self._4x}']:.6f}"
1952				]
1953		if save_to_file:
1954			if not os.path.exists(dir):
1955				os.makedirs(dir)
1956			if filename is None:
1957				filename = f'D{self._4x}_analyses.csv'
1958			with open(f'{dir}/{filename}', 'w') as fid:
1959				fid.write(make_csv(out))
1960		if print_out:
1961			self.msg('\n' + pretty_table(out))
1962		return out
1963
1964	@make_verbal
1965	def covar_table(
1966		self,
1967		correl = False,
1968		dir = 'output',
1969		filename = None,
1970		save_to_file = True,
1971		print_out = True,
1972		output = None,
1973		):
1974		'''
1975		Print out, save to disk and/or return the variance-covariance matrix of D4x
1976		for all unknown samples.
1977
1978		**Parameters**
1979
1980		+ `dir`: the directory in which to save the csv
1981		+ `filename`: the name of the csv file to write to
1982		+ `save_to_file`: whether to save the csv
1983		+ `print_out`: whether to print out the matrix
1984		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
1985		    if set to `'raw'`: return a list of list of strings
1986		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1987		'''
1988		samples = sorted([u for u in self.unknowns])
1989		out = [[''] + samples]
1990		for s1 in samples:
1991			out.append([s1])
1992			for s2 in samples:
1993				if correl:
1994					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
1995				else:
1996					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
1997
1998		if save_to_file:
1999			if not os.path.exists(dir):
2000				os.makedirs(dir)
2001			if filename is None:
2002				if correl:
2003					filename = f'D{self._4x}_correl.csv'
2004				else:
2005					filename = f'D{self._4x}_covar.csv'
2006			with open(f'{dir}/{filename}', 'w') as fid:
2007				fid.write(make_csv(out))
2008		if print_out:
2009			self.msg('\n'+pretty_table(out))
2010		if output == 'raw':
2011			return out
2012		elif output == 'pretty':
2013			return pretty_table(out)
2014
2015	@make_verbal
2016	def table_of_samples(
2017		self,
2018		dir = 'output',
2019		filename = None,
2020		save_to_file = True,
2021		print_out = True,
2022		output = None,
2023		):
2024		'''
2025		Print out, save to disk and/or return a table of samples.
2026
2027		**Parameters**
2028
2029		+ `dir`: the directory in which to save the csv
2030		+ `filename`: the name of the csv file to write to
2031		+ `save_to_file`: whether to save the csv
2032		+ `print_out`: whether to print out the table
2033		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2034		    if set to `'raw'`: return a list of list of strings
2035		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2036		'''
2037
2038		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2039		for sample in self.anchors:
2040			out += [[
2041				f"{sample}",
2042				f"{self.samples[sample]['N']}",
2043				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2044				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2045				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2046				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2047				]]
2048		for sample in self.unknowns:
2049			out += [[
2050				f"{sample}",
2051				f"{self.samples[sample]['N']}",
2052				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2053				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2054				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2055				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2056				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2057				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2058				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2059				]]
2060		if save_to_file:
2061			if not os.path.exists(dir):
2062				os.makedirs(dir)
2063			if filename is None:
2064				filename = f'D{self._4x}_samples.csv'
2065			with open(f'{dir}/{filename}', 'w') as fid:
2066				fid.write(make_csv(out))
2067		if print_out:
2068			self.msg('\n'+pretty_table(out))
2069		if output == 'raw':
2070			return out
2071		elif output == 'pretty':
2072			return pretty_table(out)
2073
2074
2075	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2076		'''
2077		Generate session plots and save them to disk.
2078
2079		**Parameters**
2080
2081		+ `dir`: the directory in which to save the plots
2082		+ `figsize`: the width and height (in inches) of each plot
2083		+ `filetype`: 'pdf' or 'png'
2084		+ `dpi`: resolution for PNG output
2085		'''
2086		if not os.path.exists(dir):
2087			os.makedirs(dir)
2088
2089		for session in self.sessions:
2090			sp = self.plot_single_session(session, xylimits = 'constant')
2091			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2092			ppl.close(sp.fig)
2093
2094
2095	@make_verbal
2096	def consolidate_samples(self):
2097		'''
2098		Compile various statistics for each sample.
2099
2100		For each anchor sample:
2101
2102		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2103		+ `SE_D47` or `SE_D48`: set to zero by definition
2104
2105		For each unknown sample:
2106
2107		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2108		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2109
2110		For each anchor and unknown:
2111
2112		+ `N`: the total number of analyses of this sample
2113		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2114		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2115		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2116		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2117		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2118		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2119		'''
2120		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2121		for sample in self.samples:
2122			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2123			if self.samples[sample]['N'] > 1:
2124				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2125
2126			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2127			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2128
2129			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2130			if len(D4x_pop) > 2:
2131				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2132			
2133		if self.standardization_method == 'pooled':
2134			for sample in self.anchors:
2135				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2136				self.samples[sample][f'SE_D{self._4x}'] = 0.
2137			for sample in self.unknowns:
2138				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2139				try:
2140					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2141				except ValueError:
2142					# when `sample` is constrained by self.standardize(constraints = {...}),
2143					# it is no longer listed in self.standardization.var_names.
2144					# Temporary fix: define SE as zero for now
2145					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2146
2147		elif self.standardization_method == 'indep_sessions':
2148			for sample in self.anchors:
2149				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2150				self.samples[sample][f'SE_D{self._4x}'] = 0.
2151			for sample in self.unknowns:
2152				self.msg(f'Consolidating sample {sample}')
2153				self.unknowns[sample][f'session_D{self._4x}'] = {}
2154				session_avg = []
2155				for session in self.sessions:
2156					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2157					if sdata:
2158						self.msg(f'{sample} found in session {session}')
2159						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2160						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2161						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2162						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2163						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2164						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2165						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2166				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2167				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2168				wsum = sum([weights[s] for s in weights])
2169				for s in weights:
2170					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2171
2172		for r in self:
2173			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2174
2175
2176
2177	def consolidate_sessions(self):
2178		'''
2179		Compute various statistics for each session.
2180
2181		+ `Na`: Number of anchor analyses in the session
2182		+ `Nu`: Number of unknown analyses in the session
2183		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2184		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2185		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2186		+ `a`: scrambling factor
2187		+ `b`: compositional slope
2188		+ `c`: WG offset
2189		+ `SE_a`: Model stadard erorr of `a`
2190		+ `SE_b`: Model stadard erorr of `b`
2191		+ `SE_c`: Model stadard erorr of `c`
2192		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2193		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2194		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2195		+ `a2`: scrambling factor drift
2196		+ `b2`: compositional slope drift
2197		+ `c2`: WG offset drift
2198		+ `Np`: Number of standardization parameters to fit
2199		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2200		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2201		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2202		'''
2203		for session in self.sessions:
2204			if 'd13Cwg_VPDB' not in self.sessions[session]:
2205				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2206			if 'd18Owg_VSMOW' not in self.sessions[session]:
2207				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2208			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2209			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2210
2211			self.msg(f'Computing repeatabilities for session {session}')
2212			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2213			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2214			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2215
2216		if self.standardization_method == 'pooled':
2217			for session in self.sessions:
2218
2219				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2220				i = self.standardization.var_names.index(f'a_{pf(session)}')
2221				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2222
2223				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2224				i = self.standardization.var_names.index(f'b_{pf(session)}')
2225				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2226
2227				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2228				i = self.standardization.var_names.index(f'c_{pf(session)}')
2229				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2230
2231				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2232				if self.sessions[session]['scrambling_drift']:
2233					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2234					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2235				else:
2236					self.sessions[session]['SE_a2'] = 0.
2237
2238				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2239				if self.sessions[session]['slope_drift']:
2240					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2241					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2242				else:
2243					self.sessions[session]['SE_b2'] = 0.
2244
2245				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2246				if self.sessions[session]['wg_drift']:
2247					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2248					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2249				else:
2250					self.sessions[session]['SE_c2'] = 0.
2251
2252				i = self.standardization.var_names.index(f'a_{pf(session)}')
2253				j = self.standardization.var_names.index(f'b_{pf(session)}')
2254				k = self.standardization.var_names.index(f'c_{pf(session)}')
2255				CM = np.zeros((6,6))
2256				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2257				try:
2258					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2259					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2260					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2261					try:
2262						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2263						CM[3,4] = self.standardization.covar[i2,j2]
2264						CM[4,3] = self.standardization.covar[j2,i2]
2265					except ValueError:
2266						pass
2267					try:
2268						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2269						CM[3,5] = self.standardization.covar[i2,k2]
2270						CM[5,3] = self.standardization.covar[k2,i2]
2271					except ValueError:
2272						pass
2273				except ValueError:
2274					pass
2275				try:
2276					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2277					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2278					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2279					try:
2280						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2281						CM[4,5] = self.standardization.covar[j2,k2]
2282						CM[5,4] = self.standardization.covar[k2,j2]
2283					except ValueError:
2284						pass
2285				except ValueError:
2286					pass
2287				try:
2288					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2289					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2290					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2291				except ValueError:
2292					pass
2293
2294				self.sessions[session]['CM'] = CM
2295
2296		elif self.standardization_method == 'indep_sessions':
2297			pass # Not implemented yet
2298
2299
2300	@make_verbal
2301	def repeatabilities(self):
2302		'''
2303		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2304		(for all samples, for anchors, and for unknowns).
2305		'''
2306		self.msg('Computing reproducibilities for all sessions')
2307
2308		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2309		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2310		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2311		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2312		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2313
2314
2315	@make_verbal
2316	def consolidate(self, tables = True, plots = True):
2317		'''
2318		Collect information about samples, sessions and repeatabilities.
2319		'''
2320		self.consolidate_samples()
2321		self.consolidate_sessions()
2322		self.repeatabilities()
2323
2324		if tables:
2325			self.summary()
2326			self.table_of_sessions()
2327			self.table_of_analyses()
2328			self.table_of_samples()
2329
2330		if plots:
2331			self.plot_sessions()
2332
2333
2334	@make_verbal
2335	def rmswd(self,
2336		samples = 'all samples',
2337		sessions = 'all sessions',
2338		):
2339		'''
2340		Compute the χ2, root mean squared weighted deviation
2341		(i.e. reduced χ2), and corresponding degrees of freedom of the
2342		Δ4x values for samples in `samples` and sessions in `sessions`.
2343		
2344		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2345		'''
2346		if samples == 'all samples':
2347			mysamples = [k for k in self.samples]
2348		elif samples == 'anchors':
2349			mysamples = [k for k in self.anchors]
2350		elif samples == 'unknowns':
2351			mysamples = [k for k in self.unknowns]
2352		else:
2353			mysamples = samples
2354
2355		if sessions == 'all sessions':
2356			sessions = [k for k in self.sessions]
2357
2358		chisq, Nf = 0, 0
2359		for sample in mysamples :
2360			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2361			if len(G) > 1 :
2362				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2363				Nf += (len(G) - 1)
2364				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2365		r = (chisq / Nf)**.5 if Nf > 0 else 0
2366		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2367		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2368
2369	
2370	@make_verbal
2371	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2372		'''
2373		Compute the repeatability of `[r[key] for r in self]`
2374		'''
2375
2376		if samples == 'all samples':
2377			mysamples = [k for k in self.samples]
2378		elif samples == 'anchors':
2379			mysamples = [k for k in self.anchors]
2380		elif samples == 'unknowns':
2381			mysamples = [k for k in self.unknowns]
2382		else:
2383			mysamples = samples
2384
2385		if sessions == 'all sessions':
2386			sessions = [k for k in self.sessions]
2387
2388		if key in ['D47', 'D48']:
2389			# Full disclosure: the definition of Nf is tricky/debatable
2390			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2391			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2392			Nf = len(G)
2393# 			print(f'len(G) = {Nf}')
2394			Nf -= len([s for s in mysamples if s in self.unknowns])
2395# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2396			for session in sessions:
2397				Np = len([
2398					_ for _ in self.standardization.params
2399					if (
2400						self.standardization.params[_].expr is not None
2401						and (
2402							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2403							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2404							)
2405						)
2406					])
2407# 				print(f'session {session}: {Np} parameters to consider')
2408				Na = len({
2409					r['Sample'] for r in self.sessions[session]['data']
2410					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2411					})
2412# 				print(f'session {session}: {Na} different anchors in that session')
2413				Nf -= min(Np, Na)
2414# 			print(f'Nf = {Nf}')
2415
2416# 			for sample in mysamples :
2417# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2418# 				if len(X) > 1 :
2419# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2420# 					if sample in self.unknowns:
2421# 						Nf += len(X) - 1
2422# 					else:
2423# 						Nf += len(X)
2424# 			if samples in ['anchors', 'all samples']:
2425# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2426			r = (chisq / Nf)**.5 if Nf > 0 else 0
2427
2428		else: # if key not in ['D47', 'D48']
2429			chisq, Nf = 0, 0
2430			for sample in mysamples :
2431				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2432				if len(X) > 1 :
2433					Nf += len(X) - 1
2434					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2435			r = (chisq / Nf)**.5 if Nf > 0 else 0
2436
2437		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2438		return r
2439
2440	def sample_average(self, samples, weights = 'equal', normalize = True):
2441		'''
2442		Weighted average Δ4x value of a group of samples, accounting for covariance.
2443
2444		Returns the weighed average Δ4x value and associated SE
2445		of a group of samples. Weights are equal by default. If `normalize` is
2446		true, `weights` will be rescaled so that their sum equals 1.
2447
2448		**Examples**
2449
2450		```python
2451		self.sample_average(['X','Y'], [1, 2])
2452		```
2453
2454		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2455		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2456		values of samples X and Y, respectively.
2457
2458		```python
2459		self.sample_average(['X','Y'], [1, -1], normalize = False)
2460		```
2461
2462		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2463		'''
2464		if weights == 'equal':
2465			weights = [1/len(samples)] * len(samples)
2466
2467		if normalize:
2468			s = sum(weights)
2469			if s:
2470				weights = [w/s for w in weights]
2471
2472		try:
2473# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2474# 			C = self.standardization.covar[indices,:][:,indices]
2475			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2476			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2477			return correlated_sum(X, C, weights)
2478		except ValueError:
2479			return (0., 0.)
2480
2481
2482	def sample_D4x_covar(self, sample1, sample2 = None):
2483		'''
2484		Covariance between Δ4x values of samples
2485
2486		Returns the error covariance between the average Δ4x values of two
2487		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2488		returns the Δ4x variance for that sample.
2489		'''
2490		if sample2 is None:
2491			sample2 = sample1
2492		if self.standardization_method == 'pooled':
2493			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2494			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2495			return self.standardization.covar[i, j]
2496		elif self.standardization_method == 'indep_sessions':
2497			if sample1 == sample2:
2498				return self.samples[sample1][f'SE_D{self._4x}']**2
2499			else:
2500				c = 0
2501				for session in self.sessions:
2502					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2503					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2504					if sdata1 and sdata2:
2505						a = self.sessions[session]['a']
2506						# !! TODO: CM below does not account for temporal changes in standardization parameters
2507						CM = self.sessions[session]['CM'][:3,:3]
2508						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2509						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2510						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2511						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2512						c += (
2513							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2514							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2515							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2516							@ CM
2517							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2518							) / a**2
2519				return float(c)
2520
2521	def sample_D4x_correl(self, sample1, sample2 = None):
2522		'''
2523		Correlation between Δ4x errors of samples
2524
2525		Returns the error correlation between the average Δ4x values of two samples.
2526		'''
2527		if sample2 is None or sample2 == sample1:
2528			return 1.
2529		return (
2530			self.sample_D4x_covar(sample1, sample2)
2531			/ self.unknowns[sample1][f'SE_D{self._4x}']
2532			/ self.unknowns[sample2][f'SE_D{self._4x}']
2533			)
2534
2535	def plot_single_session(self,
2536		session,
2537		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2538		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2539		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2540		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2541		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2542		xylimits = 'free', # | 'constant'
2543		x_label = None,
2544		y_label = None,
2545		error_contour_interval = 'auto',
2546		fig = 'new',
2547		):
2548		'''
2549		Generate plot for a single session
2550		'''
2551		if x_label is None:
2552			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2553		if y_label is None:
2554			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2555
2556		out = _SessionPlot()
2557		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2558		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2559		
2560		if fig == 'new':
2561			out.fig = ppl.figure(figsize = (6,6))
2562			ppl.subplots_adjust(.1,.1,.9,.9)
2563
2564		out.anchor_analyses, = ppl.plot(
2565			[r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors],
2566			[r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors],
2567			**kw_plot_anchors)
2568		out.unknown_analyses, = ppl.plot(
2569			[r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns],
2570			[r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns],
2571			**kw_plot_unknowns)
2572		out.anchor_avg = ppl.plot(
2573			np.array([ np.array([
2574				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2575				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2576				]) for sample in anchors]).T,
2577			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T,
2578			**kw_plot_anchor_avg)
2579		out.unknown_avg = ppl.plot(
2580			np.array([ np.array([
2581				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2582				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2583				]) for sample in unknowns]).T,
2584			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T,
2585			**kw_plot_unknown_avg)
2586		if xylimits == 'constant':
2587			x = [r[f'd{self._4x}'] for r in self]
2588			y = [r[f'D{self._4x}'] for r in self]
2589			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2590			w, h = x2-x1, y2-y1
2591			x1 -= w/20
2592			x2 += w/20
2593			y1 -= h/20
2594			y2 += h/20
2595			ppl.axis([x1, x2, y1, y2])
2596		elif xylimits == 'free':
2597			x1, x2, y1, y2 = ppl.axis()
2598		else:
2599			x1, x2, y1, y2 = ppl.axis(xylimits)
2600				
2601		if error_contour_interval != 'none':
2602			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2603			XI,YI = np.meshgrid(xi, yi)
2604			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2605			if error_contour_interval == 'auto':
2606				rng = np.max(SI) - np.min(SI)
2607				if rng <= 0.01:
2608					cinterval = 0.001
2609				elif rng <= 0.03:
2610					cinterval = 0.004
2611				elif rng <= 0.1:
2612					cinterval = 0.01
2613				elif rng <= 0.3:
2614					cinterval = 0.03
2615				elif rng <= 1.:
2616					cinterval = 0.1
2617				else:
2618					cinterval = 0.5
2619			else:
2620				cinterval = error_contour_interval
2621
2622			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2623			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2624			out.clabel = ppl.clabel(out.contour)
2625
2626		ppl.xlabel(x_label)
2627		ppl.ylabel(y_label)
2628		ppl.title(session, weight = 'bold')
2629		ppl.grid(alpha = .2)
2630		out.ax = ppl.gca()		
2631
2632		return out
2633
2634	def plot_residuals(
2635		self,
2636		kde = False,
2637		hist = False,
2638		binwidth = 2/3,
2639		dir = 'output',
2640		filename = None,
2641		highlight = [],
2642		colors = None,
2643		figsize = None,
2644		dpi = 100,
2645		yspan = None,
2646		):
2647		'''
2648		Plot residuals of each analysis as a function of time (actually, as a function of
2649		the order of analyses in the `D4xdata` object)
2650
2651		+ `kde`: whether to add a kernel density estimate of residuals
2652		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2653		+ `histbins`: specify bin edges for the histogram
2654		+ `dir`: the directory in which to save the plot
2655		+ `highlight`: a list of samples to highlight
2656		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2657		+ `figsize`: (width, height) of figure
2658		+ `dpi`: resolution for PNG output
2659		+ `yspan`: factor controlling the range of y values shown in plot
2660		  (by default: `yspan = 1.5 if kde else 1.0`)
2661		'''
2662		
2663		from matplotlib import ticker
2664
2665		if yspan is None:
2666			if kde:
2667				yspan = 1.5
2668			else:
2669				yspan = 1.0
2670		
2671		# Layout
2672		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2673		if hist or kde:
2674			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2675			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2676		else:
2677			ppl.subplots_adjust(.08,.05,.78,.8)
2678			ax1 = ppl.subplot(111)
2679		
2680		# Colors
2681		N = len(self.anchors)
2682		if colors is None:
2683			if len(highlight) > 0:
2684				Nh = len(highlight)
2685				if Nh == 1:
2686					colors = {highlight[0]: (0,0,0)}
2687				elif Nh == 3:
2688					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2689				elif Nh == 4:
2690					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2691				else:
2692					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2693			else:
2694				if N == 3:
2695					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2696				elif N == 4:
2697					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2698				else:
2699					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2700
2701		ppl.sca(ax1)
2702		
2703		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2704
2705		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2706
2707		session = self[0]['Session']
2708		x1 = 0
2709# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2710		x_sessions = {}
2711		one_or_more_singlets = False
2712		one_or_more_multiplets = False
2713		multiplets = set()
2714		for k,r in enumerate(self):
2715			if r['Session'] != session:
2716				x2 = k-1
2717				x_sessions[session] = (x1+x2)/2
2718				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2719				session = r['Session']
2720				x1 = k
2721			singlet = len(self.samples[r['Sample']]['data']) == 1
2722			if not singlet:
2723				multiplets.add(r['Sample'])
2724			if r['Sample'] in self.unknowns:
2725				if singlet:
2726					one_or_more_singlets = True
2727				else:
2728					one_or_more_multiplets = True
2729			kw = dict(
2730				marker = 'x' if singlet else '+',
2731				ms = 4 if singlet else 5,
2732				ls = 'None',
2733				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2734				mew = 1,
2735				alpha = 0.2 if singlet else 1,
2736				)
2737			if highlight and r['Sample'] not in highlight:
2738				kw['alpha'] = 0.2
2739			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2740		x2 = k
2741		x_sessions[session] = (x1+x2)/2
2742
2743		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2744		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2745		if not (hist or kde):
2746			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2747			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2748
2749		xmin, xmax, ymin, ymax = ppl.axis()
2750		if yspan != 1:
2751			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2752		for s in x_sessions:
2753			ppl.text(
2754				x_sessions[s],
2755				ymax +1,
2756				s,
2757				va = 'bottom',
2758				**(
2759					dict(ha = 'center')
2760					if len(self.sessions[s]['data']) > (0.15 * len(self))
2761					else dict(ha = 'left', rotation = 45)
2762					)
2763				)
2764
2765		if hist or kde:
2766			ppl.sca(ax2)
2767
2768		for s in colors:
2769			kw['marker'] = '+'
2770			kw['ms'] = 5
2771			kw['mec'] = colors[s]
2772			kw['label'] = s
2773			kw['alpha'] = 1
2774			ppl.plot([], [], **kw)
2775
2776		kw['mec'] = (0,0,0)
2777
2778		if one_or_more_singlets:
2779			kw['marker'] = 'x'
2780			kw['ms'] = 4
2781			kw['alpha'] = .2
2782			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2783			ppl.plot([], [], **kw)
2784
2785		if one_or_more_multiplets:
2786			kw['marker'] = '+'
2787			kw['ms'] = 4
2788			kw['alpha'] = 1
2789			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2790			ppl.plot([], [], **kw)
2791
2792		if hist or kde:
2793			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2794		else:
2795			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2796		leg.set_zorder(-1000)
2797
2798		ppl.sca(ax1)
2799
2800		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2801		ppl.xticks([])
2802		ppl.axis([-1, len(self), None, None])
2803
2804		if hist or kde:
2805			ppl.sca(ax2)
2806			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2807
2808			if kde:
2809				from scipy.stats import gaussian_kde
2810				yi = np.linspace(ymin, ymax, 201)
2811				xi = gaussian_kde(X).evaluate(yi)
2812				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2813# 				ppl.plot(xi, yi, 'k-', lw = 1)
2814			elif hist:
2815				ppl.hist(
2816					X,
2817					orientation = 'horizontal',
2818					histtype = 'stepfilled',
2819					ec = [.4]*3,
2820					fc = [.25]*3,
2821					alpha = .25,
2822					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2823					)
2824			ppl.text(0, 0,
2825				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2826				size = 7.5,
2827				alpha = 1,
2828				va = 'center',
2829				ha = 'left',
2830				)
2831
2832			ppl.axis([0, None, ymin, ymax])
2833			ppl.xticks([])
2834			ppl.yticks([])
2835# 			ax2.spines['left'].set_visible(False)
2836			ax2.spines['right'].set_visible(False)
2837			ax2.spines['top'].set_visible(False)
2838			ax2.spines['bottom'].set_visible(False)
2839
2840		ax1.axis([None, None, ymin, ymax])
2841
2842		if not os.path.exists(dir):
2843			os.makedirs(dir)
2844		if filename is None:
2845			return fig
2846		elif filename == '':
2847			filename = f'D{self._4x}_residuals.pdf'
2848		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2849		ppl.close(fig)
2850				
2851
2852	def simulate(self, *args, **kwargs):
2853		'''
2854		Legacy function with warning message pointing to `virtual_data()`
2855		'''
2856		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2857
2858	def plot_distribution_of_analyses(
2859		self,
2860		dir = 'output',
2861		filename = None,
2862		vs_time = False,
2863		figsize = (6,4),
2864		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2865		output = None,
2866		dpi = 100,
2867		):
2868		'''
2869		Plot temporal distribution of all analyses in the data set.
2870		
2871		**Parameters**
2872
2873		+ `dir`: the directory in which to save the plot
2874		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2875		+ `dpi`: resolution for PNG output
2876		+ `figsize`: (width, height) of figure
2877		+ `dpi`: resolution for PNG output
2878		'''
2879
2880		asamples = [s for s in self.anchors]
2881		usamples = [s for s in self.unknowns]
2882		if output is None or output == 'fig':
2883			fig = ppl.figure(figsize = figsize)
2884			ppl.subplots_adjust(*subplots_adjust)
2885		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2886		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2887		Xmax += (Xmax-Xmin)/40
2888		Xmin -= (Xmax-Xmin)/41
2889		for k, s in enumerate(asamples + usamples):
2890			if vs_time:
2891				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2892			else:
2893				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2894			Y = [-k for x in X]
2895			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2896			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2897			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2898		ppl.axis([Xmin, Xmax, -k-1, 1])
2899		ppl.xlabel('\ntime')
2900		ppl.gca().annotate('',
2901			xy = (0.6, -0.02),
2902			xycoords = 'axes fraction',
2903			xytext = (.4, -0.02), 
2904            arrowprops = dict(arrowstyle = "->", color = 'k'),
2905            )
2906			
2907
2908		x2 = -1
2909		for session in self.sessions:
2910			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2911			if vs_time:
2912				ppl.axvline(x1, color = 'k', lw = .75)
2913			if x2 > -1:
2914				if not vs_time:
2915					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2916			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2917# 			from xlrd import xldate_as_datetime
2918# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2919			if vs_time:
2920				ppl.axvline(x2, color = 'k', lw = .75)
2921				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2922			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2923
2924		ppl.xticks([])
2925		ppl.yticks([])
2926
2927		if output is None:
2928			if not os.path.exists(dir):
2929				os.makedirs(dir)
2930			if filename == None:
2931				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2932			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2933			ppl.close(fig)
2934		elif output == 'ax':
2935			return ppl.gca()
2936		elif output == 'fig':
2937			return fig
2938
2939
2940	def plot_bulk_compositions(
2941		self,
2942		samples = None,
2943		dir = 'output/bulk_compositions',
2944		figsize = (6,6),
2945		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
2946		show = False,
2947		sample_color = (0,.5,1),
2948		analysis_color = (.7,.7,.7),
2949		labeldist = 0.3,
2950		radius = 0.05,
2951		):
2952		'''
2953		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
2954		
2955		By default, creates a directory `./output/bulk_compositions` where plots for
2956		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
2957		
2958		
2959		**Parameters**
2960
2961		+ `samples`: Only these samples are processed (by default: all samples).
2962		+ `dir`: where to save the plots
2963		+ `figsize`: (width, height) of figure
2964		+ `subplots_adjust`: passed to `subplots_adjust()`
2965		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
2966		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
2967		+ `sample_color`: color used for replicate markers/labels
2968		+ `analysis_color`: color used for sample markers/labels
2969		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
2970		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
2971		'''
2972
2973		from matplotlib.patches import Ellipse
2974
2975		if samples is None:
2976			samples = [_ for _ in self.samples]
2977
2978		saved = {}
2979
2980		for s in samples:
2981
2982			fig = ppl.figure(figsize = figsize)
2983			fig.subplots_adjust(*subplots_adjust)
2984			ax = ppl.subplot(111)
2985			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
2986			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
2987			ppl.title(s)
2988
2989
2990			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
2991			UID = [_['UID'] for _ in self.samples[s]['data']]
2992			XY0 = XY.mean(0)
2993
2994			for xy in XY:
2995				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
2996				
2997			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
2998			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
2999			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3000			saved[s] = [XY, XY0]
3001			
3002			x1, x2, y1, y2 = ppl.axis()
3003			x0, dx = (x1+x2)/2, (x2-x1)/2
3004			y0, dy = (y1+y2)/2, (y2-y1)/2
3005			dx, dy = [max(max(dx, dy), radius)]*2
3006
3007			ppl.axis([
3008				x0 - 1.2*dx,
3009				x0 + 1.2*dx,
3010				y0 - 1.2*dy,
3011				y0 + 1.2*dy,
3012				])			
3013
3014			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3015
3016			for xy, uid in zip(XY, UID):
3017
3018				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3019				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3020
3021				if (vector_in_display_space**2).sum() > 0:
3022
3023					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3024					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3025					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3026					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3027
3028					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3029
3030				else:
3031
3032					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3033
3034			if radius:
3035				ax.add_artist(Ellipse(
3036					xy = XY0,
3037					width = radius*2,
3038					height = radius*2,
3039					ls = (0, (2,2)),
3040					lw = .7,
3041					ec = analysis_color,
3042					fc = 'None',
3043					))
3044				ppl.text(
3045					XY0[0],
3046					XY0[1]-radius,
3047					f'\n± {radius*1e3:.0f} ppm',
3048					color = analysis_color,
3049					va = 'top',
3050					ha = 'center',
3051					linespacing = 0.4,
3052					size = 8,
3053					)
3054
3055			if not os.path.exists(dir):
3056				os.makedirs(dir)
3057			fig.savefig(f'{dir}/{s}.pdf')
3058			ppl.close(fig)
3059
3060		fig = ppl.figure(figsize = figsize)
3061		fig.subplots_adjust(*subplots_adjust)
3062		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3063		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3064
3065		for s in saved:
3066			for xy in saved[s][0]:
3067				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3068			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3069			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3070			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3071
3072		x1, x2, y1, y2 = ppl.axis()
3073		ppl.axis([
3074			x1 - (x2-x1)/10,
3075			x2 + (x2-x1)/10,
3076			y1 - (y2-y1)/10,
3077			y2 + (y2-y1)/10,
3078			])			
3079
3080
3081		if not os.path.exists(dir):
3082			os.makedirs(dir)
3083		fig.savefig(f'{dir}/__all__.pdf')
3084		if show:
3085			ppl.show()
3086		ppl.close(fig)

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
921	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
922		'''
923		**Parameters**
924
925		+ `l`: a list of dictionaries, with each dictionary including at least the keys
926		`Sample`, `d45`, `d46`, and `d47` or `d48`.
927		+ `mass`: `'47'` or `'48'`
928		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
929		+ `session`: define session name for analyses without a `Session` key
930		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
931
932		Returns a `D4xdata` object derived from `list`.
933		'''
934		self._4x = mass
935		self.verbose = verbose
936		self.prefix = 'D4xdata'
937		self.logfile = logfile
938		list.__init__(self, l)
939		self.Nf = None
940		self.repeatability = {}
941		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = 1.008129

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
def make_verbal(oldfun):
944	def make_verbal(oldfun):
945		'''
946		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
947		'''
948		@wraps(oldfun)
949		def newfun(*args, verbose = '', **kwargs):
950			myself = args[0]
951			oldprefix = myself.prefix
952			myself.prefix = oldfun.__name__
953			if verbose != '':
954				oldverbose = myself.verbose
955				myself.verbose = verbose
956			out = oldfun(*args, **kwargs)
957			myself.prefix = oldprefix
958			if verbose != '':
959				myself.verbose = oldverbose
960			return out
961		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
964	def msg(self, txt):
965		'''
966		Log a message to `self.logfile`, and print it out if `verbose = True`
967		'''
968		self.log(txt)
969		if self.verbose:
970			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
973	def vmsg(self, txt):
974		'''
975		Log a message to `self.logfile` and print it out
976		'''
977		self.log(txt)
978		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
981	def log(self, *txts):
982		'''
983		Log a message to `self.logfile`
984		'''
985		if self.logfile:
986			with open(self.logfile, 'a') as fid:
987				for txt in txts:
988					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
991	def refresh(self, session = 'mySession'):
992		'''
993		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
994		'''
995		self.fill_in_missing_info(session = session)
996		self.refresh_sessions()
997		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1000	def refresh_sessions(self):
1001		'''
1002		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1003		to `False` for all sessions.
1004		'''
1005		self.sessions = {
1006			s: {'data': [r for r in self if r['Session'] == s]}
1007			for s in sorted({r['Session'] for r in self})
1008			}
1009		for s in self.sessions:
1010			self.sessions[s]['scrambling_drift'] = False
1011			self.sessions[s]['slope_drift'] = False
1012			self.sessions[s]['wg_drift'] = False
1013			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1014			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1017	def refresh_samples(self):
1018		'''
1019		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1020		'''
1021		self.samples = {
1022			s: {'data': [r for r in self if r['Sample'] == s]}
1023			for s in sorted({r['Sample'] for r in self})
1024			}
1025		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1026		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1029	def read(self, filename, sep = '', session = ''):
1030		'''
1031		Read file in csv format to load data into a `D47data` object.
1032
1033		In the csv file, spaces before and after field separators (`','` by default)
1034		are optional. Each line corresponds to a single analysis.
1035
1036		The required fields are:
1037
1038		+ `UID`: a unique identifier
1039		+ `Session`: an identifier for the analytical session
1040		+ `Sample`: a sample identifier
1041		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1042
1043		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1044		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1045		and `d49` are optional, and set to NaN by default.
1046
1047		**Parameters**
1048
1049		+ `fileneme`: the path of the file to read
1050		+ `sep`: csv separator delimiting the fields
1051		+ `session`: set `Session` field to this string for all analyses
1052		'''
1053		with open(filename) as fid:
1054			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1057	def input(self, txt, sep = '', session = ''):
1058		'''
1059		Read `txt` string in csv format to load analysis data into a `D47data` object.
1060
1061		In the csv string, spaces before and after field separators (`','` by default)
1062		are optional. Each line corresponds to a single analysis.
1063
1064		The required fields are:
1065
1066		+ `UID`: a unique identifier
1067		+ `Session`: an identifier for the analytical session
1068		+ `Sample`: a sample identifier
1069		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1070
1071		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1072		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1073		and `d49` are optional, and set to NaN by default.
1074
1075		**Parameters**
1076
1077		+ `txt`: the csv string to read
1078		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1079		whichever appers most often in `txt`.
1080		+ `session`: set `Session` field to this string for all analyses
1081		'''
1082		if sep == '':
1083			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1084		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1085		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1086
1087		if session != '':
1088			for r in data:
1089				r['Session'] = session
1090
1091		self += data
1092		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, a18_acid=None):
1095	@make_verbal
1096	def wg(self, samples = None, a18_acid = None):
1097		'''
1098		Compute bulk composition of the working gas for each session based on
1099		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1100		`self.Nominal_d18O_VPDB`.
1101		'''
1102
1103		self.msg('Computing WG composition:')
1104
1105		if a18_acid is None:
1106			a18_acid = self.ALPHA_18O_ACID_REACTION
1107		if samples is None:
1108			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1109
1110		assert a18_acid, f'Acid fractionation factor should not be zero.'
1111
1112		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1113		R45R46_standards = {}
1114		for sample in samples:
1115			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1116			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1117			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1118			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1119			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1120
1121			C12_s = 1 / (1 + R13_s)
1122			C13_s = R13_s / (1 + R13_s)
1123			C16_s = 1 / (1 + R17_s + R18_s)
1124			C17_s = R17_s / (1 + R17_s + R18_s)
1125			C18_s = R18_s / (1 + R17_s + R18_s)
1126
1127			C626_s = C12_s * C16_s ** 2
1128			C627_s = 2 * C12_s * C16_s * C17_s
1129			C628_s = 2 * C12_s * C16_s * C18_s
1130			C636_s = C13_s * C16_s ** 2
1131			C637_s = 2 * C13_s * C16_s * C17_s
1132			C727_s = C12_s * C17_s ** 2
1133
1134			R45_s = (C627_s + C636_s) / C626_s
1135			R46_s = (C628_s + C637_s + C727_s) / C626_s
1136			R45R46_standards[sample] = (R45_s, R46_s)
1137		
1138		for s in self.sessions:
1139			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1140			assert db, f'No sample from {samples} found in session "{s}".'
1141# 			dbsamples = sorted({r['Sample'] for r in db})
1142
1143			X = [r['d45'] for r in db]
1144			Y = [R45R46_standards[r['Sample']][0] for r in db]
1145			x1, x2 = np.min(X), np.max(X)
1146
1147			if x1 < x2:
1148				wgcoord = x1/(x1-x2)
1149			else:
1150				wgcoord = 999
1151
1152			if wgcoord < -.5 or wgcoord > 1.5:
1153				# unreasonable to extrapolate to d45 = 0
1154				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1155			else :
1156				# d45 = 0 is reasonably well bracketed
1157				R45_wg = np.polyfit(X, Y, 1)[1]
1158
1159			X = [r['d46'] for r in db]
1160			Y = [R45R46_standards[r['Sample']][1] for r in db]
1161			x1, x2 = np.min(X), np.max(X)
1162
1163			if x1 < x2:
1164				wgcoord = x1/(x1-x2)
1165			else:
1166				wgcoord = 999
1167
1168			if wgcoord < -.5 or wgcoord > 1.5:
1169				# unreasonable to extrapolate to d46 = 0
1170				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1171			else :
1172				# d46 = 0 is reasonably well bracketed
1173				R46_wg = np.polyfit(X, Y, 1)[1]
1174
1175			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1176
1177			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1178
1179			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1180			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1181			for r in self.sessions[s]['data']:
1182				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1183				r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

def compute_bulk_delta(self, R45, R46, D17O=0):
1186	def compute_bulk_delta(self, R45, R46, D17O = 0):
1187		'''
1188		Compute δ13C_VPDB and δ18O_VSMOW,
1189		by solving the generalized form of equation (17) from
1190		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1191		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1192		solving the corresponding second-order Taylor polynomial.
1193		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1194		'''
1195
1196		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1197
1198		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1199		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1200		C = 2 * self.R18_VSMOW
1201		D = -R46
1202
1203		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1204		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1205		cc = A + B + C + D
1206
1207		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1208
1209		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1210		R17 = K * R18 ** self.LAMBDA_17
1211		R13 = R45 - 2 * R17
1212
1213		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1214
1215		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1218	@make_verbal
1219	def crunch(self, verbose = ''):
1220		'''
1221		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1222		'''
1223		for r in self:
1224			self.compute_bulk_and_clumping_deltas(r)
1225		self.standardize_d13C()
1226		self.standardize_d18O()
1227		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1230	def fill_in_missing_info(self, session = 'mySession'):
1231		'''
1232		Fill in optional fields with default values
1233		'''
1234		for i,r in enumerate(self):
1235			if 'D17O' not in r:
1236				r['D17O'] = 0.
1237			if 'UID' not in r:
1238				r['UID'] = f'{i+1}'
1239			if 'Session' not in r:
1240				r['Session'] = session
1241			for k in ['d47', 'd48', 'd49']:
1242				if k not in r:
1243					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1246	def standardize_d13C(self):
1247		'''
1248		Perform δ13C standadization within each session `s` according to
1249		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1250		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1251		may be redefined abitrarily at a later stage.
1252		'''
1253		for s in self.sessions:
1254			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1255				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1256				X,Y = zip(*XY)
1257				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1258					offset = np.mean(Y) - np.mean(X)
1259					for r in self.sessions[s]['data']:
1260						r['d13C_VPDB'] += offset				
1261				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1262					a,b = np.polyfit(X,Y,1)
1263					for r in self.sessions[s]['data']:
1264						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1266	def standardize_d18O(self):
1267		'''
1268		Perform δ18O standadization within each session `s` according to
1269		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1270		which is defined by default by `D47data.refresh_sessions()`as equal to
1271		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1272		'''
1273		for s in self.sessions:
1274			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1275				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1276				X,Y = zip(*XY)
1277				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1278				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1279					offset = np.mean(Y) - np.mean(X)
1280					for r in self.sessions[s]['data']:
1281						r['d18O_VSMOW'] += offset				
1282				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1283					a,b = np.polyfit(X,Y,1)
1284					for r in self.sessions[s]['data']:
1285						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1288	def compute_bulk_and_clumping_deltas(self, r):
1289		'''
1290		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1291		'''
1292
1293		# Compute working gas R13, R18, and isobar ratios
1294		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1295		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1296		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1297
1298		# Compute analyte isobar ratios
1299		R45 = (1 + r['d45'] / 1000) * R45_wg
1300		R46 = (1 + r['d46'] / 1000) * R46_wg
1301		R47 = (1 + r['d47'] / 1000) * R47_wg
1302		R48 = (1 + r['d48'] / 1000) * R48_wg
1303		R49 = (1 + r['d49'] / 1000) * R49_wg
1304
1305		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1306		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1307		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1308
1309		# Compute stochastic isobar ratios of the analyte
1310		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1311			R13, R18, D17O = r['D17O']
1312		)
1313
1314		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1315		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1316		if (R45 / R45stoch - 1) > 5e-8:
1317			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1318		if (R46 / R46stoch - 1) > 5e-8:
1319			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1320
1321		# Compute raw clumped isotope anomalies
1322		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1323		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1324		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1327	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1328		'''
1329		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1330		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1331		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1332		'''
1333
1334		# Compute R17
1335		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1336
1337		# Compute isotope concentrations
1338		C12 = (1 + R13) ** -1
1339		C13 = C12 * R13
1340		C16 = (1 + R17 + R18) ** -1
1341		C17 = C16 * R17
1342		C18 = C16 * R18
1343
1344		# Compute stochastic isotopologue concentrations
1345		C626 = C16 * C12 * C16
1346		C627 = C16 * C12 * C17 * 2
1347		C628 = C16 * C12 * C18 * 2
1348		C636 = C16 * C13 * C16
1349		C637 = C16 * C13 * C17 * 2
1350		C638 = C16 * C13 * C18 * 2
1351		C727 = C17 * C12 * C17
1352		C728 = C17 * C12 * C18 * 2
1353		C737 = C17 * C13 * C17
1354		C738 = C17 * C13 * C18 * 2
1355		C828 = C18 * C12 * C18
1356		C838 = C18 * C13 * C18
1357
1358		# Compute stochastic isobar ratios
1359		R45 = (C636 + C627) / C626
1360		R46 = (C628 + C637 + C727) / C626
1361		R47 = (C638 + C728 + C737) / C626
1362		R48 = (C738 + C828) / C626
1363		R49 = C838 / C626
1364
1365		# Account for stochastic anomalies
1366		R47 *= 1 + D47 / 1000
1367		R48 *= 1 + D48 / 1000
1368		R49 *= 1 + D49 / 1000
1369
1370		# Return isobar ratios
1371		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1374	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1375		'''
1376		Split unknown samples by UID (treat all analyses as different samples)
1377		or by session (treat analyses of a given sample in different sessions as
1378		different samples).
1379
1380		**Parameters**
1381
1382		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1383		+ `grouping`: `by_uid` | `by_session`
1384		'''
1385		if samples_to_split == 'all':
1386			samples_to_split = [s for s in self.unknowns]
1387		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1388		self.grouping = grouping.lower()
1389		if self.grouping in gkeys:
1390			gkey = gkeys[self.grouping]
1391		for r in self:
1392			if r['Sample'] in samples_to_split:
1393				r['Sample_original'] = r['Sample']
1394				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1395			elif r['Sample'] in self.unknowns:
1396				r['Sample_original'] = r['Sample']
1397		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1400	def unsplit_samples(self, tables = False):
1401		'''
1402		Reverse the effects of `D47data.split_samples()`.
1403		
1404		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1405		
1406		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1407		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1408		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1409		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1410		that case session-averaged Δ4x values are statistically independent).
1411		'''
1412		unknowns_old = sorted({s for s in self.unknowns})
1413		CM_old = self.standardization.covar[:,:]
1414		VD_old = self.standardization.params.valuesdict().copy()
1415		vars_old = self.standardization.var_names
1416
1417		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1418
1419		Ns = len(vars_old) - len(unknowns_old)
1420		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1421		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1422
1423		W = np.zeros((len(vars_new), len(vars_old)))
1424		W[:Ns,:Ns] = np.eye(Ns)
1425		for u in unknowns_new:
1426			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1427			if self.grouping == 'by_session':
1428				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1429			elif self.grouping == 'by_uid':
1430				weights = [1 for s in splits]
1431			sw = sum(weights)
1432			weights = [w/sw for w in weights]
1433			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1434
1435		CM_new = W @ CM_old @ W.T
1436		V = W @ np.array([[VD_old[k]] for k in vars_old])
1437		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1438
1439		self.standardization.covar = CM_new
1440		self.standardization.params.valuesdict = lambda : VD_new
1441		self.standardization.var_names = vars_new
1442
1443		for r in self:
1444			if r['Sample'] in self.unknowns:
1445				r['Sample_split'] = r['Sample']
1446				r['Sample'] = r['Sample_original']
1447
1448		self.refresh_samples()
1449		self.consolidate_samples()
1450		self.repeatabilities()
1451
1452		if tables:
1453			self.table_of_analyses()
1454			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1456	def assign_timestamps(self):
1457		'''
1458		Assign a time field `t` of type `float` to each analysis.
1459
1460		If `TimeTag` is one of the data fields, `t` is equal within a given session
1461		to `TimeTag` minus the mean value of `TimeTag` for that session.
1462		Otherwise, `TimeTag` is by default equal to the index of each analysis
1463		in the dataset and `t` is defined as above.
1464		'''
1465		for session in self.sessions:
1466			sdata = self.sessions[session]['data']
1467			try:
1468				t0 = np.mean([r['TimeTag'] for r in sdata])
1469				for r in sdata:
1470					r['t'] = r['TimeTag'] - t0
1471			except KeyError:
1472				t0 = (len(sdata)-1)/2
1473				for t,r in enumerate(sdata):
1474					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1477	def report(self):
1478		'''
1479		Prints a report on the standardization fit.
1480		Only applicable after `D4xdata.standardize(method='pooled')`.
1481		'''
1482		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1485	def combine_samples(self, sample_groups):
1486		'''
1487		Combine analyses of different samples to compute weighted average Δ4x
1488		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1489		dictionary.
1490		
1491		Caution: samples are weighted by number of replicate analyses, which is a
1492		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1493		correlated analytical errors for one or more samples).
1494		
1495		Returns a tuplet of:
1496		
1497		+ the list of group names
1498		+ an array of the corresponding Δ4x values
1499		+ the corresponding (co)variance matrix
1500		
1501		**Parameters**
1502
1503		+ `sample_groups`: a dictionary of the form:
1504		```py
1505		{'group1': ['sample_1', 'sample_2'],
1506		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1507		```
1508		'''
1509		
1510		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1511		groups = sorted(sample_groups.keys())
1512		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1513		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1514		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1515		W = np.array([
1516			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1517			for j in groups])
1518		D4x_new = W @ D4x_old
1519		CM_new = W @ CM_old @ W.T
1520
1521		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1524	@make_verbal
1525	def standardize(self,
1526		method = 'pooled',
1527		weighted_sessions = [],
1528		consolidate = True,
1529		consolidate_tables = False,
1530		consolidate_plots = False,
1531		constraints = {},
1532		):
1533		'''
1534		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1535		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1536		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1537		i.e. that their true Δ4x value does not change between sessions,
1538		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1539		`'indep_sessions'`, the standardization processes each session independently, based only
1540		on anchors analyses.
1541		'''
1542
1543		self.standardization_method = method
1544		self.assign_timestamps()
1545
1546		if method == 'pooled':
1547			if weighted_sessions:
1548				for session_group in weighted_sessions:
1549					if self._4x == '47':
1550						X = D47data([r for r in self if r['Session'] in session_group])
1551					elif self._4x == '48':
1552						X = D48data([r for r in self if r['Session'] in session_group])
1553					X.Nominal_D4x = self.Nominal_D4x.copy()
1554					X.refresh()
1555					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1556					w = np.sqrt(result.redchi)
1557					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1558					for r in X:
1559						r[f'wD{self._4x}raw'] *= w
1560			else:
1561				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1562				for r in self:
1563					r[f'wD{self._4x}raw'] = 1.
1564
1565			params = Parameters()
1566			for k,session in enumerate(self.sessions):
1567				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1568				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1569				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1570				s = pf(session)
1571				params.add(f'a_{s}', value = 0.9)
1572				params.add(f'b_{s}', value = 0.)
1573				params.add(f'c_{s}', value = -0.9)
1574				params.add(f'a2_{s}', value = 0.,
1575# 					vary = self.sessions[session]['scrambling_drift'],
1576					)
1577				params.add(f'b2_{s}', value = 0.,
1578# 					vary = self.sessions[session]['slope_drift'],
1579					)
1580				params.add(f'c2_{s}', value = 0.,
1581# 					vary = self.sessions[session]['wg_drift'],
1582					)
1583				if not self.sessions[session]['scrambling_drift']:
1584					params[f'a2_{s}'].expr = '0'
1585				if not self.sessions[session]['slope_drift']:
1586					params[f'b2_{s}'].expr = '0'
1587				if not self.sessions[session]['wg_drift']:
1588					params[f'c2_{s}'].expr = '0'
1589
1590			for sample in self.unknowns:
1591				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1592
1593			for k in constraints:
1594				params[k].expr = constraints[k]
1595
1596			def residuals(p):
1597				R = []
1598				for r in self:
1599					session = pf(r['Session'])
1600					sample = pf(r['Sample'])
1601					if r['Sample'] in self.Nominal_D4x:
1602						R += [ (
1603							r[f'D{self._4x}raw'] - (
1604								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1605								+ p[f'b_{session}'] * r[f'd{self._4x}']
1606								+	p[f'c_{session}']
1607								+ r['t'] * (
1608									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1609									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1610									+	p[f'c2_{session}']
1611									)
1612								)
1613							) / r[f'wD{self._4x}raw'] ]
1614					else:
1615						R += [ (
1616							r[f'D{self._4x}raw'] - (
1617								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1618								+ p[f'b_{session}'] * r[f'd{self._4x}']
1619								+	p[f'c_{session}']
1620								+ r['t'] * (
1621									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1622									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1623									+	p[f'c2_{session}']
1624									)
1625								)
1626							) / r[f'wD{self._4x}raw'] ]
1627				return R
1628
1629			M = Minimizer(residuals, params)
1630			result = M.least_squares()
1631			self.Nf = result.nfree
1632			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1633			new_names, new_covar, new_se = _fullcovar(result)[:3]
1634			result.var_names = new_names
1635			result.covar = new_covar
1636
1637			for r in self:
1638				s = pf(r["Session"])
1639				a = result.params.valuesdict()[f'a_{s}']
1640				b = result.params.valuesdict()[f'b_{s}']
1641				c = result.params.valuesdict()[f'c_{s}']
1642				a2 = result.params.valuesdict()[f'a2_{s}']
1643				b2 = result.params.valuesdict()[f'b2_{s}']
1644				c2 = result.params.valuesdict()[f'c2_{s}']
1645				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1646				
1647
1648			self.standardization = result
1649
1650			for session in self.sessions:
1651				self.sessions[session]['Np'] = 3
1652				for k in ['scrambling', 'slope', 'wg']:
1653					if self.sessions[session][f'{k}_drift']:
1654						self.sessions[session]['Np'] += 1
1655
1656			if consolidate:
1657				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1658			return result
1659
1660
1661		elif method == 'indep_sessions':
1662
1663			if weighted_sessions:
1664				for session_group in weighted_sessions:
1665					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1666					X.Nominal_D4x = self.Nominal_D4x.copy()
1667					X.refresh()
1668					# This is only done to assign r['wD47raw'] for r in X:
1669					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1670					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1671			else:
1672				self.msg('All weights set to 1 ‰')
1673				for r in self:
1674					r[f'wD{self._4x}raw'] = 1
1675
1676			for session in self.sessions:
1677				s = self.sessions[session]
1678				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1679				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1680				s['Np'] = sum(p_active)
1681				sdata = s['data']
1682
1683				A = np.array([
1684					[
1685						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1686						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1687						1 / r[f'wD{self._4x}raw'],
1688						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1689						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1690						r['t'] / r[f'wD{self._4x}raw']
1691						]
1692					for r in sdata if r['Sample'] in self.anchors
1693					])[:,p_active] # only keep columns for the active parameters
1694				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1695				s['Na'] = Y.size
1696				CM = linalg.inv(A.T @ A)
1697				bf = (CM @ A.T @ Y).T[0,:]
1698				k = 0
1699				for n,a in zip(p_names, p_active):
1700					if a:
1701						s[n] = bf[k]
1702# 						self.msg(f'{n} = {bf[k]}')
1703						k += 1
1704					else:
1705						s[n] = 0.
1706# 						self.msg(f'{n} = 0.0')
1707
1708				for r in sdata :
1709					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1710					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1711					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1712
1713				s['CM'] = np.zeros((6,6))
1714				i = 0
1715				k_active = [j for j,a in enumerate(p_active) if a]
1716				for j,a in enumerate(p_active):
1717					if a:
1718						s['CM'][j,k_active] = CM[i,:]
1719						i += 1
1720
1721			if not weighted_sessions:
1722				w = self.rmswd()['rmswd']
1723				for r in self:
1724						r[f'wD{self._4x}'] *= w
1725						r[f'wD{self._4x}raw'] *= w
1726				for session in self.sessions:
1727					self.sessions[session]['CM'] *= w**2
1728
1729			for session in self.sessions:
1730				s = self.sessions[session]
1731				s['SE_a'] = s['CM'][0,0]**.5
1732				s['SE_b'] = s['CM'][1,1]**.5
1733				s['SE_c'] = s['CM'][2,2]**.5
1734				s['SE_a2'] = s['CM'][3,3]**.5
1735				s['SE_b2'] = s['CM'][4,4]**.5
1736				s['SE_c2'] = s['CM'][5,5]**.5
1737
1738			if not weighted_sessions:
1739				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1740			else:
1741				self.Nf = 0
1742				for sg in weighted_sessions:
1743					self.Nf += self.rmswd(sessions = sg)['Nf']
1744
1745			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1746
1747			avgD4x = {
1748				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1749				for sample in self.samples
1750				}
1751			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1752			rD4x = (chi2/self.Nf)**.5
1753			self.repeatability[f'sigma_{self._4x}'] = rD4x
1754
1755			if consolidate:
1756				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1759	def standardization_error(self, session, d4x, D4x, t = 0):
1760		'''
1761		Compute standardization error for a given session and
1762		(δ47, Δ47) composition.
1763		'''
1764		a = self.sessions[session]['a']
1765		b = self.sessions[session]['b']
1766		c = self.sessions[session]['c']
1767		a2 = self.sessions[session]['a2']
1768		b2 = self.sessions[session]['b2']
1769		c2 = self.sessions[session]['c2']
1770		CM = self.sessions[session]['CM']
1771
1772		x, y = D4x, d4x
1773		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1774# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1775		dxdy = -(b+b2*t) / (a+a2*t)
1776		dxdz = 1. / (a+a2*t)
1777		dxda = -x / (a+a2*t)
1778		dxdb = -y / (a+a2*t)
1779		dxdc = -1. / (a+a2*t)
1780		dxda2 = -x * a2 / (a+a2*t)
1781		dxdb2 = -y * t / (a+a2*t)
1782		dxdc2 = -t / (a+a2*t)
1783		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1784		sx = (V @ CM @ V.T) ** .5
1785		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1788	@make_verbal
1789	def summary(self,
1790		dir = 'output',
1791		filename = None,
1792		save_to_file = True,
1793		print_out = True,
1794		):
1795		'''
1796		Print out an/or save to disk a summary of the standardization results.
1797
1798		**Parameters**
1799
1800		+ `dir`: the directory in which to save the table
1801		+ `filename`: the name to the csv file to write to
1802		+ `save_to_file`: whether to save the table to disk
1803		+ `print_out`: whether to print out the table
1804		'''
1805
1806		out = []
1807		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1808		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1809		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1810		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1811		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1812		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1813		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1814		out += [['Model degrees of freedom', f"{self.Nf}"]]
1815		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1816		out += [['Standardization method', self.standardization_method]]
1817
1818		if save_to_file:
1819			if not os.path.exists(dir):
1820				os.makedirs(dir)
1821			if filename is None:
1822				filename = f'D{self._4x}_summary.csv'
1823			with open(f'{dir}/{filename}', 'w') as fid:
1824				fid.write(make_csv(out))
1825		if print_out:
1826			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1829	@make_verbal
1830	def table_of_sessions(self,
1831		dir = 'output',
1832		filename = None,
1833		save_to_file = True,
1834		print_out = True,
1835		output = None,
1836		):
1837		'''
1838		Print out an/or save to disk a table of sessions.
1839
1840		**Parameters**
1841
1842		+ `dir`: the directory in which to save the table
1843		+ `filename`: the name to the csv file to write to
1844		+ `save_to_file`: whether to save the table to disk
1845		+ `print_out`: whether to print out the table
1846		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1847		    if set to `'raw'`: return a list of list of strings
1848		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1849		'''
1850		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1851		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1852		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1853
1854		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1855		if include_a2:
1856			out[-1] += ['a2 ± SE']
1857		if include_b2:
1858			out[-1] += ['b2 ± SE']
1859		if include_c2:
1860			out[-1] += ['c2 ± SE']
1861		for session in self.sessions:
1862			out += [[
1863				session,
1864				f"{self.sessions[session]['Na']}",
1865				f"{self.sessions[session]['Nu']}",
1866				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1867				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1868				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1869				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1870				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1871				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1872				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1873				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1874				]]
1875			if include_a2:
1876				if self.sessions[session]['scrambling_drift']:
1877					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1878				else:
1879					out[-1] += ['']
1880			if include_b2:
1881				if self.sessions[session]['slope_drift']:
1882					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1883				else:
1884					out[-1] += ['']
1885			if include_c2:
1886				if self.sessions[session]['wg_drift']:
1887					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1888				else:
1889					out[-1] += ['']
1890
1891		if save_to_file:
1892			if not os.path.exists(dir):
1893				os.makedirs(dir)
1894			if filename is None:
1895				filename = f'D{self._4x}_sessions.csv'
1896			with open(f'{dir}/{filename}', 'w') as fid:
1897				fid.write(make_csv(out))
1898		if print_out:
1899			self.msg('\n' + pretty_table(out))
1900		if output == 'raw':
1901			return out
1902		elif output == 'pretty':
1903			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1906	@make_verbal
1907	def table_of_analyses(
1908		self,
1909		dir = 'output',
1910		filename = None,
1911		save_to_file = True,
1912		print_out = True,
1913		output = None,
1914		):
1915		'''
1916		Print out an/or save to disk a table of analyses.
1917
1918		**Parameters**
1919
1920		+ `dir`: the directory in which to save the table
1921		+ `filename`: the name to the csv file to write to
1922		+ `save_to_file`: whether to save the table to disk
1923		+ `print_out`: whether to print out the table
1924		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1925		    if set to `'raw'`: return a list of list of strings
1926		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1927		'''
1928
1929		out = [['UID','Session','Sample']]
1930		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1931		for f in extra_fields:
1932			out[-1] += [f[0]]
1933		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1934		for r in self:
1935			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1936			for f in extra_fields:
1937				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1938			out[-1] += [
1939				f"{r['d13Cwg_VPDB']:.3f}",
1940				f"{r['d18Owg_VSMOW']:.3f}",
1941				f"{r['d45']:.6f}",
1942				f"{r['d46']:.6f}",
1943				f"{r['d47']:.6f}",
1944				f"{r['d48']:.6f}",
1945				f"{r['d49']:.6f}",
1946				f"{r['d13C_VPDB']:.6f}",
1947				f"{r['d18O_VSMOW']:.6f}",
1948				f"{r['D47raw']:.6f}",
1949				f"{r['D48raw']:.6f}",
1950				f"{r['D49raw']:.6f}",
1951				f"{r[f'D{self._4x}']:.6f}"
1952				]
1953		if save_to_file:
1954			if not os.path.exists(dir):
1955				os.makedirs(dir)
1956			if filename is None:
1957				filename = f'D{self._4x}_analyses.csv'
1958			with open(f'{dir}/{filename}', 'w') as fid:
1959				fid.write(make_csv(out))
1960		if print_out:
1961			self.msg('\n' + pretty_table(out))
1962		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1964	@make_verbal
1965	def covar_table(
1966		self,
1967		correl = False,
1968		dir = 'output',
1969		filename = None,
1970		save_to_file = True,
1971		print_out = True,
1972		output = None,
1973		):
1974		'''
1975		Print out, save to disk and/or return the variance-covariance matrix of D4x
1976		for all unknown samples.
1977
1978		**Parameters**
1979
1980		+ `dir`: the directory in which to save the csv
1981		+ `filename`: the name of the csv file to write to
1982		+ `save_to_file`: whether to save the csv
1983		+ `print_out`: whether to print out the matrix
1984		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
1985		    if set to `'raw'`: return a list of list of strings
1986		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1987		'''
1988		samples = sorted([u for u in self.unknowns])
1989		out = [[''] + samples]
1990		for s1 in samples:
1991			out.append([s1])
1992			for s2 in samples:
1993				if correl:
1994					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
1995				else:
1996					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
1997
1998		if save_to_file:
1999			if not os.path.exists(dir):
2000				os.makedirs(dir)
2001			if filename is None:
2002				if correl:
2003					filename = f'D{self._4x}_correl.csv'
2004				else:
2005					filename = f'D{self._4x}_covar.csv'
2006			with open(f'{dir}/{filename}', 'w') as fid:
2007				fid.write(make_csv(out))
2008		if print_out:
2009			self.msg('\n'+pretty_table(out))
2010		if output == 'raw':
2011			return out
2012		elif output == 'pretty':
2013			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2015	@make_verbal
2016	def table_of_samples(
2017		self,
2018		dir = 'output',
2019		filename = None,
2020		save_to_file = True,
2021		print_out = True,
2022		output = None,
2023		):
2024		'''
2025		Print out, save to disk and/or return a table of samples.
2026
2027		**Parameters**
2028
2029		+ `dir`: the directory in which to save the csv
2030		+ `filename`: the name of the csv file to write to
2031		+ `save_to_file`: whether to save the csv
2032		+ `print_out`: whether to print out the table
2033		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2034		    if set to `'raw'`: return a list of list of strings
2035		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2036		'''
2037
2038		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2039		for sample in self.anchors:
2040			out += [[
2041				f"{sample}",
2042				f"{self.samples[sample]['N']}",
2043				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2044				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2045				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2046				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2047				]]
2048		for sample in self.unknowns:
2049			out += [[
2050				f"{sample}",
2051				f"{self.samples[sample]['N']}",
2052				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2053				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2054				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2055				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2056				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2057				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2058				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2059				]]
2060		if save_to_file:
2061			if not os.path.exists(dir):
2062				os.makedirs(dir)
2063			if filename is None:
2064				filename = f'D{self._4x}_samples.csv'
2065			with open(f'{dir}/{filename}', 'w') as fid:
2066				fid.write(make_csv(out))
2067		if print_out:
2068			self.msg('\n'+pretty_table(out))
2069		if output == 'raw':
2070			return out
2071		elif output == 'pretty':
2072			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2075	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2076		'''
2077		Generate session plots and save them to disk.
2078
2079		**Parameters**
2080
2081		+ `dir`: the directory in which to save the plots
2082		+ `figsize`: the width and height (in inches) of each plot
2083		+ `filetype`: 'pdf' or 'png'
2084		+ `dpi`: resolution for PNG output
2085		'''
2086		if not os.path.exists(dir):
2087			os.makedirs(dir)
2088
2089		for session in self.sessions:
2090			sp = self.plot_single_session(session, xylimits = 'constant')
2091			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2092			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2095	@make_verbal
2096	def consolidate_samples(self):
2097		'''
2098		Compile various statistics for each sample.
2099
2100		For each anchor sample:
2101
2102		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2103		+ `SE_D47` or `SE_D48`: set to zero by definition
2104
2105		For each unknown sample:
2106
2107		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2108		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2109
2110		For each anchor and unknown:
2111
2112		+ `N`: the total number of analyses of this sample
2113		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2114		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2115		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2116		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2117		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2118		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2119		'''
2120		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2121		for sample in self.samples:
2122			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2123			if self.samples[sample]['N'] > 1:
2124				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2125
2126			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2127			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2128
2129			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2130			if len(D4x_pop) > 2:
2131				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2132			
2133		if self.standardization_method == 'pooled':
2134			for sample in self.anchors:
2135				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2136				self.samples[sample][f'SE_D{self._4x}'] = 0.
2137			for sample in self.unknowns:
2138				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2139				try:
2140					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2141				except ValueError:
2142					# when `sample` is constrained by self.standardize(constraints = {...}),
2143					# it is no longer listed in self.standardization.var_names.
2144					# Temporary fix: define SE as zero for now
2145					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2146
2147		elif self.standardization_method == 'indep_sessions':
2148			for sample in self.anchors:
2149				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2150				self.samples[sample][f'SE_D{self._4x}'] = 0.
2151			for sample in self.unknowns:
2152				self.msg(f'Consolidating sample {sample}')
2153				self.unknowns[sample][f'session_D{self._4x}'] = {}
2154				session_avg = []
2155				for session in self.sessions:
2156					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2157					if sdata:
2158						self.msg(f'{sample} found in session {session}')
2159						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2160						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2161						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2162						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2163						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2164						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2165						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2166				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2167				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2168				wsum = sum([weights[s] for s in weights])
2169				for s in weights:
2170					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2171
2172		for r in self:
2173			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2177	def consolidate_sessions(self):
2178		'''
2179		Compute various statistics for each session.
2180
2181		+ `Na`: Number of anchor analyses in the session
2182		+ `Nu`: Number of unknown analyses in the session
2183		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2184		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2185		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2186		+ `a`: scrambling factor
2187		+ `b`: compositional slope
2188		+ `c`: WG offset
2189		+ `SE_a`: Model stadard erorr of `a`
2190		+ `SE_b`: Model stadard erorr of `b`
2191		+ `SE_c`: Model stadard erorr of `c`
2192		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2193		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2194		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2195		+ `a2`: scrambling factor drift
2196		+ `b2`: compositional slope drift
2197		+ `c2`: WG offset drift
2198		+ `Np`: Number of standardization parameters to fit
2199		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2200		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2201		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2202		'''
2203		for session in self.sessions:
2204			if 'd13Cwg_VPDB' not in self.sessions[session]:
2205				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2206			if 'd18Owg_VSMOW' not in self.sessions[session]:
2207				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2208			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2209			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2210
2211			self.msg(f'Computing repeatabilities for session {session}')
2212			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2213			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2214			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2215
2216		if self.standardization_method == 'pooled':
2217			for session in self.sessions:
2218
2219				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2220				i = self.standardization.var_names.index(f'a_{pf(session)}')
2221				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2222
2223				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2224				i = self.standardization.var_names.index(f'b_{pf(session)}')
2225				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2226
2227				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2228				i = self.standardization.var_names.index(f'c_{pf(session)}')
2229				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2230
2231				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2232				if self.sessions[session]['scrambling_drift']:
2233					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2234					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2235				else:
2236					self.sessions[session]['SE_a2'] = 0.
2237
2238				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2239				if self.sessions[session]['slope_drift']:
2240					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2241					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2242				else:
2243					self.sessions[session]['SE_b2'] = 0.
2244
2245				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2246				if self.sessions[session]['wg_drift']:
2247					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2248					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2249				else:
2250					self.sessions[session]['SE_c2'] = 0.
2251
2252				i = self.standardization.var_names.index(f'a_{pf(session)}')
2253				j = self.standardization.var_names.index(f'b_{pf(session)}')
2254				k = self.standardization.var_names.index(f'c_{pf(session)}')
2255				CM = np.zeros((6,6))
2256				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2257				try:
2258					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2259					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2260					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2261					try:
2262						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2263						CM[3,4] = self.standardization.covar[i2,j2]
2264						CM[4,3] = self.standardization.covar[j2,i2]
2265					except ValueError:
2266						pass
2267					try:
2268						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2269						CM[3,5] = self.standardization.covar[i2,k2]
2270						CM[5,3] = self.standardization.covar[k2,i2]
2271					except ValueError:
2272						pass
2273				except ValueError:
2274					pass
2275				try:
2276					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2277					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2278					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2279					try:
2280						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2281						CM[4,5] = self.standardization.covar[j2,k2]
2282						CM[5,4] = self.standardization.covar[k2,j2]
2283					except ValueError:
2284						pass
2285				except ValueError:
2286					pass
2287				try:
2288					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2289					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2290					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2291				except ValueError:
2292					pass
2293
2294				self.sessions[session]['CM'] = CM
2295
2296		elif self.standardization_method == 'indep_sessions':
2297			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2300	@make_verbal
2301	def repeatabilities(self):
2302		'''
2303		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2304		(for all samples, for anchors, and for unknowns).
2305		'''
2306		self.msg('Computing reproducibilities for all sessions')
2307
2308		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2309		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2310		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2311		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2312		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2315	@make_verbal
2316	def consolidate(self, tables = True, plots = True):
2317		'''
2318		Collect information about samples, sessions and repeatabilities.
2319		'''
2320		self.consolidate_samples()
2321		self.consolidate_sessions()
2322		self.repeatabilities()
2323
2324		if tables:
2325			self.summary()
2326			self.table_of_sessions()
2327			self.table_of_analyses()
2328			self.table_of_samples()
2329
2330		if plots:
2331			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2334	@make_verbal
2335	def rmswd(self,
2336		samples = 'all samples',
2337		sessions = 'all sessions',
2338		):
2339		'''
2340		Compute the χ2, root mean squared weighted deviation
2341		(i.e. reduced χ2), and corresponding degrees of freedom of the
2342		Δ4x values for samples in `samples` and sessions in `sessions`.
2343		
2344		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2345		'''
2346		if samples == 'all samples':
2347			mysamples = [k for k in self.samples]
2348		elif samples == 'anchors':
2349			mysamples = [k for k in self.anchors]
2350		elif samples == 'unknowns':
2351			mysamples = [k for k in self.unknowns]
2352		else:
2353			mysamples = samples
2354
2355		if sessions == 'all sessions':
2356			sessions = [k for k in self.sessions]
2357
2358		chisq, Nf = 0, 0
2359		for sample in mysamples :
2360			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2361			if len(G) > 1 :
2362				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2363				Nf += (len(G) - 1)
2364				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2365		r = (chisq / Nf)**.5 if Nf > 0 else 0
2366		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2367		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2370	@make_verbal
2371	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2372		'''
2373		Compute the repeatability of `[r[key] for r in self]`
2374		'''
2375
2376		if samples == 'all samples':
2377			mysamples = [k for k in self.samples]
2378		elif samples == 'anchors':
2379			mysamples = [k for k in self.anchors]
2380		elif samples == 'unknowns':
2381			mysamples = [k for k in self.unknowns]
2382		else:
2383			mysamples = samples
2384
2385		if sessions == 'all sessions':
2386			sessions = [k for k in self.sessions]
2387
2388		if key in ['D47', 'D48']:
2389			# Full disclosure: the definition of Nf is tricky/debatable
2390			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2391			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2392			Nf = len(G)
2393# 			print(f'len(G) = {Nf}')
2394			Nf -= len([s for s in mysamples if s in self.unknowns])
2395# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2396			for session in sessions:
2397				Np = len([
2398					_ for _ in self.standardization.params
2399					if (
2400						self.standardization.params[_].expr is not None
2401						and (
2402							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2403							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2404							)
2405						)
2406					])
2407# 				print(f'session {session}: {Np} parameters to consider')
2408				Na = len({
2409					r['Sample'] for r in self.sessions[session]['data']
2410					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2411					})
2412# 				print(f'session {session}: {Na} different anchors in that session')
2413				Nf -= min(Np, Na)
2414# 			print(f'Nf = {Nf}')
2415
2416# 			for sample in mysamples :
2417# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2418# 				if len(X) > 1 :
2419# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2420# 					if sample in self.unknowns:
2421# 						Nf += len(X) - 1
2422# 					else:
2423# 						Nf += len(X)
2424# 			if samples in ['anchors', 'all samples']:
2425# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2426			r = (chisq / Nf)**.5 if Nf > 0 else 0
2427
2428		else: # if key not in ['D47', 'D48']
2429			chisq, Nf = 0, 0
2430			for sample in mysamples :
2431				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2432				if len(X) > 1 :
2433					Nf += len(X) - 1
2434					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2435			r = (chisq / Nf)**.5 if Nf > 0 else 0
2436
2437		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2438		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2440	def sample_average(self, samples, weights = 'equal', normalize = True):
2441		'''
2442		Weighted average Δ4x value of a group of samples, accounting for covariance.
2443
2444		Returns the weighed average Δ4x value and associated SE
2445		of a group of samples. Weights are equal by default. If `normalize` is
2446		true, `weights` will be rescaled so that their sum equals 1.
2447
2448		**Examples**
2449
2450		```python
2451		self.sample_average(['X','Y'], [1, 2])
2452		```
2453
2454		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2455		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2456		values of samples X and Y, respectively.
2457
2458		```python
2459		self.sample_average(['X','Y'], [1, -1], normalize = False)
2460		```
2461
2462		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2463		'''
2464		if weights == 'equal':
2465			weights = [1/len(samples)] * len(samples)
2466
2467		if normalize:
2468			s = sum(weights)
2469			if s:
2470				weights = [w/s for w in weights]
2471
2472		try:
2473# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2474# 			C = self.standardization.covar[indices,:][:,indices]
2475			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2476			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2477			return correlated_sum(X, C, weights)
2478		except ValueError:
2479			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2482	def sample_D4x_covar(self, sample1, sample2 = None):
2483		'''
2484		Covariance between Δ4x values of samples
2485
2486		Returns the error covariance between the average Δ4x values of two
2487		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2488		returns the Δ4x variance for that sample.
2489		'''
2490		if sample2 is None:
2491			sample2 = sample1
2492		if self.standardization_method == 'pooled':
2493			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2494			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2495			return self.standardization.covar[i, j]
2496		elif self.standardization_method == 'indep_sessions':
2497			if sample1 == sample2:
2498				return self.samples[sample1][f'SE_D{self._4x}']**2
2499			else:
2500				c = 0
2501				for session in self.sessions:
2502					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2503					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2504					if sdata1 and sdata2:
2505						a = self.sessions[session]['a']
2506						# !! TODO: CM below does not account for temporal changes in standardization parameters
2507						CM = self.sessions[session]['CM'][:3,:3]
2508						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2509						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2510						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2511						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2512						c += (
2513							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2514							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2515							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2516							@ CM
2517							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2518							) / a**2
2519				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2521	def sample_D4x_correl(self, sample1, sample2 = None):
2522		'''
2523		Correlation between Δ4x errors of samples
2524
2525		Returns the error correlation between the average Δ4x values of two samples.
2526		'''
2527		if sample2 is None or sample2 == sample1:
2528			return 1.
2529		return (
2530			self.sample_D4x_covar(sample1, sample2)
2531			/ self.unknowns[sample1][f'SE_D{self._4x}']
2532			/ self.unknowns[sample2][f'SE_D{self._4x}']
2533			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2535	def plot_single_session(self,
2536		session,
2537		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2538		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2539		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2540		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2541		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2542		xylimits = 'free', # | 'constant'
2543		x_label = None,
2544		y_label = None,
2545		error_contour_interval = 'auto',
2546		fig = 'new',
2547		):
2548		'''
2549		Generate plot for a single session
2550		'''
2551		if x_label is None:
2552			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2553		if y_label is None:
2554			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2555
2556		out = _SessionPlot()
2557		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2558		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2559		
2560		if fig == 'new':
2561			out.fig = ppl.figure(figsize = (6,6))
2562			ppl.subplots_adjust(.1,.1,.9,.9)
2563
2564		out.anchor_analyses, = ppl.plot(
2565			[r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors],
2566			[r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors],
2567			**kw_plot_anchors)
2568		out.unknown_analyses, = ppl.plot(
2569			[r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns],
2570			[r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns],
2571			**kw_plot_unknowns)
2572		out.anchor_avg = ppl.plot(
2573			np.array([ np.array([
2574				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2575				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2576				]) for sample in anchors]).T,
2577			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T,
2578			**kw_plot_anchor_avg)
2579		out.unknown_avg = ppl.plot(
2580			np.array([ np.array([
2581				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2582				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2583				]) for sample in unknowns]).T,
2584			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T,
2585			**kw_plot_unknown_avg)
2586		if xylimits == 'constant':
2587			x = [r[f'd{self._4x}'] for r in self]
2588			y = [r[f'D{self._4x}'] for r in self]
2589			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2590			w, h = x2-x1, y2-y1
2591			x1 -= w/20
2592			x2 += w/20
2593			y1 -= h/20
2594			y2 += h/20
2595			ppl.axis([x1, x2, y1, y2])
2596		elif xylimits == 'free':
2597			x1, x2, y1, y2 = ppl.axis()
2598		else:
2599			x1, x2, y1, y2 = ppl.axis(xylimits)
2600				
2601		if error_contour_interval != 'none':
2602			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2603			XI,YI = np.meshgrid(xi, yi)
2604			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2605			if error_contour_interval == 'auto':
2606				rng = np.max(SI) - np.min(SI)
2607				if rng <= 0.01:
2608					cinterval = 0.001
2609				elif rng <= 0.03:
2610					cinterval = 0.004
2611				elif rng <= 0.1:
2612					cinterval = 0.01
2613				elif rng <= 0.3:
2614					cinterval = 0.03
2615				elif rng <= 1.:
2616					cinterval = 0.1
2617				else:
2618					cinterval = 0.5
2619			else:
2620				cinterval = error_contour_interval
2621
2622			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2623			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2624			out.clabel = ppl.clabel(out.contour)
2625
2626		ppl.xlabel(x_label)
2627		ppl.ylabel(y_label)
2628		ppl.title(session, weight = 'bold')
2629		ppl.grid(alpha = .2)
2630		out.ax = ppl.gca()		
2631
2632		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2634	def plot_residuals(
2635		self,
2636		kde = False,
2637		hist = False,
2638		binwidth = 2/3,
2639		dir = 'output',
2640		filename = None,
2641		highlight = [],
2642		colors = None,
2643		figsize = None,
2644		dpi = 100,
2645		yspan = None,
2646		):
2647		'''
2648		Plot residuals of each analysis as a function of time (actually, as a function of
2649		the order of analyses in the `D4xdata` object)
2650
2651		+ `kde`: whether to add a kernel density estimate of residuals
2652		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2653		+ `histbins`: specify bin edges for the histogram
2654		+ `dir`: the directory in which to save the plot
2655		+ `highlight`: a list of samples to highlight
2656		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2657		+ `figsize`: (width, height) of figure
2658		+ `dpi`: resolution for PNG output
2659		+ `yspan`: factor controlling the range of y values shown in plot
2660		  (by default: `yspan = 1.5 if kde else 1.0`)
2661		'''
2662		
2663		from matplotlib import ticker
2664
2665		if yspan is None:
2666			if kde:
2667				yspan = 1.5
2668			else:
2669				yspan = 1.0
2670		
2671		# Layout
2672		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2673		if hist or kde:
2674			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2675			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2676		else:
2677			ppl.subplots_adjust(.08,.05,.78,.8)
2678			ax1 = ppl.subplot(111)
2679		
2680		# Colors
2681		N = len(self.anchors)
2682		if colors is None:
2683			if len(highlight) > 0:
2684				Nh = len(highlight)
2685				if Nh == 1:
2686					colors = {highlight[0]: (0,0,0)}
2687				elif Nh == 3:
2688					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2689				elif Nh == 4:
2690					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2691				else:
2692					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2693			else:
2694				if N == 3:
2695					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2696				elif N == 4:
2697					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2698				else:
2699					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2700
2701		ppl.sca(ax1)
2702		
2703		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2704
2705		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2706
2707		session = self[0]['Session']
2708		x1 = 0
2709# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2710		x_sessions = {}
2711		one_or_more_singlets = False
2712		one_or_more_multiplets = False
2713		multiplets = set()
2714		for k,r in enumerate(self):
2715			if r['Session'] != session:
2716				x2 = k-1
2717				x_sessions[session] = (x1+x2)/2
2718				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2719				session = r['Session']
2720				x1 = k
2721			singlet = len(self.samples[r['Sample']]['data']) == 1
2722			if not singlet:
2723				multiplets.add(r['Sample'])
2724			if r['Sample'] in self.unknowns:
2725				if singlet:
2726					one_or_more_singlets = True
2727				else:
2728					one_or_more_multiplets = True
2729			kw = dict(
2730				marker = 'x' if singlet else '+',
2731				ms = 4 if singlet else 5,
2732				ls = 'None',
2733				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2734				mew = 1,
2735				alpha = 0.2 if singlet else 1,
2736				)
2737			if highlight and r['Sample'] not in highlight:
2738				kw['alpha'] = 0.2
2739			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2740		x2 = k
2741		x_sessions[session] = (x1+x2)/2
2742
2743		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2744		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2745		if not (hist or kde):
2746			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2747			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2748
2749		xmin, xmax, ymin, ymax = ppl.axis()
2750		if yspan != 1:
2751			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2752		for s in x_sessions:
2753			ppl.text(
2754				x_sessions[s],
2755				ymax +1,
2756				s,
2757				va = 'bottom',
2758				**(
2759					dict(ha = 'center')
2760					if len(self.sessions[s]['data']) > (0.15 * len(self))
2761					else dict(ha = 'left', rotation = 45)
2762					)
2763				)
2764
2765		if hist or kde:
2766			ppl.sca(ax2)
2767
2768		for s in colors:
2769			kw['marker'] = '+'
2770			kw['ms'] = 5
2771			kw['mec'] = colors[s]
2772			kw['label'] = s
2773			kw['alpha'] = 1
2774			ppl.plot([], [], **kw)
2775
2776		kw['mec'] = (0,0,0)
2777
2778		if one_or_more_singlets:
2779			kw['marker'] = 'x'
2780			kw['ms'] = 4
2781			kw['alpha'] = .2
2782			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2783			ppl.plot([], [], **kw)
2784
2785		if one_or_more_multiplets:
2786			kw['marker'] = '+'
2787			kw['ms'] = 4
2788			kw['alpha'] = 1
2789			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2790			ppl.plot([], [], **kw)
2791
2792		if hist or kde:
2793			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2794		else:
2795			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2796		leg.set_zorder(-1000)
2797
2798		ppl.sca(ax1)
2799
2800		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2801		ppl.xticks([])
2802		ppl.axis([-1, len(self), None, None])
2803
2804		if hist or kde:
2805			ppl.sca(ax2)
2806			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2807
2808			if kde:
2809				from scipy.stats import gaussian_kde
2810				yi = np.linspace(ymin, ymax, 201)
2811				xi = gaussian_kde(X).evaluate(yi)
2812				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2813# 				ppl.plot(xi, yi, 'k-', lw = 1)
2814			elif hist:
2815				ppl.hist(
2816					X,
2817					orientation = 'horizontal',
2818					histtype = 'stepfilled',
2819					ec = [.4]*3,
2820					fc = [.25]*3,
2821					alpha = .25,
2822					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2823					)
2824			ppl.text(0, 0,
2825				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2826				size = 7.5,
2827				alpha = 1,
2828				va = 'center',
2829				ha = 'left',
2830				)
2831
2832			ppl.axis([0, None, ymin, ymax])
2833			ppl.xticks([])
2834			ppl.yticks([])
2835# 			ax2.spines['left'].set_visible(False)
2836			ax2.spines['right'].set_visible(False)
2837			ax2.spines['top'].set_visible(False)
2838			ax2.spines['bottom'].set_visible(False)
2839
2840		ax1.axis([None, None, ymin, ymax])
2841
2842		if not os.path.exists(dir):
2843			os.makedirs(dir)
2844		if filename is None:
2845			return fig
2846		elif filename == '':
2847			filename = f'D{self._4x}_residuals.pdf'
2848		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2849		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: <color>} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2852	def simulate(self, *args, **kwargs):
2853		'''
2854		Legacy function with warning message pointing to `virtual_data()`
2855		'''
2856		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
2858	def plot_distribution_of_analyses(
2859		self,
2860		dir = 'output',
2861		filename = None,
2862		vs_time = False,
2863		figsize = (6,4),
2864		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2865		output = None,
2866		dpi = 100,
2867		):
2868		'''
2869		Plot temporal distribution of all analyses in the data set.
2870		
2871		**Parameters**
2872
2873		+ `dir`: the directory in which to save the plot
2874		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2875		+ `dpi`: resolution for PNG output
2876		+ `figsize`: (width, height) of figure
2877		+ `dpi`: resolution for PNG output
2878		'''
2879
2880		asamples = [s for s in self.anchors]
2881		usamples = [s for s in self.unknowns]
2882		if output is None or output == 'fig':
2883			fig = ppl.figure(figsize = figsize)
2884			ppl.subplots_adjust(*subplots_adjust)
2885		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2886		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2887		Xmax += (Xmax-Xmin)/40
2888		Xmin -= (Xmax-Xmin)/41
2889		for k, s in enumerate(asamples + usamples):
2890			if vs_time:
2891				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2892			else:
2893				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2894			Y = [-k for x in X]
2895			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2896			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2897			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2898		ppl.axis([Xmin, Xmax, -k-1, 1])
2899		ppl.xlabel('\ntime')
2900		ppl.gca().annotate('',
2901			xy = (0.6, -0.02),
2902			xycoords = 'axes fraction',
2903			xytext = (.4, -0.02), 
2904            arrowprops = dict(arrowstyle = "->", color = 'k'),
2905            )
2906			
2907
2908		x2 = -1
2909		for session in self.sessions:
2910			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2911			if vs_time:
2912				ppl.axvline(x1, color = 'k', lw = .75)
2913			if x2 > -1:
2914				if not vs_time:
2915					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2916			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2917# 			from xlrd import xldate_as_datetime
2918# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2919			if vs_time:
2920				ppl.axvline(x2, color = 'k', lw = .75)
2921				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2922			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2923
2924		ppl.xticks([])
2925		ppl.yticks([])
2926
2927		if output is None:
2928			if not os.path.exists(dir):
2929				os.makedirs(dir)
2930			if filename == None:
2931				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2932			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2933			ppl.close(fig)
2934		elif output == 'ax':
2935			return ppl.gca()
2936		elif output == 'fig':
2937			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
2940	def plot_bulk_compositions(
2941		self,
2942		samples = None,
2943		dir = 'output/bulk_compositions',
2944		figsize = (6,6),
2945		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
2946		show = False,
2947		sample_color = (0,.5,1),
2948		analysis_color = (.7,.7,.7),
2949		labeldist = 0.3,
2950		radius = 0.05,
2951		):
2952		'''
2953		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
2954		
2955		By default, creates a directory `./output/bulk_compositions` where plots for
2956		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
2957		
2958		
2959		**Parameters**
2960
2961		+ `samples`: Only these samples are processed (by default: all samples).
2962		+ `dir`: where to save the plots
2963		+ `figsize`: (width, height) of figure
2964		+ `subplots_adjust`: passed to `subplots_adjust()`
2965		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
2966		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
2967		+ `sample_color`: color used for replicate markers/labels
2968		+ `analysis_color`: color used for sample markers/labels
2969		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
2970		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
2971		'''
2972
2973		from matplotlib.patches import Ellipse
2974
2975		if samples is None:
2976			samples = [_ for _ in self.samples]
2977
2978		saved = {}
2979
2980		for s in samples:
2981
2982			fig = ppl.figure(figsize = figsize)
2983			fig.subplots_adjust(*subplots_adjust)
2984			ax = ppl.subplot(111)
2985			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
2986			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
2987			ppl.title(s)
2988
2989
2990			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
2991			UID = [_['UID'] for _ in self.samples[s]['data']]
2992			XY0 = XY.mean(0)
2993
2994			for xy in XY:
2995				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
2996				
2997			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
2998			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
2999			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3000			saved[s] = [XY, XY0]
3001			
3002			x1, x2, y1, y2 = ppl.axis()
3003			x0, dx = (x1+x2)/2, (x2-x1)/2
3004			y0, dy = (y1+y2)/2, (y2-y1)/2
3005			dx, dy = [max(max(dx, dy), radius)]*2
3006
3007			ppl.axis([
3008				x0 - 1.2*dx,
3009				x0 + 1.2*dx,
3010				y0 - 1.2*dy,
3011				y0 + 1.2*dy,
3012				])			
3013
3014			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3015
3016			for xy, uid in zip(XY, UID):
3017
3018				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3019				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3020
3021				if (vector_in_display_space**2).sum() > 0:
3022
3023					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3024					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3025					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3026					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3027
3028					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3029
3030				else:
3031
3032					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3033
3034			if radius:
3035				ax.add_artist(Ellipse(
3036					xy = XY0,
3037					width = radius*2,
3038					height = radius*2,
3039					ls = (0, (2,2)),
3040					lw = .7,
3041					ec = analysis_color,
3042					fc = 'None',
3043					))
3044				ppl.text(
3045					XY0[0],
3046					XY0[1]-radius,
3047					f'\n± {radius*1e3:.0f} ppm',
3048					color = analysis_color,
3049					va = 'top',
3050					ha = 'center',
3051					linespacing = 0.4,
3052					size = 8,
3053					)
3054
3055			if not os.path.exists(dir):
3056				os.makedirs(dir)
3057			fig.savefig(f'{dir}/{s}.pdf')
3058			ppl.close(fig)
3059
3060		fig = ppl.figure(figsize = figsize)
3061		fig.subplots_adjust(*subplots_adjust)
3062		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3063		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3064
3065		for s in saved:
3066			for xy in saved[s][0]:
3067				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3068			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3069			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3070			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3071
3072		x1, x2, y1, y2 = ppl.axis()
3073		ppl.axis([
3074			x1 - (x2-x1)/10,
3075			x2 + (x2-x1)/10,
3076			y1 - (y2-y1)/10,
3077			y2 + (y2-y1)/10,
3078			])			
3079
3080
3081		if not os.path.exists(dir):
3082			os.makedirs(dir)
3083		fig.savefig(f'{dir}/__all__.pdf')
3084		if show:
3085			ppl.show()
3086		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3090class D47data(D4xdata):
3091	'''
3092	Store and process data for a large set of Δ47 analyses,
3093	usually comprising more than one analytical session.
3094	'''
3095
3096	Nominal_D4x = {
3097		'ETH-1':   0.2052,
3098		'ETH-2':   0.2085,
3099		'ETH-3':   0.6132,
3100		'ETH-4':   0.4511,
3101		'IAEA-C1': 0.3018,
3102		'IAEA-C2': 0.6409,
3103		'MERCK':   0.5135,
3104		} # I-CDES (Bernasconi et al., 2021)
3105	'''
3106	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3107	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3108	reference frame.
3109
3110	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3111	```py
3112	{
3113		'ETH-1'   : 0.2052,
3114		'ETH-2'   : 0.2085,
3115		'ETH-3'   : 0.6132,
3116		'ETH-4'   : 0.4511,
3117		'IAEA-C1' : 0.3018,
3118		'IAEA-C2' : 0.6409,
3119		'MERCK'   : 0.5135,
3120	}
3121	```
3122	'''
3123
3124
3125	@property
3126	def Nominal_D47(self):
3127		return self.Nominal_D4x
3128	
3129
3130	@Nominal_D47.setter
3131	def Nominal_D47(self, new):
3132		self.Nominal_D4x = dict(**new)
3133		self.refresh()
3134
3135
3136	def __init__(self, l = [], **kwargs):
3137		'''
3138		**Parameters:** same as `D4xdata.__init__()`
3139		'''
3140		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3141
3142
3143	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3144		'''
3145		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3146		value for that temperature, and add treat these samples as additional anchors.
3147
3148		**Parameters**
3149
3150		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3151		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3152		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3153		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3154		if `new`: keep pre-existing anchors but update them in case of conflict
3155		between old and new Δ47 values;
3156		if `old`: keep pre-existing anchors but preserve their original Δ47
3157		values in case of conflict.
3158		'''
3159		f = {
3160			'petersen': fCO2eqD47_Petersen,
3161			'wang': fCO2eqD47_Wang,
3162			}[fCo2eqD47]
3163		foo = {}
3164		for r in self:
3165			if 'Teq' in r:
3166				if r['Sample'] in foo:
3167					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3168				else:
3169					foo[r['Sample']] = f(r['Teq'])
3170			else:
3171					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3172
3173		if priority == 'replace':
3174			self.Nominal_D47 = {}
3175		for s in foo:
3176			if priority != 'old' or s not in self.Nominal_D47:
3177				self.Nominal_D47[s] = foo[s]

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3136	def __init__(self, l = [], **kwargs):
3137		'''
3138		**Parameters:** same as `D4xdata.__init__()`
3139		'''
3140		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3143	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3144		'''
3145		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3146		value for that temperature, and add treat these samples as additional anchors.
3147
3148		**Parameters**
3149
3150		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3151		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3152		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3153		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3154		if `new`: keep pre-existing anchors but update them in case of conflict
3155		between old and new Δ47 values;
3156		if `old`: keep pre-existing anchors but preserve their original Δ47
3157		values in case of conflict.
3158		'''
3159		f = {
3160			'petersen': fCO2eqD47_Petersen,
3161			'wang': fCO2eqD47_Wang,
3162			}[fCo2eqD47]
3163		foo = {}
3164		for r in self:
3165			if 'Teq' in r:
3166				if r['Sample'] in foo:
3167					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3168				else:
3169					foo[r['Sample']] = f(r['Teq'])
3170			else:
3171					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3172
3173		if priority == 'replace':
3174			self.Nominal_D47 = {}
3175		for s in foo:
3176			if priority != 'old' or s not in self.Nominal_D47:
3177				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
class D48data(D4xdata):
3182class D48data(D4xdata):
3183	'''
3184	Store and process data for a large set of Δ48 analyses,
3185	usually comprising more than one analytical session.
3186	'''
3187
3188	Nominal_D4x = {
3189		'ETH-1':  0.138,
3190		'ETH-2':  0.138,
3191		'ETH-3':  0.270,
3192		'ETH-4':  0.223,
3193		'GU-1':  -0.419,
3194		} # (Fiebig et al., 2019, 2021)
3195	'''
3196	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3197	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3198	reference frame.
3199
3200	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3201	Fiebig et al. (in press)):
3202
3203	```py
3204	{
3205		'ETH-1' :  0.138,
3206		'ETH-2' :  0.138,
3207		'ETH-3' :  0.270,
3208		'ETH-4' :  0.223,
3209		'GU-1'  : -0.419,
3210	}
3211	```
3212	'''
3213
3214
3215	@property
3216	def Nominal_D48(self):
3217		return self.Nominal_D4x
3218
3219	
3220	@Nominal_D48.setter
3221	def Nominal_D48(self, new):
3222		self.Nominal_D4x = dict(**new)
3223		self.refresh()
3224
3225
3226	def __init__(self, l = [], **kwargs):
3227		'''
3228		**Parameters:** same as `D4xdata.__init__()`
3229		'''
3230		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3226	def __init__(self, l = [], **kwargs):
3227		'''
3228		**Parameters:** same as `D4xdata.__init__()`
3229		'''
3230		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (in press)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}