Non-Gaussian Behavior; Testing Fits¶
Introduction¶
The various analyses in the Tutorial assume implicitly that every
probability distribution relevant to a fit is Gaussian. The input
data and priors are assumed Gaussian. The chi**2
function is
assumed to be well approximated by a Gaussian in the vicinity of
its minimum, in order to estimate uncertainties for the best-fit
parameters. Functions of those parameters are assumed to yield
results that are described by Gaussian random variables. These assumptions
are usually pretty good for high-statistics data, when standard deviations
are small, but can lead to problems with low statistics.
Here we present three methods for testing these assumptions. Some of these techniques, like the statistical bootstrap and Bayesian integration, can also be used to analyze non-Gaussian results.
Bootstrap Error Analysis; Non-Gaussian Output¶
The bootstrap provides an efficient way to check on a fit’s validity, and also a method for analyzing non-Gaussian outputs. The strategy is to:
make a large number of “bootstrap copies” of the original input data and prior that differ from each other by random amounts characteristic of the underlying randomness in the original data and prior (see the documentation for
for more information);
repeat the entire fit analysis for each bootstrap copy of the data and prior, extracting fit results from each;
use the variation of the fit results from bootstrap copy to bootstrap copy to determine an approximate probability distribution (possibly non-Gaussian) for the each result.
To illustrate, we return to our fit in the section
on Correlated Parameters; Gaussian Bayes Factor, where the uncertainties on the
final parameters were relatively large.
We will use a booststrap analysis to check the error
estimates coming out of that fit.
We do this by adding code right after the
fit, in the main()
function:
import numpy as np
import gvar as gv
import lsqfit
def main():
x, y = make_data()
prior = make_prior()
fit = lsqfit.nonlinear_fit(prior=prior, data=(x,y), fcn=fcn)
print(fit)
print('p1/p0 =', fit.p[1] / fit.p[0], 'p3/p2 =', fit.p[3] / fit.p[2])
print('corr(p0,p1) =', gv.evalcorr(fit.p[:2])[1,0])
# boostrap analysis: collect bootstrap data
print('\nBootstrap Analysis:')
Nbs = 40 # number of bootstrap copies
output = {'p':[], 'p1/p0':[], 'p3/p2':[]}
for bsfit in fit.bootstrapped_fit_iter(Nbs):
p = bsfit.pmean
output['p'].append(p)
output['p1/p0'].append(p[1] / p[0])
output['p3/p2'].append(p[3] / p[2])
# average over bootstrap copies and tabulate results
output = gv.dataset.avg_data(output, bstrap=True)
print(gv.tabulate(output))
print('corr(p0,p1) =', gv.evalcorr(output['p'][:2])[1,0])
def make_data():
x = np.array([
4., 2., 1., 0.5, 0.25, 0.167, 0.125, 0.1, 0.0833, 0.0714, 0.0625
])
y = gv.gvar([
'0.198(14)', '0.216(15)', '0.184(23)', '0.156(44)', '0.099(49)',
'0.142(40)', '0.108(32)', '0.065(26)', '0.044(22)', '0.041(19)',
'0.044(16)'
])
return x, y
def make_prior():
p = gv.gvar(['0(1)', '0(1)', '0(1)', '0(1)'])
p[1] = 20 * p[0] + gv.gvar('0.0(1)') # p[1] correlated with p[0]
return p
def fcn(x, p):
return (p[0] * (x**2 + p[1] * x)) / (x**2 + x * p[2] + p[3])
if __name__ == '__main__':
main()
The bootstrapped_fit_iter
produces fits bsfit
for each of
Nbs=40
different bootstrap copies of the input data (y
and the prior).
We collect the mean values for the various parameters and functions of
parameters from each fit, ignoring the uncertainties, and
then calculate averages and covariance matrices from these results using
gvar.dataset.avg_data()
.
Most of the bootstrap results agree with the results coming directly from the fit:
Least Square Fit:
chi2/dof [dof] = 0.61 [11] Q = 0.82 logGBF = 19.129
Parameters:
0 0.149 (17) [ 0.0 (1.0) ]
1 2.97 (34) [ 0 (20) ]
2 1.23 (61) [ 0.0 (1.0) ] *
3 0.59 (15) [ 0.0 (1.0) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 19/0.0)
p1/p0 = 19.97(67) p3/p2 = 0.48(22)
corr(p0,p1) = 0.9570678206095675
Bootstrap Averages:
key/index value
-------------------------
p 0 0.146 (18)
1 2.91 (29)
2 1.14 (77)
3 0.59 (15)
p1/p0 19.86 (79)
p3/p2 0.49 (39)
corr(p0,p1) = 0.9487861077873482
In particular, the bootstrap analysis confirms the previous error estimates
(to within 10-30%, since Nbs=40
) except for p3/p2
, where the error
is substantially larger in the bootstrap analysis.
If p3/p2
is important, one might want to look
more closely at its distribution.
We use the bootstrap to create histograms of the probability distributions
of p3/p2
and p1/p01
by adding the following code to the end of the main()
function:
print('Histogram Analysis:')
count = {'p1/p0':[], 'p3/p2':[]}
hist = {
'p1/p0':gv.PDFHistogram(fit.p[1] / fit.p[0]),
'p3/p2':gv.PDFHistogram(fit.p[3] / fit.p[2]),
}
# collect bootstrap data
for bsfit in fit.bootstrapped_fit_iter(n=1000):
p = bsfit.pmean
count['p1/p0'].append(hist['p1/p0'].count(p[1] / p[0]))
count['p3/p2'].append(hist['p3/p2'].count(p[3] / p[2]))
# calculate averages and covariances
count = gv.dataset.avg_data(count)
# print histogram statistics and show plots
import matplotlib.pyplot as plt
pltnum = 1
for k in count:
print(k + ':')
print(hist[k].analyze(count[k]).stats)
plt.subplot(1, 2, pltnum)
plt.xlabel(k)
hist[k].make_plot(count[k], plot=plt)
if pltnum == 2:
plt.ylabel('')
pltnum += 1
plt.show()
Here we do 1000 bootstrap copies (rather than 40)
to improve the accuracy of the bootstrap
results. The output from this code shows statistical analyses of the
histogram data for p1/p0
and p3/p2
:
Histogram Analysis:
p1/p0:
mean = 19.967(23) sdev = 0.723(16) skew = 0.127(75) ex_kurt = -0.02(14)
median = 19.939(28) plus = 0.734(35) minus = 0.662(31)
p3/p2:
mean = 0.5164(75) sdev = 0.2385(64) skew = 0.628(84) ex_kurt = 0.84(18)
median = 0.4874(79) plus = 0.314(17) minus = 0.1481(54)
The code also displays histograms of the probability distributions, where the dashed lines show the results expected directly from the fit (that is, in the Gaussian approximation):
While the distribution for p1/p0
is consistent with the fit
results (dashed line) and Gaussian,
the distribution for p3/p2
is significantly skewed, with
a much longer tail to the right. The final result for p3/p2
might
more accurately be summarized as 0.48 with errors of +0.31 and -0.15,
although the Gaussian estimate of 0.48±0.22 would suffice for many
applications.
The skewed distribution for p3/p2
is not particularly
surprising given the ±50% uncertainty in the
denominator p2
.
Bayesian Integrals¶
Bayesian expectation values provide an alternative to least-squares fits.
These expectation values are integrals over the fit parameters that are
weighted by the probability density function (PDF for the parameters)
proportional
to exp(-chi**2/2)
, where chi**2
includes contributions from both
the data and the priors. They can be used to
calculate mean values of the parameters, their covariances, and the means
and covariances of any function of the parameters.
These will agree with the
best-fit results of our least-squares fits provided chi**2
is well
approximated by its quadratic expansion in the parameters — that is,
insofar as
exp(-chi**2/2)
is well approximated
by the Gaussian distribution in the parameters
specified by their best-fit means and covariance matrix (from fit.p
).
Here we use
vegas.PDFIntegrator
to evaluate Bayesian expectation values.
vegas.PDFIntegrator
uses the vegas
module for adaptive
multi-dimensional integration to evaluate expectation values. It integrates
arbitrary functions of the parameters, multiplied by the probability
density function, over the entire parameter space.
To illustrate how vegas.PDFIntegrator
works with lsqfit
,
we again revisit the analysis in the section
on Correlated Parameters; Gaussian Bayes Factor. We modify the end of the main()
function
of our original code
to evaluate the means and covariances of the parameters, and also
their probability histograms, using a Bayesian integral:
import matplotlib.pyplot as plt
import numpy as np
import gvar as gv
import lsqfit
import vegas
def main():
x, y = make_data()
prior = make_prior()
fit = lsqfit.nonlinear_fit(prior=prior, data=(x,y), fcn=fcn)
print(fit)
# Bayesian integrator with PDF from the fit
expval = vegas.PDFIntegrator(fit.p, pdf=fit.pdf)
# adapt integrator expval to PDF from fit
neval = 1000
nitn = 10
expval(neval=neval, nitn=nitn)
# <g(p)> gives mean and covariance matrix, and counts for histograms
hist = [
gv.PDFHistogram(fit.p[0]), gv.PDFHistogram(fit.p[1]),
gv.PDFHistogram(fit.p[2]), gv.PDFHistogram(fit.p[3]),
]
def g(p):
return dict(
mean=p,
outer=np.outer(p, p),
count=[
hist[0].count(p[0]), hist[1].count(p[1]),
hist[2].count(p[2]), hist[3].count(p[3]),
],
)
# evaluate expectation value of g(p)
results = expval(g, neval=neval, nitn=nitn, adapt=False)
# analyze results
print('\nIterations:')
print(results.summary())
print('Integration Results:')
pmean = results['mean']
pcov = results['outer'] - np.outer(pmean, pmean)
print(' mean(p) =', pmean)
print(' cov(p) =\n', pcov)
# create GVars from results
p = gv.gvar(gv.mean(pmean), gv.mean(pcov))
print('\nBayesian Parameters:')
print(gv.tabulate(p), '\\n')
print('logBF =', np.log(results.pdfnorm) - fit.pdf.lognorm)
# show histograms
print('\nHistogram Statistics:')
count = results['count']
for i in range(4):
# print histogram statistics
print('p[{}]:'.format(i))
print(hist[i].analyze(count[i]).stats)
# make histogram plots
plt.subplot(2, 2, i + 1)
plt.xlabel('p[{}]'.format(i))
hist[i].make_plot(count[i], plot=plt)
if i % 2 != 0:
plt.ylabel('')
plt.show()
def make_data():
x = np.array([
4. , 2. , 1. , 0.5 , 0.25 , 0.167 , 0.125 ,
0.1 , 0.0833, 0.0714, 0.0625
])
y = gv.gvar([
'0.198(14)', '0.216(15)', '0.184(23)', '0.156(44)', '0.099(49)',
'0.142(40)', '0.108(32)', '0.065(26)', '0.044(22)', '0.041(19)',
'0.044(16)'
])
return x, y
def make_prior():
p = gv.gvar(['0(1)', '0(1)', '0(1)', '0(1)'])
p[1] = 20 * p[0] + gv.gvar('0.0(1)') # p[1] correlated with p[0]
return p
def fcn(x, p):
return (p[0] * (x**2 + p[1] * x)) / (x**2 + x * p[2] + p[3])
if __name__ == '__main__':
main()
Here expval
is an integrator that is used to evaluate expectation
values of arbitrary functions of the fit parameters.
PDFIntegrator
uses output (fit.p
) from the least-squares fit to
design a vegas
integrator
optimized for calculating expectation values.
The integrator uses an
iterative Monte Carlo algorithm that adapts to the
probability density function (fit.pdf(p)
) after each iteration.
See the vegas
documentation for much more information.
We first call the integrator without a function. This allows it
to adapt to the probability density function from the fit without the extra
overhead of evaluating a function of the parameters. The integrator
uses nitn=10
iterations of the vegas
algorithm, with at most
neval=1000
evaluations of the probability density function for each
iteration.
We then use the optimized integrator to evaluate the expectation value
of function g(p)
, turning adaptation off with adapt=False
(since
it is unneeded).
The expectation value of g(p)
is returned
in dictionary results
.
The results from this script are:
Least Square Fit:
chi2/dof [dof] = 0.61 [11] Q = 0.82 logGBF = 19.129
Parameters:
0 0.149 (17) [ 0.0 (1.0) ]
1 2.97 (34) [ 0 (20) ]
2 1.23 (61) [ 0.0 (1.0) ] *
3 0.59 (15) [ 0.0 (1.0) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 19/0.0)
Iterations:
itn integral average chi2/dof Q
-------------------------------------------------------
1 0.001288(19) 0.001288(19) 0.00 1.00
2 0.001303(23) 0.001296(15) 0.76 0.92
3 0.001311(19) 0.001301(12) 0.67 1.00
4 0.001302(19) 0.001301(10) 0.64 1.00
5 0.001422(71) 0.001325(16) 0.68 1.00
6 0.001350(19) 0.001329(14) 0.70 1.00
7 0.001301(22) 0.001325(12) 0.71 1.00
8 0.001357(40) 0.001329(12) 0.76 1.00
9 0.001265(21) 0.001322(11) 0.76 1.00
10 0.001297(19) 0.0013196(99) 0.75 1.00
Integration Results:
mean(p) = [0.15490(18) 3.0953(37) 1.4430(68) 0.6900(36)]
cov(p) =
[[0.0002303(44) 0.004469(85) 0.00791(15) 0.001280(96)]
[0.004469(85) 0.0966(17) 0.1595(32) 0.0268(18)]
[0.00791(15) 0.1595(32) 0.3213(70) 0.0269(29)]
[0.001280(96) 0.0268(18) 0.0269(29) 0.0316(22)]]
Bayesian Parameters:
index value
---------------------
0 0.155 (15)
1 3.10 (31)
2 1.44 (57)
3 0.69 (18)
logBF = 19.1514(75)
Histogram Statistics:
p[0] -
mean = 0.15496(22) sdev = 0.01607(20) skew = 0.106(45) ex_kurt = 0.74(29)
median = 0.15446(17) plus = 0.01586(31) minus = 0.01510(15)
p[1] -
mean = 3.0955(35) sdev = 0.3262(27) skew = 0.027(66) ex_kurt = 1.13(54)
median = 3.0883(37) plus = 0.3203(49) minus = 0.3102(34)
p[2] -
mean = 1.4414(67) sdev = 0.5886(56) skew = 0.285(29) ex_kurt = -0.053(55)
median = 1.4107(64) plus = 0.597(10) minus = 0.5438(57)
p[3] -
mean = 0.6706(38) sdev = 0.1848(58) skew = -0.24(17) ex_kurt = 1.48(29)
median = 0.6657(20) plus = 0.1924(37) minus = 0.1481(15)
The iterations table shows results from each of the nitn=10
vegas iterations used to
evaluate the expectation values. Estimates for the integral of the probability
density function are listed for each iteration. (Results from the integrator
are approximate, with error estimates.) These are consistent with each other
and with the (more accurate) overall average.
The integration results show that the Bayesian estimates for the means
of the parameters are accurate to roughly 1% or better, which is
sufficiently accurate here given the size of the standard deviations.
Estimates for
the covariance matrix elements are less accurate, which is typical.
This information is converted into gvar.GVar
s for the parameters and
tabulated under “Bayesian Parameters,” for comparison with the
original fit results — the agreement is pretty good.
This is further confirmed by the
(posterior) probability distributions
for each parameter:

The means are shifted slightly from the fit results and there
is modest skewing, but the differences are not great. The
logarithm of the Bayes Factor (logBF
), calculated from
the integral of fit.pdf(p)
, agrees well with
fit.logGBF
which is the Bayes Factor evaluated
in the Gaussian approximation (implicit in
least-squares fitting). This is further evidence
that the Gaussian approximation is reasonable
for this problem.
As a second example of Bayesian integration, we return briefly to
the problem described in Positive Parameters; Non-Gaussian Priors: we want the
average a
of noisy data subject the constraint that the average
must be positive. The constraint is likely to
introduce strong distortions in the probability density function (PDF)
given that the fit analysis suggests a value of 0.011±0.013.
We plot the actual PDF using the following code, beginning with
a fit that uses a flat prior (between 0 and 0.04):
import gvar as gv
import lsqfit
import vegas
# data, prior, and fit function
y = gv.gvar([
'-0.17(20)', '-0.03(20)', '-0.39(20)', '0.10(20)', '-0.03(20)',
'0.06(20)', '-0.23(20)', '-0.23(20)', '-0.15(20)', '-0.01(20)',
'-0.12(20)', '0.05(20)', '-0.09(20)', '-0.36(20)', '0.09(20)',
'-0.07(20)', '-0.31(20)', '0.12(20)', '0.11(20)', '0.13(20)'
])
prior = {}
prior['ga(a)'] = gv.BufferDict.uniform('ga', 0., 0.04)
def fcn(p, N=len(y)):
return N * [p['a']]
# least-squares fit
fit = lsqfit.nonlinear_fit(prior=prior, data=y, fcn=fcn)
print(fit)
a = fit.p['a']
print('a =', a)
# Bayesian analysis: histogram for a
hist = gv.PDFHistogram(a, nbin=16, binwidth=0.5)
def g(p):
a = p['a']
return hist.count(a)
expval = vegas.PDFIntegrator(fit.p, pdf=fit.pdf)
expval(neval=1009, nitn=10)
count = expval(g, neval=1000, nitn=10, adapt=False)
# print out results and show plot
print('\nHistogram Analysis:')
print (hist.analyze(count).stats)
hist.make_plot(count, show=True)
The output from this script is
Least Square Fit:
chi2/dof [dof] = 0.85 [20] Q = 0.65 logGBF = 5.2385
Parameters:
ga(a) -0.59 (96) [ 0.0 (1.0) ]
-----------------------------------------------
a 0.011 (13) [ 0.020 (16) ]
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 15/0.0)
a = 0.011(13)
Histogram Analysis:
mean = 0.014054(22) sdev = 0.010659(21) skew = 0.6181(26) ex_kurt = -0.5593(50)
median = 0.011867(19) plus = 0.014385(31) minus = 0.008761(18)
and the probability distribution for a
looks like

This distribution is distorted between a=0
and the mean
value, but otherwise is fairly similar to the Gaussian result
0.011±0.013 (dashed line). A more accurate summary of the result
for a
would be 0.012 with an error of +0.014 and -0.009, though
again the Gaussian result is not terribly misleading even in this
case.
The Bayesian integrals are relatively simple in these example. More
complicated problems can require much more computer time to evaluate
the integrals, with hundreds of thousands or millions of integrand
evaluations per iteration (neval
). This is particularly true
as the number of parameters increases. PDFIntegrator
uses
information from the least-squares fit to simplify the integration
for vegas
by optimizing the integration variables used, but
integrals over tens of variables are intrinsically challenging.
PDFIntegrator
can be used with MPI to run such integrals
on multiple processors, for a considerable speed-up.
We used Bayesian integrals here to deal with non-Gaussian behavior in fit outputs. The case study Case Study: Outliers and Bayesian Integrals shows how to use them when the input data is not quite Gaussian.
Testing Fits with Simulated Data¶
Ideally we would test a fitting protocol by doing fits of data similar to our actual fit but where we know the correct values for the fit parameters ahead of the fit. Method
returns an iterator that creates any number of such simulations of the original fit.
A key assumption underlying least-squares fitting is that the fit
data y[i]
are random samples from a distribution whose mean
is the fit function fcn(x, fitp)
evaluated with the best-fit
values fitp
for the parameters. simulated_fit_iter
iterators
generate simulated data by drawing other random samples from the
same distribution, assigning them the same covariance matrix as the
original data. The simulated data are fit
using the same priors and fitter settings
as in the original fit, and the results (an lsqfit.nonlinear_fit
object) are returned by the iterator. The fits with simulated data should
have good chi**2
values, and the results from these fits
should agree, within errors, with the original fit results since the
simulated data are from the same distribution as the original data. There
is a problem with the fitting protocol if this is not the case most of the
time.
To illustrate we again examine the fits
in the section on Correlated Parameters; Gaussian Bayes Factor:
we add three fit simulations at the end of the main()
function:
import numpy as np
import gvar as gv
import lsqfit
def main():
x, y = make_data()
prior = make_prior()
fit = lsqfit.nonlinear_fit(prior=prior, data=(x,y), fcn=fcn)
print(40 * '*' + ' real fit')
print(fit.format(True))
# 3 simulated fits
for sfit in fit.simulated_fit_iter(n=3):
# print simulated fit details
print(40 * '=' + ' simulation')
print(sfit.format(True))
# compare simulated fit results with exact values (pexact=fit.pmean)
diff = sfit.p - sfit.pexact
print('\nsfit.p - pexact =', diff)
print(gv.fmt_chi2(gv.chi2(diff)))
print
def make_data():
x = np.array([
4. , 2. , 1. , 0.5 , 0.25 , 0.167 , 0.125 ,
0.1 , 0.0833, 0.0714, 0.0625
])
y = gv.gvar([
'0.198(14)', '0.216(15)', '0.184(23)', '0.156(44)', '0.099(49)',
'0.142(40)', '0.108(32)', '0.065(26)', '0.044(22)', '0.041(19)',
'0.044(16)'
])
return x, y
def make_prior():
p = gv.gvar(['0(1)', '0(1)', '0(1)', '0(1)'])
p[1] = 20 * p[0] + gv.gvar('0.0(1)') # p[1] correlated with p[0]
return p
def fcn(x, p):
return (p[0] * (x**2 + p[1] * x)) / (x**2 + x * p[2] + p[3])
if __name__ == '__main__':
main()
This code produces the following output, showing how the input data fluctuate from simulation to simulation:
**************************************** real fit
Least Square Fit:
chi2/dof [dof] = 0.61 [11] Q = 0.82 logGBF = 19.129
Parameters:
0 0.149 (17) [ 0.0 (1.0) ]
1 2.97 (34) [ 0 (20) ]
2 1.23 (61) [ 0.0 (1.0) ] *
3 0.59 (15) [ 0.0 (1.0) ]
Fit:
x[k] y[k] f(x[k],p)
--------------------------------------
4 0.198 (14) 0.193 (11)
2 0.216 (15) 0.210 (10)
1 0.184 (23) 0.209 (15) *
0.5 0.156 (44) 0.177 (15)
0.25 0.099 (49) 0.124 (13)
0.167 0.142 (40) 0.094 (12) *
0.125 0.108 (32) 0.075 (11) *
0.1 0.065 (26) 0.0629 (96)
0.0833 0.044 (22) 0.0538 (87)
0.0714 0.041 (19) 0.0471 (79)
0.0625 0.044 (16) 0.0418 (72)
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 19/0.0)
======================================== simulation
Least Square Fit:
chi2/dof [dof] = 1.2 [11] Q = 0.27 logGBF = 15.278
Parameters:
0 0.134 (14) [ 0.0 (1.0) ]
1 2.68 (29) [ 0 (20) ]
2 0.68 (47) [ 0.0 (1.0) ]
3 0.54 (12) [ 0.0 (1.0) ]
Fit:
x[k] y[k] f(x[k],p)
--------------------------------------
4 0.200 (14) 0.186 (12)
2 0.192 (15) 0.212 (10) *
1 0.242 (23) 0.221 (16)
0.5 0.163 (44) 0.187 (18)
0.25 0.089 (49) 0.126 (15)
0.167 0.130 (40) 0.093 (13)
0.125 0.103 (32) 0.073 (11)
0.1 0.046 (26) 0.0599 (96)
0.0833 0.054 (22) 0.0508 (85)
0.0714 0.004 (19) 0.0441 (77) **
0.0625 0.060 (16) 0.0389 (70) *
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 7/0.0)
sfit.p - pexact = [-0.015(14) -0.29(29) -0.54(47) -0.05(12)]
chi2/dof = 0.34 [4] Q = 0.85
======================================== simulation
Least Square Fit:
chi2/dof [dof] = 1.1 [11] Q = 0.38 logGBF = 17.048
Parameters:
0 0.156 (18) [ 0.0 (1.0) ]
1 3.12 (36) [ 0 (20) ]
2 1.35 (66) [ 0.0 (1.0) ] *
3 0.77 (20) [ 0.0 (1.0) ]
Fit:
x[k] y[k] f(x[k],p)
--------------------------------------
4 0.207 (14) 0.201 (11)
2 0.224 (15) 0.214 (10)
1 0.163 (23) 0.206 (14) *
0.5 0.162 (44) 0.167 (15)
0.25 0.124 (49) 0.113 (14)
0.167 0.111 (40) 0.084 (12)
0.125 0.085 (32) 0.066 (11)
0.1 0.097 (26) 0.0550 (93) *
0.0833 0.020 (22) 0.0469 (83) *
0.0714 0.043 (19) 0.0409 (75)
0.0625 0.031 (16) 0.0362 (68)
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 23/0.0)
sfit.p - pexact = [0.008(18) 0.15(36) 0.13(66) 0.18(20)]
chi2/dof = 0.22 [4] Q = 0.93
======================================== simulation
Least Square Fit:
chi2/dof [dof] = 0.76 [11] Q = 0.68 logGBF = 17.709
Parameters:
0 0.138 (14) [ 0.0 (1.0) ]
1 2.77 (29) [ 0 (20) ]
2 0.72 (46) [ 0.0 (1.0) ]
3 0.53 (11) [ 0.0 (1.0) ]
Fit:
x[k] y[k] f(x[k],p)
--------------------------------------
4 0.196 (14) 0.193 (12)
2 0.218 (15) 0.221 (10)
1 0.240 (23) 0.231 (16)
0.5 0.157 (44) 0.198 (18)
0.25 0.157 (49) 0.134 (14)
0.167 0.022 (40) 0.099 (12) *
0.125 0.070 (32) 0.078 (11)
0.1 0.090 (26) 0.0644 (96)
0.0833 0.045 (22) 0.0547 (86)
0.0714 0.053 (19) 0.0475 (78)
0.0625 0.059 (16) 0.0420 (70) *
Settings:
svdcut/n = 1e-12/0 tol = (1e-08*,1e-10,1e-10) (itns/time = 10/0.0)
sfit.p - pexact = [-0.010(14) -0.21(29) -0.51(46) -0.06(11)]
chi2/dof = 0.77 [4] Q = 0.54
The parameters sfit.p
produced by the simulated fits agree well
with the original fit parameters pexact=fit.pmean
, with good
fits in each case. We calculate the chi**2
for the difference
sfit.p - pexact
in
each case; good chi**2
values validate the parameter values, standard
deviations, and correlations.
Goodness of Fit¶
The quality of a fit is often judged by the value of
chi**2/N
, where N
is the
number of degrees of freedom.
Conventionally we expect chi**2/N
to be of order 1 ± sqrt(2/N)
since fluctuations in the mean values of the data are of order the
uncertainties in the data. More precisely the means are
assumed to be random samples drawn from a Gaussian distribution whose
means are given by the best-fit function and whose covariance matrix
comes from
the data. There are two situations where this measure of goodness-of-fit
becomes unreliable.
The first situation is when there is a large SVD cut on the data.
As discussed in SVD Cuts and Inadequate Statistics, an SVD cut increases
the uncertainties
in the data without increasing the random fluctuations in the
data means. As a result contributions
from the parts of the chi**2
function affected by the SVD cut
tend to be much smaller than naively expected, artificially
pulling chi**2/N
down.
The second situation that compromises chi**2
is when some or all of
the priors used in a fit are broad — that is, when a fit result for
a parameter has a much
smaller uncertainty than the corresponding prior, but a mean that
is artificially
close to the prior’s mean. This often arises when the means
used in the priors are not random samples, which is
frequently the case (unlike for the fit data). Again contributions to chi**2
associated with such priors tend to be much smaller than naively expected,
pulling chi**2
down.
These complications can conspire to make chi**2/N
significantly less than 1 when the fit is good. Of greater concern,
they can mask evidence of a bad fit: chi**2/N ≈ 1
is not
necessarily evidence of a good fit in such situations.
A simple way to address these situations is to redo the fit with
keyword parameter noise=True
.
This causes lsqfit.nonlinear_fit
to
add extra fluctuations to the means in the prior and the data
that are characteristic of the probability distributions associated
with the priors and the SVD cut, respectively:
prior => prior + (gv.sample(prior) - gv.mean(prior))
y => y + gv.sample(y.correction)
These fluctuations
should leave fit results unchanged (within errors) but increase
chi**2/N
so it is of order one.
To add this test to the fit from SVD Cuts and Inadequate Statistics, we modify the code to include a second fit at the end:
import numpy as np
import gvar as gv
import lsqfit
def main():
ysamples = [
[0.0092441016, 0.0068974057, 0.0051480509, 0.0038431422, 0.0028690492],
[0.0092477405, 0.0069030565, 0.0051531383, 0.0038455855, 0.0028700587],
[0.0092558569, 0.0069102437, 0.0051596569, 0.0038514537, 0.0028749153],
[0.0092294581, 0.0068865156, 0.0051395262, 0.003835656, 0.0028630454],
[0.009240534, 0.0068961523, 0.0051480046, 0.0038424661, 0.0028675632],
]
y = gv.dataset.avg_data(ysamples)
x = np.array([15., 16., 17., 18., 19.])
def fcn(p):
return p['a'] * gv.exp(- p['b'] * x)
prior = gv.gvar(dict(a='0.75(5)', b='0.30(3)'))
fit = lsqfit.nonlinear_fit(data=y, prior=prior, fcn=fcn, svdcut=0.0028)
print(fit.format(True))
print('\n================ Add noise to prior, SVD')
noisyfit = lsqfit.nonlinear_fit(
data=y, prior=prior, fcn=fcn, svdcut=0.0028, noise=True,
)
print(noisyfit.format(True))
if __name__ == '__main__':
main()
Running this code gives the following output:
Least Square Fit:
chi2/dof [dof] = 0.38 [5] Q = 0.86 logGBF = 53.598
Parameters:
a 0.74338 (29) [ 0.750 (50) ]
b 0.292480 (42) [ 0.300 (30) ]
Settings:
svdcut/n = 0.0028/3 tol = (1e-08*,1e-10,1e-10) (itns/time = 5/0.0)
================ Add noise to prior, SVD
Least Square Fit:
chi2/dof [dof] = 0.81 [5] Q = 0.54 logGBF = 52.523
Parameters:
a 0.74343 (29) [ 0.803 (50) ] *
b 0.292485 (42) [ 0.314 (30) ]
Fit:
key y[key] f(p)[key]
---------------------------------------------
0 0.0092436 (39) 0.0092443 (32)
1 0.0068987 (35) 0.0069000 (26)
2 0.0051496 (30) 0.0051502 (21)
3 0.0038437 (23) 0.0038441 (17)
4 0.0028689 (17) 0.0028693 (14)
Settings:
svdcut/n = 0.0028/3* tol = (1e-08*,1e-10,1e-10) (itns/time = 5/0.0)
The fit with extra noise has a larger chi**2
, as expected,
but is still a good fit. It also
gives fit parameters that agree within errors
with those from the
original fit. In general, there is probably something wrong with
the original fit (e.g., svdcut
too small, or priors inconsistent with the fit data)
if adding noise makes chi**2/N
signficantly larger than one,
or changes the best-fit values of the parameters significantly.
Fit Residuals and Q-Q Plots¶
It can be useful to examine the normalized residuals from a fit (in array
fit.residuals
). These are the differences between
the data and the corresponding values from the fit function using the best-fit
values for the fit parameters. The differences are projected onto the
eigenvectors of the correlation matrix and normalized by dividing by the
square root of the corresponding eigenvalues.
The statistical assumptions underlying lsqfit.nonlinear_fit
imply that the
normalized fit residuals should be uncorrelated and distributed
randomly about zero in
a Gaussian distribution.
One way to test whether residuals from a fit have a Gaussian distribution is to make a Q-Q plot. Plots for the two fits from the previous section (one without extra noise on the left, and the other with noise) are:


These plots were made using
by adding the following lines at the end
of the main()
method:
fit.qqplot_residuals().show()
noisyfit.qqplot_residuals().show()
In each case the residuals are first ordered, from smallest to largest. They are then plotted against the value expected for a residual at that position in the list if the list elements were drawn at random from a Gaussian distribution of unit width and zero mean. (For example, given 100 samples from the Gaussian distribution, the sample in position 16 of the ordered list should have a value around -1 since 16% of the values should be more than one standard deviation below the mean.) The residuals are consistent with a Gaussian distribution if they fall on or near a straight line in such a plot.
The plots show fits of the residuals to straight lines (red solid lines).
The residuals in the left plot (without additional noise) are reasonably
consistent with a straight line (and, therefore, a Gaussian distribution),
but the slope (0.55) is much less than
one. This is because chi2/dof = 0.38
for this fit
is much smaller than one. Typically
we expect the slope to be roughly the square root of chi2/dof
(since chi2
equals the sum of the residuals squared).
The residuals in the right plot are also quite linear in the Q-Q plot.
In this case the fit residuals include extra noise associated with
the priors and with the SVD cut, as discussed in the previous section.
As a result chi2/dof = 0.81
is much closer to one, as is the
resulting slope (0.90) in the Q-Q plot. The fit line through
the residuals is much closer here to the dashed line in the plot,
which is what would result if the residuals had unit standard
deviation and zero mean.
The fits pictured above have relatively few residuals. Q-Q plots become increasingly compelling as the number of residuals increases. The following plots, from fits without (left) and with (right) prior/SVD noise, come from a lattice QCD analysis with 383 residuals:

