lsqfit
- Nonlinear Least Squares Fitting¶
Introduction¶
This package contains tools for nonlinear least-squares curve fitting of data. In general a fit has four inputs:
- The dependent data
y
that is to be fit — typicallyy
is a Python dictionary in anlsqfit
analysis. Its valuesy[k]
are eithergvar.GVar
s or arrays (any shape or dimension) ofgvar.GVar
s that specify the values of the dependent variables and their errors.- A collection
x
of independent data —x
can have any structure and contain any data (or no data).- A fit function
f(x, p)
whose parametersp
are adjusted by the fit untilf(x, p)
equalsy
to withiny
s errors — parameters p` are usually specified by a dictionary whose valuesp[k]
are individual parameters or (numpy
) arrays of parameters. The fit function is assumed independent ofx
(that is,f(p)
) ifx = False
(or ifx
is omitted from the input data).- Initial estimates or priors for each parameter in
p
— priors are usually specified using a dictionaryprior
whose valuesprior[k]
aregvar.GVar
s or arrays ofgvar.GVar
s that give initial estimates (values and errors) for parametersp[k]
.
A typical code sequence has the structure:
... collect x, y, prior ...
def f(x, p):
... compute fit to y[k], for all k in y, using x, p ...
... return dictionary containing the fit values for the y[k]s ...
fit = lsqfit.nonlinear_fit(data=(x, y), prior=prior, fcn=f)
print(fit) # variable fit is of type nonlinear_fit
The parameters p[k]
are varied until the chi**2
for the fit is
minimized.
The best-fit values for the parameters are recovered after fitting
using, for example, p=fit.p
. Then the p[k]
are gvar.GVar
s or
arrays of gvar.GVar
s that give best-fit estimates and fit uncertainties
in those estimates. The print(fit)
statement prints a summary of
the fit results.
The dependent variable y
above could be an array instead of a
dictionary, which is less flexible in general but possibly more
convenient in simpler fits. Then the approximate y
returned by fit
function f(x, p)
must be an array with the same shape as the dependent
variable. The prior prior
could also be represented by an array
instead of a dictionary.
By default priors are Gaussian/normal distributions, represented by gvar.GVar
s.
Setting nonlinear_fit
parameter extend=True
allows for log-normal
and sqrt-normal distributions as well. The latter are indicated by replacing
the prior (in a dictionary prior) with key c
, for example, by a prior for
the parameter’s logarithm or square root, with key log(c)
or sqrt(c)
,
respectively. nonlinear_fit
adds parameter c
to the parameter
dictionary, deriving its value from parameter log(c)
or sqrt(c)
. The
fit function can be expressed directly in terms of parameter c
and so is
the same no matter which distribution is used for c
. Note that a
sqrt-normal distribution with zero mean is equivalent to an exponential
distribution. Additional distributions can be added using
gvar.add_parameter_distribution()
.
The lsqfit
tutorial contains extended explanations and examples.
The first appendix in the paper at http://arxiv.org/abs/arXiv:1406.2279
provides conceptual background on the techniques used in this
module for fits and, especially, error budgets.
nonlinear_fit Objects¶
-
class
lsqfit.
nonlinear_fit
(data, fcn, prior=None, p0=None, extend=False, svdcut=1e-15, debug=False, **kargs)¶ Nonlinear least-squares fit.
lsqfit.nonlinear_fit
fits a (nonlinear) functionf(x, p)
to datay
by varying parametersp
, and stores the results: for example,fit = nonlinear_fit(data=(x, y), fcn=f, prior=prior) # do fit print(fit) # print fit results
The best-fit values for the parameters are in
fit.p
, while thechi**2
, the number of degrees of freedom, the logarithm of Gaussian Bayes Factor, the number of iterations, and the cpu time needed for the fit are infit.chi2
,fit.dof
,fit.logGBF
,fit.nit
, andfit.time
, respectively. Results for individual parameters infit.p
are of typegvar.GVar
, and therefore carry information about errors and correlations with other parameters. The fit data and prior can be recovered usingfit.x
(equalsFalse
if there is nox
),fit.y
, andfit.prior
; the data and prior are corrected for the svd cut, if there is one (that is, their covariance matrices have been modified in accordance with the svd cut).Parameters: - data –
Data to be fit by
lsqfit.nonlinear_fit
. It can have any of the following formats:data = x, y
x
is the independent data that is passed to the fit function with the fit parameters:fcn(x, p)
.y
is a dictionary (or array) ofgvar.GVar
s that encode the means and covariance matrix for the data that is to be fit being fit. The fit function must return a result having the same layout asy
.data = y
y
is a dictionary (or array) ofgvar.GVar
s that encode the means and covariance matrix for the data being fit. There is no independent data so the fit function depends only upon the fit parameters:fit(p)
. The fit function must return a result having the same layout asy
.data = x, ymean, ycov
x
is the independent data that is passed to the fit function with the fit parameters:fcn(x, p)
.ymean
is an array containing the mean values of the fit data.ycov
is an array containing the covariance matrix of the fit data;ycov.shape
equals2*ymean.shape
. The fit function must return an array having the same shape asymean
.data = x, ymean, ysdev
x
is the independent data that is passed to the fit function with the fit parameters:fcn(x, p)
.ymean
is an array containing the mean values of the fit data.ysdev
is an array containing the standard deviations of the fit data;ysdev.shape
equalsymean.shape
. The data are assumed to be uncorrelated. The fit function must return an array having the same shape asymean
.
Setting
x=False
in the first, third or fourth of these formats implies that the fit function depends only on the fit parameters: that is,fcn(p)
instead offcn(x, p)
. (This is not assumed ifx=None
.) - fcn (function) – The function to be fit to
data
. It is either a function of the independent datax
and the fit parametersp
(fcn(x, p)
), or a function of just the fit parameters (fcn(p)
) when there is nox
data orx=False
. The parameters are tuned in the fit until the function returns values that agree with they
data to within they
s’ errors. The function’s return value must have the same layout as they
data (a dictionary or an array). The fit parametersp
are either: 1) a dictionary where eachp[k]
is a single parameter or an array of parameters (any shape); or, 2) a single array of parameters. The layout of the parameters is the same as that of priorprior
if it is specified; otherwise, it is inferred from of the starting valuep0
for the fit. - prior (dictionary, array, or
None
) – A dictionary (or array) containing a priori estimates for all parametersp
used by fit functionfcn(x, p)
(orfcn(p)
). Fit parametersp
are stored in a dictionary (or array) with the same keys and structure (or shape) asprior
. The default value isNone
;prior
must be defined ifp0
isNone
. - p0 (dictionary, array, string or
None
) – Starting values for fit parameters in fit.lsqfit.nonlinear_fit
adjustsp0
to make it consistent in shape and structure withprior
when the latter is specified: elements missing fromp0
are filled in usingprior
, and elements inp0
that are not inprior
are discarded. Ifp0
is a string, it is taken as a file name andlsqfit.nonlinear_fit
attempts to read starting values from that file; best-fit parameter values are written out to the same file after the fit (for priming future fits). Ifp0
isNone
or the attempt to read the file fails, starting values are extracted fromprior
. The default value isNone
;p0
must be defined ifprior
isNone
. - svdcut (
None
orfloat
) – Ifsvdcut
is nonzero (notNone
), svd cuts are applied to every block-diagonal sub-matrix of the covariance matrix for the datay
andprior
(if there is a prior). The blocks are first rescaled so that all diagonal elements equal 1 – that is, the blocks are replaced by the correlation matrices for the corresponding subsets of variables. Then, ifsvdcut > 0
, eigenvalues of the rescaled matrices that are smaller thansvdcut
times the maximum eigenvalue are replaced bysvdcut
times the maximum eigenvalue. This makes the covariance matrix less singular and less susceptible to roundoff error. Whensvdcut < 0
, eigenvalues smaller than|svdcut|
times the maximum eigenvalue are discarded and the corresponding components iny
andprior
are zeroed out. - extend – Log-normal and sqrt-normal distributions can be used
for fit priors when
extend=True
, provided the parameters are specified by a dictionary (as opposed to an array). To use such a distribution for a parameter'c'
in the fit prior, replaceprior['c']
with a prior specifying its logarithm or square root, designated byprior['log(c)']
orprior['sqrt(c)']
, respectively. The dictionaries containing parameters generated bylsqfit.nonlinear_fit
will have entries for both'c'
and'log(c)'
or'sqrt(c)'
, so only the prior need be changed to switch to log-normal/sqrt-normal distributions. Settingextend=False
(the default) restricts all parameters to Gaussian distributions. Additional distributions can be added usinggvar.add_parameter_distribution()
. - debug (boolean) – Set to
True
for extra debugging of the fit function and a check for roundoff errors. (Default isFalse
.) - fitterargs – Dictionary of arguments passed on to
lsqfit.multifit
, which does the fitting.
The results from the fit are accessed through the following attributes (of
fit
wherefit = nonlinear_fit(...)
):-
chi2
¶ The minimum
chi**2
for the fit.fit.chi2 / fit.dof
is usually of order one in good fits; values much less than one suggest that the actual standard deviations in the input data and/or priors are smaller than the standard deviations used in the fit.
-
cov
¶ Covariance matrix of the best-fit parameters from the fit.
-
dof
¶ Number of degrees of freedom in the fit, which equals the number of pieces of data being fit when priors are specified for the fit parameters. Without priors, it is the number of pieces of data minus the number of fit parameters.
-
logGBF
¶ The logarithm of the probability (density) of obtaining the fit data by randomly sampling the parameter model (priors plus fit function) used in the fit. This quantity is useful for comparing fits of the same data to different models, with different priors and/or fit functions. The model with the largest value of
fit.logGBF
is the one prefered by the data. The exponential of the difference infit.logGBF
between two models is the ratio of probabilities (Bayes factor) for those models. Differences infit.logGBF
smaller than 1 are not very significant. Gaussian statistics are assumed when computingfit.logGBF
.
-
p
¶ Best-fit parameters from fit. Depending upon what was used for the prior (or
p0
), it is either: a dictionary (gvar.BufferDict
) ofgvar.GVar
s and/or arrays ofgvar.GVar
s; or an array (numpy.ndarray
) ofgvar.GVar
s.fit.p
represents a multi-dimensional Gaussian distribution which, in Bayesian terminology, is the posterior probability distribution of the fit parameters.
-
pmean
¶ Means of the best-fit parameters from fit (dictionary or array).
-
psdev
¶ Standard deviations of the best-fit parameters from fit (dictionary or array).
-
palt
¶ Same as
fit.p
except that the errors are computed directly fromfit.cov
. This is faster but means that no information about correlations with the input data is retained (unlike infit.p
); and, therefore,fit.palt
cannot be used to generate error budgets.fit.p
andfit.palt
give the same means and normally give the same errors for each parameter. They differ only when the input data’s covariance matrix is too singular to invert accurately (because of roundoff error), in which case an SVD cut is advisable.
-
p0
¶ The parameter values used to start the fit.
-
Q
¶ The probability that the
chi**2
from the fit could have been larger, by chance, assuming the best-fit model is correct. Good fits haveQ
values larger than 0.1 or so. Also called the p-value of the fit.
-
svdcorrection
¶ The sum of all SVD corrections, if any, added to the fit data
y
or the priorprior
.
-
svdn
¶ The number of eignemodes modified (and/or deleted) by the SVD cut.
-
nblocks
¶ A dictionary where
nblocks[s]
equals the number of block-diagonal sub-matrices of they
–prior
covariance matrix that are sizes
-by-s
. This is sometimes useful for debugging.
-
time
¶ CPU time (in secs) taken by fit.
The input parameters to the fit can be accessed as attributes. Note in particular attributes:
-
prior
¶ Prior used in the fit. This may differ from the input prior if an SVD cut is used. It is either a dictionary (
gvar.BufferDict
) or an array (numpy.ndarray
), depending upon the input. EqualsNone
if no prior was specified.
-
x
¶ The first field in the input
data
. This is sometimes the independent variable (as in ‘y vs x’ plot), but may be anything. It is set equal toFalse
if thex
field is omitted from the inputdata
. (This also means that the fit function has nox
argument: sof(p)
rather thanf(x,p)
.)
-
y
¶ Fit data used in the fit. This may differ from the input data if an SVD cut is used. It is either a dictionary (
gvar.BufferDict
) or an array (numpy.ndarray
), depending upon the input.
Additional methods are provided for printing out detailed information about the fit, testing fits with simulated data, doing bootstrap analyses of the fit errors, dumping (for later use) and loading parameter values, and checking for roundoff errors in the final error estimates:
-
format
(maxline=0, pstyle='v')¶ Formats fit output details into a string for printing.
The output tabulates the
chi**2
per degree of freedom of the fit (chi2/dof
), the number of degrees of freedom, the logarithm of the Gaussian Bayes Factor for the fit (logGBF
), and the number of fit- algorithm iterations needed by the fit. Optionally, it will also list the best-fit values for the fit parameters together with the prior for each (in[]
on each line). Lines for parameters that deviate from their prior by more than one (prior) standard deviation are marked with asterisks, with the number of asterisks equal to the number of standard deviations (up to five).format
can also list all of the data and the corresponding values from the fit, again with asterisks on lines where there is a significant discrepancy. At the end it lists the SVD cut, the number of eigenmodes modified by the SVD cut, the tolerances used in the fit, and the time in seconds needed to do the fit. The tolerance used to terminate the fit is marked with an asterisk.Parameters: - maxline (integer or bool) – Maximum number of data points for which fit
results and input data are tabulated.
maxline<0
implies that onlychi2
,Q
,logGBF
, anditns
are tabulated; no parameter values are included. Settingmaxline=True
prints all data points; setting it equalNone
orFalse
is the same as setting it equal to-1
. Default ismaxline=0
. - pstyle (‘vv’, ‘v’, or ‘m’) – Style used for parameter list. Supported values are ‘vv’ for very verbose, ‘v’ for verbose, and ‘m’ for minimal. When ‘m’ is set, only parameters whose values differ from their prior values are listed.
Returns: String containing detailed information about fit.
- maxline (integer or bool) – Maximum number of data points for which fit
results and input data are tabulated.
-
fmt_errorbudget
(outputs, inputs, ndecimal=2, percent=True)¶ Tabulate error budget for
outputs[ko]
due toinputs[ki]
.For each output
outputs[ko]
,fmt_errorbudget
computes the contributions tooutputs[ko]
‘s standard deviation coming from thegvar.GVar
s collected ininputs[ki]
. This is done for each key combination(ko,ki)
and the results are tabulated with columns and rows labeled byko
andki
, respectively. If agvar.GVar
ininputs[ki]
is correlated with othergvar.GVar
s, the contribution from the others is included in theki
contribution as well (since contributions from correlatedgvar.GVar
s cannot be distinguished). The table is returned as a string.Parameters: - outputs – Dictionary of
gvar.GVar
s for which an error budget is computed. - inputs – Dictionary of:
gvar.GVar
s, arrays/dictionaries ofgvar.GVar
s, or lists ofgvar.GVar
s and/or arrays/dictionaries ofgvar.GVar
s.fmt_errorbudget
tabulates the parts of the standard deviations of eachoutputs[ko]
due to eachinputs[ki]
. - ndecimal (
int
) – Number of decimal places displayed in table. - percent (boolean) – Tabulate % errors if
percent is True
; otherwise tabulate the errors themselves. - colwidth (positive integer or None) – Width of each column. This is set automatically, to
accommodate label widths, if
colwidth=None
(default). - verify (boolean) – If
True
, a warning is issued if: 1) different inputs are correlated (and therefore double count errors); or 2) the sum (in quadrature) of partial errors is not equal to the total error to within 0.1% of the error (and the error budget is incomplete or overcomplete). No checking is done ifverify==False
(default).
Returns: A table (
str
) containing the error budget. Output variables are labeled by the keys inoutputs
(columns); sources of uncertainty are labeled by the keys ininputs
(rows).- outputs – Dictionary of
-
fmt_values
(outputs, ndecimal=None)¶ Tabulate
gvar.GVar
s inoutputs
.Parameters: - outputs – A dictionary of
gvar.GVar
objects. - ndecimal (
int
orNone
) – Format valuesv
usingv.fmt(ndecimal)
.
Returns: A table (
str
) containing values and standard deviations for variables inoutputs
, labeled by the keys inoutputs
.- outputs – A dictionary of
-
simulated_fit_iter
(n=None, pexact=None, **kargs)¶ Iterator that returns simulation copies of a fit.
Fit reliability can be tested using simulated data which replaces the mean values in
self.y
with random numbers drawn from a distribution whose mean equalsself.fcn(pexact)
and whose covariance matrix is the same asself.y
‘s. Simulated data is very similar to the original fit data,self.y
, but corresponds to a world where the correct values for the parameters (i.e., averaged over many simulated data sets) are given bypexact
.pexact
is usually taken equal tofit.pmean
.Each iteration of the iterator creates new simulated data, with different random numbers, and fits it, returning the the
lsqfit.nonlinear_fit
that results. The simulated data has the same covariance matrix asfit.y
. Typical usage is:... fit = nonlinear_fit(...) ... for sfit in fit.simulated_fit_iter(n=3): ... verify that sfit.p agrees with pexact=fit.pmean within errors ...
Only a few iterations are needed to get a sense of the fit’s reliability since we know the correct answer in each case. The simulated fit’s output results should agree with
pexact
(=fit.pmean
here) within the simulated fit’s errors.Simulated fits can also be used to estimate biases in the fit’s output parameters or functions of them, should non-Gaussian behavior arise. This is possible, again, because we know the correct value for every parameter before we do the fit. Again only a few iterations may be needed for reliable estimates.
The (possibly non-Gaussian) probability distributions for parameters, or functions of them, can be explored in more detail by setting option
bootstrap=True
and collecting results from a large number of simulated fits. Withbootstrap=True
, the means of the priors are also varied from fit to fit, as in a bootstrap simulation; the new prior means are chosen at random from the prior distribution. Variations in the best-fit parameters (or functions of them) from fit to fit define the probability distributions for those quantities. For example, one would use the following code to analyze the distribution of functiong(p)
of the fit parameters:fit = nonlinear_fit(...) ... glist = [] for sfit in fit.simulated_fit_iter(n=100, bootstrap=True): glist.append(g(sfit.pmean)) ... analyze samples glist[i] from g(p) distribution ...
This code generates
n=100
samplesglist[i]
from the probability distribution ofg(p)
. If everything is Gaussian, the mean and standard deviation ofglist[i]
should agree withg(fit.p).mean
andg(fit.p).sdev
.The only difference between simulated fits with
bootstrap=True
andbootstrap=False
(the default) is that the prior means are varied. It is essential that they be varied in a bootstrap analysis since one wants to capture the impact of the priors on the final distributions, but it is not necessary and probably not desirable when simply testing a fit’s reliability.Parameters: - n (integer or
None
) – Maximum number of iterations (equals infinity ifNone
). - pexact (
None
or array or dictionary of numbers) – Fit-parameter values for the underlying distribution used to generate simulated data; replaced byself.pmean
if isNone
(default). - bootstrap (bool) – Vary prior means if
True
; otherwise vary only the means inself.y
(default).
Returns: An iterator that returns
lsqfit.nonlinear_fit
s for different simulated data.Note that additional keywords can be added to overwrite keyword arguments in
lsqfit.nonlinear_fit
.- n (integer or
-
bootstrap_iter
(n=None, datalist=None)¶ Iterator that returns bootstrap copies of a fit.
A bootstrap analysis involves three steps: 1) make a large number of “bootstrap copies” of the original input data and prior that differ from each other by random amounts characteristic of the underlying randomness in the original data; 2) repeat the entire fit analysis for each bootstrap copy of the data, extracting fit results from each; and 3) use the variation of the fit results from bootstrap copy to bootstrap copy to determine an approximate probability distribution (possibly non-gaussian) for the fit parameters and/or functions of them: the results from each bootstrap fit are samples from that distribution.
Bootstrap copies of the data for step 2 are provided in
datalist
. Ifdatalist
isNone
, they are generated instead from the means and covariance matrix of the fit data (assuming gaussian statistics). The maximum number of bootstrap copies considered is specified byn
(None
implies no limit).Variations in the best-fit parameters (or functions of them) from bootstrap fit to bootstrap fit define the probability distributions for those quantities. For example, one could use the following code to analyze the distribution of function
g(p)
of the fit parameters:fit = nonlinear_fit(...) ... glist = [] for sfit in fit.bootstrapped_fit_iter( n=100, datalist=datalist, bootstrap=True ): glist.append(g(sfit.pmean)) ... analyze samples glist[i] from g(p) distribution ...
This code generates
n=100
samplesglist[i]
from the probability distribution ofg(p)
. If everything is Gaussian, the mean and standard deviation ofglist[i]
should agree withg(fit.p).mean
andg(fit.p).sdev
.Parameters: - n (integer) – Maximum number of iterations if
n
is notNone
; otherwise there is no maximum. - datalist (sequence or iterator or
None
) – Collection of bootstrapdata
sets for fitter.
Returns: Iterator that returns an
lsqfit.nonlinear_fit
object containing results from the fit to the next data set indatalist
- n (integer) – Maximum number of iterations if
-
dump_p
(filename)¶ Dump parameter values (
fit.p
) into filefilename
.fit.dump_p(filename)
saves the best-fit parameter values (fit.p
) from anonlinear_fit
calledfit
. These values are recovered usingp = nonlinear_fit.load_parameters(filename)
wherep
‘s layout is the same as that offit.p
.
-
dump_pmean
(filename)¶ Dump parameter means (
fit.pmean
) into filefilename
.fit.dump_pmean(filename)
saves the means of the best-fit parameter values (fit.pmean
) from anonlinear_fit
calledfit
. These values are recovered usingp0 = nonlinear_fit.load_parameters(filename)
wherep0
‘s layout is the same asfit.pmean
. The saved values can be used to initialize a later fit (nonlinear_fit
parameterp0
).
-
static
load_parameters
(filename)¶ Load parameters stored in file
filename
.p = nonlinear_fit.load_p(filename)
is used to recover the values of fit parameters dumped usingfit.dump_p(filename)
(orfit.dump_pmean(filename)
) wherefit
is of typelsqfit.nonlinear_fit
. The layout of the returned parametersp
is the same as that offit.p
(orfit.pmean
).
-
check_roundoff
(rtol=0.25, atol=1e-6)¶ Check for roundoff errors in fit.p.
Compares standard deviations from fit.p and fit.palt to see if they agree to within relative tolerance
rtol
and absolute toleranceatol
. Generates a warning if they do not (in which case an svd cut might be advisable).
- data –
Functions¶
-
lsqfit.
empbayes_fit
(z0, fitargs, **minargs)¶ Call
lsqfit.nonlinear_fit(**fitargs(z))
varyingz
, starting atz0
, to maximizelogGBF
(empirical Bayes procedure).The fit is redone for each value of
z
that is tried, in order to determinelogGBF
.Parameters: - z0 (array) – Starting point for search.
- fitargs (function) – Function of array
z
that determines which fit parameters to use. The function returns these as an argument dictionary forlsqfit.nonlinear_fit()
. - minargs (dictionary) – Optional argument dictionary, passed on to
lsqfit.multiminex
, which finds the minimum.
Returns: A tuple containing the best fit (object of type
lsqfit.nonlinear_fit
) and the optimal value for parameterz
.
-
lsqfit.
wavg
(dataseq, prior=None, fast=False, **kargs)¶ Weighted average of
gvar.GVar
s or arrays/dicts ofgvar.GVar
s.The weighted average of several
gvar.GVar
s is what one obtains from a least-squares fit of the collection ofgvar.GVar
s to the one-parameter fit functiondef f(p): return N * [p[0]]
where
N
is the number ofgvar.GVar
s. The average is the best-fit value forp[0]
.gvar.GVar
s with smaller standard deviations carry more weight than those with larger standard deviations. The averages computed bywavg
take account of correlations between thegvar.GVar
s.If
prior
is notNone
, it is added to the list of data used in the average. Thuswavg([x2, x3], prior=x1)
is the same aswavg([x1, x2, x3])
.Typical usage is
x1 = gvar.gvar(...) x2 = gvar.gvar(...) x3 = gvar.gvar(...) xavg = wavg([x1, x2, x3]) # weighted average of x1, x2 and x3
where the result
xavg
is agvar.GVar
containing the weighted average.The individual
gvar.GVar
s in the last example can be replaced by multidimensional distributions, represented by arrays ofgvar.GVar
s or dictionaries ofgvar.GVar
s (or arrays ofgvar.GVar
s). For example,x1 = [gvar.gvar(...), gvar.gvar(...)] x2 = [gvar.gvar(...), gvar.gvar(...)] x3 = [gvar.gvar(...), gvar.gvar(...)] xavg = wavg([x1, x2, x3]) # xavg[i] is wgtd avg of x1[i], x2[i], x3[i]
where each array
x1
,x2
... must have the same shape. The resultxavg
in this case is an array ofgvar.GVar
s, where the shape of the array is the same as that ofx1
, etc.Another example is
x1 = dict(a=[gvar.gvar(...), gvar.gvar(...)], b=gvar.gvar(...)) x2 = dict(a=[gvar.gvar(...), gvar.gvar(...)], b=gvar.gvar(...)) x3 = dict(a=[gvar.gvar(...), gvar.gvar(...)]) xavg = wavg([x1, x2, x3]) # xavg['a'][i] is wgtd avg of x1['a'][i], x2['a'][i], x3['a'][i] # xavg['b'] is gtd avg of x1['b'], x2['b']
where different dictionaries can have (some) different keys. Here the result
xavg
is agvar.BufferDict`
having the same keys asx1
, etc.Weighted averages can become costly when the number of random samples being averaged is large (100s or more). In such cases it might be useful to set parameter
fast=True
. This causeswavg
to estimate the weighted average by incorporating the random samples one at a time into a running average:result = prior for dataseq_i in dataseq: result = wavg([result, dataseq_i], ...)
This method is much faster when
len(dataseq)
is large, and gives the exact result when there are no correlations between different elements of listdataseq
. The results are approximately correct whendataseq[i]
anddataseq[j]
are correlated fori!=j
.Parameters: - dataseq – The
gvar.GVar
s to be averaged.dataseq
is a one-dimensional sequence ofgvar.GVar
s, or of arrays ofgvar.GVar
s, or of dictionaries containinggvar.GVar
s or arrays ofgvar.GVar
s. Alldataseq[i]
must have the same shape. - prior (
gvar.GVar
or array/dictionary ofgvar.GVar
s) – Prior values for the averages, to be included in the weighted average. Default value isNone
, in which caseprior
is ignored. - fast (bool) – Setting
fast=True
causeswavg
to compute an approximation to the weighted average that is much faster to calculate when averaging a large number of samples (100s or more). The default isfast=False
. - kargs (dict) – Additional arguments (e.g.,
svdcut
) to the fitter used to do the averaging.
Results returned by
gvar.wavg()
have the following extra attributes describing the average:-
lsqfit.
chi2
¶ chi**2
for weighted average.
-
lsqfit.
dof
¶ Effective number of degrees of freedom.
-
lsqfit.
Q
¶ The probability that the
chi**2
could have been larger, by chance, assuming that the data are all Gaussain and consistent with each other. Values smaller than 0.1 or suggest that the data are not Gaussian or are inconsistent with each other. Also called the p-value.Quality factor Q (or p-value) for fit.
-
lsqfit.
time
¶ Time required to do average.
-
lsqfit.
svdcorrection
¶ The svd corrections made to the data when
svdcut
is notNone
.
-
lsqfit.
fit
¶ Fit output from average.
- dataseq – The
-
lsqfit.
gammaQ
()¶ Return the normalized incomplete gamma function
Q(a,x) = 1-P(a,x)
.Q(a, x) = 1/Gamma(a) * \int_x^\infty dt exp(-t) t ** (a-1) = 1 - P(a, x)
Note that
gammaQ(ndof/2., chi2/2.)
is the probabilty that one could get achi**2
larger thanchi2
withndof
degrees of freedom even if the model used to constructchi2
is correct.
-
gvar.
add_parameter_distribution
(name, invfcn)¶ Add new parameter distribution for use in fits.
This function adds new distributions for the parameters used in
lsqfit.nonlinear_fit
. For example, the codeimport gvar as gv gv.add_parameter_distribution('log', gv.exp)
enables the use of log-normal distributions for parameters. The log-normal distribution is invoked for a parameter
p
by includinglog(p)
rather thanp
itself in the fit prior. log-normal, sqrt-normal, and erfinv-normal distributions are included by default. (Setting a priorprior[erfinv(w)]
equal togv.gvar('0(1)') / gv.sqrt(2)
means that the prior probability forw
is distributed uniformly between -1 and 1, and is zero elsewhere.)These distributions are implemented by replacing a fit parameter
p
by a new fit parameterfcn(p)
wherefcn
is some function.fcn(p)
is assumed to have a Gaussian distribution, and parameterp
is recovered using the inverse functioninvfcn
wherep=invfcn(fcn(p))
.Parameters: - name (str) – Distribution’s name.
- invfcn – Inverse of the transformation function.
-
gvar.
del_parameter_distribution
(name)¶ Delete parameter distribution
name
.
-
gvar.
add_parameter_parentheses
(p)¶ Return dictionary with proper keys for parameter distributions (legacy code).
This utility function helps fix legacy code that uses parameter keys like
logp
orsqrtp
instead oflog(p)
orsqrt(p)
, as now required. This method creates a copy of dictionaryp'' but with keys like ``logp
orsqrtp
replaced bylog(p)
orsqrt(p)
. So settingp = add_parameter_parentheses(p)
fixes the keys in
p
for log-normal and sqrt-normal parameters.
Classes for Bayesian Integrals¶
-
class
lsqfit.
BayesPDF
(fit, svdcut=1e-15)¶ Bayesian probability density function corresponding to
nonlinear_fit
fit
.The probability density function is the exponential of
-1/2
times thechi**2
function (data and priors) used infit
divided bynorm
.Parameters: - fit – Fit from
nonlinear_fit
. - svdcut (non-negative float or None) – If not
None
, replace covariance matrix ofg
with a new matrix whose small eigenvalues are modified: eigenvalues smaller thansvdcut
times the maximum eigenvalueeig_max
are replaced bysvdcut*eig_max
. This can ameliorate problems caused by roundoff errors when inverting the covariance matrix. It increases the uncertainty associated with the modified eigenvalues and so is conservative. Settingsvdcut=None
orsvdcut=0
leaves the covariance matrix unchanged. Default is1e-15
.
-
__call__
(p)¶ Probability density function evaluated at
p
.
-
logpdf
(p)¶ Logarithm of the probability density function evaluated at
p
.
- fit – Fit from
-
class
lsqfit.
BayesIntegrator
(fit, limit=1e15, scale=1, pdf=None, svdcut=1e-15)¶ vegas
integrator for Bayesian fit integrals.Parameters: - fit – Fit from
nonlinear_fit
. - limit (positive float) – Limits the integrations to a finite
region of size
limit
times the standard deviation on either side of the mean. This can be useful if the functions being integrated misbehave for large parameter values (e.g.,numpy.exp
overflows for a large range of arguments). Default is1e15
. - scale (positive float) – The integration variables are
rescaled to emphasize parameter values of order
scale
times the corresponding standard deviations. The rescaling does not change the value of the integral but it can reduce uncertainties in thevegas
estimates. Default is1.0
. - pdf (function) – Probability density function
pdf(p)
of the fit parameters to use in place of the normal PDF associated with the least-squares fit used to create the integrator. - svdcut (non-negative float or None) – If not
None
, replace covariance matrix ofg
with a new matrix whose small eigenvalues are modified: eigenvalues smaller thansvdcut
times the maximum eigenvalueeig_max
are replaced bysvdcut*eig_max
. This can ameliorate problems caused by roundoff errors when inverting the covariance matrix. It increases the uncertainty associated with the modified eigenvalues and so is conservative. Settingsvdcut=None
orsvdcut=0
leaves the covariance matrix unchanged. Default is1e-15
.
BayesIntegrator(fit)
is avegas
integrator that evaluates expectation values for the multi-dimensional Bayesian distribution associated withnonlinear_fit
fit
: the probability density function is the exponential of thechi**2
function (times-1/2
), for data and priors, used in the fit. For linear fits, it is equivalent togvar.PDFIntegrator(fit.p)
, since thechi**2
function is quadratic in the fit parameters; but they can differ significantly for nonlinear fits.BayesIntegrator
integrates over the entire parameter space but first reexpresses the integrals in terms of variables that diagonalize the covariance matrix of the best-fit parametersfit.p
fromnonlinear_fit
and are centered at the best-fit values. This greatly facilitates the integration usingvegas
, making integrals over 10s or more of parameters feasible. (Thevegas
module must be installed separately in order to useBayesIntegrator
.)A simple illustration of
BayesIntegrator
is given by the following code, which we use to evaluate the mean and standard deviation fors*g
wheres
andg
are fit parameters:import lsqfit import gvar as gv import numpy as np # least-squares fit x = np.array([0.1, 1.2, 1.9, 3.5]) y = gv.gvar(['1.2(1.0)', '2.4(1)', '2.0(1.2)', '5.2(3.2)']) prior = gv.gvar(dict(a='0(5)', s='0(2)', g='2(2)')) def f(x, p): return p['a'] + p['s'] * x ** p['g'] fit = lsqfit.nonlinear_fit(data=(x,y), prior=prior, fcn=f, debug=True) print(fit) # Bayesian integral to evaluate expectation value of s*g def g(p): sg = p['s'] * p['g'] return [sg, sg**2] expval = lsqfit.BayesIntegrator(fit, limit=20.) warmup = expval(neval=4000, nitn=10) results = expval(g, neval=4000, nitn=15, adapt=False) print(results.summary()) print('results =', results, '\n') sg, sg2 = results sg_sdev = (sg2 - sg**2) ** 0.5 print('s*g from Bayes integral: mean =', sg, ' sdev =', sg_sdev) print('s*g from fit:', fit.p['s'] * fit.p['g'])
where the
warmup
calls to the integrator are used to adapt it to probability density function from the fit, and then the integrator is used to evaluate the expectation value ofg(p)
, which is returned in arrayresults
. Hereneval
is the (approximate) number of function calls per iteration of thevegas
algorithm andnitn
is the number of iterations. We use the integrator to calculated the expectation value ofs*g
and(s*g)**2
so we can compute a mean and standard deviation.The output from this code shows that the Gaussian approximation for
s*g
(0.76(66)) is somewhat different from the result obtained from a Bayesian integral (0.48(54)):Least Square Fit: chi2/dof [dof] = 0.32 [4] Q = 0.87 logGBF = -9.2027 Parameters: a 1.61 (90) [ 0.0 (5.0) ] s 0.62 (81) [ 0.0 (2.0) ] g 1.2 (1.1) [ 2.0 (2.0) ] Settings: svdcut/n = 1e-15/0 reltol/abstol = 0.0001/0* (itns/time = 10/0.0) itn integral average chi2/dof Q ------------------------------------------------------- 1 1.034(21) 1.034(21) 0.00 1.00 2 1.034(21) 1.034(15) 0.56 0.64 3 1.024(18) 1.030(12) 0.37 0.90 4 1.010(18) 1.0254(98) 0.47 0.89 5 1.005(17) 1.0213(85) 0.55 0.88 6 1.013(19) 1.0199(78) 0.69 0.80 7 0.987(16) 1.0152(70) 0.78 0.72 8 1.002(18) 1.0135(66) 0.90 0.59 9 1.036(20) 1.0160(62) 0.86 0.66 10 1.060(20) 1.0204(60) 0.94 0.55 results = [0.4837(32) 0.5259(47)] s*g from Bayes integral: mean = 0.4837(32) sdev = 0.5403(25) s*g from fit: 0.78(66)
The table shows estimates of the probability density function’s normalization from each of the
vegas
iterations used by the integrator to estimate the final results.In general functions being integrated can return a number, or an array of numbers, or a dictionary whose values are numbers or arrays of numbers. This allows multiple expectation values to be evaluated simultaneously.
See the documentation with the
vegas
module for more details on its use, and on the attributes and methods associated with integrators. The example above setsadapt=False
when computing final results. This gives more reliable error estimates whenneval
is small. Note thatneval
may need to be much larger (tens or hundreds of thousands) for more difficult high-dimension integrals.-
__call__
(f=None, mpi=False, pdf=None, **kargs)¶ Estimate expectation value of function
f(p)
.Uses multi-dimensional integration modules
vegas
to estimate the expectation value off(p)
with respect to the probability density function associated withnonlinear_fit
fit
. Settingmpi=True
configures vegas for multi-processor running using MPI.Parameters: - f (function) – Function
f(p)
to integrate. Integral is the expectation value of the function with respect to the distribution. The function can return a number, an array of numbers, or a dictionary whose values are numbers or arrays of numbers. - mpi (bool) –
If
True
configure for use with multiple processors and MPI. This option requires modulempi4py
. A scriptxxx.py
using an MPI integrator is run withmpirun
: e.g.,mpirun -np 4 -output-filename xxx.out python xxx.py
runs on 4 processors. Setting
mpi=False
(default) does not support multiple processors. The MPI processor rank can be obtained from thempi_rank
attribute of the integrator. - pdf (function) – Probability density function
pdf(p)
of the fit parameters to use in place of the normal PDF associated with the least-squares fit used to create the integrator.
All other keyword arguments are passed on to a
vegas
integrator; see thevegas
documentation for further information.- f (function) – Function
- fit – Fit from
Other Classes¶
-
class
lsqfit.
multifit
(x0, n, f, tol=1e-4, maxit=1000, alg='lmsder', analyzer=None)¶ Fitter for nonlinear least-squares multidimensional fits.
Parameters: - x0 (
numpy
array of floats) – Starting point for minimization. - n (positive integer) – Length of vector returned by the fit function
f(x)
. - f (function) – Fit function:
multifit
minimizessum_i f_i(x)**2
by varying parametersx
. The parameters are a 1-dnumpy
array of either numbers orgvar.GVar
s. - tol (tuple or float) –
Setting
tol=(reltol, abstol)
causes the fit to stop searching for a solution when|dx_i| <= abstol + reltol * |x_i|
. With version 2 or higher of the GSL library,tol=(xtol, gtol, ftol)
can be used, where the fit stops when any one of the following three criteria is satisfied:- step size small:
|dx_i| <= xtol * (xtol + |x_i|)
; - gradient small:
||g . x||_inf <= gtol * ||f||^2
; - residuals small:
||f(x+dx) - f(x)|| <= ftol * max(||f(x)||, 1)
.
Recommended values are:
xtol=1/10**d
ford
digits of precision in the parameters;gtol=1e-6
to account for roundoff errors in gradientg
(unless the second order derivative vanishes at minimum as well, in which casegtol=0
might be good); andftol<<1
. Settingtol=reltol
is equivalent to settingtol=(reltol, 0.0)
. The default setting istol=0.0001
. - step size small:
- maxit (integer) – Maximum number of iterations in search for minimum; default is 1000.
- alg (string) – GSL algorithm to use for minimization. Two options are
currently available:
"lmsder"
, the scaled LMDER algorithm (default); and"lmder"
, the unscaled LMDER algorithm. With version 2 of the GSL library, another option is"lmniel"
, which can be useful when there is much more data than parameters. - analyzer (function) – Optional function of
x, [...f_i(x)...], [[..df_ij(x)..]]
which is called after each iteration. This can be used to inspect intermediate steps in the minimization, if needed.
multifit
is a function-class whose constructor does a least squares fit by minimizingsum_i f_i(x)**2
as a function of vectorx
. The following attributes are available:-
x
¶ Location of the most recently computed (best) fit point.
-
cov
¶ Covariance matrix at the minimum point.
-
f
¶ The fit function
f(x)
at the minimum in the most recent fit.
-
J
¶ Gradient
J_ij = df_i/dx[j]
for most recent fit.
-
nit
¶ Number of iterations used in last fit to find the minimum.
-
stopping_criterion
¶ - Criterion used to stop fit:
- 0 => didn’t converge 1 => step size small 2 => gradient small 3 => residuals small
-
error
¶ None
if fit successful; an error message otherwise.
multifit
is a wrapper for themultifit
GSL routine.- x0 (
-
class
lsqfit.
multiminex
(x0, f, tol=1e-4, maxit=1000, step=1, alg='nmsimplex2', analyzer=None)¶ Minimizer for multidimensional functions.
Parameters: - x0 (
numpy
array of floats) – Starting point for minimization search. - f (function) – Function
f(x)
to be minimized by varying vectorx
. - tol (float) – Minimization stops when
x
has converged to with tolerancetol
; default is1e-4
. - maxit (integer) – Maximum number of iterations in search for minimum; default is 1000.
- step (number) – Initial step size to use in varying components of
x
; default is 1. - alg (string) – GSL algorithm to use for minimization. Three options are
currently available:
"nmsimplex"
, Nelder Mead Simplex algorithm;"nmsimplex2"
, an improved version of"nmsimplex"
(default); and"nmsimplex2rand"
, a version of"nmsimplex2"
with random shifts in the start position. - analyzer (function) – Optional function of
x, f(x), it
, whereit
is the iteration number, which is called after each iteration. This can be used to inspect intermediate steps in the minimization, if needed.
multiminex
is a function-class whose constructor minimizes a multidimensional functionf(x)
by varying vectorx
. This routine does not use user-supplied information about the gradient off(x)
. The following attributes are available:-
x
¶ Location of the most recently computed minimum (1-d array).
-
f
¶ Value of function
f(x)
at the most recently computed minimum.
-
nit
¶ Number of iterations required to find most recent minimum.
-
error
¶ None
if fit successful; an error message otherwise.
multiminex
is a wrapper for themultimin
GSL routine.- x0 (