This notebook demonstrates the main features of pycheops for the analysis of as single transit of KELT-11 b observed by CHEOPS.
To run this notebook ...
Configure your pycheops installation
>>> from pycheops.core import setup_config
Install the DACE API (https://dace.unige.ch/pythonAPI/)
Using pylab to import numpy, scipy and matplotlib.
Use %pylab notebook
for interactive plots, but be aware that plots may not appear in the window you expect if you do not stop the plot interaction in one cell and then run plotting commands in another cell.
%pylab inline
import pycheops
The light curve for a planet of radius $R_p$ transiting a star of radius $R_s$ with impact parameter $b$ is calculated using the power-2 limb-darkening law, $$I_{\lambda}(\mu) = 1 - c(1 - \mu^{\alpha}),$$ and is described by the following parameters.
See help(pycheops.models.TransitModel)
for more information on the transit model.
For a circular orbit, the parameter W is the transit duration in phase units (not true for eccentric orbits).
Initial values for some/all of these parameters can be obtained using the PlanetProperties class. For CHEOPS science team members, you can use the latest information on the planet stored in a table at DACE using the keyword option "query_dace=True. For non-CST members, or if the planet is not in the DACE table, the planet information is obtained from TEPCat.
DACE_ACCESS = True
if DACE_ACCESS:
Kelt11b = pycheops.PlanetProperties('KELT-11b',query_dace=True)
else:
Kelt11b = pycheops.PlanetProperties('KELT-11b',query_dace=False,
ecosw=ufloat(-0.0040, 0.0490),
esinw=ufloat(0.0310, 0.0680))
print(Kelt11b)
A query to DACE is used to obtain the file key for the observation of KELT-11b during in-orbit commissioning (prog_id=24).
If the data are not already in your pycheops cache directory then pycheops will attempt to downloaded them from DACE (assuming you have setup your access correctly).
There may be multiple versions of the data set in DACE so we use the data_proc_num attribute to select the latest version.
The data reduction pipeline (DRP) report is shown when you download the report. To view the report again at a later time you can use the command dataset.view_report()
The output from dataset.get_lightcurve
is the numpy arrays time, flux and fluxerr. We do not need these yet so we send the output to a variable ``.
N.B. you must decide whether or not to apply a correction for contamination of the aperture by nearby stars (decontaminate=True
or decontaminate=False
). There is some uncertainty in this correction so, in some cases, using decontaminate=True
can add noise to the light curve.
from dace.cheops import Cheops
myfilter = {'obj_id_catname': {'contains':'KELT-11b','prog_id':24}}
data = Cheops.query_database(filters=myfilter)
i_last_data_proc_num = np.argmax(data['data_proc_num'])
file_key = data['file_key'][i_last_data_proc_num]
dataset = pycheops.Dataset(file_key)
_ = dataset.get_lightcurve(aperture='OPTIMAL',decontaminate=True)
This is an important step in order to understand the instrumental effects that may be present in your data. It may take a few minutes to generate the movie.
dataset.animate_frames(vmax=0.5)
The items in the dictionary dataset.lc are
dataset.lc.keys()
An initial cut for obvious outliers is applied here. Data points with discrepant fluxes are rejected from all the arrays in dataset.lc except lc['table']
. The output of the function is the clipped arrays time, flux, and flux_err.
time,flux,flux_err = dataset.clip_outliers(verbose=True);
Useful as a quick-look of the data and meta-data to look for any obvious correlations or problems.
dataset.diagnostic_plot()
The plot shows a well-defined transit with a depth of about 0.2% and some little "spikes" that seem to occur regularly - we will deal with these below.
plot(time,flux,'r+') # matplotlib
bjd_ref = dataset.bjd_ref
xlabel(f'BJD - {bjd_ref}') # example of python f-string formatting
ylabel('Flux')
title(dataset.target)
Many of the instrumental artefacts in the CHEOPS light curves are associated with the spacecraft roll angle. In this case we can see the the "spikes" occur near a roll angle of 340 degrees.
Note the use of a temporary variable t
that "points" to the object we are using in the plot command.
t = dataset.lc
plot(t['roll_angle'],t['flux'],'gx') # plot with green crosses, just for a change
xlabel('Roll angle [degrees]')
ylabel('Flux')
xlim(0,360)
A flux ramp is often observed in the beginning of a visit with an amplitude of a few hundred ppm (either positive or negative) and decaying over a time scale of several hours.
This ramp is due to a small scale change in the shape of the PSF. This in turn can be understood as a slight focus change as a result of a thermal adaptation of the telescope tube to the new heat load by the thermal radiation from the Earth. This thermal adaptation (breathing) is monitored by thermal sensors in the tube.
At the time of writing (Dec 2020) several algorithms are being investigated to correct for this ramp effect. One algorithm that is simple to implement and seems to work quite well is to correct the measured flux using the equation $$ {\rm Flux}_{\rm corrected} = {\rm Flux}_{\rm measured} (1+\beta \Delta T) $$ where $$ \Delta T = T_{\rm thermFront\_2} + 12 $$
The following values of the coefficient $\beta$ have been determined by Goran Olofsson.
Aperture | $\beta$ |
---|---|
22.5 | 0.00014 |
25.0 | 0.00020 |
30.0 | 0.00033 |
40.0 | 0.00040 |
The method correct_ramp
uses linear interpolation in this table to predict the slope $\beta$ for the aperture radius of the light curve.
time,flux,flux_err = dataset.correct_ramp(plot=True)
Notice in the previous cell that the output from the last command is printed to screen - the semi-colon at the end of the last command here avoids this little issue.
The contamination in this case is a small fraction of the total flux.
plot(t['roll_angle'],t['contam'],'bo',ms=3) # blue dots, marker size =3
xlabel('Roll angle [degrees]')
ylabel('contam')
xlim(0,360);
This is only done as an example of how to mask the data - there is no good reason to exclude these data points from the analysis.
As with clip_outliers, all the internal arrays except lc['table']
are masked.
mask = dataset.lc['centroid_x'] < 813.5
time,flux,flux_err = dataset.mask_data(mask)
Properties of the star such as Teff and log g are taken from SWEET-Cat. The SWEET-Cat catalogue is downloaded automatically if it is not in your pycheops cache folder, or if the cached version is more than a day old.
The values of Teff, logg and [Fe/H] are used to estimate the limb-darkening parameters of the star and the mean stellar density $\rho_{\star}$.
See help(pycheops.StarProperties)
for instructions on how to specifiy your own values.
CHEOPS Science team members can use the option dace=True
to use updated values from the StarProperties table stored on DACE.
star = pycheops.StarProperties(dataset.target) # dace=True
print(star)
The dataset method lmfit_transit uses the Levenberg-Marquardt method implemented in lmfit to find a least-squares best fit for a transit model calculated using the qpower2 algorithm.
Parameter values can be specified in various ways
To enable decorrelation against a parameter, specifiy it as a free parameter, e.g., dfdbg=(0,1). The decorrelation parameters that are available can be found using help(pycheops.models.FactorModel)
Decorrelation is done against is a scaled version of the quantity specified with a range of either (-1,1) or, for strictly positive quantities, (0,1). This means the coefficients dfdx, dfdy, etc. correspond to the amplitude of the flux variation due to the correlation with the relevant parameter.
Where possible, the initial values for lmfit are taken from the KELT11b PlanetProperties object. These values are stored as ufloat objects. The .n
attribute of a ufloat is its value and the .s
attribute is the standard error.
from uncertainties import ufloat
from uncertainties.umath import sqrt as usqrt
P = Kelt11b.P.n
BJD_0 = Kelt11b.T0.n
cycle = round((dataset.bjd_ref-BJD_0)/P)
T_0 = BJD_0 - dataset.bjd_ref + cycle*P
D = Kelt11b.D.n/1e6 # Depth stored in ppm
W = Kelt11b.W.n/P # Width stored in days
if Kelt11b.f_c and Kelt11b.f_s:
f_c = Kelt11b.f_c
f_s = Kelt11b.f_s
else:
# From Pepper et al., 2017
ecosw = ufloat(-0.004,0.05099)
esinw = ufloat(0.031,0.055)
ecc = usqrt(ecosw**2+esinw**2)
f_s = esinw/usqrt(ecc) # f_s = sqrt(e)sin(omega) = e.sin(omega)/sqrt(e)
f_c = ecosw/usqrt(ecc) # f_c = sqrt(e)cos(omega) = e.cos(omega)/sqrt(e)
Use a straight-line fit to the data either side of transit to set the out-of-eclipse level to 1.
plot(time,flux,'.',color='gray')
time,flux,flux_err = dataset.flatten(T_0,P*W)
plot(time,flux,'.',color='blue')
axvline(T_0-P*W/2,color='darkcyan',ls=':')
axvline(T_0+P*W/2,color='darkcyan',ls=':')
axhline(1,color='darkcyan',ls=':')
xlabel('Phase')
ylabel('Flux');
We will use this residuals from this fit to calculate priors on the parameters of the decorrelation model.
There is only one transit in the dataset so the orbital period is fixed for this analysis.
The output from lmfit_transit() is an lmfit MinimizerResult object.
lmfit0 = dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho)
Note the ";" at the end of the plot command to stop the plot appearing twice. Alternatively, you can put the output of the plot command (which is a matplotlib Figure object) into a variable. This can be used later to save the plot to a file, for example.
dataset.plot_lmfit(binwidth=0.008, figsize=(10,6), fontsize=14);
Decorrelation is done against is a scaled version of the quantity specified with a range of either (-1,1) or, for strictly positive quantities, (0,1). This means the coefficients dfdx, dfdy, etc. correspond to the amplitude of the flux variation due to the correlation with the relevant parameter.
A sensible prior for these coefficients (decorrelation parameters) is a normal distribution with the same standard deviation as the residuals from the fit above with no decorrelation.
Careful selection of the prior is important because it will be used later to determine if we have selected a good set of decorrelation parameters.
sigma_0 = lmfit0.rms
dprior = ufloat(0, sigma_0)
print(f'sigma_0 = {sigma_0:0.6f}')
# Prior on dfdt is different because it is measured in 1/day, so
tprior = ufloat(0, sigma_0/np.ptp(time))
The question of which decorrelation parameters to include is an example of model selection. In this case we have nested models in which one or more of the parameters is fixed in the simpler model.
For models $M_0$ and $M_1$ with parameters $\theta_0 = \{p_1, p_2, \dots, p_n, 0\}$ and $\theta_1 = \{p_1, p_2, \dots, p_n, \psi\}$, given the data $D$, the Bayes factor $B_{01}$ is defined by
$$ \frac{P(M_0|D)}{P(M_1|D)} = \frac{P(M_0)}{P(M_1)} \frac{P(D|M_0)}{D|P(M_1)} = \frac{P(M_0)}{P(M_1)} B_{01},$$where $ P(D|M_0) = \int P(D|\theta_0)P(\theta_0)d^n\theta$ and similarly for $P(D|M_1)$. $P(\theta_0)$ is the prior on the parameters of model $M_0$.
The prior on the extra parameter are the same for both models so we can use the Savage-Dickey Density Ratio to calculate the Bayes factor $$ B_{01} = \frac{P(\psi=0|D)}{P(\psi=0)} $$
For a normal prior with standard deviation $\sigma_0$, $P(\psi=0) = 1/\sigma_0\sqrt{2\pi}$.
The posterior probability distributions for the decorrelation parameters are normally well-behaved and close to Gaussian, as expected for a linear model. Assuming that they are normally distributed and that the standard deviation is given accurately by the error on the parameter given by lmfit, and that a priori the two models are equally likely, we can calculate Bayes factor for models with/without parameter with value $p\pm\sigma_p$ from lmfit using
$$B_p = e^{-(p/\sigma_p)^2/2}\sigma_0/\sigma_p$$These are the values listed in the section [[Bayes Factors]] in the report above. Parameters with Bayes factors $\overset{>}{\sim} 1$ are not supported by the data and can be removed from the model. In this example we can safely remove dftdt
. We could instead remove dfdsin2phi
, but we will keep it because it is one of a set.
N.B. this statistic is only valid for comparison of the models with/without one parameter, so parameters should be removed one-by-one and the test repeated for every new pair of models.
From the initial fit with no decorrelation we use the RMS of the residuals to set the priors on the decorrelation parameters, $\mathcal{N}(0, \sigma_p)$ or, for ${df}/{dt}$, $\mathcal{N}(0, \sigma_p/\Delta t)$ where $\Delta t$ is the duration of the visit. We then added decorrelation parameters to the fit one-by-one, selecting the parameter with the lowest Bayer factor at each step and stopping when $B_p>1$ for all remaining parameters. This process sometimes leads to a set of parameters that includes a few parameters that are strongly correlated with one another and so are therefore not well determined, i.e. they have large Bayes factors. So, after adding parameters one-by-one, go through a process of repeatedly removing the parameter with the largest Bayes factor if any of the parameters have a Bayes factors $B_p>1$.
For a derivation of the Savage-Dickey Density Ratio see Trotta 2007MNRAS.378...72T
dfdt
is a linear trend with time and dfdcontam
is a linear trend v. the contamination of the aperture by nearby stars. The contamination is estimated using the Gaia catalogue. If decontaminate=True
is used in get_lightcurve() then this value is subtracted from the measured flux, so this value should be close to zero.
detrend = {}
bestbf = 0
allpar = ['dfdsinphi','dfdcosphi',
'dfdsin2phi','dfdcos2phi',
'dfdsin3phi','dfdcos3phi',
'dfdx', 'dfdy', 'dfdsmear',
'dfdbg','dfdt', 'dfdcontam']
print('Parameter BF Delta_BIC RMS(ppm)')
while bestbf < 1:
bestbf = np.inf
for p in allpar:
dtmp = detrend.copy()
dtmp[p] = tprior if p is 'dfdt' else dprior
lmfit = dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **dtmp)
bre = re.compile(r'{}: *(\d+\.\d{{3}})\n'.format(p))
m = bre.findall(dataset.lmfit_report())
if len(m) > 0:
bf = float(m[-1])
if bf < bestbf:
bestbf = bf
newpar = p
if bestbf < 1:
print(f'+{newpar:<12s} {bestbf:6.2f} {lmfit.bic-lmfit0.bic:8.1f} {1e6*lmfit.rms:8.1f}')
detrend[newpar] = tprior if newpar is 'dfdt' else dprior
allpar.remove(newpar)
worstbf = 10
while worstbf > 1:
worstbf = 0
for p in detrend:
bre = re.compile(r'{}: *(\d+\.\d{{3}})\n'.format(p))
m = bre.findall(dataset.lmfit_report())
if len(m) > 0:
bf = float(m[-1])
if bf > worstbf:
worstbf = bf
delpar = p
if worstbf > 1:
del detrend[delpar]
lmfit = dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **detrend)
print(f'-{delpar:<12s} {worstbf:6.2f} {lmfit.bic-lmfit0.bic:8.1f} {1e6*lmfit.rms:8.1f}')
lmfit = dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **detrend)
print(dataset.lmfit_report(min_correl=0.5))
In this plot (detrend=True)
dataset.plot_lmfit(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=True);
Same color-coding of data and models as before.
fig = dataset.rollangle_plot()
The method add_glint()
is being used here to calculate a spline fit to the residuals from the last fit. A scaled version of this function (which is stored in dataset) can then be added to the model.
glint_func = dataset.add_glint(nspline=48,binwidth=5,figsize=(10,4),gapmax=5)
The parameter glint_scale includes a linear scaling of the glint function we created as part of the model. Here we allow the glint to be scaled between 0 (no glint) and 2 (twice as much glint as estimated from the previous fit).
dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **detrend,
glint_scale=(0,2))
print(dataset.lmfit_report(min_correl=0.5))
dataset.plot_lmfit(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=True,
title='With glint v. roll-angle');
dataset.plot_lmfit(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=False,
title='With glint v. roll-angle');
fig = dataset.rollangle_plot()
The previous fit is better, but the extra flux does not appear at exactly the same roll angle every time. The reason from this can be found by checking for solar system objects near the field of view. We see that the Moon was only 16$^{\circ}$ from the target. The position of the Moon varies noticeably during the observation, so we must account for this when we model the glint.
dataset.planet_check()
Residuals from last fit will already be corrected for glint, so we can either re-run lmfit without glint_scale
and re-fit, or just fit the out-of transit flux.
For this example, let's create a mask so that we can fit the out-of-transit flux.
from pycheops.utils import phaser
phase = phaser(time,lmfit.params['P'],lmfit.params['T_0'],-0.5)
mask = abs(phase) < lmfit.params['W']/2
plot(phase[mask],flux[mask],'.',color='gray')
plot(phase[~mask],flux[~mask],'.',color='blue')
xlim(-0.09,0.09)
xlabel('Phase')
ylabel('Flux');
The previous glint function is overwritten by this new glint function.
Note the the spike v. Moon angle is sharper than against roll angle.
moon_glint = dataset.add_glint(moon=True,nspline=48,binwidth=5,
fit_flux=True, mask=mask,
figsize=(10,5))
dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **detrend,
glint_scale=(0,2))
print(dataset.lmfit_report(min_correl=0.5))
The glint function will remove some of the count-rate variation with roll angle, so repeat the process of removing decorrelation parameters one-by-one.
worstbf = 10
while worstbf > 1:
worstbf = 0
for p in detrend:
bre = re.compile(r'{}: *(\d+\.\d{{3}})\n'.format(p))
m = bre.findall(dataset.lmfit_report())
if len(m) > 0:
bf = float(m[-1])
if bf > worstbf:
worstbf = bf
delpar = p
if worstbf > 1:
del detrend[delpar]
lmfit = dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **detrend)
print(f'-{delpar:<12s} {worstbf:6.2f} {lmfit.bic-lmfit0.bic:8.1f} {1e6*lmfit.rms:8.1f}')
dataset.lmfit_transit(T_0 = ufloat(T_0, 0.001), P=P,
D=(D/4, D, D*4), W=(W/4, W, W*4),b=(0,0.5,1),
f_c=f_c, f_s=f_s,
h_1=star.h_1, h_2=star.h_2.n,
logrhoprior=star.logrho, **detrend,
glint_scale=(0,2))
print(dataset.lmfit_report(min_correl=0.5))
Note that for the roll angle plot there are now 3 panels. The residuals in the top panel are corrected for glint but not for other trends - the model for these is shown in brown. The fit of the glint function to the residuals corrected for other trends is shown the middle panel. The lower panel shows the residuals from the complete model, i.e., flux - (transit $\times$ trends + glint)
fig = dataset.rollangle_plot()
dataset.plot_lmfit(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=False,
title='Including moon glint ');
dataset.plot_lmfit(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=True,
title='Including moon glint ');
The "bumps-and-wiggles" in the residuals from the previous fit are very likely to be due to real flux variations from the star due to convection in its atmosphere a.k.a. granulation noise or "flicker" (Sulis et al. 2020). The time scale ($\sim$ hour) and amplitude ($\sim 100$ppm) of the variation are typical granulation noise observed in other cool sub-giant stars.
A Gaussian Process (GP) is a way to model the granulation noise by specifying the properties of the correlation matrix for this correlated noise. We will use the celerite
package to do this because it can do the calculation of the likelihood for a given noise function very quickly, and it includes a kernel that is appropriate for granulation noise.
Our noise model will then have three components
flux_err
.With $Q=1/\sqrt{2}$, the kernel that describes the covariance between observations separated by a time $\tau$ is $$ k(\tau) = S_0\omega_0 e^{-\omega_0\tau/\sqrt{2}}\cos\left(\omega_0\tau/\sqrt{2} - \pi/4\right)$$
The free parameters of the noise model are then $\sigma$ (amplitude of the jitter), $\omega_0$ (related to correlation time scale) and $S_0$ (related to amplitude of the correlated noise). These are all strictly positive values for which we want to use a scale-free prior so we use the log of these values when we sample the posterior probability distribution (PPD). The sampling of the PPD is done using emcee.
Directly fitting a GP to the data with all the transit and decorrelation parameters as free parameters leads to large errors on the transit parameters and a GP that implies much more variability in the flux than is seen in CHEOPS data for similar stars, i.e., the GP tries to model the transit as noise. To avoid this, we first do a GP fit with the transit parameters fixed. The results from this fit are used to put priors on the parameters of the noise model in the final fit.
# Copy the parameters from the last lmfit
params_fixed = dataset.lmfit.params.copy()
# Fix the transit parameters
for p in ['T_0','D','W','b']:
params_fixed[p].set(vary=False)
# Use emcee to sample the PPD. Note we add the SHOTerm at this step.
result = dataset.emcee_sampler(params=params_fixed, add_shoterm=True,
burn=512,steps=256,nwalkers=128)
print(dataset.emcee_report(min_correl=0.5))
With detrend=False (the default) the upper panel shows the unbinned and binned flux values used in the fit (light blue and dark blue, as before), the transit model (green) and the best-fit (maximum likelihood) model including all trends, glint and the GP noise model. Several samples from the PPD are used to plot other models that give a reasonable fit to the data in light brown.
The lower panel shows the residuals from non-GP parts of the model, i.e., flux - (transit $\times$ trends + glint). The GP noise model is shown in brown.
fig = dataset.plot_emcee(binwidth=0.008, figsize=(10,6), fontsize=14,
title='Fixed transit parameters')
With detrend=True, the flux values in the upper panel have been corrected for instrumental trends and glint. The transit model is shown in green, and the transit model + GP noise models are shown in brown (best-fit and samples, as before). The residuals are plotted in the lower panel in the same way as for detrend=False.
fig = dataset.plot_emcee(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=True,
title='Fixed transit parameters')
We now create Gaussian priors on the noise model parameters centered on the mean of the PPD from the previous fit, and the standard deviation is set to twice the width the PPD. This keeps the noise model in a reasonable region of its parameter space while allowing us to explore if the transit model parameters depend strongly on the noise model. This is done by setting the user_data attribute of the lmfit Parameter object to a ufloat value.
We already added the SHOTerm to the model in the previous step so we do not need to include "add_shoterm" in the call to emcee_sampler again. The output from this emcee run will overwrite the previous results stored in the dataset object. We are also over-writing the "result" variable that stores the output from the previous emcee run as an lmfit MinimizerResult object - we could give the result a different variable name here if we want to save those results.
# Get a copy of the parameters from the previous emcee run
params_prior = dataset.emcee.params.copy()
# Setting the noise model parameters
from uncertainties import ufloat
for p in ['log_S0', 'log_omega0', 'log_sigma']:
params_prior[p].user_data=ufloat(params_prior[p].value,2*params_prior[p].stderr)
# Restoring the transit model parameters as free parameters
for p in ['T_0','D','W','b']:
params_prior[p].set(vary=True)
# Run emcee again
result = dataset.emcee_sampler(params=params_prior, burn=512, steps=256, nwalkers=256)
print(dataset.emcee_report(min_correl=0.5))
fig = dataset.plot_emcee(binwidth=0.008, figsize=(10,6), fontsize=14, detrend=False,
title='GP prior')
fig = dataset.plot_emcee(binwidth=0.008, figsize=(10,6), fontsize=16, detrend=True)
fig.savefig('kelt11fig.png')
We can use a corner plot to view correlations between model parameters. Gaussian priors on parameters are shown using dashed-green lines to indicate the $\pm$ 1-sigma limits. In this first corner plot we can see how the PPD of the noise model parameters are narrower than their priors, i.e., the noise models are still being determined mostly by the data and the priors are only preventing these parameters from going to very large or small values.
font = {'family': 'serif', 'color': 'black', 'weight': 'normal','size': 18}
fig = dataset.corner_plot(
plotkeys=['log_S0','log_omega0','log_sigma'],
kwargs={'label_kwargs':{'fontdict':font}})
fig = dataset.corner_plot(plotkeys='all',kwargs={'label_kwargs':{'fontdict':font}})
fig = dataset.rollangle_plot()
A trail plot is shows the parameter values v. step number for all the walkers in the sampler. This is a very effective way to check that the sampler has converged, i.e., that the sampler is randomly sampling the posterior probability distribution.
In a "well-mixed" chain, the trail plot will look like noise for all the walkers and there will be no trends in the position or width of the trails.
fig = dataset.trail_plot('all')
Lomb-Scargle power-spectrum of the residuals.
The previous fit included a GP which is not included in the calculation of the residuals, i.e., the power spectrum includes the power "fitted-out" using the GP. The assumption here is that the GP has been used to model stellar variability that we wish to characterize using the power spectrum.
The red vertical dotted lines show the CHEOPS orbital frequency and its first two harmonics.
The likely range of $\nu_{\rm max}$ (peak power for stellar oscillation) is shown using green dashed lines.
fig = dataset.plot_fft(star, gsmooth=10)
Once we are happy with the fit to the transit light curve we can use the parameters derived to calculate the mass and radius of the planet. To do this we need to specify the semi-amplitude of the star's orbit due to the planet (K in m/s) and provide the star's mass and/or radius. If only the mass is provided then the radius is calculated from the stellar density obtained from the transit width, and vice versa.
For more details on the options in dataset.massradius, see the inline help for pycheops.funcs.massradius
.
r,fig=dataset.massradius(m_star=(1.44, 0.06), K=(18.5,1.7),jovian=False)
fig.savefig('KELT-11b_massradius.png')
This will save all the data and results from the last least-squares fit and the last emcee fit.
dataset.save();
MultiVisit can be used to analyse one of more datasets previously saved from pycheops.
The fit automagically includes the following model to account for trends in the data correlated with roll angle: $$\sum_{j=1}^n \alpha_j\sin(j\cdot\Omega t) + \beta_j\cos(j\cdot\Omega t),$$
where, $\Omega$ is the angular rotation frequency of the spacecraft. The free parameters $\alpha_j$ and $\beta_j$ are never calculated explicitly. Instead, they are implicitly accounted for in the calculation of the likelihood using the trick explained by Rodrigo et al. (2017RNAAS...1....7L). The number of terms in this model ($n=3$ by default) can be set
using the parameter nroll
. To disable this feature using unroll=False
.
If a saved dataset includes any other decorrelation parameters as free parameters in the last fit, e.g., dfdt
, dfdbg
etc., then these are also included as free parameters in the fit to that dataset with Multivisit. The same applies to glint_scale
.
The unroll
method in MultiVisit assumes that the spacecraft roll angle changes linearly with time at a constant rate $\Omega$. The value of $\Omega$ is determined separately for each dataset. In fact there are smooth variations in $\Omega$ so the results from MultiVisit will not be quite the same as those obtained using dfdsinphi
.. dfdcos3phi
in Dataset.
See help(MultiVisit)
for more details on the following functions within MultiVisit
N.B. time values in MultiVisit objects are stored as BJD-2457000 (same as TESS) so we use the tzero() method here to calculate the time of mid-transit closest to the mid-point of the observations.
from pycheops import MultiVisit
M = MultiVisit('KELT-11b', id_kws={'dace':False})
TJD_0 = M.tzero(BJD_0,P)
result = M.fit_transit(log_omega0=params_prior['log_omega0'],
log_S0=params_prior['log_S0'],
f_c=f_c, f_s=f_s,
unroll=True, nroll=1,
extra_priors={
'T_0':ufloat(TJD_0,0.01),
'h_1':star.h_1,
'h_2':star.h_2,
'logrho':star.logrho},
burn=512, steps=256, nwalkers=256)
print(M.fit_report(min_correl=0.8))
Here we can check whether the approximation that $\Omega$ is contant is a good one for KELT-11. In general, variations in roll angle rate are large for stars far from the celestial equator.
fig,ax = subplots(nrows=2,sharex=True)
t = M.datasets[0].lc['time']
t0 = floor(t[0])
roll_angle = M.datasets[0].lc['roll_angle']
ax[0].plot(t-t0,roll_angle,'.')
ax[0].set_ylabel('Roll angle [$^{\circ}$]')
drdt = gradient(roll_angle)
Omega = np.nanmedian(drdt)
ax[1].plot(t-t0,drdt/Omega,'.')
ax[1].set_ylim(0.99,1.01)
ax[1].set_xlabel('Time')
ax[1].set_ylabel('$\Omega/<\Omega>$');
There are three options for dealing with the non-linear change of roll angle with time ...
nroll
.To enable option 3 use unwrap=True
- here is an example for KELT-11b.
result = M.fit_transit(log_omega0=params_prior['log_omega0'],
log_S0=params_prior['log_S0'],
f_c=f_c, f_s=f_s,
unroll=True, nroll=1, unwrap=True,
extra_priors={
'T_0':ufloat(TJD_0,0.01),
'h_1':star.h_1,
'h_2':star.h_2,
'logrho':star.logrho},
burn=512, steps=256, nwalkers=256)
print(M.fit_report(min_correl=0.8))
M.plot_fit();
M.corner_plot(plotkeys='all');
result.params
M.trail_plot()
© Dr Pierre Maxted, Keele University, UK