
Example - Spring Mass System
=============================

This example provides a simple demonstration of SMCPy functionality. The goal
is to inversely determine the uncertainty in the model parameters given a set
of noisy observations of the spring mass system using Sequential Monte Carlo
(SMC) and to compare the results to Markov chain Monte Carlo (MCMC). As SMC
relies on a MCMC kernel, a fully functional MCMC sampler is included in the
SMCPy package. In this example, it is used for verification of the SMC sampler.
The example covers all steps for implementing SMC and MCMC samplers using
SMCPy, including the creation of a user-defined computational model (spring
mass numerical integrator) that uses the standardized SMCPy interface, defining
prior distributions, initializing the samplers, computing statistical moments
with the resulting estimators, and plotting the results. The full source code
for this example can be found in the SMCPy repository:
``/SMCPy/examples/spring_mass/spring_mass_example.py`` and
``/SMCPy/examples/spring_mass/mcmc_verify/spring_mass_mcmc.py`` for the SMC and
MCMC samplers, respectively. Data generation was conducted using
``/SMCPy/examples/spring_mass/generate_noisy_data.py``.

.. _spring-mass:

.. figure:: images/spring_mass_diagram.png
    :align: center
    :width: 2in

Problem Specification
----------------------

The governing equation of motion for the system is given by

.. math:: m_s \ddot{z}  = -k_s z + m_s g
    :label: springmass

where :math:`m_s` is the mass, :math:`k_s` is the spring stiffness, :math:`g`
is the acceleration due to gravity, :math:`z` is the vertical displacement of
the mass, and :math:`\ddot{z}` is the acceleration of the mass. The true
stiffness and gravitational constant are unknown. The goal of this example is
to, by observing the motion of the mass over time, :math:`z_t`, estimate both
:math:`K_s` and :math:`G`, which are random variables representing the
uncertainty in the value of spring stiffness and the gravitational constant,
respectively. For reasons outside the scope of this example, the problem
becomes ill-posed unless the stiffness is normalized by mass such that
:math:`k_s^* = k_s/m_s` and :math:`k_s^*` are realizations of the random
variable of interest :math:`K_s^*`. The equation of motion given by Equation
(1) is now

.. math:: \ddot{z}  = -k_s^* z + g.
    :label: springmass_mod

Given observations :math:`z_t` which, assuming measurement noise exists, are realizations of the random variable :math:`Z_t`, the relationship between the computational model, measurement noise, and observations is

.. math:: Z_t = f_t(K_s^*, G) + \epsilon_t
    :label: springmass_stat_model

Here, :math:`\epsilon_t` is the measurement noise associated with each observation of displacement taken over time (also a random variable), and :math:`f_t` is the computational model response at time :math:`t`, which involves numerical integration of Equation (2).

The inverse solution of equations in the form of (2) is the posterior
distribution, or, in the context of this example, the joint distribution of
:math:`K_s^*` and :math:`G` conditional on the observed data, :math:`z_t`. While
difficult to solve directly, the posterior distribution can be approximated via
sampling methods such as SMC and MCMC. This example covers this process using
the SMCSampler and MCMCSampler classes in the SMCPy Python module. In
particular, the objective is to approximate the expected value of the model
parameters given observations :math:`z_t`.

The MCMC sampler draws samples from the unknown posterior distribution
by forming a Markov chain through the parameter space whose stationary
distribution is the posterior. The samples forming the chain can then be used
to build an estimator of the expected value

.. math:: E[X] = \frac{1}{N} \sum_{i=1}^N X_i

where :math:`X_i` is the random variable of interest, and :math:`i=1,\ldots,N`, with :math:`N` being the number of equally-weighted samples drawn using the MCMC sampler.

While MCMC is a proven approach to evaluting the quantities of interest, it can
be slow if the computational model is expensive. The Markovian nature of MCMC
means it is inherently a serial process. SMC, on the other hand, is a
parallelizable alternative that uses weighted samples called "particles." In
SMC, a sequence of target distributions is defined that gradually transitions
from the typically-known prior distribution to the posterior distribution of
interest. A particle filtering framework based on importance sampling and a
MCMC transition kernel can then be introduced that allows for recursive
estimation of each target in the sequence. Taking the final particle state, the
SMC estimator is

.. math:: E[X] = \sum_{j=1}^M W_j X_j

where :math:`W_j` is the normalized weight associated with the :math:`j^{th}` particle and :math:`M` is the total number of particles.

For this example, synthetic data will be generated by adding noise to a model response for a given set of "true" parameters. Next, MCMC and SMC will be used to estimate the expected value of these parameters with the expectation that these estimates are close to the true parameter values. Note that the code used to generate results in this example is actually split between two files in the SMCPy package. The two will be combined here and redundant code will be skipped.


Step 1: Initialization; define the model and generate the data
--------------------------------------------------------------

Begin by importing the needed Python modules, including SMCPy sampler classes
and the SpringMassModel class that defines the spring mass numerical
integrator:

.. code-block:: python

    import numpy as np
    from smcpy.smc.smc_sampler import SMCSampler
    from smcpy.mcmc.mcmc_sampler import MCMCSampler
    from spring_mass_models import SpringMassModel

Below is a snippet of the SpringMassModel class; the entire class can be found in the SMCPy repo (``/SMCPy/examples/spring_mass/spring_mass_model.py``):

.. code-block:: python

  from smcpy.model.base_model import BaseModel

  ...

  class SpringMassModel(BaseModel):
      '''
      Defines Spring Mass model with 2 free params (spring stiffness, k &
      mass, m)
      '''
      def __init__(self, state0=None, time_grid=None):


Note that user-defined models in SMCPy must inherit from the SMCPy abstract class ``BaseModel`` and implement an  ``evaluate`` function that accepts and returns numpy arrays for inputs and outputs, respectively. Here, the ``state0`` argument defines the initial state of the spring mass system, and ``time_grid`` defines the times at which to return displacement.

The first step in an analysis is to obtain data from which to make an
inference. In this example, this data will come in the form of observations of
the z-displacement of the mass made over time. For demonstration purposes, the
data will be generated from the spring mass model, and noise will be added by
sampling from a zero-mean Gaussian distribution and adding these values to the
model output. While not a realistic case, it is typical to generate and use
synthetic data in this manner for verification purposes when performing inverse
uncertainty quantification. The following code snippet is from
``generate_noisy_data.py`` and demonstrates the generation of data assuming the
ground truth parameters, or those that are unknown and are to be estimated, are
:math:`k_s^*=1.67` and :math:`g^*=4.62`:

.. code-block:: python

  # Initialize model
  state0 = [0., 0.]                        #initial conditions
  measure_t_grid = np.arange(0., 5., 0.2)  #time 
  model = SpringMassModel(state0, measure_t_grid)

  # Define the ground truth
  true_params = {'K': 1.67, 'g': 4.62}
  
  # Load data
  noise_stddev = 0.5
  displacement_data = model.generate_noisy_data_with_model(noise_std_dev, truth_params)

Once the data has been generated, the only remaining task is to use the SMC and MCMC sampler classes to generate estimates of the parameter expected values.

Step 2: Perform Parameter Estimation using SMCPy
------------------------------------------------

To instance the SMC sampler class, the data, model and parameter prior
distributions must be passed to the constructor. The first two have been
defined already. The parameter priors can be defined using the SMCPy standard
format. In this case, both prior distributions will be defined as Uniform
bounded from 0.0 to 10.0.

.. code-block:: python

  # Define prior distributions
  param_priors = {'K': ['Uniform', 0.0, 10.0],
                  'g': ['Uniform', 0.0, 10.0]}
  
Other options, such as Normal or Truncated Normal distributions are supported.

All that is left now is to use the ``sample()`` method, which requires the number of particles, number of time steps (i.e., the number of target distributions to define in the sequence) and the number of MCMC proposals to make when implementing the MCMC transition kernel between targets. Another optional but useful argument used here is the effective sample size (ESS) threshold. The ESS is a measure of how many particles in the overall population have significant weight. If this number falls below the ESS threshold, the particles are resampled with replacement to reduce degeneracy.

.. code-block:: python

  # SMC sampling
  num_particles = 5000
  num_time_steps = 20
  num_mcmc_steps = 1
  smc = SMCSampler(displacement_data, model, param_priors)
  pchain = smc.sample(num_particles, num_time_steps, num_mcmc_steps, noise_stddev,
                      ess_threshold=num_particles*0.5)

The object returned by the sampler is referred to as a "particle chain," which
is just a data storage object that contains all particle values, weights, and
likelihoods computed at each step in the sampling process. This object has some useful methods, such as plotting tools and estimators. One such estimator is that of Equation (5), the quantity of interest for this example.

.. code-block:: python

  # Calculate smc-estimated means
  smc_means = pchain.get_mean()

These values will be stored to compare with MCMC in the next section.

Before moving on, a pairwise plot can be generated to visualize parameter uncertainty and correlation using the ``plot_pairwise_weights()`` method of the particle chain object. The particles are represented as colored dots, with the colorbar showing normalized particle weight. The mean value of each parameter is shown as a dotted orange line.

.. code-block:: python

  pchain.plot_pairwise_weights(save=True, show=False)

.. _pairwise:

.. figure:: ../examples/spring_mass/pairwise.png
    :align: center
    :width: 3.5in

Step 3: Perform Parameter Estimation using MCMCPy
-------------------------------------------------

Repeating the estimation of parameter means is similar with the MCMCPy submodule. Instancing the ``MCMCSampler`` class is done in the same way as before.

.. code-block:: python

  mcmc = MCMCSampler(displacement_data, model, param_priors)

Additional parameters have to be set prior to sampling, however. In this
example, it is assumed that the measurement noise variance is known, so the
variance will be fixed and provided in the form of the standard deviation used
when generating the data, previously. Additionally, an initial guess (i.e.,
where to initialize the Markov chain within the parameter space) are specified.

.. code-block:: python

  initial_guess = {'K': 1.0, 'g': 1.0}
  mcmc.generate_pymc_model(q0=initial_guess, std_dev0=noise_stddev, fix_var=True)

When using the ``sample()`` method, the number of samples, :math:`N`, must be
defined along with the number of those samples that will be discarded as
burn-in. Burn-in is defined as the samples generated prior to the Markov chain
reaching its stationary condition. In practice, the number of samples in the
burn-in period is unknown and conservatively estimated.

.. code-block:: python

  num_samples = 100000
  num_samples_burnin = 5000
  mcmc.sample(num_samples, num_samples_burnin)

The MCMC module is a wrapper built around the PyMC package (https://github.com/pymc-devs/pymc). The stored samples can be accessed through the PyMC trace object as follows: ``<mcmc_object>.MCMC.trace(<param_name>)[:]``. The means can be calculated from the trace.

.. code-block:: python

  Kmean = np.mean(mcmc.MCMC.trace('K')[:])
  gmean = np.mean(mcmc.MCMC.trace('g')[:])

A pairwise plot can also be generated with the MCMC results. Note that the samples are not weighted in this case.

.. code-block:: python

  mcmc.plot_pairwise(keys=['K', 'g'], filename='mcmc_pairwise.png')

.. _mcmc_pairwise:

.. figure:: ../examples/spring_mass/mcmc_verify/pairwise.png
    :align: center
    :width: 3.25in


Comparing the Results
---------------------

The mean values obtained for a given run of the example scripts is shown below.
Note that both SMC and MCMC sampling involve random numbers, meaning these
numbers will be slightly different for every run. The true parameters set in the data generation step were :math:`k_s^* = 1.67` and :math:`g^* = 4.62`.

====================     =====================     =====================
Parameter                MCMC-estimated Mean       SMC-estimated mean
====================     =====================     =====================
:math:`k_s^*`            1.5397276686462869        1.5567662328730911
:math:`g^*`              4.231683157045673         4.276694515489001
====================     =====================     =====================

Running the SMCSampler in Parallel with mpi4py
----------------------------------------------

As mentioned, the ``SMCSampler`` class was designed with high performance computing in mind. The sampler uses the mpi4py package (https://bitbucket.org/mpi4py/mpi4py) to run model evaluations at each SMC step in parallel. The SMC example script can be run in parallel using ``mpirun -np <number of processors> python spring_mass_example.py``.
