Overview of momi2
Before assembling the watch, lay out every part and understand what it does.
momi2 at a glance. Panel A: The W-matrix (Polanski-Kimmel coefficients) as a heatmap – the combinatorial bridge between sample configurations and expected branch lengths. Panel B: Moran model transition dynamics showing how a delta distribution at frequency \(k\) spreads over time. Panel C: Admixture tensor – expected migrating lineage counts as a function of admixture fraction. Panel D: Expected time spent with \(j\) lineages under constant-size epochs with different population sizes \(N\).
What Does momi2 Do?
Given DNA sequence data from one or more populations, momi2 infers the
demographic history that produced the observed patterns of genetic variation.
Concretely, momi2 asks: What history of population size changes, splits,
and admixture events would produce a frequency spectrum that looks like the one
we observe?
Think of a fine mechanical watch. A watchmaker can deduce every gear ratio
inside the case by studying the positions of the hands on the dial. momi2
works the same way: it computes the expected hand positions (the expected SFS)
for a given gear configuration (a demographic model), then adjusts the gears
until the hands match what is observed. But unlike moments, which turns the
gears forward in time to see where the hands end up, momi2 traces the
hands backward through the coalescent to figure out what gear configuration
must have produced them.
From the inferred history, you learn:
Population sizes through time – expansions, contractions, bottlenecks
Divergence times – when populations split from each other
Admixture fractions – the proportion of ancestry contributed by each source during a pulse admixture event
Growth rates – exponential growth or decline within epochs
How momi2 Differs from moments and dadi
All three tools infer demography from the SFS. The difference is how they compute the expected SFS under a given model.
Property |
dadi |
moments |
momi2 |
|---|---|---|---|
Direction |
Forward (diffusion PDE) |
Forward (moment ODEs) |
Backward (coalescent) |
Core equation |
Wright-Fisher diffusion PDE on a frequency grid |
System of ODEs for SFS entries |
Tensor products over a demographic event tree |
Population model |
Continuous diffusion |
Continuous diffusion (via moments) |
Moran model (discrete) |
Gradient computation |
Finite differences |
Analytic or finite differences |
Automatic differentiation (autograd) |
Multi-population scaling |
Grid grows as \(O(n^D)\) – expensive beyond 3 populations |
ODE system grows polynomially – practical for 4–5 populations |
Tensor operations along event tree – scales well for many populations |
Selection |
Yes |
Yes |
No (neutral models only) |
The key trade-off
momi2 gives up selection modeling in exchange for two advantages:
exact gradients via automatic differentiation (enabling faster, more
reliable optimization) and a coalescent formulation that scales more
naturally to many populations. If you need selection, use moments
or dadi. If you have many populations and neutral models, momi2
is often the better choice.
The Key Innovation: Tensors + Autograd
The central insight of momi2 is that the expected SFS under any demographic
model can be written as a sequence of tensor operations – matrix
multiplications, convolutions, and antidiagonal summations – applied to
likelihood tensors that track allele configurations across populations.
Each operation corresponds to a demographic event:
Moran transition (a period of constant or exponential population size): multiply the likelihood tensor along the relevant axis by the Moran transition matrix \(P(t) = V e^{\Lambda t} V^{-1}\)
Population merge (a split event, viewed backward in time): convolve two population axes into one
Admixture pulse: contract the likelihood tensor with a 3-tensor encoding the binomial mixing of lineages
Because every one of these operations is differentiable, the Python library
autograd can compute exact gradients of the log-likelihood with respect to
all demographic parameters by backpropagating through the entire computation
graph. No hand-coded Jacobians, no finite-difference approximations, no
truncation error.
The Moran Model Connection
Within each epoch (a period between demographic events), momi2 needs to
compute how the distribution of lineage counts evolves over time. It uses the
Moran model – a continuous-time Markov chain on the number of derived
alleles in a sample of size \(n\).
The Moran model has a tridiagonal rate matrix with a known eigendecomposition. This means the transition matrix for any time \(t\) can be computed in closed form:
where \(V\) is the matrix of eigenvectors and \(\Lambda\) is the diagonal matrix of eigenvalues. No numerical ODE integration is needed – just a single matrix operation per epoch.
For epochs with exponential growth, the time is rescaled by integrating the inverse population size: \(\tau = \int_0^T 2/N(s)\, ds\), and the expected coalescence times are computed via closed-form expressions involving exponential integrals.
Parameters
Parameter |
Meaning |
Typical units |
|---|---|---|
\(N_e\) |
Effective population size (reference) |
Individuals |
\(N_i(t)\) |
Population size of deme \(i\) at time \(t\) |
Individuals (or scaled by \(N_e\)) |
\(T_{\text{split}}\) |
Time of population split (backward from present) |
Generations (or scaled: \(T / 4N_e\)) |
\(f\) |
Admixture fraction in a pulse event |
Proportion (0 to 1) |
\(g\) |
Exponential growth rate within an epoch |
Per generation |
\(\mu\) |
Per-site per-generation mutation rate |
\(\sim 10^{-8}\) |
The Flow in Detail
Step 1: Observed SFS
Count derived allele frequencies across k populations
from SNP data. Result: a k-dimensional histogram.
Step 2: Demographic model
Specify a population tree: which populations split when,
with what sizes, growth rates, and admixture events.
Step 3: Expected SFS via tensor computation
Traverse the event tree bottom-up (leaves to root):
- At each leaf: initialize a likelihood tensor
- At each epoch boundary: apply Moran transition matrix
- At each split: convolve two population tensors
- At each pulse: apply admixture 3-tensor
Accumulate the expected SFS along the way.
Step 4: Likelihood computation
Compare observed and expected SFS via composite
log-likelihood (multinomial or Poisson).
Step 5: Optimization
Compute gradients via autograd, then optimize
parameters using TNC, L-BFGS-B, ADAM, or SVRG.
Step 6: Uncertainty quantification
Use the autograd Hessian or bootstrap to estimate
confidence intervals on inferred parameters.
Ready to Build
The next four chapters disassemble momi2 gear by gear:
The Coalescent SFS – The dial: how expected branch lengths in the coalescent translate into expected SFS entries, and how demographic events modify these expectations.
The Moran Model – The escapement: the Moran model as a continuous-time Markov chain, its eigendecomposition, and how it governs lineage dynamics within each epoch.
Tensor Machinery – The gear train: likelihood tensors, convolution for population merges, the junction tree algorithm, and how the full SFS is assembled from parts.
Automatic Differentiation & Inference – The mainspring: autograd-powered optimization, composite likelihoods, stochastic gradient methods, and uncertainty quantification.