Chapter 8: Applications Under Error
“Theory is clean. Reality is messy. The art is knowing how much error you can tolerate.”
Connection to the Central Thesis
We’ve built the theoretical foundation. Now we apply it to real problems—each with its own error sources, constraints, and tolerances.
The unifying theme: every application involves approximating unknown dynamics from imperfect data. Success depends on:
Understanding the error sources
Choosing appropriate methods
Quantifying uncertainty
Validating against domain knowledge
This chapter surveys applications across domains, emphasizing practical considerations.
Sensor Fusion and State Estimation
The Problem
Multiple sensors measure the same system:
GPS: Position, noisy, 1 Hz
IMU: Acceleration, less noisy, 100 Hz
Wheel encoders: Velocity, very noisy
How do you combine them to estimate position, velocity, and acceleration?
The Kalman Filter Approach
Model the system as: $\(x_{k+1} = A x_k + B u_k + w_k\)\( \)\(y_k = C x_k + v_k\)$
The Kalman filter optimally fuses measurements when noise is Gaussian.
Where PyDelt Fits
PyDelt complements Kalman filtering:
Preprocessing: Smooth individual sensor streams before fusion
Derivative estimation: Get velocity from position, acceleration from velocity
Model identification: Discover A, B, C from data
from pydelt.interpolation import SplineInterpolator
import numpy as np
# GPS position data (1 Hz, noisy)
t_gps = np.arange(0, 100, 1.0)
pos_gps = true_position(t_gps) + 2.0 * np.random.randn(len(t_gps))
# Smooth and differentiate
interp = SplineInterpolator(smoothing=10.0) # Heavy smoothing for GPS
interp.fit(t_gps, pos_gps)
# Estimate velocity and acceleration
vel_est = interp.differentiate(order=1)
acc_est = interp.differentiate(order=2)
# Evaluate at high frequency for fusion with IMU
t_fine = np.arange(0, 100, 0.01)
vel_from_gps = vel_est(t_fine)
acc_from_gps = acc_est(t_fine)
Error Considerations
Sensor |
Error Source |
Typical Magnitude |
|---|---|---|
GPS |
Multipath, atmospheric |
2-5 m position |
IMU |
Bias drift, noise |
Accumulates over time |
Encoders |
Wheel slip, quantization |
1-5% velocity |
Key insight: Differentiation amplifies GPS noise. Heavy smoothing is essential.
Financial Time Series
The Problem
Estimate instantaneous volatility, drift, and Greeks from price data.
Challenges Specific to Finance
Non-stationarity: Parameters change over time
Jumps: Prices can gap (earnings, news)
Microstructure noise: Bid-ask bounce at high frequency
Fat tails: Extreme events more common than Gaussian
Volatility Estimation
from pydelt.interpolation import LowessInterpolator
import numpy as np
# Log returns
log_prices = np.log(prices)
returns = np.diff(log_prices)
# Realized volatility (rolling window)
window = 20
realized_vol = np.array([
np.std(returns[max(0,i-window):i]) * np.sqrt(252)
for i in range(1, len(returns)+1)
])
# Smooth volatility estimate
lowess = LowessInterpolator(frac=0.1)
lowess.fit(np.arange(len(realized_vol)), realized_vol)
vol_smooth = lowess(np.arange(len(realized_vol)))
Greeks from Market Data
Option prices V(S, t, σ, r) depend on underlying S, time t, volatility σ, rate r.
Greeks are partial derivatives:
Delta = ∂V/∂S (hedge ratio)
Gamma = ∂²V/∂S² (convexity)
Vega = ∂V/∂σ (volatility sensitivity)
Theta = ∂V/∂t (time decay)
from pydelt.multivariate import MultivariateDerivatives
from pydelt.interpolation import SplineInterpolator
# Option prices at different strikes and maturities
# strikes: array of strike prices
# maturities: array of times to expiration
# prices: 2D array of option prices
# Flatten for multivariate fitting
inputs = np.column_stack([
strikes.flatten(),
maturities.flatten()
])
outputs = prices.flatten()
mv = MultivariateDerivatives(SplineInterpolator, smoothing=0.01)
mv.fit(inputs, outputs)
# Delta = ∂V/∂K (approximately ∂V/∂S with sign flip)
gradient = mv.gradient()
delta_gamma = gradient(inputs) # [:, 0] is ∂V/∂K
Error Considerations
Greek |
Error Source |
Mitigation |
|---|---|---|
Delta |
Price noise, discrete strikes |
Smooth across strikes |
Gamma |
Second derivative amplifies noise |
Heavy smoothing |
Vega |
Implied vol uncertainty |
Use vol surface |
Theta |
Time discretization |
Interpolate in time |
Physics Simulations
The Problem
Extract physical quantities (forces, energies, transport coefficients) from simulation trajectories.
Molecular Dynamics
Atoms move according to Newton’s equations: $\(m_i \ddot{r}_i = -\nabla_i U(r_1, ..., r_N)\)$
From trajectories r_i(t), estimate:
Velocities: ṙ_i
Forces: F_i = mᵢr̈_i
Diffusion coefficients: from velocity autocorrelation
from pydelt.interpolation import SplineInterpolator
import numpy as np
# Particle trajectory from MD simulation
# positions: (n_frames, n_atoms, 3)
# times: (n_frames,)
def compute_velocities(times, positions):
"""Compute velocities for all atoms."""
n_frames, n_atoms, n_dims = positions.shape
velocities = np.zeros_like(positions)
for atom in range(n_atoms):
for dim in range(n_dims):
interp = SplineInterpolator(smoothing=0.01)
interp.fit(times, positions[:, atom, dim])
velocities[:, atom, dim] = interp.differentiate(order=1)(times)
return velocities
def compute_diffusion_coefficient(times, positions):
"""Compute diffusion coefficient from MSD."""
# Mean squared displacement
msd = np.mean(np.sum((positions - positions[0])**2, axis=2), axis=1)
# D = lim_{t→∞} MSD / (6t) in 3D
# Fit linear region
interp = SplineInterpolator(smoothing=1.0)
interp.fit(times, msd)
slope = interp.differentiate(order=1)(times[-1])
return slope / 6
Fluid Dynamics
From velocity fields u(x, t), compute:
Vorticity: ω = ∇ × u
Strain rate: S = (∇u + ∇uᵀ)/2
Dissipation: ε = 2ν S:S
from pydelt.multivariate import MultivariateDerivatives
# Velocity field on grid
# u_x, u_y: (nx, ny) arrays
# Compute velocity gradients
def compute_vorticity(x, y, u_x, u_y):
"""Compute vorticity ω = ∂u_y/∂x - ∂u_x/∂y."""
# Flatten grid
X, Y = np.meshgrid(x, y)
inputs = np.column_stack([X.flatten(), Y.flatten()])
# Fit u_x and u_y
mv_ux = MultivariateDerivatives(SplineInterpolator, smoothing=0.1)
mv_ux.fit(inputs, u_x.flatten())
mv_uy = MultivariateDerivatives(SplineInterpolator, smoothing=0.1)
mv_uy.fit(inputs, u_y.flatten())
# Gradients
grad_ux = mv_ux.gradient()(inputs) # [:, 0] = ∂u_x/∂x, [:, 1] = ∂u_x/∂y
grad_uy = mv_uy.gradient()(inputs) # [:, 0] = ∂u_y/∂x, [:, 1] = ∂u_y/∂y
# Vorticity
vorticity = grad_uy[:, 0] - grad_ux[:, 1]
return vorticity.reshape(u_x.shape)
Biomedical Signals
The Problem
Extract physiological information from noisy biosignals (ECG, EEG, EMG).
ECG Analysis
Heart rate variability (HRV) requires:
Detect R-peaks
Compute RR intervals
Estimate instantaneous heart rate
Compute HRV metrics (SDNN, RMSSD, frequency bands)
from pydelt.interpolation import SplineInterpolator
import numpy as np
def analyze_hrv(r_peak_times):
"""Compute HRV metrics from R-peak times."""
# RR intervals
rr_intervals = np.diff(r_peak_times)
rr_times = r_peak_times[:-1] + rr_intervals / 2
# Instantaneous heart rate
hr = 60 / rr_intervals # beats per minute
# Smooth for continuous HR estimate
interp = SplineInterpolator(smoothing=0.5)
interp.fit(rr_times, hr)
# Heart rate variability = derivative of HR
hr_variability = interp.differentiate(order=1)
# Time-domain metrics
sdnn = np.std(rr_intervals)
rmssd = np.sqrt(np.mean(np.diff(rr_intervals)**2))
return {
'hr_interp': interp,
'hr_variability': hr_variability,
'sdnn': sdnn,
'rmssd': rmssd
}
EEG Processing
Brain signals require:
Artifact removal (eye blinks, muscle)
Frequency band extraction (alpha, beta, gamma)
Phase and amplitude estimation
Key insight: EEG is inherently noisy. Heavy smoothing is appropriate for slow features (< 30 Hz), but fast oscillations require careful bandpass filtering first.
Machine Learning
Gradient-Based Optimization
All of deep learning is gradient descent: $\(\theta_{t+1} = \theta_t - \eta \nabla L(\theta_t)\)$
PyDelt’s relevance:
Loss landscape analysis: Estimate Hessian eigenvalues from loss samples
Hyperparameter sensitivity: ∂(validation loss)/∂(hyperparameter)
Neural network analysis: Jacobian of network outputs
Physics-Informed Neural Networks (PINNs)
Embed physical laws in the loss: $\(L = L_{\text{data}} + \lambda L_{\text{physics}}\)$
where: $\(L_{\text{physics}} = \left\|\frac{\partial u}{\partial t} - F\left(u, \frac{\partial u}{\partial x}, ...\right)\right\|^2\)$
PyDelt can:
Precompute derivatives for training data
Validate PINN predictions against numerical derivatives
Estimate derivatives where PINN is uncertain
Sensitivity Analysis
How do model outputs depend on inputs?
from pydelt.multivariate import MultivariateDerivatives
from pydelt.interpolation import SplineInterpolator
# Model predictions at various input points
# inputs: (n_samples, n_features)
# outputs: (n_samples,) or (n_samples, n_outputs)
mv = MultivariateDerivatives(SplineInterpolator, smoothing=0.1)
mv.fit(inputs, outputs)
# Sensitivity = gradient magnitude
gradient = mv.gradient()
sensitivities = gradient(inputs)
# Feature importance = average |∂output/∂feature|
feature_importance = np.mean(np.abs(sensitivities), axis=0)
Error Budget Framework
Systematic Approach
For any application, construct an error budget:
Error Source |
Magnitude |
Mitigation |
|---|---|---|
Measurement noise |
σ_meas |
Smoothing, filtering |
Sampling rate |
Δt |
Interpolation |
Model mismatch |
ε_model |
Better model class |
Numerical precision |
ε_num |
Usually negligible |
Total |
√(Σε²) |
Propagate through pipeline |
Propagation Through Derivatives
If measurement error is σ and you estimate f’ via finite differences with step h:
For smoothing splines with n points:
Validation Strategies
Synthetic data: Generate from known system, verify recovery
Cross-validation: Hold out data, predict derivatives
Physical constraints: Do derivatives satisfy conservation laws?
Sensitivity analysis: How do results change with parameters?
Practical Checklist
Before applying PyDelt to a new problem:
1. Understand Your Data
What is the sampling rate?
What is the noise level?
Are there outliers or gaps?
Is the underlying process smooth?
2. Choose Appropriate Method
Low noise → Splines or finite differences
High noise → LOWESS or heavy smoothing
Outliers → Robust methods (LOWESS)
Complex patterns → Neural networks
3. Validate Results
Do derivatives have correct sign and magnitude?
Are results stable to smoothing parameter?
Do they satisfy physical constraints?
Cross-validation error acceptable?
4. Quantify Uncertainty
Estimate error from noise propagation
Bootstrap confidence intervals
Sensitivity to method choice
Key Takeaways
Every application has domain-specific error sources
Smoothing parameters should reflect noise level
Validation against domain knowledge is essential
Error budgets help quantify uncertainty
No single method works for all applications
Exercises
Sensor fusion: Simulate GPS (1 Hz, σ=2m) and IMU (100 Hz, σ=0.1 m/s²) data. Use PyDelt to estimate velocity from GPS. Compare to integrated IMU.
Option Greeks: Download option chain data. Estimate Delta and Gamma across strikes. Compare to Black-Scholes theoretical values.
HRV analysis: Use a public ECG dataset. Compute instantaneous heart rate and HRV metrics. Validate against published norms.
Error budget: For your favorite application, construct an error budget. Identify the dominant error source.
Previous: ← Stochastic Calculus | Back to: Introduction
Further Reading
General Numerical Methods
Press et al., Numerical Recipes
Trefethen, Approximation Theory and Approximation Practice
Domain-Specific
Finance: Hull, Options, Futures, and Other Derivatives
Physics: Frenkel & Smit, Understanding Molecular Simulation
Biomedical: Rangayyan, Biomedical Signal Analysis
ML: Goodfellow et al., Deep Learning