Chapter 1: Numerical Differentiation
“The derivative is a limit. In practice, we can’t take limits—we can only approximate them.”
Connection to the Central Thesis
Recall our core problem: we observe a system governed by unknown dynamics dx/dt = f(x,t), but we only have discrete, noisy samples. We cannot compute the exact derivative—we can only approximate it.
Finite differences are the simplest approximation. They work directly on the data without fitting a surrogate. This makes them fast and assumption-free, but also fragile: they amplify noise and require careful step size selection.
Understanding finite differences is essential because:
They reveal why differentiation is hard (ill-posed inverse problem)
They establish error analysis techniques used throughout
They motivate why we need smoothing and interpolation
The Fundamental Problem
You have discrete samples (tᵢ, yᵢ) and want to estimate dy/dt. The textbook definition:
is useless because:
You can’t take h → 0 with finite precision arithmetic
Your data has gaps—you can’t evaluate f at arbitrary points
Your measurements have noise
This chapter covers the classical approaches and their limitations.
Finite Difference Formulas
Forward, Backward, and Central Differences
Given evenly-spaced data with step h:
Formula |
Expression |
Error Order |
|---|---|---|
Forward |
(f(x+h) - f(x)) / h |
O(h) |
Backward |
(f(x) - f(x-h)) / h |
O(h) |
Central |
(f(x+h) - f(x-h)) / 2h |
O(h²) |
The central difference is more accurate because errors cancel symmetrically.
Derivation via Taylor Series
Expand f(x+h) and f(x-h) around x:
Subtracting:
So:
The error is O(h²), not O(h).
Higher-Order Formulas
More points = higher accuracy:
Five-point stencil for f’(x): $\(f'(x) \approx \frac{-f(x+2h) + 8f(x+h) - 8f(x-h) + f(x-2h)}{12h} + O(h^4)\)$
Second derivative (central): $\(f''(x) \approx \frac{f(x+h) - 2f(x) + f(x-h)}{h^2} + O(h^2)\)$
Implementation
import numpy as np
def finite_diff_derivative(y, h, order=1, accuracy=2):
"""
Compute derivative using finite differences.
Parameters:
y: array of function values
h: step size (assumed uniform)
order: derivative order (1 or 2)
accuracy: 2 or 4 (determines stencil width)
"""
n = len(y)
dy = np.zeros(n)
if order == 1 and accuracy == 2:
# Central difference O(h²)
dy[1:-1] = (y[2:] - y[:-2]) / (2*h)
dy[0] = (y[1] - y[0]) / h # Forward at boundary
dy[-1] = (y[-1] - y[-2]) / h # Backward at boundary
elif order == 1 and accuracy == 4:
# Five-point stencil O(h⁴)
dy[2:-2] = (-y[4:] + 8*y[3:-1] - 8*y[1:-3] + y[:-4]) / (12*h)
# Fall back to lower order at boundaries
dy[0] = (y[1] - y[0]) / h
dy[1] = (y[2] - y[0]) / (2*h)
dy[-2] = (y[-1] - y[-3]) / (2*h)
dy[-1] = (y[-1] - y[-2]) / h
elif order == 2 and accuracy == 2:
# Second derivative O(h²)
dy[1:-1] = (y[2:] - 2*y[1:-1] + y[:-2]) / h**2
dy[0] = dy[1] # Extrapolate at boundaries
dy[-1] = dy[-2]
return dy
The Two Sources of Error
Truncation Error
From dropping higher-order Taylor terms. For central difference:
Decreases as h → 0.
Rounding Error
From finite precision arithmetic. When subtracting nearly-equal numbers:
where ε_machine ≈ 10⁻¹⁶ for float64.
Increases as h → 0.
The Optimal Step Size
Total error ≈ truncation + rounding:
Minimize by taking derivative with respect to h and setting to zero:
For typical functions and float64: h_opt ≈ 10⁻⁵ to 10⁻⁶.
import numpy as np
import matplotlib.pyplot as plt
def test_optimal_h():
"""Demonstrate optimal step size."""
f = np.sin
f_prime = np.cos
x = 1.0
h_values = np.logspace(-15, 0, 100)
errors = []
for h in h_values:
approx = (f(x + h) - f(x - h)) / (2 * h)
true = f_prime(x)
errors.append(abs(approx - true))
plt.loglog(h_values, errors)
plt.xlabel('Step size h')
plt.ylabel('Absolute error')
plt.title('Finite Difference Error vs Step Size')
plt.axvline(1e-5, color='r', linestyle='--', label='Optimal h ≈ 10⁻⁵')
plt.legend()
plt.show()
# Find optimal
opt_idx = np.argmin(errors)
print(f"Optimal h: {h_values[opt_idx]:.2e}, Error: {errors[opt_idx]:.2e}")
The Noise Amplification Problem
With noisy data y = f(x) + ε where ε ~ N(0, σ²):
The noise term has variance:
Noise in derivative scales as σ/h. Smaller h = more noise amplification.
Demonstration
import numpy as np
from pydelt.interpolation import SplineInterpolator
# True function and derivative
t = np.linspace(0, 2*np.pi, 200)
y_true = np.sin(t)
dy_true = np.cos(t)
# Add noise
noise_level = 0.02
y_noisy = y_true + noise_level * np.random.randn(len(t))
# Finite difference (naive)
h = t[1] - t[0]
dy_fd = np.gradient(y_noisy, h)
# Compare noise levels
print(f"Signal noise std: {np.std(y_noisy - y_true):.4f}")
print(f"Derivative noise std: {np.std(dy_fd - dy_true):.4f}")
print(f"Amplification factor: {np.std(dy_fd - dy_true) / np.std(y_noisy - y_true):.1f}x")
Irregular Grids
Real data often has non-uniform spacing. The formulas change:
Two-Point (Non-uniform)
Three-Point (Non-uniform)
For points x₀, x₁, x₂ with spacings h₁ = x₁ - x₀ and h₂ = x₂ - x₁:
Fornberg’s Algorithm
For arbitrary stencils, use Fornberg’s algorithm to compute optimal weights:
def fornberg_weights(z, x, m):
"""
Compute finite difference weights for derivative of order m
at point z using function values at points x.
Based on Fornberg (1988).
"""
n = len(x) - 1
c = np.zeros((n+1, m+1))
c[0, 0] = 1
c1 = 1
for i in range(1, n+1):
mn = min(i, m)
c2 = 1
for j in range(i):
c3 = x[i] - x[j]
c2 *= c3
if i <= m:
c[i-1, i] = 0
for k in range(mn, -1, -1):
c[i, k] = c1 * (k * c[i-1, k-1] - (x[i-1] - z) * c[i-1, k]) / c2
c[j, k] = ((x[i] - z) * c[j, k] - k * c[j, k-1]) / c3
c1 = c2
return c[:, m]
PyDelt’s Finite Difference Methods
PyDelt provides robust finite difference implementations:
from pydelt import finite_difference
# Basic usage
t = np.linspace(0, 1, 100)
y = np.sin(2 * np.pi * t)
# First derivative
dy = finite_difference(t, y, order=1)
# Second derivative
d2y = finite_difference(t, y, order=2)
# With specified accuracy
dy_high = finite_difference(t, y, order=1, accuracy=4)
When Finite Differences Fail
Finite differences are inadequate when:
Noise level is significant (σ/h becomes large)
Data is sparse (can’t use higher-order stencils)
Function has sharp features (truncation error dominates)
You need derivatives at non-sample points
The solution: smoothing and interpolation, covered in the next chapters.
Key Takeaways
Central differences are O(h²), better than forward/backward O(h)
Optimal h balances truncation and rounding error (~10⁻⁵ for float64)
Noise amplifies as 1/h—smaller steps = more noise
Irregular grids require modified formulas (Fornberg’s algorithm)
Finite differences alone are insufficient for noisy data
Exercises
Optimal step size: For f(x) = exp(x) at x = 0, find the step size that minimizes total error. Compare to the theoretical prediction.
Noise amplification: Generate sin(x) with Gaussian noise σ = 0.01. Plot derivative error vs. step size h. At what h does noise dominate?
Higher-order stencils: Implement the 5-point stencil for f’(x). Compare accuracy to 3-point for f(x) = x⁵ at x = 1.
Non-uniform grid: Generate data on a random grid. Implement the 3-point non-uniform formula and compare to linear interpolation + uniform finite difference.
Previous: ← The Numerical Calculus Problem | Next: Noise and Smoothing →