Chapter 4: Integration Intuition
“Integration is the inverse of differentiation. It accumulates change over time. It computes areas, volumes, and expectations.”
The Two Faces of Integration
Integration has two complementary interpretations:
Geometric: The area under a curve
Analytical: The reverse of differentiation (antiderivative)
Both are connected by the Fundamental Theorem of Calculus—one of the most important results in mathematics.
Integration as Accumulation
The Intuition
If the derivative tells you the rate of change, the integral tells you the total change.
Example: Velocity and Distance
If v(t) is your velocity at time t
Then ∫v(t)dt is the total distance traveled
Velocity (rate) ──derivative──> Acceleration
<──integral───
Position ──derivative──> Velocity ──derivative──> Acceleration
<──integral─── <──integral───
In Code
import numpy as np
from pydelt.integrals import integrate_derivative
# Velocity data (e.g., from a sensor)
time = np.linspace(0, 10, 100)
velocity = 2 * time # Constant acceleration: v = 2t
# Integrate to get position
# If v = 2t, then position = t² (plus initial position)
position = integrate_derivative(velocity, time, initial_value=0)
# Check: position should be approximately t²
print(f"Position at t=5: {position[50]:.2f}") # Should be ~25
print(f"Exact: {5**2}") # 25
Integration as Area
The Geometric View
The definite integral ∫ₐᵇ f(x)dx represents the signed area between the curve f(x) and the x-axis, from x=a to x=b.
f(x)
│ ╱╲
│ ╱ ╲ ← Area above x-axis (positive)
│ ╱ ╲
────┼─╱──────╲────── x
│ ╲
│ ╲ ← Area below x-axis (negative)
a b
Area above the x-axis counts as positive
Area below the x-axis counts as negative
Why This Matters for ML
Probability distributions are defined by integrals:
The total probability must equal 1:
The Fundamental Theorem of Calculus
This theorem connects derivatives and integrals:
Part 1: Differentiation Undoes Integration
If F(x) = ∫ₐˣ f(t)dt, then F’(x) = f(x).
Translation: If you integrate a function and then differentiate the result, you get back the original function.
Part 2: Integration Undoes Differentiation
If F’(x) = f(x), then ∫ₐᵇ f(x)dx = F(b) - F(a).
Translation: To compute a definite integral, find an antiderivative and evaluate at the endpoints.
Example
Common Integrals
Function |
Integral (Antiderivative) |
|---|---|
xⁿ |
xⁿ⁺¹/(n+1) + C (n ≠ -1) |
1/x |
ln|x| + C |
eˣ |
eˣ + C |
sin(x) |
-cos(x) + C |
cos(x) |
sin(x) + C |
1/(1+x²) |
arctan(x) + C |
The “+ C” is the constant of integration—since the derivative of a constant is zero, any constant could have been there.
Numerical Integration
When you don’t have a formula, you approximate the integral numerically.
The Trapezoidal Rule
Approximate the area using trapezoids:
def trapezoidal_integrate(y, x):
"""Integrate y with respect to x using trapezoidal rule."""
dx = np.diff(x)
return np.sum((y[:-1] + y[1:]) / 2 * dx)
# Example
x = np.linspace(0, np.pi, 100)
y = np.sin(x)
area = trapezoidal_integrate(y, x)
print(f"∫sin(x)dx from 0 to π = {area:.4f}") # Should be 2.0
Simpson’s Rule
Uses parabolas instead of lines—more accurate:
PyDelt’s Integration
from pydelt.integrals import integrate_derivative, integrate_derivative_with_error
# With error estimation
result, error = integrate_derivative_with_error(
derivative_signal=velocity,
time=time,
initial_value=0
)
print(f"Integrated value: {result[-1]:.4f} ± {error:.4f}")
Integration in Machine Learning
1. Probability and Expectations
The expected value of a random variable:
The variance:
2. Loss Functions as Integrals
Many loss functions are integrals in disguise:
Cross-entropy (discrete version of KL divergence): $\(H(p, q) = -\sum_x p(x) \log q(x) \approx -\int p(x) \log q(x) dx\)$
3. Cumulative Distribution Functions
The CDF is the integral of the PDF:
# Example: Standard normal CDF
from scipy import stats
import numpy as np
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x) # Probability density
cdf = stats.norm.cdf(x) # Cumulative (integral of pdf)
# Verify: numerical integration of PDF ≈ CDF
from scipy.integrate import cumulative_trapezoid
numerical_cdf = cumulative_trapezoid(pdf, x, initial=0)
numerical_cdf = numerical_cdf / numerical_cdf[-1] # Normalize
4. Neural ODEs
Neural ODEs define the network as an integral:
The hidden state evolves continuously, and the integral is computed numerically.
5. Normalizing Flows
Change of variables requires integrating the Jacobian:
Monte Carlo Integration
For high-dimensional integrals, random sampling often works better than grid-based methods:
where x_i are random samples.
Why This Works
By the law of large numbers, the sample mean converges to the expected value:
ML Connection
Stochastic gradient descent is Monte Carlo estimation of the full gradient
Variational inference uses Monte Carlo to estimate intractable integrals
Reinforcement learning uses Monte Carlo returns
Integration Challenges
1. No Closed Form
Many integrals have no analytical solution:
The error function erf(x) is defined by this integral—it has no simpler form.
2. Improper Integrals
Integrals over infinite domains or with singularities:
These require careful handling (limits, convergence tests).
3. High Dimensions
The “curse of dimensionality” makes grid-based integration impractical:
10 points per dimension
10 dimensions
= 10¹⁰ = 10 billion points!
Monte Carlo methods scale much better.
Connecting Derivatives and Integrals in PyDelt
PyDelt lets you go both directions:
import numpy as np
from pydelt.interpolation import SplineInterpolator
from pydelt.integrals import integrate_derivative
# Original function
x = np.linspace(0, 2*np.pi, 100)
y = np.sin(x)
# Differentiate
interp = SplineInterpolator(smoothing=0.01)
interp.fit(x, y)
derivative = interp.differentiate(order=1)(x) # Should be cos(x)
# Integrate the derivative back
reconstructed = integrate_derivative(derivative, x, initial_value=0)
# Should recover sin(x) (up to numerical error)
print(f"Max reconstruction error: {np.max(np.abs(reconstructed - y)):.6f}")
Key Takeaways
Integration accumulates change—the inverse of differentiation
Geometrically, it’s the area under a curve
The Fundamental Theorem connects derivatives and integrals
Numerical methods (trapezoidal, Simpson’s, Monte Carlo) handle real data
ML uses integrals everywhere: probability, expectations, loss functions, Neural ODEs
Exercises
Compute by hand: ∫₀¹ x² dx (use the power rule for antiderivatives)
Verify numerically: Use PyDelt’s
integrate_derivativeto verify your answer.Probability: If X ~ Uniform(0, 1), compute E[X²] = ∫₀¹ x² dx. What is it?
Round trip: Generate data from f(x) = x³, differentiate it, then integrate the derivative. How close do you get to the original?
Previous: ← Differentiation Rules | Next: Approximation Theory →