Why Calculus? A Guide for ML Practitioners
“Calculus is the mathematics of change. In machine learning, everything changes: weights during training, predictions with inputs, loss over time.”
What Calculus Actually Does
At its core, calculus answers two fundamental questions:
Question 1: How Fast Is Something Changing?
This is the domain of derivatives. When you ask:
“How sensitive is my model’s output to this input feature?”
“Which direction should I adjust my weights to reduce the loss?”
“How quickly is my training loss decreasing?”
You’re asking about rates of change—derivatives.
Question 2: How Much Has Accumulated Over Time?
This is the domain of integrals. When you ask:
“What’s the total probability under this distribution?”
“What’s the expected value of this random variable?”
“How much error has accumulated over this time series?”
You’re asking about accumulation—integrals.
Why ML Practitioners Should Care
1. Gradient Descent Is Just Calculus
The most important algorithm in modern ML is gradient descent:
new_weights = old_weights - learning_rate × gradient
That gradient? It’s a vector of derivatives. Understanding what derivatives mean helps you:
Debug training issues (vanishing/exploding gradients)
Choose appropriate learning rates
Understand why certain architectures work better than others
2. Backpropagation Is the Chain Rule
When you call loss.backward() in PyTorch, you’re applying the chain rule of calculus—a method for computing derivatives of composed functions. Understanding this helps you:
Design custom loss functions
Implement custom layers
Debug gradient flow issues
3. Probability Distributions Require Integration
Every time you work with:
Probability density functions (PDFs)
Cumulative distribution functions (CDFs)
Expected values and variances
KL divergence or cross-entropy
You’re working with integrals, whether you realize it or not.
4. Physics-Informed ML Uses Differential Equations
The cutting edge of ML increasingly incorporates physical laws:
Neural ODEs (Ordinary Differential Equations)
Physics-Informed Neural Networks (PINNs)
Hamiltonian Neural Networks
These all require understanding how derivatives describe physical systems.
The PyDelt Perspective
PyDelt exists because real-world data doesn’t come with analytical formulas. You have:
Sensor measurements, not equations
Time series, not functions
Noisy observations, not clean curves
Traditional calculus assumes you have a formula like f(x) = sin(x). But what if you only have 1000 data points that look like a sine wave?
PyDelt bridges this gap. It lets you:
Fit smooth functions to your data
Compute derivatives of those functions
Use those derivatives for analysis, optimization, or modeling
What You’ll Learn in This Series
This theory section builds your calculus intuition from the ground up:
Chapter |
What You’ll Learn |
ML Connection |
|---|---|---|
What functions are and how limits work |
Understanding model behavior at boundaries |
|
Rates of change, slopes, sensitivity |
Gradients, feature importance, sensitivity analysis |
|
Chain rule, product rule, quotient rule |
Backpropagation, custom gradients |
|
Accumulation, area, inverse of derivative |
Probability, expectations, cumulative metrics |
|
Taylor series, polynomial approximation |
Why neural networks work, local linear models |
|
Gradients, Jacobians, Hessians |
High-dimensional optimization, curvature |
|
Complex numbers, Euler’s formula |
Fourier transforms, signal processing |
|
Putting it all together |
Backprop, optimization, physics-informed NN |
A Note on Rigor vs. Intuition
This series prioritizes intuition over formalism. We’ll:
Start with real-world examples before equations
Use visualizations to build geometric understanding
Connect every concept to practical ML applications
Provide rigorous definitions for those who want them
If you want theorem-proof style mathematics, excellent textbooks exist (see Bibliography). Our goal is different: to give you the working understanding you need to be a more effective ML practitioner.
Getting Started
Ready to begin? Start with Chapter 1: Functions and Limits, where we’ll explore what it really means for a function to approach a value—and why that matters for understanding derivatives.
Quick Reference: Calculus in ML
Calculus Concept |
ML Application |
|---|---|
Derivative |
Gradient, sensitivity, rate of change |
Partial derivative |
Gradient component for one parameter |
Gradient (∇f) |
Direction of steepest ascent |
Chain rule |
Backpropagation |
Integral |
Probability, expectation, cumulative sum |
Taylor series |
Local approximation, why NNs work |
Jacobian |
Transformation of probability densities |
Hessian |
Curvature, second-order optimization |
Laplacian |
Diffusion, smoothing, graph neural networks |
References
For those wanting deeper mathematical treatment:
Strang, G. Calculus. MIT OpenCourseWare. Free online
Goodfellow, I., Bengio, Y., & Courville, A. Deep Learning, Chapter 4: Numerical Computation. deeplearningbook.org
Boyd, S. & Vandenberghe, L. Convex Optimization. stanford.edu