Neural Networks & Automatic Differentiation
Neural networks provide powerful interpolation capabilities with exact automatic differentiation. Unlike traditional methods that approximate derivatives numerically, neural networks can compute exact gradients through backpropagation, making them ideal for optimization and machine learning applications.
🧠 Core Concepts
Automatic Differentiation (AutoDiff) computes derivatives by applying the chain rule systematically to elementary operations. This provides machine-precision accuracy for derivatives, unlike numerical approximation methods.
Universal Function Approximation: Neural networks can approximate any continuous function to arbitrary precision given sufficient capacity, making them extremely versatile interpolators.
Backpropagation: The algorithm that efficiently computes gradients by propagating error signals backward through the network.
🔧 Neural Network Interpolator
The NeuralNetworkInterpolator combines the universal pydelt API with deep learning backends:
from pydelt.interpolation import NeuralNetworkInterpolator
import numpy as np
# Create neural network interpolator
nn_interp = NeuralNetworkInterpolator(
hidden_layers=[64, 32, 16], # Network architecture
activation='relu', # Activation function
learning_rate=0.001, # Optimizer learning rate
epochs=1000, # Training iterations
backend='pytorch' # 'pytorch' or 'tensorflow'
)
Key Parameters:
- hidden_layers: List of hidden layer sizes [64, 32] creates 2 hidden layers
- activation: ‘relu’, ‘tanh’, ‘sigmoid’, ‘swish’, ‘gelu’
- learning_rate: Adam optimizer learning rate (0.001-0.01 typical)
- epochs: Training iterations (500-5000 depending on complexity)
- backend: ‘pytorch’ (default) or ‘tensorflow’
🎯 Example 1: Nonlinear Function Approximation
Classic Example: Runge Function
The Runge function is notoriously difficult for polynomial interpolation but neural networks handle it well:
import numpy as np
import matplotlib.pyplot as plt
from pydelt.interpolation import NeuralNetworkInterpolator, SplineInterpolator
# Runge function: f(x) = 1 / (1 + 25x²)
def runge_function(x):
return 1 / (1 + 25 * x**2)
def runge_derivative(x):
return -50 * x / (1 + 25 * x**2)**2
# Training data (sparse sampling)
x_train = np.linspace(-1, 1, 15)
y_train = runge_function(x_train)
# Neural network interpolator
nn_interp = NeuralNetworkInterpolator(
hidden_layers=[128, 64, 32],
activation='tanh', # Good for smooth functions
learning_rate=0.005,
epochs=2000
)
nn_interp.fit(x_train, y_train)
# Compare with spline
spline = SplineInterpolator(smoothing=0.0)
spline.fit(x_train, y_train)
# Evaluation points
x_test = np.linspace(-1, 1, 200)
y_true = runge_function(x_test)
dy_true = runge_derivative(x_test)
# Predictions
y_nn = nn_interp.predict(x_test)
y_spline = spline.predict(x_test)
# Derivatives (automatic vs numerical)
nn_deriv_func = nn_interp.differentiate(order=1)
spline_deriv_func = spline.differentiate(order=1)
dy_nn = nn_deriv_func(x_test)
dy_spline = spline_deriv_func(x_test)
# Error analysis
nn_func_error = np.sqrt(np.mean((y_nn - y_true)**2))
nn_deriv_error = np.sqrt(np.mean((dy_nn - dy_true)**2))
spline_func_error = np.sqrt(np.mean((y_spline - y_true)**2))
spline_deriv_error = np.sqrt(np.mean((dy_spline - dy_true)**2))
print("Function Approximation Errors:")
print(f"Neural Network: {nn_func_error:.6f}")
print(f"Spline: {spline_func_error:.6f}")
print("\nDerivative Errors:")
print(f"Neural Network: {nn_deriv_error:.6f}")
print(f"Spline: {spline_deriv_error:.6f}")
Expected Results: Neural networks typically achieve 10-100x better accuracy than splines for the Runge function, especially near the boundaries where polynomial methods struggle.
🌊 Example 2: Fluid Dynamics - Velocity Field
Application: Reconstructing velocity fields from particle tracking data in fluid mechanics.
# Simulate 2D fluid flow around a cylinder (potential flow)
def potential_flow_velocity(x, y, U_inf=1.0, R=0.5):
"""Velocity field around a cylinder in cross-flow"""
r_sq = x**2 + y**2
# Avoid singularity at origin
r_sq = np.maximum(r_sq, 1e-10)
# Velocity components for flow around cylinder
u = U_inf * (1 - R**2 * (x**2 - y**2) / r_sq**2)
v = U_inf * (-R**2 * 2 * x * y / r_sq**2)
return u, v
# Generate training data (sparse particle tracking)
np.random.seed(42)
n_particles = 200
x_particles = np.random.uniform(-2, 2, n_particles)
y_particles = np.random.uniform(-2, 2, n_particles)
# Remove particles inside cylinder
mask = (x_particles**2 + y_particles**2) > 0.6**2
x_particles = x_particles[mask]
y_particles = y_particles[mask]
# Get velocity components
u_true, v_true = potential_flow_velocity(x_particles, y_particles)
# Add measurement noise
u_measured = u_true + 0.05 * np.random.randn(len(u_true))
v_measured = v_true + 0.05 * np.random.randn(len(v_true))
# Prepare input data (x,y positions) and output data (u,v velocities)
input_data = np.column_stack([x_particles, y_particles])
output_data = np.column_stack([u_measured, v_measured])
# Neural network for vector-valued function
nn_flow = NeuralNetworkInterpolator(
hidden_layers=[128, 128, 64],
activation='swish', # Good for fluid dynamics
learning_rate=0.002,
epochs=3000
)
nn_flow.fit(input_data, output_data)
# Create evaluation grid
x_grid = np.linspace(-2, 2, 50)
y_grid = np.linspace(-2, 2, 50)
X, Y = np.meshgrid(x_grid, y_grid)
# Remove points inside cylinder
mask_grid = (X**2 + Y**2) > 0.6**2
x_eval = X[mask_grid]
y_eval = Y[mask_grid]
eval_points = np.column_stack([x_eval, y_eval])
# Predict velocity field
velocity_pred = nn_flow.predict(eval_points)
u_pred = velocity_pred[:, 0]
v_pred = velocity_pred[:, 1]
# Compute derivatives for vorticity analysis
# ∂u/∂y and ∂v/∂x for vorticity ω = ∂v/∂x - ∂u/∂y
deriv_func = nn_flow.differentiate(order=1)
derivatives = deriv_func(eval_points)
print(f"Reconstructed {len(eval_points)} velocity vectors from {len(x_particles)} measurements")
print(f"Velocity field ready for vorticity and strain rate analysis")
📊 Example 3: Time Series with Complex Dynamics
Application: Chaotic time series analysis (Lorenz attractor).
from scipy.integrate import odeint
# Lorenz system parameters
def lorenz_system(state, t, sigma=10, rho=28, beta=8/3):
x, y, z = state
dxdt = sigma * (y - x)
dydt = x * (rho - z) - y
dzdt = x * y - beta * z
return [dxdt, dydt, dzdt]
# Generate Lorenz attractor data
t = np.linspace(0, 20, 2000)
initial_state = [1.0, 1.0, 1.0]
trajectory = odeint(lorenz_system, initial_state, t)
# Extract x-component time series
x_series = trajectory[:, 0]
# Subsample for training (simulate sparse measurements)
indices = np.arange(0, len(t), 10) # Every 10th point
t_train = t[indices]
x_train = x_series[indices]
# Add measurement noise
x_noisy = x_train + 0.5 * np.random.randn(len(x_train))
# Neural network interpolator
nn_chaos = NeuralNetworkInterpolator(
hidden_layers=[256, 128, 64, 32], # Deep network for complex dynamics
activation='gelu', # Good for chaotic systems
learning_rate=0.001,
epochs=4000
)
nn_chaos.fit(t_train, x_noisy)
# Predict full time series
x_pred = nn_chaos.predict(t)
# Compute instantaneous rate of change
rate_func = nn_chaos.differentiate(order=1)
dx_dt_pred = rate_func(t)
# True derivative from Lorenz equations
dx_dt_true = 10 * (trajectory[:, 1] - trajectory[:, 0])
# Analysis
reconstruction_error = np.sqrt(np.mean((x_pred - x_series)**2))
derivative_error = np.sqrt(np.mean((dx_dt_pred - dx_dt_true)**2))
print(f"Time series reconstruction error: {reconstruction_error:.3f}")
print(f"Derivative reconstruction error: {derivative_error:.3f}")
# Phase space reconstruction quality
correlation = np.corrcoef(x_pred, x_series)[0, 1]
print(f"Correlation with true attractor: {correlation:.4f}")
⚡ Advantages of Neural Networks
1. Exact Derivatives - Automatic differentiation provides machine-precision gradients - No numerical approximation errors - Consistent accuracy across all derivative orders
2. Universal Approximation - Can represent any continuous function - Handles highly nonlinear relationships - Scales to high-dimensional problems
3. Noise Robustness - Implicit regularization through architecture - Dropout and batch normalization for stability - Learns underlying patterns despite measurement noise
4. Scalability - GPU acceleration for large datasets - Batch processing for efficiency - Parallel computation of derivatives
🔧 Advanced Configuration
Custom Architecture Design:
# Deep network for complex functions
complex_nn = NeuralNetworkInterpolator(
hidden_layers=[512, 256, 128, 64, 32],
activation='swish',
learning_rate=0.0005,
epochs=5000,
batch_size=64,
dropout_rate=0.1
)
# Wide network for high-frequency components
wide_nn = NeuralNetworkInterpolator(
hidden_layers=[1024, 1024],
activation='relu',
learning_rate=0.002,
epochs=2000
)
Training Monitoring:
# Enable training progress tracking
nn_interp = NeuralNetworkInterpolator(
hidden_layers=[128, 64],
epochs=1000,
verbose=True, # Print training progress
early_stopping=True, # Stop when validation loss plateaus
validation_split=0.2 # Use 20% of data for validation
)
Backend Selection:
# PyTorch backend (default, recommended)
nn_pytorch = NeuralNetworkInterpolator(backend='pytorch')
# TensorFlow backend
nn_tensorflow = NeuralNetworkInterpolator(backend='tensorflow')
⚠️ Limitations & Considerations
Computational Cost: - Training time scales with network size and data complexity - GPU recommended for large networks (>1000 parameters) - Memory usage grows with batch size and network depth
Hyperparameter Sensitivity: - Learning rate requires tuning (too high: instability, too low: slow convergence) - Architecture choice affects approximation quality - Overfitting possible with insufficient data
Reproducibility: - Random initialization affects results - Set random seeds for reproducible results:
import torch
import numpy as np
# Set seeds for reproducibility
torch.manual_seed(42)
np.random.seed(42)
nn_interp = NeuralNetworkInterpolator(...)
🎓 Best Practices
Architecture Guidelines: 1. Start simple: Begin with [64, 32] hidden layers 2. Go deeper for complexity: Add layers for highly nonlinear functions 3. Go wider for detail: Increase layer sizes for high-frequency components 4. Use appropriate activations: ‘relu’ (general), ‘tanh’ (smooth), ‘swish’ (modern)
Training Tips: 1. Monitor convergence: Use validation split to track overfitting 2. Adjust learning rate: Decrease if training is unstable 3. Early stopping: Prevent overfitting with patience parameter 4. Data normalization: Scale inputs to [-1, 1] or [0, 1] range
Derivative Accuracy: - Neural network derivatives are exact (no approximation error) - Accuracy depends on function approximation quality - Higher-order derivatives may amplify approximation errors
🔗 Next Steps
Neural networks excel at univariate and simple multivariate problems. For advanced multivariate calculus operations, continue to:
Multivariate Calculus: Gradients, Jacobians, and Hessians for vector-valued functions
Stochastic Computing: Probabilistic neural networks with uncertainty quantification
The automatic differentiation capabilities of neural networks become especially powerful when combined with multivariate operations and stochastic link functions.