Examples & Use Cases
Real-World Applications
1. Network Performance Monitoring
Detect performance degradation and anomalies in network metrics:
from pybocd import BOCDNIG
import numpy as np
# Monitor latency over time
model = BOCDNIG(m_0=50.0, kappa_0=1.0, alpha_0=2.0, beta_0=1.0, l=300.0)
latencies = [45, 48, 50, 52, 51, 100, 98, 105, 103, 50, 52]
for t, latency in enumerate(latencies):
model.add_data(latency)
if model.run_length == 0:
print(f"Changepoint detected at time {t}! Latency spike detected.")
2. Financial Market Analysis
Identify regime changes in stock prices:
from pybocd import BOCDNIG
import numpy as np
# Monitor stock returns
model = BOCDNIG(m_0=0.0, kappa_0=0.5, alpha_0=2.0, beta_0=0.5, l=100.0)
# Returns with regime change
returns = np.concatenate([
np.random.normal(0.01, 0.02, 100), # Bull market
np.random.normal(-0.01, 0.03, 100) # Bear market
])
for t, ret in enumerate(returns):
model.add_data(ret)
if model.run_length < 10:
print(f"Regime change at time {t}")
3. System Health Monitoring
Track CPU usage and detect anomalies:
from pybocd import BOCDGMM
# GMM is better for multimodal CPU usage patterns
model = BOCDGMM(
alpha_0=2.0, beta_0=2.0,
m_0=30.0, kappa_0=1.0,
alpha_p_0=2.0, beta_p_0=2.0,
l=200.0, m=10, n=150
)
# Simulate CPU usage: idle + working + spike
cpu_usage = np.concatenate([
np.random.normal(10, 5, 100), # Idle
np.random.normal(50, 10, 100), # Working
np.random.normal(90, 5, 50) # Overload
])
changepoints = []
for t, cpu in enumerate(cpu_usage):
model.add_data(cpu)
if model.run_length == 0:
changepoints.append(t)
print(f"Changepoints detected at: {changepoints}")
4. Sensor Data Analysis
Detect equipment failures using multivariate measurements:
from pybocd import BOCDNIG
import numpy as np
# Monitor multiple sensor streams separately
sensors = {
'temperature': BOCDNIG(m_0=25.0, kappa_0=1.0, alpha_0=2.0, beta_0=1.0),
'vibration': BOCDNIG(m_0=0.5, kappa_0=0.5, alpha_0=2.0, beta_0=0.1),
'pressure': BOCDNIG(m_0=100.0, kappa_0=1.0, alpha_0=2.0, beta_0=1.0)
}
# Simulate sensor readings
data = {
'temperature': [25, 26, 25, 27, 45, 46, 45, 47],
'vibration': [0.5, 0.5, 0.6, 0.5, 3.0, 3.1, 3.2, 3.0],
'pressure': [100, 101, 99, 100, 150, 151, 150, 152]
}
for t in range(len(data['temperature'])):
for sensor_name, model in sensors.items():
value = data[sensor_name][t]
model.add_data(value)
# Check if any sensor detected a changepoint
anomalies = [name for name, model in sensors.items() if model.run_length < 5]
if anomalies:
print(f"Potential equipment failure at time {t}: {anomalies}")
Benchmark: BOCD-NIG vs BOCD-GMM
import time
import numpy as np
from pybocd import BOCDNIG, BOCDGMM
# Create synthetic data with changepoint
np.random.seed(42)
data = np.concatenate([
np.random.normal(0, 1, 1000),
np.random.normal(3, 1, 1000)
])
# Benchmark BOCD-NIG
start = time.time()
nig_model = BOCDNIG(m_0=0.0, kappa_0=1.0, alpha_0=1.0, beta_0=1.0, l=200.0)
for x in data:
nig_model.add_data(x)
nig_time = time.time() - start
# Benchmark BOCD-GMM
start = time.time()
gmm_model = BOCDGMM(alpha_0=2.0, beta_0=2.0, m_0=0.0, kappa_0=1.0,
alpha_p_0=2.0, beta_p_0=2.0, l=200.0, m=10, n=100)
for x in data:
gmm_model.add_data(x)
gmm_time = time.time() - start
print(f"BOCD-NIG: {nig_time:.3f}s")
print(f"BOCD-GMM: {gmm_time:.3f}s")
print(f"Ratio: {gmm_time/nig_time:.1f}x slower")
Tips for Best Results
1. Choose the Right Model
- Use BOCD-NIG if: Data is univariate, approximately normal, and speed is critical
- Use BOCD-GMM if: Data is multimodal, has outliers, or is non-Gaussian
2. Tuning Expected Run Length
Start with l = 100 and adjust based on your domain knowledge:
# If changepoints are frequent
model = BOCDNIG(..., l=50)
# If changepoints are rare
model = BOCDNIG(..., l=500)
3. Preprocess Your Data
- Normalize: Scale to comparable units
- Remove outliers: If not modeling them explicitly
- Resample: If data has missing values
# Example preprocessing
data = (data - np.mean(data)) / np.std(data) # Standardize
4. Validate Results
Cross-validate with domain experts and other methods:
# Compare with simpler methods (e.g., CUSUM)
# Validate detected changepoints against known events
# Use multiple models and average results
Running Examples from pybocd-Examples
Visit the pybocd-Examples repository for complete, runnable examples with real datasets and detailed explanations.
Common Pitfalls to Avoid
❌ Setting l too small → Too many false positives
❌ Setting l too large → Missing real changepoints
❌ Ignoring preprocessing → Poor results on raw data
❌ Using BOCD-GMM for simple data → Unnecessary complexity
❌ Not validating results → Unknown reliability of detections