Examples & Use Cases

Real-World Applications

1. Network Performance Monitoring

Detect performance degradation and anomalies in network metrics:

from pybocd import BOCDNIG
import numpy as np

# Monitor latency over time
model = BOCDNIG(m_0=50.0, kappa_0=1.0, alpha_0=2.0, beta_0=1.0, l=300.0)

latencies = [45, 48, 50, 52, 51, 100, 98, 105, 103, 50, 52]

for t, latency in enumerate(latencies):
    model.add_data(latency)
    if model.run_length == 0:
        print(f"Changepoint detected at time {t}! Latency spike detected.")

2. Financial Market Analysis

Identify regime changes in stock prices:

from pybocd import BOCDNIG
import numpy as np

# Monitor stock returns
model = BOCDNIG(m_0=0.0, kappa_0=0.5, alpha_0=2.0, beta_0=0.5, l=100.0)

# Returns with regime change
returns = np.concatenate([
    np.random.normal(0.01, 0.02, 100),   # Bull market
    np.random.normal(-0.01, 0.03, 100)   # Bear market
])

for t, ret in enumerate(returns):
    model.add_data(ret)
    if model.run_length < 10:
        print(f"Regime change at time {t}")

3. System Health Monitoring

Track CPU usage and detect anomalies:

from pybocd import BOCDGMM

# GMM is better for multimodal CPU usage patterns
model = BOCDGMM(
    alpha_0=2.0, beta_0=2.0,
    m_0=30.0, kappa_0=1.0,
    alpha_p_0=2.0, beta_p_0=2.0,
    l=200.0, m=10, n=150
)

# Simulate CPU usage: idle + working + spike
cpu_usage = np.concatenate([
    np.random.normal(10, 5, 100),       # Idle
    np.random.normal(50, 10, 100),      # Working
    np.random.normal(90, 5, 50)         # Overload
])

changepoints = []
for t, cpu in enumerate(cpu_usage):
    model.add_data(cpu)
    if model.run_length == 0:
        changepoints.append(t)

print(f"Changepoints detected at: {changepoints}")

4. Sensor Data Analysis

Detect equipment failures using multivariate measurements:

from pybocd import BOCDNIG
import numpy as np

# Monitor multiple sensor streams separately
sensors = {
    'temperature': BOCDNIG(m_0=25.0, kappa_0=1.0, alpha_0=2.0, beta_0=1.0),
    'vibration': BOCDNIG(m_0=0.5, kappa_0=0.5, alpha_0=2.0, beta_0=0.1),
    'pressure': BOCDNIG(m_0=100.0, kappa_0=1.0, alpha_0=2.0, beta_0=1.0)
}

# Simulate sensor readings
data = {
    'temperature': [25, 26, 25, 27, 45, 46, 45, 47],
    'vibration': [0.5, 0.5, 0.6, 0.5, 3.0, 3.1, 3.2, 3.0],
    'pressure': [100, 101, 99, 100, 150, 151, 150, 152]
}

for t in range(len(data['temperature'])):
    for sensor_name, model in sensors.items():
        value = data[sensor_name][t]
        model.add_data(value)
    
    # Check if any sensor detected a changepoint
    anomalies = [name for name, model in sensors.items() if model.run_length < 5]
    if anomalies:
        print(f"Potential equipment failure at time {t}: {anomalies}")

Benchmark: BOCD-NIG vs BOCD-GMM

import time
import numpy as np
from pybocd import BOCDNIG, BOCDGMM

# Create synthetic data with changepoint
np.random.seed(42)
data = np.concatenate([
    np.random.normal(0, 1, 1000),
    np.random.normal(3, 1, 1000)
])

# Benchmark BOCD-NIG
start = time.time()
nig_model = BOCDNIG(m_0=0.0, kappa_0=1.0, alpha_0=1.0, beta_0=1.0, l=200.0)
for x in data:
    nig_model.add_data(x)
nig_time = time.time() - start

# Benchmark BOCD-GMM
start = time.time()
gmm_model = BOCDGMM(alpha_0=2.0, beta_0=2.0, m_0=0.0, kappa_0=1.0,
                     alpha_p_0=2.0, beta_p_0=2.0, l=200.0, m=10, n=100)
for x in data:
    gmm_model.add_data(x)
gmm_time = time.time() - start

print(f"BOCD-NIG: {nig_time:.3f}s")
print(f"BOCD-GMM: {gmm_time:.3f}s")
print(f"Ratio: {gmm_time/nig_time:.1f}x slower")

Tips for Best Results

1. Choose the Right Model

  • Use BOCD-NIG if: Data is univariate, approximately normal, and speed is critical
  • Use BOCD-GMM if: Data is multimodal, has outliers, or is non-Gaussian

2. Tuning Expected Run Length

Start with l = 100 and adjust based on your domain knowledge:

# If changepoints are frequent
model = BOCDNIG(..., l=50)

# If changepoints are rare
model = BOCDNIG(..., l=500)

3. Preprocess Your Data

  • Normalize: Scale to comparable units
  • Remove outliers: If not modeling them explicitly
  • Resample: If data has missing values
# Example preprocessing
data = (data - np.mean(data)) / np.std(data)  # Standardize

4. Validate Results

Cross-validate with domain experts and other methods:

# Compare with simpler methods (e.g., CUSUM)
# Validate detected changepoints against known events
# Use multiple models and average results

Running Examples from pybocd-Examples

Visit the pybocd-Examples repository for complete, runnable examples with real datasets and detailed explanations.

Common Pitfalls to Avoid

Setting l too small → Too many false positives
Setting l too large → Missing real changepoints
Ignoring preprocessing → Poor results on raw data
Using BOCD-GMM for simple data → Unnecessary complexity
Not validating results → Unknown reliability of detections