Basic Examples
This page walks through the two foundational ModelDoctor examples: diagnosing a classification model and diagnosing a regression model.
All examples use synthetic data generated by scikit-learn, so no external datasets need to be downloaded. The full runnable scripts are in the examples/ directory of the repository.
Example 01: Basic Classification
Script: examples/01_basic_classification.py
The "Hello World" of ModelDoctor. Trains a Random Forest classifier and runs the full diagnostic pipeline.
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import modeldoctor as md
# 1. Generate a dataset with noise and redundancy
X, y = make_classification(
n_samples=1000,
n_features=20,
n_informative=5,
n_redundant=10,
random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# 2. Train a model
model = RandomForestClassifier(n_estimators=50, max_depth=5, random_state=42)
model.fit(X_train, y_train)
# 3. Diagnose
report = md.diagnose(model, X_train, y_train, X_test, y_test)
# 4. Print results
print(f"Health Score: {report.health_score.overall:.1f} / 100")
print(f"Grade: {report.health_score.grade}")
all_findings = [f for d in report.diagnoses for f in (d.findings or [])]
flagged = [f for f in all_findings if f.severity.value in ("warning", "critical")]
if flagged:
print(f"\nFindings ({len(flagged)} issues flagged):")
for f in flagged:
print(f" [{f.severity.value.upper()}] {f.title}")
print(f" {f.explanation}")
else:
print("\nNo high-severity issues found.")
# 5. Review prescriptions
recs = report.prescription.all_recommendations
if recs:
print(f"\nRecommendations ({len(recs)}):")
for rec in recs:
print(f" - {rec.description}")
print(f" Estimated gain: {rec.estimated_improvement}")
Expected Output
A constrained Random Forest (max_depth=5) on this synthetic dataset typically produces a health score above 80 with no critical findings.
Example 02: Basic Regression
Script: examples/02_basic_regression.py
Demonstrates ModelDoctor on a regression task. ModelDoctor automatically infers the task type from y_train.
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
import modeldoctor as md
X, y = make_regression(
n_samples=1000,
n_features=20,
n_informative=10,
noise=0.1,
random_state=42
)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestRegressor(n_estimators=50, max_depth=5, random_state=100)
model.fit(X_train, y_train)
report = md.diagnose(model, X_train, y_train, X_test, y_test)
print(f"Health Score: {report.health_score.overall:.1f} / 100")
print(f"Grade: {report.health_score.grade}")
all_findings = [f for d in report.diagnoses for f in (d.findings or [])]
flagged = [f for f in all_findings if f.severity.value in ("warning", "critical")]
if flagged:
print(f"\nFindings ({len(flagged)} issues):")
for f in flagged:
print(f" [{f.severity.value.upper()}] {f.title}: {f.explanation}")
else:
print("\nNo high-severity issues found. Model appears healthy.")
Notes on Regression
CalibrationDoctoris skipped for regression tasks (it checks probability outputs).PredictionDoctorevaluates R², MSE, and MAE instead of accuracy and F1.OverfittingDoctorcompares training R² vs. test R² to detect memorization.
Running the Examples
All examples in the examples/ directory are standalone scripts. Run them directly:
Next Steps
- Advanced Examples — custom doctors, leakage detection, and overfitting analysis.
- Classification Guide — deeper walkthrough of classification diagnostics.
- Regression Guide — deeper walkthrough of regression diagnostics.