Visualize biologically-aware cross-validation to prevent data leakage.
PatientSplit
BatchAware
PhyloSplit
Train
Test
Hover to see Group ID (e.g. Patient)
Python API
from sciforge.crossvalbio import PatientSplit
# Ensure no patient leakage
ps = PatientSplit(n_splits=5)
for train, test in ps.split(X, y, groups=patient_ids):
# Guaranteed disjoint patients
model.fit(X[train], y[train])
Standard K-Fold would treat multiple samples from the same patient as independent, causing optimistic bias. PatientSplit fixes this.