scitex_ml.metrics
Scitex metrics module.
Standardized naming convention: - calc_* functions: Modern standardized metric calculations - Legacy names (bACC, balanced_accuracy, etc.): For backward compatibility
- scitex_ml.metrics.calc_bacc(y_true, y_pred, labels=None, fold=None)[source]
Calculate balanced accuracy with robust label handling.
- Parameters:
y_true (np.ndarray) – True labels (can be str or int)
y_pred (np.ndarray) – Predicted labels (can be str or int)
labels (List, optional) – Expected label list
fold (int, optional) – Fold number for tracking
- Returns:
{‘metric’: ‘balanced_accuracy’, ‘value’: float, ‘fold’: int}
- Return type:
Dict[str, Any]
- scitex_ml.metrics.calc_mcc(y_true, y_pred, labels=None, fold=None)[source]
Calculate Matthews Correlation Coefficient with robust label handling.
- Parameters:
y_true (np.ndarray) – True labels (can be str or int)
y_pred (np.ndarray) – Predicted labels (can be str or int)
labels (List, optional) – Expected label list
fold (int, optional) – Fold number for tracking
- Returns:
{‘metric’: ‘mcc’, ‘value’: float, ‘fold’: int}
- Return type:
Dict[str, Any]
- scitex_ml.metrics.calc_conf_mat(y_true, y_pred, labels=None, fold=None, normalize=None)[source]
Calculate confusion matrix with robust label handling.
- Parameters:
- Returns:
- {
‘metric’: ‘confusion_matrix’, ‘value’: pd.DataFrame, ‘fold’: int, ‘labels’: list
}
- Return type:
Dict[str, Any]
- scitex_ml.metrics.calc_clf_report(y_true, y_pred, labels=None, fold=None)[source]
Generate classification report with robust label handling.
- Parameters:
y_true (np.ndarray) – True labels (can be str or int)
y_pred (np.ndarray) – Predicted labels (can be str or int)
labels (List, optional) – Expected label list
fold (int, optional) – Fold number for tracking
- Returns:
- {
‘metric’: ‘classification_report’, ‘value’: pd.DataFrame, ‘fold’: int, ‘labels’: list
}
- Return type:
Dict[str, Any]
- scitex_ml.metrics.calc_roc_auc(y_true, y_proba, labels=None, fold=None, return_curve=False)[source]
Calculate ROC AUC score with robust handling.
- Parameters:
- Returns:
{‘metric’: ‘roc_auc’, ‘value’: float, ‘fold’: int}
- Return type:
Dict[str, Any]
- scitex_ml.metrics.calc_pre_rec_auc(y_true, y_proba, labels=None, fold=None, return_curve=False)[source]
Calculate Precision-Recall AUC with robust handling.
- Parameters:
- Returns:
{‘metric’: ‘pr_auc’, ‘value’: float, ‘fold’: int}
- Return type:
Dict[str, Any]
- scitex_ml.metrics.calc_bacc_from_conf_mat(cm)[source]
Calculate balanced accuracy from confusion matrix.
- Parameters:
cm (np.ndarray) – Confusion matrix
- Returns:
Balanced accuracy
- Return type:
- scitex_ml.metrics.calc_seizure_window_prediction_metrics(y_true, y_pred, metadata, window_duration_min=1.0)[source]
Calculate clinical seizure prediction metrics (window-based).
This function calculates window-based sensitivity, meaning it measures the percentage of seizure time windows that were correctly identified. This is NOT event-based sensitivity (which would measure % of seizure events detected regardless of how many windows within each event).
- Parameters:
y_true (np.ndarray) – True labels (string: ‘seizure’ or ‘interictal_control’)
y_pred (np.ndarray) – Predicted labels (string: ‘seizure’ or ‘interictal_control’)
metadata (pd.DataFrame) – Metadata with ‘seizure_type’ column indicating seizure/interictal periods
window_duration_min (float, optional) – Duration of each time window in minutes (default: 1.0)
- Returns:
Dictionary containing: - seizure_sensitivity: % of seizure time windows detected (NOT event-based) - fp_per_hour: False positives per hour during interictal periods - time_in_warning: % of total time in alarm state - n_seizure_windows: Number of seizure windows - n_interictal_windows: Number of interictal windows - n_true_positives: Correctly predicted seizure windows - n_false_positives: Incorrectly predicted as seizure - n_false_negatives: Missed seizure windows - n_true_negatives: Correctly predicted as interictal - meets_sensitivity_target: Whether sensitivity ≥ 90% - meets_fp_target: Whether FP/h ≤ 0.2 - meets_tiw_target: Whether time in warning ≤ 20%
- Return type:
Notes
False positives are calculated only during interictal periods
True positives/false negatives are calculated only during seizure periods
Clinical targets based on FDA guidance for seizure prediction devices
For event-based sensitivity, use calc_seizure_event_prediction_metrics instead
Example
>>> # 1 seizure spanning 20 windows, detect 5 windows >>> # Window-based sensitivity: 5/20 = 25% >>> # This measures temporal coverage of the seizure
References
FDA guidance on seizure prediction devices
- scitex_ml.metrics.calc_seizure_event_prediction_metrics(y_true, y_pred, metadata, window_duration_min=1.0)[source]
Calculate clinical seizure prediction metrics (event-based).
This function calculates event-based sensitivity, meaning it measures whether each seizure EVENT was detected (at least one alarm raised), regardless of how many windows within that event were predicted.
This is clinically more relevant as one timely alarm per seizure event is sufficient for intervention, matching the clinical requirement: “Did the system raise an alarm for this seizure?”
- Parameters:
y_true (np.ndarray) – True labels (string: ‘seizure’ or ‘interictal_control’)
y_pred (np.ndarray) – Predicted labels (string: ‘seizure’ or ‘interictal_control’)
metadata (pd.DataFrame) –
Metadata with ‘seizure_type’ and ‘seizure_id’ columns. seizure_id: Unique identifier for each seizure event (e.g., ‘sz_001’, ‘sz_002’)
Should be NaN or empty for interictal periods
window_duration_min (float, optional) – Duration of each time window in minutes (default: 1.0)
- Returns:
Dictionary containing: - seizure_sensitivity: % of seizure events detected (event-based) - fp_per_hour: False positives per hour during interictal periods - time_in_warning: % of total time in alarm state - n_seizure_events: Number of unique seizure events - n_detected_events: Number of events with at least one alarm - n_missed_events: Number of events with zero alarms - n_interictal_windows: Number of interictal windows - n_false_positives: Incorrectly predicted as seizure - n_true_negatives: Correctly predicted as interictal - meets_sensitivity_target: Whether sensitivity ≥ 90% - meets_fp_target: Whether FP/h ≤ 0.2 - meets_tiw_target: Whether time in warning ≤ 20%
- Return type:
Notes
Requires ‘seizure_id’ column in metadata to group windows by event
False positives are calculated only during interictal periods
Event detection requires at least one window predicted as seizure
Clinical targets based on FDA guidance for seizure prediction devices
For window-based sensitivity, use calc_seizure_window_prediction_metrics instead
Example
>>> # 1 seizure spanning 20 windows, detect just 1 window >>> # Event-based sensitivity: 1/1 = 100% (event was detected!) >>> # This measures "did we catch the seizure at all?"
References
FDA guidance on seizure prediction devices
- scitex_ml.metrics.calc_seizure_prediction_metrics(y_true, y_pred, metadata, window_duration_min=1.0)
Calculate clinical seizure prediction metrics (window-based).
This function calculates window-based sensitivity, meaning it measures the percentage of seizure time windows that were correctly identified. This is NOT event-based sensitivity (which would measure % of seizure events detected regardless of how many windows within each event).
- Parameters:
y_true (np.ndarray) – True labels (string: ‘seizure’ or ‘interictal_control’)
y_pred (np.ndarray) – Predicted labels (string: ‘seizure’ or ‘interictal_control’)
metadata (pd.DataFrame) – Metadata with ‘seizure_type’ column indicating seizure/interictal periods
window_duration_min (float, optional) – Duration of each time window in minutes (default: 1.0)
- Returns:
Dictionary containing: - seizure_sensitivity: % of seizure time windows detected (NOT event-based) - fp_per_hour: False positives per hour during interictal periods - time_in_warning: % of total time in alarm state - n_seizure_windows: Number of seizure windows - n_interictal_windows: Number of interictal windows - n_true_positives: Correctly predicted seizure windows - n_false_positives: Incorrectly predicted as seizure - n_false_negatives: Missed seizure windows - n_true_negatives: Correctly predicted as interictal - meets_sensitivity_target: Whether sensitivity ≥ 90% - meets_fp_target: Whether FP/h ≤ 0.2 - meets_tiw_target: Whether time in warning ≤ 20%
- Return type:
Notes
False positives are calculated only during interictal periods
True positives/false negatives are calculated only during seizure periods
Clinical targets based on FDA guidance for seizure prediction devices
For event-based sensitivity, use calc_seizure_event_prediction_metrics instead
Example
>>> # 1 seizure spanning 20 windows, detect 5 windows >>> # Window-based sensitivity: 5/20 = 25% >>> # This measures temporal coverage of the seizure
References
FDA guidance on seizure prediction devices
- scitex_ml.metrics.calc_silhouette_score_slow(X, labels, metric='euclidean', sample_size=None, random_state=None, **kwds)[source]
Compute the mean Silhouette Coefficient of all samples.
This method is computationally expensive compared to the reference one.
The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is
(b - a) / max(a, b). To clarrify, b is the distance between a sample and the nearest cluster that b is not a part of.This function returns the mean Silhoeutte Coefficient over all samples. To obtain the values for each sample, use silhouette_samples
The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters. Negative values genly indicate that a sample has been assigned to the wrong cluster, as a different cluster is more similar.
- Parameters:
X (array [n_samples_a, n_features]) – Feature array.
labels (array, shape = [n_samples]) – label values for each sample
metric (string, or callable) – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by metrics.pairwise._pairwise_distances. If X is the distance array itself, use “precomputed” as the metric.
sample_size (int or None) – The size of the sample to use when computing the Silhouette Coefficient. If sample_size is None, no sampling is used.
random_state (integer or numpy.RandomState, optional) – The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
**kwds (optional keyword parameters) – Any further parameters are passed directly to the distance function. If using a scipy.spatial.distance metric, the parameters are still metric dependent. See the scipy docs for usage examples.
- Returns:
silhouette – Mean Silhouette Coefficient for all samples.
- Return type:
References
- Peter J. Rousseeuw (1987). “Silhouettes: a Graphical Aid to the
Interpretation and Validation of Cluster Analysis”. Computational and Applied Mathematics 20: 53-65. doi:10.1016/0377-0427(87)90125-7.
- scitex_ml.metrics.calc_silhouette_samples_slow(X, labels, metric='euclidean', **kwds)[source]
Compute the Silhouette Coefficient for each sample.
The Silhoeutte Coefficient is a measure of how well samples are clustered with samples that are similar to themselves. Clustering models with a high Silhouette Coefficient are said to be dense, where samples in the same cluster are similar to each other, and well separated, where samples in different clusters are not very similar to each other.
The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is
(b - a) / max(a, b).This function returns the Silhoeutte Coefficient for each sample.
The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters.
- Parameters:
X (array [n_samples_a, n_features]) – Feature array.
labels (array, shape = [n_samples]) – label values for each sample
metric (string, or callable) – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by metrics.pairwise._pairwise_distances. If X is the distance array itself, use “precomputed” as the metric.
**kwds (optional keyword parameters) – Any further parameters are passed directly to the distance function. If using a scipy.spatial.distance metric, the parameters are still metric dependent. See the scipy docs for usage examples.
- Returns:
silhouette – Silhouette Coefficient for each samples.
- Return type:
array, shape = [n_samples]
References
- Peter J. Rousseeuw (1987). “Silhouettes: a Graphical Aid to the
Interpretation and Validation of Cluster Analysis”. Computational and Applied Mathematics 20: 53-65. doi:10.1016/0377-0427(87)90125-7.
- scitex_ml.metrics.calc_silhouette_score_block(X, labels, metric='euclidean', sample_size=None, random_state=None, n_jobs=1, **kwds)[source]
Compute the mean Silhouette Coefficient of all samples.
The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is
(b - a) / max(a, b). To clarrify, b is the distance between a sample and the nearest cluster that b is not a part of.This function returns the mean Silhoeutte Coefficient over all samples. To obtain the values for each sample, use silhouette_samples
The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters. Negative values genly indicate that a sample has been assigned to the wrong cluster, as a different cluster is more similar.
- Parameters:
X (array [n_samples_a, n_features]) – Feature array.
labels (array, shape = [n_samples]) – label values for each sample
metric (string, or callable) – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by metrics.pairwise._pairwise_distances. If X is the distance array itself, use “precomputed” as the metric.
sample_size (int or None) – The size of the sample to use when computing the Silhouette Coefficient. If sample_size is None, no sampling is used.
random_state (integer or numpy.RandomState, optional) – The generator used to initialize the centers. If an integer is given, it fixes the seed. Defaults to the global numpy random number generator.
**kwds (optional keyword parameters) – Any further parameters are passed directly to the distance function. If using a scipy.spatial.distance metric, the parameters are still metric dependent. See the scipy docs for usage examples.
- Returns:
silhouette – Mean Silhouette Coefficient for all samples.
- Return type:
References
- Peter J. Rousseeuw (1987). “Silhouettes: a Graphical Aid to the
Interpretation and Validation of Cluster Analysis”. Computational and Applied Mathematics 20: 53-65. doi:10.1016/0377-0427(87)90125-7.
- scitex_ml.metrics.calc_silhouette_samples_block(X, labels, metric='euclidean', n_jobs=1, **kwds)[source]
Compute the Silhouette Coefficient for each sample.
The Silhoeutte Coefficient is a measure of how well samples are clustered with samples that are similar to themselves. Clustering models with a high Silhouette Coefficient are said to be dense, where samples in the same cluster are similar to each other, and well separated, where samples in different clusters are not very similar to each other.
The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is
(b - a) / max(a, b).This function returns the Silhoeutte Coefficient for each sample.
The best value is 1 and the worst value is -1. Values near 0 indicate overlapping clusters.
- Parameters:
X (array [n_samples_a, n_features]) – Feature array.
labels (array, shape = [n_samples]) – label values for each sample
metric (string, or callable) – The metric to use when calculating distance between instances in a feature array. If metric is a string, it must be one of the options allowed by metrics.pairwise._pairwise_distances. If X is the distance array itself, use “precomputed” as the metric.
**kwds (optional keyword parameters) – Any further parameters are passed directly to the distance function. If using a scipy.spatial.distance metric, the parameters are still metric dependent. See the scipy docs for usage examples.
- Returns:
silhouette – Silhouette Coefficient for each samples.
- Return type:
array, shape = [n_samples]
References
- Peter J. Rousseeuw (1987). “Silhouettes: a Graphical Aid to the
Interpretation and Validation of Cluster Analysis”. Computational and Applied Mathematics 20: 53-65. doi:10.1016/0377-0427(87)90125-7.
- scitex_ml.metrics.calc_feature_importance(model, feature_names=None, top_n=None)[source]
Calculate feature importance from a trained model.
- Parameters:
model (object) – Trained model with feature importance attributes Supports: - Tree-based: feature_importances_ (RandomForest, XGBoost, etc.) - Linear: coef_ (LogisticRegression, LinearSVC, etc.)
feature_names (List[str], optional) – Names of features. If None, uses feature_0, feature_1, …
top_n (int, optional) – Return only top N most important features
- Return type:
- Returns:
importance_dict (Dict[str, float]) – Dictionary mapping feature names to importance scores
importance_array (np.ndarray) – Array of importance scores (same order as feature_names)
- Raises:
ValueError – If model doesn’t support feature importance extraction
Examples
>>> from sklearn.ensemble import RandomForestClassifier >>> import numpy as np >>> X = np.random.rand(100, 5) >>> y = np.random.randint(0, 2, 100) >>> model = RandomForestClassifier().fit(X, y) >>> importance_dict, importance_array = calc_feature_importance( ... model, feature_names=['f1', 'f2', 'f3', 'f4', 'f5'] ... )
- scitex_ml.metrics.calc_permutation_importance(model, X, y, feature_names=None, n_repeats=10, random_state=None, scoring=None)[source]
Calculate permutation feature importance.
More reliable than built-in importance for some models, but slower.
- Parameters:
model (object) – Trained model
X (np.ndarray) – Feature matrix
y (np.ndarray) – Target vector
feature_names (List[str], optional) – Names of features
n_repeats (int, default 10) – Number of times to permute each feature
random_state (int, optional) – Random seed for reproducibility
scoring (str, optional) – Scoring metric (default uses model’s score method)
- Return type:
- Returns:
importance_mean (Dict[str, float]) – Mean importance for each feature
importance_std (Dict[str, float]) – Standard deviation of importance for each feature
Modules