spacr.deep_spacr
Module Contents
- spacr.deep_spacr.apply_model(src, model_path, image_size=224, batch_size=64, normalize=True, n_jobs=10)[source]
Apply a trained PyTorch model to images in a directory.
The function loads a saved model, builds a dataset from the input images, runs batched inference, and saves prediction scores to a CSV file.
- Parameters:
src (str or sequence) – Path to the input image directory or collection of image paths.
model_path (str) – Path to the saved PyTorch model.
image_size (int) – Final square crop size used before inference.
batch_size (int) – Number of images processed per batch.
normalize (bool) – Whether to normalize the image channels using mean 0.5 and standard deviation 0.5.
n_jobs (int) – Number of worker processes used by the DataLoader.
- Returns:
DataFrame with image paths and predicted positive-class probabilities.
- Return type:
pandas.DataFrame
The returned DataFrame contains the columns
pathandpred. Results are also written to a CSV file derived frommodel_pathand the current date. The model output is interpreted as a binary logit and converted to probabilities withtorch.sigmoid.
- spacr.deep_spacr.apply_model_to_tar(settings={})[source]
Apply a trained PyTorch model to images stored in a tar archive.
The function loads a saved model, reads images from a tar-based dataset, performs batched inference, post-processes prediction scores, and saves the results to a CSV file.
- Parameters:
settings (dict) – Dictionary of inference settings. Expected keys include
tar_path,model_path,image_size,batch_size,normalize,n_jobs,verbose, andscore_threshold.- Returns:
DataFrame with processed prediction results.
- Return type:
pandas.DataFrame
The returned DataFrame contains at least the columns
pathandpred. Additional columns may be added byprocess_vision_results. If the model output has shape(N, 2), the probability of class 1 is computed withtorch.softmax. Otherwise, outputs are treated as binary logits and converted withtorch.sigmoid.
- spacr.deep_spacr.evaluate_model_performance(model, loader, epoch, loss_type='auto', loss_fn=None, num_classes=None)[source]
Evaluates performance for binary or multiclass models.
- Returns:
metrics + loss + epoch [prediction_probs, all_labels]
binary: probs shape (N,)
multiclass: probs shape (N, C)
- Return type:
data_dict (dict)
- spacr.deep_spacr.test_model_core(model, loader, loader_name, epoch, loss_type)[source]
Core test loop returning both summary metrics and a row-per-image dataframe, compatible with binary & multiclass.
- spacr.deep_spacr.test_model_performance(loaders, model, loader_name_list, epoch, loss_type)[source]
Wrapper kept for API compatibility with your caller. Returns (summary_metrics_dataframe, per_file_results_dataframe)
- spacr.deep_spacr.train_model(src, dst, model_type, train_loaders, epochs=100, learning_rate=0.0001, weight_decay=0.05, amsgrad=False, optimizer_type='adamw', use_checkpoint=False, dropout_rate=0, n_jobs=20, val_loaders=None, test_loaders=None, init_weights='imagenet', intermedeate_save=None, chan_dict=None, schedule=None, loss_type='auto', gradient_accumulation=False, gradient_accumulation_steps=4, channels=['r', 'g', 'b'], verbose=False, num_classes=2, early_stopping_patience=0)[source]
Trains a model (supports 2-class and >2-class via CrossEntropy).
- New parameters:
- early_stopping_patience: number of epochs with no val improvement before stopping.
Set to 0 to disable (original behavior).
- spacr.deep_spacr.train_model_v1(src, dst, model_type, train_loaders, epochs=100, learning_rate=0.0001, weight_decay=0.05, amsgrad=False, optimizer_type='adamw', use_checkpoint=False, dropout_rate=0, n_jobs=20, val_loaders=None, test_loaders=None, init_weights='imagenet', intermedeate_save=None, chan_dict=None, schedule=None, loss_type='auto', gradient_accumulation=False, gradient_accumulation_steps=4, channels=['r', 'g', 'b'], verbose=False, num_classes=2)[source]
Trains a model (supports 2-class and >2-class via CrossEntropy; BCE only for true single-logit binary).
- spacr.deep_spacr.visualize_integrated_gradients(src, model_path, target_label_idx=0, image_size=224, channels=[1, 2, 3], normalize=True, save_integrated_grads=False, save_dir='integrated_grads')[source]
- spacr.deep_spacr.visualize_smooth_grad(src, model_path, target_label_idx, image_size=224, channels=[1, 2, 3], normalize=True, save_smooth_grad=False, save_dir='smooth_grad')[source]
- spacr.deep_spacr.save_top_class_examples(df, tar_path, dst, n=20, classes=None)[source]
Extract the N most confident images per class from the tar and save them into class-labelled subfolders under dst.
- For binary classification (classes=[0, 1]):
class_0/ ← the 20 images with the LOWEST pred (closest to 0)
class_1/ ← the 20 images with the HIGHEST pred (closest to 1)
- Parameters:
df (pd.DataFrame) – Must contain columns ‘path’ (tar member name) and ‘pred’ (probability).
tar_path (str) – Path to the tar archive that holds the images.
dst (str) – Root folder where class subfolders will be created.
n (int) – Number of top images to keep per class.
classes (list or None) – Explicit class labels. If None, defaults to binary [0, 1].
- spacr.deep_spacr.merge_predictions_into_db(df, db_path, table='png_list', pred_col='pred', class_col='cv_predictions')[source]
Merge prediction scores back into the SQLite database.
Matching on basename of png_path (DB) vs path (tar member name), since the tar stores relative member names while the DB stores full disk paths.
- spacr.deep_spacr.model_knowledge_transfer(teacher_paths, student_save_path, data_loader, device='cpu', student_model_name='maxvit_t', pretrained=True, dropout_rate=None, use_checkpoint=False, alpha=0.5, temperature=2.0, lr=0.0001, epochs=10)[source]