Metadata-Version: 2.4
Name: saliencytools
Version: 0.35
Summary: A collection of metrics for comparing saliency maps
Home-page: https://github.com/valevalerio/saliencytools
Author: Valerio Bonsignori
Author-email: Valerio Bonsignori <valerio.bonsignori@phd.unipi.it>
License-Expression: MIT
Project-URL: Homepage, https://github.com/valevalerio/saliencytools
Project-URL: Issues, https://github.com/valevalerio/saliencytools/issues
Project-URL: Documentation, https://valevalerio.github.io/saliencytools/
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# SaliencyTools

Comparing saliency maps produced by XAI methods.

![Tests](https://github.com/valevalerio/saliencytools/actions/workflows/test.yml/badge.svg)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
[![PyPI version](https://img.shields.io/pypi/v/saliencytools)](https://pypi.org/project/saliencytools/)
[![Documentation Status](https://img.shields.io/website?down_color=red&up_color=44cc11&url=https://valevalerio.github.io/saliencytools&label=Documentation)](https://valevalerio.github.io/saliencytools/)

### This module is a work in progress, contributions are welcome!

**SaliencyTools** is an open-source Python package providing thirteen curated, image-native metrics for comparing saliency maps generated by explainability methods (SHAP, LIME, GradCAM, Integrated Gradients, …).
Unlike general-purpose distance libraries, all metrics operate directly on 2-D NumPy arrays, preserve spatial structure, and handle signed attribution maps natively.

# Installation

```bash
pip install saliencytools
```

# Usage

```python
import numpy as np
from saliencytools.maskcompare import (
    sign_agreement_ratio,
    ssim,
    mean_absolute_error,
    euclidean_distance,
    cosine_distance,
    mean_squared_error,
    psnr,
    emd,
    correlation_distance,
    jaccard_distance,
    czenakowski_distance,
    kl_divergence,
    auc_judd,
    # preprocessing utilities
    normalize_mask_0_1,
    clip_mask,
)

# Two signed attribution maps (e.g. from SHAP and LIME on the same input)
map_a = np.random.randn(28, 28)
map_b = np.random.randn(28, 28)

print(sign_agreement_ratio(map_a, map_b))
print(ssim(map_a, map_b))

# Set-theoretic metrics require non-negative inputs
map_a_nn = normalize_mask_0_1(map_a)
map_b_nn = normalize_mask_0_1(map_b)
print(jaccard_distance(map_a_nn, map_b_nn))

# AUC-Judd is directed: first argument is the prediction, second is the reference
# e.g. compare a LIME explanation against a prototype
print(auc_judd(map_a, map_b))
```

# Metrics implemented

| Category | Metric | Function |
|---|---|---|
| Geometric | Euclidean distance | `euclidean_distance` |
| Geometric | Cosine distance | `cosine_distance` |
| Geometric | Mean Absolute Error | `mean_absolute_error` |
| Geometric | Mean Squared Error | `mean_squared_error` |
| Statistical | Earth Mover's Distance | `emd` |
| Statistical | Peak Signal-to-Noise Ratio | `psnr` |
| Statistical | Correlation distance | `correlation_distance` |
| Set-theoretic | Jaccard distance | `jaccard_distance` |
| Set-theoretic | Czekanowski distance | `czenakowski_distance` |
| Binary | Sign Agreement Ratio | `sign_agreement_ratio` |
| Structural | SSIM | `ssim` |
| Information-theoretic | KL Divergence (symmetric) | `kl_divergence` |
| Information-theoretic | AUC-Judd (directed) | `auc_judd` |

**Preprocessing utilities:** `normalize_mask_0_1`, `clip_mask`, `normalize_mask`

Set-theoretic metrics (`jaccard_distance`, `czenakowski_distance`) require non-negative inputs; apply `normalize_mask_0_1` first.
`auc_judd` is intentionally asymmetric: `auc_judd(prediction, reference)` scores how well the prediction recovers the above-mean regions of the reference map.

# Proxy benchmark

Because real saliency maps have no ground truth, we validate metric discriminability with a controlled proxy benchmark: a k-nearest-neighbour classifier on MNIST, evaluated with macro-F1 across 8 preprocessing configurations and 10 independent prototype draws.

Key findings (k=20 prototypes/class, 10 seeds):

| Metric | Mean F1 | Best config | Time (s) |
|---|---|---|---|
| Sign Agreement Ratio | **0.746 ± 0.017** | `[---]` | ~11 |
| SSIM | 0.726 ± 0.013 | `[---]` | ~228 |
| MAE | 0.707 ± 0.016 | `[CN-]` | ~9 |
| … | | | |
| Earth Mover's Distance | 0.380 ± 0.009 | `[---]` | ~283 |

`[C--]` = clip to [-1,1]; `[N--]` = normalize to [0,1]; `[S--]` = Sobel filter.

<!-- ![F1_scores](https://valevalerio.github.io/saliencytools/_static/heatmap.png) -->
![joyplot](https://valevalerio.github.io/saliencytools/_static/joyplot.png)
## Reproducing the benchmark

```bash
# Full multi-seed run (k=20 prototypes/class, 10 seeds; resumes safely if interrupted)
python run_benchmark.py --out results_seeds.json

# k-sensitivity run
python run_benchmark.py --k 5 --out results_k5.json
```

Estimated runtime: 3–4 hours on CPU for the default run. Use `--resume` to continue an interrupted run.

## Reproducing paper tables and figures

```bash
# LaTeX results table (auto-detects multi-seed format)
python paper/generate_tables.py --results results_seeds.json

# All figures (heatmap, F1-vs-time, stability, joyplot)
# Add --results-k5 to overlay the k=5 KDE on the joyplot
python paper/generate_figures.py --results results_seeds.json --multi-seed
python paper/generate_figures.py --results results_seeds.json --results-k5 results_k5.json --multi-seed
```

Tables are written to `paper/tables/` and figures to `paper/figures/`.

# Why SaliencyTools?

Existing alternatives fall short for saliency map comparison:

- **distancia** ([GitHub](https://github.com/ym001/distancia), [docs](https://distancia.readthedocs.io/en/latest/)) — broad coverage of mathematical distances, but images must be converted to flat lists (spatial structure lost), several metrics contain implementation errors, and there is no image-native preprocessing pipeline.
- **saliency-metrics** ([GitHub](https://github.com/sandylaker/saliency-metrics), [docs](https://saliency-metrics.readthedocs.io/en/latest/index.html)) — targets saliency evaluation specifically, but abandoned since 2022 with incomplete documentation.
- **Quantus** ([GitHub](https://github.com/understandable-machine-intelligence-lab/Quantus)) — evaluates explanation *quality* relative to a model (faithfulness, robustness, localisation); requires the model and input data. SaliencyTools only needs two maps, making it suitable for lightweight, model-agnostic comparison. The two tools are complementary.

SaliencyTools is image-native, actively maintained, and formally tested for symmetry, non-negativity, and identity axioms (`test/test_metrics.py`).

# Further reading

- Interpretable Machine Learning Book — https://christophm.github.io/interpretable-ml-book/pixel-attribution.html
- Bylinskii et al. (2019) — *What Do Different Evaluation Metrics Tell Us About Saliency Models?* — IEEE TPAMI
- Samek et al. (2017) — *Evaluating the Visualization of What a Deep Neural Network Has Learned* — IEEE TNNLS
- Wörheide et al. (2021) — *Multilevel Correspondence Analysis* — https://doi.org/10.1364/JOSAA.31.000532
- Google `saliency` library — https://pypi.org/project/saliency/
