Metadata-Version: 2.4
Name: shapiq
Version: 1.5.2
Summary: Shapley Interactions for Machine Learning
Author-email: Maximilian Muschalik <Maximilian.Muschalik@lmu.de>, "Santo M. A. R. Thies" <Santo.Thies@lmu.de>, Hubert Baniecki <h.baniecki@uw.edu.pl>, Fabian Fumagalli <ffumagalli@techfak.de>
Maintainer-email: Maximilian Muschalik <Maximilian.Muschalik@lmu.de>, "Santo M. A. R. Thies" <Santo.Thies@lmu.de>
License-Expression: MIT
Project-URL: documentation, https://shapiq.readthedocs.io
Project-URL: source, https://github.com/mmschlk/shapiq
Project-URL: tracker, https://github.com/mmschlk/shapiq/issues
Project-URL: changelog, https://github.com/mmschlk/shapiq/blob/main/CHANGELOG.md
Keywords: python,machine learning,interpretable machine learning,shap,xai,explainable ai,interaction,shapley interactions,shapley values,feature interaction
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: Unix
Classifier: Operating System :: MacOS
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: pandas
Requires-Dist: joblib
Requires-Dist: scikit-learn
Requires-Dist: tqdm
Requires-Dist: requests
Requires-Dist: matplotlib
Requires-Dist: networkx
Requires-Dist: colour
Requires-Dist: pillow
Provides-Extra: sparse
Requires-Dist: sparse-transform; extra == "sparse"
Requires-Dist: galois; extra == "sparse"
Provides-Extra: proxy
Requires-Dist: xgboost; extra == "proxy"
Requires-Dist: lightgbm; extra == "proxy"
Requires-Dist: catboost; extra == "proxy"
Requires-Dist: smac; extra == "proxy"
Dynamic: license-file

# shapiq: Shapley Interactions for Machine Learning <img src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/logo/logo_shapiq_light.svg" alt="shapiq_logo" align="right" height="250px"/>

[![PyPI version](https://badge.fury.io/py/shapiq.svg)](https://badge.fury.io/py/shapiq)
[![License](https://img.shields.io/badge/License-MIT-brightgreen.svg)](https://opensource.org/licenses/MIT)
[![codecov](https://codecov.io/gh/mmschlk/shapiq/branch/main/graph/badge.svg)](https://codecov.io/gh/mmschlk/shapiq)
[![Tests](https://github.com/mmschlk/shapiq/actions/workflows/ci.yml/badge.svg)](https://github.com/mmschlk/shapiq/blob/main/.github/workflows/ci.yml)
[![Read the Docs](https://readthedocs.org/projects/shapiq/badge/?version=latest)](https://shapiq.readthedocs.io/en/latest/?badge=latest)

[![PyPI Version](https://img.shields.io/pypi/pyversions/shapiq.svg)](https://pypi.org/project/shapiq)
[![PyPI status](https://img.shields.io/pypi/status/shapiq.svg?color=blue)](https://pypi.org/project/shapiq)
[![PePy](https://static.pepy.tech/badge/shapiq?style=flat-square)](https://pepy.tech/project/shapiq)

[![Code Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Contributions Welcome](https://img.shields.io/badge/contributions-welcome-brightgreen)](https://github.com/mmschlk/shapiq/issues)
[![Last Commit](https://img.shields.io/github/last-commit/mmschlk/shapiq)](https://github.com/mmschlk/shapiq/commits/main)

> An interaction may speak more than a thousand main effects.

Shapley Interaction Quantification (`shapiq`) is a Python package for (1) approximating any-order Shapley interactions, (2) benchmarking game-theoretical algorithms for machine learning, (3) explaining feature interactions of model predictions. `shapiq` extends the well-known [shap](https://github.com/shap/shap) package for both researchers working on game theory in machine learning, as well as the end-users explaining models. SHAP-IQ extends individual Shapley values by quantifying the **synergy** effect between entities (aka **players** in the jargon of game theory) like explanatory features, data points, or weak learners in ensemble models. Synergies between players give a more comprehensive view of machine learning models.

## 🛠️ Install
`shapiq` is intended to work with **Python 3.12 and above**.
Installation can be done via `uv` :
```sh
uv add shapiq
```

or via `pip`:

```sh
pip install shapiq
```

## 👀 Upcoming
See what’s on the horizon for the library in our [GitHub Project Board](https://github.com/users/mmschlk/projects/4). We plan and track upcoming features, improvements, and maintenance tasks there including new explainers, performance optimizations, and expanded model support.

## ⭐ Quickstart

You can explain your model with `shapiq.explainer` and visualize Shapley interactions with `shapiq.plot`.
If you are interested in the underlying game theoretic algorithms, then check out the `shapiq.approximator` and `shapiq.games` modules.

### Compute any-order feature interactions

Explain your models with Shapley interactions:
Just load your data and model, and then use a `shapiq.Explainer` to compute Shapley interactions.

```python
import shapiq
# load data
X, y = shapiq.load_california_housing(to_numpy=True)
# train a model
from sklearn.ensemble import RandomForestRegressor
model = RandomForestRegressor()
model.fit(X, y)
# set up an explainer with k-SII interaction values up to order 4
explainer = shapiq.TabularExplainer(
    model=model,
    data=X,
    index="k-SII",
    max_order=4
)
# explain the model's prediction for the first sample
interaction_values = explainer.explain(X[0], budget=256)
# analyse interaction values
print(interaction_values)

>> InteractionValues(
>>     index=k-SII, max_order=4, min_order=0, estimated=False,
>>     estimation_budget=256, n_players=8, baseline_value=2.07282292,
>>     Top 10 interactions:
>>         (0,): 1.696969079  # attribution of feature 0
>>         (0, 5): 0.4847876
>>         (0, 1): 0.4494288  # interaction between features 0 & 1
>>         (0, 6): 0.4477677
>>         (1, 5): 0.3750034
>>         (4, 5): 0.3468325
>>         (0, 3, 6): -0.320  # interaction between features 0 & 3 & 6
>>         (2, 3, 6): -0.329
>>         (0, 1, 5): -0.363
>>         (6,): -0.56358890
>> )
```

### Compute Shapley values like you are used to with SHAP

If you are used to working with SHAP, you can also compute Shapley values with `shapiq` the same way:
You can load your data and model, and then use the `shapiq.Explainer` to compute Shapley values.
If you set the index to ``'SV'``, you will get the Shapley values as you know them from SHAP.

```python
import shapiq

data, model = ...  # get your data and model
explainer = shapiq.Explainer(
    model=model,
    data=data,
    index="SV",  # Shapley values
)
shapley_values = explainer.explain(data[0])
shapley_values.plot_force(feature_names=...)
```

Once you have the Shapley values, you can easily compute Interaction values as well:

```python
explainer = shapiq.Explainer(
    model=model,
    data=data,
    index="k-SII",  # k-SII interaction values
    max_order=2     # specify any order you want
)
interaction_values = explainer.explain(data[0])
interaction_values.plot_force(feature_names=...)
```

<p align="center">
  <img width="800px" src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/motivation_sv_and_si.png" alt="An example Force Plot for the California Housing Dataset with Shapley Interactions">
</p>

### Use ProxySPEX (Proxy SParse EXplainer) <img src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/spex_logo.png" alt="spex_logo" align="right" height="75px"/>
For large-scale use-cases you can also check out the [👓``ProxySPEX``](https://shapiq.readthedocs.io/en/latest/api/shapiq.approximator.sparse.html#shapiq.approximator.sparse.SPEX) approximator.

```python
# load your data and model with large number of features
data, model, n_features = ...

# use the ProxySPEX approximator directly
approximator = shapiq.ProxySPEX(n=n_features, index="FBII", max_order=2)
fbii_scores = approximator.approximate(budget=2000, game=model.predict)

# or use ProxySPEX with an explainer
explainer = shapiq.Explainer(
    model=model,
    data=data,
    index="FBII",
    max_order=2,
    approximator="proxyspex"  # specify ProxySPEX as approximator
)
explanation = explainer.explain(data[0])
```

### Visualize feature interactions

A handy way of visualizing interaction scores up to order 2 are network plots.
You can see an example of such a plot below.
The nodes represent feature **attributions** and the edges represent the **interactions** between features.
The strength and size of the nodes and edges are proportional to the absolute value of attributions and interactions, respectively.

```python
shapiq.network_plot(
    first_order_values=interaction_values.get_n_order_values(1),
    second_order_values=interaction_values.get_n_order_values(2)
)
# or use
interaction_values.plot_network()
```

The pseudo-code above can produce the following plot (here also an image is added):

<p align="center">
  <img width="500px" src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/network_example2.png" alt="network_plot_example">
</p>

### Explain TabPFN

With ``shapiq`` you can also explain [``TabPFN``](https://github.com/PriorLabs/TabPFN) by making use of the _remove-and-recontextualize_ explanation paradigm implemented in ``shapiq.TabPFNExplainer``.

```python
import tabpfn, shapiq
data, labels = ...                    # load your data
model = tabpfn.TabPFNClassifier()     # get TabPFN
model.fit(data, labels)               # "fit" TabPFN (optional)
explainer = shapiq.TabPFNExplainer(   # setup the explainer
    model=model,
    data=data,
    labels=labels,
    index="FSII"
)
fsii_values = explainer.explain(data[0])  # explain with Faithful Shapley values
fsii_values.plot_force()               # plot the force plot
```

<p align="center">
  <img width="800px" src="https://raw.githubusercontent.com/mmschlk/shapiq/main/docs/source/_static/images/fsii_tabpfn_force_plot_example.png" alt="Force Plot of FSII values">
</p>


## 📖 Documentation with tutorials
The documentation of ``shapiq`` can be found at https://shapiq.readthedocs.io.
If you are new to Shapley values or Shapley interactions, we recommend starting with the [introduction](https://shapiq.readthedocs.io/en/latest/introduction/) and the [examples & tutorials](https://shapiq.readthedocs.io/en/latest/auto_examples/index.html).
There is a lot of great resources available to get you started with Shapley values and interactions.

## 💬 Citation

If you use ``shapiq`` and enjoy it, please consider citing our [NeurIPS paper](https://arxiv.org/abs/2410.01649) or consider starring this repository.

```bibtex
@inproceedings{Muschalik.2024b,
  title     = {shapiq: Shapley Interactions for Machine Learning},
  author    = {Maximilian Muschalik and Hubert Baniecki and Fabian Fumagalli and
               Patrick Kolpaczki and Barbara Hammer and Eyke H\"{u}llermeier},
  booktitle = {The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year      = {2024},
  url       = {https://openreview.net/forum?id=knxGmi6SJi}
}
```

## 📦 Contributing
We welcome any kind of contributions to `shapiq`!
If you are interested in contributing, please check out our [contributing guidelines](https://github.com/mmschlk/shapiq/blob/main/.github/CONTRIBUTING.md).
If you have any questions, feel free to reach out to us.
We are tracking our progress via a [project board](https://github.com/users/mmschlk/projects/4) and the [issues](https://github.com/mmschlk/shapiq/issues) section.
If you find a bug or have a feature request, please open an issue or help us fixing it by opening a pull request.

## 📜 License
This project is licensed under the [MIT License](https://github.com/mmschlk/shapiq/blob/main/LICENSE).

## 💰 Funding
This work is openly available under the MIT license.
Some authors acknowledge the financial support by the German Research Foundation (DFG) under grant number TRR 318/1 2021 – 438445824.

---
Built with ❤️ by the shapiq team.

# Changelog

## v1.5.2 (2026-06-12)

### Highlights of new Features

- Added `'baseline'` imputer support (`BaselineImputer`) to the `shapiq_games` `LocalExplanation` benchmark games.
- Added `ProxySHAP` and `ProxySPEX` as approximator options for the `TabularExplainer`, `TabPFNExplainer`, and `AgnosticExplainer` in `shapiq.explainer`. All three resolve approximators through the shared `setup_approximator` configuration, so `proxyshap` and `proxyspex` are selectable wherever that configuration is used.
- Added support for linear surrogate models and HPO-informed proxies in `ProxySHAP` and `RegressionMSR`. The extraction route (linear coefficient read-out vs. exact tree read-out) is selected automatically from the proxy's base estimator type, and hyperparameter-search wrappers (scikit-learn's `GridSearchCV`, `RandomizedSearchCV`, `HalvingGridSearchCV`, or SMAC) are fitted and their `best_estimator_` read out, for both linear and tree proxies.
- Made `ProxySPEX`'s `proxy_model` configurable: it now also accepts the string tags `"lightgbm"` (default), `"xgboost"`, and `"tree"`, as well as custom tree estimators and HPO wrappers. The default remains the HPO-informed LightGBM proxy.
- `ProxySHAP`, `ProxySPEX`, and `RegressionMSR` no longer require LightGBM, XGBoost, or SMAC to be installed. When an optional backend is unavailable they warn and fall back to a scikit-learn `DecisionTreeRegressor`, so the proxy approximators always import and run.
- Added log-space weight computation for the regression-based (`KernelSHAP`, `KernelSHAPIQ`, `RegressionFSII`, `RegressionFBII`, `kADDSHAP`) and Monte Carlo (`SHAPIQ`, `SVARMIQ`) approximators, supporting Shapley approximation for feature counts beyond 1000.


### Bugfix

- Fixes `InterventionalTreeExplainer` routing the explain point at float64 precision while tree thresholds and the reference data are float32. The explain point is now cast to float32 to match the model's evaluation.
- Fixes `ProxySHAP` and `RegressionMSR` alignment of sampled coalitions between the approximator and its adjustment method (`random_state` now defaults to a fixed seed so both samplers reproduce identical coalitions).
- Fixes `ProxySPEX` passing `initialize_dict=True` to its sampler, which was unnecessary since `ProxySPEX` does not return all possible interactions (now `False`).

## v1.5.1 (2026-05-30)

### Bugfix

- Fixes segfaults on macOS when using shapiq alongside XGBoost or LightGBM by static-linking the OpenMP runtime instead of vendoring a dynamic `libomp.dylib`. No API changes. [#536](https://github.com/mmschlk/shapiq/issues/536)

## v1.5.0 (2026-05-29)

### Highlights of new Features

- adds `ProxySHAP` approximator in `shapiq.approximator.proxy` for proxy-model-accelerated interaction estimation
- adds `RegressionMSR` approximator in `shapiq.approximator.proxy` for proxy-model accelerated value estimation
- refactors `shapiq.tree` into submodules: `conversion/`, `linear/`, `interventional/` with speedup through a C++ backend
- adds `LinearTreeSHAP` in `shapiq.tree.linear` for fast first-order Shapley value computation
- adds `InterventionalTreeExplainer` in `shapiq.tree.interventional`
- adds `KNNExplainer`, `WeightedKNNExplainer` and `ThresholdNNExplainer` for nearest neighbor models
- changes the default for all user-facing `Explainer` classes to `index="SV"`, `max_order=1` (Shapley values) — see Breaking Changes below
- adds `shapiq.scatter_plot` for SHAP-style scatter (dependence) plots of interaction values, supporting both first-order and higher-order interactions [#516](https://github.com/mmschlk/shapiq/pull/516)


### Introducing ProxySHAP [#501](https://github.com/mmschlk/shapiq/pull/501), [Preprint](https://arxiv.org/abs/2605.22738)

Adds [`ProxySHAP`](src/shapiq/approximator/proxy/proxyshap.py) as a new approximator that accelerates Shapley interaction estimation by fitting a lightweight **proxy tree model** (XGBoost by default) on sampled coalitions, computing _exact_ interactions for the proxy via the `InterventionalTreeExplainer`, and then optionally correcting for the approximation error on the true model.
Adds [`RegressionMSR`](src/shapiq/approximator/proxy/regressionmsr.py) as a new approximator that accelerates Shapley value estimation by fitting a lightweight **proxy tree model** (XGBoost by default) on sampled coalitions, computing _exact_ interactions for the proxy via the `InterventionalTreeExplainer`, and then optionally correcting for the approximation error on the true model.

### Introducing Explainers for Nearest Neighbor Models

Adds three new explainers, namely `KNNExplainer`, `WeightedKNNExplainer` and `ThresholdNNExplainer`, which efficiently compute explanations for nearest neighbor models from the [scikit-learn](https://scikit-learn.org/stable/) library.
One application of these explainers is Data Valuation, i.e. the task of evaluating the usefulness of training data points for training models.


### Python Version [#497](https://github.com/mmschlk/shapiq/pull/497)

- adds support for Python 3.14 making the package compatible with the latest Python version.
- drops support for Python 3.10 and 3.11. The minimum supported Python version is now 3.12.
- drops support for Intel (x86_64) macOS. No pre-built wheels are shipped for Intel-based Macs because `numba` and `llvmlite` (transitive dependencies via `galois`) no longer publish x86_64 macOS wheels. Apple Silicon (arm64) macOS, Linux, and Windows are unaffected.

### Removed Deprecated Features

- removes `path_to_values` parameter from `shapiq.Game`, which was previously deprecated. Use `shapiq.Game.load()` instead. [#496](https://github.com/mmschlk/shapiq/pull/496)
- removes pickle support from `shapiq.InteractionValues`. JSON is now the only supported file format. Use `InteractionValues.save()` and `InteractionValues.load()` with JSON files. [#496](https://github.com/mmschlk/shapiq/pull/496)

### Breaking Changes

- **`Explainer` default changed to Shapley values.** `Explainer`,
  `TabularExplainer`, `TabPFNExplainer`, `AgnosticExplainer`, and `TreeExplainer`
  now default to `index="SV"`, `max_order=1`. Previously they defaulted to
  `index="k-SII"`, `max_order=2`. Users relying on the previous default must
  pass these arguments explicitly. NN explainers and `ProductKernelExplainer`
  are unaffected — they already defaulted to `"SV"`. Approximators and
  computers also retain their existing defaults.

### Extending shapiq-games [#476](https://github.com/mmschlk/shapiq/issues/476)
- adds standard SHAP datasets for benchmarking in `shapiq_games.benchmark.local_xai`.
- adds **all** TabArena datasets for benchmarking in `shapiq_games.benchmark.local_xai`.

Four adjustment strategies are supported:

- **`"none"`**: use proxy interactions directly (fastest, least accurate)
- **`"msr"`** / **`"svarm"`** / **`"kernel"`**: unbiased adjustments via established estimators

This implementation relies on C-extension routines (`compute_interactions_sparse`) for high-throughput coalition evaluation.

### Introducing LinearTreeSHAP [#501](https://github.com/mmschlk/shapiq/pull/501)

Adds [`LinearTreeSHAP`](src/shapiq/tree/linear/explainer.py) — an efficient implementation of the **Linear TreeSHAP** algorithm (Yu et al., 2022) for computing first-order Shapley values on tree-based models.
Unlike `TreeSHAPIQ`, which supports any-order interactions, `LinearTreeSHAP` is optimized exclusively for Shapley values (`index="SV"`) and achieves higher throughput by using a dedicated C++ extension (`linear_tree_shap_iterative`).
It is exported from `shapiq.tree.LinearTreeSHAP`. For further details we refer to the paper: Yu, S., Zheng, S., Chen, H., & Li, J. (2022). Linear TreeSHAP. _NeurIPS 2022_.

### Complete Refactor of the `shapiq.tree` Module [#501](https://github.com/mmschlk/shapiq/pull/501)

The internal tree infrastructure has been fully reorganized into a clean subpackage layout:

```
shapiq/tree/
├── base.py              — TreeModel and EdgeTree data structures
├── treeshapiq.py        — TreeSHAP-IQ: any-order interactions via Chebyshev interpolation
├── explainer.py         — TreeExplainer (main user-facing API, unchanged interface)
├── validation.py        — validate_tree_model() for sklearn / XGBoost / LightGBM
├── utils.py             — helper utilities
├── conversion/          — tree-to-internal-format converters (sklearn, XGBoost, LightGBM)
├── linear/              — LinearTreeSHAP implementation
└── interventional/      — InterventionalTreeExplainer and InterventionalGame
```

The conversion of the tree methods has been moved to C++ giving at least 2x up to 6x times speeup over shap tree conversion.

### Documentation and Examples

- removes all Jupyter notebooks from the library and moves the examples to a Sphinx-Gallery so they are built and tested as part of the documentation. [#509](https://github.com/mmschlk/shapiq/pull/509)

### Performance

- speeds up the baseline imputer by removing a Python `for` loop from its hot path. [#498](https://github.com/mmschlk/shapiq/pull/498)

### Refactoring of spex module
- moves `ProxySPEX` from `shapiq.approximator.sparse` to `shapiq.approximator.proxy`. Imports from `shapiq.approximator` (i.e. `from shapiq.approximator import ProxySPEX`) are unchanged.
- removes the `"proxyspex"` option from `Sparse.decoder_type`; it now accepts only `"soft"` or `"hard"` (default `"soft"`).
- moves `sparse-transform` and `galois` out of the core dependencies. Install with `pip install shapiq[sparse]` to use `SPEX` / `Sparse`.
- adds a `sparse` extra (`sparse-transform`, `galois`) required by `shapiq.approximator.sparse`.
- adds a `proxy` extra (`xgboost`, `lightgbm`) required by `shapiq.approximator.proxy` (`ProxySHAP`, `ProxySPEX`, `RegressionMSR`).

### Bugfix

- fixes a bug in tree conversion, such that tree models with no splits are still correctly parsed. [#370](https://github.com/mmschlk/shapiq/issues/370)
- fixes `min_order` in `TreeExplainer` so that it now actually restricts the returned `InteractionValues` to interactions of order ``min_order..max_order`` (``min_order=0`` continues to include the empty interaction at the baseline value); invalid values now raise a clear `ValueError`. [#325](https://github.com/mmschlk/shapiq/issues/325)
- fixes tree conversion breaking when the `LC_NUMERIC` locale is not set to the standard `"C"` value. [#515](https://github.com/mmschlk/shapiq/pull/515)
- fixes a segfault in the `ProxySHAP` C++ extension code. [#506](https://github.com/mmschlk/shapiq/pull/506)
- fixes a Out Of Memory (OOM) when values is an dictionary and n_players is large.  [#531](https://github.com/mmschlk/shapiq/issues/531)

## v1.4.1 (2025-11-10)

### Bugfix

- fixes a bug in `ProxySPEX` where the `baseline_value` was set to a wrong id and not the correct score of the empty coalition. [#469](https://github.com/mmschlk/shapiq/issues/469)
- fixes the build process of `shapiq` to correctly exclude all test/benchmark/docs/etc. files from the built package. [#464](https://github.com/mmschlk/shapiq/issues/464)

## v1.4.0 (2025-10-31)

### Introducing ProxySPEX [#442](https://github.com/mmschlk/shapiq/pull/442)

Adds the [`ProxySPEX`](https://arxiv.org/pdf/2505.17495) [approximator](https://github.com/mmschlk/shapiq/blob/main/src/shapiq/approximator/sparse/proxyspex.py) for efficient computation of sparse interaction values using the new ProxySPEX algorithm.
ProxySPEX is a direct extension of the [SPEX](https://openreview.net/pdf?id=UQpYmaBGwB) algorithm, which uses clever fourier representations of the value function and analysis to identify the most relevant interactions (in terms of `Moebius` coefficients) and transforms them into summary scores (Shapley interactions).
One of the key innovations of ProxySPEX compared to SPEX is the use of a proxy model that approximates the original value function (uses a LightGBM model internally).
**Notably,** to run ProxySPEX, users have to install the `lightgbm` package in their environment.
For further details we refer to the paper, which will be presented at NeurIPS'2025: Butler, L., Kang, J.S., Agarwal, A., Erginbas, Y.E., Yu, Bin, Ramchandran, K. (2025). ProxySPEX: Inference-Efficient Interpretability via Sparse Feature Interactions in LLMs. [arxiv](https://arxiv.org/pdf/2505.17495)

### Introducing ProductKernelExplainer [#431](https://github.com/mmschlk/shapiq/pull/431)

The `ProductKernelExplainer` is a new model-specific explanation method for machine learning models that utilize Product Kernels, such as Gaussian Processes and Support Vector Machines.
Similar to the TreeExplainer, it uses a specific computation scheme that leverages the structure of the underlying product kernels to efficiently compute exact Shapley values.
**Note**, this explainer is only able to compute Shapley values (not higher-order interactions yet).
For further details we refer to the paper: Mohammadi, M., Chau, S.-L., Muandet, K. Computing Exact Shapley Values in Polynomial Time for Product-Kernel Methods. [arxiv](https://arxiv.org/abs/2505.16516)

### New Conditional Imputation Methods [#435](https://github.com/mmschlk/shapiq/pull/435)

Based on traditional statistical methods, we implemented two new conditional imputation methods named `GaussianImputer` and `GaussianCopulaImputer` within the `shapiq.imputer` module.
Both imputation methods are designed to handle missing feature imputation in a way that respects the underlying data distribution with the assumption that the data follows a multivariate Gaussian distribution (`GaussianImputer`) or can be represented with Gaussian copulas (`GaussianCopulaImputer`).
In practice, this assumption may often be violated, but these methods can still provide reasonable imputations in many scenarios and serve as a useful benchmark enabling easier research in the field of conditional imputation for Shapley value explanations.

### Shapiq Statically Typechecked [#430](https://github.com/mmschlk/shapiq/pull/430)

We have introduced static type checking to `shapiq` using [Pyright](https://github.com/microsoft/pyright), and integrated it into our `pre-commit` hooks.
This ensures that type inconsistencies are caught early during development, improving code quality and maintainability.
Developers will now benefit from immediate feedback on type errors, making the codebase more robust and reliable as it evolves.

### Separation of `shapiq` into `shapiq`, `shapiq_games`, and `shapiq-benchmark` [#459](https://github.com/mmschlk/shapiq/issues/459)

We have begun the process of modularizing the `shapiq` package by splitting it into three distinct packages: `shapiq`, `shapiq_games`, and `shapiq-benchmark`.

- The `shapiq` package now serves as the core library. It contains the main functionality, including approximators, explainers, computation routines, interaction value logic, and plotting utilities.
- The new `shapiq_games` package includes examples and utilities for defining custom cooperative games using the `shapiq.Game` API. Although it lives in the same repository as `shapiq`, it is designed to be installable and usable as a standalone package. Internally, its source code is available via the `shapiq_games` submodule. Dependencies for this package can be managed via extras (e.g., `uv pip install shapiq[games]`) or by installing `shapiq_games` directly.
- The `shapiq-benchmark` package is hosted in a separate repository and is intended for conducting benchmarks. It builds on top of both `shapiq` and `shapiq_games`, and includes benchmarking utilities, datasets, and game configurations for evaluating the performance of different approximators and explainers. It can be installed via `pip install shapiq-benchmark`.

This restructuring aims to improve maintainability and development scalability. The core `shapiq` package will continue to receive the majority of updates and enhancements, and keeping it streamlined ensures better focus and usability. Meanwhile, separating games and benchmarking functionality allows these components to evolve more independently while maintaining compatibility through clearly defined dependencies.

### List of All New Features

- adds the ProxySPEX (Proxy Sparse Explanation) module in `approximator.sparse` for even more efficient computation of sparse interaction values [#442](https://github.com/mmschlk/shapiq/pull/442)
- uses `predict_logits` method of sklearn-like classifiers if available in favor of `predict_proba` to support models that also offer logit outputs like TabPFNClassifier for better interpretability of the explanations [#426](https://github.com/mmschlk/shapiq/issues/426)
- adds the `shapiq.explainer.ProductKernelExplainer` for model-specific explanation of Product Kernel based models like Gaussian Processes and Support Vector Machines. [#431](https://github.com/mmschlk/shapiq/pull/431)
- adds the `GaussianImputer` and `GaussianCopulaImputer` classes to the `shapiq.imputer` module for conditional imputation based on Gaussian assumptions. [#435](https://github.com/mmschlk/shapiq/pull/435)
- speeds up the imputation process in `MarginalImputer` by dropping an unnecessary loop [#449](https://github.com/mmschlk/shapiq/pull/449)
- makes `n_players` argument of `shapiq.ExactComputer` optional when a `shapiq.Game` object is passed [#388](https://github.com/mmschlk/shapiq/issues/388)

### Removed Features and Breaking Changes

- removes the ability to load `InteractionValues` from pickle files. This is now deprecated and will be removed in the next release. Use `InteractionValues.save(..., as_json=True)` to save interaction values as JSON files instead. [#413](https://github.com/mmschlk/shapiq/issues/413)
- removes `coalition_lookup` and `value_storage` properties from `shapiq.Game` since the seperated view on game values and coalitions they belong to is now outdated. Use the `shapiq.Game.game_values` dictionary instead. [#430](https://github.com/mmschlk/shapiq/pull/430)
- reorders the arguments of `shapiq.ExactComputer`'s constructor to have `n_players` be optional if a `shapiq.Game` object is passed. [#388](https://github.com/mmschlk/shapiq/issues/388)

### Bugfixes

- fixes a bug where RegressionFBII approximator was throwing an error when the index was `'BV'` or `'FBII'`.[#420](https://github.com/mmschlk/shapiq/pull/420)
- allows subtraction and addition of `InteractionValues` objects with different `index` attributes by ignoring and raising a warning instead of an error. The resulting `InteractionValues` object will have the `index` of the first object. [#423](https://github.com/mmschlk/shapiq/pull/423)

### Maintenance and Development

- refactored the `shapiq.Games` and `shapiq.InteractionValues` API by adding an interactions and game_values dictionary as the main data structure to store the interaction scores and game values. This allows for more efficient storage and retrieval of interaction values and game values, as well as easier manipulation of the data. [#419](https://github.com/mmschlk/shapiq/pull/419)
- addition and subtraction of InteractionValues objects (via `shapiq.InteractionValues.__add__`) now also works for different indices, which will raise a warning and will return a new InteractionValues object with the index set of the first. [#422](https://github.com/mmschlk/shapiq/pull/422)
- refactors the `shapiq.ExactComputer` to allow for initialization without passing n_players when a `shapiq.Game` object is passed [#388](https://github.com/mmschlk/shapiq/issues/388). Also introduces a tighter type hinting for the `index` parameter using `Literal` types. [#450](https://github.com/mmschlk/shapiq/pull/450)
- removes zeros from the `InteractionValues.coalition_lookup` from the `MoebiusConverter` for better memory efficiency. [#369](https://github.com/mmschlk/shapiq/issues/369)

### Docs

- added an example notebook for `InteractionValues`, highlighting _Initialization_, _Modification_, _Visualization_ and _Save and Loading_.
- makes API reference docs more consistent by adding missing docstrings and improving existing ones across the package. [#420](https://github.com/mmschlk/shapiq/pull/420), [#437](https://github.com/mmschlk/shapiq/issues/437), [#452](https://github.com/mmschlk/shapiq/issues/452) among others.

## v1.3.2 (2025-10-14)

### Hotfix

Removes `overrides` import in tabular explainer, which is not part of the package dependencies resulting in an ImportError when importing `shapiq`. [#436](https://github.com/mmschlk/shapiq/issues/436)

## v1.3.1 (2025-07-11)

### New Features

- adds the `shapiq.plot.beesvarm_plot()` function to shapiq. The beeswarm plot was extended to also support interactions of features. Beeswarm plots are useful in visualizing dependencies between feature values. The beeswarm plot was adapted from the SHAP library by sub-dividing the y-axis for each interaction term. [#399](https://github.com/mmschlk/shapiq/issues/399)
- adds JSON support to `InteractionValues` and `Game` objects, allowing for easy serialization and deserialization of interaction values and game objects [#412](https://github.com/mmschlk/shapiq/pull/412) usage of `pickle` is now deprecated. This change allows us to revamp the data structures in the future and offers more flexibility.

### Testing, Code-Quality and Documentation

- adds a testing suite for testing deprecations in `tests/tests_deprecations/` which allows for easier deprecation managment and tracking of deprecated features [#412](https://github.com/mmschlk/shapiq/pull/412)

## Deprecated

- The `Game(path_to_values=...)` constructor is now deprecated and will be removed in version 1.4.0. Use `Game.load(...)` or `Game().load_values(...)` instead.
- Saving and loading `InteactionValues` via `InteractionValues.save(..., as_pickle=True)` and `InteractionValues.save(..., as_npz=True)` is now deprecated and will be removed in version 1.4.0. Use `InteractionValues.save(...)` to save as json.

## v1.3.0 (2025-06-17)

### Highlights

- `shapiq.SPEX` (Sparse Exact) approximator for efficient computation of sparse interaction values for really large models and games. Paper: [SPEX: Scaling Feature Interaction Explanations for LLMs](https://arxiv.org/abs/2502.13870)
- `shapiq.AgnosticExplainer` a generic explainer that works for any value function or `shapiq.Game` object, allowing for more flexibility in explainers.
- prettier graph-based plots via `shapiq.si_graph_plot()` and `shapiq.network_plot()`, which now use the same backend for more flexibility and easier maintenance.

### New Features

- adds the SPEX (Sparse Exact) module in `approximator.sparse` for efficient computation of sparse interaction values [#379](https://github.com/mmschlk/shapiq/pull/379)
- adds `shapiq.AgnosticExplainer` which is a generic explainer that can be used for any value function or `shapiq.Game` object. This allows for more flexibility in the explainers. [#100](https://github.com/mmschlk/shapiq/issues/100), [#395](https://github.com/mmschlk/shapiq/pull/395)
- changes `budget` to be a mandatory parameter given to the `TabularExplainer.explain()` method [#355](https://github.com/mmschlk/shapiq/pull/356)
- changes logic of `InteractionValues.get_n_order()` function to be callable with **either** the `order: int` parameter and optional assignment of `min_order: int` and `max_order: int` parameters **or** with the min/max order parameters [#372](https://github.com/mmschlk/shapiq/pull/372)
- renamed `min_percentage` parameter in the force plot to `contribution_threshold` to better reflect its purpose [#391](https://github.com/mmschlk/shapiq/pull/391)
- adds `verbose` parameter to the `Explainer`'s `explain_X()` method to control weather a progress bar is shown or not which is defaulted to `False`. [#391](https://github.com/mmschlk/shapiq/pull/391)
- made `InteractionValues.get_n_order()` and `InteractionValues.get_n_order_values()` function more efficient by iterating over the stored interactions and not over the powerset of all potential interactions, which made the function not usable for higher player counts (models with many features, and results obtained from `TreeExplainer`). Note, this change does not really help `get_n_order_values()` as it still needs to create a numpy array of shape `n_players` times `order` [#372](https://github.com/mmschlk/shapiq/pull/372)
- streamlined the `network_plot()` plot function to use the `si_graph_plot()` as its backend function. This allows for more flexibility in the plot function and makes it easier to use the same code for different purposes. In addition, the `si_graph_plot` was modified to make plotting more easy and allow for more flexibility with new parameters. [#349](https://github.com/mmschlk/shapiq/pull/349)
- adds `Game.compute()` method to the `shapiq.Game` class to compute game values without changing the state of the game object. The compute method also introduces a `shapiq.utils.sets.generate_interaction_lookup_from_coalitions()` utility method which creates an interaction lookup dict from an array of coalitions. [#397](https://github.com/mmschlk/shapiq/pull/397)
- streamlines the creation of network plots and graph plots which now uses the same backend. The network plot via `shapiq.network_plot()` or `InteractionValues.plot_network()` is now a special case of the `shapiq.si_graph_plot()` and `InteractionValues.plot_si_graph()`. This allows to create more beautiful plots and easier maintenance in the future. [#349](https://github.com/mmschlk/shapiq/pull/349)

### Testing, Code-Quality and Documentation

- activates `"ALL"` rules in `ruff-format` configuration to enforce stricter code quality checks and addressed around 500 (not automatically solvable) issues in the code base. [#391](https://github.com/mmschlk/shapiq/pull/391)
- improved the testing environment by adding a new fixture module containing mock `InteractionValues` objects to be used in the tests. This allows for more efficient and cleaner tests, as well as easier debugging of the tests [#372](https://github.com/mmschlk/shapiq/pull/372)
- removed check and error message if the `index` parameter is not in the list of available indices in the `TabularExplainer` since the type hints were replaced by Literals [#391](https://github.com/mmschlk/shapiq/pull/391)
- removed multiple instances where `shapiq` tests if some approximators/explainers can be instantiated with certain indices or not in favor of using Literals in the `__init__` method of the approximator classes. This allows for better type hinting and IDE support, as well as cleaner code. [#391](https://github.com/mmschlk/shapiq/pull/391)
- Added documentation for all public modules, classes, and functions in the code base to improve the documentation quality and make it easier to understand how to use the package. [#391](https://github.com/mmschlk/shapiq/pull/391)
- suppress a `RuntimeWarning` in `Regression` approximators `solve_regression()`method when the solver is not able to find good interim solutions for the regression problem.
- refactors the tests into `tests_unit/` and `tests_integration/` to better separate unit tests from integration tests. [#395](https://github.com/mmschlk/shapiq/pull/395)
- adds new integration tests in `tests/tests_integration/test_explainer_california_housing` which compares the different explainers against ground-truth interaction values computed by `shapiq.ExactComputer` and interaction values stored on [disk](https://github.com/mmschlk/shapiq/tree/main/tests/data/interaction_values/california_housing) as a form of regression test. This test should help finding bugs in the future when the approximators, explainers, or exact computation are changed. [#395](https://github.com/mmschlk/shapiq/pull/395)

### Bug Fixes

- fixed a bug in the `shapiq.waterfall_plot` function that caused the plot to not display correctly resulting in cutoff y_ticks. Additionally, the file was renamed from `watefall.py` to `waterfall.py` to match the function name [#377](https://github.com/mmschlk/shapiq/pull/377)
- fixes a bug with `TabPFNExplainer`, where the model was not able to be used for predictions after it was explained. This was due to the model being fitted on a subset of features, which caused inconsistencies in the model's predictions after explanation. The fix includes that after each call to the `TabPFNImputer.value_function`, the tabpfn model is fitted on the whole dataset (without omitting features). This means that the original model can be used for predictions after it has been explained. [#396](https://github.com/mmschlk/shapiq/issues/396).
- fixed a bug in computing `BII` or `BV` indices with `shapiq.approximator.MonteCarlo` approximators (affecting `SHAP-IQ`, `SVARM` and `SVARM-IQ`). All orders of BII should now be computed correctly. [#395](https://github.com/mmschlk/shapiq/pull/395)

## v1.2.3 (2025-03-24)

- substantially improves the runtime of all `Regression` approximators by a) a faster pre-computation of the regression matrices and b) a faster computation of the weighted least squares regression [#340](https://github.com/mmschlk/shapiq/issues/340)
- removes `sample_replacements` parameter from `MarginalImputer` and removes the DeprecationWarning for it
- adds a trivial computation to `TreeSHAP-IQ` for trees that use only one feature in the tree (this works for decision stumps or trees splitting on only one feature multiple times). In such trees, the computation is trivial as the whole effect of $\nu(N) - \nu(\emptyset)$ is all on the main effect of the single feature and there is no interaction effect. This expands on the fix in v1.2.1 [#286](https://github.com/mmschlk/shapiq/issues/286).
- fixes a bug with xgboost where feature names where trees that did not contain all features would lead `TreeExplainer` to fail
- fixes a bug with `stacked_bar_plot` where the higher order interactions were inflated by the lower order interactions, thus wrongly showing the higher order interactions as higher than they are
- fixes a bug where `InteractionValues.get_subset()` returns a faulty `coalition_lookup` dictionary pointing to indices outside the subset of players [#336](https://github.com/mmschlk/shapiq/issues/336)
- updates default value of `TreeExplainer`'s `min_order` parameter from 1 to 0 to include the baseline value in the interaction values as per default
- adds the `RegressionFBII` approximator to estimate Faithful Banzhaf interactions via least squares regression [#333](https://github.com/mmschlk/shapiq/pull/333). Additionally, FBII support was introduced in TabularExplainer and MonteCarlo-Approximator.
- adds a `RandomGame` class as part of `shapiq.games.benchmark` which always returns a random vector of integers between 0 and 100.

## v1.2.2 (2025-03-11)

- changes python support to 3.10-3.13 [#318](https://github.com/mmschlk/shapiq/pull/318)
- fixes a bug that prohibited importing shapiq in environments without write access [#326](https://github.com/mmschlk/shapiq/issues/326)
- adds `ExtraTreeRegressors` to supported models [#309](https://github.com/mmschlk/shapiq/pull/309)

## v1.2.1 (2025-02-17)

- fixes bugs regarding plotting [#315](https://github.com/mmschlk/shapiq/issues/315) and [#316](https://github.com/mmschlk/shapiq/issues/316)
- fixes a bug with TreeExplainer and Trees that consist of only one feature [#286](https://github.com/mmschlk/shapiq/issues/286)
- fixes SV init with explainer for permutation, svarm, kernelshap, and unbiased kernelshap [#319](https://github.com/mmschlk/shapiq/issues/319)
- adds a progress bar to `explain_X()` [#324](https://github.com/mmschlk/shapiq/issues/324)

## v1.2.0 (2025-01-15)

- adds `shapiq.TabPFNExplainer` as a specialized version of the `shapiq.TabularExplainer` which offers a streamlined variant of the explainer for the TabPFN model [#301](https://github.com/mmschlk/shapiq/issues/301)
- handles `explainer.explain()` now through a common interface for all explainer classes which now need to implement a `explain_function()` method
- adds the baseline_value into the InteractionValues object's value storage for the `()` interaction if `min_order=0` (default usually) for all indices that are not ``SII```(SII has another baseline value) such that the values are efficient (sum up to the model prediction) without the awkward handling of the baseline_value attribute
- renames `game_fun` parameter in `shapiq.ExactComputer` to `game` [#297](https://github.com/mmschlk/shapiq/issues/297)
- adds a TabPFN example notebook to the documentation
- removes warning when class_index is not provided in explainers [#298](https://github.com/mmschlk/shapiq/issues/298)
- adds the `sentence_plot` function to the `plot` module to visualize the contributions of words to a language model prediction in a sentence-like format
- makes abbreviations in the `plot` module optional [#281](https://github.com/mmschlk/shapiq/issues/281)
- adds the `upset_plot` function to the `plot` module to visualize the interactions of higher-order [#290](https://github.com/mmschlk/shapiq/issues/290)
- adds support for IsoForest models to explainer and tree explainer [#278](https://github.com/mmschlk/shapiq/issues/278)
- adds support for sub-selection of players in the interaction values data class [#276](https://github.com/mmschlk/shapiq/issues/276) which allows retrieving interaction values for a subset of players
- refactors game theory computations like `ExactComputer`, `MoebiusConverter`, `core`, among others to be more modular and flexible into the `game_theory` module [#258](https://github.com/mmschlk/shapiq/issues/258)
- improves quality of the tests by adding many more semantic tests to the different interaction indices and computations [#285](https://github.com/mmschlk/shapiq/pull/285)

## v1.1.1 (2024-11-13)

### Improvements and Ease of Use

- adds a `class_index` parameter to `TabularExplainer` and `Explainer` to specify the class index to be explained for classification models [#271](https://github.com/mmschlk/shapiq/issues/271) (renames `class_label` parameter in TreeExplainer to `class_index`)
- adds support for `PyTorch` models to `Explainer` [#272](https://github.com/mmschlk/shapiq/issues/272)
- adds new tests comparing `shapiq` outputs for SVs with alues computed with `shap`
- adds new tests for checking `shapiq` explainers with different types of models

### Bug Fixes

- fixes a bug that `RandomForestClassifier` models were not working with the `TreeExplainer` [#273](https://github.com/mmschlk/shapiq/issues/273)

## v1.1.0 (2024-11-07)

### New Features and Improvements

- adds computation of the Egalitarian Core (`EC`) and Egalitarian Least-Core (`ELC`) to the `ExactComputer` [#182](https://github.com/mmschlk/shapiq/issues/182)
- adds `waterfall_plot` [#34](https://github.com/mmschlk/shapiq/issues/34) that visualizes the contributions of features to the model prediction
- adds `BaselineImputer` [#107](https://github.com/mmschlk/shapiq/issues/107) which is now responsible for handling the `sample_replacements` parameter. Added a DeprecationWarning for the parameter in `MarginalImputer`, which will be removed in the next release.
- adds `joint_marginal_distribution` parameter to `MarginalImputer` with default value `True` [#261](https://github.com/mmschlk/shapiq/issues/261)
- renames explanation graph to `si_graph`
- `get_n_order` now has optional lower/upper limits for the order
- computing metrics for benchmarking now tries to resolve not-matching interaction indices and will throw a warning instead of a ValueError [#179](https://github.com/mmschlk/shapiq/issues/179)
- add a legend to benchmark plots [#170](https://github.com/mmschlk/shapiq/issues/170)
- refactored the `shapiq.games.benchmark` module into a separate `shapiq.benchmark` module by moving all but the benchmark games into the new module. This closes [#169](https://github.com/mmschlk/shapiq/issues/169) and makes benchmarking more flexible and convenient.
- a `shapiq.Game` can now be called more intuitively with coalitions data types (tuples of int or str) and also allows to add `player_names` to the game at initialization [#183](https://github.com/mmschlk/shapiq/issues/183)
- improve tests across the package

### Documentation

- adds a notebook showing how to use custom tree models with the `TreeExplainer` [#66](https://github.com/mmschlk/shapiq/issues/66)
- adds a notebook show how to use the `shapiq.Game` API to create custom games [#184](https://github.com/mmschlk/shapiq/issues/184)
- adds a notebook showing hot to visualize interactions [#252](https://github.com/mmschlk/shapiq/issues/252)
- adds a notebook showing how to compute Shapley values with `shapiq` [#193](https://github.com/mmschlk/shapiq/issues/197)
- adds a notebook for conducting data valuation [#190](https://github.com/mmschlk/shapiq/issues/190)
- adds a notebook showcasing introducing the Core and how to compute it with `shapiq` [#191](https://github.com/mmschlk/shapiq/issues/191)

### Bug Fixes

- fixes a bug with SIs not adding up to the model prediction because of wrong values in the empty set [#264](https://github.com/mmschlk/shapiq/issues/264)
- fixes a bug that `TreeExplainer` did not have the correct baseline_value when using XGBoost models [#250](https://github.com/mmschlk/shapiq/issues/250)
- fixes the force plot not showing and its baseline value

## v1.0.1 (2024-06-05)

- add `max_order=1` to `TabularExplainer` and `TreeExplainer`
- fix `TreeExplainer.explain_X(..., n_jobs=2, random_state=0)`

## v1.0.0 (2024-06-04)

Major release of the `shapiq` Python package including (among others):

- `approximator` module implements over 10 approximators of Shapley values and interaction indices.
- `exact` module implements a computer for over 10 game theoretic concepts like interaction indices or generalized values.
- `games` module implements over 10 application benchmarks for the approximators.
- `explainer` module includes a `TabularExplainer` and `TreeExplainer` for any-order feature interactions of machine learning model predictions.
- `interaction_values` module implements a data class to store and analyze interaction values.
- `plot` module allows visualizing interaction values.
- `datasets` module loads datasets for testing and examples.

Documentation of `shapiq` with tutorials and API reference is available at https://shapiq.readthedocs.io
