causalis.scenarios.unconfoundedness.refutation.score.influence_plot

Lightweight plots for the most influential score contributions.

Module Contents

Functions

plot_influence_instability

Plot only the most influential score contributions.

Data

__all__

API

causalis.scenarios.unconfoundedness.refutation.score.influence_plot.plot_influence_instability(estimate: causalis.data_contracts.causal_estimate.CausalEstimate, data: Optional[causalis.dgp.causaldata.CausalData] = None, *, trimming_threshold: Optional[float] = None, use_estimator_psi: bool = True, top_k: int = 20, annotate: bool = True, figsize: Optional[Tuple[float, float]] = None, dpi: int = 220, font_scale: float = 1.1, save: Optional[str] = None, save_dpi: Optional[int] = None, transparent: bool = False) matplotlib.pyplot.Figure

Plot only the most influential score contributions.

Panels

  1. Ranked bar chart of the top k observations by |psi_i|.

  2. Scatter of those top k observations versus clipped propensity m_i.

This replacement intentionally avoids plotting every observation. Use run_score_diagnostics(... )["summary"] for global tail metrics and this plot for a lightweight drill-down into the largest contributors.

Notes

The figure ranks observations by :math:|\hat\psi_i|, where :math:\hat\psi_i is the fitted score contribution. Large values indicate observations with unusually strong leverage on the final estimate. If many top points sit near propensity clipping boundaries, that is usually a sign to inspect overlap and nuisance fit quality together.

Examples

from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor from causalis.dgp import obs_linear_26_dataset from causalis.scenarios.unconfoundedness.model import IRM data = obs_linear_26_dataset( … n=1000, … seed=3141, … include_oracle=False, … return_causal_data=True, … ) irm = IRM( … data=data, … ml_g=RandomForestRegressor( … n_estimators=200, … max_depth=6, … min_samples_leaf=5, … random_state=3141, … ), … ml_m=RandomForestClassifier( … n_estimators=200, … max_depth=6, … min_samples_leaf=5, … random_state=3141, … ), … n_folds=3, … random_state=3141, … ) estimate = irm.fit().estimate(score=”ATE”) fig = plot_influence_instability(estimate, data=data, top_k=15) # doctest: +SKIP

causalis.scenarios.unconfoundedness.refutation.score.influence_plot.__all__

[‘plot_influence_instability’]