causalis.scenarios.unconfoundedness.refutation.unconfoundedness.unconfoundedness_validation

Unconfoundedness diagnostics focused on covariate balance (SMD).

Module Contents

Functions

run_unconfoundedness_diagnostics

Run covariate-balance diagnostics implied by unconfoundedness.

Data

__all__

API

causalis.scenarios.unconfoundedness.refutation.unconfoundedness.unconfoundedness_validation.run_unconfoundedness_diagnostics(data: causalis.dgp.causaldata.CausalData, estimate: causalis.data_contracts.causal_estimate.CausalEstimate, *, threshold: float = 0.1, normalize: Optional[bool] = None, return_summary: bool = True) Dict[str, Any]

Run covariate-balance diagnostics implied by unconfoundedness.

The diagnostic compares the treated and control pseudo-populations induced
by the estimated propensity score. For ATE, the effective weights are

.. math::

    w_{1i} = ar w_i 

rac{D_i}{\hat m_i}, \qquad w_{0i} = ar w_i rac{1-D_i}{1-\hat m_i},

while for ATTE this implementation uses

.. math::

    w_{1i} = D_i,
    \qquad
    w_{0i} = (1-D_i)

rac{\hat m_i}{1-\hat m_i}.

For each confounder :math:`X_j`, the weighted standardized mean
difference is

.. math::

    \mathrm{SMD}_j =

rac{|\mu_{1j}^{(w)} - \mu_{0j}^{(w)}|} {\sqrt{(s_{1j}^{2,(w)} + s_{0j}^{2,(w)}) / 2}}.

Smaller weighted SMDs are better. A common rule of thumb is to aim for
:math:`|\mathrm{SMD}| < 0.10`.

Parameters
----------
data : CausalData
    Dataset used to fit the estimator.
estimate : CausalEstimate
    Effect estimate with ``diagnostic_data`` containing propensity and, when
    available, weight information.
threshold : float, default 0.10
    SMD threshold used for warnings and pass/fail summaries.
normalize : bool, optional
    Override whether pseudo-population weights are mean-normalized.
return_summary : bool, default True
    Include a compact summary table in the returned payload.

Returns
-------
Dict[str, Any]
    Diagnostic report with weighted balance tables, severity flags, and an
    optional summary DataFrame.

Raises
------
ValueError
    If required diagnostic arrays are missing or have incompatible shapes.
RuntimeError
    If balance weights collapse to zero total mass.

Examples
--------
>>> from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
>>> from causalis.dgp import obs_linear_26_dataset
>>> from causalis.scenarios.unconfoundedness.model import IRM
>>> data = obs_linear_26_dataset(
...     n=1000,
...     seed=3141,
...     include_oracle=False,
...     return_causal_data=True,
... )
>>> irm = IRM(
...     data=data,
...     ml_g=RandomForestRegressor(
...         n_estimators=200,
...         max_depth=6,
...         min_samples_leaf=5,
...         random_state=3141,
...     ),
...     ml_m=RandomForestClassifier(
...         n_estimators=200,
...         max_depth=6,
...         min_samples_leaf=5,
...         random_state=3141,
...     ),
...     n_folds=3,
...     random_state=3141,
... )
>>> estimate = irm.fit().estimate(score="ATE")
>>> report = run_unconfoundedness_diagnostics(data, estimate)
>>> report["balance"]["smd_max"]  # doctest: +SKIP
>>> report["balance"]["worst_features"].head()  # doctest: +SKIP
causalis.scenarios.unconfoundedness.refutation.unconfoundedness.unconfoundedness_validation.__all__

[‘run_unconfoundedness_diagnostics’]