causalis.scenarios.unconfoundedness.model

IRM estimator consuming CausalData.

Implements cross-fitted nuisance estimation for g0, g1 and m, and supports ATE/ATTE/GATE/GATET scores. https://github.com/DoubleML/doubleml-for-py/blob/main/doubleml/irm/irm.py

Module Contents

Classes

IRM

Interactive Regression Model (IRM) with cross-fitting using CausalData.

API

class causalis.scenarios.unconfoundedness.model.IRM(data: Optional[causalis.dgp.causaldata.CausalData] = None, ml_g: Any = None, ml_m: Any = None, *, n_folds: int = 5, n_rep: int = 1, normalize_ipw: bool = False, trimming_rule: str = 'truncate', trimming_threshold: float = 0.01, weights: Optional[numpy.ndarray | Dict[str, Any]] = None, relative_baseline_min: float = 1e-08, random_state: Optional[int] = None, n_jobs: int = 1, store_diagnostics: bool = True)

Bases: sklearn.base.BaseEstimator

Interactive Regression Model (IRM) with cross-fitting using CausalData.

Parameters

data : CausalData Data container with outcome, binary treatment (0/1), and confounders. ml_g : estimator Learner for E[Y|X,D]. If classifier and Y is binary, predict_proba is used; otherwise predict(). ml_m : classifier Learner for E[D|X] (propensity). Must support predict_proba() or predict() in (0,1). n_folds : int, default 5 Number of cross-fitting folds. n_rep : int, default 1 Number of repetitions of sample splitting. Currently only 1 is supported. normalize_ipw : bool, default False Whether to normalize IPW terms within the score. Applied to ATE only. For ATTE, normalization is ignored to preserve the canonical ATTE EIF. trimming_rule : {“truncate”}, default “truncate” Trimming approach for propensity scores. trimming_threshold : float, default 1e-2 Threshold for trimming if rule is “truncate”. weights : Optional[np.ndarray or Dict], default None Optional weights. - If array of shape (n,), used as ATE weights (w). Assumed E[w|X] = w. - If dict, can contain ‘weights’ (w) and ‘weights_bar’ (E[w|X]). - For ATTE, computed internally (w=D/P(D=1), w_bar=m(X)/P(D=1)). Note: If weights depend on treatment or outcome, E[w|X] must be provided for correct sensitivity analysis. relative_baseline_min : float, default 1e-8 Minimum absolute baseline value used for relative effects. If |mu_c| is below this threshold, relative estimates are set to NaN with a warning. random_state : Optional[int], default None Random seed for fold creation. n_jobs : int, default 1 Number of parallel jobs for fold-level cross-fitting. Use -1 to use all available CPUs. Practical guidance: - Start with n_jobs=1 for stable, low-contention defaults. - Increase to n_jobs=2/4/-1 when cross-fitting is the bottleneck. - If nuisance learners are already multithreaded (e.g. CatBoost with thread_count=-1), keep n_jobs=1 or set learner threads to 1 to avoid CPU oversubscription. - On shared machines, prefer a bounded value (for example 2 or 4) instead of -1. store_diagnostics : bool, default True Whether to retain raw fit-time arrays and diagnostic-only artifacts on the fitted model. Set to False for a lighter-weight estimator that still supports effect estimation, while only retaining immutable outcome and treatment snapshots. In lightweight mode the estimator no longer keeps the confounder matrix, raw propensities, or fold assignments in memory after fit().

Examples

from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor from causalis.dgp import obs_linear_26_dataset from causalis.scenarios.unconfoundedness.model import IRM data = obs_linear_26_dataset( … n=1000, … seed=3141, … include_oracle=False, … return_causal_data=True, … ) ml_g = RandomForestRegressor( … n_estimators=200, … max_depth=6, … min_samples_leaf=5, … random_state=3141, … ) ml_m = RandomForestClassifier( … n_estimators=200, … max_depth=6, … min_samples_leaf=5, … random_state=3141, … ) irm = IRM(data=data, ml_g=ml_g, ml_m=ml_m, n_folds=3, random_state=3141) ate = irm.fit().estimate(score=”ATE”) ate.summary() # doctest: +SKIP atte = irm.estimate(score=”ATTE”) atte.value # doctest: +SKIP

Notes

The IRM model targets binary-treatment causal effects under unconfoundedness. Let :math:W = (Y, D, X) with :math:D \in \{0, 1\} and define

.. math::

g_0(d, x) = \mathbb{E}[Y \mid D=d, X=x], \qquad
m_0(x) = \mathbb{P}(D=1 \mid X=x).

Under conditional ignorability and overlap,

.. math::

(Y(0), Y(1)) \perp D \mid X, \qquad 0 < m_0(X) < 1 \ \text{a.s.},

the target functionals are identified as

.. math::

\theta_0^{ATE} = \mathbb{E}[g_0(1, X) - g_0(0, X)]

and

.. math::

\theta_0^{ATTE} = \mathbb{E}[g_0(1, X) - g_0(0, X) \mid D=1].

This implementation cross-fits three nuisance objects: :math:\hat g_1(x) \approx \mathbb{E}[Y \mid D=1, X=x], :math:\hat g_0(x) \approx \mathbb{E}[Y \mid D=0, X=x], and :math:\hat m(x) \approx \mathbb{P}(D=1 \mid X=x). Propensities are trimmed via

.. math::

\tilde m(x) = \min\{1-\varepsilon, \max(\hat m(x), \varepsilon)\},

where :math:\varepsilon = trimming_threshold.

Estimation solves the sample moment equation

.. math::

\mathbb{E}_n[\psi_a(W_i; \hat\eta)\theta + \psi_b(W_i; \hat\eta)] = 0,

giving the closed-form estimator

.. math::

\hat\theta = -\frac{\mathbb{E}_n[\psi_b(W_i; \hat\eta)]}
{\mathbb{E}_n[\psi_a(W_i; \hat\eta)]}.

For both ATE and ATTE, the orthogonal score component used here is

.. math::

\psi_b =
w \, (\hat g_1(X) - \hat g_0(X))
+ \bar w
\left[
(Y - \hat g_1(X)) \frac{D}{\tilde m(X)}
-
(Y - \hat g_0(X)) \frac{1-D}{1-\tilde m(X)}
\right].

The score derivative differs by estimand:

.. math::

\psi_a = -1 \quad \text{for ATE}, \qquad
\psi_a = -w \quad \text{for ATTE}.

The corresponding weights are

.. math::

w = \bar w = 1 \quad \text{for unweighted ATE},

while for ATTE` this implementation uses normalized treated weights

.. math::

w_i = \frac{D_i}{\mathbb{E}_n[D]}, \qquad
\bar w_i = \frac{\tilde m(X_i)}{\mathbb{E}_n[D]}.

If normalize_ipw=True, the inverse-probability factors :math:D / \tilde m(X) and :math:(1-D) / (1-\tilde m(X)) are additionally stabilized by their sample means (a Hajek-style normalization). This option is applied to ATE only; for ATTE it is intentionally ignored to preserve the canonical ATTE efficient influence function used by the estimator.

Initialization

Initialize the estimator and validate configuration options.

fit(data: Optional[causalis.dgp.causaldata.CausalData] = None, *, store_diagnostics: Optional[bool] = None) causalis.scenarios.unconfoundedness.model.IRM

Fit nuisance models via cross-fitting.

Parameters

data : Optional[CausalData], default None CausalData container. If None, uses self.data. store_diagnostics : Optional[bool], default None Optional override for whether the fitted model should retain diagnostics-oriented arrays and expose diagnostic payloads from subsequent estimate() calls. Outcome and treatment snapshots are always retained to keep post-fit estimation deterministic.

Returns

self : IRM Fitted estimator.

estimate(score: str = 'ATE', alpha: float = 0.05, groups: Optional[pandas.DataFrame | pandas.Series] = None, cov_type: str = 'HC3', cov_kwds: Optional[Dict[str, Any]] = None) causalis.data_contracts.causal_estimate.CausalEstimate | causalis.data_contracts.gate_estimate.GateEstimate

Compute treatment effects using stored nuisance predictions.

Parameters

score : {“ATE”, “ATTE”, “GATE”, “GATET”}, default “ATE” Target estimand. alpha : float, default 0.05 Significance level for intervals. Diagnostic payloads are included only when the model was fitted with store_diagnostics=True. groups : Optional[pd.DataFrame | pd.Series], default None Group labels/indicators for score="GATE" or score="GATET". If None, fallback to self.data.gate_groups when present. GATE/GATET requires CausalData.user_id and aligns groups to those fit-time observation ids. Row-indexed groups are also accepted only when the fit-time row-to-user_id mapping is still unchanged. cov_type : {“HC0”, “HC1”, “HC2”, “HC3”}, default “HC3” Robust covariance type for score="GATE" / score="GATET" inference. cov_kwds : Optional[Dict[str, Any]], default None Additional covariance keyword arguments requested for subgroup inference. These are currently ignored because GATE/GATET use closed-form HCx covariance formulas rather than delegating to statsmodels.

Returns

CausalEstimate or GateEstimate Result container for the estimated effect. For subgroup scores, the returned GateEstimate supports summary() for subgroup-vs-zero inference, contrast(...) for formal group-vs-group tests, and pairwise_summary(...) for a broader comparison table.

property diagnostics_: Dict[str, Any]

Return diagnostic data.

Returns

dict Dictionary containing ‘m_hat’, ‘g0_hat’, ‘g1_hat’, and ‘folds’.

property coef: numpy.ndarray

Return the estimated coefficient.

Returns

np.ndarray The estimated coefficient.

property se: numpy.ndarray

Return the standard error of the estimate.

Returns

np.ndarray The standard error.

property pvalues: numpy.ndarray

Return the p-values for the estimate.

Returns

np.ndarray The p-values.

property summary: pandas.DataFrame

Return a summary DataFrame of the results.

Returns

pd.DataFrame The results summary.

property orth_signal: numpy.ndarray

Return the cross-fitted orthogonal signal (psi_b).

Returns

np.ndarray The orthogonal signal.

gate(groups: pandas.DataFrame | pandas.Series, alpha: float = 0.05, cov_type: str = 'HC3', cov_kwds: Optional[Dict[str, Any]] = None) causalis.data_contracts.gate_estimate.GateEstimate

Convenience wrapper for estimate(score="GATE", ...).

Parameters

groups : pd.DataFrame or pd.Series Subgroup labels or a strict dummy basis. GATE requires CausalData.user_id and aligns groups to those fit-time observation ids. alpha : float, default 0.05 Significance level for confidence intervals. cov_type : {“HC0”, “HC1”, “HC2”, “HC3”}, default “HC3” Robust covariance type for subgroup inference. cov_kwds : Optional[Dict[str, Any]], default None Additional covariance keyword arguments requested by the caller. These are currently ignored by the closed-form GATE implementation.

Returns

GateEstimate Estimated subgroup effects and diagnostics. The returned result also supports contrast(...) and pairwise_summary(...) for formal post-estimation group comparisons.

gatet(groups: pandas.DataFrame | pandas.Series, alpha: float = 0.05, cov_type: str = 'HC3', cov_kwds: Optional[Dict[str, Any]] = None) causalis.data_contracts.gate_estimate.GateEstimate

Convenience wrapper for estimate(score="GATET", ...).

sensitivity_analysis(r2_y: float, r2_d: float, rho: float = 1.0, H0: float = 0.0, alpha: float = 0.05) causalis.scenarios.unconfoundedness.model.IRM

Compute a sensitivity analysis following Chernozhukov et al. (2022).

Parameters

r2_y : float Sensitivity parameter for outcome equation (R^2 form, R_Y^2; converted to odds form internally). r2_d : float Sensitivity parameter for treatment equation (R^2 form, R_D^2). rho : float, default 1.0 Correlation between unobserved components. H0 : float, default 0.0 Null hypothesis for robustness values. alpha : float, default 0.05 Significance level for CI bounds.

confint(alpha: float = 0.05) pandas.DataFrame

Compute confidence intervals for the estimated coefficient.

Parameters

alpha : float, default 0.05 Significance level.

Returns

pd.DataFrame DataFrame with confidence intervals.