causalis.scenarios.unconfoundedness.model¶
IRM estimator consuming CausalData.
Implements cross-fitted nuisance estimation for g0, g1 and m, and supports ATE/ATTE/GATE/GATET scores. https://github.com/DoubleML/doubleml-for-py/blob/main/doubleml/irm/irm.py
Module Contents¶
Classes¶
Interactive Regression Model (IRM) with cross-fitting using CausalData. |
API¶
- class causalis.scenarios.unconfoundedness.model.IRM(data: Optional[causalis.dgp.causaldata.CausalData] = None, ml_g: Any = None, ml_m: Any = None, *, n_folds: int = 5, n_rep: int = 1, normalize_ipw: bool = False, trimming_rule: str = 'truncate', trimming_threshold: float = 0.01, weights: Optional[numpy.ndarray | Dict[str, Any]] = None, relative_baseline_min: float = 1e-08, random_state: Optional[int] = None, n_jobs: int = 1, store_diagnostics: bool = True)¶
Bases:
sklearn.base.BaseEstimatorInteractive Regression Model (IRM) with cross-fitting using CausalData.
Parameters
data : CausalData Data container with outcome, binary treatment (0/1), and confounders. ml_g : estimator Learner for E[Y|X,D]. If classifier and Y is binary, predict_proba is used; otherwise predict(). ml_m : classifier Learner for E[D|X] (propensity). Must support predict_proba() or predict() in (0,1). n_folds : int, default 5 Number of cross-fitting folds. n_rep : int, default 1 Number of repetitions of sample splitting. Currently only 1 is supported. normalize_ipw : bool, default False Whether to normalize IPW terms within the score. Applied to ATE only. For ATTE, normalization is ignored to preserve the canonical ATTE EIF. trimming_rule : {“truncate”}, default “truncate” Trimming approach for propensity scores. trimming_threshold : float, default 1e-2 Threshold for trimming if rule is “truncate”. weights : Optional[np.ndarray or Dict], default None Optional weights. - If array of shape (n,), used as ATE weights (w). Assumed E[w|X] = w. - If dict, can contain ‘weights’ (w) and ‘weights_bar’ (E[w|X]). - For ATTE, computed internally (w=D/P(D=1), w_bar=m(X)/P(D=1)). Note: If weights depend on treatment or outcome, E[w|X] must be provided for correct sensitivity analysis. relative_baseline_min : float, default 1e-8 Minimum absolute baseline value used for relative effects. If |mu_c| is below this threshold, relative estimates are set to NaN with a warning. random_state : Optional[int], default None Random seed for fold creation. n_jobs : int, default 1 Number of parallel jobs for fold-level cross-fitting. Use
-1to use all available CPUs. Practical guidance: - Start withn_jobs=1for stable, low-contention defaults. - Increase ton_jobs=2/4/-1when cross-fitting is the bottleneck. - If nuisance learners are already multithreaded (e.g. CatBoost withthread_count=-1), keepn_jobs=1or set learner threads to1to avoid CPU oversubscription. - On shared machines, prefer a bounded value (for example2or4) instead of-1. store_diagnostics : bool, default True Whether to retain raw fit-time arrays and diagnostic-only artifacts on the fitted model. Set toFalsefor a lighter-weight estimator that still supports effect estimation, while only retaining immutable outcome and treatment snapshots. In lightweight mode the estimator no longer keeps the confounder matrix, raw propensities, or fold assignments in memory afterfit().Examples
from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor from causalis.dgp import obs_linear_26_dataset from causalis.scenarios.unconfoundedness.model import IRM data = obs_linear_26_dataset( … n=1000, … seed=3141, … include_oracle=False, … return_causal_data=True, … ) ml_g = RandomForestRegressor( … n_estimators=200, … max_depth=6, … min_samples_leaf=5, … random_state=3141, … ) ml_m = RandomForestClassifier( … n_estimators=200, … max_depth=6, … min_samples_leaf=5, … random_state=3141, … ) irm = IRM(data=data, ml_g=ml_g, ml_m=ml_m, n_folds=3, random_state=3141) ate = irm.fit().estimate(score=”ATE”) ate.summary() # doctest: +SKIP atte = irm.estimate(score=”ATTE”) atte.value # doctest: +SKIP
Notes
The IRM model targets binary-treatment causal effects under unconfoundedness. Let :math:
W = (Y, D, X)with :math:D \in \{0, 1\}and define.. math::
g_0(d, x) = \mathbb{E}[Y \mid D=d, X=x], \qquad m_0(x) = \mathbb{P}(D=1 \mid X=x).Under conditional ignorability and overlap,
.. math::
(Y(0), Y(1)) \perp D \mid X, \qquad 0 < m_0(X) < 1 \ \text{a.s.},the target functionals are identified as
.. math::
\theta_0^{ATE} = \mathbb{E}[g_0(1, X) - g_0(0, X)]and
.. math::
\theta_0^{ATTE} = \mathbb{E}[g_0(1, X) - g_0(0, X) \mid D=1].This implementation cross-fits three nuisance objects: :math:
\hat g_1(x) \approx \mathbb{E}[Y \mid D=1, X=x], :math:\hat g_0(x) \approx \mathbb{E}[Y \mid D=0, X=x], and :math:\hat m(x) \approx \mathbb{P}(D=1 \mid X=x). Propensities are trimmed via.. math::
\tilde m(x) = \min\{1-\varepsilon, \max(\hat m(x), \varepsilon)\},where :math:
\varepsilon =trimming_threshold.Estimation solves the sample moment equation
.. math::
\mathbb{E}_n[\psi_a(W_i; \hat\eta)\theta + \psi_b(W_i; \hat\eta)] = 0,giving the closed-form estimator
.. math::
\hat\theta = -\frac{\mathbb{E}_n[\psi_b(W_i; \hat\eta)]} {\mathbb{E}_n[\psi_a(W_i; \hat\eta)]}.For both ATE and ATTE, the orthogonal score component used here is
.. math::
\psi_b = w \, (\hat g_1(X) - \hat g_0(X)) + \bar w \left[ (Y - \hat g_1(X)) \frac{D}{\tilde m(X)} - (Y - \hat g_0(X)) \frac{1-D}{1-\tilde m(X)} \right].The score derivative differs by estimand:
.. math::
\psi_a = -1 \quad \text{for ATE}, \qquad \psi_a = -w \quad \text{for ATTE}.The corresponding weights are
.. math::
w = \bar w = 1 \quad \text{for unweighted ATE},while for ATTE` this implementation uses normalized treated weights
.. math::
w_i = \frac{D_i}{\mathbb{E}_n[D]}, \qquad \bar w_i = \frac{\tilde m(X_i)}{\mathbb{E}_n[D]}.If
normalize_ipw=True, the inverse-probability factors :math:D / \tilde m(X)and :math:(1-D) / (1-\tilde m(X))are additionally stabilized by their sample means (a Hajek-style normalization). This option is applied to ATE only; for ATTE it is intentionally ignored to preserve the canonical ATTE efficient influence function used by the estimator.Initialization
Initialize the estimator and validate configuration options.
- fit(data: Optional[causalis.dgp.causaldata.CausalData] = None, *, store_diagnostics: Optional[bool] = None) causalis.scenarios.unconfoundedness.model.IRM¶
Fit nuisance models via cross-fitting.
Parameters
data : Optional[CausalData], default None CausalData container. If None, uses self.data. store_diagnostics : Optional[bool], default None Optional override for whether the fitted model should retain diagnostics-oriented arrays and expose diagnostic payloads from subsequent
estimate()calls. Outcome and treatment snapshots are always retained to keep post-fit estimation deterministic.Returns
self : IRM Fitted estimator.
- estimate(score: str = 'ATE', alpha: float = 0.05, groups: Optional[pandas.DataFrame | pandas.Series] = None, cov_type: str = 'HC3', cov_kwds: Optional[Dict[str, Any]] = None) causalis.data_contracts.causal_estimate.CausalEstimate | causalis.data_contracts.gate_estimate.GateEstimate¶
Compute treatment effects using stored nuisance predictions.
Parameters
score : {“ATE”, “ATTE”, “GATE”, “GATET”}, default “ATE” Target estimand. alpha : float, default 0.05 Significance level for intervals. Diagnostic payloads are included only when the model was fitted with
store_diagnostics=True. groups : Optional[pd.DataFrame | pd.Series], default None Group labels/indicators forscore="GATE"orscore="GATET". If None, fallback toself.data.gate_groupswhen present. GATE/GATET requiresCausalData.user_idand aligns groups to those fit-time observation ids. Row-indexed groups are also accepted only when the fit-time row-to-user_idmapping is still unchanged. cov_type : {“HC0”, “HC1”, “HC2”, “HC3”}, default “HC3” Robust covariance type forscore="GATE"/score="GATET"inference. cov_kwds : Optional[Dict[str, Any]], default None Additional covariance keyword arguments requested for subgroup inference. These are currently ignored because GATE/GATET use closed-form HCx covariance formulas rather than delegating to statsmodels.Returns
CausalEstimate or GateEstimate Result container for the estimated effect. For subgroup scores, the returned
GateEstimatesupportssummary()for subgroup-vs-zero inference,contrast(...)for formal group-vs-group tests, andpairwise_summary(...)for a broader comparison table.
- property diagnostics_: Dict[str, Any]¶
Return diagnostic data.
Returns
dict Dictionary containing ‘m_hat’, ‘g0_hat’, ‘g1_hat’, and ‘folds’.
- property coef: numpy.ndarray¶
Return the estimated coefficient.
Returns
np.ndarray The estimated coefficient.
- property se: numpy.ndarray¶
Return the standard error of the estimate.
Returns
np.ndarray The standard error.
- property pvalues: numpy.ndarray¶
Return the p-values for the estimate.
Returns
np.ndarray The p-values.
- property summary: pandas.DataFrame¶
Return a summary DataFrame of the results.
Returns
pd.DataFrame The results summary.
- property orth_signal: numpy.ndarray¶
Return the cross-fitted orthogonal signal (psi_b).
Returns
np.ndarray The orthogonal signal.
- gate(groups: pandas.DataFrame | pandas.Series, alpha: float = 0.05, cov_type: str = 'HC3', cov_kwds: Optional[Dict[str, Any]] = None) causalis.data_contracts.gate_estimate.GateEstimate¶
Convenience wrapper for
estimate(score="GATE", ...).Parameters
groups : pd.DataFrame or pd.Series Subgroup labels or a strict dummy basis. GATE requires
CausalData.user_idand aligns groups to those fit-time observation ids. alpha : float, default 0.05 Significance level for confidence intervals. cov_type : {“HC0”, “HC1”, “HC2”, “HC3”}, default “HC3” Robust covariance type for subgroup inference. cov_kwds : Optional[Dict[str, Any]], default None Additional covariance keyword arguments requested by the caller. These are currently ignored by the closed-form GATE implementation.Returns
GateEstimate Estimated subgroup effects and diagnostics. The returned result also supports
contrast(...)andpairwise_summary(...)for formal post-estimation group comparisons.
- gatet(groups: pandas.DataFrame | pandas.Series, alpha: float = 0.05, cov_type: str = 'HC3', cov_kwds: Optional[Dict[str, Any]] = None) causalis.data_contracts.gate_estimate.GateEstimate¶
Convenience wrapper for
estimate(score="GATET", ...).
- sensitivity_analysis(r2_y: float, r2_d: float, rho: float = 1.0, H0: float = 0.0, alpha: float = 0.05) causalis.scenarios.unconfoundedness.model.IRM¶
Compute a sensitivity analysis following Chernozhukov et al. (2022).
Parameters
r2_y : float Sensitivity parameter for outcome equation (R^2 form, R_Y^2; converted to odds form internally). r2_d : float Sensitivity parameter for treatment equation (R^2 form, R_D^2). rho : float, default 1.0 Correlation between unobserved components. H0 : float, default 0.0 Null hypothesis for robustness values. alpha : float, default 0.05 Significance level for CI bounds.
- confint(alpha: float = 0.05) pandas.DataFrame¶
Compute confidence intervals for the estimated coefficient.
Parameters
alpha : float, default 0.05 Significance level.
Returns
pd.DataFrame DataFrame with confidence intervals.