causalis.scenarios.multi_unconfoundedness.model¶
Module Contents¶
Classes¶
Interactive Regression Model (IRM) for multi-treatment unconfoundedness. |
API¶
- class causalis.scenarios.multi_unconfoundedness.model.MultiTreatmentIRM(data: Optional[causalis.data_contracts.multicausaldata.MultiCausalData] = None, ml_g: Any = None, ml_m: Any = None, *, n_folds: int = 5, n_rep: int = 1, normalize_ipw: bool = False, trimming_rule: str = 'truncate', trimming_threshold: float = 0.01, random_state: Optional[int] = None, n_jobs: int = 1, store_diagnostics: bool = True)¶
Bases:
sklearn.base.BaseEstimatorInteractive Regression Model (IRM) for multi-treatment unconfoundedness.
DoubleML-style cross-fitting estimator consuming
MultiCausalDataand producing pairwise contrasts between each active treatment arm and the baseline arm (column 0). The model supportsK >= 2mutually exclusive treatment arms encoded as one-hot columns.Parameters
data : MultiCausalData Data container with outcome, one-hot treatment indicators, and confounders. ml_g : estimator Learner for :math:
\mathbb{E}[Y \mid X, D=k]. If classifier andYis binary,predict_probais used; otherwisepredict()is used. ml_m : classifier Learner for the generalized propensity score :math:\mathbb{P}(D=k \mid X). Must supportpredict_proba(). n_folds : int, default 5 Number of cross-fitting folds. n_rep : int, default 1 Number of repetitions of sample splitting. Currently only 1 is supported. normalize_ipw : bool, default False Whether to normalize inverse-probability terms within the score. Applied toscore="ATE"only. Forscore="ATTE", normalization is ignored to preserve the canonical orthogonal ATTE score used by the estimator. trimming_rule : {“truncate”}, default “truncate” Trimming approach for propensity scores. trimming_threshold : float, default 1e-2 Lower threshold used before renormalizing multiclass propensities back to the simplex. random_state : Optional[int], default None Random seed for fold creation. n_jobs : int, default 1 Number of parallel jobs for fold-level cross-fitting. Use-1to use all available CPUs. Practical guidance: - Start withn_jobs=1for stable, low-contention defaults. - Increase ton_jobs=2/4/-1when cross-fitting is the bottleneck. - If nuisance learners are already multithreaded (e.g. CatBoost withthread_count=-1), keepn_jobs=1or set learner threads to1to avoid CPU oversubscription. store_diagnostics : bool, default True Whether to retain raw fit-time arrays and diagnostic-only artifacts on the fitted model. Set toFalsefor a lighter-weight estimator that still supports effect estimation while omitting heavier caches such as confounders, raw propensities, and fold assignments.Examples
from sklearn.linear_model import LinearRegression, LogisticRegression from causalis.dgp.multicausaldata import generate_multi_dml_cx_26 from causalis.scenarios.multi_unconfoundedness.model import MultiTreatmentIRM data = generate_multi_dml_cx_26( … n=1000, … seed=3141, … return_causal_data=True, … ) irm = MultiTreatmentIRM( … data=data, … ml_g=LinearRegression(), … ml_m=LogisticRegression(max_iter=2000), … n_folds=3, … random_state=3141, … ) res_ate = irm.fit().estimate(score=”ATE”) res_ate.summary() # doctest: +SKIP res_atte = irm.estimate(score=”ATTE”) res_atte.value # doctest: +SKIP
Notes
Let :math:
W = (Y, D, X)where :math:D \in \{0, 1, \dots, K-1\}and arm :math:0is the designated baseline. Define the arm-specific outcome regressions and generalized propensity scores as.. math::
g_{0, k}(x) = \mathbb{E}[Y \mid D=k, X=x], \qquad m_{0, k}(x) = \mathbb{P}(D=k \mid X=x).Under multi-arm unconfoundedness and overlap,
.. math::
(Y(0), \dots, Y(K-1)) \perp D \mid X, \qquad 0 < m_{0, k}(X) < 1 \ \text{a.s. for all } k,the pairwise baseline ATE for arm :math:
k > 0is.. math::
\theta_{0, k}^{ATE} = \mathbb{E}[g_{0, k}(X) - g_{0, 0}(X)].The corresponding pairwise ATTE conditions on membership in arm :math:
k:.. math::
\theta_{0, k}^{ATTE} = \mathbb{E}[g_{0, k}(X) - g_{0, 0}(X) \mid D=k].This implementation cross-fits all arm-specific outcome nuisances :math:
\hat g_k(X)and all class propensities :math:\hat m_k(X). The propensity vector is lower-trimmed componentwise and then renormalized onto the simplex so that each row still sums to one.Estimation solves the sample moment equation
.. math::
\mathbb{E}_n[\psi_a(W_i; \hat\eta)\theta_k + \psi_{b, k}(W_i; \hat\eta)] = 0,which yields the closed-form estimate
.. math::
\hat\theta_k = -\frac{\mathbb{E}_n[\psi_{b, k}(W_i; \hat\eta)]} {\mathbb{E}_n[\psi_a(W_i; \hat\eta)]}.For the pairwise ATE, the score component for each active arm :math:
k > 0is.. math::
\psi_{b, k}^{ATE} = \hat g_k(X) - \hat g_0(X) + (Y - \hat g_k(X)) \frac{d_k}{\tilde m_k(X)} - (Y - \hat g_0(X)) \frac{d_0}{\tilde m_0(X)},with :math:
\psi_a = -1. Here :math:d_k = 1\{D=k\}and :math:\tilde m_kdenotes the trimmed-and-renormalized propensity for arm :math:k.For the pairwise ATTE, let :math:
p_k = \mathbb{E}[d_k]. Because :math:Y(k)is observed for treated units in arm :math:k, the orthogonal (doubly robust) score takes the baseline-regression form.. math::
\psi_{b, k}^{ATTE} = \frac{d_k}{p_k} (Y - \hat g_0(X)) - \frac{d_0}{p_k} \frac{\tilde m_k(X)}{\tilde m_0(X)} (Y - \hat g_0(X)).For ATTE, :math:
\psi_a = -1in the solved moment equation and the returned estimate object keeps the same shape and fields as for ATE.Initialization
- fit(data: Optional[causalis.data_contracts.multicausaldata.MultiCausalData] = None, *, store_diagnostics: Optional[bool] = None) causalis.scenarios.multi_unconfoundedness.model.MultiTreatmentIRM¶
- estimate(score: str = 'ATE', alpha: float = 0.05, diagnostic_data: bool = True) causalis.data_contracts.multicausal_estimate.MultiCausalEstimate¶
Estimate pairwise baseline contrasts for each active treatment arm.
Parameters
score : {“ATE”, “ATTE”}, default “ATE” Target estimand.
"ATE"estimates pairwise average treatment effects for each active arm versus baseline arm 0."ATTE"estimates the corresponding pairwise average treatment effect on the treated for each active arm versus baseline arm 0 using the orthogonal / doubly robust ATTE score. alpha : float, default 0.05 Two-sided significance level used for Wald confidence intervals. diagnostic_data : bool, default True Whether to attach the fitted diagnostic payload to the returned estimate.Returns
MultiCausalEstimate Result container holding one effect estimate per active arm versus the baseline arm, together with confidence intervals, p-values, relative effects, and optionally diagnostic payloads.
- property diagnostics_: Dict[str, Any]¶
Return diagnostic data.
Returns
dict Dictionary containing ‘m_hat’, ‘g_hat’ and ‘folds’.
- property coef: numpy.ndarray¶
Return the estimated coefficient.
Returns
np.ndarray The estimated coefficient.
- property se: numpy.ndarray¶
Return the standard error of the estimate.
Returns
np.ndarray The standard error.
- property pvalues: numpy.ndarray¶
Return the p-values for the estimate.
Returns
np.ndarray The p-values.
- property summary: pandas.DataFrame¶
Return a summary DataFrame of the results.
Returns
pd.DataFrame The results summary.
- property orth_signal: numpy.ndarray¶
Return the cross-fitted orthogonal signal (psi_b).
Returns
np.ndarray The orthogonal signal.
- sensitivity_analysis(cf_y: Optional[float] = None, r2_d: Any = 0.0, rho: Any = 1.0, H0: float = 0.0, alpha: float = 0.05, *, r2_y: Optional[float] = None) causalis.scenarios.multi_unconfoundedness.model.MultiTreatmentIRM¶
- confint() pandas.DataFrame¶