causalis.scenarios.cuped.diagnostics.regression_checks¶
Module Contents¶
Classes¶
Lightweight OLS/regression health checks for CUPED diagnostics. |
Functions¶
Return rank/conditioning diagnostics for a numeric design matrix. |
|
Find pairs with absolute correlation very close to one. |
|
Approximate VIF from inverse correlation matrix of standardized covariates.. |
|
Compute leverage, Cook’s distance, and internally studentized residuals. |
|
Refit OLS on winsorized outcome and return treatment coefficient. |
|
Build a compact payload with design, residual, and influence diagnostics. |
|
Check that the design matrix is full rank. |
|
Check global collinearity via condition number. |
|
Check near-duplicate centered covariate pairs. |
|
Check VIF from centered main-effect covariates. |
|
Check adjusted-vs-naive ATE gap relative to naive SE. |
|
Check residual extremes using max standardized residual only. |
|
Check leverage concentration. |
|
Check Cook’s distance influence diagnostics. |
|
Check HC2/HC3 stability when leverage terms approach one. |
|
Check sensitivity of adjusted ATE to winsorized-outcome refit. |
|
Run all CUPED regression assumption tests and return row payloads. |
|
Return a table of GREEN/YELLOW/RED assumption flags from checks payload. |
|
Build assumption table from |
|
Build assumptions table from a CUPED estimate. |
|
Fit CUPED on |
|
Return overall GREEN/YELLOW/RED status from an assumptions table. |
|
Return pandas Styler with colored flag cells for notebook display. |
Data¶
API¶
- causalis.scenarios.cuped.diagnostics.regression_checks.FLAG_GREEN¶
‘GREEN’
- causalis.scenarios.cuped.diagnostics.regression_checks.FLAG_YELLOW¶
‘YELLOW’
- causalis.scenarios.cuped.diagnostics.regression_checks.FLAG_RED¶
‘RED’
- causalis.scenarios.cuped.diagnostics.regression_checks.FLAG_LEVEL¶
None
- causalis.scenarios.cuped.diagnostics.regression_checks.FLAG_COLOR¶
None
- class causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks(/, **data: Any)¶
Bases:
pydantic.BaseModelLightweight OLS/regression health checks for CUPED diagnostics.
Initialization
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- ate_naive: float¶
None
- ate_adj: float¶
None
- ate_gap: float¶
None
- ate_gap_over_se_naive: Optional[float]¶
None
- k: int¶
None
- rank: int¶
None
- full_rank: bool¶
None
- condition_number: float¶
None
- p_main_covariates: int¶
None
- near_duplicate_pairs: list[tuple[str, str, float]]¶
‘Field(…)’
- vif: Optional[Dict[str, float]]¶
None
- resid_scale_mad: float¶
None
- n_std_resid_gt_3: int¶
None
- n_std_resid_gt_4: int¶
None
- max_abs_std_resid: float¶
None
- max_leverage: float¶
None
- leverage_cutoff: float¶
None
- n_high_leverage: int¶
None
- max_cooks: float¶
None
- cooks_cutoff: float¶
None
- n_high_cooks: int¶
None
- min_one_minus_h: float¶
None
- n_tiny_one_minus_h: int¶
None
- winsor_q: Optional[float]¶
None
- ate_adj_winsor: Optional[float]¶
None
- ate_adj_winsor_gap: Optional[float]¶
None
- causalis.scenarios.cuped.diagnostics.regression_checks.design_matrix_checks(design: pandas.DataFrame) tuple[int, int, bool, float]¶
Return rank/conditioning diagnostics for a numeric design matrix.
- causalis.scenarios.cuped.diagnostics.regression_checks.near_duplicate_corr_pairs(x: pandas.DataFrame, tol: float, max_pairs: int = 50) list[tuple[str, str, float]]¶
Find pairs with absolute correlation very close to one.
- causalis.scenarios.cuped.diagnostics.regression_checks.vif_from_corr(x: pandas.DataFrame) Optional[Dict[str, float]]¶
Approximate VIF from inverse correlation matrix of standardized covariates..
- causalis.scenarios.cuped.diagnostics.regression_checks.leverage_and_cooks(y: numpy.ndarray, z: numpy.ndarray, params: numpy.ndarray) tuple[numpy.ndarray, numpy.ndarray, numpy.ndarray]¶
Compute leverage, Cook’s distance, and internally studentized residuals.
- causalis.scenarios.cuped.diagnostics.regression_checks.winsor_fit_tau(y: pandas.Series, design: pandas.DataFrame, cov_type: str, use_t_fit: bool, winsor_q: Optional[float]) Optional[float]¶
Refit OLS on winsorized outcome and return treatment coefficient.
- causalis.scenarios.cuped.diagnostics.regression_checks.run_regression_checks(y: pandas.Series, design: pandas.DataFrame, result: Any, result_naive: Any, cov_type: str, use_t_fit: bool, corr_near_one_tol: float, tiny_one_minus_h_tol: float, winsor_q: Optional[float]) causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks¶
Build a compact payload with design, residual, and influence diagnostics.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_design_rank(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks) Dict[str, Any]¶
Check that the design matrix is full rank.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_condition_number(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, warn_threshold: float = 100000000.0, red_multiplier: float = 100.0) Dict[str, Any]¶
Check global collinearity via condition number.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_near_duplicates(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, red_pairs_threshold: int = 3) Dict[str, Any]¶
Check near-duplicate centered covariate pairs.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_vif(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, warn_threshold: float = 20.0, red_multiplier: float = 2.0) Dict[str, Any]¶
Check VIF from centered main-effect covariates.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_ate_gap(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, yellow_threshold: float = 2.0, red_threshold: float = 2.5) Dict[str, Any]¶
Check adjusted-vs-naive ATE gap relative to naive SE.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_residual_tails(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, yellow_abs_std_resid: float = 7.0, red_abs_std_resid: float = 10.0) Dict[str, Any]¶
Check residual extremes using max standardized residual only.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_leverage(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, yellow_multiplier: float = 5.0, red_multiplier: float = 10.0, red_floor: float = 0.5) Dict[str, Any]¶
Check leverage concentration.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_cooks(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, yellow_threshold: float = 0.1, red_threshold: float = 1.0) Dict[str, Any]¶
Check Cook’s distance influence diagnostics.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_hc23_stability(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, cov_type: str, tiny_one_minus_h_tol: float = 1e-08) Dict[str, Any]¶
Check HC2/HC3 stability when leverage terms approach one.
- causalis.scenarios.cuped.diagnostics.regression_checks.assumption_winsor_sensitivity(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, winsor_reference_se: Optional[float] = None, yellow_sigma: float = 1.0, red_sigma: float = 2.0, yellow_ratio: float = 0.1, red_ratio: float = 0.25) Dict[str, Any]¶
Check sensitivity of adjusted ATE to winsorized-outcome refit.
- causalis.scenarios.cuped.diagnostics.regression_checks.regression_assumption_rows_from_checks(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, cov_type: str = 'HC2', condition_number_warn_threshold: float = 100000000.0, vif_warn_threshold: float = 20.0, tiny_one_minus_h_tol: float = 1e-08, winsor_reference_se: Optional[float] = None) list[Dict[str, Any]]¶
Run all CUPED regression assumption tests and return row payloads.
- causalis.scenarios.cuped.diagnostics.regression_checks.regression_assumptions_table_from_checks(checks: causalis.scenarios.cuped.diagnostics.regression_checks.RegressionChecks, cov_type: str = 'HC2', condition_number_warn_threshold: float = 100000000.0, vif_warn_threshold: float = 20.0, tiny_one_minus_h_tol: float = 1e-08, winsor_reference_se: Optional[float] = None) pandas.DataFrame¶
Return a table of GREEN/YELLOW/RED assumption flags from checks payload.
- causalis.scenarios.cuped.diagnostics.regression_checks.regression_assumptions_table_from_diagnostic_data(diagnostic_data: causalis.data_contracts.causal_diagnostic_data.CUPEDDiagnosticData, cov_type: str = 'HC2', condition_number_warn_threshold: float = 100000000.0, vif_warn_threshold: float = 20.0, tiny_one_minus_h_tol: float = 1e-08, winsor_reference_se: Optional[float] = None) pandas.DataFrame¶
Build assumption table from
CUPEDDiagnosticDatapayload.
- causalis.scenarios.cuped.diagnostics.regression_checks.regression_assumptions_table_from_estimate(data_or_estimate: causalis.dgp.causaldata.CausalData | causalis.data_contracts.causal_estimate.CausalEstimate, estimate: Optional[causalis.data_contracts.causal_estimate.CausalEstimate] = None, style_regression_assumptions_table: Optional[Callable[[pandas.DataFrame], Any]] = None, cov_type: Optional[str] = None, condition_number_warn_threshold: float = 100000000.0, vif_warn_threshold: float = 20.0, tiny_one_minus_h_tol: float = 1e-08) Any¶
Build assumptions table from a CUPED estimate.
Supports both call styles:
regression_assumptions_table_from_estimate(estimate, ...)regression_assumptions_table_from_estimate(data, estimate, ...)
- causalis.scenarios.cuped.diagnostics.regression_checks.regression_assumptions_table_from_data(data: causalis.dgp.causaldata.CausalData, covariates: Sequence[str], model_kwargs: Optional[Dict[str, Any]] = None, fit_kwargs: Optional[Dict[str, Any]] = None) pandas.DataFrame¶
Fit CUPED on
CausalDataand return the assumptions flag table.
- causalis.scenarios.cuped.diagnostics.regression_checks.overall_assumption_flag(table: pandas.DataFrame) str¶
Return overall GREEN/YELLOW/RED status from an assumptions table.
- causalis.scenarios.cuped.diagnostics.regression_checks.style_regression_assumptions_table(table: pandas.DataFrame)¶
Return pandas Styler with colored flag cells for notebook display.