causalis.data_contracts.multicausaldata¶
Causalis Dataclass for storing Cross-sectional DataFrame and column metadata for causal inference with multiple treatments.
Module Contents¶
Classes¶
Data contract for cross-sectional causal data with multi-class one-hot treatments. |
API¶
- class causalis.data_contracts.multicausaldata.MultiCausalData(/, **data: Any)¶
Bases:
pydantic.BaseModelData contract for cross-sectional causal data with multi-class one-hot treatments.
Parameters
df : pd.DataFrame The DataFrame containing the causal data. outcome : str The name of the outcome column. treatment_names : List[str] The names of the treatment columns. confounders : List[str], optional The names of the confounder columns, by default []. user_id : Optional[str], optional The name of the user ID column, by default None. control_treatment : str Name of the control/baseline treatment column.
Notes
This class enforces several constraints on the data, including:
Maximum number of treatment_names (default 15).
No duplicate column names in the input DataFrame.
Disjoint roles for columns (outcome, treatment_names, confounders, user_id).
Non-empty normalized names for outcome and user_id (if provided).
Existence of all specified columns in the DataFrame.
Numeric or boolean types for outcome and confounders.
Finite values for outcome, confounders, and treatment_names.
Non-constant values for outcome, treatment_names, and confounders.
No NaN values in used columns.
Binary (0/1) encoding for treatment columns.
One-hot treatment assignment (exactly one active treatment per row).
A stable control treatment in position 0.
No identical values between different columns.
Unique values for user_id (if specified).
Initialization
Create a new model by parsing and validating input data from keyword arguments.
Raises [
ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.selfis explicitly positional-only to allowselfas a field name.- model_config¶
‘ConfigDict(…)’
- MAX_TREATMENTS: ClassVar[int]¶
15
- FLOAT_TOL: ClassVar[float]¶
1e-12
- df: pandas.DataFrame¶
None
- outcome: str¶
None
- treatment_names: List[str]¶
None
- confounders: List[str]¶
‘Field(…)’
- user_id: Optional[str]¶
None
- control_treatment: str¶
None
- classmethod from_df(df: pandas.DataFrame, *, outcome: str, treatment_names: Union[str, List[str]], confounders: Optional[Union[str, List[str]]] = None, user_id: Optional[str] = None, control_treatment: str, **kwargs: Any) causalis.data_contracts.multicausaldata.MultiCausalData¶
Create a MultiCausalData instance from a pandas DataFrame.
Parameters
df : pd.DataFrame The input DataFrame. outcome : str The name of the outcome column. treatment_names : Union[str, List[str]] The name(s) of the treatment column(s). confounders : Union[str, List[str]], optional The name(s) of the confounder column(s), by default None. user_id : str, optional The name of the user ID column, by default None. control_treatment : str Name of the control treatment column. **kwargs : Any Additional keyword arguments passed to the constructor.
Returns
MultiCausalData An instance of MultiCausalData.
- property treatments: pandas.DataFrame¶
Return the treatment columns as a pandas DataFrame.
Returns
pd.DataFrame The treatment columns.
- property treatment: pandas.Series¶
Return the single treatment column as a pandas Series.
Returns
pd.Series The treatment column.
Raises
AttributeError If there is more than one treatment column.
- property X: pandas.DataFrame¶
Return the confounder columns as a pandas DataFrame.
Returns
pd.DataFrame The confounder columns.
- get_df(columns: Optional[List[str]] = None, include_outcome: bool = True, include_confounders: bool = True, include_treatments: bool = True, include_user_id: bool = False) pandas.DataFrame¶
Get a subset of the underlying DataFrame.
Parameters
columns : List[str], optional Specific columns to include, by default None. include_outcome : bool, optional Whether to include the outcome column, by default True. include_confounders : bool, optional Whether to include confounder columns, by default True. include_treatments : bool, optional Whether to include treatment columns, by default True. include_user_id : bool, optional Whether to include the user ID column, by default False.
Returns
pd.DataFrame A copy of the requested DataFrame subset.
Raises
ValueError If any of the requested columns do not exist.
- __repr__() str¶
- __str__() str¶