Metadata-Version: 2.4
Name: causal-toolkit-yamashita
Version: 0.1.2
Summary: A Python package for causal inference methods including ATE estimation, propensity score methods, and meta-learners
Author-email: Hannah Yamashita <yamashitah@uchicago.edu>
License: MIT
Project-URL: Homepage, https://github.com/yamashann/causal-toolkit-yamashita
Project-URL: Documentation, https://github.com/yamashann/causal-toolkit-yamashita#readme
Project-URL: Repository, https://github.com/yamashann/causal-toolkit-yamashita
Project-URL: Bug Tracker, https://github.com/yamashann/causal-toolkit-yamashita/issues
Keywords: causal inference,statistics,machine learning,treatment effects
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: packaging
Requires-Dist: pandas>=1.3.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: lightgbm>=3.3.0
Requires-Dist: patsy>=0.5.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0.0; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: pylint>=2.12.0; extra == "dev"
Requires-Dist: mypy>=0.950; extra == "dev"
Dynamic: license-file

# causal-toolkit-yamashita

[![Tests](https://github.com/yamashann/causal-toolkit-yamashita/workflows/Tests/badge.svg)](https://github.com/yamashann/causal-toolkit-yamashita/actions)
[![PyPI version](https://badge.fury.io/py/causal-toolkit-yamashita.svg)](https://pypi.org/project/causal-toolkit-yamashita/)

A Python package for causal inference methods, including:

- **Randomized experiment analysis** — average treatment effect (ATE) with confidence intervals and p-values
- **Propensity score methods** — inverse probability weighting (IPW) and doubly robust estimation
- **Meta-learners** — S-learner, T-learner, X-learner, and double machine learning for conditional average treatment effects (CATE)

## Installation

Install the released package from PyPI:

```bash
pip install causal-toolkit-yamashita
```

Or install from source in editable mode for development:

```bash
git clone https://github.com/yamashann/causal-toolkit-yamashita.git
cd causal-toolkit-yamashita
uv pip install -e .
```

## Quick Start

After installing (see above), save the following as `quickstart.py` and run it
with `python quickstart.py`:

```python
import numpy as np
import pandas as pd

from causal_toolkit_yamashita import calculate_ate_ci, doubly_robust, t_learner_discrete

rng = np.random.default_rng(0)
n = 1000

# --- Randomized experiment: estimate the average treatment effect (ATE) ---
# The DataFrame must have a treatment column "T" (0/1) and an outcome column "Y".
T = rng.integers(0, 2, size=n)
Y = 3.0 * T + rng.normal(size=n)                 # true effect = 3.0
rct = pd.DataFrame({"T": T, "Y": Y})

ate, lo, hi = calculate_ate_ci(rct)
print(f"RCT ATE        = {ate:.2f}  (95% CI [{lo:.2f}, {hi:.2f}])")

# --- Observational data where covariates confound the treatment ---
x1, x2 = rng.normal(size=n), rng.normal(size=n)
treat = rng.binomial(1, 1 / (1 + np.exp(-(0.8 * x1 + 0.5 * x2))))
outcome = 2.0 * treat + 1.5 * x1 + x2 + rng.normal(size=n)   # true effect = 2.0
obs = pd.DataFrame({"x1": x1, "x2": x2, "treat": treat, "outcome": outcome})

# doubly_robust(df, formula, T, Y) — `formula` is a patsy formula of the covariates
print(f"Doubly-robust  = {doubly_robust(obs, 'x1 + x2', 'treat', 'outcome'):.2f}")

# --- Per-row heterogeneous effects (CATE) via a meta-learner ---
train, test = obs.iloc[:800], obs.iloc[800:]
cate = t_learner_discrete(train, test, X=["x1", "x2"], T="treat", y="outcome")
print(f"Mean CATE      = {cate['cate'].mean():.2f}  ({len(cate)} test rows)")
```

Expected output (estimates land near the true effects baked into the data):

```text
RCT ATE        = 3.12  (95% CI [3.00, 3.25])
Doubly-robust  = 2.08
Mean CATE      = 2.16  (200 test rows)
```

All eight functions (`calculate_ate_ci`, `calculate_ate_pvalue`, `ipw`,
`doubly_robust`, `s_learner_discrete`, `t_learner_discrete`,
`x_learner_discrete`, `double_ml_cate`) are importable directly from
`causal_toolkit_yamashita`; see the API table below for signatures.

## API

### `rct` — Randomized experiments

| Function | Description |
|---|---|
| `calculate_ate_ci(data, alpha=0.05)` | Returns `(ate, ci_lower, ci_upper)` |
| `calculate_ate_pvalue(data)` | Returns `(ate, t_stat, p_value)` |

### `propensity` — Propensity score methods

| Function | Description |
|---|---|
| `ipw(df, ps_formula, T, Y)` | Inverse probability weighted ATE |
| `doubly_robust(df, formula, T, Y)` | Doubly robust ATE estimate |

### `meta_learners` — CATE estimation

| Function | Description |
|---|---|
| `s_learner_discrete(train, test, X, T, y)` | Single-model learner; returns DataFrame with `cate` column |
| `t_learner_discrete(train, test, X, T, y)` | Two-model learner |
| `x_learner_discrete(train, test, X, T, y)` | Cross-fitted X-learner |
| `double_ml_cate(train, test, X, T, y)` | Double machine learning CATE |

## Running tests

```bash
uv run pytest
```

## License

MIT
