Metadata-Version: 2.4
Name: scinference
Version: 0.1.0
Summary: Inference methods for synthetic control and related methods
Author-email: Kaspar Wuthrich <kwuthrich@ucsd.edu>
Maintainer: Anzony Quispe
License: GPL-3.0
Project-URL: Homepage, https://github.com/anzonyquispe/scinference
Project-URL: Repository, https://github.com/anzonyquispe/scinference
Keywords: synthetic control,causal inference,conformal inference,econometrics
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.20.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: cvxpy>=1.2.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: pandas>=1.3.0; extra == "dev"

# scinference

A Python package for inference methods for synthetic control and related methods.

This is a Python port of the [R scinference package](https://github.com/kwuthrich/scinference) by Victor Chernozhukov, Kaspar Wuthrich, and Yinchu Zhu.

## Installation

```bash
pip install scinference
```

## Validation Against R Package

All results in this README are generated using **data from the original R package** to demonstrate that Python and R produce **identical results**.

---

# Example 1: Conformal Inference

Conformal inference works with a small number of post-treatment periods and provides valid p-values and confidence intervals.

## Data

We use simulated data generated by the R package with `set.seed(12345)`:
- **J = 50** control units
- **T0 = 50** pre-treatment periods
- **T1 = 5** post-treatment periods
- **True treatment effect = 2**

```python
import numpy as np
from scinference import scinference

# Load data (generated by R for exact comparison)
# Y0: control outcomes (T x J matrix)
# Y1: treated outcome (T x 1 vector)
T0, T1 = 50, 5
```

## Testing a Null Hypothesis

We test the null hypothesis **H0: theta = (4, 4, 4, 4, 4)'**. Since the true effect is 2, this is a false null and we expect to reject it.

```python
# Synthetic Control with Moving Block permutations
result_sc = scinference(Y1, Y0, T1=T1, T0=T0, theta0=4,
                        estimation_method="sc", permutation_method="mb")

# Difference-in-Differences
result_did = scinference(Y1, Y0, T1=T1, T0=T0, theta0=4,
                         estimation_method="did", permutation_method="mb")

# Constrained Lasso
result_classo = scinference(Y1, Y0, T1=T1, T0=T0, theta0=4,
                            estimation_method="classo", permutation_method="mb")
```

### Results: P-values for H0: theta = 4 (Moving Block Permutations)

| Method | Python | R | Match |
|--------|--------|---|-------|
| Synthetic Control | 0.018182 | 0.018182 | YES |
| Difference-in-Diff | 0.036364 | 0.036364 | YES |
| Constrained Lasso | 0.036364 | 0.036364 | YES |

All three methods reject the false null at the 5% level.

## Pointwise Confidence Intervals

Compute 90% pointwise confidence intervals using the synthetic control method:

```python
result_ci = scinference(Y1, Y0, T1=T1, T0=T0,
                        estimation_method="sc",
                        ci=True,
                        ci_grid=np.arange(-2, 8.1, 0.1),
                        alpha=0.1)

print("90% Confidence Intervals:")
for t in range(T1):
    print(f"  Period {t+1}: [{result_ci['lb'][t]:.2f}, {result_ci['ub'][t]:.2f}]")
```

### Results: 90% Pointwise Confidence Intervals

| Period | Python LB | R LB | Python UB | R UB | Match |
|--------|-----------|------|-----------|------|-------|
| 1 | 0.80 | 0.80 | 4.10 | 4.10 | YES |
| 2 | 1.50 | 1.50 | 4.60 | 4.60 | YES |
| 3 | 0.40 | 0.40 | 4.00 | 4.00 | YES |
| 4 | 1.80 | 1.80 | 5.20 | 5.20 | YES |
| 5 | -0.50 | -0.50 | 2.60 | 2.60 | YES |

![Confidence Intervals](images/confidence_intervals.png)

The true treatment effect (2) is covered by the confidence intervals in all periods.

---

# Example 2: T-test Based Inference

The t-test method requires a **larger number of post-treatment periods** and provides estimates of the Average Treatment Effect on the Treated (ATT).

## Data

Data generated by R with `set.seed(12345)`:
- **J = 30** control units
- **T0 = 30** pre-treatment periods
- **T1 = 30** post-treatment periods
- **True ATT = 2**

```python
# T-test with K=2 cross-fits
result = scinference(Y1, Y0, T1=T1, T0=T0,
                     inference_method="ttest", K=2, alpha=0.1)

print(f"ATT estimate: {result['att']:.4f}")
print(f"Standard Error: {result['se']:.4f}")
print(f"90% CI: [{result['lb']:.4f}, {result['ub']:.4f}]")
```

### Results: T-test with Synthetic Control (K=2)

| Metric | Python | R | Match |
|--------|--------|---|-------|
| ATT | 1.488715 | 1.488715 | YES |
| Standard Error | 0.292220 | 0.292220 | YES |
| 90% CI Lower | -0.356287 | -0.356287 | YES |
| 90% CI Upper | 3.333716 | 3.333716 | YES |

### Results: T-test with Synthetic Control (K=3)

```python
result_K3 = scinference(Y1, Y0, T1=T1, T0=T0,
                        inference_method="ttest", K=3, alpha=0.1)
```

| Metric | Python | R | Match |
|--------|--------|---|-------|
| ATT | 1.526818 | 1.526818 | YES |
| Standard Error | 0.194858 | 0.194858 | YES |
| 90% CI Lower | 0.957835 | 0.957835 | YES |
| 90% CI Upper | 2.095801 | 2.095801 | YES |

### Results: T-test with Difference-in-Differences (K=2)

```python
result_did = scinference(Y1, Y0, T1=T1, T0=T0,
                         inference_method="ttest",
                         estimation_method="did", K=2, alpha=0.1)
```

| Metric | Python | R | Match |
|--------|--------|---|-------|
| ATT | 1.624656 | 1.624656 | YES |
| Standard Error | 0.192063 | 0.192063 | YES |
| 90% CI Lower | 0.412019 | 0.412019 | YES |
| 90% CI Upper | 2.837293 | 2.837293 | YES |

![T-test Comparison](images/ttest_comparison.png)

All methods produce estimates close to the true ATT of 2, with the true value covered by the confidence intervals.

---

# API Reference

## Main Function

```python
scinference(Y1, Y0, T1, T0,
            inference_method="conformal",
            alpha=0.1,
            ci=False,
            theta0=0,
            estimation_method="sc",
            permutation_method="mb",
            ci_grid=None,
            n_perm=5000,
            K=2)
```

## Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `Y1` | array | Required | Outcome for treated unit (T x 1) |
| `Y0` | array | Required | Outcomes for control units (T x J) |
| `T1` | int | Required | Number of post-treatment periods |
| `T0` | int | Required | Number of pre-treatment periods |
| `inference_method` | str | "conformal" | "conformal" or "ttest" |
| `alpha` | float | 0.1 | Significance level |
| `ci` | bool | False | Compute confidence intervals (conformal) |
| `theta0` | float/array | 0 | Null hypothesis value |
| `estimation_method` | str | "sc" | "did", "sc", or "classo" |
| `permutation_method` | str | "mb" | "mb" (moving block) or "iid" |
| `ci_grid` | array | None | Grid for CI computation |
| `n_perm` | int | 5000 | Permutations for IID method |
| `K` | int | 2 | Number of cross-fits (t-test) |

## Returns

**Conformal inference:**
- `p_val`: p-value for the null hypothesis
- `lb`: lower bounds of pointwise CIs (if `ci=True`)
- `ub`: upper bounds of pointwise CIs (if `ci=True`)

**T-test:**
- `att`: Average Treatment Effect on the Treated
- `se`: Standard error
- `lb`: Lower bound of confidence interval
- `ub`: Upper bound of confidence interval

## Estimation Methods

| Method | Description |
|--------|-------------|
| `did` | Difference-in-differences (simple average of controls) |
| `sc` | Synthetic control (Abadie et al.) - constrained weighted average |
| `classo` | Constrained lasso - L1-penalized with sum-to-one constraint |

---


# References

- Chernozhukov, V., Wuthrich, K., & Zhu, Y. (2021). "An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls." *Journal of the American Statistical Association*. [arXiv:1712.09089](https://arxiv.org/abs/1712.09089)

- Chernozhukov, V., Wuthrich, K., & Zhu, Y. (2019). "Practical and robust t-test based inference for synthetic control and related methods." [arXiv:1812.10820](https://arxiv.org/abs/1812.10820)

---

# License

GPL-3.0
