crabbymetrics
  • Home
  • API
  • Binding Crash Course
  • Regression And GLMs
    • OLS
    • Ridge
    • Fixed Effects OLS
    • ElasticNet
    • Logit
    • Multinomial Logit
    • Poisson
    • GMM
    • FTRL
    • MEstimator Poisson
  • Causal Inference
    • Balancing Weights
    • EPLM
    • Average Derivative
    • Double ML And AIPW
    • Richer Regression
    • TwoSLS
    • Synthetic Control
    • Synthetic DID
    • Horizontal Panel Ridge
    • Matrix Completion
    • Interactive Fixed Effects
    • Staggered Panel Event Study
  • Transforms
    • PCA And Kernel Basis
  • Ablations
    • Variance Estimators
    • Semiparametric Estimator Comparisons
    • Bridging Finite And Superpopulation
    • Panel Estimator DGP Comparisons
    • Same Root Panel Case Studies
  • Optimization
    • Optimizers
    • GMM With Optimizers
  • Ding: First Course
    • Overview And TOC
    • Ch 1 Correlation And Simpson
    • Ch 2 Potential Outcomes
    • Ch 3 CRE And Fisher RT
    • Ch 4 CRE And Neyman
    • Ch 9 Bridging Finite And Superpopulation
    • Ch 11 Propensity Score
    • Ch 12 Double Robust ATE
    • Ch 13 Double Robust ATT
    • Ch 21 Experimental IV
    • Ch 23 Econometric IV
    • Ch 27 Mediation

Same Root Panel Case Studies

Shen, Ding, Sekhon, and Yu frame a useful distinction for panel causal designs: the same observed panel can support horizontal time-series style prediction, vertical cross-sectional style prediction, and weighting estimators that interpolate between them. The point estimates may look like close cousins; the inference changes because the source of randomness changes.

This cached ablation puts that idea into one reproducible page for two canonical synthetic-control cases:

  • the Basque Country GDP case from Abadie and Gardeazabal;
  • the California Proposition 99 tobacco case from Abadie, Diamond, and Hainmueller.

For each case, the page runs the same panel through DID/TWFE, horizontal ridge prediction, vertical ridge prediction, synthetic control, synthetic DID, and matrix completion. It then reports the relevant placebo standard error and donor-placebo \(p\)-value that treat the donor pool as the comparison population, while leaving bootstrap/jackknife entries null where they are not meaningful for the design.

The final section runs a small semi-synthetic placebo simulation on the two real donor panels: each donor takes a turn as the treated unit, a known effect is added after the historical treatment date, and the estimators are scored against that known truth.

Data are vendored under docs/data/ so the page renders without network access.

1 Setup

Show code
from dataclasses import dataclass
from html import escape
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from IPython.display import HTML, display

import crabbymetrics as cm

np.set_printoptions(precision=4, suppress=True)
DATA_DIR = Path("..") / "data"


def html_table(headers, rows):
    parts = ["<table>", "<thead>", "<tr>"]
    parts.extend(f"<th>{escape(str(h))}</th>" for h in headers)
    parts.extend(["</tr>", "</thead>", "<tbody>"])
    for row in rows:
        parts.append("<tr>")
        parts.extend(f"<td>{cell}</td>" for cell in row)
        parts.append("</tr>")
    parts.extend(["</tbody>", "</table>"])
    return "".join(parts)


@dataclass(frozen=True)
class PanelCase:
    key: str
    title: str
    outcome_label: str
    panel: np.ndarray
    units: list[str]
    years: np.ndarray
    treated_units: list[int]
    t_pre: int
    simulation_tau: float

    @property
    def treatment_year(self):
        return int(self.years[self.t_pre])

    @property
    def donor_units(self):
        treated = set(self.treated_units)
        return [i for i in range(self.panel.shape[0]) if i not in treated]

2 Load The Two Case Panels

Show code
def wide_from_long(df, unit_col, time_col, outcome_col, units=None):
    if units is None:
        units = sorted(df[unit_col].drop_duplicates().tolist())
    years = np.array(sorted(df[time_col].drop_duplicates().astype(int).tolist()))
    lookup = df.set_index([unit_col, time_col])[outcome_col]
    panel = np.array([[lookup.loc[(u, y)] for y in years] for u in units], dtype=float)
    return panel, [str(u) for u in units], years


basque_raw = pd.read_csv(DATA_DIR / "basque.csv")
basque_raw["regionno"] = basque_raw["regionno"].astype(int)
basque_raw["year"] = basque_raw["year"].astype(int)
# Region 1 is the national aggregate, not a donor region. Region 17 is Basque Country.
basque_raw = basque_raw[basque_raw["regionno"] != 1].copy()
basque_raw["label"] = basque_raw["regionno"].astype(str) + ": " + basque_raw["regionname"]
basque_units = (
    basque_raw[["regionno", "regionname", "label"]]
    .drop_duplicates()
    .sort_values("regionno")
)
basque_panel, basque_unit_labels, basque_years = wide_from_long(
    basque_raw,
    unit_col="label",
    time_col="year",
    outcome_col="gdpcap",
    units=basque_units["label"].tolist(),
)
basque_treated = [basque_unit_labels.index(next(u for u in basque_unit_labels if u.startswith("17:")))]
basque_t_pre = int(np.where(basque_years == 1970)[0][0])

cal_raw = pd.read_csv(DATA_DIR / "california_prop99.csv", sep=";")
cal_units = sorted(cal_raw["State"].drop_duplicates().tolist())
cal_panel, cal_unit_labels, cal_years = wide_from_long(
    cal_raw,
    unit_col="State",
    time_col="Year",
    outcome_col="PacksPerCapita",
    units=cal_units,
)
cal_treated = [cal_unit_labels.index("California")]
cal_t_pre = int(np.where(cal_years == 1989)[0][0])

cases = [
    PanelCase(
        key="basque",
        title="Basque Country GDP",
        outcome_label="real per-capita GDP, thousand 1986 USD",
        panel=basque_panel,
        units=basque_unit_labels,
        years=basque_years,
        treated_units=basque_treated,
        t_pre=basque_t_pre,
        simulation_tau=-0.75,
    ),
    PanelCase(
        key="california",
        title="California Proposition 99",
        outcome_label="cigarette packs per capita",
        panel=cal_panel,
        units=cal_unit_labels,
        years=cal_years,
        treated_units=cal_treated,
        t_pre=cal_t_pre,
        simulation_tau=-15.0,
    ),
]

display(
    HTML(
        html_table(
            ["Case", "Units", "Years", "Treated", "First treated year", "Outcome"],
            [
                [
                    c.title,
                    c.panel.shape[0],
                    f"{int(c.years[0])}-{int(c.years[-1])}",
                    ", ".join(c.units[i] for i in c.treated_units),
                    c.treatment_year,
                    c.outcome_label,
                ]
                for c in cases
            ],
        )
    )
)
Case Units Years Treated First treated year Outcome
Basque Country GDP 17 1955-1997 17: Basque Country (Pais Vasco) 1970 real per-capita GDP, thousand 1986 USD
California Proposition 99 39 1970-2000 California 1989 cigarette packs per capita

3 Estimators

Each estimator returns an average post-treatment ATT estimate and, when meaningful, a period-by-period post-treatment effect path.

  • horizontal ridge is now the first-class crabbymetrics.HorizontalPanelRidge: it uses the common fit(Y, W) panel API, learns treated cohorts and never-treated donors internally, and forecasts treated counterfactual paths from donor paths;
  • vertical ridge remains a short reference calculation: it regresses donor outcomes in each post-treatment period on donor pre-treatment histories, then predicts the treated unit’s missing counterfactual from its pre-treatment history.

Synthetic control remains the low-level simplex-weight reference. Synthetic DID and matrix completion now use the same fit(Y, W) panel contract as horizontal ridge, so the vignette no longer constructs estimator-specific masks or treated-unit arguments for the modal panel estimators.

Show code
@dataclass
class Estimate:
    name: str
    att: float
    effect_path: np.ndarray
    counterfactual: np.ndarray | None = None
    pre_rmse: float | None = None


def split_panel(panel, treated_units):
    treated_units = list(treated_units)
    controls = [i for i in range(panel.shape[0]) if i not in set(treated_units)]
    treated_mean = panel[treated_units].mean(axis=0)
    control_mean = panel[controls].mean(axis=0)
    return treated_mean, control_mean, controls


def treatment_matrix(panel, treated_units, t_pre):
    w = np.zeros_like(panel, dtype=float)
    w[np.ix_(list(treated_units), np.arange(t_pre, panel.shape[1]))] = 1.0
    return w


def panel_counterfactual_mean(summary, treated_units):
    return np.asarray(summary["counterfactual"])[list(treated_units)].mean(axis=0)


def panel_effect_path(summary, treated_units, t_pre):
    return np.asarray(summary["treatment_effect"])[list(treated_units), t_pre:].mean(axis=0)


def ridge_solve(x, y, alpha=1.0, penalize_intercept=False):
    x = np.asarray(x, dtype=float)
    y = np.asarray(y, dtype=float)
    penalty = alpha * np.eye(x.shape[1])
    if not penalize_intercept:
        penalty[0, 0] = 0.0
    return np.linalg.solve(x.T @ x + penalty, x.T @ y)


def estimate_did(panel, treated_units, t_pre):
    treated_mean, control_mean, _ = split_panel(panel, treated_units)
    pre_gap = treated_mean[:t_pre].mean() - control_mean[:t_pre].mean()
    effect = treated_mean[t_pre:] - control_mean[t_pre:] - pre_gap
    return Estimate("DID", float(effect.mean()), effect, control_mean + pre_gap)


def estimate_twfe(panel, treated_units, t_pre):
    n, tt = panel.shape
    unit = np.repeat(np.arange(n, dtype=np.uint32), tt)
    time = np.tile(np.arange(tt, dtype=np.uint32), n)
    treated = np.zeros(n)
    treated[treated_units] = 1.0
    post = (np.arange(tt) >= t_pre).astype(float)
    d = np.repeat(treated, tt) * np.tile(post, n)
    fe = np.column_stack([unit, time])
    model = cm.FixedEffectsOLS()
    model.fit(d[:, None], fe, panel.reshape(-1))
    att = float(np.asarray(model.summary()["coef"])[0])
    # For a single adoption date in a balanced panel, TWFE and DID have the same
    # coefficient. Use the DID post-period effect path for the time-series
    # inference diagnostic rather than a degenerate constant coefficient path.
    effect = estimate_did(panel, treated_units, t_pre).effect_path
    return Estimate("TWFE", att, effect)


def estimate_horizontal_ridge(panel, treated_units, t_pre, alpha=0.25):
    model = cm.HorizontalPanelRidge(penalty=alpha)
    model.fit(panel, treatment_matrix(panel, treated_units, t_pre))
    summary = model.summary()
    return Estimate(
        "Horizontal ridge",
        float(summary["att"]),
        panel_effect_path(summary, treated_units, t_pre),
        panel_counterfactual_mean(summary, treated_units),
        float(summary["pre_rmse"]),
    )


def estimate_vertical_ridge(panel, treated_units, t_pre, alpha=1.0):
    treated_mean, _, controls = split_panel(panel, treated_units)
    x_controls = np.column_stack([np.ones(len(controls)), panel[controls, :t_pre]])
    x_treated = np.r_[1.0, treated_mean[:t_pre]]
    counter_post = []
    for t in range(t_pre, panel.shape[1]):
        beta = ridge_solve(x_controls, panel[controls, t], alpha=alpha)
        counter_post.append(float(x_treated @ beta))
    counter_post = np.asarray(counter_post)
    effect = treated_mean[t_pre:] - counter_post
    counterfactual = np.r_[treated_mean[:t_pre], counter_post]
    return Estimate("Vertical ridge", float(effect.mean()), effect, counterfactual)


def estimate_sc(panel, treated_units, t_pre):
    treated_mean, _, controls = split_panel(panel, treated_units)
    donors_pre = panel[controls, :t_pre].T
    donors_all = panel[controls].T
    model = cm.SyntheticControl(max_iterations=800)
    model.fit(donors_pre, treated_mean[:t_pre])
    y0_hat = np.asarray(model.predict(donors_all))
    effect = treated_mean[t_pre:] - y0_hat[t_pre:]
    return Estimate("Synthetic control", float(effect.mean()), effect, y0_hat, float(model.summary()["pre_rmse"]))


def estimate_sdid(panel, treated_units, t_pre):
    model = cm.SyntheticDID(max_iterations=5000)
    model.fit(panel, treatment_matrix(panel, treated_units, t_pre))
    summary = model.summary()
    y0_hat = panel_counterfactual_mean(summary, treated_units)
    effect = panel_effect_path(summary, treated_units, t_pre)
    return Estimate("Synthetic DID", float(summary["att"]), effect, y0_hat, float(summary["pre_rmse"]))


def estimate_mc(panel, treated_units, t_pre):
    treated_mean, _, _ = split_panel(panel, treated_units)
    model = cm.MatrixCompletion(lambda_fraction=0.035, max_iterations=350, tolerance=1e-7)
    model.fit(panel, treatment_matrix(panel, treated_units, t_pre))
    summary = model.summary()
    y0_hat = panel_counterfactual_mean(summary, treated_units)
    effect = panel_effect_path(summary, treated_units, t_pre)
    pre_rmse = float(np.sqrt(np.mean((treated_mean[:t_pre] - y0_hat[:t_pre]) ** 2)))
    return Estimate("Matrix completion", float(summary["att"]), effect, y0_hat, pre_rmse)


ESTIMATORS = [
    ("DID", estimate_did),
    ("TWFE", estimate_twfe),
    ("Horizontal ridge", estimate_horizontal_ridge),
    ("Vertical ridge", estimate_vertical_ridge),
    ("Synthetic control", estimate_sc),
    ("Synthetic DID", estimate_sdid),
    ("Matrix completion", estimate_mc),
]

4 Inference Helper

The donor-placebo standard error and \(p\)-value rotate treatment across donor units at the same intervention date and compare the realized estimate with that placebo distribution. Bootstrap and jackknife columns are kept explicit in the summary table, but are left null where that inferential leaf is not meaningful or not exposed for the estimator. In particular, both case studies have a single treated unit/group, so SDID’s bootstrap and jackknife SEs are undefined; placebo is the relevant SDID inference method here.

Show code
def donor_placebo_distribution(case, estimator_fn, tau_to_add=0.0):
    actual = set(case.treated_units)
    donor_pool = [i for i in range(case.panel.shape[0]) if i not in actual]
    out = []
    for pseudo in donor_pool:
        sub_units = [pseudo] + [i for i in donor_pool if i != pseudo]
        sub_panel = case.panel[sub_units].copy()
        if tau_to_add != 0.0:
            sub_panel[0, case.t_pre:] += tau_to_add
        try:
            out.append(float(estimator_fn(sub_panel, [0], case.t_pre).att))
        except Exception:
            out.append(np.nan)
    return np.asarray(out, dtype=float)


def placebo_standard_error(placebo):
    vals = placebo[np.isfinite(placebo)]
    if len(vals) <= 1:
        return np.nan
    return float(np.std(vals, ddof=1))


def placebo_p_value(actual, placebo):
    vals = placebo[np.isfinite(placebo)]
    if len(vals) == 0:
        return np.nan
    return float((1.0 + np.sum(np.abs(vals) >= abs(actual))) / (len(vals) + 1.0))


def format_number(value):
    return f"{value:.3f}" if np.isfinite(value) else "--"


def inference_summary(name, placebo):
    return {
        "bootstrap_se": np.nan,
        "jackknife_se": np.nan,
        "placebo_se": placebo_standard_error(placebo),
    }

5 Case Study Estimates

Show code
case_results = {}
for case in cases:
    case_results[case.key] = {}
    for name, fn in ESTIMATORS:
        est = fn(case.panel, case.treated_units, case.t_pre)
        placebo = donor_placebo_distribution(case, fn)
        case_results[case.key][name] = {
            "estimate": est,
            "placebo": placebo,
            "inference": inference_summary(name, placebo),
            "placebo_p": placebo_p_value(est.att, placebo),
        }

rows = []
for case in cases:
    for name, _ in ESTIMATORS:
        item = case_results[case.key][name]
        est = item["estimate"]
        inf = item["inference"]
        rows.append(
            [
                case.title,
                name,
                f"{est.att:.3f}",
                "" if est.pre_rmse is None else f"{est.pre_rmse:.3f}",
                format_number(inf["bootstrap_se"]),
                format_number(inf["jackknife_se"]),
                format_number(inf["placebo_se"]),
                format_number(item["placebo_p"]),
            ]
        )

display(
    HTML(
        html_table(
            [
                "Case",
                "Estimator",
                "ATT",
                "Pre RMSE",
                "Bootstrap SE",
                "Jackknife SE",
                "Placebo SE",
                "Donor-placebo p",
            ],
            rows,
        )
    )
)
Case Estimator ATT Pre RMSE Bootstrap SE Jackknife SE Placebo SE Donor-placebo p
Basque Country GDP DID -0.431 -- -- 0.663 0.412
Basque Country GDP TWFE -0.431 -- -- 0.663 0.412
Basque Country GDP Horizontal ridge -1.035 0.057 -- -- 0.357 0.059
Basque Country GDP Vertical ridge -0.806 -- -- 0.398 0.118
Basque Country GDP Synthetic control -0.895 0.076 -- -- 0.512 0.118
Basque Country GDP Synthetic DID -0.793 0.075 -- -- 0.331 0.059
Basque Country GDP Matrix completion -0.589 0.062 -- -- 0.520 0.176
California Proposition 99 DID -27.349 -- -- 17.519 0.103
California Proposition 99 TWFE -27.349 -- -- 17.519 0.103
California Proposition 99 Horizontal ridge -15.756 0.002 -- -- 16.174 0.308
California Proposition 99 Vertical ridge -15.454 -- -- 16.384 0.205
California Proposition 99 Synthetic control -18.998 2.400 -- -- 10.982 0.077
California Proposition 99 Synthetic DID -15.605 1.725 -- -- 9.490 0.051
California Proposition 99 Matrix completion -20.082 1.105 -- -- 11.994 0.103

6 Outcome Paths And Gaps

Show code
plot_methods = ["Horizontal ridge", "Synthetic control", "Synthetic DID", "Matrix completion"]
colors = {
    "Donor mean": "#d95f02",
    "Horizontal ridge": "#e7298a",
    "Synthetic control": "#1b9e77",
    "Synthetic DID": "#7570b3",
    "Matrix completion": "#315f9f",
}

fig, axes = plt.subplots(len(cases), 2, figsize=(10, 3.7 * len(cases)), dpi=100, constrained_layout=True)
if len(cases) == 1:
    axes = np.array([axes])
fig.patch.set_facecolor("#f7f4ef")

for row, case in enumerate(cases):
    treated_mean, donor_mean, _ = split_panel(case.panel, case.treated_units)
    ax_path, ax_gap = axes[row]
    for ax in (ax_path, ax_gap):
        ax.set_facecolor("#fbfaf7")
        ax.grid(True, color="#ddd6cc", linewidth=0.8, alpha=0.7)
        ax.axvline(case.treatment_year, color="0.35", ls="--", lw=1.2)

    ax_path.plot(case.years, treated_mean, color="black", lw=2.4, label="Treated")
    ax_path.plot(case.years, donor_mean, color=colors["Donor mean"], lw=1.8, label="Donor mean")
    ax_gap.plot(case.years, treated_mean - donor_mean, color=colors["Donor mean"], lw=1.6, label="Treated - donor mean")

    for method in plot_methods:
        est = case_results[case.key][method]["estimate"]
        if est.counterfactual is None:
            continue
        ax_path.plot(case.years, est.counterfactual, color=colors[method], lw=1.9, label=method)
        ax_gap.plot(case.years, treated_mean - est.counterfactual, color=colors[method], lw=1.8, label=method)

    ax_path.set_title(f"{case.title}: observed and counterfactual paths", weight="bold")
    ax_path.set_xlabel("Year")
    ax_path.set_ylabel(case.outcome_label)
    ax_path.legend(frameon=False, fontsize=8)

    ax_gap.axhline(0.0, color="0.45", lw=1)
    ax_gap.set_title(f"{case.title}: treated minus counterfactual", weight="bold")
    ax_gap.set_xlabel("Year")
    ax_gap.set_ylabel("Gap")
    ax_gap.legend(frameon=False, fontsize=8)

plt.show()

7 Semi-Synthetic Placebo Simulation

This simulation keeps the empirical donor panels intact. For each case and each donor unit, we temporarily declare that donor treated, remove the real treated unit from the donor pool, add a known post-treatment effect, and rerun every estimator. This is not a universal Monte Carlo; it is a compact stress test of the estimator/inference code on the two empirical designs used above.

Show code
def donor_turn_simulation(case):
    rows = []
    for name, fn in ESTIMATORS:
        placebo_with_tau = donor_placebo_distribution(case, fn, tau_to_add=case.simulation_tau)
        vals = placebo_with_tau[np.isfinite(placebo_with_tau)]
        err = vals - case.simulation_tau
        rows.append(
            {
                "Case": case.title,
                "Estimator": name,
                "N": len(vals),
                "Truth": case.simulation_tau,
                "Mean estimate": float(vals.mean()),
                "Bias": float(err.mean()),
                "RMSE": float(np.sqrt(np.mean(err**2))),
                "Median abs error": float(np.median(np.abs(err))),
            }
        )
    return rows

sim_rows_raw = []
for case in cases:
    sim_rows_raw.extend(donor_turn_simulation(case))

sim_rows = [
    [
        r["Case"],
        r["Estimator"],
        r["N"],
        f"{r['Truth']:.2f}",
        f"{r['Mean estimate']:.3f}",
        f"{r['Bias']:+.3f}",
        f"{r['RMSE']:.3f}",
        f"{r['Median abs error']:.3f}",
    ]
    for r in sim_rows_raw
]

display(
    HTML(
        html_table(
            ["Case", "Estimator", "Pseudo-treated units", "Truth", "Mean estimate", "Bias", "RMSE", "Median |error|"],
            sim_rows,
        )
    )
)
Case Estimator Pseudo-treated units Truth Mean estimate Bias RMSE Median |error|
Basque Country GDP DID 16 -0.75 -0.750 +0.000 0.642 0.334
Basque Country GDP TWFE 16 -0.75 -0.750 +0.000 0.642 0.334
Basque Country GDP Horizontal ridge 16 -0.75 -0.695 +0.055 0.350 0.278
Basque Country GDP Vertical ridge 16 -0.75 -0.749 +0.001 0.386 0.281
Basque Country GDP Synthetic control 16 -0.75 -0.733 +0.017 0.496 0.241
Basque Country GDP Synthetic DID 16 -0.75 -0.751 -0.001 0.321 0.165
Basque Country GDP Matrix completion 16 -0.75 -0.732 +0.018 0.504 0.310
California Proposition 99 DID 38 -15.00 -15.000 +0.000 17.287 9.050
California Proposition 99 TWFE 38 -15.00 -15.000 -0.000 17.287 9.050
California Proposition 99 Horizontal ridge 38 -15.00 -17.783 -2.783 16.201 7.517
California Proposition 99 Vertical ridge 38 -15.00 -16.094 -1.094 16.204 7.515
California Proposition 99 Synthetic control 38 -15.00 -15.759 -0.759 10.863 7.108
California Proposition 99 Synthetic DID 38 -15.00 -14.612 +0.388 9.372 6.108
California Proposition 99 Matrix completion 38 -15.00 -15.559 -0.559 11.849 7.209

8 Reading The Results

The Basque and California panels are small enough that no single inferential convention should be treated as dispositive. That is exactly the Shen-Ding-Sekhon-Yu lesson. The table therefore keeps unavailable bootstrap/jackknife leaves null and reports the donor-placebo leaf where it is relevant. For SDID in these two single-treated-unit designs, that means placebo inference only. A donor-placebo standard error or \(p\)-value asks whether the treated unit looks unusual relative to donor units assigned the same historical intervention date.

The semi-synthetic donor-turn simulation is the sanity check: when the true effect is known and injected into real donor paths, the matching and matrix-completion leaves usually beat raw DID/TWFE when pre-treatment factor mismatch is important, but their placebo inference remains limited by the size and quality of the donor pool. In other words, the estimators and the inference really are leaves from the same root, not interchangeable black boxes.