did.estimation.did2s(data, yname, first_stage, second_stage, treatment, cluster)
Estimate a Difference-in-Differences model using Gardner’s two-step DID2S estimator.
Parameters
data |
pd.DataFrame |
The DataFrame containing all variables. |
required |
yname |
str |
The name of the dependent variable. |
required |
first_stage |
str |
The formula for the first stage, starting with ‘~’. |
required |
second_stage |
str |
The formula for the second stage, starting with ‘~’. |
required |
treatment |
str |
The name of the treatment variable. |
required |
cluster |
str |
The name of the cluster variable. |
required |
Returns
object |
A fitted model object of class [Feols(/reference/Feols.qmd). |
Examples
import pandas as pd
import numpy as np
from pyfixest.did.estimation import did2s
url = "https://raw.githubusercontent.com/py-econometrics/pyfixest/master/pyfixest/did/data/df_het.csv"
df_het = pd.read_csv(url)
df_het.head()
0 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1990 |
0.066159 |
False |
-20.0 |
-6 |
-0.086466 |
0 |
0.0 |
7.022709 |
1 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1991 |
-0.030980 |
False |
-19.0 |
-6 |
0.766593 |
0 |
0.0 |
7.778628 |
2 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1992 |
-0.119607 |
False |
-18.0 |
-6 |
1.512968 |
0 |
0.0 |
8.436377 |
3 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1993 |
0.126321 |
False |
-17.0 |
-6 |
0.021870 |
0 |
0.0 |
7.191207 |
4 |
1 |
33 |
Group 2 |
7.043016 |
2010 |
1994 |
-0.106921 |
False |
-16.0 |
-6 |
-0.017603 |
0 |
0.0 |
6.918492 |
In a first step, we estimate a classical event study model:
# estimate the model
fit = did2s(
df_het,
yname="dep_var",
first_stage="~ 0 | unit + year",
second_stage="~i(rel_year, ref=-1.0)",
treatment="treat",
cluster="state",
)
fit.tidy().head()
Coefficient |
|
|
|
|
|
|
C(rel_year, contr.treatment(base=-1.0))[T.-inf] |
1.905043e-09 |
3.134439e-10 |
6.077778 |
4.038252e-07 |
1.271042e-09 |
2.539043e-09 |
C(rel_year, contr.treatment(base=-1.0))[T.-20.0] |
-5.822583e-02 |
3.580900e-02 |
-1.626011 |
1.120020e-01 |
-1.306564e-01 |
1.420471e-02 |
C(rel_year, contr.treatment(base=-1.0))[T.-19.0] |
-6.032229e-03 |
3.034072e-02 |
-0.198816 |
8.434394e-01 |
-6.740213e-02 |
5.533768e-02 |
C(rel_year, contr.treatment(base=-1.0))[T.-18.0] |
-6.152383e-03 |
3.509400e-02 |
-0.175312 |
8.617419e-01 |
-7.713670e-02 |
6.483193e-02 |
C(rel_year, contr.treatment(base=-1.0))[T.-17.0] |
-1.253330e-02 |
2.483369e-02 |
-0.504690 |
6.166168e-01 |
-6.276418e-02 |
3.769757e-02 |
We can also inspect the model visually:
fit.iplot(figsize= [1200, 400], coord_flip=False).show()
To estimate a pooled effect, we need to slightly update the second stage formula:
fit = did2s(
df_het,
yname="dep_var",
first_stage="~ 0 | unit + year",
second_stage="~i(treat)",
treatment="treat",
cluster="state"
)
fit.tidy().head()
Coefficient |
|
|
|
|
|
|
C(treat)[T.True] |
2.230482 |
0.024709 |
90.271437 |
0.0 |
2.180504 |
2.28046 |