pyrollmatch Validation Report

Comparing pyrollmatch (Python/polars) vs rollmatch (R) on synthetic data.

1. Test Configuration

ParameterValue
Treated units500
Control units2000
Time periods20
Covariatesx1, x2, x3, x4, x5
Alpha (caliper)0.1
Lookback1
Num matches3
ReplacementTrue
Seed42

2. Summary Comparison

Python Runtime

0.05s

R Runtime

0.516s
Wall time: 1.1s (incl. startup)

Python Matched

500 / 500
100.0% match rate

R Matched

[500]
1498 vs [1496] pairs

3. Propensity Score Comparison

Score Correlation (R vs Python)

1.000000
Perfect agreement = 1.000000

Mean Absolute Error

0.000073
Difference due to GLM solver (R glm vs sklearn)

4. Matched Pair Overlap

MetricCount%
Python pairs1498
R pairs1496
Exact overlap101067.5%
Python only488
R only486

Note: Different GLM solvers produce slightly different propensity scores, leading to different pairs within tight calipers. This is expected and does not indicate a bug — what matters is balance quality.

5. Covariate Balance (SMD)

CovariateUnmatched SMDPython Matched SMDR Matched SMD
x1 0.0364 0.0013 0.0302
x2 0.0244 0.0032 0.0248
x3 -0.0462 -0.0103 -0.0052
x4 -0.0046 -0.0019 -0.0029
x5 -0.0970 0.0007 0.0088

6. Post-Matching Diagnostics (Python)

6a. Balance Test (SMD + t-test + Variance Ratio + KS)

CovariateSMDt-test pVar RatioKS p
x1 0.0013 0.9799 1.114 0.7955
x2 0.0032 0.9500 0.949 0.7667
x3 -0.0103 0.8426 0.944 0.9884
x4 -0.0019 0.9715 1.047 0.9608
x5 0.0007 0.9892 1.046 0.9515

6b. TOST Equivalence Test (bound = 0.36σ)

CovariateEquiv Bound (δ)TOST pEquivalent?
x1 0.7245 0.0000 Yes
x2 0.7073 0.0000 Yes
x3 0.7014 0.0000 Yes
x4 0.7236 0.0000 Yes
x5 0.6670 0.0000 Yes

7. Verdict

VALIDATED: pyrollmatch produces equivalent matching quality to R rollmatch. Propensity scores are highly correlated and both implementations achieve good covariate balance.

Generated by pyrollmatch validation suite. Seed=42.