pyrollmatch Validation Report
Comparing pyrollmatch (Python/polars) vs rollmatch (R) on synthetic data.
1. Test Configuration
| Parameter | Value |
| Treated units | 500 |
| Control units | 2000 |
| Time periods | 20 |
| Covariates | x1, x2, x3, x4, x5 |
| Alpha (caliper) | 0.1 |
| Lookback | 1 |
| Num matches | 3 |
| Replacement | True |
| Seed | 42 |
2. Summary Comparison
R Runtime
0.516s
Wall time: 1.1s (incl. startup)
Python Matched
500 / 500
100.0% match rate
R Matched
[500]
1498 vs [1496] pairs
3. Propensity Score Comparison
Score Correlation (R vs Python)
1.000000
Perfect agreement = 1.000000
Mean Absolute Error
0.000073
Difference due to GLM solver (R glm vs sklearn)
4. Matched Pair Overlap
| Metric | Count | % |
| Python pairs | 1498 | — |
| R pairs | 1496 | — |
| Exact overlap | 1010 | 67.5% |
| Python only | 488 | |
| R only | 486 | |
Note: Different GLM solvers produce slightly different propensity scores,
leading to different pairs within tight calipers. This is expected and does not
indicate a bug — what matters is balance quality.
5. Covariate Balance (SMD)
| Covariate | Unmatched SMD | Python Matched SMD | R Matched SMD |
| x1 |
0.0364 |
0.0013 |
0.0302 |
| x2 |
0.0244 |
0.0032 |
0.0248 |
| x3 |
-0.0462 |
-0.0103 |
-0.0052 |
| x4 |
-0.0046 |
-0.0019 |
-0.0029 |
| x5 |
-0.0970 |
0.0007 |
0.0088 |
6. Post-Matching Diagnostics (Python)
6a. Balance Test (SMD + t-test + Variance Ratio + KS)
| Covariate | SMD | t-test p | Var Ratio | KS p |
| x1 |
0.0013 |
0.9799 |
1.114 |
0.7955 |
| x2 |
0.0032 |
0.9500 |
0.949 |
0.7667 |
| x3 |
-0.0103 |
0.8426 |
0.944 |
0.9884 |
| x4 |
-0.0019 |
0.9715 |
1.047 |
0.9608 |
| x5 |
0.0007 |
0.9892 |
1.046 |
0.9515 |
6b. TOST Equivalence Test (bound = 0.36σ)
| Covariate | Equiv Bound (δ) | TOST p | Equivalent? |
| x1 |
0.7245 |
0.0000 |
Yes |
| x2 |
0.7073 |
0.0000 |
Yes |
| x3 |
0.7014 |
0.0000 |
Yes |
| x4 |
0.7236 |
0.0000 |
Yes |
| x5 |
0.6670 |
0.0000 |
Yes |
7. Verdict
VALIDATED: pyrollmatch produces equivalent matching quality to R rollmatch. Propensity scores are highly correlated and both implementations achieve good covariate balance.
Generated by pyrollmatch validation suite. Seed=42.