Metadata-Version: 2.4
Name: pybdiff
Version: 0.1.0
Summary: Python replication of Stata's bdiff command — tests whether regression coefficients differ across two subgroups
Project-URL: Homepage, https://github.com/luzhiyu-econ/pybdiff
Project-URL: Repository, https://github.com/luzhiyu-econ/pybdiff
Project-URL: Issues, https://github.com/luzhiyu-econ/pybdiff/issues
Author-email: luzhiyu-econ <zhiyu.lu.econ@icloud.com>
License: MIT License
        
        Copyright (c) 2025 luzhiyu-econ
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Requires-Python: >=3.9
Requires-Dist: joblib
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: pyfixest
Requires-Dist: scipy
Requires-Dist: tqdm
Description-Content-Type: text/markdown

# pybdiff

Python replication of Stata's `bdiff` command. Tests whether regression coefficients differ across two subgroups using permutation tests, bootstrap tests, or parametric Wald tests — with optional parallel execution.

## Attribution

This package is a Python reimplementation of the Stata command `bdiff` originally written by:

> Lian, Yujun (连玉君). *bdiff*: Coefficient Difference Test across Two Groups.
> Version 1.04, 24 Nov 2020. Sun Yat-sen University.
> Email: arlionn@163.com · Blog: https://www.lianxh.cn

All methodological credit belongs to the original author. The bootstrap and permutation procedures follow:

> Efron, B., Tibshirani, R., 1993. *An Introduction to the Bootstrap*. Chapman & Hall.

## Installation

```bash
pip install pybdiff
```

## Quick Start

```python
from pybdiff import bdiff

result = bdiff(
    df      = df,
    group   = "treated",          # column with values 0 and 1
    formula = "y ~ x1 + x2 | firm + year",
    vcov    = {"CRV1": "firm"},
    method  = "permutation",      # "permutation", "bootstrap", or "wald"
    reps    = 500,
    seed    = 42,
    n_jobs  = -1,                 # use all CPU cores
)
print(result)
```

## Methods

| Method | Type | Description |
|--------|------|-------------|
| `permutation` | Non-parametric | Randomly shuffles group labels (default) |
| `bootstrap` | Non-parametric | Resamples each group with replacement |
| `wald` | Parametric | Chi-squared test using block-diagonal VCV |

## Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `df` | `pd.DataFrame` | — | Dataset with a 0/1 group column |
| `group` | `str` | — | Name of the grouping variable |
| `formula` | `str` | — | PyFixest formula, e.g. `"y ~ x1 + x2 \| fe"` |
| `vcov` | `str \| dict` | `"iid"` | Variance-covariance type |
| `method` | `str` | `"permutation"` | Test method |
| `reps` | `int` | `500` | Resampling iterations (ignored for Wald) |
| `seed` | `int \| None` | `42` | Master random seed |
| `n_jobs` | `int` | `1` | Parallel workers (`-1` = all cores) |
| `verbose` | `bool` | `True` | Print progress bar and results table |

## Returns

`pd.DataFrame` indexed by variable name with columns:

- `b_group0`, `b_group1` — estimated coefficients per group
- `diff` — coefficient difference (b0 − b1)
- `stat` — test statistic (frequency count or chi-squared)
- `p_value` — two-sided p-value
- `se_diff` — standard error of the difference (Wald only)
- `valid_reps` — successful resampling iterations (resampling only)

## Dependencies

- [pyfixest](https://github.com/py-econometrics/pyfixest)
- pandas, numpy, scipy, joblib, tqdm

## License

MIT
