Metadata-Version: 2.4
Name: pycointbreak
Version: 0.1.0
Summary: Tests for breaks in fractional cointegration (Hassler & Breitung 2006; Rodrigues, Sibbertsen & Voges 2019).
Home-page: https://github.com/merwanroudane/pycointbreak
Author: Dr. Merwan Roudane
Author-email: "Dr. Merwan Roudane" <merwanroudane920@gmail.com>
Maintainer-email: "Dr. Merwan Roudane" <merwanroudane920@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/merwanroudane/pycointbreak
Project-URL: Repository, https://github.com/merwanroudane/pycointbreak
Project-URL: Bug Tracker, https://github.com/merwanroudane/pycointbreak/issues
Keywords: econometrics,fractional cointegration,structural breaks,time series,Hassler-Breitung,Rodrigues-Sibbertsen-Voges,long memory,LM test
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.22
Requires-Dist: pandas>=1.4
Requires-Dist: scipy>=1.8
Requires-Dist: statsmodels>=0.13
Requires-Dist: matplotlib>=3.5
Requires-Dist: seaborn>=0.12
Provides-Extra: examples
Requires-Dist: yfinance>=0.2; extra == "examples"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# pycointbreak

> **Tests for breaks in fractional cointegration** — a Python
> implementation of the Hassler & Breitung (2006) residual-based LM
> test and the Rodrigues, Sibbertsen & Voges (2019) supremum tests
> for breaks in the cointegrating relationship.

[![python](https://img.shields.io/badge/python-3.10%2B-blue)]()
[![status](https://img.shields.io/badge/status-research%20release-orange)]()
[![license](https://img.shields.io/badge/license-MIT-green)]()

---

## Author

**Dr. Merwan Roudane**
Email: [merwanroudane920@gmail.com](mailto:merwanroudane920@gmail.com)
GitHub: <https://github.com/merwanroudane/pycointbreak>

If you use this library in academic work, please cite both the
underlying papers and this package — see `pycointbreak.cite()` for
ready-made BibTeX entries.

---

## What does it do?

Given a scalar series `y` and a (possibly multivariate) series `x`,
each integrated of order `d`, `pycointbreak` lets you test whether

```
y_t = β' x_t + z_t,        z_t ~ I(d − b),   b ≥ 0,
```

is

1. **Cointegrated at all** (`b > 0`) — using the Hassler & Breitung
   (2006) LM-type test.
2. **Cointegrated only on part of the sample** (segmented fractional
   cointegration) — using the Rodrigues, Sibbertsen & Voges (2019)
   sup-tests over split, forward incremental, backward incremental,
   and rolling sub-samples.

When the answer is "yes, breaks", the library can also estimate the
**break date** via the consistent estimator of RSV (2019, eq. 19).

---

## Installation

```bash
git clone https://github.com/merwanroudane/pycointbreak.git
cd pycointbreak
pip install -e .
```

Dependencies: `numpy`, `pandas`, `scipy`, `statsmodels`,
`matplotlib`, `seaborn`. Optional: `yfinance` (for the real-data
example).

---

## Quick start

```python
import pycointbreak as pcb

# 1. Generate (or load) two I(d) series.
y, x = pcb.simulate_segmented_cointegration(
    T=500, d=1.0, b=0.4, break_frac=0.5,
    regime="coint_then_spurious", seed=2024,
)

# 2. Full-sample Hassler-Breitung test.
hb = pcb.hassler_breitung_test(y, x, d=1.0, p=0)
print(hb.summary())

# 3. RSV (2019) sup-test battery.
battery = pcb.rsv_battery(y, x, d=1.0, lam0=0.5, n_grid=120)
for k, r in battery.items():
    print(r.summary())

# 4. Estimate the break date.
bp = pcb.estimate_break_point(y, x, d=1.0, direction="auto")
print(bp.summary())

# 5. Paper-style summary table.
df = pcb.battery_to_dataframe(battery, hb=hb, label="my pair")
print(pcb.render_table(df, fmt="text",
                       title="Tests for segmented cointegration"))

# 6. Plots.
import matplotlib.pyplot as plt
pcb.plot_series_with_breaks({"y": y, "x": x}, {"break": bp.obs})
pcb.plot_rsv_battery(battery)
pcb.plot_breakpoint_objective(bp)
plt.show()
```

---

## Mathematical compatibility with the papers

The implementation follows the papers' notation closely. The key
ingredients are exposed as building blocks:

| Symbol                                                             | Function                                  |
| ------------------------------------------------------------------ | ----------------------------------------- |
| `Δ⁺ᵈ x_t` — truncated (Type II) fractional difference              | `pcb.fdiff(x, d)`                         |
| `π_j(d)` — coefficients of `(1−L)^d`                               | `pcb.frac_coefs(d, n)`                    |
| `x*_{t−1} = Σ_{j=1}^{t−1} x_{t−j}/j` — harmonic-weighted sum (HB eq. 3) | `pcb.harmonic_running_sum(x)`             |
| `(1−L)^{−d} x_t` — fractional integration                          | `pcb.frac_integrate(x, d)`                |
| Geweke & Porter-Hudak `d̂`                                          | `pcb.gph(x)`                              |
| HB test (eqs. 12–14, iid case)                                     | `pcb.hassler_breitung_test(..., p=0)`     |
| HB test with AR(p) augmentation (eqs. 17–18)                       | `pcb.hassler_breitung_test(..., p=p)`     |
| RSV sub-sample t-stat (eq. 5)                                      | internal `_hb_subsample_tstat`            |
| Sup-statistics 𝒯_S, 𝒯_S*, 𝒯_If, 𝒯_Ib, 𝒯_R (eqs. 10–15)             | `pcb.rsv_sup_test(..., kind=...)`         |
| RSV break-point estimator (eq. 19)                                 | `pcb.estimate_break_point(...)`           |
| Critical values from Table 1 of RSV2019                            | `pcb.rsv_critical_value(test, T, alpha)`  |

### On the eq.-(5) `√n` factor

Equation (5) of RSV (2019) literally reads

```
t(ê(λ₁,λ₂)) = √(⌊λ₂T⌋ − ⌊λ₁T⌋) · Σ ê_t · ê*_{t−1}
              / [√Σ ê*²_{t−1} · √((T−1)⁻¹ · Σ ê²_t)]
```

With the full-sample variance `1/(T−1)` and the extra `√n` factor,
under H0 this statistic is `O_p(√T)` rather than asymptotically
N(0,1), which would contradict Theorem 2 (`sup χ²₁` limit) and the
critical values of Table 1 (5%-cv ≈ 4–6 for `T = 500`). We
therefore interpret the formula as the standard HB statistic
(eq. 14 of HB2006) applied to the **sub-sample**, with sub-sample
variance `1/(n−1)` — the same convention used by Davidson &
Monticini (2010). This yields `t → N(0, 1)` per Proposition 3 of
HB2006 and `T_K → sup χ²₁` consistent with Theorem 2 of RSV2019.

### On the critical values

The tabulated critical values in `pcb.RSV_TABLE1` reproduce Table 1
of RSV (2019) for `T = 250` and `T = 500` with `λ₀ = 0.5`. They are
the result of a Monte Carlo simulation in the paper averaged over
several values of `d`. For configurations far from
`(T ∈ {250, 500}, λ₀ = 0.5, d ≈ 1)`, use the bootstrap helper

```python
cv = pcb.bootstrap_rsv_cv(
    T=1000, d=1.0, kind="T_If", n_sim=2000,
)
print(cv)   # {0.10: ..., 0.05: ..., 0.01: ...}
```

which returns critical values calibrated to your exact sample size,
grid, and `d`.

---

## Modules

| Module                          | Contents                                               |
| ------------------------------- | ------------------------------------------------------ |
| `pycointbreak.fracdiff`         | Truncated (Type II) fractional difference operators    |
| `pycointbreak.gph`              | GPH log-periodogram estimator of `d`                   |
| `pycointbreak.hassler_breitung` | HB (2006) LM-type test                                 |
| `pycointbreak.rsv_tests`        | RSV (2019) sup-tests for breaks                        |
| `pycointbreak.breakpoint`       | Break-fraction estimator (RSV eq. 19)                  |
| `pycointbreak.critical_values`  | Tabulated and bootstrap critical values                |
| `pycointbreak.simulate`         | DGPs from HB2006 §5 and RSV2019 §4 (Experiments 1–3)   |
| `pycointbreak.reporting`        | Paper-style tables (text/markdown/LaTeX/HTML)          |
| `pycointbreak.plots`            | Publication-quality figures (matplotlib + seaborn)     |

---

## API at a glance

### Tests

```python
hassler_breitung_test(y1, y2, d=1.0, p=0, include_const=True)
    → HBResult(t_stat, p_value, reject, ...)

rsv_sup_test(y1, y2, d=1.0, kind="T_If", lam0=0.5, n_grid=200, ...)
    → RSVResult(statistic, argmax, t2_path, ...)

rsv_battery(y1, y2, d=1.0, tests=("T_S_star", "T_If", "T_Ib", "T_R"), ...)
    → dict[str, RSVResult]

estimate_break_point(y1, y2, d=1.0, delta=0.05, direction="auto", ...)
    → BreakPointResult(tau_hat, obs, date, objective, ...)
```

### Reporting

```python
df = battery_to_dataframe(battery, hb=hb, label="series")
print(render_table(df, fmt="text"))      # ASCII
print(render_table(df, fmt="markdown"))   # for GitHub / Jupyter
print(render_table(df, fmt="latex"))      # for papers
print(render_table(df, fmt="html"))       # for web
```

### Plots

```python
plot_series_with_breaks(series_dict, breaks_dict)
plot_rsv_profile(rsv_result)            # one sup-test
plot_rsv_battery(battery_dict)          # 2×2 panel of all four
plot_breakpoint_objective(bp_result)
plot_residual_diagnostics(residuals)
plot_split_heatmap(y, x, d=1.0, ...)    # 2-D (λ₁, λ₂) heatmap
```

### Simulation

```python
simulate_rsv_dgp(T=500, d=1.0, b=0.0)
    # Experiment 1 — constant cointegration over the whole sample

simulate_segmented_cointegration(
    T=500, d=1.0, b=0.4, break_frac=0.5,
    regime="coint_then_spurious",
)
    # Experiments 2 & 3 — one break in the cointegrating relationship
```

---

## Examples

The `examples/` folder contains three runnable scripts:

* `example_basic.py` — minimal end-to-end demo on simulated data.
* `example_real_data.py` — applies the full pipeline to real stock
  prices (Coca-Cola / Pepsi by default via `yfinance`; falls back to
  simulated data when the network is unavailable).
* `example_simulation_study.py` — small Monte Carlo reproducing a
  slice of Table 2 of RSV (2019).

---

## References

* Hassler, U. and Breitung, J. (2006). *A Residual-Based LM Type
  Test Against Fractional Cointegration*. **Econometric Theory**,
  22(6), 1091–1111.
* Rodrigues, P. M. M., Sibbertsen, P. and Voges, M. (2019).
  *Testing for breaks in the cointegrating relationship: On the
  stability of government bond markets' equilibrium*. Discussion
  Paper 656.
* Davidson, J. and Monticini, A. (2010). *Tests for cointegration
  with structural breaks based on subsamples*.
  **Computational Statistics & Data Analysis**, 54(11), 2498–2511.
* Robinson, P. M. (1991). *Testing for strong serial correlation and
  dynamic conditional heteroskedasticity in multiple regression*.
  **Journal of Econometrics**, 47, 67–84.
* Breitung, J. and Hassler, U. (2002). *Inference on the cointegration
  rank in fractionally integrated processes*. **Journal of
  Econometrics**, 110, 167–185.
* Demetrescu, M., Kuzin, V. and Hassler, U. (2008). *Long memory
  testing in the time domain*. **Econometric Theory**, 24(1),
  176–215.
* Geweke, J. and Porter-Hudak, S. (1983). *The Estimation and
  Application of Long Memory Time Series Models*. **Journal of Time
  Series Analysis**, 4, 221–238.

---

## License

MIT
