Metadata-Version: 2.4
Name: lrdbench
Version: 1.2.1
Summary: A reproducible benchmark framework for evaluating long-range dependence estimators on canonical, contaminated, and observational time series.
Project-URL: Homepage, https://github.com/dave2k77/lrdbench
Project-URL: Documentation, https://lrdbench.readthedocs.io/
Project-URL: Repository, https://github.com/dave2k77/lrdbench
Project-URL: Issues, https://github.com/dave2k77/lrdbench/issues
Author: Davian Chin
License: MIT
License-File: LICENSE
Keywords: benchmarking,biomedical signal processing,fractional processes,hurst exponent,long-range dependence,scientific computing,time series,uncertainty quantification
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: matplotlib>=3.8
Requires-Dist: numpy>=1.26
Requires-Dist: pandas>=2.1
Requires-Dist: pywavelets>=1.5
Requires-Dist: pyyaml>=6.0
Requires-Dist: scipy>=1.11
Requires-Dist: seaborn>=0.13
Requires-Dist: typing-extensions>=4.8
Provides-Extra: all
Requires-Dist: build>=1.0; extra == 'all'
Requires-Dist: hypothesis>=6.98; extra == 'all'
Requires-Dist: ipykernel>=6.0; extra == 'all'
Requires-Dist: jinja2>=3.1; extra == 'all'
Requires-Dist: joblib>=1.3; extra == 'all'
Requires-Dist: jupyterlab>=4.0; extra == 'all'
Requires-Dist: mkdocs-material>=9.5; extra == 'all'
Requires-Dist: mkdocs>=1.5; extra == 'all'
Requires-Dist: mkdocstrings[python]>=0.24; extra == 'all'
Requires-Dist: mypy>=1.8; extra == 'all'
Requires-Dist: pre-commit>=3.6; extra == 'all'
Requires-Dist: pyarrow>=14.0; extra == 'all'
Requires-Dist: pytest-cov>=4.1; extra == 'all'
Requires-Dist: pytest>=8.0; extra == 'all'
Requires-Dist: ruff>=0.4; extra == 'all'
Requires-Dist: scikit-learn>=1.4; extra == 'all'
Requires-Dist: tabulate>=0.9; extra == 'all'
Requires-Dist: torch>=2.2; extra == 'all'
Requires-Dist: twine>=5.0; extra == 'all'
Provides-Extra: data-driven
Requires-Dist: joblib>=1.3; extra == 'data-driven'
Requires-Dist: scikit-learn>=1.4; extra == 'data-driven'
Requires-Dist: torch>=2.2; extra == 'data-driven'
Provides-Extra: dev
Requires-Dist: build>=1.0; extra == 'dev'
Requires-Dist: mypy>=1.8; extra == 'dev'
Requires-Dist: pre-commit>=3.6; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Requires-Dist: twine>=5.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5; extra == 'docs'
Requires-Dist: mkdocs>=1.5; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.24; extra == 'docs'
Provides-Extra: ml
Requires-Dist: joblib>=1.3; extra == 'ml'
Requires-Dist: scikit-learn>=1.4; extra == 'ml'
Provides-Extra: nn
Requires-Dist: torch>=2.2; extra == 'nn'
Provides-Extra: notebooks
Requires-Dist: ipykernel>=6.0; extra == 'notebooks'
Requires-Dist: jupyterlab>=4.0; extra == 'notebooks'
Provides-Extra: parquet
Requires-Dist: pyarrow>=14.0; extra == 'parquet'
Provides-Extra: reports
Requires-Dist: jinja2>=3.1; extra == 'reports'
Requires-Dist: tabulate>=0.9; extra == 'reports'
Provides-Extra: test
Requires-Dist: hypothesis>=6.98; extra == 'test'
Requires-Dist: pytest-cov>=4.1; extra == 'test'
Requires-Dist: pytest>=8.0; extra == 'test'
Description-Content-Type: text/markdown

# lrdbench

**A reproducible benchmark framework for evaluating long-range dependence estimators on canonical, contaminated, and observational time series.**

**Documentation:** [lrdbench.readthedocs.io](https://lrdbench.readthedocs.io/) (built with MkDocs and Read the Docs).

Current public release: `1.2.1`. Releases are archived on Zenodo — cite the concept DOI
[`10.5281/zenodo.20937726`](https://doi.org/10.5281/zenodo.20937726) (it always resolves to the
latest archived version) or use `CITATION.cff`.

`lrdbench` is a research-oriented benchmarking framework for studying the behaviour of long-range dependence (LRD) estimators across three distinct settings:

- **ground-truth mode** for canonical synthetic time series with declared target truth;
- **stress-test mode** for synthetic time series under controlled contamination;
- **observational mode** for biomedical or user-provided time series without benchmark truth.

The framework is designed to support:

- rigorous comparison of classical and new LRD estimators;
- bundled temporal, spectral, geometric, wavelet, aggregation, and data-driven estimator families;
- uncertainty-aware benchmarking, including empirical interval coverage where applicable;
- robustness analysis under the bundled contamination operators: heavy-tailed noise, level shifts, outliers, and polynomial trends;
- experimental data-driven baselines, including Random Forest, SVR, CNN, and LSTM estimators;
- transparent failure analysis and validity-rate reporting;
- manifest-driven, provenance-complete, reproducible benchmark execution.

---

## Why this project exists

There is currently no widely adopted, comprehensive, reproducible benchmark specifically designed for long-range dependence estimation that simultaneously addresses:

- canonical synthetic processes with known targets;
- contamination-induced estimator instability;
- uncertainty quantification and interval coverage;
- observational biomedical time series with no benchmark truth;
- extensible enrolment of new estimators under a common interface.

`lrdbench` aims to fill that gap.

It is especially intended to support the careful evaluation of the hypothesis that many classical second-order LRD estimators behave well in their intended stationary finite-variance regime, but become unstable, miscalibrated, or non-identifiable under nonstationarity, heavy-tailed fluctuations, artefacts, and other out-of-regime conditions.

---

## Scope

`lrdbench` is a **research benchmark framework**. It is **not**:

- a clinical decision system;
- a diagnostic tool;
- a guarantee of “true LRD” in arbitrary empirical signals;
- a universal ranking oracle for all estimators in all regimes.

Benchmark results must always be interpreted in light of:

- the declared benchmark mode;
- the target estimand;
- the source specification;
- the contamination design;
- the metric definitions;
- the aggregation and leaderboard rules.

See [`RESEARCH_USAGE.md`](RESEARCH_USAGE.md) for the full policy.

---

## Core features

### Benchmark modes
- **Ground-truth mode**
  - bias, MAE, RMSE
  - empirical coverage
  - interval width
  - validity rate
  - runtime and efficiency

- **Stress-test mode**
  - estimate drift
  - degradation ratios
  - validity collapse
  - coverage collapse
  - robustness leaderboards

- **Observational mode**
  - instability across windows
  - preprocessing sensitivity
  - resampling variability
  - failure analysis
  - stability leaderboards

### Supported data sources
- implemented synthetic generators:
  - fGn
  - fBm
  - ARFIMA(0,d,0)
  - MRW
  - fOU
- contaminated synthetic pipelines
- custom CSV datasets
- future observational/API-based datasets

### Reporting
- HTML reports
- Markdown reports
- CSV exports
- Parquet result stores
- JSON metadata exports
- LaTeX tables for publication workflows

### Extensibility
- pluggable estimator interface
- manifest-driven benchmark runs
- explicit estimator metadata and estimand declarations
- run-local supervised training for built-in ML/NN baseline estimators
- registry-based component enrolment

### Bundled estimators
- temporal Hurst-proxy methods: `RS`, `DFA`, `DMA`, `AbsoluteMoment`, `Variance`,
  `VarianceResidual`
- spectral long-memory methods: `GPH`, `Periodogram`, `WhittleMLE`,
  `ModifiedLocalWhittle`
- geometric and wavelet Hurst-proxy comparators
- experimental data-driven baselines: `MLRandomForest`, `MLSVR`, `MLCNN`, `MLLSTM`

See [`docs/bundled_estimators.md`](docs/bundled_estimators.md) and
[`docs/estimator_status.md`](docs/estimator_status.md) for names, targets, and interpretation
status.

---

## Design principles

`lrdbench` is built around a few non-negotiable principles:

1. **Explicit estimands**  
   Every estimator must declare the quantity it is intended to estimate.

2. **Mode-aware evaluation**  
   Truth-based metrics are not used where truth does not exist.

3. **Failure transparency**  
   Invalid outputs, crashes, and missing uncertainty are recorded explicitly.

4. **Provenance preservation**  
   Every benchmark result is traceable to a manifest, source, estimator configuration, and software version.

5. **Reproducibility first**  
   A benchmark run should be reproducible from a single manifest plus the relevant package version and data sources.

---

## Installation

### Core installation

```bash
pip install lrdbench
```

### First run

After installing with reporting support, run the smallest packaged benchmark:

```bash
pip install "lrdbench[reports]"
lrdbench run smoke_ground_truth
```

From a repository checkout, the same quickstart is available as:

```bash
pip install -e ".[reports]"
python examples/quickstart_pure.py
```

The command prints the run identifier, result store, HTML report path, and output validation
command. See the [quickstart tutorial](docs/tutorials/quickstart.md) for the full walkthrough.

### Preview before running

Use `--dry-run` when you want to inspect the benchmark grid before fitting estimators or writing
outputs:

```bash
lrdbench run smoke_ground_truth --dry-run
```

The preview prints the benchmark mode, materialised record count, enrolled estimator count, total
fit jobs, clean/contaminated split for stress tests, and global seed. This is the recommended first
step before launching public-medium, custom, or data-driven suites.

### Data-driven smoke benchmark

The RF/SVR data-driven smoke suite uses optional scikit-learn dependencies:

```bash
pip install "lrdbench[ml,reports]"
lrdbench run smoke_data_driven
```

From a repository checkout:

```bash
pip install -e ".[ml,reports]"
lrdbench run configs/suites/smoke_data_driven.yaml
python examples/data_driven_baseline_benchmark.py
```

Data-driven estimators are trained from the manifest-declared `ml_training` block and should be
interpreted relative to that synthetic training distribution. See
[`docs/data_driven_estimators.md`](docs/data_driven_estimators.md).
