Metadata-Version: 2.4
Name: bbstat
Version: 0.1.1
Summary: A lightweight library for Bayesian bootstrapping and statistical evaluation.
Author-email: Chris Wehmeyer <wehmeyer.chris@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/cwehmeyer/bbstat
Project-URL: Documentation, https://cwehmeyer.github.io/bbstat/
Project-URL: Repository, https://github.com/cwehmeyer/bbstat.git
Project-URL: Issues, https://github.com/cwehmeyer/bbstat/issues
Project-URL: Changelog, https://github.com/cwehmeyer/bbstat/blob/main/CHANGELOG.md
Keywords: bayesian,bootstrap,statistics
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: matplotlib
Requires-Dist: numpy>=1.21
Requires-Dist: scipy>=1.11
Provides-Extra: test
Requires-Dist: pytest>=7.0; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Provides-Extra: docs
Requires-Dist: mkdocs; extra == "docs"
Requires-Dist: mkdocs-autorefs; extra == "docs"
Requires-Dist: mkdocs-get-deps; extra == "docs"
Requires-Dist: mkdocs-include-markdown-plugin; extra == "docs"
Requires-Dist: mkdocstrings[python]; extra == "docs"
Provides-Extra: lint
Requires-Dist: black; extra == "lint"
Requires-Dist: ruff; extra == "lint"
Provides-Extra: typecheck
Requires-Dist: mypy; extra == "typecheck"
Requires-Dist: scipy-stubs; extra == "typecheck"
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: tox; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# bbstat

[![PyPI version](https://img.shields.io/pypi/v/bbstat.svg)](https://pypi.org/project/bbstat/)
[![Python Versions](https://img.shields.io/pypi/pyversions/bbstat.svg)](https://pypi.org/project/bbstat/)
[![CI](https://github.com/cwehmeyer/bbstat/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/cwehmeyer/bbstat/actions/workflows/ci.yaml)
[![codecov](https://codecov.io/gh/cwehmeyer/bbstat/branch/main/graph/badge.svg?token=V3QV2DFJ9W)](https://codecov.io/gh/cwehmeyer/bbstat)
[![Docs](https://img.shields.io/badge/docs-latest-blue.svg)](https://cwehmeyer.github.io/bbstat/)

A lightweight library for Bayesian bootstrapping and statistical evaluation.

## Installation

### From PyPI:

```bash
pip install bbstat
```

### From GitHub source code:

```bash
git clone https://github.com/cwehmeyer/bbstat.git
cd bbstat
pip install .
```

## Quickstart

```python
import numpy as np
from bbstat import bootstrap

# Data preparation: simulated income for a small population (e.g., a survey of 25 people)
income = np.array([
    24_000, 26_000, 28_000, 30_000, 32_000,
    35_000, 36_000, 38_000, 40_000, 41_000,
    45_000, 48_000, 50_000, 52_000, 54_000,
    58_000, 60_000, 62_000, 65_000, 68_000,
    70_000, 75_000, 80_000, 90_000, 100_000,
], dtype=np.float64)

# Direct estimate of mean income
print(np.mean(income))  # => 52280.0

# Bootstrapped distribution of the mean income.
distribution = bootstrap(data=income, statistic_fn="mean", seed=1)
print(distribution)  # => BootstrapDistribution(mean=52263.8..., size=1000)

# Summarize the bootstrapped distribution of the mean income.
summary = distribution.summarize(level=0.87)
print(summary)  # => BootstrapSummary(mean=52263.8..., ci_low=46566.8..., ci_high=58453.6..., level=0.87)
print(summary.round())  # => BootstrapSummary(mean=52000.0, ci_low=47000.0, ci_high=58000.0, level=0.87)
```

## API Overview

### `bootstrap(data, statistic_fn, n_boot=1000, ...)`

Performs Bayesian bootstrapping on input `data` using the given statistic.

- `data`: 1D NumPy array, or tuple/list thereof
- `statistic_fn`: string or callable (e.g., `"mean"`, `"median"`, or custom function)
- `level`: credible interval (default 0.87)
- `n_boot`: number of bootstrap samples
- `seed`: random seed (optional)
- `blocksize`: number of resamples to allocate in one block
- `fn_kwargs`: optional dictionary with parameters for `statistic_fn`

Returns a `BootstrapResult` with:
- `.mean`: estimated statistic value
- `.ci`: tuple representing lower and upper bounds of the credible interval
- `.level`: credible level used
- `.n_boot`: number of bootstraps performed
- `.estimates`: array of statistic values computed across the bootstrapped posterior samples

### Weighted statistic functions included

The module bbstat.statistics provides a number univariate and bivariate weighted statistics:
- `"entropy"`: `bbstat.statistics.compute_weighted_entropy(data, weights)`
- `"eta_square_dependency"`: `bbstat.statistics.compute_weighted_eta_square_dependency(data, weights)`
- `"log_odds"`: `bbstat.statistics.compute_weighted_log_odds(data, weights, state: int)`
- `"mean"`: `bbstat.statistics.compute_weighted_mean(data, weights)`
- `"median"`: `bbstat.statistics.compute_weighted_median(data, weights)`
- `"mutual_information"`: `bbstat.statistics.compute_weighted_mutual_information(data, weights)`
- `"pearson_dependence"`: `bbstat.statistics.compute_weighted_pearson_dependence(data, weights, ddof: int = 0)`
- `"percentile"`: `bbstat.statistics.compute_weighted_percentile(data, weights, percentile: float)`
- `"probability"`: `bbstat.statistics.compute_weighted_probability(data, weights, state: int)`
- `"quantile"`: `bbstat.statistics.compute_weighted_quantile(data, weights, quantile: float)`
- `"self_information"`: `bbstat.statistics.compute_weighted_self_information(data, weights, state: int)`
- `"spearman_depedence"`: `bbstat.statistics.compute_weighted_spearman_depedence(data, weights, ddof: int = 0)`
- `"std"`: `bbstat.statistics.compute_weighted_std(data, weights, ddof: int = 0)`
- `"sum"`: `bbstat.statistics.compute_weighted_sum(data, weights)`
- `"variance"`: `bbstat.statistics.compute_weighted_variance(data, weights, ddof: int = 0)`

If you want to use your own custom functions, please adhere to this pattern

```python
def custom_statistic(data, weights, *, **kwargs) -> float
```

where `data` is
- a 1D numpy array of length `n_data` or
- a tuple/list of 1D numpy arrays, each of length `n_data`

and `weights` is a 1D numpy array of length `n_data`, with non-negative elements that sum up to one. The function may also take additional parameters which can be supplied via `**kwargs`.

## License

This project is licensed under the MIT License.

## Contributing

Contributions are welcome! Please open an issue or submit a pull request.
