Metadata-Version: 2.4
Name: binjamin
Version: 0.2.0
Summary: Bin width estimation and lattice geometry — from data to measurement space
Project-URL: Homepage, https://github.com/adelic-ai/binjamin
Author-email: Shun Richard Honda <shun.honda@adelic.org>
License: MIT
License-File: LICENSE
Keywords: bayesian-blocks,bin-width,binning,factorization,freedman-diaconis,histogram,lattice,multiscale,p-adic
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Scientific/Engineering :: Mathematics
Requires-Python: >=3.12
Requires-Dist: numpy>=1.24
Description-Content-Type: text/markdown

# binjamin

Derive artifact-free, perfectly nested multiscale analysis windows from any discrete data. Also includes every major bin width estimation method in one place.

## Install

```bash
pip install binjamin
```

## Quick Start

### Lattice geometry — the main feature

Standard multiscale analysis requires choosing window sizes. Those choices introduce boundary artifacts, inconsistent nesting, and results that depend on the analyst. Binjamin derives the measurement space from arithmetic — the windows are forced by the math, not chosen.

```python
import binjamin as bj

geo = bj.lattice([10, 30, 60, 360])
```

That's it. You get a complete measurement space:

```
geo.cbin        # 10 — computational resolution (gcd of windows)
geo.horizon     # 360 — lcm of windows
geo.windows     # (10, 12, 15, 18, 20, 30, 36, 40, 60, 90, 120, 180, 360)
geo.prime_basis # {2: 2, 3: 2} — the geometry of scale
```

Every window divides the horizon. Every window nests perfectly into every larger window. No boundary artifacts. The structure comes from the [divisibility lattice](docs/concepts.md) — the unique maximal family of window sizes with these properties.

### From real data

```python
# EEG at 256 Hz — windows in samples
geo = bj.lattice([256, 1024, 4096, 15360], data=sample_indices)
geo.data_grain  # estimated from data
geo.cbin        # 256 — analysis resolution
geo.grain       # 1 — can zoom finer later
```

### Bin width estimation

```python
bj.freedman_diaconis(intervals)   # robust, no assumption
bj.auto(intervals)                # good default
edges, counts = bj.bin(data)      # bin and count
```

## Features

### Lattice geometry

Derive a complete measurement space from analysis windows.

```python
geo = bj.lattice(
    windows=[10, 30, 60, 360],   # desired analysis scales
    grain=1,                      # finest resolution (optional)
    cbin=10,                      # materialized resolution (optional)
    data=orders,                  # data for grain estimation (optional)
    horizon=720,                  # override horizon (optional)
)
```

Usually you just pass `windows`. Everything else is derived. See [docs/lattice.md](docs/lattice.md) for details.

**LatticeGeometry fields:**

| Field | Meaning |
|-------|---------|
| `data_grain` | Estimated from data (raw cadence) |
| `grain` | Finest admissible resolution |
| `cbin` | Materialized resolution (divides all windows) |
| `horizon` | Outer boundary of coordinate space |
| `windows` | All valid scales in the active domain |
| `prime_basis` | Prime factorization of horizon // cbin |
| `coordinates` | Prime exponent vector per window |
| `members` | Full set of lattice members |

### Coordinates

Every integer has a unique prime factorization. Binjamin exposes this as a coordinate system.

```python
bj.factorize(360)                  # {2: 3, 3: 2, 5: 1}
bj.divisors(12)                    # (1, 2, 3, 4, 6, 12)
bj.lattice_members(360, 10)        # (10, 20, 30, ..., 360)
bj.smallest_divisor_gte(360, 50)   # 60
bj.to_int({2: 3, 3: 2, 5: 1})     # 360
```

Vector arithmetic mirrors integer operations:

```python
bj.vec_add({2: 1}, {3: 1})        # {2: 1, 3: 1}  — multiplication
bj.vec_sub({2: 2, 3: 1}, {2: 1})  # {2: 1, 3: 1}  — division
bj.vec_le({2: 1}, {2: 2, 3: 1})   # True           — divisibility
bj.vec_min({2: 3}, {2: 1, 5: 2})  # {2: 1}         — gcd
bj.vec_max({2: 3}, {2: 1, 5: 2})  # {2: 3, 5: 2}   — lcm
```

See [docs/coordinates.md](docs/coordinates.md) for the full API and the math behind it.

### Bin width estimation

Every major method. All scalar methods take a 1-D array, return a float.

| Method | Best for |
|---|---|
| `auto` | General default |
| `freedman_diaconis` | Unknown distribution, outliers |
| `scott` | Near-normal data |
| `sturges` | Small, near-normal |
| `rice` | Simple, no assumption |
| `sqrt` | Quick exploratory |
| `doane` | Skewed or multimodal |
| `stone` | Accuracy over speed |
| `knuth` | Optimal uniform bins |
| `gcd_interval` | Integer sequences, regular data |
| `bayesian_blocks` | Non-stationary event data (variable-width) |

See [docs/methods.md](docs/methods.md) for formulas and usage guidance.

### Grain estimation

```python
bj.grain_from_orders(orders)                        # Freedman-Diaconis default
bj.grain_from_orders(orders, method="gcd_interval")  # exact for regular data
bj.suggest_cbin(grain=60, windows=[3600, 86400])     # → 60
```

## Documentation

| Doc | What it covers |
|-----|----------------|
| [Lattice geometry](docs/lattice.md) | LatticeGeometry, derivation logic, grain/cbin/horizon |
| [Coordinates](docs/coordinates.md) | Prime factorization, divisors, vector arithmetic |
| [Methods](docs/methods.md) | Bin width estimation formulas and guidance |

## Used by

[SignalForge](https://github.com/adelic-ai/signalforge) — multiscale signal analysis on the p-adic divisibility lattice. Binjamin provides the mathematical substrate; SignalForge provides signals, surfaces, and the exploration tools.

## License

[MIT](LICENSE)
