Metadata-Version: 2.4
Name: tri-boost
Version: 0.2.1
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Financial and Insurance Industry
Requires-Dist: numpy>=1.23
Requires-Dist: scikit-learn>=1.2 ; extra == 'bench'
Requires-Dist: pandas>=1.5 ; extra == 'bench'
Requires-Dist: pyarrow>=10.0 ; extra == 'bench'
Requires-Dist: scikit-learn>=1.2 ; extra == 'sklearn'
Provides-Extra: bench
Provides-Extra: sklearn
Summary: A depth-3 oblivious GBM that is exactly decomposable into ≤3rd-order fANOVA tables.
Keywords: gradient-boosting,gbm,explainable,interpretable,fanova,glassbox,insurance,actuarial
License: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Documentation, https://github.com/PricingFrontier/tri-boost#readme
Project-URL: Repository, https://github.com/PricingFrontier/tri-boost

# tri-boost

**A depth-3 oblivious gradient-boosting machine that is *exactly* decomposable into ≤3rd-order
functional-ANOVA (fANOVA) "rating tables" — without giving up accuracy or speed.**

Every tree is a depth-`1..=3` symmetric (oblivious) tree with one shared `(feature, threshold)`
test per level and at most three distinct raw features, so the trained ensemble truncates *exactly*
at the 3rd interaction order. That structure lets the fitted model be rewritten, losslessly, as a
small set of main-effect and interaction tables that reproduce the model's predictions bit-for-bit
— a glass-box GBM you can read, ship as lookup tables, or audit.

- **Rust core** (`tri-boost-core`), thin [PyO3](https://pyo3.rs) bindings, and scikit-learn
  estimators. The core is `#![forbid(unsafe_code)]`, no-panic-gated, and deterministic
  (bit-identical across thread counts).
- **scikit-learn compatible**: `TriBoostRegressor` / `TriBoostClassifier` drop into `Pipeline`,
  `GridSearchCV`, `cross_val_score`.
- **Objectives**: `squared_error`, `logistic` (binary), native-softmax **multiclass**, and the
  log-link `poisson` / `gamma` / `tweedie` families for insurance frequency & severity.
- **Exact decomposition**: `model.tables(X)` emits the ≤3rd-order fANOVA tables; the reconstruction
  is verified against the ensemble by five lossless invariant checks.

## Install

```bash
pip install tri-boost            # numpy-only core (raw booster API)
pip install "tri-boost[sklearn]" # + the scikit-learn estimators
```

Wheels are built for Linux / macOS / Windows as a single abi3 wheel per platform (CPython 3.10–3.13).

### From source

Building from source needs a Rust toolchain (`rustup`) and [maturin](https://www.maturin.rs):

```bash
pip install maturin
maturin develop --release        # builds the Rust extension into the active venv
```

## Quickstart

```python
import numpy as np
from tri_boost import TriBoostRegressor, TriBoostClassifier

# Constructor defaults embed the recommended recipe (early stopping against a large n_trees cap,
# leaf refinement, outer bagging), so a bare estimator performs well without tuning.

# Regression
reg = TriBoostRegressor().fit(X_train, y_train)
pred = reg.predict(X_test)

# Binary classification
clf = TriBoostClassifier().fit(X_train, y_binary)
proba = clf.predict_proba(X_test)        # (n, 2)

# Multiclass (native softmax) — same API, K>=3 classes
mclf = TriBoostClassifier().fit(X_train, y_multiclass)
proba = mclf.predict_proba(X_test)       # (n, K), rows sum to 1
```

### The exact decomposition

```python
import json
tables = json.loads(clf.tables(X_sample))   # ≤3rd-order fANOVA rating tables (JSON)
```

Each fitted model decomposes into main effects and ≤3-way interactions that reproduce the raw
score exactly (for a multiclass model, one table bank per class logit).

## Objectives

| Objective | Task | Link |
|-----------|------|------|
| `squared_error` | regression | identity |
| `logistic` | binary classification | logit |
| softmax (automatic for `TriBoostClassifier` with ≥3 classes) | multiclass | softmax |
| `poisson` | counts / frequency | log |
| `gamma` | positive severities | log |
| `tweedie` | compound Poisson-gamma | log |

## Status & license

Pre-1.0; the wire `schema_version` is versioned independently of the package version.
Licensed under **Apache-2.0**.

