Metadata-Version: 2.4
Name: geoboost
Version: 0.1.0
Summary: Geometry-Aware Gradient Boosting on Toroidal Manifolds — drop-in XGBoost replacement for periodic and manifold-structured data
Home-page: https://github.com/axiomcorpltd/geoboost
Author: Liviu Ioan Cadar
Author-email: Liviu Ioan Cadar <lee.cadar@gmail.com>
Maintainer-email: Liviu Ioan Cadar <lee.cadar@gmail.com>
License: Apache-2.0
Project-URL: Homepage, https://pypi.org/project/geoboost
Project-URL: Research Paper, https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6511819
Project-URL: Bug Tracker, https://github.com/leecadar/geoboost/issues
Project-URL: Author, https://axiomcorp.ai
Keywords: gradient-boosting,riemannian-geometry,toroidal-manifold,machine-learning,phase-transition,xgboost,bayesian,periodic-data,gaussian-curvature,manifold-learning
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Mathematics
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0
Requires-Dist: xgboost>=1.7.0
Requires-Dist: scikit-learn>=1.0.0
Requires-Dist: scipy>=1.7.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# GeoBoost

**Geometry-Aware Gradient Boosting on Toroidal Manifolds**

*Axiom Corp Ltd · Liviu Ioan Cadar · Manchester, UK*

[![PyPI version](https://badge.fury.io/py/geoboost.svg)](https://badge.fury.io/py/geoboost)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

---

## What is GeoBoost?

GeoBoost is a drop-in replacement for XGBoost that operates in the **Riemannian geometry of the toroidal manifold T²**. It solves a fundamental problem that flat gradient boosting cannot: data that lives on periodic, cyclic, or angular feature spaces.

**The core problem:** XGBoost assumes Euclidean space. For periodic data, two points at φ = +3.10 and φ = −3.10 are treated as 6.20 units apart. On the torus, they are neighbours separated by 0.08 units. A split at φ = 0 permanently isolates these neighbours into opposite tree branches. This is not a bias — it is a **topological error**.

GeoBoost eliminates it.

---

## Benchmark Results

| Metric | XGBoost | GeoBoost | Notes |
|--------|---------|----------|-------|
| **Phase boundary accuracy** | 73.3% | **99.1%** | n=1000, 100 runs — *primary metric* |
| Overall accuracy | 99.3% | 99.8% | |
| Regressor RMSE | 0.028 | **0.012** | 2.4× improvement |
| AUC-ROC | 0.9977 | **1.0000** | |
| Commercial (Meridian) AUC | ~0.75 | **0.933** | Real-world data, n=180 |
| CRISIS recall | 0% | **80%** | 5-fold CV on real data |

**The phase boundary is the critical metric.** At the K=0 locus — where Gaussian curvature crosses zero — complex systems undergo phase transitions. Financial crises, protein misfolding, hurricane rapid intensification, geomagnetic storms: all manifest as K sign inversions on the toroidal manifold. Flat XGBoost is systematically blind at this boundary. GeoBoost is not.

---

## Installation

```bash
pip install geoboost
```

---

## Quick Start

```python
import numpy as np
from geoboost import GeoBoostClassifier

# Data: first two columns MUST be (phi, psi) — angular coordinates
phi = np.random.uniform(-np.pi, np.pi, 500)
psi = np.random.uniform(-np.pi, np.pi, 500)
X = np.column_stack([phi, psi])
y = (np.cos(phi) > 0).astype(int)  # 0=CRISIS, 1=STABLE

# Fit — same API as XGBoost
clf = GeoBoostClassifier(R=1.0, r=0.5, n_estimators=100)
clf.fit(X, y)

# Predict
clf.predict(X[:5])
clf.predict_proba(X[:5])
clf.predict_regime(X[:5])         # 'CRISIS' / 'TRANSITION' / 'STABLE'
clf.phase_boundary_score(X, y)    # Primary metric — accuracy at K=0
```

### Binary crisis classifier (recommended for real data)

```python
from geoboost import GeoBoostBinaryClassifier

clf = GeoBoostBinaryClassifier(
    crisis_weight=5.0,   # set to n_stable / n_crisis
    geodesic_splits=True
)
clf.fit(X, y_binary)     # y: 1=CRISIS, 0=NOT

# Returns calibrated P(CRISIS) per sample
probs = clf.predict_crisis_probability(X)

# sklearn Pipeline compatible
from sklearn.pipeline import Pipeline
pipe = Pipeline([('geo', clf)])
```

### Regressor

```python
from geoboost import GeoBoostRegressor

reg = GeoBoostRegressor(R=1.0, r=0.5, scale_riemannian=True)
reg.fit(X, K_values)    # predict Gaussian curvature directly
reg.predict(X)
```

### Auto-tune manifold parameters

```python
clf = GeoBoostClassifier(R=None, r=None)  # auto-tunes R and r from data
clf.fit(X, y)
print(clf.manifold_)   # TorusManifold(R=1.23, r=0.48)
```

---

## Full Report

```python
from geoboost.metrics import geoboost_report

print(geoboost_report(
    X_test, y_test,
    clf.predict(X_test),
    clf.predict_proba(X_test),
    clf.manifold_
))
```

Output:
```
╔══════════════════════════════════════════════╗
║         GeoBoost Evaluation Report           ║
╠══════════════════════════════════════════════╣
║  Overall Accuracy:        0.998              ║
║  Phase Boundary Acc:      0.991  ← PRIMARY   ║
║  AUC-ROC:                 1.000              ║
║  Geodesic RMSE:           0.012              ║
║  Manifold: R=1.00, r=0.50                    ║
╚══════════════════════════════════════════════╝
```

---

## Hyperparameters

### GeoBoost-specific

| Parameter | Default | Range | Description |
|-----------|---------|-------|-------------|
| `R` | 1.0 | [0.5, 5.0] | Major torus radius (None = auto) |
| `r` | 0.5 | [0.1, R×0.8] | Minor torus radius (None = auto) |
| `K_threshold` | 0.05 | [0.02, 0.20] | Phase boundary sensitivity |
| `boundary_weight` | 2.0 | [1.0, 5.0] | Amplification near K=0 |
| `geodesic_splits` | True | bool | Enable geodesic pre-clustering |
| `n_clusters` | 20 | [10, 50] | Geodesic cluster count |

All standard XGBoost parameters (`n_estimators`, `max_depth`, `learning_rate`, etc.) work identically.

### Auto-tuning

```python
from geoboost import GeoBoostHPO

hpo = GeoBoostHPO(n_iter=30, cv=5)
best_params = hpo.fit(X_train, y_train)
print(hpo.summary())
```

---

## When to Use GeoBoost

GeoBoost provides statistically significant advantage when:

1. **Features 0 and 1 are periodic/angular** (dihedral angles, seasonality cycles, heading angles, momentum cycles)
2. **Mean |K| > 0.01** — the manifold has non-trivial curvature
3. **Phase transitions exist in the data** — regimes that shift at a geometric boundary

**Use `GeoBoostBinaryClassifier` when:** n_crisis < 100  
**Use `RiemannianGeoBoostClassifier` when:** n_crisis > 100 (full Riemannian gradient G⁻¹∇L)

### Validated domains

| Domain | phi | psi | What K=0 means |
|--------|-----|-----|----------------|
| **Financial markets** | Price momentum cycle | Annual seasonality | Market crisis onset |
| **Structural biology** | Backbone dihedral φ | Backbone dihedral ψ | IDP→amyloid transition |
| **Atmospheric science** | Storm heading | Departure angle | Rapid intensification onset |
| **Commercial demand** | Booking momentum | Departure seasonality | Margin compression event |

---

## Architecture

GeoBoost augments XGBoost with three Riemannian components:

**1. Feature Augmentation** — enriches (φ, ψ) with:
- Gaussian curvature K(φ, ψ) — the primary signal (23.3% importance)
- Metric tensor components g_φφ, g_ψψ
- Distance to K=0 boundary
- Smooth periodic sin/cos encodings (replaces raw angles)

**2. Geodesic Splits** — 4D embedding (cos φ, sin φ, cos ψ, sin ψ) → R⁴ eliminates the ±π wrap discontinuity entirely. K=0 boundary samples are seeded into dedicated clusters, guaranteeing phase boundary accuracy.

**3. Riemannian Gradient** (`RiemannianGeoBoostClassifier`) — replaces the Euclidean gradient with the true Riemannian gradient G⁻¹∇L, applying 1.8× correction factor at the crisis (inner face) region.

---

## Scientific Background

GeoBoost is part of the **CCMC (Cadar Chain Monte Carlo)** research framework — a toroidal manifold-based Bayesian inference system validated across four independent scientific domains.

The universal finding: **the K sign inversion is the universal phase transition signal on T²**. When Gaussian curvature transitions from positive to negative across the K=0 boundary, a regime transition is occurring. GeoBoost is the supervised learning component of this framework — it learns the K=0 crossing from labelled data.

**Related publications:**
- Paper 2: *The Regime-Inversion Theorem: Geometric Detection of Market Crisis via Toroidal Manifold Analysis* — SSRN 6511819 / Quantitative Finance (submitted April 2026)
- Paper 3: *Toroidal Curvature Inversion as a Geometric Biosignal for IDP Amyloidogenesis* — Nature Methods NMETH-A66145 (submitted May 2026)
- Paper 7: *GeoBoost: Geometry-Aware Gradient Boosting on Toroidal Manifolds* — in preparation

---

## Citation

```bibtex
@software{cadar2026geoboost,
  title   = {GeoBoost: Geometry-Aware Gradient Boosting on Toroidal Manifolds},
  author  = {Cadar, Liviu Ioan},
  year    = {2026},
  version = {0.1.0},
  url     = {https://pypi.org/project/geoboost},
  note    = {Axiom Corp Ltd, Manchester UK. ORCID: 0009-0000-7874-8121}
}
```

---

## Known Limitations

- Optimised for T² torus; extension to Tⁿ (n>2) untested
- Geodesic clustering is CPU-bound (no GPU acceleration yet)
- Riemannian gradient advantage requires n_crisis > 100 to stabilise
- Real-world performance on very small datasets (N < 500) is less pronounced than synthetic benchmarks

---

## License

Apache License 2.0 — see [LICENSE](LICENSE) for details.

Developed by **Liviu Ioan Cadar**, Axiom Corp Ltd, Manchester UK.  
*Legatum Super Omnia.*

---

*For research enquiries: lee.cadar@gmail.com*  
*Research profile: [SSRN](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6511819)*
