Metadata-Version: 2.4
Name: pytyche
Version: 0.2.1
Summary: Bayesian causal inference for zero-inflated outcomes — GPU-accelerated joint hurdle BCF with SBC calibration
Author-email: Tim Radcliffe <tradcliffe2@gmail.com>
License: MIT License
        
        Copyright (c) 2026 Tim Radcliffe
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://gitlab.com/tradcliffe2/tyche
Project-URL: Documentation, https://tradcliffe2.gitlab.io/tyche/
Project-URL: Repository, https://gitlab.com/tradcliffe2/tyche
Project-URL: Issues, https://gitlab.com/tradcliffe2/tyche/-/issues
Project-URL: Changelog, https://gitlab.com/tradcliffe2/tyche/-/blob/main/CHANGELOG.md
Keywords: causal-inference,bayesian,BCF,BART,JAX,calibration,experimentation,treatment-effects,hurdle-model
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: scipy
Requires-Dist: pandas
Requires-Dist: scikit-learn
Requires-Dist: pyyaml
Requires-Dist: arviz
Requires-Dist: matplotlib>=3.7
Requires-Dist: pillow>=10.0
Requires-Dist: tqdm>=4.60
Requires-Dist: jax
Requires-Dist: bartz>=0.9.0
Requires-Dist: numpyro
Provides-Extra: gpu
Requires-Dist: jax[cuda12]; extra == "gpu"
Provides-Extra: cpu-bcf
Requires-Dist: stochtree; extra == "cpu-bcf"
Provides-Extra: tracking
Requires-Dist: dvc>=3.0; extra == "tracking"
Requires-Dist: dvclive>=3.0; extra == "tracking"
Requires-Dist: matplotlib>=3.7; extra == "tracking"
Requires-Dist: plotext>=5; extra == "tracking"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: pyright; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=7.3; extra == "docs"
Requires-Dist: myst-parser>=3.0; extra == "docs"
Requires-Dist: myst-nb>=1.1; extra == "docs"
Requires-Dist: jupytext>=1.16; extra == "docs"
Requires-Dist: furo>=2024.5; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints; extra == "docs"
Requires-Dist: sphinx-autobuild>=2024.4; extra == "docs"
Requires-Dist: sphinx-copybutton; extra == "docs"
Requires-Dist: sphinx-design; extra == "docs"
Requires-Dist: sphinx-sitemap; extra == "docs"
Requires-Dist: jsonschema>=4.0; extra == "docs"
Requires-Dist: linkify-it-py>=2.0; extra == "docs"
Dynamic: license-file

# pytyche

**GPU-accelerated Bayesian causal forests for zero-inflated outcomes — and an adaptive, round-based experiment loop built on top of them.**

[![PyPI version](https://img.shields.io/pypi/v/pytyche)](https://pypi.org/project/pytyche/)
[![Python versions](https://img.shields.io/pypi/pyversions/pytyche)](https://pypi.org/project/pytyche/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Docs](https://img.shields.io/badge/docs-gitlab.io-blue)](https://tradcliffe2.gitlab.io/tyche/)

pytyche does two things. First, it ships some of the **fastest Bayesian Causal
Forest (BCF) estimators available**: continuous and binary effects run on the
GPU via [bartz](https://github.com/Gattocrucco/bartz), and *hurdle* outcomes
(revenue, spend, and other "mostly-zero, sometimes-positive" metrics) run on
pytyche's own GPU kernel. Any of these can be used **standalone for a single
fit** — give it data, get back a calibrated posterior over heterogeneous
treatment effects. Second, it wraps those estimators in a **round-based
adaptive experiment loop** that allocates the next round's traffic toward the
segments that respond, while keeping controls everywhere so measurement stays
honest. The whole loop runs on a single GPU.

The speed is what makes the rest practical: BCF intervals at production scale
need empirical recalibration (simulation-based calibration across realistic
data), which means hundreds of full posterior fits. On a GPU that is an
overnight job instead of a CPU-week, so calibration becomes something you do
per-deployment rather than per-publication.

## Install

```bash
# Recommended — GPU JAX (CUDA 12, Linux)
uv add 'pytyche[gpu]'      # or: pip install 'pytyche[gpu]'

# CPU-only (fully functional; the first fit warns once if no GPU is found)
uv add pytyche             # or: pip install pytyche
```

Check the runtime with `python -c "import pytyche as pt; pt.check_setup()"`.

## Quick start

Fit the canonical hurdle model on an 800-visitor synthetic dataset in about
20 seconds on JAX-CPU:

```python
import os; os.environ["JAX_PLATFORMS"] = "cpu"  # omit for GPU
import pytyche as pt

bundle = pt.generate(n_visitors=800, segments={
    "responders":     {"pct": 0.4, "base_conv": 0.08, "treatment_effect": 0.10,
                       "aov_mu": 3.5, "aov_sigma": 0.5, "treatment_aov_mu_shift": 0.15},
    "non_responders": {"pct": 0.6, "base_conv": 0.06, "treatment_effect": 0.0,
                       "aov_mu": 3.3, "aov_sigma": 0.5, "treatment_aov_mu_shift": 0.0},
}, metric="revenue_per_visitor", seed=0)

result = pt.fit(bundle.observed, num_burnin=40, num_mcmc=80, num_trees_mu=30,
                num_trees_tau=15, max_depth=4, num_gfr_sweeps=2,
                diagnostic_interval=20, seed=0)
result.analyze()  # treatment comparisons, discovered segments, recommendation
# result.rpv_cate_samples → (n_visitors, 80) posterior draws of the per-visitor effect
```

To run a full multi-round experiment instead of a single fit,
`pt.sequential_experiment(...)` drives the adaptive loop end to end — a
realistically-sized run (350,000 visitors) takes about fifteen minutes on a
consumer GPU.

## Highlights

- **GPU hurdle BCF.** Two coupled forests — probit conversion and log-severity
  — share a single tree topology (following Linero et al.'s shared Bayesian
  forests), so the structure carries information across both channels and
  stabilizes per-segment effects at the low conversion rates online
  experiments actually live at. Roughly **4.5–8.6× faster** than the StochTree
  CPU backend at n=750k; single-channel continuous/binary fits hit **17–63×**
  from n=250k to n=2M ([benchmark grid](ROADMAP.md#empirical-performance-snapshot-2026-05-08--bench-publication-grid)).
- **Calibrated intervals.** BCF posteriors are narrow by construction; pytyche
  recalibrates them against simulation-based ground truth so the credible
  intervals you report are honest at your operating scale.
- **Adaptive experiment loop.** `pt.sequential_experiment` runs Thompson
  allocation with guaranteed control retention and built-in power simulation.
- **Interpretable segments.** Each round compresses the effect posterior into
  a shallow policy tree — a reviewable decision surface, not just a model.
- **Synthetic data generators.** A small typed grammar
  (`pytyche.generators.scenarios`) parameterizes the data-generating process
  for calibration sweeps and power analysis.
- **Honest-uncertainty contracts.** `pytyche.contracts` separates observed
  data from ground truth at the type level, so analysis code cannot
  accidentally peek at what it shouldn't see.

## Documentation

- **[Your first hurdle BCF fit](https://tradcliffe2.gitlab.io/tyche/tutorials/first-hurdle-bcf-fit.html)** — install to an interpretable posterior.
- **[The adaptive experiment](https://tradcliffe2.gitlab.io/tyche/tutorials/first-adaptive-experiment.html)** — the full multi-round loop, end to end.
- **[Overview](https://tradcliffe2.gitlab.io/tyche/concepts/overview.html)** — what pytyche does, who it's for, and the design lineage.
- **[Full documentation](https://tradcliffe2.gitlab.io/tyche/)** — tutorials, how-to guides, concepts, and the API reference.

## When to use it

pytyche is built for **designed experiments**: round-based online tests with a
handful of treatments where assignment rules are explicit and propensities are
recorded exactly. It also supports **observational causal inference** — BCF is
purpose-built for confounded settings, taking propensity scores into the prior
for strong point estimation. Two honest caveats there: pytyche expects
propensity scores as an *input* (it has no built-in nuisance/propensity
estimation or double-ML cross-fitting — that's the reason to reach for
[econml](https://github.com/py-why/EconML) or
[DoubleML](https://github.com/DoubleML/doubleml-for-py) instead), and the
library is shaped and validated around designed experiments, so observational
use is supported but less tested. In all cases, treat intervals as needing
calibration at your scale before you rely on them.

Out of scope: marketplaces and anything with cross-visitor interference (SUTVA
violations), regulated contexts needing preregistration-grade governance,
large-catalog per-item recommendation, and real-time / streaming adaptation.
The full scope discussion is in the
[overview](https://tradcliffe2.gitlab.io/tyche/concepts/overview.html#scope-and-assumptions).

## Contributing

Contributions are welcome — see [`CONTRIBUTING.md`](CONTRIBUTING.md) for the
development setup, branching model, and testing tiers.

## License

MIT — see [`LICENSE`](LICENSE). Built on
[bartz](https://github.com/Gattocrucco/bartz) (MIT) by Giacomo Petrillo; the
GPU BART kernels the continuous and binary paths fit on top of are bartz's. The
hurdle GPU kernel, shared-tree extensions, and the calibration / targeting /
generator stack are pytyche's.

**Source:** <https://gitlab.com/tradcliffe2/tyche> · **PyPI:** <https://pypi.org/project/pytyche/> (the package is `pytyche`; the GitLab repo is `tyche` for URL brevity).

## Citation

*Methodology paper in preparation. Cite as `pytyche, v0.2.1,
https://gitlab.com/tradcliffe2/tyche` until a citable DOI is up.*
