Metadata-Version: 2.4
Name: trench-core
Version: 0.8.0
Summary: Plumbing for calibrated AI trading agents — capture, score, instrument, anchor.
Author-email: TrenchSignals <hello@trenchsignals.io>
License: MIT
Project-URL: Homepage, https://trenchsignals.io
Project-URL: Documentation, https://trenchsignals.io/api
Project-URL: Repository, https://github.com/trenchsignals/trench-core
Project-URL: Changelog, https://github.com/trenchsignals/trench-core/blob/main/CHANGELOG.md
Keywords: llm-agents,calibration,brier-score,prediction-markets,trading,ai,anthropic,claude
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Office/Business :: Financial :: Investment
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: calibration
Requires-Dist: pandas>=2.0; extra == "calibration"
Requires-Dist: numpy>=1.24; extra == "calibration"
Provides-Extra: replay
Provides-Extra: sources
Requires-Dist: feedparser>=6.0; extra == "sources"
Provides-Extra: markets
Provides-Extra: ontology
Provides-Extra: all
Requires-Dist: trench-core[calibration,markets,ontology,replay,sources]; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: pdoc>=14.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Dynamic: license-file

# trench-core

> Plumbing for calibrated AI trading agents.
> Capture, score, instrument, anchor.

[![PyPI version](https://img.shields.io/pypi/v/trench-core.svg)](https://pypi.org/project/trench-core/)
[![Python versions](https://img.shields.io/pypi/pyversions/trench-core.svg)](https://pypi.org/project/trench-core/)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

`trench-core` is the open-source framework that powers
[TrenchSignals](https://trenchsignals.io) — an autonomous AI paper-trading
geopolitical conflict markets in public, with every prediction Brier-scored
and every loss publicly post-mortemed.

This package is the **plumbing**: capture, score, instrument, anchor. It is
deliberately *not* the agent itself — TrenchSignals' specific ontology,
prompts, brand voice, and operating record stay private.

If you want to:

- **Capture every LLM call** with full input/output for later replay through a
  different model
- **Score predictions** against actual market settlements (Brier, calibration
  curves, threshold backtests, P&L attribution)
- **Anchor predictions** into a public hash chain so anyone can verify them
  without trusting your server
- **Instrument every loop iteration** with a single structured outcome line so
  failure modes are observable in production
- **Run a multi-variant tournament** of decision policies on the same
  intelligence pipeline

…then `trench-core` gives you the building blocks.

## Status

> **Alpha (`0.x`).** All eight modules are shipped (`0.8.0`). API may break
> between minor versions until `1.0.0`. Pin exactly if you depend on it.

See [the changelog](CHANGELOG.md) for what's landed.

## Install

```bash
pip install trench-core              # core + most modules (stdlib only)
pip install 'trench-core[sources]'   # adds feedparser for RSS polling
pip install 'trench-core[all]'       # everything
```

Six of the eight modules need only the standard library. Only `sources`
(RSS polling, via `feedparser`) declares an optional runtime dependency.

## Quickstart

A minimal end-to-end loop: capture a Claude call, log its outcome,
register the bundle's hash on a public chain, and score the prediction
once the market settles. No `requests`, no Anthropic SDK — the framework
stays provider-agnostic; you wire your own model caller.

```python
from pathlib import Path

from trench_core.replay import BundleWriter, load_bundles, replay_bundle, diff_signals
from trench_core.cycle_outcomes import OutcomeLogger
from trench_core.registry import RegistryWriter, verify_chain
from trench_core.calibration import calibration_report

# 1. Capture: at the moment of generation
writer = BundleWriter(
    path="bundles.jsonl",
    system_prompt="You are an analyst...",   # auto-hashed for grouping
)
my_prompt = "Will Iran sign a deal by June 2027?"
my_raw    = your_llm_call(my_prompt, model="sonnet-4-6")  # caller-supplied
my_parsed = your_parser(my_raw)
writer.write(prompt=my_prompt, response_raw=my_raw,
             parsed=my_parsed, model="sonnet-4-6")

# 2. Instrument: emit one structured outcome line per loop iteration
outcomes = OutcomeLogger("/var/log/agent.log")
outcomes.emit("traded", confidence=my_parsed["confidence"], side="NO")

# 3. Pre-register: anchor today's bundles in a public hash chain
chain = RegistryWriter(
    bundle_paths={"baseline": "bundles.jsonl"},
    registry_root="registry/",
    summary_fields=("direction", "confidence"),
)
chain.update()                    # appends today's record
verify_chain("registry/")         # raises ChainBroken if anything tampered

# 4. Score: when the market settles, run a calibration report
trades = [...]        # your closed-trade dicts
evals  = [...]        # your per-market evaluation dicts
report = calibration_report(trades, evaluations=evals)
print(report["brier"]["mean_brier"], report["trade_summary"]["roi_pct"])

# 5. Replay: re-run a captured cycle through a different model
bundles = load_bundles("bundles.jsonl")
res = replay_bundle(
    bundle=bundles[-1],
    model="haiku-4-5",
    model_caller=your_llm_call,
    parser=your_parser,
)
print(diff_signals(res.bundle.parsed, res.candidate_parsed))
```

Each module is independently usable — pick the ones you need. Per-module
quickstarts live in their `__init__.py` docstrings (also rendered in the
generated API docs).

## Modules

| Module | What it does | Status |
|---|---|---|
| `trench_core.calibration` | Brier scoring, calibration curves, threshold backtests, P&L attribution | ✅ shipped (0.3.0) |
| `trench_core.replay` | Capture-then-replay + diff harness for LLM agents | ✅ shipped (0.4.0) |
| `trench_core.cycle_outcomes` | Structured "one outcome line per loop iteration" instrumentation | ✅ shipped (0.2.0) |
| `trench_core.registry` | Public hash-chain pre-registration of agent outputs | ✅ shipped (0.2.0) |
| `trench_core.ontology` | Generic typed entity graph + alias resolver, SQLite-backed | ✅ shipped (0.5.0) |
| `trench_core.sources` | RSS poller + USGS seismic poller (Twitter/Telegram/financial deferred) | ✅ shipped (0.6.0) |
| `trench_core.markets` | Public-data clients — Kalshi (read-only), Manifold (Polymarket trading deferred) | ✅ shipped (0.7.0) |
| `trench_core.tournament` | Multi-variant runner pattern — same intel, different policies | ✅ shipped (0.8.0) |

## What's deliberately not in scope

- ❌ A backtesting engine (use [zipline](https://zipline.ml4trading.io/) or
  [vectorbt](https://vectorbt.dev/) for historical sim)
- ❌ A strategy library (no canned signals)
- ❌ A managed/SaaS version (you run it yourself)
- ❌ Brokerage integration (the framework provides data, not execution glue)
- ❌ A multi-provider AI abstraction layer (Anthropic-first by design; wrap
  your own analyzer for OpenAI)

## Examples

The [`examples/`](examples/) directory has runnable demos for every
module. All of them run offline (network calls are mocked):

```bash
python examples/01_cycle_outcomes.py
python examples/02_registry.py
python examples/03_calibration.py
python examples/04_replay.py
python examples/05_ontology.py
python examples/06_sources.py
python examples/07_markets.py
python examples/08_tournament.py
```

## Contributing

The project is alpha — issues and PRs welcome, but expect API churn
until `1.0.0`. See [CONTRIBUTING.md](CONTRIBUTING.md) for the
development setup and the discipline that's kept the extraction
clean (audit before code, ruff + pytest pre-flight, byte-identical
proof for refactors).

## License

MIT — see [LICENSE](LICENSE).
