Metadata-Version: 2.4
Name: helm-ai
Version: 0.2.5
Summary: AI agent orchestration and token-optimisation system
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0
Requires-Dist: python-frontmatter>=1.1
Requires-Dist: anthropic>=0.40
Requires-Dist: fastapi>=0.100
Requires-Dist: uvicorn>=0.23
Requires-Dist: httpx>=0.25
Requires-Dist: jinja2>=3.0
Requires-Dist: fastmcp<0.2,>=0.1
Requires-Dist: requests>=2.31
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.3; extra == "dev"
Requires-Dist: pytest-cov>=5.0; extra == "dev"
Requires-Dist: pytest-xdist>=3.0; extra == "dev"
Dynamic: license-file

# Helm

AI agent orchestration and token-optimisation system.

A routing layer dispatches tasks to the most efficient capable worker — local Ollama, MCP expert, Playwright, or Claude Orchestrator — based on confidence scoring. A SENTINEL knowledge file accumulates lessons per-pod so routing improves over time. At the org level, a CEO orchestrator aggregates across pods into exec summaries and a risk register.

---

## Stack

- **Python 3.11+** — core orchestration package (`helm/`)
- **Pydantic v2** — all schema validation
- **TypeScript** — MCP bridge (Sprint 3B)
- **HTML** — report output (Sprint 3B)

---

## Project Structure

```
helm/                   # Python package
tests/
  fixtures/             # Canonical test fixtures — never modified during test runs
  mocks/
    claude_responses/   # Mock Claude API responses for unit tests
    ollama_responses/   # Mock Ollama responses for unit tests
docs/
  tasks/
    worker.md           # helm-builder task queue
    mcp.md              # mcp-builder task queue
  messages/             # Async inter-agent comms
  tracking/
    sprint.md           # Sprint board (source of truth)
  decisions/            # Architecture decision records
  lessons-learned.md    # Known failure patterns — read before every task
  guidelines/
    workflow.md         # Task sizing and completion protocol
CLAUDE.md               # Project rules — read every session via /go
AGENT_SYSTEM.md         # Operating manual for multi-agent coordination
```

---

## Development Setup

**Requires Python 3.11+.** The codebase uses `datetime.UTC` and other 3.11 stdlib additions throughout. Running on an older Python will fail at import time.

### Mac (first time)

```bash
brew install python@3.11          # install if not present
python3.11 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pip install fastapi "uvicorn[standard]" httpx
pytest --co -q                    # verify test discovery
```

### Windows / home PC

```bash
# Install Python 3.11+ from python.org or winget install Python.Python.3.11
python -m venv .venv
.venv\Scripts\activate
pip install -e ".[dev]"
pip install fastapi "uvicorn[standard]" httpx
pytest --co -q
```

### Cross-machine notes

- `.python-version` (repo root) is set to `3.11` — pyenv and most tooling will respect it.
- `pyproject.toml` enforces `requires-python = ">=3.11"` — `pip install` will fail loudly on older Pythons rather than silently producing broken installs.
- The `.[dev]` extras install pytest, pytest-timeout, and pytest-cov. FastAPI, uvicorn, and httpx are runtime/test dependencies for the web layer but not yet in `[dev]` extras (pending S25 dep review).

### Run the test suite

```bash
pytest                             # all tests
pytest -m sprint25_gate -v        # current sprint gate only
pytest --co -q                    # discovery check (no execution)
```

---

## Sprint Plan

| Sprint | Focus | Gate |
|--------|-------|------|
| 1 | Foundation: schemas, briefing, routing, SENTINEL | 37 tests green |
| 2 | Integration: Ollama worker, circuit breaker, pool queue, quota monitor | GP-001/002/003 pass |
| 3A | Optimisation: real confidence scoring, shadow testing | Shadow tester live |
| 3B | Report generator + MCP bridge | HTML output from real data |
| 4 | Hardening: security audit, 100-task baseline | ≥74% token savings |
| 5 | Org layer: CEO orchestrator, Org-SENTINEL | GP-004 passes |
| 6 | Org hardening + optional live viewer | 200-task baseline ≥74% |

See `docs/tracking/sprint.md` for current status.

---

## Contributing

Read `CLAUDE.md` and `AGENT_SYSTEM.md` before contributing.
Branch from `develop`. PR to `develop`. Managing Director reviews all merges.
