Metadata-Version: 2.4
Name: xeno-harness
Version: 4.1.0
Summary: Multi-agent autonomy harness with multi-provider support, per-role permission tiers, and Agent Harness Engineering observability.
Author-email: Pierre Belon-Savon <belonsavon@gmail.com>
License-Expression: Apache-2.0
License-File: LICENSE
License-File: NOTICE
Requires-Python: >=3.12
Requires-Dist: anthropic<1.0,>=0.40
Requires-Dist: claude-agent-sdk<1.0,>=0.1
Requires-Dist: fastapi<1.0,>=0.110
Requires-Dist: httpx<1.0,>=0.27
Requires-Dist: litellm<1.85,>=1.84
Requires-Dist: mem0ai==0.1.116
Requires-Dist: pgserver<1.0,>=0.1
Requires-Dist: procrastinate<4.0,>=3.8.1
Requires-Dist: prompt-toolkit<4.0,>=3.0
Requires-Dist: psycopg2-binary<3.0,>=2.9
Requires-Dist: psycopg[binary,pool]<4.0,>=3.2
Requires-Dist: pydantic[email]<3.0,>=2.12
Requires-Dist: python-dotenv<2.0,>=1.0
Requires-Dist: pyyaml<7.0,>=6.0
Requires-Dist: rich<14,>=13
Requires-Dist: sqlparse<1.0,>=0.5
Requires-Dist: uvicorn[standard]<1.0,>=0.27
Requires-Dist: yoyo-migrations<9.0,>=8.2
Description-Content-Type: text/markdown

# agent-harness

An autonomy-aware Claude harness with cross-session memory, skills, a
streaming dashboard, and an MCP server. Built across a 7-lab series; see
[`LABS/`](LABS/) and [`DEEP-DIVES/`](DEEP-DIVES/).

## Layout

```
autonomy-lab/        # Canonical harness (sync CLI loop)
  autonomy_harness.py    # Budget, CacheMetrics, @tool registry, memory,
                         # compaction, sub-agents, skills, ~1600 lines
  memory_backend.py      # FilesystemBackend + SqliteBackend
  skill_loader.py        # Two-tier cascade + path-safe Read
  consolidation.py       # End-of-session Haiku consolidation pass
  mcp_server.py          # MCP stdio server exposing the registry
  skills/                # Skill cascade (3 starter skills)
  tests/                 # pytest suite (offline + API-gated)

ai-harness/          # Streaming dashboard (FastAPI + React)
  backend/agent.py       # 190-line streaming wrapper around autonomy_harness
  backend/main.py        # FastAPI app: auth, rate-limit, CORS, /api/*
  frontend/              # Vite + React + Tailwind v4 + shadcn

.github/workflows/   # CI: pytest + tsc + backend-import on every PR
```

## Use it

```bash
# CLI single-shot (full machinery, no streaming)
cd autonomy-lab
uv run python autonomy_harness.py "Your goal here"

# Dashboard with streaming UI
cd ai-harness && ./dev.sh   # backend :8000, frontend :5173

# MCP server (Claude Code, Cursor, ...)
# In your client's MCP config:
#   "autonomy-harness": {
#     "command": "uv",
#     "args": ["run", "--directory",
#              "/abs/path/to/autonomy-lab",
#              "python", "mcp_server.py"]
#   }
```

## Deploy the dashboard

```bash
cd deploy
cp .env.example .env   # set ANTHROPIC_API_KEY and DASHBOARD_TOKEN
docker compose up --build
# http://localhost:8080
```

See [`deploy/README.md`](deploy/README.md) for the three modes:
local laptop (HTTP), Mac mini + Tailscale (HTTP over private mesh), or
public domain (HTTPS via Let's Encrypt). Memory + replay state persist in
`deploy/data/`.

Dev mode (no `DASHBOARD_TOKEN`) auto-disables auth.

## Web tools

`web_search` and `web_fetch` are wired as Anthropic server-side tools (zero
local impl, real-time index, billed per call). On by default; disable with
`WEB_TOOLS_ENABLED=0` before running the CLI or the dashboard backend.

## Tests

```bash
cd autonomy-lab && uv run pytest -v
```

CI runs the offline subset (38 tests, ~2s) on every PR. API-using tests
(`test_skill_triggers`, `test_context_rot`, `test_claude_picks_right_tool`)
self-skip without `ANTHROPIC_API_KEY`.
