Metadata-Version: 2.4
Name: aegisloop
Version: 0.1.0
Summary: Aegisloop — a model-agnostic, trustworthy autonomous LLM agent harness (run, orchestrate, verify, deploy)
Author: ai-harness
License: MIT
Project-URL: Homepage, https://github.com/ai-harness/harness
Project-URL: Documentation, https://github.com/ai-harness/harness/blob/main/docs/ARCHITECTURE.md
Keywords: llm,agent,autonomous,orchestration,verification,openai,anthropic,gemini
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: pydantic>=2
Requires-Dist: httpx>=0.27
Requires-Dist: mcp>=1.0
Requires-Dist: numpy>=1.24
Provides-Extra: api
Requires-Dist: fastapi>=0.110; extra == "api"
Requires-Dist: uvicorn>=0.27; extra == "api"
Provides-Extra: postgres
Requires-Dist: pg8000>=1.30; extra == "postgres"
Provides-Extra: embeddings
Requires-Dist: fastembed>=0.3; extra == "embeddings"
Provides-Extra: otel
Requires-Dist: opentelemetry-sdk>=1.20; extra == "otel"
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20; extra == "otel"
Provides-Extra: all
Requires-Dist: fastapi>=0.110; extra == "all"
Requires-Dist: uvicorn>=0.27; extra == "all"
Requires-Dist: pg8000>=1.30; extra == "all"
Requires-Dist: fastembed>=0.3; extra == "all"
Requires-Dist: opentelemetry-sdk>=1.20; extra == "all"
Requires-Dist: opentelemetry-exporter-otlp-proto-http>=1.20; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: ruff>=0.6; extra == "dev"
Requires-Dist: fastapi>=0.110; extra == "dev"
Requires-Dist: uvicorn>=0.27; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"

# Ultimate Agent Harness

A **model-agnostic** harness that lets any LLM (GPT, Claude, Gemini, local) operate
**autonomously** with **grounded, verifiable** output and production-grade guardrails.
Ships with an **interactive coding agent** (`harness chat`) — a Claude-Code/Copilot-style
REPL with real file & shell tools — **and** a **hosted HTTP API + web dashboard**
(`harness serve`) so the same engine deploys as a service. Built on the harness's safety layers.

> ⚠️ Honest framing: hallucination can't be *eliminated* with today's models. This harness
> **minimizes and detects** it via layered, independent verification — **correctness over cost** —
> and **abstains** ("I don't know") instead of guessing.

---

## What it is

The model is one component. This harness is everything *around* it:

- **Provider gateway** — one interface, swappable models (+ retries, fallback, cost).
- **Context & memory** — prompt assembly, conversation, vector retrieval, compaction.
- **Tools & sandbox** — **MCP**-based tools; untrusted code in WASM/microVM isolation.
- **Agent loop** — typed state machine: plan → act → observe → reflect → verify → stop.
- **Verification** — layered, *always-on*, *independent*: deterministic tools, claim/NLI checks,
  cross-model judges, semantic-entropy uncertainty → verify-or-abstain.
- **Zero-trust security** — assume the model is compromised: dual-LLM split, taint tracking,
  egress lockdown (breaks the "lethal trifecta"), capability policy + human approval.
- **Evaluation** — datasets + scorers + a CI regression gate.
- **Observability** — OTel GenAI tracing → Langfuse; durable execution (crash-resumable), replay.

---

## Documentation

- 📐 **[docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)** — the 8 layers, interfaces,
  anti-hallucination strategy, tech stack, repo structure.
- 🗺️ **[docs/ROADMAP.md](docs/ROADMAP.md)** — phased build plan with quality gates.

> 🔎 **Prior art:** design influenced by an analysis of
> [safe-agentic-workflow](https://github.com/bybren-llc/safe-agentic-workflow). Adopted
> patterns (progressive disclosure, independence gates, skill packs, manifest-sync) are
> documented in ARCHITECTURE §15.
>
> 🧪 **Frontier (2024–2026):** MCP, dual-LLM/CaMeL zero-trust, semantic-entropy verification,
> durable execution, context engineering — with sources in ARCHITECTURE §16.

---

## Design principles

1. Provider-agnostic core · 2. Everything typed (pydantic) · 3. Ground, then generate ·
4. Verify before commit · 5. Fail closed · 6. Everything observable ·
7. Deterministic where possible · 8. Budget-bounded · 9. Process as service ·
10. Independent verification (the checker is never the author) ·
11. Assume compromise (zero-trust).

> **Correctness over cost** — verification is layered and never skipped to save tokens.

---

## Status

**Working prototype** (`src/harness/`) — a real, live-model-integrated, tested system
(**~6,220 LOC src+tests, 230 tests passing, ruff-clean**), hardened through repeated
**GPT-5.5 adversarial review of the entire codebase** (80+ findings fixed, then re-challenged
to convergence — provider fault-handling, exactly-once keys, verifier quorum, zero-trust egress,
workspace path-confinement, bounded process control, circuit-breaker concurrency, attributed-verify
soundness, injection ReDoS):

The **trust moat** (what makes it differentiated — answers with verifiable evidence, resists
manipulation, and provably abstains): **claim-level attributed verification**, **injection ASR
driven to 0%**, and **selective prediction with risk control**.

| Area | Status |
|---|---|
| **Claim-level attributed verification** (flagship anti-hallucination): decompose an answer into atomic claims, verify EACH (deterministic / evidence+citation / cross-model consensus), strip unsupported claims, **fail-closed accept-or-abstain** | ✅ live: caught a hallucinated false claim & abstained |
| **Injection-hardened zero-trust**: dual-LLM quarantine + **nonce spotlighting** + injection detector + **fail-closed neutralization**; expanded **27-attack** red-team | ✅ **ASR 11% → 0%** (live, A/B-measured) |
| **Selective prediction with risk control**: calibrate a confidence threshold to a **target error rate** at max coverage; fail-closed until calibrated | ✅ risk-coverage controller |
| **Interactive coding CLI** (`harness chat`): read/write/edit/list/glob/grep/run_command on the real workspace, **path-confined** + **approval-gated**, live tool events, model switching, session save/resume | ✅ live: creates & edits files, runs commands |
| **Hosted API + web dashboard** (`harness serve`): run/orchestrate/verify/enqueue/stream over HTTP, live ops health, kill-switch, trust scorecard; **pip-installable** (`harness`/`harness-serve`), optional Bearer auth | ✅ live end-to-end vs real models (run → 465, SSE tokens, 9/9) |
| **Resilience layer** (all model calls): retry + backoff + jitter, per-model **circuit breaker** (single-trial half-open), **rate limiter** (token bucket) | ✅ wired into the gateway |
| **In-loop cost enforcement** + **stuck-loop detection**: hard per-run USD budget (priced by the model that answered); repeated no-progress tool calls trip `STUCK` | ✅ deploy-without-babysitting rails |
| **Planning**: LLM→DAG plan, **parallel fan-out**, **per-step retry**, blocked-dependent propagation, optional per-step verifier | ✅ wave-based concurrent executor |
| **Ops**: per-tenant cost governor + **SLO/drift monitor** + throughput benchmark | ✅ |
| Provider gateway (per-model **completion vs response** routing + fallback + capability matrix + **SSE streaming**) | ✅ live: claude-opus-4.8, gemini-3.1-pro-preview, gpt-5.5 |
| Agent loop (typed state machine + native tool-calling + **forced tool-use** + tracing + **resume**) | ✅ |
| Tools + **hardened sandbox** (Docker `--network none`, runs as `nobody`, **kills container on timeout**) | ✅ writes & runs Python; blocks net; no orphans |
| **MCP transport** (FastMCP server ↔ stdio client) | ✅ |
| Verification: ensemble + **NLI clustering** + **CoVe** + **best-of-N** + deterministic + verify-or-abstain | ✅ **calibrated: 0% false-accept on 16 labeled cases** |
| **RAG**: **neural hybrid** (BM25 + fastembed/ONNX vectors, RRF fusion) + grounded answers w/ citations | ✅ semantic + lexical |
| **Durable execution**: SQLite + **Postgres**, resume, + **exactly-once** tool side effects (idempotency cache) | ✅ side effect runs once across re-executions |
| **Multi-agent**: orchestrator-worker (taint-aware) + **evaluator-optimizer** (generate→critique→refine) | ✅ |
| **Factory mode** (headless durable batch; **separate-process workers**) | ✅ |
| **Observability**: OpenTelemetry GenAI spans → **Jaeger** backend (traces queryable via API) | ✅ |
| Eval + **calibration** + CLI + hardening (ruff-clean, console scripts) | ✅ |

Honest remaining (the next frontier, per a 3-model review — Opus-4.8 / GPT-5.5 / Gemini-3.1-Pro):
**async human-in-the-loop** (pause durable state, escalate, resume on reply), **governance/versioning**
(prompt/tool/model version pins, eval gates, canary, rollback, audit), **DLP** (PII/secret scrubbing
on the egress path), and true **distributed multi-process** scale-out + durable *plan* resume. Also
environment-gated: **microVM** isolation (Linux), a **dedicated NLI model** (torch), **LATS**
tree-search, and large **external benchmarks** (AgentDojo/SWE-bench).

## Quickstart

```bash
python -m venv .venv && .venv\Scripts\pip install -e ".[dev]"
# point at your local model proxy (default http://localhost:5000)
python -m harness chat                                            # ⭐ interactive coding agent (Claude-Code-style)
python -m harness run --code "Compute the 20th Fibonacci number"  # one-shot agent (sandboxed code)
python -m harness verify-full "Capital of Australia?"             # ensemble + NLI + CoVe + best-of-N
python -m harness verify-claims "Tell me about the Eiffel Tower"  # claim-level attributed verification
python -m harness multi "Capitals of France, Japan, Egypt?"       # multi-agent orchestrator
python -m harness redteam                                         # prompt-injection benchmark (ASR 0%)
python -m harness factory "6*7" "capital of Italy" "100-1"        # headless durable batch
python -m harness eval                                            # trap-set scorecard (7/7)
python -m harness serve                                          # ⭐ HTTP API + web dashboard (http://127.0.0.1:8080)
python -m pytest -q                                               # 373 tests
```

In the `chat` REPL: `/help`, `/model <name>`, `/tools`, `/auto`, `/save`, `/resume`, `/cwd`,
`/clear`, `/exit`. Mutating/exec actions ask `[y/N/a]` (use `--auto-approve` or `/auto on` to skip).

## Deploy: install, API & dashboard

Install as a package (the `harness` / `harness-serve` commands land on your PATH):

```bash
python -m build                       # -> dist/harness-0.1.0-py3-none-any.whl (+ sdist)
pip install "dist/harness-0.1.0-py3-none-any.whl[api]"   # extras: api, postgres, embeddings, otel, all
harness serve --host 0.0.0.0 --port 8080                 # or: harness-serve
```

`harness serve` exposes the whole harness over HTTP **plus a single-page web console** at `/`:

| Endpoint | Purpose |
|---|---|
| `GET /` | Dashboard — ask/run/orchestrate/verify/stream, fleet queue, live health, kill-switch, scorecard |
| `POST /api/run` · `/api/orchestrate` · `/api/verify` | Autonomous run · multi-agent decompose · cross-model verify |
| `POST /api/enqueue` · `GET /api/runs` · `/api/queue` | Durable work queue + fleet (drain with `harness worker`) |
| `GET /api/stream` (SSE) | Live token-by-token streaming |
| `GET /api/health` · `POST /api/control/{stop,resume}` | Ops health + the persistent, fail-closed kill-switch |
| `GET /api/frontier` | The trust scorecard (9/9) |

Multi-user auth is off by default; set `HARNESS_API_KEY` to require `Authorization: Bearer <key>`
on every `/api/*` call (the dashboard has a key field).

## Target stack

Python 3.11+ · pydantic v2 · httpx/asyncio · **MCP** · instructor/Outlines · chromadb/qdrant ·
**DBOS** · Langfuse/OTel · pytest. See ARCHITECTURE §9 for the full table.
