Metadata-Version: 2.4
Name: harnessforge
Version: 0.2.1
Summary: Universal harness layer for AI coding agents — one command sets up your repo for any agent (Claude Code, Cursor, Codex, Gemini CLI, Aider, OpenHarness).
Project-URL: Homepage, https://github.com/jcaiagent7143-ui/harnessforge
Project-URL: Documentation, https://jcaiagent7143-ui.github.io/harnessforge
Project-URL: Repository, https://github.com/jcaiagent7143-ui/harnessforge
Project-URL: Issues, https://github.com/jcaiagent7143-ui/harnessforge/issues
Project-URL: Changelog, https://github.com/jcaiagent7143-ui/harnessforge/blob/main/CHANGELOG.md
Author: The harnessforge Contributors
License: MIT
License-File: LICENSE
Keywords: agent,agent-blueprint,agents-md,ai,aider,claude,claude-code,codex,cursor,gemini-cli,harness,llm,mcp,openai,openharness,rag,skills
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: anyio>=4.3
Requires-Dist: httpx>=0.27
Requires-Dist: jinja2>=3.1
Requires-Dist: pydantic>=2.6
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.7
Requires-Dist: typer>=0.12
Provides-Extra: all
Requires-Dist: anthropic>=0.40; extra == 'all'
Requires-Dist: fastapi>=0.110; extra == 'all'
Requires-Dist: google-genai>=0.3; extra == 'all'
Requires-Dist: litellm>=1.40; extra == 'all'
Requires-Dist: mcp>=1.0; extra == 'all'
Requires-Dist: numpy>=1.26; extra == 'all'
Requires-Dist: ollama>=0.3; extra == 'all'
Requires-Dist: openai>=1.40; extra == 'all'
Requires-Dist: textual>=0.60; extra == 'all'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.40; extra == 'anthropic'
Provides-Extra: dev
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0; extra == 'dev'
Requires-Dist: vcrpy>=6.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: mkdocs-material>=9.5; extra == 'docs'
Requires-Dist: mkdocs>=1.6; extra == 'docs'
Requires-Dist: mkdocstrings[python]>=0.25; extra == 'docs'
Provides-Extra: embeddings
Requires-Dist: numpy>=1.26; extra == 'embeddings'
Provides-Extra: gemini
Requires-Dist: google-genai>=0.3; extra == 'gemini'
Provides-Extra: litellm
Requires-Dist: litellm>=1.40; extra == 'litellm'
Provides-Extra: mcp
Requires-Dist: mcp>=1.0; extra == 'mcp'
Provides-Extra: ollama
Requires-Dist: ollama>=0.3; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.40; extra == 'openai'
Provides-Extra: proxy
Requires-Dist: fastapi>=0.110; extra == 'proxy'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'proxy'
Provides-Extra: tui
Requires-Dist: textual>=0.60; extra == 'tui'
Provides-Extra: web
Requires-Dist: fastapi>=0.110; extra == 'web'
Requires-Dist: uvicorn[standard]>=0.29; extra == 'web'
Description-Content-Type: text/markdown

<div align="center">

# harnessforge

### One command turns any repo into a project where Claude Code, Cursor, Codex, Gemini CLI, Aider, and any other coding agent show up already knowing the codebase.

```bash
uvx harnessforge init
```

[![PyPI](https://img.shields.io/pypi/v/harnessforge?label=pypi)](https://pypi.org/project/harnessforge/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![CI](https://github.com/jcaiagent7143-ui/harnessforge/actions/workflows/test.yml/badge.svg)](https://github.com/jcaiagent7143-ui/harnessforge/actions/workflows/test.yml)
[![Tests](https://img.shields.io/badge/tests-186%20passing-brightgreen)](https://github.com/jcaiagent7143-ui/harnessforge/actions/workflows/test.yml)

</div>

---

## What you get

Run `harness init` in your repo. ~3 seconds later, you have:

```
your-repo/
├── AGENTS.md           ← every coding agent reads this (OpenAI Codex CLI convention)
├── SOUL.md             ← personality / tone for this project
├── TOOLS.md            ← which tools / MCPs to use
├── MEMORY.md           ← memory schemas
├── SKILLS/             ← anthropics/skills-compatible procedures
│   ├── chunk-and-embed/SKILL.md
│   ├── retrieve-and-rerank/SKILL.md
│   └── …
├── .claude/CLAUDE.md       ← Claude Code reads this automatically
├── .cursor/rules           ← Cursor reads this automatically
├── .continue/config.json   ← Continue reads this automatically
├── .windsurf/rules         ← Windsurf reads this automatically
├── harness.config.json     ← what blueprint this repo is bound to
└── .harness/
    ├── profile.yaml        ← the canonical machine-readable description
    ├── manifest.json       ← sha256 of every file for safe re-runs
    └── memory_schemas/     ← JSON Schemas the blueprint expects
```

These aren't placeholder stubs. Here's the first 25 lines of an `AGENTS.md` `harness init --no-llm` produced for a tiny stock-analysis repo:

```markdown
# AGENTS.md

> _Generated by harnessforge v0.2.1 · blueprint `finance-agent` v1.0.0._

You are a **finance / market-data analyst agent** working in **portfoliowatch**.

This project is **read-only by default.** You fetch market data, compute
signals, surface insights. You **never** place orders, move money, or
modify positions without an **explicit per-action human-approval gate**
that the user typed "yes" through in this session.

---

## The analyst loop

```
fetch → compute → screen → flag
```

- `SKILLS/fetch-market-data` — get prices/quotes/fundamentals; respect rate limits.
- `SKILLS/compute-technicals` — RSI, SMA, MACD, etc. Vectorized; tested against canonical references.
- `SKILLS/screen-positions` — filter the universe by your declared criteria.
- `SKILLS/flag-attention` — surface what changed and why — calibrated, not alarmist.
```

A Claude Code session opened in that repo reads it and knows the loop, the
safety contract, and which skills to invoke — without you typing any of it
into the chat.

---

## Install

```bash
# No install, run once:
uvx harnessforge init

# Or install globally:
pipx install harnessforge
harness init

# Or pip into a venv:
pip install harnessforge
```

`--no-llm` makes init fully deterministic — no API key, ~2 seconds:

```bash
uvx harnessforge init --no-llm
```

Optional extras add an LLM-based refinement step (pulls dependency only if
you want it):

```bash
pip install "harnessforge[anthropic]"   # use Claude to refine the profile
pip install "harnessforge[openai]"      # use GPT
pip install "harnessforge[mcp]"         # expose harness itself as an MCP server
```

---

## Why this exists

Every time a developer opens a new repo in Claude Code (or Cursor, Codex,
Gemini CLI, Aider), they re-explain the project: what kind of code is
this, what conventions, what's forbidden, what does done look like.
Different IDE, same boilerplate. And every file is project-specific —
you can't just copy yesterday's `CLAUDE.md`.

The fix is small: a deterministic walker that inspects the repo, picks a
sensible **agent blueprint** based on the deps + structure, and emits the
ground-truth files every major coding agent reads on startup. Run it once
per repo, commit the output, every agent you use shows up smarter.

That's what `harness init` is. The thing it generates is the harness; the
CLI is a 60-second way to author one.

---

## Five blueprints in 0.2.x

Pick with `--blueprint`, or let the recommender choose based on inspection
(yfinance deps → `finance-agent`; langchain/qdrant → `rag-agent`; airflow
→ `workflow-agent`; generic Python → `python-cli-app`).

| Blueprint | For | Skills it ships |
|---|---|---|
| **`python-cli-app`** | Build a Python CLI / library / web API — the default for greenfield Python work | `add-cli-command`, `add-unit-test`, `manage-dependency`, `check-style` |
| **`finance-agent`** | Market data + portfolio analysis | `fetch-market-data`, `compute-technicals`, `screen-positions`, `flag-attention` + `no_trades_without_gate` validator that fails if generated code calls a broker function without an approval check |
| **`rag-agent`** | Retrieval-augmented Q&A with citation enforcement | `chunk-and-embed`, `retrieve-and-rerank`, `answer-with-citations`, `eval-recall-precision` + citation cross-checker |
| **`support-agent`** | Customer support: intent → KB → ticket → escalate | `classify-intent`, `retrieve-kb-answer`, `file-ticket`, `escalate-if-unresolved` + ticket-lineage validator |
| **`workflow-agent`** | Multi-step orchestration (Zapier/n8n-style) | `decompose-task`, `call-tool-with-retry`, `check-result` + tool-log + idempotency validators |

Beyond the catalog you can author project-specific skills:

```bash
harness skills add fetch-portfolio-prices \
  --domain --description "Fetch live prices for tickers in positions.json from Polygon."
```

Domain skills land under `SKILLS/domain/<name>/SKILL.md` and surface in
`harness skills list` alongside blueprint-shipped skills.

---

## Validation that's actually a contract, not a vibe

`harness verify` runs blueprint-defined checks and emits a stable JSON
contract — the same contract whether you call the CLI or the MCP tool. Your
coding agent reads the JSON and self-corrects:

```bash
$ harness verify --json
{
  "schema_version": 1,
  "blueprint": "finance-agent",
  "checks": [
    {"name": "structure", "status": "pass", "duration_ms": 1, "messages": []},
    {"name": "tests",     "status": "pass", "duration_ms": 42, "messages": []},
    {"name": "no_trades_without_gate", "status": "pass", "duration_ms": 8, "messages": []}
  ],
  "summary": {"total": 3, "passed": 3, "failed": 0}
}
```

Exit codes: `0` all pass · `1` failures · `2` config error · `3` not a harness-bootstrapped repo. Drop in CI:

```yaml
- run: pip install harnessforge
- run: harness sync --check    # fail if generated files drifted from manifest
- run: harness verify --json   # fail if blueprint contract broken
```

The `no_trades_without_gate` validator on `finance-agent` is the spicy one:
it static-scans the repo for `place_order(`, `buy(`, `sell(`, etc., and
fails the build if any of them sit behind only a config flag instead of a
runtime user-approval gate. See `docs/concepts/trust-model.md` for the full
trust boundary (`profile.test_command` and `lint_command` are executed as
shell — same model as `make test` / `npm test`).

---

## Commands

```
harness init [PATH]                       # inspect → profile → blueprint → render
harness sync [PATH] [--check]             # re-render adapters; --check = drift detect for CI
harness inspect [PATH] [--json|yaml]      # deterministic InspectionReport
harness verify [TARGET] [--json] [--tests|--lint]
harness blueprint {list, show, apply}
harness skills {list, show, add [--domain]}
harness doctor                            # diagnose env: provider keys, extras, repo health
harness mcp                               # run stdio MCP server (5 typed tools)
harness version
```

---

## Use with your existing coding agent

| Agent | What it reads | Setup |
|---|---|---|
| Claude Code | `.claude/CLAUDE.md` | nothing — automatic |
| Cursor | `.cursor/rules` | nothing |
| Codex CLI | `AGENTS.md` | nothing |
| Continue | `.continue/config.json` | nothing |
| Windsurf | `.windsurf/rules` | nothing |
| Gemini CLI / Aider | `AGENTS.md` | nothing |
| Anything that speaks MCP | `harness mcp` exposes 5 typed tools | add to client config |

For the MCP path:

```json
{
  "mcpServers": {
    "harness": {
      "command": "uvx",
      "args": ["--from", "harnessforge[mcp]", "harness", "mcp"]
    }
  }
}
```

Exposes `harness_inspect`, `harness_blueprint_list`, `harness_skills_list`,
`harness_verify`, `harness_profile_read` as typed tools your agent can call.

---

## Hero demos (reproducible from this repo)

Three end-to-end demos against real public repos, pinned to specific SHAs:

| Demo | Repo | Blueprint | Run |
|---|---|---|---|
| FastAPI + RAG | [fastapi/full-stack-fastapi-template](https://github.com/fastapi/full-stack-fastapi-template) | `rag-agent` | [`examples/hero/fastapi_rag/run.sh`](examples/hero/fastapi_rag/run.sh) |
| Zulip + Support | [zulip/zulip](https://github.com/zulip/zulip) | `support-agent` | [`examples/hero/zulip_support/run.sh`](examples/hero/zulip_support/run.sh) |
| Airflow + Workflow | [apache/airflow](https://github.com/apache/airflow) | `workflow-agent` | [`examples/hero/airflow_workflow/run.sh`](examples/hero/airflow_workflow/run.sh) |

Each `run.sh` shallow-clones the upstream repo at the pinned SHA, runs
`harness init --no-llm`, and runs `harness verify` to confirm everything passes. CI runs all three on every push.

---

## How this compares to alternatives

The "agent infrastructure" space has runtimes (Hermes, OpenClaw,
OpenHarness), SDKs (OpenAI Agents SDK, Mastra), and now provisioners
(harnessforge). The honest version of the comparison — including when
*not* to use harnessforge — lives in [`docs/concepts/vs-hermes-openclaw-openharness.md`](docs/concepts/vs-hermes-openclaw-openharness.md).

The shortest version: **if you write your own `CLAUDE.md` for every repo
you start, harnessforge replaces that file with one that's actually
project-specific, plus everything the other coding agents you use need
to read.** That's the comparison most readers are actually making.

---

## Quality bar

- **186 tests** across unit / golden-file / interop / integration tiers, all green on Python 3.11 / 3.12 / 3.13
- **83% line coverage** on `src/harness/`
- **mypy strict** + **ruff** clean
- **mkdocs --strict** builds clean
- **Fresh-venv install verified** — `pip install harnessforge && harness version` works on a clean machine (the v0.2.1 release was held until this passed)
- **Two rounds of real-agent A/B evaluation** — Claude Code building the same stock-analysis agent WITH vs. WITHOUT the harness; the diff drove the v0.2 + v0.2.1 designs. See [`CHANGELOG.md`](CHANGELOG.md) for the per-fix-per-eval breakdown.

---

## Status

`harnessforge` 0.2.1 — first public release. Following [Semantic Versioning](https://semver.org).

Feedback welcome via issues — see [CONTRIBUTING.md](CONTRIBUTING.md) and
[SECURITY.md](SECURITY.md). The 5 shipped blueprints are intentionally
opinionated; PRs proposing a 6th are encouraged. Sales / browser blueprints
land in 0.3 alongside auth-bearing MCP catalog entries.

## License

MIT — see [LICENSE](LICENSE).
