Metadata-Version: 2.4
Name: autoctx
Version: 0.2.1
Summary: autocontext control plane for iterative strategy evolution.
Project-URL: Homepage, https://github.com/greyhaven-ai/autocontext
Project-URL: Repository, https://github.com/greyhaven-ai/autocontext
Project-URL: Issues, https://github.com/greyhaven-ai/autocontext/issues
Project-URL: Documentation, https://github.com/greyhaven-ai/autocontext/tree/main/autocontext/docs
License: Apache-2.0
Keywords: agents,autocontext,evaluation,harness,llm,optimization
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: anthropic>=0.66.0
Requires-Dist: fastapi>=0.116.1
Requires-Dist: httpx>=0.28.1
Requires-Dist: prime-sandboxes>=0.2.14
Requires-Dist: pydantic>=2.11.0
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: rich>=13.9.4
Requires-Dist: typer>=0.16.0
Requires-Dist: uvicorn>=0.35.0
Requires-Dist: websockets>=16.0
Provides-Extra: agent-sdk
Requires-Dist: claude-agent-sdk>=0.1.0; extra == 'agent-sdk'
Provides-Extra: all
Requires-Dist: claude-agent-sdk>=0.1.0; extra == 'all'
Requires-Dist: mcp>=1.0.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: pydantic-monty>=0.0.7; extra == 'all'
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == 'mcp'
Provides-Extra: mlx
Requires-Dist: mlx>=0.30.0; extra == 'mlx'
Requires-Dist: rustbpe>=0.1.0; extra == 'mlx'
Requires-Dist: safetensors>=0.4.0; extra == 'mlx'
Requires-Dist: tiktoken>=0.11.0; extra == 'mlx'
Provides-Extra: monty
Requires-Dist: pydantic-monty>=0.0.7; extra == 'monty'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Description-Content-Type: text/markdown

# autocontext

autocontext is a control plane for improving agent behavior over repeated runs. It combines multi-agent candidate generation, staged validation, scenario execution, knowledge accumulation, optional local distillation, and OpenClaw-facing APIs.

## Working Directory

Run the commands in this README from the `autocontext/` directory. The Python package, CLI entrypoint, tests, migrations, and dashboard assets all live here.

## What It Does

- Runs iterative generation loops against game scenarios and agent-task scenarios
- Persists playbooks, hints, tools, reports, and snapshots across runs
- Supports staged validation, harness synthesis, and harness-aware routing
- Exports training data and runs autoresearch-style local training loops
- Exposes evaluation, validation, artifact, and discovery operations over MCP and HTTP

## Quick Start

From the repo root:

```bash
cd autocontext
uv venv
source .venv/bin/activate
uv sync --group dev
```

Use the repo-level `.env.example` as the reference for available `AUTOCONTEXT_*` settings.

`operator-in-the-loop` remains a typed scenario family for capability discovery and experimentation, but autocontext does not scaffold executable operator-loop runtimes. Use datasets, tools, or live-agent experiments instead of harness-owned escalation scripts.

Run a deterministic local scenario:

```bash
AUTOCONTEXT_AGENT_PROVIDER=deterministic \
uv run autoctx run --scenario grid_ctf --gens 3 --run-id quickstart
```

Run with Anthropic:

```bash
AUTOCONTEXT_AGENT_PROVIDER=anthropic \
AUTOCONTEXT_ANTHROPIC_API_KEY=... \
uv run autoctx run --scenario grid_ctf --gens 3
```

Start the API server and dashboard:

```bash
uv run autoctx serve --host 127.0.0.1 --port 8000
```

Open `http://127.0.0.1:8000` after the server starts.

Start the MCP server:

```bash
uv sync --group dev --extra mcp
uv run autoctx mcp-serve
```

## Main CLI Commands

```bash
uv run autoctx run --scenario grid_ctf --gens 3
uv run autoctx list
uv run autoctx status <run_id>
uv run autoctx replay <run_id> --generation 1
uv run autoctx benchmark --scenario grid_ctf --runs 5
uv run autoctx new-scenario --template prompt-optimization --name my-task
uv run autoctx serve --host 127.0.0.1 --port 8000
uv run autoctx mcp-serve
uv run autoctx wait <condition_id> --json
```

Useful variants:

```bash
AUTOCONTEXT_AGENT_PROVIDER=anthropic AUTOCONTEXT_ANTHROPIC_API_KEY=... \
uv run autoctx run --scenario grid_ctf --gens 3

AUTOCONTEXT_AGENT_PROVIDER=deterministic AUTOCONTEXT_RLM_ENABLED=true \
uv run autoctx run --scenario grid_ctf --gens 3
```

## Training Workflow

Export JSONL training data from completed runs:

```bash
uv run autoctx export-training-data \
  --scenario grid_ctf \
  --all-runs \
  --output training/grid_ctf.jsonl
```

Launch the autoresearch-style training loop:

```bash
uv sync --group dev --extra mlx
uv run autoctx train \
  --scenario grid_ctf \
  --data training/grid_ctf.jsonl \
  --time-budget 300
```

MLX training is host-only. It must run on an Apple Silicon macOS machine with Metal access. It will not run correctly inside a Docker sandbox on macOS.

If you only want to inspect generated training data first, export without training and open the JSONL directly.

For host setup details and OpenClaw automation via a file-based watcher bridge, see [docs/mlx-training.md](docs/mlx-training.md).

## Configuration

Configuration is loaded from `AUTOCONTEXT_*` environment variables in `src/autocontext/config/settings.py`.

Common settings:

- `AUTOCONTEXT_AGENT_PROVIDER`
- `AUTOCONTEXT_EXECUTOR_MODE`
- `AUTOCONTEXT_MODEL_COMPETITOR`
- `AUTOCONTEXT_MATCHES_PER_GENERATION`
- `AUTOCONTEXT_MAX_RETRIES`
- `AUTOCONTEXT_JUDGE_PROVIDER`
- `AUTOCONTEXT_RLM_ENABLED`
- `AUTOCONTEXT_HARNESS_PREFLIGHT_ENABLED`
- `AUTOCONTEXT_STAGED_VALIDATION_ENABLED`

See the repo-level [.env.example](../.env.example) for a working starting point.

## Repository Structure

```text
autocontext/
  src/autocontext/   Python package
  tests/             Pytest suite
  dashboard/         Static dashboard assets
  docs/              Package-specific documentation
  migrations/        SQLite migrations
ts/                  TypeScript package
tui/                 Interactive terminal UI
infra/               Docker, Fly.io, bootstrap scripts
```

## Validation and Development

```bash
uv run ruff check src tests
uv run mypy src
uv run pytest
```

If you change protocol messages, regenerate the derived protocol artifacts from the repo root:

```bash
cd ..
uv run --directory autocontext python scripts/generate_protocol.py
```

## OpenClaw / ClawHub

autocontext exposes:

- artifact contracts for harnesses, policies, and distilled models
- REST and MCP operations for evaluate, validate, publish, import, and discover
- ClawHub skill manifests and scenario discovery metadata
- an adapter layer for running OpenClaw agents inside the harness

## Additional Docs

- [Agent integration guide](docs/agent-integration.md) — CLI-first integration for external agents, MCP fallback, JSON output reference
- [Sandbox modes](docs/sandbox.md)
- [MLX host training](docs/mlx-training.md)
- [Demo data notes](demo_data/README.md)
- [Copy-paste examples](../examples/README.md)
- [Change history](../CHANGELOG.md)
- [Repository overview](../README.md)
