# pcq Agent Guide

pcq is an agent-operable experiment evidence and control library for machine
learning projects. It is not a training framework, model zoo, framework adapter
matrix, or experiment tracking SaaS. It is a small Python package that makes
arbitrary ML code inspectable, reproducible, verifiable, comparable, and
repeatable through standard files and JSON surfaces.

## Identity

- Package: pcq
- Import: `import pcq`
- CLI: `pcq`
- License: Apache-2.0
- Repository: https://github.com/playidea-lab/pcq
- PyPI: https://pypi.org/project/pcq/
- Website: https://playidea-lab.github.io/pcq/

Core sentence:

```text
pcq does not operate the model.
pcq operates the experiment boundary.
```

Runtime contract names:

- `cq.yaml`
- `CQ_CONFIG_JSON`
- `cq://`

These names do not mean pcq is usable only with the managed CQ service.

## What pcq Standardizes

An experiment project has a `cq.yaml` file:

```yaml
name: mnist-mlp
cmd: uv run python train.py
configs:
  seed: 42
  epochs: 3
  output_dir: output
  monitor: eval_acc
  mode: max
metrics:
  - epoch
  - eval_acc
artifacts:
  - output/
```

The training code can use any ML stack:

```python
import pcq

cfg = pcq.config()
out = pcq.output_dir()

# Run any training code here.
score = 0.82

pcq.log(epoch=0, eval_acc=score)
pcq.save_all(
    history=[{"epoch": 0, "eval_acc": score}],
    status="completed",
    artifacts={"model": "model.pt"},
)
```

## Read Path For Agents

Use these commands before editing or running:

```bash
pcq resolve --json
pcq inspect . --json
pcq validate . --strictness 2 --json
```

The agent should identify:

- project root
- selected `cq.yaml`
- command to run
- declared metrics
- output directory
- existing artifacts
- previous run records
- validation warnings or blocking failures

## Run Path For Agents

For a final result object only:

```bash
pcq run --path . --json
```

For live events:

```bash
pcq run --path . --jsonl
```

For a final JSON object plus event evidence in a file:

```bash
pcq run --path . --events output/events.jsonl --json
```

JSONL events are newline-delimited JSON objects. Each event includes at least:

- `schema_version`
- `seq`
- `time`
- `event`

Important event types:

- `run.started`
- `stdout`
- `stderr`
- `metric`
- `run.completed`
- `run.failed`
- `run.error`

Metric events are derived from `pcq.log(...)` stdout lines in `@key=value`
format.

## Post-Run Path For Agents

After the process exits:

```bash
pcq validate-run output --strictness 3 --json
pcq describe-run output --json
```

For comparing two iterations:

```bash
pcq compare-runs old_output new_output --json
pcq lineage new_output --json
```

`describe-run` and `compare-runs` expose decision facts. They intentionally do
not decide whether to continue, rollback, or accept a run.

## Standard Artifacts

A valid run should produce:

- `config.json`
- `metrics.json`
- `manifest.json`
- `run_summary.json`
- `run_record.json`
- `validation_report.json`

`run_record.json` is the canonical completion record. It should contain source,
environment, input, metric, artifact, validation, lineage, and summary evidence.

## Agent Behavior

Do:

- Prefer JSON/JSONL surfaces over scraping terminal prose.
- Keep project-specific model, dataset, loss, optimizer, scheduler, and
  framework code in the user's project.
- Declare metrics in `cq.yaml` before emitting them with `pcq.log(...)`.
- Use `pcq.output_dir()` rather than hard-coded `output/` paths.
- Treat failed runs as evidence when partial artifacts can be preserved.

Do not:

- Treat process exit code alone as experiment success.
- Assume pcq owns the training loop.
- Assume CQ service is required.
- Add framework adapters when direct contract code is enough.
- Edit pcq internals for one project-specific experiment.

## MCP Integration (v4.1.0)

pcq ships an optional Model Context Protocol server so agent runtimes
(Claude Code, Codex, custom LLM clients) can call pcq with structured
JSON instead of shelling out and parsing stdout.

Install:

```bash
uv add 'pcq[mcp]'
```

Wire the project:

```bash
pcq init-experiment --output ./my-exp --agent claude
pcq agent install --target claude --path ./my-exp --mcp
```

The `--mcp` flag merges the following into `.mcp.json` (existing entries
preserved):

```json
{
  "mcpServers": {
    "pcq": {
      "command": "pcq",
      "args": ["mcp", "serve"]
    }
  }
}
```

Serve:

```bash
pcq mcp serve                                            # stdio (default)
pcq mcp serve --transport sse --host 127.0.0.1 --port 8765
```

14 MCP tools (read-only tools never mkdir, never mutate cq.yaml, never
spawn subprocesses):

| Tool | Read-only | Maps to |
|------|-----------|---------|
| `resolve_project` | yes | `pcq resolve` |
| `inspect_project` | yes | `pcq inspect` |
| `validate_project` | yes | `pcq validate` |
| `validate_run` | yes | `pcq validate-run` |
| `describe_run` | yes | `pcq describe-run` |
| `compare_runs` | yes | `pcq compare-runs` |
| `lineage_chain` | yes | `pcq lineage` |
| `apply_plan` | no | `pcq apply-plan` |
| `apply_planset` | no | `pcq apply-planset` |
| `init_experiment` | no | `pcq init-experiment` |
| `finalize_run` | no | `pcq finalize` |
| `agent_install` | no | `pcq agent install` |
| `agent_status` | yes | `pcq agent status` |
| `run_experiment` | no | `pcq run` |

Every tool's input/output is anchored in the
`pcq.agent.json_contracts.JSON_CONTRACTS` registry frozen since v2.13.

Long multi-hour GPU training should not block the in-process
`run_experiment` tool. For that workload, prefer the CQ service queue
which consumes the same contract.

Embed the registry directly without the MCP server wrapper:

```python
from pcq.mcp.tools import build_tools
import asyncio

tools = build_tools()
resolve = next(t for t in tools if t.name == "resolve_project")
result = asyncio.run(resolve.handler({"path": "."}))
```

## Examples

### sklearn — RandomForest on Iris

```yaml
# cq.yaml
name: sklearn-baseline
cmd: uv run python train.py
configs:
  output_dir: output
  seed: 42
  n_estimators: 100
  monitor: eval_acc
  mode: max
metrics:
  - epoch
  - eval_acc
artifacts:
  - output/
```

```python
# train.py
import pcq, joblib
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

cfg = pcq.config()
out = pcq.output_dir()
pcq.seed_everything(cfg.get("seed", 42))

X, y = load_iris(return_X_y=True)
X_tr, X_te, y_tr, y_te = train_test_split(
    X, y, test_size=0.2, random_state=cfg["seed"]
)
model = RandomForestClassifier(n_estimators=cfg.get("n_estimators", 100))
model.fit(X_tr, y_tr)
acc = float(model.score(X_te, y_te))

pcq.log(epoch=0, eval_acc=acc)
joblib.dump(model, out / "model.pkl")
pcq.save_all(history=[{"epoch": 0, "eval_acc": acc}],
             artifacts={"model": "model.pkl"})
```

### PyTorch — agnostic training loop

```python
import pcq, torch
from torch import nn

cfg = pcq.config()
out = pcq.output_dir()
pcq.seed_everything(cfg.get("seed", 42))

model = nn.Linear(cfg["in_dim"], cfg["out_dim"])
opt = torch.optim.Adam(model.parameters(), lr=cfg["lr"])

history = []
for epoch in range(cfg["epochs"]):
    train_loss = train_one_epoch(model, opt)   # user code
    val_acc = evaluate(model)                   # user code
    pcq.log(epoch=epoch, train_loss=train_loss, val_acc=val_acc)
    history.append({"epoch": epoch, "train_loss": train_loss, "val_acc": val_acc})

torch.save(model.state_dict(), out / "model.pt")
pcq.save_all(history=history, artifacts={"model": "model.pt"})
```

## Related Docs

- v4 direction: https://github.com/playidea-lab/pcq/blob/main/docs/V4_DIRECTION.md
- Introduction: https://github.com/playidea-lab/pcq/blob/main/docs/INTRODUCTION.md
- JSON contracts: https://github.com/playidea-lab/pcq/blob/main/docs/JSON_CONTRACTS.md
- Agent guide: https://github.com/playidea-lab/pcq/blob/main/docs/AGENT_OPERATING_GUIDE.md
- Strictness: https://github.com/playidea-lab/pcq/blob/main/docs/STRICTNESS.md
- Runtime contract: https://github.com/playidea-lab/pcq/blob/main/docs/CQ_YAML_RUNTIME_CONTRACT.md
- MCP integration: https://github.com/playidea-lab/pcq/blob/main/docs/MCP_INTEGRATION.md
