Metadata-Version: 2.4
Name: relay-workflow
Version: 0.2.0
Summary: Cross-agent (Claude Code + Codex) planning/execution workflow — ALIGN, PLAN, PROJECT, IMPLEMENT, SHIP — plus a skill compiler that authors once and emits both host variants.
Project-URL: Homepage, https://github.com/applied-artificial-intelligence/relay
Project-URL: Repository, https://github.com/applied-artificial-intelligence/relay
Project-URL: Issues, https://github.com/applied-artificial-intelligence/relay/issues
Author-email: Stefan Jansen <stefan@applied-ai.com>
License-Expression: MIT
License-File: LICENSE
Keywords: agents,automation,claude-code,codex,github,workflow
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Version Control :: Git
Requires-Python: >=3.10
Provides-Extra: test
Requires-Dist: pytest>=8; extra == 'test'
Description-Content-Type: text/markdown

# Relay

**A cross-agent (Claude Code + Codex) planning/execution workflow.** Portable
operating discipline + continuity across agents: forcefully align on a spec, plan it
under a mechanically-bounded agent that *can't* run off and build, project it onto
GitHub (epic → milestones → issues), then implement and ship incrementally.

Local docs are the source of truth; **GitHub is the projection** (degrades to
local-only with no remote). One workflow, two host bindings — both converge on the
same `.workspace/work/<unit>/` files and `gh` projection.

> **Status**: alpha — pipeline complete (all 5 steps live-tested), published
> to PyPI as [`relay-workflow`](https://pypi.org/project/relay-workflow/) 0.1.0
> (2026-06-03). Extracted 2026-05-28 from `factory` work unit 024
> (`claude_codex_interop_framework`). Successor to the Claude-only
> `claude-code-toolkit` (which remains active separately). Design rationale
> and the empirical host probes are in [`docs/`](docs/).
>
> **See it run**: [`docs/demo.cast`](docs/demo.cast) — a recorded
> end-to-end pipeline against a fresh GitHub repo
> ([`applied-artificial-intelligence/relay-demo`](https://github.com/applied-artificial-intelligence/relay-demo)):
> `relay plan` → `relay project --execute` → `relay implement --execute`
> (headless Claude writes `greet.py`) → `relay ship --execute` (PR #8
> merged). Play with `asciinema play docs/demo.cast`.

## Pipeline (host-neutral)

Five steps built — pipeline complete:

```
align/    →  spec.md       forceful interrogation (skill + Codex prompt; interactive)
relay_plan.py      →  plan.json/.md   bounded plan-only agent run (can't implement)
relay_project.py   →  GitHub          idempotent epic + milestones + issues (+ branches)
relay_implement.py →  branch + PR     headless agent implements one issue (Closes #)
relay_ship.py      →  merged PR       bubble up issue/milestone/project state
```

Alongside the pipeline, **`relay compile`** is a small adjacent tool that
solves the *other* cross-agent problem: author a skill once in a canonical
SKILL.md and emit both `.claude/skills/<name>/` (full frontmatter) and
`.agents/skills/<name>/` (Codex-trimmed, with a `SKILL_DIR` resolver). See
[`docs/skill-format.md`](docs/skill-format.md).

## `align/` — spec interrogation (interactive)

`align/SKILL.md` (Claude skill) + `align/align.codex.md` (Codex `/align` prompt):
forcefully interrogate the user — one question at a time, challenge vague answers,
force out-of-scope exclusions — then write `<unit>/spec.md` from `spec-template.md`.
Front-loads all human clarification so the headless `plan` step never needs to ask.
Deploy: copy `SKILL.md` to a skills dir; copy `align.codex.md` to `~/.codex/prompts/align.md`.

## `relay_plan.py` — plan step

`relay_plan.py`: turns an aligned **spec** into a structured, milestone/issue
**plan** without letting the coding agent run off into implementation. Host-neutral.

```
spec.md  →  [bounded plan-only agent run]  →  plan.json + plan.md
```

1. Reads `<unit>/spec.md` (produced by `align`).
2. Runs the host's **mechanically bounded "plan-but-don't-build" primitive**:
   - **Claude**: `claude -p --permission-mode plan` — `ExitPlanMode` is terminal,
     nothing executes. Plan extracted from `--output-format json` result (falls back
     to newest `~/.claude/plans/*.md`).
   - **Codex**: `codex exec --sandbox read-only --output-schema <schema> -o <file>` —
     read-only sandbox can't write; schema **enforces** the milestone/issue JSON shape.
3. Writes `<unit>/plan.json` (structured) + `<unit>/plan.md` (human/checklist).
4. Emits `<unit>/gh_projection.sh` — a flat, non-idempotent `gh` script. **Superseded
   by `relay_project.py`** for real use; kept only as a quick-look dry-run.

## `relay_project.py` — project step (idempotent GitHub projection)

Mirrors `plan.json` onto GitHub: a tracking **epic** issue, **milestones**, one
**issue** per work item, with branch + `Closes #` wiring written back into `plan.json`.

```
plan.json  →  [resolve repo]  →  epic + milestones + issues  →  plan.json (numbers+branches) + projection.json
```

- **Idempotent**: matches existing milestones/issues by title (`gh api .../milestones`,
  `gh issue list`) and reuses them — re-running never duplicates. Verified live against
  a throwaway repo (4 milestones/21 issues; second run created 0 new objects).
- **Epic**: a `[epic] <objective>` tracking issue with a child task-list, refreshed
  (not duplicated) on re-run. `--no-epic` to skip.
- **Branch/PR wiring**: each issue gets `branch: relay/m<n>-<mslug>/<islug>` and a
  `Closes #<n>` contract recorded in `plan.json` for IMPLEMENT to consume.
- **Local-first**: no GitHub remote → exits local-only (add a remote and re-run to promote).
- **Missing labels** don't abort — the issue is created without them and logged.

## `relay_implement.py` — implement step (headless, one issue → PR)

Consumes the projected `plan.json` and drives a **headless coding agent** to
implement ONE issue on its branch, then opens a PR that closes the issue. This is
the step that lets Relay execute its own backlog instead of a human hand-building.

```
plan.json  →  [select issue]  →  branch off base  →  [headless agent]  →  push + PR (Closes #N)  →  plan.json (pr+state)
```

- **One issue at a time** (workflow-design.md model). Default target = first
  *pending* issue (state ∉ {implemented, merged, done}); `--issue <#n|title>` to pick.
- Reuses the `branch`/`closes` wiring the PROJECT step wrote into `plan.json`
  (errors if absent — run `relay_project.py --execute` first).
- Assembles a **brief** from objective + milestone + issue body + `spec.md` (the
  durable contract) + mechanical scope rules — host-neutral, byte-identical
  across hosts — then runs the chosen host headless (Claude: `claude -p
  --output-format json`; Codex: `codex exec` in a `workspace-write` sandbox).
  Success is verified by `git` (commits on the branch), not by parsing stdout.
- **Idempotent**: reuses an existing branch / existing PR; if the agent leaves no
  commits, records `state=no-op` and opens no PR. Local-only (no remote) →
  `state=implemented-local`, branch left for later promotion.
- **Dry-run by default**: prints target, branch, the commands it would run, and the
  full assembled brief. `--execute` to branch + spend an agent run + open the PR.
- **Host: `--host {claude,codex,auto}`** (default `auto`). Auto-detect mirrors
  PLAN's precedence — `claude` on PATH wins, else `codex`, else a clear error.
  Both hosts drive the same host-neutral brief and write the same `state`
  vocabulary (`open` / `no-op` / `implemented-local`).
- **Host equivalence**: Codex runs under a `workspace-write` sandbox so it can
  edit and commit the working tree. (PLAN uses `read-only` for a *different*
  reason — to mechanically bound the planner so it can't build; IMPLEMENT must
  be able to write.) For fully-autonomous runs that must run tests and commit
  without prompts, `codex --dangerously-bypass-approvals-and-sandbox` is the
  analogue of Claude's `--permission-mode bypassPermissions` (default is the
  safer `acceptEdits`).

## `relay_ship.py` — ship step (incremental: PR merge → issue/milestone done)

Consumes the `pr`/`state` IMPLEMENT wrote into `plan.json` and **closes the loop
incrementally**. There is no terminal "ship" event in workflow-design.md — SHIP
runs (and re-runs) as PRs become mergeable: PR merge → issue done (auto via
`Closes #N`) → milestone done (when all its issues are merged) → project done
(when every milestone closes).

```
plan.json  →  [pick PR]  →  check mergeable + checks  →  gh pr merge  →  bubble-up  →  plan.json (merged/done)
```

- **One PR per run by default** (mirrors IMPLEMENT's one-at-a-time); `--all` to
  drain every mergeable PR in one pass.
- Merge gate: `mergeable=MERGEABLE` AND checks not `FAILING`. Pending checks
  block by default; `--auto` enables GitHub auto-merge so the PR lands when
  checks pass.
- `--admin` bypasses branch protections on repos without CI.
- `mergeable=UNKNOWN` (GitHub's lazy-compute response on first probe) is
  auto-retried once.
- After each pass, bubbles up milestones whose issues are all done and sets
  `plan.state = "shipped"` when every milestone closes.
- `gh pr merge` lands on GitHub leaving the local base behind — SHIP fetches +
  fast-forwards before committing state so the push succeeds the first time.
- Already-merged PRs are recognised and reconciled without re-merging
  (re-running is safe + idempotent).

## `relay compile` — author skills once, emit both host variants

The pipeline is *workflow* interop; this is *artifact* interop. Same problem
shape (one source, two hosts), different surface.

```
skills-src/<name>/SKILL.md  →  .claude/skills/<name>/   (full frontmatter, ${CLAUDE_SKILL_DIR})
                            →  .agents/skills/<name>/   (name+description only, ${SKILL_DIR} resolver)
```

- Canonical SKILL.md is the **Claude-superset**: write at the richer level,
  the compiler drops Claude-only frontmatter fields when emitting Codex.
- Self-references use `{baseDir}` (the form Claude skills already use). The
  compiler rewrites it to `${CLAUDE_SKILL_DIR}` for Claude and to
  `${SKILL_DIR}` for Codex, and injects a one-line resolver into the first
  bash block on the Codex side so `SKILL_DIR` is always defined.
- `@import ./file.md` becomes an on-demand `Read` reference for Claude
  (progressive disclosure) and is *inlined* verbatim for Codex (which has
  no include mechanism).
- Codex output carries an `AUTO-GENERATED … do not edit` header so a future
  hand-editor doesn't change the wrong file.
- Bundled assets (`scripts/`, `references/`, etc.) are copied verbatim to
  both outputs.
- Lints on Codex's caps (`name ≤ 64`, `description ≤ 1024`).

Format details and rationale: [`docs/skill-format.md`](docs/skill-format.md).
Functionally verified on real marketplace skills (`lightgbm-training`,
`structured-writing`, …).

## Install

From PyPI:
```bash
uv tool install relay-workflow   # recommended (isolated env, like pipx)
# or
pipx install relay-workflow      # legacy-compatible alternative
```

From source (for development or to pin to a working tree):
```bash
git clone https://github.com/applied-artificial-intelligence/relay
cd relay
uv tool install --force .        # install this checkout
```

Both paths put a `relay` binary on your `PATH`. Requires Python ≥ 3.10 and the
`gh` CLI for the steps that touch GitHub.

## Usage

```bash
# align: invoke the Claude skill / Codex /align prompt (interactive), then:
relay plan <work-unit-dir>                            # auto-detect host, plan only
relay plan <unit> --host codex                        # force host
relay project <unit>                                  # dry-run GitHub projection
relay project <unit> --execute                        # create/refresh epic+milestones+issues
relay implement <unit>                                # dry-run: show target issue + brief
relay implement <unit> --execute                      # implement first pending issue → PR (auto-detect host)
relay implement <unit> --host codex --execute         # force Codex as the headless coding host
relay implement <unit> --issue 7 --execute --permission-mode bypassPermissions
relay ship <unit>                                     # dry-run: which PRs would merge
relay ship <unit> --all --execute                     # merge every mergeable PR, bubble up
relay ship <unit> --execute --auto                    # enable auto-merge (wait on checks)

relay compile <skills-src-dir> <out-dir>              # one SKILL.md → both .claude/ and .agents/
```

`relay --help` lists steps; `relay <step> --help` lists step-specific options.
`python -m relay <step>` works too (handy on systems where `pipx`/`uv`-installed
scripts aren't on `PATH` yet).

## Verified (2026-05-27)

| Host | Mechanism | Output | Time | Repo touched? |
|---|---|---|---|---|
| Codex (gpt-5.5) | PLAN: read-only + enforced schema | 4 milestones / 21 issues | 39s | no |
| Claude (2.1.152) | PLAN: `--permission-mode plan` + JSON extract | 2 milestones / 10 issues | 29s | no |
| Claude IMPLEMENT | `claude -p` in working tree | 2 issues → 2 PRs (`Closes #`) | — | yes (commits on branch) |
| Codex IMPLEMENT | `codex exec` + `workspace-write` sandbox | branch + PR (`Closes #`) — _pending live verification_ | — | yes (commits on branch) |

Both leave the repo untouched (the anti-runaway guarantee is mechanical, not
behavioral) and converge on the same `plan.json` + `gh_projection.sh`.

## Known gaps (prototype, not production)

- **Codex schema** requires `additionalProperties:false` + every property in
  `required` (OpenAI structured-output rule) — handled in `PLAN_SCHEMA`.
- Claude path is **not** schema-enforced — relies on the model returning JSON; a
  structuring/repair pass would harden it.
- No rolling-wave support yet (re-plan a single milestone into issues later).
- GitHub **Projects v2** board (Status/Area/Type fields, à la tradesharp) is not
  created — `relay_project.py` does epic issue + milestones only (CLI-friendly, no
  GraphQL). Project-v2 wiring is the next projection increment.
- Issue-level idempotency matches on **exact title**; renaming an issue in the spec
  creates a new one rather than updating the old. Acceptable for append-style planning.
- **IMPLEMENT `--execute` live-tested 2026-05-28** (throwaway `relay-smoketest`, 2 issues →
  2 PRs via `claude -p`, both in-scope with tests + correct `Closes #`). Idempotency holds:
  skips issues that already have a PR (`--redo` to force), state is committed to the base
  branch (survives between runs, leaves a clean tree), dirty-guard ignores untracked agent
  artifacts. Default selection walks pending issues one at a time.
- **Parallel issues on one file conflict at merge time**: each issue branches off base
  independently, so two issues editing the same file produce mergeable-but-conflicting PRs.
  That's a SHIP/rebase concern, not a runner bug.
- **SHIP `--execute` live-tested 2026-05-31** (same `relay-smoketest` repo): merged PR #4
  (`Closes #1` auto-closed issue #1), correctly detected the parallel-PR conflict on PR #5,
  bubble-up left milestone open (1 of 2 issues done). Already-merged reconciliation verified
  on a second pass after a hard reset. Fixed two bugs found during the live test:
  (a) `gh pr view` first probe returning `mergeable=UNKNOWN` — auto-retry once;
  (b) post-merge local base was stale, so state push failed — fetch + fast-forward before
  committing state.

## Repo layout

```
align/             ALIGN step — Claude skill + Codex /align prompt + spec template
src/relay/         Python package — installs as the `relay` CLI dispatcher
  cli.py           subcommand dispatcher (relay plan|project|implement|ship)
  plan.py          PLAN step  — bounded plan-only run → plan.json/.md
  project.py       PROJECT step — idempotent GitHub epic/milestones/issues
  implement.py     IMPLEMENT step — headless agent implements one issue → PR (Closes #)
  ship.py          SHIP step — incremental PR merge → issue/milestone bubble-up
pyproject.toml     hatchling build; entry-point `relay = "relay.cli:main"`
docs/              workflow-design.md, planmode-probe.md, portal-briefing.md, PROPOSAL.md
.workspace/        agent state (memory/transitions/work), interop convention
```

## Design & provenance

- [`docs/workflow-design.md`](docs/workflow-design.md) — canonical workflow spec (ALIGN→PLAN→PROJECT→IMPLEMENT→SHIP).
- [`docs/planmode-probe.md`](docs/planmode-probe.md) — Claude vs Codex plan-mode probe; the empirical basis for the bounded-planner design.
- [`docs/portal-briefing.md`](docs/portal-briefing.md) — briefing for the website Agent Lab portal.
- Next steps (from design): self-host Relay (run the full ALIGN→…→SHIP cycle on Relay's
  own backlog), then rolling-wave re-planning + Projects v2 board.
