Metadata-Version: 2.4
Name: relay-workflow
Version: 0.4.0
Summary: Host-neutral cross-agent work-tracking CLI for Claude Code + Codex — plan → project → ship, with a skill compiler/sync bridge and work-unit status/cleanup verbs.
Project-URL: Homepage, https://github.com/applied-artificial-intelligence/relay
Project-URL: Repository, https://github.com/applied-artificial-intelligence/relay
Project-URL: Issues, https://github.com/applied-artificial-intelligence/relay/issues
Author-email: Stefan Jansen <stefan@applied-ai.com>
License-Expression: MIT
License-File: LICENSE
Keywords: agents,automation,claude-code,codex,github,workflow
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Version Control :: Git
Requires-Python: >=3.10
Description-Content-Type: text/markdown

# Relay

**Host-neutral cross-agent work-tracking CLI for Claude Code + Codex.** Take a
spec → break it into a structured plan → project the plan onto GitHub (epic,
milestones, issues) → ship merged PRs back to closed issues. Plus a small
adjacent skill-bridge (`compile` / `sync`) so a single SKILL.md authored once
runs on either host.

Local files are the source of truth (`.workspace/work/<unit>/spec.md`,
`plan.json`, `projection.json`); **GitHub is the projection** (and gracefully
degrades to local-only with no remote). One workflow, two host bindings —
both converge on the same files and the same `gh` projection.

> **Status**: alpha — five pipeline verbs (`plan`, `project`, `ship`,
> `compile`, `sync`) plus two work-state verbs (`status`, `cleanup`), all
> stdlib-only at runtime. Published to PyPI as
> [`relay-workflow`](https://pypi.org/project/relay-workflow/). Successor to
> the Claude-only `claude-code-toolkit`.
>
> **0.4.0 (2026-06-09) is a deliberate breaking release** — `relay implement`
> is removed. Run the implementation in your foreground Claude Code or Codex
> session against shared `.workspace/` state; close the loop with
> `relay ship`. The full migration is in [`CHANGELOG.md`](CHANGELOG.md).
>
> **See it run**: [`docs/demo.cast`](docs/demo.cast) (Claude host) and
> [`docs/demo-codex.cast`](docs/demo-codex.cast) (Codex host) — recorded
> against [`applied-artificial-intelligence/relay-demo`](https://github.com/applied-artificial-intelligence/relay-demo).
> Play with `asciinema play docs/demo.cast`.

## The workflow

```
align       →  spec.md           interactive interrogation (skill + Codex prompt)
relay plan  →  plan.json/.md     bounded plan-only agent run (host-asymmetric — see below)
relay project →  GitHub          idempotent epic + milestones + issues (opt-in)
[ you implement each issue in your foreground Claude / Codex session ]
relay ship  →  merged PRs        bubble up issue/milestone state via `Closes #N` (opt-in)
```

`project` + `ship` are **opt-in**: a single-feature, solo flow can stop after
`plan` and never touch GitHub. The skill-bridge tools (`compile`, `sync`) live
adjacent to the pipeline — same problem shape (one source, two hosts),
different surface.

### Two state-management verbs (new in 0.4.0)

```
relay status            inspect ACTIVE_WORK + artifacts + drift + GH counts
relay cleanup           detect stalled units + stale pointer (dry-run; --execute to repair)
```

`status` is read-only. `cleanup` is dry-run by default and never deletes a
directory — at most it writes a `STATUS.md` tag inside a stalled unit or
repairs `ACTIVE_WORK`. Every pipeline verb also maintains the
`.workspace/work/ACTIVE_WORK` pointer so the agent and the
[`workflow` plugin's](https://github.com/applied-artificial-intelligence/claude_code_plugins)
`capture-plan.sh` hook stay in sync across sessions.

## Host recommendation matrix

Relay is genuinely host-neutral — both hosts can drive every verb. But the
two are **not symmetric** at plan-capture time, and pretending they are
would make the casts and docs misleading. Use the table below; the cast
narrations follow it.

| Workflow shape | Recommended host | Why |
|---|---|---|
| Interactive plan-then-iterate-then-capture | **Claude Code** | `ExitPlanMode` `PostToolUse` hook (Claude Code v2.1+) fires reliably; the `capture-plan.sh` hook receives the full plan content via stdin and writes it straight into the work unit. Iterations replace the file; capture is automatic on approval. |
| Headless single-shot plan generation | **Either** | `relay plan --host {claude,codex}` runs the same bounded primitive on both. Codex has the cleaner output path (`--sandbox read-only --output-schema` enforces the JSON shape and `-o <FILE>` is deterministic); Claude path requires JSON extraction. |
| Foreground per-issue implementation | **Either**, user preference | Both can read `.workspace/`, both close issues that the PR's body says `Closes #N`. Codex has finer-grained sandbox controls; Claude has subagent fan-out. |
| Multi-host project where one dev switches | **Use Claude for plan capture, either for execution** | The asymmetry only bites at plan-capture time; everything else is symmetric. |

**Codex `/plan` parity is structural, not bridgeable.** As of `codex-cli`
0.137.0 there is no discrete plan tool to hook (no `ExitPlanMode` analogue,
no `/plan` slash, no plan-event surface). Use `relay plan --host codex` for
the headless path on Codex. Probe details:
[`docs/probes/relay-0.4.0.md`](docs/probes/relay-0.4.0.md) (Probe A, B).

## Plan-mode: what it actually buys you

A 2026-06-09 probe (3 runs/side on a 159-LOC ambiguity-rich refactor) measured
the no-plan vs Claude plan-mode delta:

- **Same plan content, more elaboration.** A 9-dimension rubric scored
  no-plan 8.0/9 vs plan 8.33/9 — within-group variance exceeds the delta.
- **Subagent fan-out *is* real.** Plan-mode triggered one `Agent`/`Explore`
  exploration phase per run (3/3 plan vs 0/3 no-plan) at this task size.
  That translates to thoroughness in *reaching* the plan, not in the plan
  itself.
- **Structured artifact.** Plan-mode produces an
  `ExitPlanMode.plan` payload that `capture-plan.sh` can write deterministically
  into the work unit. No-plan plans live inside the result message text.
- **Cost: ~2.5× compute, ~2× wall time** versus no-plan.

> **Honest framing:** plan-mode is a *workflow-integration tool* — explicit
> approval gate, structured artifact, downstream-tooling-friendly — **not a
> plan-quality booster.** Recommend it when the downstream step
> (`relay project`) consumes the artifact, or when you want explicit
> iteration. Skip it for fast one-shots: you get the same plan content at
> ~40% of the cost.

Casts and copy do not (and will not) claim plan-mode "produces better plans"
— the probe data doesn't support the stronger claim.

## `align` — spec interrogation (interactive, host-neutral)

`align/SKILL.md` (Claude skill) + `align/align.codex.md` (Codex `/align`
prompt): forcefully interrogate the user — one question at a time, challenge
vague answers, force out-of-scope exclusions — then write `<unit>/spec.md`
from `spec-template.md`. Front-loads all human clarification so the headless
`plan` step never needs to ask. Deploy: copy `SKILL.md` to a skills dir;
copy `align.codex.md` to `~/.codex/prompts/align.md`.

`relay align` / `relay new <topic>` are **deferred to 0.5**. The skill +
Codex prompt remain the entry point in 0.4.0.

## `relay plan` — bounded plan-only run

```
spec.md  →  [bounded plan-only agent run]  →  plan.json + plan.md
```

1. Reads `<unit>/spec.md` (produced by `align`).
2. Runs the host's **mechanically bounded "plan-but-don't-build" primitive**:
   - **Claude**: `claude -p --permission-mode plan` — `ExitPlanMode` is
     terminal, nothing executes. Plan extracted from `--output-format json`
     result. (Probe A confirmed Claude Code v2.1+ always returns the plan
     content on stdout; the 0.3.x `~/.claude/plans/*.md` mtime-hunt is gone.)
   - **Codex**: `codex exec --sandbox read-only --output-schema <schema> -o
     <file>` — read-only sandbox can't write; schema **enforces** the
     milestone/issue JSON shape.
3. Writes `<unit>/plan.json` (structured) + `<unit>/plan.md` (human/checklist).
4. Emits `<unit>/gh_projection.sh` — a flat, non-idempotent `gh` script.
   **Superseded by `relay project`** for real use; kept only as a quick-look
   dry-run.

## `relay project` — idempotent GitHub projection (opt-in)

Mirrors `plan.json` onto GitHub: a tracking **epic** issue, **milestones**,
one **issue** per work item, with branch + `Closes #` wiring written back
into `plan.json`.

```
plan.json  →  [resolve repo]  →  epic + milestones + issues  →  plan.json (numbers+branches) + projection.json
```

- **Idempotent**: matches existing milestones/issues by title and reuses
  them — re-running never duplicates.
- **Epic**: a `[epic] <objective>` tracking issue with a child task-list,
  refreshed (not duplicated) on re-run. `--no-epic` to skip.
- **Branch/PR wiring**: each issue gets `branch:
  relay/m<n>-<mslug>/<islug>` and a `Closes #<n>` contract recorded in
  `plan.json`. The branch is just a *suggested* name — what `relay ship`
  actually probes for is any open PR whose body says `Closes #N`, so you
  can branch and PR however you like.
- **Local-first**: no GitHub remote → exits local-only.
- **Missing labels** don't abort — the issue is created without them and
  logged.

## You implement (foreground)

Open the issue in your Claude Code or Codex session, do the work, push a
branch, open a PR whose body contains `Closes #N`. Anything that closes the
issue via GitHub's keyword is fine — relay ship doesn't care how you got
there.

## `relay ship` — incremental PR merge → bubble-up (opt-in)

```
plan.json  →  [discover PR via gh search]  →  check mergeable + checks  →  gh pr merge  →  bubble-up  →  plan.json (merged/done)
```

- **PR auto-discovery** (new in 0.4.0): for each open issue with no
  recorded `pr` field, ship probes `gh pr list --search "Closes #N"`. An
  issue with no matching PR is logged and left alone (not errored). A
  0.3.x `plan.json` that already carries `pr=<n>` skips the probe
  (backward-compat).
- **One PR per run by default** (`--all` to drain every mergeable PR in one
  pass).
- Merge gate: `mergeable=MERGEABLE` AND checks not `FAILING`. Pending
  checks block by default; `--auto` enables GitHub auto-merge.
- `--admin` bypasses branch protections on repos without CI.
- `mergeable=UNKNOWN` (GitHub's lazy-compute response on first probe) is
  auto-retried once.
- After each pass, bubbles up milestones whose issues are all done and
  sets `plan.state = "shipped"` when every milestone closes.
- Already-merged PRs are reconciled without re-merging (idempotent).

## `relay status` / `relay cleanup` — work-state inspection

```bash
relay status                # active pointer + artifacts + drift + (optional) GH counts
relay status --offline      # skip the gh network calls
relay cleanup               # dry-run: list pointer + stalled-unit drift
relay cleanup --execute     # write STATUS.md tags + prompt to repair pointer
```

- `status` reports: `ACTIVE_WORK` value, whether its dir exists, presence +
  mtime of `spec.md` / `plan.json` / `projection.json`, the unit's
  most-recently-modified file, other unit dirs touched in the last 7 days
  (drift warning), and — if `gh` is on PATH and a remote exists — open
  epic / milestone / `Closes #N` PR counts.
- `cleanup` detects three drift cases:
  1. `ACTIVE_WORK` → nonexistent dir.
  2. Unit with `spec.md`, no `plan.json`, last activity >14 days.
  3. Unit with `plan.json`, no `projection.json`, no activity >14 days.
  Under `--execute`, it writes a `STATUS.md` tag inside the affected unit
  (idempotent — already-tagged units are skipped) or prompts to reassign /
  clear the stale pointer. **It never deletes a directory.**

Every pipeline verb (`plan` / `project` / `ship` / the `implement`
migration shim) also maintains `<repo>/.workspace/work/ACTIVE_WORK`
best-effort on entry, so pointer drift cannot accumulate across sessions.

## `relay compile` — author skills once, emit both host variants

```
skills-src/<name>/SKILL.md  →  .claude/skills/<name>/   (full frontmatter, ${CLAUDE_SKILL_DIR})
                            →  .agents/skills/<name>/   (name+description only, ${SKILL_DIR} resolver)
```

- Canonical SKILL.md is the **Claude-superset**: write at the richer level,
  the compiler drops Claude-only frontmatter fields when emitting Codex.
- Self-references use `{baseDir}` (the form Claude skills already use). The
  compiler rewrites it to `${CLAUDE_SKILL_DIR}` for Claude and to
  `${SKILL_DIR}` for Codex, and injects a one-line resolver into the first
  bash block on the Codex side.
- `@import ./file.md` becomes an on-demand `Read` reference for Claude
  (progressive disclosure) and is *inlined* verbatim for Codex (which has
  no include mechanism).
- Codex output carries an `AUTO-GENERATED … do not edit` header.
- Bundled assets (`scripts/`, `references/`, etc.) are copied verbatim to
  both outputs.
- Lints on Codex's caps (`name ≤ 64`, `description ≤ 1024`).

Format details: [`docs/skill-format.md`](docs/skill-format.md).

## `relay sync` — bulk-install a marketplace's skills into Codex user-scope

```
~/agents/coding/plugins/*/skills/<name>/SKILL.md
       │
       └─ relay sync ──→ ~/.codex/skills/<name>/        (Codex-trimmed + ${SKILL_DIR})
```

- **Codex-only by default.** Claude already discovers marketplace skills
  via the plugin mechanism; Codex has no plugin mechanism, so a
  user-scope install is its only path to the same library.
- **Dry-run by default.** `--execute` to actually emit.
- **Idempotent**: per-skill `<target>/<name>/` is fully replaced on each
  run; unrelated neighbour skills in the target dir are untouched.
- **Collision-aware**: same skill-name in two plugins gets flagged on
  stderr (last write wins).
- **Filters**: `--plugin <name>` to limit, `--marketplace <dir>` /
  `--target <dir>` to override defaults.

```bash
relay sync                                              # dry-run, defaults
relay sync --execute                                    # install every skill into ~/.codex/skills/
relay sync --plugin quant --execute                     # just the quant plugin's skills
relay sync --marketplace ~/my-plugins --target ~/.codex/skills --execute
```

## Install

From PyPI:
```bash
uv tool install relay-workflow   # recommended (isolated env, like pipx)
# or
pipx install relay-workflow      # legacy-compatible alternative
```

From source (for development or to pin to a working tree):
```bash
git clone https://github.com/applied-artificial-intelligence/relay
cd relay
uv tool install --force .        # install this checkout
```

Both paths put a `relay` binary on your `PATH`. Requires Python ≥ 3.10. The
GitHub-touching verbs (`project`, `ship`, the optional `status` GH counts)
need the `gh` CLI authenticated.

## Usage

```bash
# align: invoke the Claude skill / Codex /align prompt (interactive), then:
relay plan <work-unit-dir>                            # auto-detect host, plan only
relay plan <unit> --host codex                        # force host
relay project <unit>                                  # dry-run GitHub projection
relay project <unit> --execute                        # create/refresh epic+milestones+issues

# (you implement issue #N in your foreground Claude or Codex session,
#  open a PR whose body says `Closes #N`, then:)

relay ship <unit>                                     # dry-run: which PRs would merge
relay ship <unit> --all --execute                     # merge every mergeable PR, bubble up
relay ship <unit> --execute --auto                    # enable auto-merge (wait on checks)

relay status                                          # active pointer + artifacts + drift
relay status --offline                                # skip the gh network calls
relay cleanup                                         # dry-run: stalled-unit / stale-pointer report
relay cleanup --execute                               # write STATUS.md tags + prompt pointer repair

relay compile <skills-src-dir> <out-dir>              # one SKILL.md → both .claude/ and .agents/
relay sync                                            # dry-run: list marketplace skills → ~/.codex/skills/
relay sync --execute                                  # install all marketplace skills into Codex user-scope
```

`relay --help` lists steps; `relay <step> --help` lists step-specific
options. `python -m relay <step>` works too.

## Repo layout

```
align/             ALIGN step — Claude skill + Codex /align prompt + spec template
src/relay/         Python package — installs as the `relay` CLI dispatcher
  cli.py           subcommand dispatcher (relay plan|project|ship|compile|sync|status|cleanup)
  plan.py          PLAN step — bounded plan-only run → plan.json/.md
  project.py       PROJECT step — idempotent GitHub epic/milestones/issues
  ship.py          SHIP step — PR auto-discovery + incremental merge + bubble-up
  status.py        STATUS verb — active pointer + artifacts + drift + GH counts
  cleanup.py       CLEANUP verb — stalled-unit + stale-pointer detection (safe by default)
  workunit.py      best-effort ACTIVE_WORK pointer helper
  compile.py       SKILL.md → both host variants
  sync.py          marketplace-wide skill install for Codex
pyproject.toml     hatchling build; entry-point `relay = "relay.cli:main"`
docs/              workflow-design.md, plan-mode probes, demo casts
CHANGELOG.md       release notes (breaking changes called out)
.workspace/        agent state (memory/transitions/work), interop convention
```

## Design & provenance

- [`CHANGELOG.md`](CHANGELOG.md) — release history.
- [`docs/workflow-design.md`](docs/workflow-design.md) — canonical workflow
  spec (ALIGN→PLAN→PROJECT→SHIP).
- [`docs/planmode-probe.md`](docs/planmode-probe.md) — Claude vs Codex
  plan-mode probe; the empirical basis for the bounded-planner design.
- The 2026-06-09 probes (A–F) that drove the 0.4.0 spec rewrite are at
  `.workspace/work/current/2026-06-08-relay-0.4.0-implement-removal/probes-2026-06-09.md`
  in the agent state.
- Next steps (from design): self-host Relay (run the full
  ALIGN→PLAN→PROJECT→SHIP cycle on Relay's own backlog), then `relay align`
  / `relay new <topic>` (0.5), then rolling-wave re-planning + Projects v2
  board.
