Steplog is a build-governance tool for solo agentic builders. It captures what your agent did, surfaces drift, and renders it all into a single dashboard you can open in a browser. It does not tell you what to do next.
Steplog ships as a Python package. Install via pipx for an isolated environment:
pipx install steplog
Then check the install:
steplog version
steplog doctor
If you work in Claude Code, the plugin wraps the CLI in slash commands. Install in three steps:
/plugin marketplace add datafrogger/steplog/plugin install steplogYou'll get /steplog-init, /steplog-render, /steplog-open, /steplog-archive, /steplog-unarchive, plus a SessionStart recap and a Stop hook that nudges you when a turn ended with code changes but no state-file update.
If you want to vendor Steplog into a project without a registry round-trip, download the latest tarball from the GitHub releases page and unpack it into the project root. The scripts/ directory carries the canonical generator and CLI; everything works locally without an install.
Once installed, bootstrap Steplog onto a project from inside that project's directory:
steplog init
# or, in Claude Code:
/steplog-init
Three things appear:
.steplog/state.json — the canonical state file. Every section, activity, decision, and nudge lives here.state.schema.json — the JSON Schema that validates the state file. Pinned to the Steplog version that initialised the project.BUILD_LOG.html — the dashboard, regenerated from state.json on every meaningful write.If you're using Claude Code, every session opens with a 4-line recap printed to stdout: section count, most recent ship/prod event, currently in-flight activity, top of the next-actions queue. The recap is deterministic — same state in, same recap out. Other agents (Codex, Cursor, Aider) see the same protocol via AGENTS.md at the repo root.
The Claude Code plugin's Stop hook fires at the end of every turn. If the turn changed code but didn't update state.json, the hook prints a one-line nudge reminding you to log an activity. The reminder is gentle and dismissible — it's a habit-builder, not a gate.
The dashboard has three layers: a sticky command center at the top, a top grid below it (completion ring + category view), and a stack of modules underneath. Every panel has a small ? next to its title; hover for a one-paragraph explainer.
Pinned to the top of the viewport. The Steplog logo lockup sits at the left; below it a row of labeled chips — updated (relative time, hover for the absolute timestamp), agent of record, current schema version, ? Guide link, and a theme toggle that cycles Auto / Light / Dark. The six-cell metric strip on the right covers PRD-completion, features-remaining, open nudges, mid-build, total activities, and sensitive-surface count. Three of the cells are derived (computed from the state file rather than read from it) and carry a small red pill so you can tell the difference at a glance — hovering the pill anywhere on the dashboard explains what derived means. Cells that point at a specific surface are clickable: features remaining and PRD complete jump you to the BUILD MAP, open nudges jumps to the Nudges module, and clicking a category row in the Category Completion view filters the BUILD MAP to that category. Each click is followed by a 2-second highlight pulse on the destination so you can see where you landed.
A 152px conic-gradient gauge. The headline percentage counts only production-marked sections — work that's running and being used right now. A teal band shows shipped sections that haven't yet been operator-marked production-ready; an amber band shows in-progress unshipped sections; grey is the rest. The number is honest about how much actually reached production, not how much was merged. Hover for the full derivation.
Seven dashboard modules sit below the top grid. Each has a uniform header — drag handle, title, subtitle, pin, collapse — and you can rearrange them however you want. The pin button now cycles three states: top-pin (📌, anchored to the top of the stack), bottom-pin (📍, anchored to the bottom), or unpinned (free-floating). Order, pin direction, and collapsed state persist in your browser per project.
The faint dashed arcs in the BUILD MAP's left rail show which sections depend on which. An arrow points from a prerequisite (tail) to a dependent (arrowhead). If you see an arrow from s_017 to s_049, that's saying s_049 can't fully ship until s_017 is far enough along. They're soft hints — sourced from the depends_on field on each section in state.json. The renderer doesn't enforce them; they're for reading, not for blocking.
v0.9 made them readable in two layers. Feature-011 s_056 widened the rail and varied the arc depth by arrow length — short dependencies bow tightly, long ones bow further left, so arrows of different spans visually separate instead of overlapping into one mass. Hover any arrow (s_057) and a tooltip names both endpoints with their section IDs and titles. Hover any section row (s_058) and that section's full network lights up: outgoing arrows in blue (sections that depend on this one), incoming in teal (sections this one depends on), connected rows tint subtly behind them.
Each section in .steplog/state.json has an optional depends_on array. Empty array (or absent) means no prerequisites. Each ID in the array becomes one arrow ending at this section's row.
{
"id": "s_049",
"title": "BUILD MAP readability redesign",
"depends_on": ["s_017"],
...
}
The schema is defined in state.schema.json under spec_sections[].depends_on — array of strings matching the s_NNN pattern. Validation runs on every steplog show so a typo in an ID surfaces immediately.
Hand-edit .steplog/state.json, add the prerequisite's section ID to the relevant section's depends_on array, re-render with steplog show. The new arrow appears in the next pass; click it to verify the tooltip reads correctly.
There's no CLI shortcut for this yet — the depends_on field is operator-curated, intentionally hand-maintained so the dependency graph reflects deliberate design decisions rather than auto-derived chatter from commit history or import order. If the manual editing friction becomes real, a future feature can add a CLI subcommand; until then, JSON editing is the contract.
The ⓘ button next to the BUILD MAP heading opens a slide-out panel with your project's current dependency stats (live counts, most-needed section, examples). That's the in-product equivalent of this section.
If your project has multiple worktrees, the topper carries a three-way branch indicator: committed (canonical branch only), working (current branch), all_branches (union view across worktrees with a small per-section pill showing which branches each section appears on). Switching modes requires re-rendering — each button's tooltip carries the exact steplog show --mode X command.
Steplog has a small, deliberate contract with whatever agent is working on your project:
state.json: a build step, a design decision, a ship event, a bug fix.The full contract lives in AGENT_PROTOCOL.md at the repo root. Every agent on the project — Claude Code, Codex CLI, Cursor, Aider — reads the same protocol.
Every section and most activities carry a technical description (for someone reading a diff) and a less-technical description (plain language, second person, no jargon). The toggle in the upper-right corner of the dashboard switches every visible description between the two. The button reads Less Technical by default — that's intentional: the rename from "Plain" to "Less Technical" frames the content as having two registers, not the reader as the simpler audience.
When you ask an agent to start building from a feature PRD, the protocol now includes a short ritual that fires first. The agent walks the PRD against the actual repo, looking for things like file paths that don't exist, schema versions that don't match, section IDs that collide, or convention assumptions that aren't true. The output is a Markdown review document at PRD/feature-NNN-prd-review.md with a verdict: APPROVE, APPROVE WITH REVISIONS, or BLOCK. You read it; you decide.
This is the fourth verification layer in the system. Three are automatic — the pre-commit hook (catches commits without state updates), the Claude Code Stop hook (catches per-turn agent neglect), the drift detector (catches files added outside any agent). The PRD-review ritual is the manual one. It catches the class of mistake the others can't see: a PRD asserting something about your repo that turns out to be wrong. It exists because two real catches (a missing mockup, an archive folder convention conflict) only surfaced because someone happened to look. The ritual makes the looking systematic.
The full ritual prose — when it fires, what the five categories check, what the output looks like — is in AGENT_PROTOCOL.md under "PRD Review ritual". A test fixture at src/steplog/migrations/test/fixtures/broken-prd-example.md shows the kind of defects the ritual is designed to catch.
v0.9 is a visual + UX polish pass. The dashboard goes from data-rich-but-rough to readable, branded, and bendable. Nothing changes about what Steplog records or how — only how the dashboard reads.
The Steplog Signal Steps lockup now sits in the topper. The browser tab carries a matching favicon. The meta line is restructured into labeled chips — updated (relative time, hover for the absolute timestamp), agent, version, a ? Guide link to this page, and a theme toggle (next subsection).
The topper has a theme toggle that cycles Auto → Light → Dark. Auto follows your OS's prefers-color-scheme preference live; Light and Dark force the theme regardless. Your choice persists per-project per-machine in localStorage. A project can also commit a default theme into state.ui_preferences.theme for any new visitor; your local override takes precedence.
The bootstrap script in <head> applies your theme before any visual content paints, so there's no flash of wrong-theme content on load. The logo automatically swaps to its dark variant when the theme is dark.
Two strips of filter chips above the BUILD MAP card: category (multi-select; toggle CORE / SECURITY / ENHANCEMENT / INFRASTRUCTURE / RESEARCH / COMPLIANCE individually, or hit ALL to clear) and lifecycle (single-select: ALL / PROD ONLY / NOT IN PROD / IN PROGRESS). The BUILD MAP header counts showing X of Y sections; an "n filter(s) active" pill on the right shows when filters are non-default and gives a one-click clear. Filter state persists across both reload paths (manual reload + the 2-minute auto-refresh).
Row text now reads at 14px (was 11px); titles and PRD anchors stack vertically with proper spacing. Status cells became compact pills — a small colored dot (whose fill density encodes done / in-progress / planned / not-started) plus the stage label. All dependency arrows reroute mechanically to the new pill positions. Total row width is unchanged.
Clicking a metric box that points at a specific surface now smooth-scrolls you there with a brief 2-second pulse so you can see where you landed. FEATURES REMAINING and PRD COMPLETE jump to the BUILD MAP; OPEN NUDGES jumps to the Nudges module; clicking a category row jumps to the BUILD MAP and filters it to that category. Hovering any DERIVED pill (anywhere on the dashboard) reveals a small tooltip after a brief delay: "Calculated by Steplog from current state — not directly recorded."
Hover any chip in the PRD Lifecycle stacked column and a small folder button (📁) appears in its corner. Click it to archive the chip from this view; it moves to the new LIFECYCLE ARCHIVE module pinned at the bottom of the page, and a 5-second toast offers Undo. Each archived pill carries an unarchive arrow (↑) that returns it. Archived chips still appear in BUILD MAP, NOW, and the timeline — only the PRD Lifecycle stacked column hides them.
The pin button on every module head now cycles three states: top-pin (📌), bottom-pin (📍), or unpinned. The LIFECYCLE ARCHIVE module ships pinned-bottom by default; future modules can opt into either direction. Pinned modules respect their direction during drag-reorder.
v0.9 bumps state.steplog_version from 0.8.0 to 0.9.0 via migration 006_add_ui_preferences_and_chip_archive.py. Two additive fields: a top-level ui_preferences object (theme and default_layout_preset, both optional) and a per-section prd_lifecycle_archived boolean (default false when missing). Existing state files validate cleanly after migration with no data loss.
When a section's work is done and you want it out of the active build map but kept in the project history:
steplog archive s_023
# or:
/steplog-archive s_023
Archived sections still count toward the completion ring and category dots — they just move into a "Show archive" toggle at the bottom of the build map. Reverse with steplog unarchive s_023.
The renderer reads which view to render at invocation time. To switch:
steplog show --mode committed # main only
steplog show --mode working # current branch only (default)
steplog show --mode all_branches # union across worktrees
The default button in the topper's layout strip clears any pinned, collapsed, or reordered state for the current project. Per-browser; doesn't affect other machines.
Every migration and operator-driven mutation writes a snapshot to .steplog/backups/ first. If something goes wrong:
steplog migrate --rollback
That restores the most recent snapshot. MIGRATIONS.md at the repo root is the human-readable audit trail of every schema upgrade.
Steplog v0.9.0 is Stage 1 of a longer staged path. Things explicitly outside the current scope:
steplog migrate per worktree to bring them in.If you spot drift in the dashboard's rendering, or a feature that would have saved you ten minutes today, log it as a note activity in state.json. The next time you open the dashboard you'll see it in the timeline; the time after that you'll have decided whether it's worth a feature pack.