CLI reference
All commands, flags, and environment variables.
Synopsis
cantrip [run] [OPTIONS] [PATH]
cantrip compare CHARM_A CHARM_B
cantrip export-transcript PATH [OPTIONS]
cantrip hooks test EVENT [--payload JSON] [--path DIR]
cantrip skill export NAME PATH [--charm-path DIR] [--force]
cantrip checkpoints list [--db PATH] [--task-id ID]
cantrip checkpoints show [--db PATH] TASK_ID STEP_NAME ORDINAL
cantrip checkpoints delete [--db PATH] --task-id ID [--yes]
cantrip audit list [--path PATH] [--task-id ID] [--action KIND] [--tool NAME]
cantrip audit export [--path PATH] [--format {jsonl,csv}]
cantrip permissions test TOOL [OPTIONS]
cantrip permissions list [--charm-path DIR] [--user-config DIR] [--no-builtin]
cantrip docs index [--site NAME | --all] [--embed-provider NAME] [--embed-model MODEL]
cantrip docs list [--root PATH]
cantrip docs search SITE QUERY [--top-k N]
cantrip --version
cantrip --help
The run subcommand is the default — you can omit it.
cantrip /path/to/charm is equivalent to
cantrip run /path/to/charm.
Which command to reach for
| Command | Reach for it when |
|---|---|
cantrip / cantrip run |
You want a normal interactive session to build or improve a charm. |
cantrip compare |
You need a charm-aware diff between two charm trees. |
cantrip export-transcript |
You want to save a session for review, sharing, or CI artefacts. |
cantrip hooks test |
You are iterating on cantrip.hooks.yaml and want to fire one synthetic event locally. |
cantrip skill export |
You want to re-export a discovered skill as a standard SKILL.md bundle. |
cantrip checkpoints ... |
You need to inspect or clear durable-execution checkpoints in a .cantrip session file. |
cantrip audit ... |
You want to inspect the policy-decision trail after a run or export it for spreadsheets and CI logs. |
cantrip permissions ... |
You want to see what the permission gate would do before the agent makes a real tool call. |
cantrip docs ... |
You want to build or query the local documentation index that powers @docs. |
| Slash commands | You are already in chat and want to drive built-in features without another shell command. |
cantrip run
Start the agent and build or improve a charm.
Positional arguments
- PATH
- Path to the charm project directory. Defaults to the current directory. The directory is created if it does not exist.
Provider and model
- --provider {gemini,claude,inference-snap,fireworks,openrouter,opencode-zen,openai-compatible}
-
LLM provider to use. Default:
gemini. See Choose an LLM provider for a full comparison. - --model MODEL
-
Specific model name. Provider-dependent. When omitted, the
provider's default model is used. Required with
--provider openai-compatible. - --snap SNAP_NAME
-
Inference snap name when using
--provider inference-snap. Default:gemma3. - --base-url URL
-
API base URL override. Required with
--provider openai-compatible(e.g.https://api.together.xyz/v1). Optional forinference-snap(overrides snap discovery),fireworks,openrouter, andopencode-zen(for proxies or compatible hosts). - --snap-read-timeout SECONDS
-
HTTP read timeout for the inference-snap provider's chat
completions. Default: 1200 s (20 min) — long enough
for a worst-case big-file rewrite on the slowest local snap.
Drop this on faster GPUs to fail-fast on stuck generations.
Falls back to the
CANTRIP_SNAP_READ_TIMEOUTenvironment variable when omitted. - --short-session {on,off,auto}
-
Short-session mode for tight-context models.
auto(the default) turns it on for providers with less than ~16 K usable context — small local inference snaps such as gemma4 — and off for everything else;on/offforce it. When active, Cantrip compacts at 50 % of the window (instead of 80 %), replaces the prose-summary compaction with a one-line-per-tool-call history ledger that drops the raw older messages, and treats each turn as a near-fresh conversation — trading some cross-edit memory for the ability to actually finish a multi-edit task on a model whose system prompt plus tool schemas already fill a third of the window. The status bar shows a[short-session]chip while it is on, and/costreports the compaction strategy in use. Tight-context providers (this mode, or any provider that caps its tool array — inference snaps cap at 12) are also offered a curated tool slice scoped to the active workflow phase rather than the full toolbox; the status bar shows a[build ยท 11]-style chip and/costnames the phase. SeeCANTRIP_TOOL_PHASEbelow to pin it. Falls back to theCANTRIP_SHORT_SESSIONenvironment variable when omitted.
Light model (cost routing)
- --light-model MODEL
- Cheaper model for internal tasks (compaction, research summaries). Auto-detected if omitted.
- --light-provider {gemini,claude,inference-snap,fireworks,openrouter,opencode-zen}
- Use a different provider for light tasks, enabling hybrid mode.
- --light-snap SNAP_NAME
-
Lighter inference snap for internal tasks (e.g.
nemotron-3-nano).
Interface
- --no-tui
- Run in CLI mode (command-line REPL) without the terminal UI. See Choose an interface for when to prefer the CLI REPL over the TUI, Web UI, or print mode.
- --web
- Run with a browser-based Web UI instead of the TUI. See Choose an interface for browser-vs-terminal tradeoffs and Web-specific caveats.
- --web-port PORT
-
Port for the Web UI. Default:
8471. - --objective TEXT
-
Free-text user-prose objective for the session, e.g.
--objective "build a Postgres charm with COS plus Pebble notices". Stored on the session and used by Ralph re-feed and goal-aware status surfaces in place of the--charm-name+--charm-typeparaphrase. Update mid-session with the/goalslash command. Persists acrosscantrip resume. - --theme THEME
-
TUI colour theme. Options:
cantrip,ubuntu,monokai,solarized-dark,light.
Behaviour
- --improve CHARM_PATH
- Audit and improve an existing charm at the given path instead of building a new one.
- --watcher
- Start the event watcher on launch. The watcher monitors the dev model for status changes and creates diagnostic tasks automatically.
- --concurrency N
-
Maximum number of concurrent subagent tasks. Default:
3. - --no-snapshots
-
Disable per-turn working-tree snapshots. By default Cantrip
commits the charm tree into a hidden git repo before every
user turn so
/undoand/redocan roll back agent edits. Use this flag (or setCANTRIP_SNAPSHOTS=false) when working in a monorepo where snapshotting is too slow. See Undo agent changes. - --no-auto-lint
-
Disable per-edit lint feedback. By default Cantrip runs
ruffandtyon every Python file the agent writes, andcharmlinton charm YAML (metadata.yaml,charmcraft.yaml,actions.yaml,config.yaml), then appends the diagnostics to the tool result so the agent can react to lint and type errors in the same turn. Edits succeed even when the linter reports issues — diagnostics are advisory, not gating. Use this flag if the linters are unavailable or the inline feedback is noisy in your workflow. - --architect
-
Architect/editor two-model split. Each agent
turn runs in two passes: an architect pass on the
main model emits a plain-prose proposal (no tool calls),
then an editor pass on a cheaper model translates
the proposal into actual
fs_edit/fs_writetool calls. Both passes appear separately in/cost. Toggle mid-session with/architect. See Use architect mode. - --editor-provider NAME
-
Override the editor provider when
--architectis on. Useful for hybrid combinations like architect=Claude, editor=Gemini-Flash. Ignored without--architect. - --editor-model SLUG
-
Override the editor model slug when
--architectis on. Defaults to the configured editor provider's default model. Ignored without--architect. - --no-auto-commit
-
Disable per-turn auto-commit. By default every
turn that mutates files lands as a discrete git commit in
the charm repo with a Cantrip co-author trailer; pre-existing
dirty work commits separately as
chore(pre-cantrip): save in-progress work. Use this flag (or/auto-commit offmid-session) when you prefer to batch agent edits into your own commits. See Auto-commit per turn.
cantrip compare
Diff two charm implementations along four dimensions and print a
human-readable report. Useful for evaluating a Cantrip-generated
charm against a hand-crafted or upstream one without running a
full diff -r. Reads both modern
charmcraft.yaml and the legacy
metadata.yaml / config.yaml /
actions.yaml split so charms on either layout
compare cleanly.
Positional arguments
- CHARM_A
- First charm directory. Required.
- CHARM_B
- Second charm directory. Required.
Output
The report groups drift into sections:
- Structure — which landmark files and
directories exist on each side (e.g.
tests/integration,terraform,.github/workflows). - Config options — added, removed, and
changed
config.optionskeys, with both sides' values printed for every changed option. - provides / requires / peers — relation endpoints, compared by name and interface.
- Actions, Containers, Extensions — sets of names.
- Tests — unit- and integration-test file counts, always rendered even when identical so "both zero" is itself a visible finding.
Sections that match on both sides render as
(identical — same X) so your eye can skip
straight to drift.
cantrip export-transcript
Export a session transcript from a .cantrip file.
Positional arguments
- PATH
-
Charm directory containing a
.cantripsession file. Required.
Options
- --format {html,markdown,jsonl}
-
Output format. Default:
html. - --output FILE
-
Output file path. Default:
transcript.<ext>in the charm directory. - --task TASK_ID
- Export only a specific task and its subagent conversation.
- --phase {research,build,deploy,test}
- Export only tasks in the given phase.
- --since TIMESTAMP
-
Export only messages and events at or after the given ISO
timestamp (e.g.
2026-04-15T10:00:00Z). - --branch TURN_ID
-
Export the conversation path leading to a specific turn id.
Without this flag, the export follows the
session's currently active branch — a forked session
therefore exports only the active path by default.
Off-branch turns stay reachable: list them with
/treeand re-export with--branch <id>when needed. - --page-size N
-
Split HTML output into pages of N conversation messages each.
Creates numbered files (
transcript_1.html,transcript_2.html, etc.) with navigation links.
cantrip hooks
Manage user-defined hooks configured in
~/.config/cantrip/hooks.yaml or
cantrip.hooks.yaml. See
How to configure hooks for the
schema.
cantrip hooks test
Fires a synthetic event against the loaded hook config and
prints per-hook results — exit code, duration,
stdout/stderr excerpts, veto / timeout status. Useful while
authoring a config to check an if: filter
matches and a run: command exits cleanly. Exits
0 on success, 2 on argument / JSON errors.
cantrip hooks test EVENT [--payload JSON] [--path DIR]
EVENTrequired-
One of the hook event
names:
pre_tool_call,post_tool_call,pre_compact,post_compact,pre_subagent,post_subagent, plus the reserved-for-later names. --payload JSON-
Optional JSON object merged into the synthetic event
alongside the auto-added
eventandtimestampfields. Yourif:filter evaluates against the merged payload. Must be a JSON object; lists and scalars are rejected with exit 2. --path DIR-
Repo root for
cantrip.hooks.yamldiscovery. Defaults to the current working directory. The user-scope config at~/.config/cantrip/hooks.yamlis always loaded; repo hooks with a collidingnameoverride user hooks.
cantrip skill
Manage Cantrip skills. See How to add a custom skill for the standard SKILL.md format and the three directories Cantrip discovers skills from.
cantrip skill export
Writes a discovered skill to a file in the standard SKILL.md
format — the same shape Claude Code, gh skill, Cursor, Codex,
Gemini CLI, and Windsurf use. Works on the bundled skills and
on user skills under ~/.claude/skills/ or
~/.config/cantrip/skills/. The exported file drops straight
back into any of those trees and Cantrip re-imports it without
translation. Exits 0 on success, 2 on unknown-skill or
existing-target errors.
cantrip skill export NAME PATH [--charm-path DIR] [--force]
NAMErequired-
Name of the skill to export — as listed in
index.list_skills()or in the<available_skills>system-prompt block. PATHrequired-
Output path. A
.mdpath is honoured verbatim (single-file layout); any other path is treated as a directory and the file is written as<path>/<name>/SKILL.md(directory layout). Parent directories are created as needed. --charm-path DIR-
Path whose occurrences are replaced with the literal
<CHARM_PATH>placeholder in the exported body. Defaults to no charm-path scrubbing. Secret scrubbing (GitHub tokens, AWS keys,Bearer …values,password=…pairs, Slack tokens) always runs. --force- Overwrite the target file if it already exists. Without this flag Cantrip refuses to clobber an existing file and exits 2.
cantrip checkpoints
Inspect and surgically remove step-level durable-execution
checkpoints stored under a session's .cantrip SQLite file.
Cantrip persists each LLM turn and each tool call for an
in-flight subagent task so that an interrupted run resumes from
the last completed step instead of re-burning tokens from turn 1.
The cantrip checkpoints subcommand is the out-of-band surface
for that state — usually you won't touch it, but it's there when
a stale row is masking a fix or when you want to see what's
cached before deciding whether to resume.
All three subcommands accept --db PATH (default: ./.cantrip).
cantrip checkpoints list
Prints a compact table per task showing every stored step
(llm_turn#N or tool:<name>#N), the storage kind, the first
12 characters of the input hash, and the creation timestamp.
With no filter, every task that has checkpoints is listed.
cantrip checkpoints list [--db PATH] [--task-id ID]
--db PATH- Path to the
.cantripsession file. Default:./.cantrip. --task-id ID- Filter to a single task id. Default: list every task with checkpoints.
cantrip checkpoints show
Pretty-prints a single stored blob. JSON-encoded kinds
(llm_response, tool_result, value) are decoded and printed
with json.dumps(..., indent=2, sort_keys=True); bytes kinds
are printed as base64.
cantrip checkpoints show TASK_ID STEP_NAME ORDINAL [--db PATH]
TASK_IDrequired- Task id the checkpoint belongs to — from
cantrip checkpoints listor the transcript viewer. STEP_NAMErequired- Step name, e.g.
llm_turnortool:read_file. ORDINALrequired- 1-based ordinal within the step. The N-th
llm_turncall for the task has ordinal N.
Exits 1 when no row matches.
cantrip checkpoints delete
Purges every checkpoint for one task. Useful when a stale row
is suspected of masking a real change, or when you want to force
a fresh run from turn 1 without setting
CANTRIP_NO_RESUME for the whole session.
cantrip checkpoints delete --task-id ID [--db PATH] [--yes]
--task-id IDrequired- Task id whose checkpoints should be removed.
--yes- Skip the interactive
y/Nconfirmation prompt. Intended for scripted use.
cantrip audit
Read the JSONL policy-decision trail written during agent and subagent tool execution. Reach for it after unattended runs, when a tool call was denied unexpectedly, or when you want a machine-readable record to attach to CI artefacts.
cantrip audit list
Print audit entries as JSONL, optionally filtered by task, outcome, or tool. This is the quickest way to answer questions like "why did that call get denied?" or "which task triggered a review request?".
cantrip audit list [--path PATH] [--task-id ID]
[--action {allowed,denied,review-requested,rate-limited}]
[--tool NAME]
--path PATH-
Audit file to read. Defaults to
<cwd>/.cantrip-audit.jsonl. --task-id ID- Filter to a single task id.
--action ...-
Filter to one decision kind:
allowed,denied,review-requested, orrate-limited. --tool NAME- Filter to one exact tool name.
The command exits 1 if the audit file does not exist so
automation can fail clearly instead of silently reading nothing.
cantrip audit export
Re-emit the audit trail in a format that composes with downstream tools. JSONL passes through unchanged; CSV is useful for spreadsheet review or attaching a compact artefact to CI.
cantrip audit export [--path PATH] [--format {jsonl,csv}]
--path PATH-
Audit file to read. Defaults to
<cwd>/.cantrip-audit.jsonl. --format {jsonl,csv}-
Output format.
jsonlprints the original lines;csvwrites one row per decision with JSON-encoded arguments in the final column.
cantrip permissions
Inspect the per-call permission gate without standing up the full
agent. Discovers the same three layers that the runtime uses —
built-in safe defaults, user-wide
~/.config/cantrip/permissions.yaml, and the per-charm
<charm>/.cantrip/permissions.yaml — composes them in the same
order, and exposes the resulting decision through two
subcommands.
cantrip permissions test
Evaluates one hypothetical tool call against the discovered
ruleset and prints the verdict, the matched rule, and the source
file (or builtin:<section>). Useful while authoring
permissions.yaml to confirm a local override actually loosens
the built-in safe default — and to catch the mirror case where a
typo in your glob silently lets a destructive command through.
Exits 0 on a successful evaluation regardless of the verdict; 2
on argument errors.
cantrip permissions test TOOL [--command CMD] [--path PATH]
[--agent NAME]
[--charm-path DIR] [--user-config DIR]
[--no-builtin] [--show-rules]
TOOLrequired-
Tool name to test (e.g.
run_command,read_file,juju_status). Thetoolssection matches against this. --command CMD-
Bash command string for the
bashsection. Only contributes whenTOOLis inbash_tools(default{run_command}); otherwise the bash section is skipped. --path PATH-
Path argument for the
pathssection. Mirrors thepath,file_path, orfilenameargument the tool would receive at runtime. --agent NAME-
Activate a per-agent overlay defined under
agents:. MatchesSubagentContext.task.category.value(e.g.RESEARCH,BUILD); overlays only tighten — across sections the most restrictive verdict wins. --charm-path DIR-
Repo root for
.cantrip/permissions.yamldiscovery. Defaults to the current working directory. --user-config DIR-
User config directory for
permissions.yaml. Defaults to~/.config/cantrip. --no-builtin-
Skip the built-in safe defaults
(
rm -rf *โ deny,sudo *โ ask,.envreads โ deny, etc.). Only file-loaded rules are evaluated. Useful when probing a config in isolation. --show-rules-
Append the full loaded ruleset listing after the verdict —
same output as
cantrip permissions list.
cantrip permissions list
Prints every loaded permission rule grouped by section
(tools, bash, paths) and
per-agent overlay, with each rule's source file or
builtin:<section>. Helpful as a sanity check
that your YAML actually parsed and composed in the order you
expect.
cantrip permissions list [--charm-path DIR] [--user-config DIR]
[--no-builtin]
Flags match permissions test. With --no-builtin and no
user/repo file present the listing prints
"No permission rules loaded." so an empty config is visible at
a glance instead of looking the same as a missing one.
cantrip docs
Index Canonical's documentation surfaces (Juju, ops,
charmcraft, rockcraft, jubilant, Charmhub) so the agent's
docs_search tool and the @docs <site> <query> mention can
return cited passages instead of paraphrasing from memory.
Requires an embed provider configured via the
role router.
cantrip docs index --site <name>-
Crawl the named site, chunk + embed every page, and upsert
into the per-site SQLite cache under
~/.cache/cantrip/docs-index/<name>/index.db. Re-running replaces rows with a stable(url, ordinal)hash — no need to clear the cache by hand. Pass--allto index every registered site. Per-page errors (404s, timeouts) are absorbed; the rest of the crawl proceeds. cantrip docs list-
Print every registered site with its index status and chunk
count. Pass
--root <path>to read from a non-default cache directory. cantrip docs search <site> <query>-
Embed
<query>through the configured embed provider, cosine-search the per-site store, print the top--top-khits (default 5) with score, URL, and excerpt. Composes with shell pipelines.
The how-to guide walks through one-time setup and the in-chat surfaces.
Slash commands
Type these directly into the chat (TUI or Web) to drive features that don’t go through the LLM. Output renders as a system message; the agent is not consulted.
Memory
Manage durable lessons across sessions. See the memory how-to for workflow.
/memory [scope]-
List every remembered entry.
scopeis optional (charmorglobal); omit to list both. /memory help- Print the full syntax block.
-
/remember <kind> [scope] -- <title> -- <body> -
Record a new memory.
kindisfact,rule, orlesson;scopedefaults tocharm. The--separator (space dash dash space) lets titles and bodies contain any punctuation. /forget <title> [scope]- Delete a memory by exact title. Quoted titles with whitespace are supported. When the same title exists in both scopes the handler refuses with an ambiguous message rather than guessing.
/memory export <name> <output_path> [scope]-
Bundle memories into a
SKILL.mdfile at<output_path>/<name>/SKILL.md(or directly at<output_path>when it ends in.md). Charm paths are replaced with<CHARM_PATH>and obvious secrets (GitHub/AWS tokens, Bearer,password=, Slack) are scrubbed. /memory export-md <output_dir> [scope]- Write one Markdown file per memory under the directory — the companion format for gist or PR-style sharing.
/memory import <source_path> [target_scope]-
Read a
SKILL.mdor a directory of memory.mdfiles and merge into the target scope (globalby default). Duplicates skip by default.
Arena (blind A/B)
Compare two models on the same prompt, blinded, and record the outcome as a global-scope memory. Useful when you’re picking a light provider for day-to-day work and want a quick head-to-head without reading model cards. See Racing and Arena for the design, the scoring rubric used by the non-interactive Best-of-N race, and the transcript events both surfaces emit.
/arena <prompt>-
Run the primary and light providers concurrently on
prompt. Both responses are shuffled into labelsAandB—model names are hidden until you pick. ReplyA,B,tie, orskipwhen the two responses arrive. A second/arenais rejected while one is pending so the labels don’t get mixed up.
Recognised picks write a fact memory at
global scope with source="arena",
tagged arena and model-preference. Titles
are arena-preference-<8-hex>, and the body cites
both models by name plus a 200-character excerpt of the prompt so
the preference is attributable to a specific ask. Ties record a
neutral “rated equivalent” entry;
skip clears the session without writing a memory.
Requires a configured light provider (--light-provider
or CANTRIP_LIGHT_PROVIDER)—without one the
command prints a setup hint and exits.
Mid-session model switching
/model- Print the active provider and model, plus the light provider when one is configured. Shows the syntax for switching.
/model <provider>-
Swap to
provider’s default model. Acceptsgemini,claude,fireworks,openrouter,opencode-zen, andinference-snap.openai-compatiblerequires a--base-urlthat doesn’t fit the slash syntax—restart the session instead. /model <provider>/<model>-
Swap to a specific model. Only the first
/splits, so Fireworks-style slugs likefireworks/accounts/fireworks/models/kimi-k2p6work.
The swap is atomic: the context-window budget tracks the new
provider’s window, provider-dependent caches (tool list,
auto-writer) rebuild on next access, and a
model_switched event lands on the event bus so the
status bar and cost tracker follow. Cost accumulators survive the
swap—they’re session totals, not per-provider. Any
cross-provider light routing (--light-provider snap
etc.) drops in favour of same-family routing; callers who rely on
a specific hybrid should restart the session.
Auto-commit per turn
/auto-commit-
Toggle the per-turn auto-commit. Bare flips
on/off;
/auto-commit onand/auto-commit offare explicit. When on, every turn that mutates files lands as a discrete git commit with aCo-Authored-By: Cantriptrailer; pre-existing dirty work commits first aschore(pre-cantrip): save in-progress work. See Auto-commit per turn.
The auto-commit hook fires inside the conversation loop after
the final assistant message lands. It walks the turn's tool
calls for write_file /
edit_file / multi_edit, stages the
touched paths via git add -- <paths> (no
catch-alls), and commits with a body that embeds the user
prompt plus a list of touched files. When a light provider is
configured the subject is generated by it; otherwise we fall
back to agent: <truncated user message>.
The most recent agent commit's SHA lands on
state.last_cantrip_commit_sha for future audit.
Architect / editor mode
/architect-
Toggle the architect/editor split. With no
argument, flips on/off;
/architect onand/architect offare explicit. With a second token, sets the editor (same syntax as/model):/architect on claude,/architect on claude/claude-haiku-4-5-20251001. See Use architect mode for the design rationale and editor resolution rules.
When architect mode is on, every conversation-loop call splits
into two passes: an architect pass on the main provider
that emits a plain-prose proposal (no tool calls), then an
editor pass on a cheaper provider that consumes the
proposal and emits the actual fs_edit /
fs_write calls. Both passes record usage attributed
to their own provider, so /cost shows two model
lines per turn. Both passes also fire transcript events
(architect_pass / editor_pass) for
audit. Streaming surfaces yield the editor's response as a
single chunk — the architect's proposal is internal, not
streamed to the user.
Share session as a gist
/share-
Export the live session as an HTML transcript and upload it as
a secret GitHub gist via
gh gist create. Returns the gist URL. Uses the same HTML renderer as/export htmlso the gist content is identical to what a local export would produce.
Requires the GitHub CLI: install gh and run
gh auth login once. When gh is missing
or unauthenticated, /share still writes the HTML to a
temp file and prints a copy-pasteable
gh gist create command so nothing is lost—the
session is never blocked by a missing dependency. Cantrip does
not run its own hosting service; the gist lives on GitHub under
the authenticated user.
Copy a chat message to the system clipboard
/copy-
Copy the most recent assistant message body, rendered as
Markdown, to the system clipboard. The TUI uses Textual's
App.copy_to_clipboardhelper; the CLI writes an OSC 52 escape directly to the controlling terminal so the copy works through tmux, screen, and ssh. /copy last- Copy the last message of any role (including the user's own most recent message). Useful when an agent reply interleaves with a tool block and you want the latest visible line.
/copy <N>-
Copy the N-th message in 1-based session order. Indices line
up with
/export markdownoutput so you can cross-reference if you need to copy something earlier in the transcript.
For copy to actually reach the clipboard through tmux, your
tmux.conf needs set -g set-clipboard on
and tmux 3.2 or later. Most modern terminal emulators (kitty,
alacritty, foot, iTerm2, gnome-terminal, Windows Terminal)
accept OSC 52 by default. When the controlling terminal isn't a
tty (piped stdout, headless CI), the CLI prints the message body
inline so the user can still grab it manually. The Web UI has
no equivalent server-pushed clipboard channel; /copy
inlines the payload in a fenced code block for browser
select-and-copy instead.
MCP (Model Context Protocol)
Inspect configured MCP servers and discover new ones from marketplaces. See the MCP how-to for configuration.
/mcp-
List configured servers, their connection status, and the
tool count each exposes. Status markers:
[ok]connected,[!!]failed,[--]stopped,[..]pending. /mcp help- Print the full syntax block.
/mcp tools <server>-
List every tool a named server advertises, with descriptions.
Tools appear to the agent as
mcp__<server>__<tool>. /mcp marketplace-
List servers from configured marketplaces (read-only).
Descriptors include the install hint, required env vars, and
OAuth scopes. Cantrip never auto-installs — you copy the
descriptor into
cantrip.mcp.yamlafter reviewing it. /mcp marketplace refresh- Bypass the 24-hour cache and re-fetch every marketplace.
Undo and redo
/undo- Roll back the most recent user turn. Restores the working tree from the snapshot taken just before that turn started and removes the user’s message plus every assistant / tool message that followed from history (in-memory and the SQLite session store). Stacks: run again to walk back further.
/redo- Re-apply the most recently undone turn. Restores the working tree to its post-turn state and re-appends the messages that were sliced off. The redo stack is in-memory only and clears the moment a new user turn arrives.
Branch and tree
Cantrip stores the conversation history as a tree.
/undo deletes; /branch rewinds without
deleting, so every dead end stays reachable.
/branch [turn-id]-
Move the active head to a prior turn and rebuild the
in-memory conversation from that point. With no argument,
forks before the most recent user turn — the typical
recovery from a bad steering message. Off-branch turns stay
in the SQLite store and remain reachable through
/treeandexport-transcript --branch <id>. /tree-
Render the session as an indented tree of turns. Every
surface gets a markdown form with turn ids and an active-
branch marker (
*); the TUI replaces it with an interactive picker — Enter on a row dispatches/branch <id>, Escape leaves the active branch alone.
Snapshots are on by default. Disable per-session with
--no-snapshots on the command line or
CANTRIP_SNAPSHOTS=false in the environment.
The snapshot repo lives outside the charm tree (under
$XDG_STATE_HOME/cantrip/snapshots/) so it
will not appear in git status or be touched by
git clean -fdx. See
Undo agent changes for the full
how-to.
Repository map
/map-
Print a compact summary of the top-ranked files in the active
charm — one line per file with the file's primary symbol and
a "+N more" hint. Files are ordered by PageRank over a
reference graph (caller → callee, plus YAML interface names
from
charmcraft.yaml). Use this to confirm what the agent thinks the repo looks like before asking it to navigate. /map full- Print the full per-file symbol breakdown — the same wall-of- text view the agent receives in its system prompt on every turn. Useful for digging into a specific area; overwhelming as the default in a small chat panel.
/map-refresh//map-refresh full-
Discard the cache at
.cantrip-repomap.jsonand reparse every source file from scratch, then print the compact (or full) summary. Normal builds are incremental (only files whose mtime changed get reparsed); a refresh is useful after a large rename or when the cache looks stale.
The map injects automatically into the system prompt under a configurable token budget (default 1500). When the conversation fills past 80% of the context window the budget halves; past 95% it drops entirely so a near-full window isn't carrying a bird's-eye view it can't act on.
Code intelligence
Three slash commands layered on the same read-only index that
backs the code_symbols, code_definition,
and code_references agent tools. The repo map
answers "what matters in this repo?"; these commands
answer "where is this exact symbol?".
/symbols <query>-
Search workspace symbols by name. Layered match policy:
exact qualified (e.g.
MyCharm._on_install) wins, then exact unqualified (every_on_install), then prefix, then case-insensitive substring. Each layer fires only when the previous one returns nothing, so a precise hit is never drowned in fuzzy candidates. /definition <symbol>- Resolve a symbol to its defining file/line plus a bounded snippet. Ambiguous queries return every candidate in the response so the caller can disambiguate; a missing symbol returns No definition rather than guessing.
/references <symbol>- List every recorded callsite for a symbol — calls, attribute access, and import sites — with file:line locations. Honest about ambiguity (notes when the name belongs to multiple candidate symbols) and truncation (the rest are counted in the response).
The same surface is also reachable as @-mention
providers: typing @symbol IngressHandler,
@definition build_layer, or @references
refresh in the chat input expands inline before the
message reaches the LLM. Coverage is Python source plus charm
metadata YAML; other languages still go through
grep/glob.
Review checks
/review-
Run every loaded prompt-based Check against the active charm.
Each Check is one structured LLM call (the
CHECK_RESULTschema constrains the reply to{status, severity, message, evidence?, suggested_fix?}), so the report is uniform regardless of which model you're using. Failures appear first, then errors (couldn't reach a verdict), then skipped (no matching files), then passes. When the active charm also has linter diagnostics, they appear underneath as a Deterministic checks section so you see one combined view.
Checks are loaded from three layered locations (later wins on name
conflict): bundled defaults shipped with Cantrip, then
~/.config/cantrip/checks/.md (user scope), then
<charm>/.cantrip/checks/.md (repo scope). Each
file is YAML frontmatter plus a markdown body — see
design/CHECKS.md
for the schema and the boundary with charmlint
(roughly: charmlint for AST/regex rules,
/review for "an experienced human would notice this
is off but you can't write it as a regex").
Three checks ship by default: charm-readme-coherence,
action-ergonomics, relation-data-hygiene.
Session objective
/goal-
Show the current free-text user-prose objective for the
session. When no objective is set, prints a hint about
/goal <text>. /goal <text>-
Set or update the objective. Subsequent Ralph re-feeds use
the user’s words verbatim instead of the
--charm-name+--charm-typeparaphrase, so iterate-until-green loops stay anchored to the original goal sentence. Stamping an objective at startup is also possible with--objectiveoncantrip run; the persisted value wins on resume. /goal clear- Remove the objective. Ralph re-feeds and status surfaces fall back to the spec-derived paraphrase.
The objective is persisted in the session store; cantrip
resume picks up the same value without the
--objective flag being re-supplied. A
/goal issued mid-Ralph takes effect on the very
next iteration without restarting the loop.
Pause and resume the autonomous loop
/pause-
Stop the background executor picking new tasks. Chat keeps
working and any task already in flight (including a CONFIRM
task waiting for your answer) runs to completion — only
the next task on the queue is held. The flag is sticky across
chat turns: typing a regular reply after
/pausewon’t silently restart the loop. /resume- Restart a paused autonomous loop. Background tasks pick up where they left off — the queue is preserved while paused.
Both commands are idempotent: /pause while already
paused (and /resume while already running) reports
the current state without touching the executor. The TUI status
bar and the Web UI header surface a single Codex-style lifecycle
badge that swaps between PAUSED, DONE,
BLOCKED, BUDGET LIMITED, and a quiet
default running state — the projection lives
in cantrip.agent.lifecycle so the two surfaces
never disagree. Precedence: paused beats
blocked, budget-limited is more
specific than blocked, and a session with pending
or active work always reads as running regardless
of stale blockers.
Project diagnostics
/diagnostics-
Run
ruff,ty, and (if the directory looks like a charm)charmlintacross the active charm and print the issues grouped by severity. Output is capped at ~1500 tokens with a "N more issues suppressed" footer when the project has more issues than fit. Result is cached for 30 seconds so repeated calls in the same turn don't re-run the linters. /diagnostics --refresh- Force a fresh lint pass, bypassing the 30-second cache — useful right after editing files outside the agent's tools.
The same aggregator runs automatically when the autonomous loop
starts a BUILD or DEBUG subagent, so the agent begins each task
already knowing what's broken. Tools that aren't installed are
listed as [skipped] notes rather than silently
masking issues — a missing ty doesn't look the same
as "all clear."
@-mention context providers
Type @<name> [args] in the chat input and the mention is replaced
with structured context before the message reaches the LLM —
one fewer tool round-trip than asking the agent to read the same
content, and the substituted block is recorded in the transcript
with a [@name]…[/@name] fence so intent stays visible alongside
content. Slash commands run first, so a literal @x inside a
slash argument is not substituted.
Tab-complete in the TUI: typing look at @fi with the cursor at
the end pops a suggestion list; Tab completes to look at @file
without disturbing the surrounding prose. Up/Down move the
highlight, Escape dismisses.
@file <path>-
Inline the contents of a repo-relative file. Absolute paths
and
..traversal are rejected. @diff- Output of
git diff HEADin the active charm. @tree [path]-
Repo-tracked file listing via
git ls-files(.gitignorerespected). Falls back to a plain walk when the directory is not a git checkout. @problems-
Current
ruff/ty/charmlintdiagnostics. Shares a 30-second cache with/diagnosticsso a quick check followed by a mention does not re-run the linters. @url <url>-
Fetch the URL via the same web-fetch path the agent uses
(private-IP block,
llms.txtprobing, HTML-to-text extraction). @charm <name>- Charmhub metadata (relations, config, revision) for the named charm.
@preset [slug]-
A known-good bundle shape from the built-in catalogue. Bare
@presetlists the shapes (cos-lite,twelve-factor-cos,identity-platform,charmed-kubeflow);@preset cos-liteexpands one — the apps grouped by semantic layer, then every relation edge with its interface name and a one-line description. Knowledge only: it prescribes no deployment steps and emits nobundle.yaml. @juju <subcmd>-
Run a read-only
jujusubcommand. The first token is hard-allowlisted:status,show-unit,show-application,show-model,config,list-secrets,show-relation,list-models. Anything else is rejected so a typo cannot reach a destructive verb. @docs <site> <query>- Top hits from the indexed Canonical docs (see Index the charm docs). Requires an embed provider; the mention reports the missing configuration if none is wired.
@symbol <query>- Workspace-symbol search via the Phase 72b code-intelligence index. Layered match policy (exact qualified > exact > prefix > fuzzy) so a precise hit isn't drowned in fuzzy candidates. Coverage: Python source plus charm metadata YAML.
@definition <symbol>- Resolve a symbol to its defining file/line plus a bounded snippet. Ambiguous queries surface every candidate.
@references <symbol>- List every recorded callsite for a symbol with file:line locations. Honest about ambiguity and truncation.
Each provider has a per-call character cap; over-budget output is
truncated with a [truncated N chars] footer rather than silently
elided. Multi-line blocks get a [@name]…[/@name] fence so the
typed mention stays visible in the transcript.
Third-party providers register via MCP servers or hooks — the
protocol and conventions are in design/CONTEXT_PROVIDERS.md.
Environment variables
This is the complete reference table. For a task-oriented walk-through of the provider keys (Gemini, Claude, Fireworks, OpenRouter, OpenCode Zen, OpenAI-compatible) and the embed / rerank role keys, see Choose an LLM provider — environment variables. The operational tunables below (memory paths, MCP storage, snapshot toggles, self-update opt-outs, checkpoint debugging) are reference-only and not duplicated elsewhere.
| Variable | Required for | Description |
|---|---|---|
GEMINI_API_KEY |
--provider gemini |
Google Gemini API key |
ANTHROPIC_API_KEY |
--provider claude |
Anthropic API key |
FIREWORKS_API_KEY |
--provider fireworks |
Fireworks.ai API key |
OPENROUTER_API_KEY |
--provider openrouter |
OpenRouter.ai API key |
OPENCODE_ZEN_API_KEY |
--provider opencode-zen |
OpenCode Zen API key. ZEN_API_KEY is accepted as a fallback. |
OPENAI_COMPATIBLE_API_KEY |
--provider openai-compatible |
Bearer token for the configured endpoint; set to any non-empty string when auth is not required. |
VOYAGE_API_KEY |
--embed-provider voyage / --rerank-provider voyage |
Voyage AI key. Required when Voyage serves the embed or rerank role. |
OPENAI_API_KEY |
--embed-provider openai |
OpenAI key for the embed role. The chat-side --provider openai-compatible path uses OPENAI_COMPATIBLE_API_KEY instead. |
OPENAI_EMBED_BASE_URL |
optional | Override the OpenAI embed endpoint. Set to a self-hosted vLLM or any OpenAI-wire-compatible host that serves /v1/embeddings. |
CANTRIP_EMBED_PROVIDER |
optional | Provider for the embed role. Accepts voyage or openai; --embed-provider overrides. |
CANTRIP_EMBED_MODEL |
optional | Embed model identifier. Defaults: voyage-3 (Voyage), text-embedding-3-small (OpenAI). |
CANTRIP_RERANK_PROVIDER |
optional | Provider for the rerank role. Currently voyage only. |
CANTRIP_RERANK_MODEL |
optional | Rerank model identifier. Default rerank-2. |
CANTRIP_MEMORY_DIR |
optional | Override the global memory directory. Defaults to $XDG_CONFIG_HOME/cantrip/memory (falls back to ~/.config/cantrip/memory). |
CANTRIP_MEMORY_SOFT_EXPIRY_DAYS |
optional | Days untouched before a memory is archived by memory_sweep. Default 60. Non-integer or non-positive values log a warning and fall back to the default. |
CANTRIP_MEMORY_HARD_EXPIRY_DAYS |
optional | Days archived before a memory is surfaced as a deletion candidate by memory_purge_check. Default 180. |
CANTRIP_MCP_USER_CONFIG |
optional | Override the user-scope MCP config path. Defaults to ~/.config/cantrip/mcp.yaml. |
CANTRIP_MCP_TOKEN_DIR |
optional | Override the per-server OAuth token storage directory. Defaults to ~/.config/cantrip/mcp_tokens/ at 0700, with token files at 0600. |
CANTRIP_MCP_GPG_TOKENS |
optional | Set to 1/true/yes/on to encrypt OAuth tokens at rest with gpg --symmetric. Requires a configured gpg-agent so writes don’t block. |
CANTRIP_MCP_MARKETPLACE_CACHE |
optional | Override the marketplace response cache directory. Defaults to ~/.cache/cantrip/marketplaces/; 24-hour TTL. |
CANTRIP_MAX_WORKTREES |
optional | Cap concurrent subagent worktrees under .cantrip-worktrees/<task-id>/. Set to 0 to disable worktree isolation entirely and run every subagent in the main tree. |
CANTRIP_NOTIFY |
optional | Notify when a task finishes. Set to bell for a terminal bell (\a to stderr), desktop to shell out to notify-send with the task title, or both. Defaults to off. The desktop path silently no-ops on platforms where notify-send is not on PATH. |
CANTRIP_NO_UPDATE_CHECK |
optional | Skip the background PyPI self-update check. Accepts 1, true, yes, or on (case-insensitive). Useful on corporate networks that block pypi.org or for scripted runs that shouldn't talk to the public internet at all. The same effect can be made persistent by setting update_check_disabled = true in ~/.config/cantrip/settings.json. |
CANTRIP_UPDATE_CACHE_DIR |
optional | Override the disk cache directory for the PyPI check. Defaults to ~/.cache/cantrip/; the verdict lives in update.json with a 24-hour TTL. |
CANTRIP_NO_RESUME |
optional | Disable step-checkpoint replay for the next run. Accepts 1, true, yes, or on (case-insensitive). Subagents skip the checkpoint lookup and re-execute every LLM turn and tool call live; fresh results still land in the store so the next run without the var sees a clean cache. Useful when hunting a bug that might itself be cached in a stale checkpoint. |
CANTRIP_KEEP_CHECKPOINTS |
optional | Preserve step checkpoints after a task reaches DONE. Accepts 1, true, yes, or on (case-insensitive). By default, checkpoints are purged on successful task completion; setting this flips the purge into a no-op so rows can be inspected via SELECT * FROM step_checkpoints in the .cantrip SQLite file. Intended for debugging; leave unset in normal use. |
CANTRIP_SNAPSHOTS |
optional | Set to 0, false, no, or off (case-insensitive) to disable per-turn working-tree snapshots backing /undo and /redo. Equivalent to passing --no-snapshots. Defaults to on; the snapshot repo lives at $XDG_STATE_HOME/cantrip/snapshots/<hash>/. |
CANTRIP_SHORT_SESSION |
optional | Force short-session mode on or off, or auto (the default) to enable it only for providers below ~16 K usable context. Equivalent to --short-session; see that flag for what the mode changes. |
CANTRIP_TOOL_FAILURE_CAP |
optional | How many times the same tool call (same name and same arguments) may fail in a row before the active task is marked BLOCKED and the run stops. Default 5; clamped to [1, 50]; non-integer or out-of-range values log a warning and fall back to the default. One round before the cap the agent injects a message telling the model to change approach. A different tool call (or different arguments) resets the streak. Raise it for flaky environments, lower it to fail fast on small local models that loop on oversized write_file payloads. |
CANTRIP_TOOL_PHASE |
optional | Pin the curated tool slice to one of research, build, debug, deploy, or demo, regardless of what the work queue is doing. Tight-context providers (inference-snap's 12-tool cap, short-session mode) only ever see the tools the active workflow phase needs; normally that phase is derived from the running task's category, but this var forces it — useful for driving Cantrip through an unusual flow (e.g. a documentation pass that wants research-tier tools throughout). Unrecognised values log a warning and are ignored. Roomy providers (Claude, Gemini) are unaffected — they get the full toolset either way. |
The inference-snap provider does not require an API key
as it runs models locally.
Session file
Cantrip stores session state in a .cantrip file (SQLite
database) in the charm project directory. This file contains:
- Conversation history (messages, tool calls, results)
- Task queue (status, dependencies, results)
- Design decisions
- Token usage metrics
- Virtual file cache
When you run cantrip in a directory that contains a
.cantrip file, the launcher prompts you to choose
Resume, Fresh, or Transcript:
- Resume — load the prior session (conversation, task queue, decisions) and continue where you left off.
- Fresh — rename the existing session to
.cantrip.bak-<timestamp>so nothing is lost, then start with an empty store. - Transcript — show the last 20 persisted messages inline, then re-ask the question.
The TUI shows a dedicated resume screen with the same three choices; the Web UI shows a banner across the top of the chat panel. On a non-TTY stdin (scripts, piped input) the CLI falls back to silent resume so automation keeps working.
Self-update check
On startup, Cantrip queries the PyPI JSON API in the background
to see whether a newer release of the cantrip
distribution has been published. The result is cached on disk at
~/.cache/cantrip/update.json for 24 hours, so
day-to-day startups don't hit the network.
When a newer version is available each front-end surfaces a non-blocking notice:
- TUI — a Rich panel prints after the Textual screen tears down, so the prompt never interrupts mid-session.
- Web UI — a dismissible banner appears at
the top of the page. Dismissal is remembered per version in
localStorage, so the banner reappears on the next release without further intervention. - CLI (
--no-tui) — a single two-line notice prints after the REPL exits: the new version, the PyPI project URL, and the exact upgrade command for the detected installer.
The upgrade command is installer-aware. Cantrip inspects
sys.executable to pick among
uv tool upgrade juju-cantrip, pipx upgrade juju-cantrip,
uv pip install --user --upgrade juju-cantrip,
uv pip install --upgrade juju-cantrip, and
snap refresh cantrip. When nothing matches the
notice falls back to the PyPI URL rather than guess.
Pre-releases (1.2.0rc1) are hidden unless the
installed version is already a pre-release. If the installed
version has been yanked on PyPI the notice shifts in tone to
recommend upgrading — see
CANTRIP_NO_UPDATE_CHECK above to disable the check
entirely.
The /update slash command forces a cache-bypassing
check from inside a session and renders the result in the chat
panel. Use it to confirm a newly published release without
waiting for the 24-hour cache TTL. Two flags toggle the
persistent opt-out without hand-editing
settings.json:
/update --no-checkwritesupdate_check_disabled = true./update --checkclears the flag.
The CANTRIP_NO_UPDATE_CHECK environment variable
shadows the settings file and stays in force for the running
session regardless of /update --check.
Controller-safety guard
Mutating Juju tools (juju_deploy, juju_refresh,
juju_relate, juju_destroy_model,
juju_remove_application) refuse to run against a non-local
controller without an explicit per-call confirmed=true from
the agent. The agent surfaces the target controller and cloud
to the operator and asks for confirmation before re-calling.
Two axes decide whether the gate fires:
-
Heuristic.
localhostandlxdclouds are always treated as local.microk8sandk8sclouds are local only when their API endpoints point at loopback (127.0.0.1,[::1],localhost) or a snap-managed socket (e.g./var/snap/microk8s/...). Anything else is non-local. -
Explicit list. A
production_controllersarray in~/.config/cantrip/settings.jsonnames controllers that always require confirm regardless of cloud type. The error message escalates language ("production controller<name>") so the operator notices what they are about to touch:{ "production_controllers": ["prod-aws", "prod-k8s"] }
The two axes compose: a controller hits the gate when either
axis says non-local. When the heuristic cannot classify the
controller (juju is not installed, or
juju show-controller produces nothing), the gate falls
silent so test environments that do not have juju available do
not see spurious refusals.