Metadata-Version: 2.4
Name: octowright
Version: 0.11.0
Summary: MCP server that drives multiple headed Playwright browsers (chromium/firefox/webkit) in parallel with per-instance action recording and persistent profiles.
Project-URL: Homepage, https://github.com/livingstaccato/octowright
Project-URL: Documentation, https://foundry.provide.io/octowright/
Project-URL: Repository, https://github.com/livingstaccato/octowright
Project-URL: Issues, https://github.com/livingstaccato/octowright/issues
Author-email: Tim Perkins <code@tim.life>, "provide.io" <code@provide.io>
Maintainer-email: "provide.io" <code@provide.io>
License-Expression: Apache-2.0
License-File: LICENSES/Apache-2.0.txt
Keywords: agent,browser-automation,claude,mcp,model-context-protocol,playwright,testing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: anyio>=4.4.0
Requires-Dist: click>=8.1
Requires-Dist: cryptography<46.0.4,>=46.0.3; sys_platform == 'win32' and (platform_machine == 'ARM64' or platform_machine == 'arm64' or platform_machine == 'aarch64')
Requires-Dist: cryptography>=48.0.1; sys_platform != 'win32' or (platform_machine != 'ARM64' and platform_machine != 'arm64' and platform_machine != 'aarch64')
Requires-Dist: defusedxml>=0.7.1
Requires-Dist: httpx>=0.27
Requires-Dist: mcp>=1.2.0
Requires-Dist: msgpack>=1.2.1
Requires-Dist: playwright>=1.47.0
Requires-Dist: provide-telemetry[otel]>=0.4.8
Requires-Dist: pydantic-settings>=2.14.2
Requires-Dist: python-multipart>=0.0.31
Requires-Dist: pyyaml>=6
Requires-Dist: starlette>=1.3.1
Requires-Dist: uvicorn[standard]>=0.30
Provides-Extra: terminal
Requires-Dist: provide-uterm-platform>=0.4.0; extra == 'terminal'
Requires-Dist: provide-uterm-server[gateway]>=0.4.0; extra == 'terminal'
Requires-Dist: provide-uterm>=0.4.0; extra == 'terminal'
Description-Content-Type: text/markdown

![octowright](https://raw.githubusercontent.com/livingstaccato/octowright/main/docs/images/brand/octowright-banner.png)

# Octowright

An MCP server that lets agentic coding clients drive **many headed Playwright browsers
in parallel** with a **mix of engines** (Chromium, Firefox, WebKit), recording every
action to a JSONL log so a session can later be exported as a standalone Playwright
script.

Octowright is optimized for multi-session, mixed-engine browser orchestration with
persistent profiles, structured recordings, and a local debugger dashboard.

## Get started

Octowright isn't on PyPI yet, so you install from source. Octowright uses
[`uv`](https://docs.astral.sh/uv/) for dependency management — there is no
`pip install` path. If you don't have `uv` yet:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

Then, from any directory you'd like Octowright to live under (e.g. `~/code/`):

```bash
git clone https://github.com/livingstaccato/octowright.git
cd octowright
uv sync                                              # install Python deps
uv run playwright install webkit firefox chromium    # install browser binaries
uv run octowright init                               # print MCP registration block + scaffold config
```

Engine-binary management is currently CLI-driven (`playwright install` /
`playwright install --list`), not exposed as Octowright MCP tools.

The last command prints a JSON block to paste into an MCP client config, commonly
`.mcp.json` for a project or `~/.claude.json` for Claude Code. It also creates
Octowright's user config directory with a sample persona, scenario, and macro so
you have something to play with.

The block it prints looks like this — `init` substitutes your install path
into `<absolute-path-to-octowright>`:

```json
{
  "mcpServers": {
    "octowright": {
      "command": "uv",
      "args": [
        "--directory",
        "<absolute-path-to-octowright>",
        "run",
        "octowright",
        "serve"
      ]
    }
  }
}
```

Reload your MCP client. The tools appear as `mcp__octowright__browser_launch`, etc.

Verify in 30 seconds: ask your client to launch a webkit browser at `octowright.com`,
click a link, list browsers, then close. The next section walks through that same
flow as a tour of what Octowright actually does.

## Your first 5 minutes

Once installed and registered, ask your MCP client to walk through these in order.
Each step builds on the previous one and shows you what Octowright actually does.

**1. Open a browser.** Ask: *"launch a webkit browser at octowright.com"*. The client
calls `browser_launch kind=webkit url=https://octowright.com`. A real WebKit window opens
on your desktop. The result includes the `instance_id` so the client can target later
actions.

**2. Drive it.** Ask: *"click the 'More information' link"*. The client calls
`browser_click_by text="More information"`. The window navigates. Every action lands
in a JSONL recording on disk.

**3. List what's open.** Ask: *"what browsers are running?"*. `browser_list` returns a
one-line summary like `1 browser: 8a3f.../webkit @ iana.org/help/example-domains`.

**4. Save the session as a macro.** Ask: *"save the last few clicks as a macro called
example-tour"*. The client calls `macro_save`. Now `macro_run name=example-tour`
replays it.

**5. Close the browser, then launch a *named* one.** Ask: *"close that browser, then
launch a chromium browser with profile=demo at github.com"*. The window opens, you log
in (or whatever). When you close it, the cookies/localStorage flush to
the profile directory in Octowright's config dir. Re-launch with the same profile and you're
already logged in.

That's the whole tool: parallel browsers, recordings, named macros, persistent
profiles. **Personas** are profiles with metadata (display name, default URL,
credential references); **scenarios** are pre-declared groups of personas you can
spin up with one call. Both are covered later. **The dashboard** ties every
piece together visually — see the next section.

## Demo catalog

Octowright now ships a curated demo catalog on top of the raw `examples/`
material.

- Repo-facing catalog: [`demo/INDEX.md`](demo/INDEX.md)
- Authored bundle manifests: `demo/bundles/<demo-id>/demo.yaml`

The current hero set promotes seven offline-first bundles:
`first-run-session`, `macro-replay-loop`, `cross-engine-trio`,
`role-based-duo`, `fixture-lab`, `verify-suite`, and
`seven-mix-orchestration`.

`examples/` remains the raw source layer for reusable macros and scenarios.
`demo/bundles/` is the product-facing layer that adds audience/tag metadata,
artifact expectations, regen commands, tutorial-export metadata, and small
deterministic seed assets.

To refresh the generated repo catalog and per-bundle tutorial-export JSON from the manifests:

```bash
uv run python scripts/demos/record_heroes.py
```

## Distributed Skill Pack

Octowright ships a packaged skill named `octowright` for Codex and
project-local plugin manifests for compatible runtimes such as Claude Code and Codex.

Install everything:

```bash
uv run octowright skill install octowright --target all
```

Check status and drift:

```bash
uv run octowright skill status octowright --target all
```

Run diagnostics:

```bash
uv run octowright skill doctor --json
```

Notes:
- Codex skill install target is `$CODEX_HOME/skills` (defaults to `~/.codex/skills`).
- Plugin manifests are written in the current project under `.claude-plugin/plugin.json`
  and `.codex-plugin/plugin.json`.
- Use `--dry-run` to preview writes and `--force` to overwrite existing installs.
- Distributed skill/plugin metadata versions are sourced from `octowright.VERSION`.

## Dashboard

`octowright serve` boots two things in one process: the MCP stdio server (what
your client talks to) and a Starlette HTTP server on `http://127.0.0.1:6286/` (what
*you* look at). One stable URL, pinned in your browser, replaces the old dance
of copying log paths and shelling out to `npx playwright show-trace` by hand.

Ask your MCP client `"give me the octowright dashboard URL"` (it'll call the
`octowright_dashboard_url` MCP tool), or just open the URL directly. You get:

- **Top-level dashboard** — every live browser, every live scenario, recent
  closed sessions, all your personas, all your saved macros. Auto-refreshes
  every 5 seconds.
- **Persona management** — each persona card shows engine list, last-used
  time, and on-disk size (chromium + firefox + webkit + yaml). Hover the
  card and click the edit (✎) icon to open an in-page YAML editor; save
  writes back to `<persona>/profile.yaml` via `PUT /api/personas/{name}`.
  Disk sizes are loaded lazily after first paint via
  `GET /api/personas/sizes` (a single directory-size scan over Octowright's
  profile config dir).
- **Closed-session cleanup** — closed-session rows expose an `⊗` delete
  button on hover; clicking removes the JSONL recording, video, trace, and
  screenshots from disk via `DELETE /api/sessions/{id}/recording`. Live
  sessions reject the call with 409 (close them first).
- **Per-session debugger** — click any session for a two-column page with the
	  live browser preview or embedded session video on the left, action timeline
	  on the right. Live previews stream JPEG frames over one WebSocket at
	  `/api/sessions/{id}/screencast`, with bounded screenshot polling as a
	  compatibility fallback.
  Controls include pause/resume plus fullscreen in panel or native browser
  mode. Click any action in the timeline to seek the video to that moment.
  Tabs underneath the timeline switch between **console messages** (filtered
  by level), **downloads** (with a "missing" badge if the file was moved),
  **markdown export**, and **screenshots** (lazy-loaded thumbnail grid).
- **Live updates** — for currently-running sessions, the page opens a
  WebSocket to `/api/sessions/{id}/tail` and appends new events as they
  arrive (no manual refresh). WebSocket frame payloads that are binary are
  intentionally hidden in the UI preview as `[binary payload hidden]`. Full frames
  are still cached to the websocket cache using base64 for safe replay and
  debugging.
- **Trace deep-dive** — a button on each session page spawns
  `npx playwright show-trace` against that session's `.zip` trace, opening
  the official Playwright trace viewer for full per-action inspection
  (network, snapshots, source links). Requires `npx` on PATH.

The markdown tab uses the new `GET /api/sessions/{id}/markdown` endpoint; the
server captures cached markdown on page load and user navigation, and generates
it on demand if a live session hasn't populated the cache yet.

The dashboard is a TypeScript SPA built into `packages/octowright-frontend/`
(Vite + strict tsc + Biome + vitest). It uses `@provide-io/telemetry` for structured
logging so frontend log lines are correlated with the Python server's
`provide.telemetry` calls. The compiled bundle ships inside the wheel; the
frontend has zero runtime dependency on Node — Node is only needed at build
time and for the optional `npx playwright show-trace` deep-dive.

If port 6286 is taken, the server walks up to 5 higher ports and picks the
first free one (or logs a warning and continues without the HTTP layer if all
are busy — MCP keeps running). Override the default with `OCTOWRIGHT_HTTP_PORT`
or bind to a different host with `OCTOWRIGHT_HTTP_HOST` (default `127.0.0.1`).
Binding to `0.0.0.0` only exposes health/static assets by default; sensitive
dashboard, API, and MCP access from another machine also requires
`OCTOWRIGHT_ALLOW_REMOTE_DASHBOARD=1`. Only enable that opt-in on trusted
networks because it exposes live browser state, recordings, traces, downloads,
and the MCP tool surface.

## Concepts: how the pieces relate

Five layers, each building on the one below:

**1. Browser.** A single live Playwright browser — one engine (chromium /
firefox / webkit), one window. Identified by an `instance_id`. Every action you
run against it gets appended to a JSONL recording, and a separate
`BrowserContext` gives it its own cookie jar (so seven parallel Discord tabs
never share auth, even when they all run on WebKit).

**2. Profile.** A directory on disk (`<octowright-config>/profiles/<persona>/<kind>/`)
that stores cookies, localStorage, IndexedDB, and service-worker state
between browser runs. When you pass `profile=dante` to `browser_launch`, the
browser uses a **persistent context** pointed at that directory — close the
browser, re-launch tomorrow, and you're still logged in. Profiles are scoped
per engine; dante on WebKit and dante on Firefox are two distinct profile
dirs under the same persona.

**3. Persona.** A *named identity* that owns profiles across one or more
engines, plus metadata: display name, `default_url`, `default_macros` to run
at launch, `credentials` (references to env vars or shell commands —
secrets themselves are never stored on disk), and an `app` dict for
free-form domain metadata. Think of a persona as "dante — my Discord power
user across all three engines", and a profile as one engine-specific piece
of that identity. You launch it with `browser_launch profile=dante`;
the resolver (`browser_suggest_for_url`) works out which persona to reuse
when the URL is ambiguous. See [docs/personas.md](https://github.com/livingstaccato/octowright/blob/main/docs/personas.md)
for the full `profile.yaml` shape.

**4. Scenario.** A *pre-declared group of personas to launch together*, each
with a `role` (`player`, `monitor`, `spectator`). Declared in
`<octowright-config>/scenarios/<name>.yaml` (or a Python `build()` function for
dynamic rosters). `scenario_start name=discord-raid` launches all seven
participants in parallel, applies shared fixtures (dialog policy, mock
routes), runs each participant's startup macros. You can then broadcast a
macro across all participants (`scenario_run_macro`), role-filter
(`role=player`), or drive a single participant by its `instance_id`. See
[docs/scenarios.md](https://github.com/livingstaccato/octowright/blob/main/docs/scenarios.md) for the full spec shape.

**5. Dashboard.** The web UI bundled with `octowright serve` is the visual
projection of everything above. The dashboard page lists every live browser,
every live scenario, recent closed sessions, every persona, every macro;
each session links to a debugger page with embedded video, click-to-seek
action timeline, console messages, downloads, and screenshots. The Playwright
trace viewer is one button away. See the [Dashboard](#dashboard) section
above for what it shows; this layer doesn't add new state — it just makes
the other four layers observable.

**When to reach for which.** A single browser for one-shot exploration. A
named profile when you want login state to survive. A persona when that
identity is worth metadata and credential references. A scenario when you
need N coordinated browsers as a single unit. The dashboard whenever you
want to *see* what's happening rather than ask your MCP client.

## Tools

Every mutating tool takes an `instance_id` returned from `browser_launch`. Each call
appends a record to that instance's JSONL log.

### Browser lifecycle

| Tool | What |
|---|---|
| `browser_launch` | Launch a new headed browser. `kind` = `chromium` / `firefox` / `webkit`. Returns `instance_id`. |
| `browser_suggest_for_url` | Pre-launch: which saved persona owns this URL? Disambiguates `"open discord.com"` requests. |
| `browser_list` | List all live instances. |
| `browser_close` / `browser_close_all` | Close one / all. Protected browsers require `force=True`; `browser_close_all` skips protected browsers unless forced and reports failures. |
| `browser_spawn_roster` | Launch N browsers in parallel from a list of launch specs. |
| `browser_navigate` | Navigate a specific instance. |
| `browser_navigate_back` | Go back one entry in the browser's history. Returns `{ok, url, title}`; `ok=False` when there's no previous page. |
| `browser_open_url` | Open a URL on an existing instance. `target='tab'` (default) appends a new page; `target='window'` calls `window.open(...,'popup',width=W,height=H)` so the OS opens a new window (defaults 1024×768). Returns `{ok, target, page_index, url}`. |
| `browser_resize` | Resize the page viewport to `width × height` CSS pixels (does not resize the OS window). |

### Input

| Tool | What |
|---|---|
| `browser_click` / `browser_type` / `browser_fill` / `browser_press_key` | CSS-selector input. |
| `browser_click_by` / `browser_fill_by` / `browser_get_text_by` | ARIA-locator input (role / label / text / data-testid). |
| `browser_hover` | Hover the cursor over a CSS selector (triggers `:hover` / hover-reveal menus / tooltips). |
| `browser_select_option` | Select one option in a native `<select>` by `value`, `label`, or 0-based `index`. |
| `browser_drag` | Drag-and-drop from `source_selector` onto `target_selector` (Playwright `drag_and_drop`). |
| `browser_set_input_files` | Upload files into an `<input type=file>`. |

Recorded CSS `click` and `fill` actions also capture semantic metadata when
Playwright can resolve it. Macro playback and exported replay scripts try that
ARIA locator first, then fall back to the original CSS selector.

### Inspection

| Tool | What |
|---|---|
| `browser_screenshot` | PNG to disk. |
| `browser_snapshot` | Accessibility tree (defaults to `body`). |
| `browser_read_markdown` | Cached Markdown representation (highly token-efficient for reading). |
| `browser_evaluate` | Run JS in the page. |
| `browser_console_messages` | Collected console output since launch (cursor pagination). |
| `browser_wait_for` | Wait for selector / text / network-idle. |
| `browser_recording_path` | Path to the JSONL action log for this instance. |
| `browser_tail_recording` | Stream new JSONL events appended since a byte cursor — for live monitoring without `tail -f`. |
| `browser_export_script` | Emit a Playwright Python (or TS) script that replays the log. |
| `browser_open_trace` | Open the Playwright trace viewer (`npx playwright show-trace`) on this session's `.zip`. |

### Assertions

| Tool | What |
|---|---|
| `browser_expect_url` / `browser_expect_text` / `browser_expect_selector` / `browser_expect_js` | Recording-aware assertions (raise on mismatch, append to JSONL). |

### Network & dialogs

| Tool | What |
|---|---|
| `browser_set_dialog_policy` | accept / dismiss / manual for `confirm()` / `alert()` / `prompt()`. Default: dismiss. |
| `browser_mock_route` / `browser_unmock_route` | Stub network responses for deterministic tests. |
| `browser_network_requests` | List captured HTTP/HTTPS requests for an instance. Optional substring `url` / `method` / `resource_type` filters; pass `since` cursor for incremental polling. |

### Pages, frames, downloads

| Tool | What |
|---|---|
| `page_list` / `page_switch` / `page_close` | Manage tabs + popups. |
| `browser_switch_frame` / `browser_reset_frame` / `browser_list_frames` | Drive an iframe. |
| `browser_downloads` / `browser_wait_for_download` | Captured downloads (cursor pagination). |

### Profiles, personas, scenarios, macros, goldens

| Tool | What |
|---|---|
| `profile_list` / `profile_delete` | Saved per-engine profile dirs. |
| `persona_list` / `persona_get` / `persona_create` / `persona_delete` | Identity-layer over profiles. |
| `persona_credentials_check` | Pre-flight: resolve every credential reference without launching a browser. |
| `scenario_list` / `scenario_start` / `scenario_status` / `scenario_stop` / `scenario_run_macro` / `scenario_participants` / `scenario_run_as_test` / `scenario_tail` | Multi-browser orchestration + verify-as-test. |
| `scenario_plan` | Dry-run: show resolved per-participant launch_kwargs without launching anything. |
| `macro_save` / `macro_list` / `macro_run` / `macro_run_sequence` / `macro_delete` | Named, parameterised action sequences. Supports `macro_call` for reusable submacros. |
| `macro_compile` | Compile the YAML macro DSL to canonical JSON; dry-run by default, save with `write=true`. |
| `macro_lint` | Static-analysis pass on a saved macro: missing required fields, unknown actions, unparameterized credential-shaped strings, empty conditional branches. |
| `golden_save` / `golden_assert` / `golden_list` / `golden_delete` | Accessibility-tree snapshot diffs. |
| `run_test_suite` | Run every `[test]`-tagged macro in a directory; emit JUnit XML. |

### Housekeeping

| Tool | What |
|---|---|
| `octowright_dashboard_url` | Returns the localhost dashboard URL (with optional `session_id` deep-link). |
| `octowright_check_takeover` | Detect competing Playwright MCP plugins in `.mcp.json` / `~/.claude.json`; report scope + suggested actions. |
| `recordings_cleanup` | Prune old recording artefacts older than N days. Dry-run by default. |

## Persistent profiles (Discord, Slack, N-login-per-app)

By default `browser_launch` creates an ephemeral browser — cookies, localStorage, and
IndexedDB die on close. To **keep login state across runs**, pass a `profile` name:

```
browser_launch kind=webkit profile=disc-1 url=https://discord.com/login
```

Each `(kind, profile)` pair gets its own on-disk user-data-dir under Octowright's config dir.
First launch opens a fresh browser; after you log in manually, closing the browser flushes
state to disk. The next launch with the same `profile` skips the login (Discord /
Slack / etc. treat it as a returning session).

**Cookie isolation:** each live browser has its own `BrowserContext`, so seven logged-in
Discord tabs you run in parallel never share cookies, localStorage, or IndexedDB — even
if they're all `kind=webkit`.

**Window title format.** Every page's `document.title` is rewritten on the fly to
end with ` (<persona-emoji><engine-emoji>) [<profile>]` — so the page's own title
leads and the badge tails. Example: `Yahoo | Mail, Weather, … (🐬🦊) [microdosing]`
in firefox, or `Yahoo | Mail, Weather, … (🐬🧭) [microdosing]` in webkit. The
persona emoji is hash-picked from a curated 33-pick pool keyed off the persona
name (deterministic, same emoji every time); the engine emoji is fixed
(🌐 chromium · 🦊 firefox · 🧭 webkit). When parallel windows pile up in cmd-` /
the Window menu / a tab strip, the suffix lets you tell them apart at a
glance even after deep navigation. Override the emoji by setting `emoji:` in
the persona's `profile.yaml`, or disable the corner badge entirely with
`badge=False` on `browser_launch` (the title injection has no off-switch
short of editing the launch — it's purely a string rewrite, no DOM nodes).

Example — seven Discord accounts on seven WebKit windows, reusable later:

```
# First time: open all seven, log each one in manually
browser_launch kind=webkit profile=disc-1 url=https://discord.com/login label=acct-1
browser_launch kind=webkit profile=disc-2 url=https://discord.com/login label=acct-2
...
browser_launch kind=webkit profile=disc-7 url=https://discord.com/login label=acct-7

# Close them — profiles flush to disk
browser_close_all

# Days later: re-launch and skip login entirely
browser_launch kind=webkit profile=disc-1 url=https://discord.com/app
...
```

Protected sessions are intended for user-facing windows. Any close-capable
tool refuses them unless the call explicitly confirms with `force=True`;
this includes `browser_close`, `browser_close_all`, and
`browser_capture_and_close`.

`profile_list` enumerates saved profiles; `profile_delete` wipes one (refuses while a
live instance is using it). Exported replay scripts embed the absolute `user_data_dir`
path, so they work on the same machine but are **not portable across machines** when a
profile is involved.

## Personas — identity layer over engine profiles

Every browser profile belongs to a **persona**: a named identity with metadata,
credential references, and optional default URL + startup macros. A persona can
have browser profiles for multiple engines (WebKit, Firefox, Chromium); each
engine profile is a child directory.

```
<octowright-config>/profiles/
├── dante/
│   ├── profile.yaml     # persona metadata
│   ├── webkit/          # dante's WebKit browser state
│   └── chromium/        # dante's Chromium browser state
└── tim/
    ├── profile.yaml
    └── webkit/
```

`profile.yaml` declares display name, default URL + macros, credential
references (read from env vars or shell commands at use time; never stored),
and free-form app metadata:

```yaml
name: dante
display_name: Dante Alighieri
default_url: https://discord.com/app
default_macros: [discord-login]
credentials:
  email_env: DANTE_EMAIL
  password_cmd: "op read op://Personal/dante/password"
app:
  discord_user_id: "1234"
  role: player
```

MCP tools: `persona_list` / `persona_get` / `persona_create` / `persona_delete` /
`persona_credentials_check`.
CLI: `octowright persona list|show|create|delete`.

**Credentials pre-flight.** Before launching a scenario whose startup macros
need logins, call `persona_credentials_check name=dante` to verify every
`*_env` / `*_cmd` reference actually resolves. The report lists each
credential, its source (env var or shell command) and the reference itself,
plus per-field `ok`/`error` — the resolved secret is never included. Use
this to avoid the classic "logged in 6 of 7 windows, then discovered the
env var was unset on #7" failure mode.

Full reference: [docs/personas.md](https://github.com/livingstaccato/octowright/blob/main/docs/personas.md).

## Macros — reusable parameterized action sequences

Turn a recorded browser session into a named, reusable macro. Capture a login flow
once, replay it with different credentials later. Example workflow:

```
# 1. Manually log into Discord on a live instance
browser_launch kind=webkit profile=disc-1 url=https://discord.com/login label=acct-1
# ... fill email, password, submit ...

# 2. Snapshot those actions as a macro, telling Octowright which literal values
#    to treat as parameters:
macro_save instance_id=<id> name=discord-login \
           parameters={"email":"me@octowright.test","password":"hunter2"}

# 3. Days later, against a fresh instance, replay it with different creds:
browser_launch kind=webkit profile=disc-2 url=https://discord.com/login label=acct-2
macro_run instance_id=<new-id> name=discord-login \
          args={"email":"other@octowright.test","password":"correcthorsebatterystaple"}
```

`macro_list` enumerates saved macros; `macro_delete` removes one. Macros live at
`${XDG_CONFIG_HOME:-~/.config}/octowright/macros/<name>.json` on POSIX, or
`%APPDATA%\octowright\macros\<name>.json` on Windows. Override with
`OCTOWRIGHT_MACROS_DIR`.

Lifecycle actions (`launch`, `close`, `snapshot`) are dropped by default — macros
are the reusable middle of a flow, not the wrapper. Pass `include_launch=True` on
`macro_save` if you need the initial navigation baked in.

**Caveat:** JSONL macros break when the target site changes its DOM (Discord
rewrites its CSS classes frequently). Recorded `click` and `fill` actions use
captured ARIA metadata first when available, then fall back to CSS, but macros
are still short-term automation — when a macro breaks, re-record rather than
hand-patch.

### Conditional / branching actions

For sites that ship multiple DOM versions of the same flow, three action types
let one macro cover all of them. Hand-author these by editing the JSON; record
the linear baseline first, then wrap fragile steps:

- **`if_selector`** — predicate on selector presence; runs `then` or `else`.
  ```json
  {"action": "if_selector", "selector": ".cookie-banner", "present": true,
   "then": [{"action": "click", "selector": ".accept-cookies"}]}
  ```
- **`try`** — best-effort sub-sequence that SUPPRESSES errors. Use for
  optional steps like dismissing a one-off banner that may or may not exist.
  ```json
  {"action": "try", "actions": [
      {"action": "click", "selector": "#optional-popup-close"}
  ]}
  ```
- **`try_each`** — branches in order; succeeds on the first whose every
  action completes; raises if all fail. The "v1 OR v2 OR v3" hammer.
  ```json
  {"action": "try_each", "branches": [
      [{"action": "click", "selector": "[aria-label='Close']"}],
      [{"action": "click", "selector": "button.dismiss"}],
      [{"action": "press_key", "key": "Escape"}]
  ]}
  ```

These nest freely — `if_selector` inside `try_each` inside `try` works as you
would expect. See `examples/macros/conditional-discord-modal-dismiss.json` for
a real-world pattern.

Full reference: [docs/macros.md](https://github.com/livingstaccato/octowright/blob/main/docs/macros.md).

## Scenarios — coordinated multi-browser orchestration

A scenario is a named group of browser instances launched together. Spin up N
players + a monitoring window + a main-site window with one call; each
instance is a regular `BrowserSession` you can drive per-participant (via
`instance_id`) using all the normal `browser_*` tools.

Declare scenarios in Octowright's config dir:

```yaml
name: discord-raid
description: 7 players + 1 monitor + 1 main-site spectator
participants:
  - persona: dante
    kind: webkit
    role: player
  - persona: ops
    kind: firefox
    role: monitor
    url: https://octowright.com/monitor
fixtures:
  mock_routes:
    - pattern: "**/api/time"
      body: '{"now":"2026-04-24T00:00:00Z"}'
  dialog_policy: dismiss
teardown:
  macro: cleanup-session
verify:
  player: assert-in-server
  monitor: assert-monitor-healthy
```

Or as Python for dynamic participant lists — `<name>.py` exposes `def build() -> Scenario`.

Lifecycle:

- `scenario_start <name>` launches all participants in parallel, applies
  fixtures, runs per-participant startup macros. Browsers **stay open**.
- `scenario_run_macro <id> <macro> [role=...]` broadcasts a macro across
  participants (optionally role-filtered). Per-participant results returned.
- Any single participant can still be driven by `instance_id` with the regular
  `browser_*` tools.
- `scenario_stop <id>` runs the teardown macro per participant, closes every
  window, returns a summary.
- `scenario_run_as_test <id>` (or `--test` on the CLI) runs `verify` macros
  and produces JUnit XML.

CLI: `octowright scenario list|start [--test --out <xml>]`; the `start`
command blocks until Ctrl-C, then runs teardown and exits.

Full reference: [docs/scenarios.md](https://github.com/livingstaccato/octowright/blob/main/docs/scenarios.md).

## Configuration

All defaults live in `src/octowright/defaults.py` and can be overridden via environment
variables:

On POSIX systems, Octowright follows the XDG Base Directory split:

- Config: `${XDG_CONFIG_HOME:-~/.config}/octowright/` for durable user-authored data.
- State: `${XDG_STATE_HOME:-~/.local/state}/octowright/` for session history, logs, and manifests.
- Cache: `${XDG_CACHE_HOME:-~/.cache}/octowright/` for rebuildable analysis captures.

On Windows, config uses `%APPDATA%\octowright\`, while state and cache use
`%LOCALAPPDATA%\octowright\State\` and `%LOCALAPPDATA%\octowright\Cache\`.

| Variable | Default | Description |
|---|---|---|
| `OCTOWRIGHT_DEFAULT_URL` | `https://octowright.com` | Fallback `url` when `browser_launch` omits it. |
| `OCTOWRIGHT_RECORDINGS` | POSIX: `${XDG_STATE_HOME:-~/.local/state}/octowright/sessions/`; Windows: `%LOCALAPPDATA%\octowright\State\sessions\` | Where session artifacts land: JSONL action logs, traces, screenshots, videos, downloads, and markdown captures. |
| `OCTOWRIGHT_CAPTURES_DIR` | POSIX: `${XDG_CACHE_HOME:-~/.cache}/octowright/captures/`; Windows: `%LOCALAPPDATA%\octowright\Cache\captures\` | Where large cached analysis payloads live. |
| `OCTOWRIGHT_CAPTURE_MAX_TOTAL_BYTES` | `52428800` | Size cap for cached analysis captures before oldest captures are pruned. |
| `OCTOWRIGHT_CAPTURE_TTL_SECONDS` | `604800` | Age cap for cached analysis captures. |
| `OCTOWRIGHT_SESSION_MANIFEST` | POSIX: `${XDG_STATE_HOME:-~/.local/state}/octowright/session-manifest.json`; Windows: `%LOCALAPPDATA%\octowright\State\session-manifest.json` | Live-session manifest used for crash recovery/status. |
| `OCTOWRIGHT_PROFILES_DIR` | POSIX: `${XDG_CONFIG_HOME:-~/.config}/octowright/profiles/`; Windows: `%APPDATA%\octowright\profiles\` | Where persistent profiles live. |
| `OCTOWRIGHT_MACROS_DIR` | POSIX: `${XDG_CONFIG_HOME:-~/.config}/octowright/macros/`; Windows: `%APPDATA%\octowright\macros\` | Where saved macros live. |
| `OCTOWRIGHT_SCENARIOS_DIR` | POSIX: `${XDG_CONFIG_HOME:-~/.config}/octowright/scenarios/`; Windows: `%APPDATA%\octowright\scenarios\` | Where scenario specs live. |
| `OCTOWRIGHT_VIEWPORT_W` / `OCTOWRIGHT_VIEWPORT_H` | `1280` / `800` | Default viewport. Used in headless mode and when dimensions are explicitly passed to `browser_launch`. In headed mode with neither set, context launches with `no_viewport=True` so the page tracks the OS window. |
| `OCTOWRIGHT_HEADLESS` | auto | Explicit `0` / `1` overrides headless mode. Auto-detected: headed on macOS or Linux+display, headless on CI (`CI=true`) or Linux without `$DISPLAY` / `$WAYLAND_DISPLAY`. |
| `OCTOWRIGHT_NAV_TIMEOUT_MS` / `OCTOWRIGHT_ACTION_TIMEOUT_MS` | — | Per-navigation / per-action timeouts. |
| `OCTOWRIGHT_HTTP_HOST` / `OCTOWRIGHT_HTTP_PORT` | `127.0.0.1` / `6286` | Dashboard bind address. Binding to `0.0.0.0` makes the HTTP sidecar reachable on your network, but sensitive dashboard/API/MCP routes stay blocked unless `OCTOWRIGHT_ALLOW_REMOTE_DASHBOARD=1` is also set. Only enable remote dashboard access on trusted networks because it exposes live browser state and local artifacts. If the port is in use, the server walks up 5 higher ports automatically. |
| `OCTOWRIGHT_LIVE_SCREENCAST_FPS` | `10` | Positive integer cap for backend live-preview stream FPS and requested frontend `fps`. |
| `OCTOWRIGHT_LIVE_SCREENCAST_QUALITY` | `70` | JPEG quality for live-preview frames, clamped to `1..100`. |
| `OCTOWRIGHT_LIVE_SCREENCAST_FULLSCREEN_MODE` | `native` | Live-preview fullscreen behavior: `native` browser fullscreen or `panel` in-page fullscreen. |
| `OCTOWRIGHT_IDLE_GRACE` | `300` | Seconds before auto-exit when the browser pool is empty. Use `--keep-alive` to disable. |

## CLI

`octowright` is a Click-based CLI; subcommands let you do common housekeeping
without going through an MCP client:

| Command | What |
|---|---|
| `octowright serve [--profile=<spec>]` | Run the MCP stdio server + the dashboard HTTP server. This is the default when you invoke `octowright` with no subcommand. Pass `--profile=core` (or `core,macros` etc.) to slim the LLM-visible MCP tool surface — see [Capability profiles](#capability-profiles) below. |
| `octowright init [--force]` | First-run scaffolding: create the standard config dirs, drop a sample persona / scenario / macro, and print the `.mcp.json` registration block with your install path filled in. |
| `octowright selftest` | Print the list of registered MCP tools without needing a live MCP client. Sanity check after install. |
| `octowright test [<dir>] [--kind <engine>] [--tag <tag>] [--out <xml>]` | Run every `[test]`-tagged macro in a directory, emit JUnit XML. |
| `octowright cleanup [--days N] [--apply]` | Prune old recording artefacts (JSONL logs, screenshots, videos, traces). Dry-run by default; `--apply` actually deletes. |
| `octowright takeover [--apply --scope=session\|project\|global --name=<n>]` | Detect competing Playwright MCP plugins in `.mcp.json` / `~/.claude.json` and offer to disable them in favour of octowright. Default is read-only report; `--apply` rewrites the config (with timestamped backup). Reversible — rename back to re-enable. |
| `octowright persona list\|show\|create\|delete` | Manage personas from the terminal. |
| `octowright scenario list\|start [--test --out <xml>] [--watch]` | Start a scenario; `--watch` streams participant events to stdout in real-time; the command blocks until Ctrl-C. |

## Capability profiles

The full MCP tool surface is currently 125 tools on a core install — every workflow Octowright supports
(browser driving, macros, scenarios, persona management, etc.) shows up in
the LLM's tool schema by default. When the LLM only needs a slice, set
`OCTOWRIGHT_PROFILE` (or pass `--profile` to `octowright serve`) to one or
more comma-separated profile names. Tools not listed in any active profile
are skipped at registration time, so the LLM-visible schema shrinks. Seven
meta/Advisor tools are always registered so agents can inspect Octowright,
find the dashboard, and surface local guidance even under narrow profiles.

| Profile | What | Tool count |
|---|---|---|
| `core` | Minimum to drive a browser end-to-end, including compact DOM and HTTP-first discovery. | 24 |
| `advanced` | Inspection, cached captures, summaries, assertions, viewport controls, and ARIA-locator interactions for stable test automation. | 30 |
| `macros` | Macro record / list / run / lint / repair / compile + artifact bundles. | 15 |
| `scenarios` | Scenario orchestration (multi-browser test setups). | 12 |
| `personas` | Persona + on-disk profile management. | 8 |
| `goldens` | Accessibility-tree snapshot baselines + diff. | 5 |
| `terminals` | Optional terminal sessions; registers only when `octowright[terminal]` is installed. | 7 |
| always-on | Status, storage report, dashboard, takeover detection, and Advisor tools registered under every profile. | 7 |
| `all` (or unset) | Default — every core-install tool registers. | 125 |

```bash
octowright serve --profile=core              # 31 tools — core + always-on
octowright serve --profile=core,macros       # 46 tools — browser + macro pipeline + always-on
octowright serve --profile=core,scenarios    # browser + multi-browser orchestration
```

The active profile shows up in `octowright selftest` and in the
`octowright_status` MCP tool's `profile` block. If a tool you expected is
missing, that's where to look. The dict lives in
`src/octowright/server/profiles.py` — extend it to add or rebalance groups.

## Octowright Advisor

Octowright Advisor is a local, deterministic guidance layer exposed through
always-on MCP tools. `octowright_status` includes an `advisor` block, and
`octowright_advisor_status` returns the same Advisor snapshot directly:
preferences, recent usage summary, and current suggestions. (`octowright_status`
also carries an `upgrade` block on the first run after a version change — present
its highlights to the user as a "what's new" note.)

Advisor currently suggests two things:

- **Macro candidates**: agents call
  `octowright_advisor_record_macro_observation` when they notice repeated
  workflows. Two observations with the same signature produce a
  `macro_candidate` suggestion. Advisor never auto-saves a macro.
- **Profile changes**: recent MCP tool usage can suggest narrowing or expanding
  `OCTOWRIGHT_PROFILE`. Profile-change suggestions can be prompt-only or marked
  `auto_apply` when the `profile_change` preference is `automatic`.

Agents should check Advisor status after first-touch status and before asking a
user whether repeated work should become a macro. Preferences are persisted in
the local Advisor state file and can be changed with
`octowright_advisor_set_preference`. Set `OCTOWRIGHT_ADVISOR_STATE` to isolate
that JSON state file for tests or separate deployments.

## Telemetry

Both halves of Octowright use the `provide.telemetry` family for structured
logging:

- **Python server** uses `provide-telemetry>=0.4.8` (structlog under the hood).
  `setup_telemetry()` is called by `octowright serve`; every module gets a
  logger via `get_logger(__name__)`. Logs land on stderr in development,
  JSON in production (auto-detected).
- **TypeScript dashboard** uses `@provide-io/telemetry@^0.4.7` (pino under
  the hood). `setupTelemetry()` runs at the top of each entrypoint;
  `getLogger('octowright.frontend.{api,tail,dashboard,session,global}')` per
  module. Logger names mirror the Python convention so log lines are easy
  to correlate across the stack.

### Log level and format

```bash
# Human-readable local debugging
export PROVIDE_LOG_LEVEL=DEBUG
export PROVIDE_LOG_FORMAT=pretty
uv run octowright serve
```

```bash
# Machine-friendly production logs
export PROVIDE_LOG_LEVEL=INFO
export PROVIDE_LOG_FORMAT=json
uv run octowright serve
```

`octowright serve --log-level DEBUG` is a convenience wrapper that sets
`PROVIDE_LOG_LEVEL` for the process and spawned daemon.

### OTLP export

Telemetry export is opt-in. To send OpenTelemetry signals to an OTLP collector:

```bash
export PROVIDE_TRACE_ENABLED=1
export PROVIDE_METRICS_ENABLED=1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://127.0.0.1:4318
# optional auth/tenant headers
export OTEL_EXPORTER_OTLP_HEADERS="authorization=Bearer%20TOKEN,x-tenant-id=dev"
uv run octowright serve
```

Signals are no-op if telemetry exporters are not configured/available.

### Playwright traces vs telemetry traces

- **Playwright trace**: per-session browser artifact (`*.trace.zip`) produced
  by Playwright when session tracing is enabled; inspect with
  `npx playwright show-trace`.
- **Telemetry trace**: OpenTelemetry spans emitted by `provide.telemetry`
  (when `PROVIDE_TRACE_ENABLED=1`) and exported to OTLP.

These are separate systems and can be enabled independently.

### HTTP metrics

HTTP request metrics for the debugger/API server are recorded through
`provide.telemetry`'s `TelemetryMiddleware` and exported via OTLP alongside the
rest of octowright's telemetry — RED metrics (`http.requests.total`,
`http.errors.total`, `http.request.duration_ms`) attributed by route, method,
and status code, plus request-id/session-id log correlation and W3C trace
propagation. There is no separate Prometheus scrape endpoint; point an OTLP
collector at the process to consume them. Disable metric recording (propagation
stays on) with:

```bash
export OCTOWRIGHT_HTTP_METRICS=0
```

## Safari caveat

Playwright's `webkit` channel is the **bundled upstream WebKit engine**, not Apple's
Safari.app. It shares the engine family but is a separate binary (`playwright install
webkit`). Driving actual Safari.app with your cookies/profile requires Apple's
`safaridriver` and is not supported by Playwright today.

## Selftest

```bash
uv run octowright selftest
```

Prints the list of registered tools without needing a live MCP client.

## Documentation

- [docs/README.md](https://github.com/livingstaccato/octowright/blob/main/docs/README.md): full documentation index.
- [docs/getting-started.md](https://github.com/livingstaccato/octowright/blob/main/docs/getting-started.md): install, registration, and first successful run.
- [docs/engines.md](https://github.com/livingstaccato/octowright/blob/main/docs/engines.md): engine install/status/reinstall and launch-mode behavior.
- [docs/personas.md](https://github.com/livingstaccato/octowright/blob/main/docs/personas.md): persona/profile lifecycle and credential preflight.
- [docs/macros.md](https://github.com/livingstaccato/octowright/blob/main/docs/macros.md): macro record/replay, linting, and test execution.
- [docs/scenarios.md](https://github.com/livingstaccato/octowright/blob/main/docs/scenarios.md): multi-browser orchestration lifecycle.
- [docs/goldens.md](https://github.com/livingstaccato/octowright/blob/main/docs/goldens.md): baseline capture vs verify policy.
- [docs/ci-quality.md](https://github.com/livingstaccato/octowright/blob/main/docs/ci-quality.md): quality gates and local CI parity commands.
- [docs/troubleshooting.md](https://github.com/livingstaccato/octowright/blob/main/docs/troubleshooting.md): fast diagnosis for common failures.
- [docs/architecture/](https://github.com/livingstaccato/octowright/tree/main/docs/architecture/): system diagrams and architecture references.
- [CHANGELOG.md](https://github.com/livingstaccato/octowright/blob/main/CHANGELOG.md): release summaries.
