Metadata-Version: 2.4
Name: octowright
Version: 0.9.1
Summary: MCP server that drives multiple headed Playwright browsers (chromium/firefox/webkit) in parallel with per-instance action recording and persistent profiles.
Project-URL: Homepage, https://github.com/livingstaccato/octowright
Project-URL: Documentation, https://foundry.provide.io/octowright/
Project-URL: Repository, https://github.com/livingstaccato/octowright
Project-URL: Issues, https://github.com/livingstaccato/octowright/issues
Author-email: Tim Perkins <code@tim.life>, "provide.io" <code@provide.io>
Maintainer-email: "provide.io" <code@provide.io>
License-Expression: Apache-2.0
License-File: LICENSES/Apache-2.0.txt
Keywords: agent,browser-automation,claude,mcp,model-context-protocol,playwright,testing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: anyio>=4.4.0
Requires-Dist: click>=8.1
Requires-Dist: cryptography<46.0.4,>=46.0.3; sys_platform == 'win32' and (platform_machine == 'ARM64' or platform_machine == 'arm64' or platform_machine == 'aarch64')
Requires-Dist: httpx>=0.27
Requires-Dist: mcp>=1.2.0
Requires-Dist: playwright>=1.47.0
Requires-Dist: provide-telemetry[otel]>=0.4.8
Requires-Dist: pyyaml>=6
Requires-Dist: starlette>=0.40
Requires-Dist: uvicorn[standard]>=0.30
Description-Content-Type: text/markdown

![octowright](https://raw.githubusercontent.com/livingstaccato/octowright/main/docs/images/brand/octowright-banner.png)

# Octowright

An MCP server that lets agentic coding clients drive **many headed Playwright browsers
in parallel** with a **mix of engines** (Chromium, Firefox, WebKit), recording every
action to a JSONL log so a session can later be exported as a standalone Playwright
script.

Octowright is optimized for multi-session, mixed-engine browser orchestration with
persistent profiles, structured recordings, and a local debugger dashboard.

## Get started

Octowright isn't on PyPI yet, so you install from source. Octowright uses
[`uv`](https://docs.astral.sh/uv/) for dependency management — there is no
`pip install` path. If you don't have `uv` yet:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

Then, from any directory you'd like Octowright to live under (e.g. `~/code/`):

```bash
git clone https://github.com/livingstaccato/octowright.git
cd octowright
uv sync                                              # install Python deps
uv run playwright install webkit firefox chromium    # install browser binaries
uv run octowright init                               # print MCP registration block + scaffold config
```

Engine-binary management is currently CLI-driven (`playwright install` /
`playwright install --list`), not exposed as Octowright MCP tools.

The last command prints a JSON block to paste into an MCP client config, commonly
`.mcp.json` for a project or `~/.claude.json` for Claude Code. It also creates
Octowright's user config directory with a sample persona, scenario, and macro so
you have something to play with.

The block it prints looks like this — `init` substitutes your install path
into `<absolute-path-to-octowright>`:

```json
{
  "mcpServers": {
    "octowright": {
      "command": "uv",
      "args": [
        "--directory",
        "<absolute-path-to-octowright>",
        "run",
        "octowright",
        "serve"
      ]
    }
  }
}
```

Reload your MCP client. The tools appear as `mcp__octowright__browser_launch`, etc.

Verify in 30 seconds: ask your client to launch a webkit browser at `octowright.com`,
click a link, list browsers, then close. The next section walks through that same
flow as a tour of what Octowright actually does.

## Your first 5 minutes

Once installed and registered, ask your MCP client to walk through these in order.
Each step builds on the previous one and shows you what Octowright actually does.

**1. Open a browser.** Ask: *"launch a webkit browser at octowright.com"*. The client
calls `browser_launch kind=webkit url=https://octowright.com`. A real WebKit window opens
on your desktop. The result includes the `instance_id` so the client can target later
actions.

**2. Drive it.** Ask: *"click the 'More information' link"*. The client calls
`browser_click_by text="More information"`. The window navigates. Every action lands
in a JSONL recording on disk.

**3. List what's open.** Ask: *"what browsers are running?"*. `browser_list` returns a
one-line summary like `1 browser: 8a3f.../webkit @ iana.org/help/example-domains`.

**4. Save the session as a macro.** Ask: *"save the last few clicks as a macro called
example-tour"*. The client calls `macro_save`. Now `macro_run name=example-tour`
replays it.

**5. Close the browser, then launch a *named* one.** Ask: *"close that browser, then
launch a chromium browser with profile=demo at github.com"*. The window opens, you log
in (or whatever). When you close it, the cookies/localStorage flush to
the profile directory in Octowright's config dir. Re-launch with the same profile and you're
already logged in.

That's the whole tool: parallel browsers, recordings, named macros, persistent
profiles. **Personas** are profiles with metadata (display name, default URL,
credential references); **scenarios** are pre-declared groups of personas you can
spin up with one call. Both are covered later. **The dashboard** ties every
piece together visually — see the next section.

## Demo catalog

Octowright now ships a curated demo catalog on top of the raw `examples/`
material.

- Repo-facing catalog: [`demo/INDEX.md`](demo/INDEX.md)
- Authored bundle manifests: `demo/bundles/<demo-id>/demo.yaml`

The current hero set promotes seven offline-first bundles:
`first-run-session`, `macro-replay-loop`, `cross-engine-trio`,
`role-based-duo`, `fixture-lab`, `verify-suite`, and
`seven-mix-orchestration`.

`examples/` remains the raw source layer for reusable macros and scenarios.
`demo/bundles/` is the product-facing layer that adds audience/tag metadata,
artifact expectations, regen commands, tutorial-export metadata, and small
deterministic seed assets.

To refresh the generated repo catalog and per-bundle tutorial-export JSON from the manifests:

```bash
uv run python scripts/demos/record_heroes.py
```

## Distributed Skill Pack

Octowright ships a packaged skill named `octowright` for Codex and
project-local plugin manifests for compatible runtimes such as Claude Code and Codex.

Install everything:

```bash
uv run octowright skill install octowright --target all
```

Check status and drift:

```bash
uv run octowright skill status octowright --target all
```

Run diagnostics:

```bash
uv run octowright skill doctor --json
```

Notes:
- Codex skill install target is `$CODEX_HOME/skills` (defaults to `~/.codex/skills`).
- Plugin manifests are written in the current project under `.claude-plugin/plugin.json`
  and `.codex-plugin/plugin.json`.
- Use `--dry-run` to preview writes and `--force` to overwrite existing installs.
- Distributed skill/plugin metadata versions are sourced from `octowright.VERSION`.

## Dashboard

`octowright serve` boots two things in one process: the MCP stdio server (what
your client talks to) and a Starlette HTTP server on `http://127.0.0.1:6286/` (what
*you* look at). One stable URL, pinned in your browser, replaces the old dance
of copying log paths and shelling out to `npx playwright show-trace` by hand.

Ask your MCP client `"give me the octowright dashboard URL"` (it'll call the
`octowright_dashboard_url` MCP tool), or just open the URL directly. You get:

- **Top-level dashboard** — every live browser, every live scenario, recent
  closed sessions, all your personas, all your saved macros. Auto-refreshes
  every 5 seconds.
- **Persona management** — each persona card shows engine list, last-used
  time, and on-disk size (chromium + firefox + webkit + yaml). Hover the
  card and click the edit (✎) icon to open an in-page YAML editor; save
  writes back to `<persona>/profile.yaml` via `PUT /api/personas/{name}`.
  Disk sizes are loaded lazily after first paint via
  `GET /api/personas/sizes` (a single directory-size scan over Octowright's
  profile config dir).
- **Closed-session cleanup** — closed-session rows expose an `⊗` delete
  button on hover; clicking removes the JSONL recording, video, trace, and
  screenshots from disk via `DELETE /api/sessions/{id}/recording`. Live
  sessions reject the call with 409 (close them first).
- **Per-session debugger** — click any session for a two-column page with the
  embedded session video on the left, action timeline on the right. Click any
  action in the timeline to seek the video to that moment. Tabs underneath
  the timeline switch between **console messages** (filtered by level),
  **downloads** (with a "missing" badge if the file was moved),
  **markdown export**, and **screenshots** (lazy-loaded thumbnail grid).
- **Live updates** — for currently-running sessions, the page opens a
  WebSocket to `/api/sessions/{id}/tail` and appends new events as they
  arrive (no manual refresh). WebSocket frame payloads that are binary are
  intentionally hidden in the UI preview as `[binary payload hidden]`. Full frames
  are still cached to the websocket cache using base64 for safe replay and
  debugging.
- **Trace deep-dive** — a button on each session page spawns
  `npx playwright show-trace` against that session's `.zip` trace, opening
  the official Playwright trace viewer for full per-action inspection
  (network, snapshots, source links). Requires `npx` on PATH.

The markdown tab uses the new `GET /api/sessions/{id}/markdown` endpoint; the
server captures cached markdown on page load and user navigation, and generates
it on demand if a live session hasn't populated the cache yet.

The dashboard is a TypeScript SPA built into `packages/octowright-frontend/`
(Vite + strict tsc + Biome + vitest). It uses `@provide-io/telemetry` for structured
logging so frontend log lines are correlated with the Python server's
`provide.telemetry` calls. The compiled bundle ships inside the wheel; the
frontend has zero runtime dependency on Node — Node is only needed at build
time and for the optional `npx playwright show-trace` deep-dive.

If port 6286 is taken, the server walks up to 5 higher ports and picks the
first free one (or logs a warning and continues without the HTTP layer if all
are busy — MCP keeps running). Override the default with `OCTOWRIGHT_HTTP_PORT`
or bind to a different host with `OCTOWRIGHT_HTTP_HOST` (default `127.0.0.1`).
Binding to `0.0.0.0` only exposes health/static assets by default; sensitive
dashboard, API, and MCP access from another machine also requires
`OCTOWRIGHT_ALLOW_REMOTE_DASHBOARD=1`. Only enable that opt-in on trusted
networks because it exposes live browser state, recordings, traces, downloads,
and the MCP tool surface.

## Concepts: how the pieces relate

Five layers, each building on the one below:

**1. Browser.** A single live Playwright browser — one engine (chromium /
firefox / webkit), one window. Identified by an `instance_id`. Every action you
run against it gets appended to a JSONL recording, and a separate
`BrowserContext` gives it its own cookie jar (so seven parallel Discord tabs
never share auth, even when they all run on WebKit).

**2. Profile.** A directory on disk (`<octowright-config>/profiles/<persona>/<kind>/`)
that stores cookies, localStorage, IndexedDB, and service-worker state
between browser runs. When you pass `profile=dante` to `browser_launch`, the
browser uses a **persistent context** pointed at that directory — close the
browser, re-launch tomorrow, and you're still logged in. Profiles are scoped
per engine; dante on WebKit and dante on Firefox are two distinct profile
dirs under the same persona.

**3. Persona.** A *named identity* that owns profiles across one or more
engines, plus metadata: display name, `default_url`, `default_macros` to run
at launch, `credentials` (references to env vars or shell commands —
secrets themselves are never stored on disk), and an `app` dict for
free-form domain metadata. Think of a persona as "dante — my Discord power
user across all three engines", and a profile as one engine-specific piece
of that identity. You launch it with `browser_launch profile=dante`;
the resolver (`browser_suggest_for_url`) works out which persona to reuse
when the URL is ambiguous. See [docs/personas.md](https://github.com/livingstaccato/octowright/blob/main/docs/personas.md)
for the full `profile.yaml` shape.

**4. Scenario.** A *pre-declared group of personas to launch together*, each
with a `role` (`player`, `monitor`, `spectator`). Declared in
`<octowright-config>/scenarios/<name>.yaml` (or a Python `build()` function for
dynamic rosters). `scenario_start name=discord-raid` launches all seven
participants in parallel, applies shared fixtures (dialog policy, mock
routes), runs each participant's startup macros. You can then broadcast a
macro across all participants (`scenario_run_macro`), role-filter
(`role=player`), or drive a single participant by its `instance_id`. See
[docs/scenarios.md](https://github.com/livingstaccato/octowright/blob/main/docs/scenarios.md) for the full spec shape.

**5. Dashboard.** The web UI bundled with `octowright serve` is the visual
projection of everything above. The dashboard page lists every live browser,
every live scenario, recent closed sessions, every persona, every macro;
each session links to a debugger page with embedded video, click-to-seek
action timeline, console messages, downloads, and screenshots. The Playwright
trace viewer is one button away. See the [Dashboard](#dashboard) section
above for what it shows; this layer doesn't add new state — it just makes
the other four layers observable.

**When to reach for which.** A single browser for one-shot exploration. A
named profile when you want login state to survive. A persona when that
identity is worth metadata and credential references. A scenario when you
need N coordinated browsers as a single unit. The dashboard whenever you
want to *see* what's happening rather than ask your MCP client.

## Tools

Every mutating tool takes an `instance_id` returned from `browser_launch`. Each call
appends a record to that instance's JSONL log.

### Browser lifecycle

| Tool | What |
|---|---|
| `browser_launch` | Launch a new headed browser. `kind` = `chromium` / `firefox` / `webkit`. Returns `instance_id`. |
| `browser_suggest_for_url` | Pre-launch: which saved persona owns this URL? Disambiguates `"open discord.com"` requests. |
| `browser_list` | List all live instances. |
| `browser_close` / `browser_close_all` | Close one / all. Protected browsers require `force=True`; `browser_close_all` skips protected browsers unless forced and reports failures. |
| `browser_spawn_roster` | Launch N browsers in parallel from a list of launch specs. |
| `browser_navigate` | Navigate a specific instance. |
| `browser_navigate_back` | Go back one entry in the browser's history. Returns `{ok, url, title}`; `ok=False` when there's no previous page. |
| `browser_open_url` | Open a URL on an existing instance. `target='tab'` (default) appends a new page; `target='window'` calls `window.open(...,'popup',width=W,height=H)` so the OS opens a new window (defaults 1024×768). Returns `{ok, target, page_index, url}`. |
| `browser_resize` | Resize the page viewport to `width × height` CSS pixels (does not resize the OS window). |

### Input

| Tool | What |
|---|---|
| `browser_click` / `browser_type` / `browser_fill` / `browser_press_key` | CSS-selector input. |
| `browser_click_by` / `browser_fill_by` / `browser_get_text_by` | ARIA-locator input (role / label / text / data-testid). |
| `browser_hover` | Hover the cursor over a CSS selector (triggers `:hover` / hover-reveal menus / tooltips). |
| `browser_select_option` | Select one option in a native `<select>` by `value`, `label`, or 0-based `index`. |
| `browser_drag` | Drag-and-drop from `source_selector` onto `target_selector` (Playwright `drag_and_drop`). |
| `browser_set_input_files` | Upload files into an `<input type=file>`. |

Recorded CSS `click` and `fill` actions also capture semantic metadata when
Playwright can resolve it. Macro playback and exported replay scripts try that
ARIA locator first, then fall back to the original CSS selector.

### Inspection

| Tool | What |
|---|---|
| `browser_screenshot` | PNG to disk. |
| `browser_snapshot` | Accessibility tree (defaults to `body`). |
| `browser_read_markdown` | Cached Markdown representation (highly token-efficient for reading). |
| `browser_evaluate` | Run JS in the page. |
| `browser_console_messages` | Collected console output since launch (cursor pagination). |
| `browser_wait_for` | Wait for selector / text / network-idle. |
| `browser_recording_path` | Path to the JSONL action log for this instance. |
| `browser_tail_recording` | Stream new JSONL events appended since a byte cursor — for live monitoring without `tail -f`. |
| `browser_export_script` | Emit a Playwright Python (or TS) script that replays the log. |
| `browser_open_trace` | Open the Playwright trace viewer (`npx playwright show-trace`) on this session's `.zip`. |

### Assertions

| Tool | What |
|---|---|
| `browser_expect_url` / `browser_expect_text` / `browser_expect_selector` / `browser_expect_js` | Recording-aware assertions (raise on mismatch, append to JSONL). |

### Network & dialogs

| Tool | What |
|---|---|
| `browser_set_dialog_policy` | accept / dismiss / manual for `confirm()` / `alert()` / `prompt()`. Default: dismiss. |
| `browser_mock_route` / `browser_unmock_route` | Stub network responses for deterministic tests. |
| `browser_network_requests` | List captured HTTP/HTTPS requests for an instance. Optional substring `url` / `method` / `resource_type` filters; pass `since` cursor for incremental polling. |

### Pages, frames, downloads

| Tool | What |
|---|---|
| `page_list` / `page_switch` / `page_close` | Manage tabs + popups. |
| `browser_switch_frame` / `browser_reset_frame` / `browser_list_frames` | Drive an iframe. |
| `browser_downloads` / `browser_wait_for_download` | Captured downloads (cursor pagination). |

### Profiles, personas, scenarios, macros, goldens

| Tool | What |
|---|---|
| `profile_list` / `profile_delete` | Saved per-engine profile dirs. |
| `persona_list` / `persona_get` / `persona_create` / `persona_delete` | Identity-layer over profiles. |
| `persona_credentials_check` | Pre-flight: resolve every credential reference without launching a browser. |
| `scenario_list` / `scenario_start` / `scenario_status` / `scenario_stop` / `scenario_run_macro` / `scenario_participants` / `scenario_run_as_test` / `scenario_tail` | Multi-browser orchestration + verify-as-test. |
| `scenario_plan` | Dry-run: show resolved per-participant launch_kwargs without launching anything. |
| `macro_save` / `macro_list` / `macro_run` / `macro_run_sequence` / `macro_delete` | Named, parameterised action sequences. Supports `macro_call` for reusable submacros. |
| `macro_compile` | Compile the YAML macro DSL to canonical JSON; dry-run by default, save with `write=true`. |
| `macro_lint` | Static-analysis pass on a saved macro: missing required fields, unknown actions, unparameterized credential-shaped strings, empty conditional branches. |
| `golden_save` / `golden_assert` / `golden_list` / `golden_delete` | Accessibility-tree snapshot diffs. |
| `run_test_suite` | Run every `[test]`-tagged macro in a directory; emit JUnit XML. |

### Housekeeping

| Tool | What |
|---|---|
| `octowright_dashboard_url` | Returns the localhost dashboard URL (with optional `session_id` deep-link). |
| `octowright_check_takeover` | Detect competing Playwright MCP plugins in `.mcp.json` / `~/.claude.json`; report scope + suggested actions. |
| `recordings_cleanup` | Prune old recording artefacts older than N days. Dry-run by default. |

## Persistent profiles (Discord, Slack, N-login-per-app)

By default `browser_launch` creates an ephemeral browser — cookies, localStorage, and
IndexedDB die on close. To **keep login state across runs**, pass a `profile` name:

```
browser_launch kind=webkit profile=disc-1 url=https://discord.com/login
```

Each `(kind, profile)` pair gets its own on-disk user-data-dir under Octowright's config dir.
First launch opens a fresh browser; after you log in manually, closing the browser flushes
state to disk. The next launch with the same `profile` skips the login (Discord /
Slack / etc. treat it as a returning session).

**Cookie isolation:** each live browser has its own `BrowserContext`, so seven logged-in
Discord tabs you run in parallel never share cookies, localStorage, or IndexedDB — even
if they're all `kind=webkit`.

**Window title format.** Every page's `document.title` is rewritten on the fly to
end with ` (<persona-emoji><engine-emoji>) [<profile>]` — so the page's own title
leads and the badge tails. Example: `Yahoo | Mail, Weather, … (🐬🦊) [microdosing]`
in firefox, or `Yahoo | Mail, Weather, … (🐬🧭) [microdosing]` in webkit. The
persona emoji is hash-picked from a curated 33-pick pool keyed off the persona
name (deterministic, same emoji every time); the engine emoji is fixed
(🌐 chromium · 🦊 firefox · 🧭 webkit). When parallel windows pile up in cmd-` /
the Window menu / a tab strip, the suffix lets you tell them apart at a
glance even after deep navigation. Override the emoji by setting `emoji:` in
the persona's `profile.yaml`, or disable the corner badge entirely with
`badge=False` on `browser_launch` (the title injection has no off-switch
short of editing the launch — it's purely a string rewrite, no DOM nodes).

Example — seven Discord accounts on seven WebKit windows, reusable later:

```
# First time: open all seven, log each one in manually
browser_launch kind=webkit profile=disc-1 url=https://discord.com/login label=acct-1
browser_launch kind=webkit profile=disc-2 url=https://discord.com/login label=acct-2
...
browser_launch kind=webkit profile=disc-7 url=https://discord.com/login label=acct-7

# Close them — profiles flush to disk
browser_close_all

# Days later: re-launch and skip login entirely
browser_launch kind=webkit profile=disc-1 url=https://discord.com/app
...
```

Protected sessions are intended for user-facing windows. Any close-capable
tool refuses them unless the call explicitly confirms with `force=True`;
this includes `browser_close`, `browser_close_all`, and
`browser_capture_and_close`.

`profile_list` enumerates saved profiles; `profile_delete` wipes one (refuses while a
live instance is using it). Exported replay scripts embed the absolute `user_data_dir`
path, so they work on the same machine but are **not portable across machines** when a
profile is involved.

## Personas — identity layer over engine profiles

Every browser profile belongs to a **persona**: a named identity with metadata,
credential references, and optional default URL + startup macros. A persona can
have browser profiles for multiple engines (WebKit, Firefox, Chromium); each
engine profile is a child directory.

```
<octowright-config>/profiles/
├── dante/
│   ├── profile.yaml     # persona metadata
│   ├── webkit/          # dante's WebKit browser state
│   └── chromium/        # dante's Chromium browser state
└── tim/
    ├── profile.yaml
    └── webkit/
```

`profile.yaml` declares display name, default URL + macros, credential
references (read from env vars or shell commands at use time; never stored),
and free-form app metadata:

```yaml
name: dante
display_name: Dante Alighieri
default_url: https://discord.com/app
default_macros: [discord-login]
credentials:
  email_env: DANTE_EMAIL
  password_cmd: "op read op://Personal/dante/password"
app:
  discord_user_id: "1234"
  role: player
```

MCP tools: `persona_list` / `persona_get` / `persona_create` / `persona_delete` /
`persona_credentials_check`.
CLI: `octowright persona list|show|create|delete`.

**Credentials pre-flight.** Before launching a scenario whose startup macros
need logins, call `persona_credentials_check name=dante` to verify every
`*_env` / `*_cmd` reference actually resolves. The report lists each
credential, its source (env var or shell command) and the reference itself,
plus per-field `ok`/`error` — the resolved secret is never included. Use
this to avoid the classic "logged in 6 of 7 windows, then discovered the
env var was unset on #7" failure mode.

Full reference: [docs/personas.md](https://github.com/livingstaccato/octowright/blob/main/docs/personas.md).

## Macros — reusable parameterized action sequences

Turn a recorded browser session into a named, reusable macro. Capture a login flow
once, replay it with different credentials later. Example workflow:

```
# 1. Manually log into Discord on a live instance
browser_launch kind=webkit profile=disc-1 url=https://discord.com/login label=acct-1
# ... fill email, password, submit ...

# 2. Snapshot those actions as a macro, telling Octowright which literal values
#    to treat as parameters:
macro_save instance_id=<id> name=discord-login \
           parameters={"email":"me@octowright.test","password":"hunter2"}

# 3. Days later, against a fresh instance, replay it with different creds:
browser_launch kind=webkit profile=disc-2 url=https://discord.com/login label=acct-2
macro_run instance_id=<new-id> name=discord-login \
          args={"email":"other@octowright.test","password":"correcthorsebatterystaple"}
```

`macro_list` enumerates saved macros; `macro_delete` removes one. Macros live at
`${XDG_CONFIG_HOME:-~/.config}/octowright/macros/<name>.json` on POSIX, or
`%APPDATA%\octowright\macros\<name>.json` on Windows. Override with
`OCTOWRIGHT_MACROS_DIR`.

Lifecycle actions (`launch`, `close`, `snapshot`) are dropped by default — macros
are the reusable middle of a flow, not the wrapper. Pass `include_launch=True` on
`macro_save` if you need the initial navigation baked in.

**Caveat:** JSONL macros break when the target site changes its DOM (Discord
rewrites its CSS classes frequently). Recorded `click` and `fill` actions use
captured ARIA metadata first when available, then fall back to CSS, but macros
are still short-term automation — when a macro breaks, re-record rather than
hand-patch.

### Conditional / branching actions

For sites that ship multiple DOM versions of the same flow, three action types
let one macro cover all of them. Hand-author these by editing the JSON; record
the linear baseline first, then wrap fragile steps:

- **`if_selector`** — predicate on selector presence; runs `then` or `else`.
  ```json
  {"action": "if_selector", "selector": ".cookie-banner", "present": true,
   "then": [{"action": "click", "selector": ".accept-cookies"}]}
  ```
- **`try`** — best-effort sub-sequence that SUPPRESSES errors. Use for
  optional steps like dismissing a one-off banner that may or may not exist.
  ```json
  {"action": "try", "actions": [
      {"action": "click", "selector": "#optional-popup-close"}
  ]}
  ```
- **`try_each`** — branches in order; succeeds on the first whose every
  action completes; raises if all fail. The "v1 OR v2 OR v3" hammer.
  ```json
  {"action": "try_each", "branches": [
      [{"action": "click", "selector": "[aria-label='Close']"}],
      [{"action": "click", "selector": "button.dismiss"}],
      [{"action": "press_key", "key": "Escape"}]
  ]}
  ```

These nest freely — `if_selector` inside `try_each` inside `try` works as you
would expect. See `examples/macros/conditional-discord-modal-dismiss.json` for
a real-world pattern.

Full reference: [docs/macros.md](https://github.com/livingstaccato/octowright/blob/main/docs/macros.md).

## Scenarios — coordinated multi-browser orchestration

A scenario is a named group of browser instances launched together. Spin up N
players + a monitoring window + a main-site window with one call; each
instance is a regular `BrowserSession` you can drive per-participant (via
`instance_id`) using all the normal `browser_*` tools.

Declare scenarios in Octowright's config dir:

```yaml
name: discord-raid
description: 7 players + 1 monitor + 1 main-site spectator
participants:
  - persona: dante
    kind: webkit
    role: player
  - persona: ops
    kind: firefox
    role: monitor
    url: https://octowright.com/monitor
fixtures:
  mock_routes:
    - pattern: "**/api/time"
      body: '{"now":"2026-04-24T00:00:00Z"}'
  dialog_policy: dismiss
teardown:
  macro: cleanup-session
verify:
  player: assert-in-server
  monitor: assert-monitor-healthy
```

Or as Python for dynamic participant lists — `<name>.py` exposes `def build() -> Scenario`.

Lifecycle:

- `scenario_start <name>` launches all participants in parallel, applies
  fixtures, runs per-participant startup macros. Browsers **stay open**.
- `scenario_run_macro <id> <macro> [role=...]` broadcasts a macro across
  participants (optionally role-filtered). Per-participant results returned.
- Any single participant can still be driven by `instance_id` with the regular
  `browser_*` tools.
- `scenario_stop <id>` runs the teardown macro per participant, closes every
  window, returns a summary.
- `scenario_run_as_test <id>` (or `--test` on the CLI) runs `verify` macros
  and produces JUnit XML.

CLI: `octowright scenario list|start [--test --out <xml>]`; the `start`
command blocks until Ctrl-C, then runs teardown and exits.

Full reference: [docs/scenarios.md](https://github.com/livingstaccato/octowright/blob/main/docs/scenarios.md).

## Configuration

All defaults live in `src/octowright/defaults.py` and can be overridden via environment
variables:

On POSIX systems, Octowright follows the XDG Base Directory split:

- Config: `${XDG_CONFIG_HOME:-~/.config}/octowright/` for durable user-authored data.
- State: `${XDG_STATE_HOME:-~/.local/state}/octowright/` for session history, logs, and manifests.
- Cache: `${XDG_CACHE_HOME:-~/.cache}/octowright/` for rebuildable analysis captures.

On Windows, config uses `%APPDATA%\octowright\`, while state and cache use
`%LOCALAPPDATA%\octowright\State\` and `%LOCALAPPDATA%\octowright\Cache\`.

| Variable | Default | Description |
|---|---|---|
| `OCTOWRIGHT_DEFAULT_URL` | `https://octowright.com` | Fallback `url` when `browser_launch` omits it. |
| `OCTOWRIGHT_RECORDINGS` | POSIX: `${XDG_STATE_HOME:-~/.local/state}/octowright/sessions/`; Windows: `%LOCALAPPDATA%\octowright\State\sessions\` | Where session artifacts land: JSONL action logs, traces, screenshots, videos, downloads, and markdown captures. |
| `OCTOWRIGHT_CAPTURES_DIR` | POSIX: `${XDG_CACHE_HOME:-~/.cache}/octowright/captures/`; Windows: `%LOCALAPPDATA%\octowright\Cache\captures\` | Where large cached analysis payloads live. |
| `OCTOWRIGHT_CAPTURE_MAX_TOTAL_BYTES` | `52428800` | Size cap for cached analysis captures before oldest captures are pruned. |
| `OCTOWRIGHT_CAPTURE_TTL_SECONDS` | `604800` | Age cap for cached analysis captures. |
| `OCTOWRIGHT_SESSION_MANIFEST` | POSIX: `${XDG_STATE_HOME:-~/.local/state}/octowright/session-manifest.json`; Windows: `%LOCALAPPDATA%\octowright\State\session-manifest.json` | Live-session manifest used for crash recovery/status. |
| `OCTOWRIGHT_PROFILES_DIR` | POSIX: `${XDG_CONFIG_HOME:-~/.config}/octowright/profiles/`; Windows: `%APPDATA%\octowright\profiles\` | Where persistent profiles live. |
| `OCTOWRIGHT_MACROS_DIR` | POSIX: `${XDG_CONFIG_HOME:-~/.config}/octowright/macros/`; Windows: `%APPDATA%\octowright\macros\` | Where saved macros live. |
| `OCTOWRIGHT_SCENARIOS_DIR` | POSIX: `${XDG_CONFIG_HOME:-~/.config}/octowright/scenarios/`; Windows: `%APPDATA%\octowright\scenarios\` | Where scenario specs live. |
| `OCTOWRIGHT_VIEWPORT_W` / `OCTOWRIGHT_VIEWPORT_H` | `1280` / `800` | Default viewport. Used in headless mode and when dimensions are explicitly passed to `browser_launch`. In headed mode with neither set, context launches with `no_viewport=True` so the page tracks the OS window. |
| `OCTOWRIGHT_HEADLESS` | auto | Explicit `0` / `1` overrides headless mode. Auto-detected: headed on macOS or Linux+display, headless on CI (`CI=true`) or Linux without `$DISPLAY` / `$WAYLAND_DISPLAY`. |
| `OCTOWRIGHT_NAV_TIMEOUT_MS` / `OCTOWRIGHT_ACTION_TIMEOUT_MS` | — | Per-navigation / per-action timeouts. |
| `OCTOWRIGHT_HTTP_HOST` / `OCTOWRIGHT_HTTP_PORT` | `127.0.0.1` / `6286` | Dashboard bind address. Binding to `0.0.0.0` makes the HTTP sidecar reachable on your network, but sensitive dashboard/API/MCP routes stay blocked unless `OCTOWRIGHT_ALLOW_REMOTE_DASHBOARD=1` is also set. Only enable remote dashboard access on trusted networks because it exposes live browser state and local artifacts. If the port is in use, the server walks up 5 higher ports automatically. |
| `OCTOWRIGHT_IDLE_GRACE` | `300` | Seconds before auto-exit when the browser pool is empty. Use `--keep-alive` to disable. |

## CLI

`octowright` is a Click-based CLI; subcommands let you do common housekeeping
without going through an MCP client:

| Command | What |
|---|---|
| `octowright serve [--profile=<spec>]` | Run the MCP stdio server + the dashboard HTTP server. This is the default when you invoke `octowright` with no subcommand. Pass `--profile=core` (or `core,macros` etc.) to slim the LLM-visible MCP tool surface — see [Capability profiles](#capability-profiles) below. |
| `octowright init [--force]` | First-run scaffolding: create the standard config dirs, drop a sample persona / scenario / macro, and print the `.mcp.json` registration block with your install path filled in. |
| `octowright selftest` | Print the list of registered MCP tools without needing a live MCP client. Sanity check after install. |
| `octowright test [<dir>] [--kind <engine>] [--tag <tag>] [--out <xml>]` | Run every `[test]`-tagged macro in a directory, emit JUnit XML. |
| `octowright cleanup [--days N] [--apply]` | Prune old recording artefacts (JSONL logs, screenshots, videos, traces). Dry-run by default; `--apply` actually deletes. |
| `octowright takeover [--apply --scope=session\|project\|global --name=<n>]` | Detect competing Playwright MCP plugins in `.mcp.json` / `~/.claude.json` and offer to disable them in favour of octowright. Default is read-only report; `--apply` rewrites the config (with timestamped backup). Reversible — rename back to re-enable. |
| `octowright persona list\|show\|create\|delete` | Manage personas from the terminal. |
| `octowright scenario list\|start [--test --out <xml>] [--watch]` | Start a scenario; `--watch` streams participant events to stdout in real-time; the command blocks until Ctrl-C. |

## Capability profiles

The full MCP tool surface is currently 111 tools — every workflow Octowright supports
(browser driving, macros, scenarios, persona management, etc.) shows up in
the LLM's tool schema by default. When the LLM only needs a slice, set
`OCTOWRIGHT_PROFILE` (or pass `--profile` to `octowright serve`) to one or
more comma-separated profile names. Tools not listed in any active profile
are skipped at registration time, so the LLM-visible schema shrinks. Seven
meta/Advisor tools are always registered so agents can inspect Octowright,
find the dashboard, and surface local guidance even under narrow profiles.

| Profile | What | Tool count |
|---|---|---|
| `core` | Minimum to drive a browser end-to-end (launch, navigate, click/type/fill, observe, close). | 16 |
| `advanced` | Inspection, cached captures, assertions, viewport controls, and ARIA-locator interactions for stable test automation. | 21 |
| `macros` | Macro record / list / run / lint / repair / compile + artifact bundles. | 15 |
| `scenarios` | Scenario orchestration (multi-browser test setups). | 12 |
| `personas` | Persona + on-disk profile management. | 8 |
| `goldens` | Accessibility-tree snapshot baselines + diff. | 5 |
| always-on | Status, storage report, dashboard, takeover detection, and Advisor tools registered under every profile. | 7 |
| `all` (or unset) | Default — every tool registers. | 111 |

```bash
octowright serve --profile=core              # 23 tools — core + always-on
octowright serve --profile=core,macros       # 38 tools — browser + macro pipeline + always-on
octowright serve --profile=core,scenarios    # browser + multi-browser orchestration
```

The active profile shows up in `octowright selftest` and in the
`octowright_status` MCP tool's `profile` block. If a tool you expected is
missing, that's where to look. The dict lives in
`src/octowright/server/profiles.py` — extend it to add or rebalance groups.

## Octowright Advisor

Octowright Advisor is a local, deterministic guidance layer exposed through
always-on MCP tools. `octowright_status` includes an `advisor` block, and
`octowright_advisor_status` returns the same Advisor snapshot directly:
preferences, recent usage summary, and current suggestions. (`octowright_status`
also carries an `upgrade` block on the first run after a version change — present
its highlights to the user as a "what's new" note.)

Advisor currently suggests two things:

- **Macro candidates**: agents call
  `octowright_advisor_record_macro_observation` when they notice repeated
  workflows. Two observations with the same signature produce a
  `macro_candidate` suggestion. Advisor never auto-saves a macro.
- **Profile changes**: recent MCP tool usage can suggest narrowing or expanding
  `OCTOWRIGHT_PROFILE`. Profile-change suggestions can be prompt-only or marked
  `auto_apply` when the `profile_change` preference is `automatic`.

Agents should check Advisor status after first-touch status and before asking a
user whether repeated work should become a macro. Preferences are persisted in
the local Advisor state file and can be changed with
`octowright_advisor_set_preference`. Set `OCTOWRIGHT_ADVISOR_STATE` to isolate
that JSON state file for tests or separate deployments.

## Telemetry

Both halves of Octowright use the `provide.telemetry` family for structured
logging:

- **Python server** uses `provide-telemetry>=0.4.8` (structlog under the hood).
  `setup_telemetry()` is called by `octowright serve`; every module gets a
  logger via `get_logger(__name__)`. Logs land on stderr in development,
  JSON in production (auto-detected).
- **TypeScript dashboard** uses `@provide-io/telemetry@^0.4.7` (pino under
  the hood). `setupTelemetry()` runs at the top of each entrypoint;
  `getLogger('octowright.frontend.{api,tail,dashboard,session,global}')` per
  module. Logger names mirror the Python convention so log lines are easy
  to correlate across the stack.

### Log level and format

```bash
# Human-readable local debugging
export PROVIDE_LOG_LEVEL=DEBUG
export PROVIDE_LOG_FORMAT=pretty
uv run octowright serve
```

```bash
# Machine-friendly production logs
export PROVIDE_LOG_LEVEL=INFO
export PROVIDE_LOG_FORMAT=json
uv run octowright serve
```

`octowright serve --log-level DEBUG` is a convenience wrapper that sets
`PROVIDE_LOG_LEVEL` for the process and spawned daemon.

### OTLP export

Telemetry export is opt-in. To send OpenTelemetry signals to an OTLP collector:

```bash
export PROVIDE_TRACE_ENABLED=1
export PROVIDE_METRICS_ENABLED=1
export OTEL_EXPORTER_OTLP_ENDPOINT=http://127.0.0.1:4318
# optional auth/tenant headers
export OTEL_EXPORTER_OTLP_HEADERS="authorization=Bearer%20TOKEN,x-tenant-id=dev"
uv run octowright serve
```

Signals are no-op if telemetry exporters are not configured/available.

### Playwright traces vs telemetry traces

- **Playwright trace**: per-session browser artifact (`*.trace.zip`) produced
  by Playwright when session tracing is enabled; inspect with
  `npx playwright show-trace`.
- **Telemetry trace**: OpenTelemetry spans emitted by `provide.telemetry`
  (when `PROVIDE_TRACE_ENABLED=1`) and exported to OTLP.

These are separate systems and can be enabled independently.

### HTTP metrics

HTTP request metrics for the debugger/API server are recorded through
`provide.telemetry`'s `TelemetryMiddleware` and exported via OTLP alongside the
rest of octowright's telemetry — RED metrics (`http.requests.total`,
`http.errors.total`, `http.request.duration_ms`) attributed by route, method,
and status code, plus request-id/session-id log correlation and W3C trace
propagation. There is no separate Prometheus scrape endpoint; point an OTLP
collector at the process to consume them. Disable metric recording (propagation
stays on) with:

```bash
export OCTOWRIGHT_HTTP_METRICS=0
```

## Safari caveat

Playwright's `webkit` channel is the **bundled upstream WebKit engine**, not Apple's
Safari.app. It shares the engine family but is a separate binary (`playwright install
webkit`). Driving actual Safari.app with your cookies/profile requires Apple's
`safaridriver` and is not supported by Playwright today.

## Selftest

```bash
uv run octowright selftest
```

Prints the list of registered tools without needing a live MCP client.

## Documentation

- [docs/README.md](https://github.com/livingstaccato/octowright/blob/main/docs/README.md): full documentation index.
- [docs/getting-started.md](https://github.com/livingstaccato/octowright/blob/main/docs/getting-started.md): install, registration, and first successful run.
- [docs/engines.md](https://github.com/livingstaccato/octowright/blob/main/docs/engines.md): engine install/status/reinstall and launch-mode behavior.
- [docs/personas.md](https://github.com/livingstaccato/octowright/blob/main/docs/personas.md): persona/profile lifecycle and credential preflight.
- [docs/macros.md](https://github.com/livingstaccato/octowright/blob/main/docs/macros.md): macro record/replay, linting, and test execution.
- [docs/scenarios.md](https://github.com/livingstaccato/octowright/blob/main/docs/scenarios.md): multi-browser orchestration lifecycle.
- [docs/goldens.md](https://github.com/livingstaccato/octowright/blob/main/docs/goldens.md): baseline capture vs verify policy.
- [docs/ci-quality.md](https://github.com/livingstaccato/octowright/blob/main/docs/ci-quality.md): quality gates and local CI parity commands.
- [docs/troubleshooting.md](https://github.com/livingstaccato/octowright/blob/main/docs/troubleshooting.md): fast diagnosis for common failures.
- [docs/architecture/](https://github.com/livingstaccato/octowright/tree/main/docs/architecture/): system diagrams and architecture references.
- [CHANGELOG.md](https://github.com/livingstaccato/octowright/blob/main/CHANGELOG.md): release summaries.
