BrowserTrace

Debug Browser Use failures with a local trace timeline

Browser Use agents fail in browser state, not just in logs. BrowserTrace records each step locally so you can inspect screenshots, URL, action, model I/O, status, and the first failed step.

View repo Open exported trace Adapter feedback

Improving this guide or a Browser Use adapter note? Use the First PR Recipe to keep the first contribution small and reviewable.

Why this exists

When a Browser Use run fails late in a task, a stack trace usually tells you which exception happened. It often does not show what the agent saw, which URL it was on, which model decision selected the target, or whether the wrong assumption came from an earlier step.

BrowserTrace keeps that missing context in a local SQLite database plus screenshot files. No signup or cloud service is required.

Try it before wiring Browser Use

uvx --from "browsertrace[ui]" browsertrace doctor
uvx --from "browsertrace[ui]" browsertrace demo
uvx --from "browsertrace[ui]" browsertrace

Open http://127.0.0.1:3000, then inspect demo: Browser Use local HTML upload navigation failure. From a source checkout, python examples/browser_use_callback_demo.py records Browser Use-shaped callback steps without installing Browser Use.

Attach it to a Browser Use agent

from browser_use import Agent
from browsertrace import Tracer
from browsertrace.integrations.browser_use import attach_tracer

tracer = Tracer()
agent = Agent(task="...", llm=ChatOpenAI(model="gpt-4o"))

with attach_tracer(agent, tracer, name="browser-use checkout run"):
    await agent.run()

The adapter hooks Browser Use step callbacks and records URL, screenshot, action summary, compact browser-state context, model thought/actions, status, and errors into the same local timeline.

For adapter field requests, use the Browser Use feedback issue and include the Browser Use version, failure shape, and which context your logs missed.

Callback compatibility

attach_tracer supports Browser Use agents that expose register_new_step_callback, plus older or forked agents with on_step_start, on_step, or _new_step_callback attributes.

Current Browser Use examples may also pass on_step_start or on_step_end directly to agent.run(...). For that run-hook-only path, use create_run_hooks:

from browsertrace import Tracer
from browsertrace.integrations.browser_use import create_run_hooks

tracer = Tracer()
hooks = create_run_hooks(tracer, name="browser-use checkout run")

with hooks:
    await agent.run(on_step_start=hooks.on_step_start, on_step_end=hooks.on_step_end)

The run-hook helper reads Browser Use history and browser-session summaries when they are available, then records the latest thought, action, extracted content, URL, title, tabs, and screenshot flag into the same local timeline. If your Browser Use version exposes a different hook shape, comment on issue #11 with the version and callback surface.

To try this path without installing Browser Use, run python examples/browser_use_run_hooks_demo.py from a source checkout.

Debug icon-only click targets

If the screenshot shows the target but Browser Use clicks a nearby toolbar button, treat it as a visible-target versus accessible-target mismatch. Icon-only buttons often rely on hover tooltips, and that tooltip text may not be present in the accessibility tree when the agent ranks candidate elements.

The best app-side fix is an accessible name on the real button, for example aria-label="Create Test". Until the app can change, make the task prompt structural, such as "click the plus icon immediately next to the search field in the Functional toolbar", or use a deterministic selector for that step.

Related community case: browser-use/browser-use#4801.

Debug new-tab desync

If a click or Enter action opens a new tab, Browser Use may keep reasoning from the stale page context unless the action result makes that browser-state delta explicit. The symptom is usually repeated retries against the old page while the expected element exists in the new tab.

In BrowserTrace terms, treat this as a browser topology change. The trace should explain whether the agent switched to the new page, stayed on the stale page, or attempted later actions before the new tab finished loading.

Related community case: browser-use/browser-use#4758.

Debug remote CDP hangs

For Browserless or other remote-CDP providers, a failed Browser Use run may not be only a screenshot problem. A stale remote browser session can make one CDP request stop returning while the websocket still looks connected. If recovery holds a shared event-bus lock during that wait, one degraded browser session can delay unrelated sessions.

When you see screenshot capture, DOM snapshot, or browser-state collection timeouts, collect timing evidence before changing retry policy:

In BrowserTrace terms, treat this as method-timing and browser-session evidence, not just a red screenshot step. The trace should explain whether the CDP method failed fast, never returned, or blocked later state collection through recovery and lock timing.

Related community case: browser-use/browser-use#4579.

Debug local HTML upload navigation mistakes

A local HTML upload can be misread as a navigation target before the intended upload action runs. The security watchdog may correctly block the bad URL, but the useful debugging boundary is earlier: why did the planner or model-visible context turn an attachment name into navigate at step 0?

Treat this as a planner/action validation boundary and future adapter boundary, not as proof of a low-level file upload bug. It also does not mean BrowserTrace already captures every internal Browser Use field.

Related community case: browser-use/browser-use#4794.

Debug action schema validation boundaries

Action schema coercion can hide why a Browser Use step targeted the wrong element. For example, a raw model action may put a boolean in an element-index field, then one validation path coerces it while another rejects it. After normalization, the final executed target can look deliberate unless the trace preserves both sides of the boundary.

Treat this as a schema validation and future adapter boundary. It does not mean BrowserTrace already captures every internal Browser Use field.

Related community case: browser-use/browser-use#4796.

Debug empty model responses

A failing parser exception might occur when the model provider returns an empty response (e.g., input_value='') before Browser Use validation runs. This suggests the parser received no assistant JSON content at all, rather than malformed JSON.

While BrowserTrace does not capture every internal Browser Use field, useful debugging evidence to preserve at the boundary between provider response, parsing, and execution includes:

Related community case: browser-use/browser-use#4786.

Share only what is safe

browsertrace list
browsertrace export <run_id> -o full.html
browsertrace export <run_id> --redact -o public.html
browsertrace export <run_id> --public -o public.html

The full export includes model input, model output, screenshots, and URLs. Use --public to omit all three sensitive fields before public sharing, or use individual redaction flags when you want to keep some fields visible.

Troubleshooting Browser Use traces

Could not attach to this Agent
BrowserTrace first tries register_new_step_callback, then common step callback attributes used by older or forked Browser Use agents. If your version exposes a different hook, comment on issue #11 with the Browser Use version and callback surface.
No screenshots appear
Some Browser Use states do not expose a screenshot for every step. BrowserTrace still records the URL, action summary, model thought/actions, status, and error when those fields are available.
The trace includes private page or prompt data
Keep the full trace local. Before attaching anything to a public issue or community post, run browsertrace export <run_id> --public -o public.html to omit prompt/model I/O, screenshots, and URLs.

What to inspect first

  1. Did the screenshot match the model's assumption?
  2. Did the selected action target the right element?
  3. Did the URL change earlier than expected?
  4. Did the model output mention a selector or label that was stale?
  5. Was the red step wrong, or did an earlier step poison the state?
BrowserTrace is MIT licensed and local-first. Browser Use adapter feedback is tracked in issue #11.