How to Use Argus
A step-by-step walkthrough of every section in the Argus dashboard — from browsing your run history to reading failure details, comparing executions, and replaying from a broken node.
Quick-start — paste this into your LLM (Claude Code, Cursor, etc.) to add ARGUS to your pipeline:
Add ARGUS monitoring to my LangGraph pipeline. In the file where the graph is built, add the following before graph.compile(): from argus import ArgusWatcher watcher = ArgusWatcher() watcher.watch(graph) # must be called BEFORE graph.compile() app = graph.compile() For cyclic graphs, also call watcher.finalize() after app.invoke().
1. Runs List
The home page is your pipeline execution history. Every time your pipeline runs with Argus attached, an entry appears here automatically.

The runs list — aggregate stats at the top, evaluation panel, and the full run table below.
Summary cards
Run table columns
Evaluation panel
The Evaluation section lets you filter runs by criteria — set a goal description and add constraints like overall_status == clean to find runs that meet specific conditions. Hit Evaluate to filter the table.
Click any run ID to open its full detail page.
2. Run Detail
The run detail page gives you a complete picture of what happened during a single pipeline execution — metrics, the execution trace, AI analysis, and the initial state.

Top of the run detail page: run ID, status, root cause chain, metrics grid, and the execution timeline.
Header
Shows the run ID, overall status badge, timestamp, total duration, step count, and Argus version. The Compare button lets you immediately diff this run against another.
Root cause chain
When a failure propagates downstream, Argus traces back to find the originating node. The red banner shows the chain — e.g. extract_skills → generate_summary — so you know exactly which node to fix, not which node complained.
Metrics
Execution timeline
Each node is listed in order with its name, output type tag, duration, and status. Nodes with failures show an indented root cause annotation — the specific field that was missing and which upstream node failed to produce it. Expand any row with the arrow to see the full input/output JSON.

Lower execution timeline showing degraded_input propagation, followed by the AI Analysis panel.
AI Analysis
When OPENAI_API_KEY is set, Argus automatically investigates non-clean runs. The panel breaks down the failure into three parts:
Root Cause Node
The specific node Argus identified as the origin of the failure — not the node that complained, but the one that first produced the broken state.
Reason
A concise explanation of why that node failed and how the bad state propagated through downstream nodes.
How to Fix It
Numbered action items — each targeting a specific node — telling you exactly what to change to prevent the failure from recurring.
A confidence score is shown in the top-right of the panel. The footer shows how many causal hypotheses were evaluated and how many observations were used.

AI fix steps, the Correlation panel (origin node + confidence), and the Behavior/Initial State sections.
Correlation
Argus runs a correlation analysis to confirm which node is the true origin of the degradation. Shows the origin node name, step index, failure signals (e.g. missing_field), and a confidence score.
Behavior & Initial State
The Behavior section shows the raw initial state your pipeline received — the exact input dict at invocation time. Useful for reproducing the failure locally.
3. Compare
Compare two runs side-by-side to see exactly what changed — useful for verifying a fix worked, catching regressions, or understanding why one run is faster than another.

Compare page: winner verdict at the top, aggregate stats table, then a node-by-node status comparison.
How to compare
Open Compare
Enter two run IDs
Read the verdict
Read the node diff
The aggregate table shows Failures, Duration, and Success Rate side-by-side with a winner indicator (B ✓) for each metric.
4. Replay
Replay re-executes your pipeline from a specific node using the frozen input state captured from a previous run. This means you can test a fix without re-running the full pipeline or making new LLM calls for the nodes before the broken one.
How replay works
When Argus records a run, it saves the input state at every node. When you replay from node X, Argus loads the exact input that node X received originally, then re-executes node X and everything downstream with your current code. A new run ID is created for the result.
Step by step — from the dashboard
Open the failing run
Find the root cause node
Click the replay icon
Wait for the new run
Compare to confirm
pass.Step by step — from the CLI
argus replay <run-id> <node-name>
argus replay <run-id> <node-name> --app my_pipeline:build_graph
The --app flag takes a module:function path to your graph factory function. Only needed if node function references weren't captured at recording time. After replay, use argus diff to compare:
argus diff <original-run-id> <replay-run-id>
Screenshots for the replay UI will be added in a future update.