ARGUS
local

Changelog

What's new in each version of ARGUS.

v0.4.0BETA2026-05-17

Beta Testing Rollout

  • LLM token/cost tracking — auto-extracts usage from node outputs, shows cost per node and total
  • Redesigned sidebar into Observe / Analyze sections
  • Replay from subdirectories — runs recorded from child folders are now found and replayable
  • Changelog page with full version timeline
  • Report the Dev page for bug reports and feature requests
  • Restored and expanded test suite
v0.3.10MINOR2026-05-06

Replay wired to UI

  • Replay from UI — hover any step, click "replay from here", no CLI needed
  • App factory input persists to .argus/config.json automatically
  • Auto-compare after replay — navigates to diff view showing original vs replay
  • Eval metrics panel in compare page (failure count, severity, success rate)
  • --app flag on argus ui for startup config
v0.3.7MINOR2026-05-04

Auto-login & interrupt stitching

  • Auto-login in argus ui — reads CLI credentials, no second OAuth flow
  • Interrupted runs + resume continuations stitched into a single merged view
  • Resume runs hidden from top-level list to avoid duplicates
v0.3.5MINOR2026-05-04

Cloud storage

  • Google login and Supabase cloud storage
  • Background sync of runs to cloud
v0.3.3MAJOR2026-04-27

Web Dashboard

  • argus ui — web dashboard served by a pure-Python HTTP server, no Node.js required
  • Runs list with status, duration, pass rate stats
  • Per-run detail view with step-by-step inspection
  • Side-by-side run comparison at /compare
  • Zero extra dependencies — just Python
v0.3.2PATCH2026-04-22

Silent failure enhancements

  • Strict mode for inspector
  • Nested error scan in tool outputs
  • Generic list type checking
v0.3.0MAJOR2026-04-19

Parallel Execution

  • Parallel nodes grouped in a parallel panel in argus show
  • Graph topology diagram above the step list
  • Silent failure detection is parallel-aware — only blames nodes for fields they wrote
  • Root cause chain excludes innocent parallel siblings
v0.2.2MINOR2026-04-18

Run Differentiator

  • argus diff — compare any two runs node-by-node
  • Status changes with FIXED / REGRESSION labels
  • Inspection diff, validator result flips, output field deltas
  • Frozen node and duration change tracking
v0.2.1MINOR2026-04-14

Tool Call Failure Detection

  • Detects error_response, rate_limit (HTTP 429), and empty_result from tool calls
  • Tool failures show inline under the node that swallowed them
v0.1.1MINOR2026-04-10

Deterministic Replay

  • Replay plays back exact outputs from the original run — no live LLM calls
  • Reproducible replays with zero configuration
v0.1.0MAJOR2026-04-02

MVP

  • Core monitoring: ArgusWatcher for LangGraph, ArgusSession for any framework
  • Silent failure detection: missing fields, type mismatches, empty outputs
  • Semantic signature registry for placeholder/degraded LLM outputs
  • Root cause analysis chain
  • CLI: argus show, argus replay
  • Local storage in .argus/runs/