Metadata-Version: 2.4
Name: runlens-ai
Version: 0.1.0
Summary: Local-first time-travel debugger for AI agent runs.
Project-URL: Homepage, https://github.com/harshbhatia04/RUNLENS
Project-URL: Repository, https://github.com/harshbhatia04/RUNLENS
Project-URL: Issues, https://github.com/harshbhatia04/RUNLENS/issues
Project-URL: Changelog, https://github.com/harshbhatia04/RUNLENS/blob/main/CHANGELOG.md
Author: RunLens Authors
License: MIT
License-File: LICENSE
Keywords: agents,ai,debugging,observability,replay
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Framework :: FastAPI
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Debuggers
Requires-Python: >=3.10
Requires-Dist: fastapi>=0.115
Requires-Dist: pydantic>=2.8
Requires-Dist: rich>=13.7
Requires-Dist: typer>=0.12
Requires-Dist: uvicorn>=0.30
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.40; extra == 'anthropic'
Provides-Extra: claude
Requires-Dist: anthropic>=0.40; extra == 'claude'
Provides-Extra: dev
Requires-Dist: httpx>=0.27; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.2; extra == 'dev'
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.2; extra == 'langgraph'
Provides-Extra: litellm
Requires-Dist: litellm>=1.44; extra == 'litellm'
Provides-Extra: openai
Requires-Dist: openai>=1.40; extra == 'openai'
Description-Content-Type: text/markdown

# RunLens

**RunLens is a local-first time-travel debugger for AI agents.**

It records prompts, model calls, tool calls, state snapshots, errors, latency, token usage, and metadata into SQLite, then opens a dashboard where you can inspect the timeline, replay a run, fork from a bad step, patch the data, and compare the branch.

The core idea is simple: when an agent fails, you should not be staring at console logs and guessing. You should be able to see exactly what happened.

![RunLens dashboard](docs/assets/dashboard.png)

## Why It Exists

AI agents fail in messy ways:

- a tool returns bad data,
- a model makes an unsupported jump,
- state gets corrupted halfway through a workflow,
- a retry hides the first useful error,
- the final answer looks wrong but the cause is five steps earlier.

RunLens gives those failures a shape: timeline, replay, fork, patch, diff.

## Quick Start

Install from PyPI after release:

```bash
python -m pip install runlens-ai
runlens demo --mock --reset
runlens server
```

Install from a local checkout:

```bash
python -m pip install -e ".[dev]"
python -m runlens demo --mock --reset
python -m runlens server
```

Open:

```text
http://127.0.0.1:8765
```

The built-in demo creates:

- `broken-research-agent`: a failed run caused by weak search results.
- `repaired-research-agent`: a completed fork after patching the bad tool output.

## What To Try In The Demo

1. Select `broken-research-agent` and click through each step to find the bad `search_web` result.
2. Select `repaired-research-agent` and open `Diff` to see exactly which nested fields changed.
3. Open `Replay` to inspect the deterministic timeline.
4. Select step `02 search_web`, edit the `Fork Output` JSON, and click `Create` to make another branch.
5. Export a trace and import it again to confirm the portable `.runlens.json` workflow.

## Python SDK

```python
from runlens import trace

with trace("research-agent", tags=["demo"]) as run:
    run.log_prompt("Find credible RAG evaluation papers")
    run.log_tool_call(
        "search",
        input={"query": "RAG evaluation benchmark"},
        output={"results": []},
    )
    run.log_model_call(
        "writer",
        input={"sources": []},
        output={"draft": "No sources found."},
        total_tokens=88,
        latency_ms=231.4,
    )
    run.log_response({"status": "needs_repair"})
```

Tool decorator:

```python
from runlens import trace

with trace("agent-with-tools") as run:
    @run.tool("lookup")
    def lookup(query: str) -> dict:
        return {"query": query, "result": "example"}

    lookup("RAGAS paper")
```

## CLI

```bash
python -m runlens init
python -m runlens demo --mock --reset
python -m runlens runs
python -m runlens replay <run_id>
python -m runlens export <run_id> --out trace.json
python -m runlens import trace.json
python -m runlens doctor
python -m runlens server --host 127.0.0.1 --port 8765
```

If the console script is on your PATH, use `runlens` instead of `python -m runlens`.

For PyPI release, the distribution name is `runlens-ai` because `runlens` is already occupied on PyPI. The import and CLI remain:

```bash
pip install runlens-ai
runlens demo --mock --reset
```

## Dashboard

The dashboard includes:

- run list with statuses, latency, token totals, and fork markers,
- ordered timeline of prompt/model/tool/state/error/response/check steps,
- step detail panel with input/output/error JSON,
- deterministic replay strip,
- fork editor for patching step output,
- deterministic or live-rerun-scaffold fork mode,
- parent-vs-fork diff with status and token comparison,
- nested changed paths inside input/output/error JSON,
- browser import/export of `.runlens.json` traces,
- responsive layout for desktop and mobile.

## Adapters

Manual SDK logging works everywhere. Optional adapter helpers are included for common model clients:

```bash
python -m pip install ".[openai]"
python -m pip install ".[anthropic]"
python -m pip install ".[litellm]"
```

OpenAI helper:

```python
from openai import OpenAI
from runlens import trace
from runlens.adapters.openai import RunLensOpenAI

client = OpenAI()

with trace("openai-agent") as run:
    wrapped = RunLensOpenAI(client, run)
    wrapped.chat_completions_create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": "Say hello"}],
    )
```

Runnable OpenAI example:

```bash
python -m pip install ".[openai]"
$env:OPENAI_API_KEY="sk-..."
python examples/openai_agent.py
python -m runlens server
```

LiteLLM helper:

```python
from runlens import trace
from runlens.adapters.litellm import completion

with trace("litellm-agent") as run:
    completion(
        run,
        model="openai/gpt-4o-mini",
        messages=[{"role": "user", "content": "Say hello"}],
    )
```

Claude / Anthropic helper:

```python
from anthropic import Anthropic
from runlens import trace
from runlens.adapters.anthropic import RunLensAnthropic

client = Anthropic()

with trace("claude-agent") as run:
    wrapped = RunLensAnthropic(client, run)
    wrapped.messages_create(
        model="claude-sonnet-4-5",
        max_tokens=500,
        messages=[{"role": "user", "content": "Say hello"}],
    )
```

Runnable Claude example:

```bash
python -m pip install ".[anthropic]"
$env:ANTHROPIC_API_KEY="sk-ant-..."
python examples/claude_agent.py
python -m runlens server
```

Graph node wrapper:

```python
from runlens import trace
from runlens.adapters.langgraph import wrap_node

def plan_node(state: dict) -> dict:
    return {**state, "plan": ["search", "write", "check"]}

with trace("graph-agent") as run:
    wrapped_plan = wrap_node(run, "plan_node", plan_node)
    wrapped_plan({"task": "research RAG eval"})
```

## Architecture

```text
Python SDK / adapters
        |
        v
RunLensStore (SQLite)
        |
        +-- FastAPI JSON API
        |
        +-- React dashboard static bundle
```

Default database path:

```text
.runlens/runlens.sqlite
```

Override it with:

```bash
$env:RUNLENS_DB="C:\path\to\runlens.sqlite"
```

## Privacy Defaults

RunLens is local-first. It does not upload traces anywhere.

Redaction is applied before storage for common secret keys such as:

- `api_key`
- `authorization`
- `password`
- `secret`
- `token`
- `cookie`

Large strings and large containers are truncated before storage.

## Production Guardrails

RunLens v0.1 includes practical hardening for local and demo deployments:

- SQLite WAL mode and write busy timeout,
- restricted default CORS origins,
- configurable request body limit,
- security headers on API and dashboard responses,
- fork patch validation at both API and storage boundaries,
- Docker image with healthcheck and persistent `/data` volume.

Configuration:

```bash
$env:RUNLENS_DB="C:\path\to\runlens.sqlite"
$env:RUNLENS_CORS_ORIGINS="https://your-demo.example.com"
$env:RUNLENS_MAX_REQUEST_BYTES="5000000"
```

Run with Docker:

```bash
docker build -t runlens-ai:0.1.0 .
docker run --rm -p 8765:8765 -v runlens-data:/data runlens-ai:0.1.0
```

## Development

```bash
python -m pip install -e ".[dev]"
cd frontend
npm install
npm run build
cd ..
python -m pytest
python -m runlens demo --mock --reset
python -m runlens server
```

Frontend dev server:

```bash
cd frontend
npm run dev
```

The Vite dev server proxies `/api` to `http://127.0.0.1:8765`.

## Publishing

Release docs live in:

- `docs/PUBLISHING.md`
- `docs/DEPLOYMENT.md`
- `docs/RELEASE_CHECKLIST.md`
- `CHANGELOG.md`
- `SECURITY.md`
- `CONTRIBUTING.md`

GitHub repository:

```text
https://github.com/harshbhatia04/RUNLENS
```

## v0.1 Status

Included:

- local SQLite trace storage,
- manual Python SDK,
- OpenAI and LiteLLM adapter helpers,
- FastAPI trace API,
- React dashboard,
- deterministic replay,
- fork-with-patched-output,
- parent-vs-fork diff,
- trace import/export from CLI and browser,
- nested JSON diff paths,
- live-rerun scaffold boundary for patched forks,
- built-in repaired demo,
- tests and package metadata.

Not included yet:

- hosted cloud sync,
- auth,
- billing,
- automatic live rerun across arbitrary private frameworks without user runner code,
- binary artifact storage.

## License

MIT
