Metadata-Version: 2.4
Name: agenttrace-runtime
Version: 0.1.0
Summary: Open-source runtime tracing and diagnostics for AI agent execution flows.
Author: happli-sys
License-Expression: MIT
Project-URL: Homepage, https://github.com/happli-sys/AgentTrace
Project-URL: Issues, https://github.com/happli-sys/AgentTrace/issues
Project-URL: Repository, https://github.com/happli-sys/AgentTrace
Keywords: ai,agent,tracing,observability,runtime,diagnostics,llm
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.2; extra == "langchain"
Provides-Extra: openai-agents
Requires-Dist: openai-agents>=0.0.3; extra == "openai-agents"
Provides-Extra: crewai
Requires-Dist: crewai>=0.1; extra == "crewai"
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: ruff; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Dynamic: license-file

# AgentTrace

<p align="left">
  <a href="https://github.com/happli-sys/AgentTrace"><img alt="GitHub Repo" src="https://img.shields.io/badge/GitHub-AgentTrace-111827?logo=github"></a>
  <img alt="Python" src="https://img.shields.io/badge/Python-3.9%2B-3776AB?logo=python&logoColor=white">
  <img alt="License" src="https://img.shields.io/badge/License-MIT-16a34a">
  <img alt="Status" src="https://img.shields.io/badge/Status-Alpha-f59e0b">
  <img alt="Local First" src="https://img.shields.io/badge/Local--First-Execution%20Tracing-2563eb">
</p>

**Open-source runtime tracing and diagnostics for AI agent execution flows.**

AgentTrace helps you understand **what your agent actually did** at runtime — not just whether the final answer looks good.

It is built for people who want to answer questions like:

- Why did the agent call this tool twice?
- Where did the latency actually come from?
- Which fallback path was triggered?
- What did the LLM see before it made this decision?
- Was the execution flow correct, redundant, or suspicious?

If you want something closer to **pprof + tracing + agent diagnostics**, AgentTrace is designed for that.

---

## Why AgentTrace

Most agent tooling focuses on one of two things:

- **output evaluation** — “was the answer good?”
- **framework abstraction** — “how do I build the agent?”

AgentTrace focuses on a different question:

> **What exactly happened during execution, and why did the agent behave that way?**

That makes it especially useful for:

- debugging execution flow
- diagnosing redundancy and fallback behavior
- inspecting LLM prompts / responses in context
- understanding tool usage patterns
- tracing runtime state across a run

---

## Core capabilities

- Trace `LLM / Tool / Skill` execution flows
- Capture parallel, retry, fallback, and repeated-call patterns
- Record `Prompt / Response / Context / Plan / Execution` snapshots
- Persist runs locally and inspect them in a built-in dashboard
- Review runs with an LLM after execution
- Generate structured diagnostics: critical path, recovery chains, redundant calls, suspicious decisions

---

## What you get

### Execution tracing

AgentTrace records a runtime trace for each run, including:

- span type
- start / end time
- latency
- status
- input parameters
- grouping and parent-child relationships

### Structured state snapshots

For LLM spans, AgentTrace can capture:

- `ContextSnapshot`
- `MemorySnapshot`
- `PlanSnapshot`
- `DecisionSnapshot`
- `ResumeSnapshot`
- `ExecutionSnapshot`

### Diagnostics

AgentTrace builds a diagnostics layer on top of the raw trace:

- critical path
- failed tool calls
- recovery chains
- redundant tool clusters
- suspicious decisions
- filtered review findings

### LLM review

After each run, AgentTrace can ask an LLM to review the recorded execution flow and flag:

- redundant tool calls
- wrong tool choices
- suspicious fallback behavior
- unnecessary skill execution
- likely execution-flow issues

Review strictness is configurable:

- `review_level=1` → tolerant
- `review_level=2` → balanced (default)
- `review_level=3` → strict

At `review_level=1/2`, the UI hides `low` severity findings by default.
At `review_level=3`, all findings are shown.

---

## Dashboard

AgentTrace includes a local dashboard at:

- `http://localhost:3500`

Current UI features include:

- session list
- execution timeline
- parallel-lane view
- collapsed repeated-tool clusters
- prompt / response modal for LLM spans
- execution-state tabs
- diagnostics panel
- LLM review panel
- collapsible final agent output

---

## Quick start

### 1. Install from source

```bash
git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e .
```

### 2. Patch once, trace every run

```python
import agenttrace
from my_agent import run

agenttrace.patch(
    "my_agent.tools",
    "my_agent.skills",
    "my_agent.llm",
    llm_modules=["my_agent.llm"],
    skill_modules=["my_agent.skills"],
    review_level=2,
)

output = agenttrace.session("查北京天气并计算 1+2")(run)("查北京天气并计算 1+2")
print(output)
print(agenttrace.last_result().summary())
```

### 3. Start the dashboard

```python
from agenttrace.dashboard.server import start_server

start_server(port=3500)
```

Open:

- `http://localhost:3500`

---

## Demo agent

This repo includes a demo agent that intentionally exercises multiple tracing scenarios:

- `bash`
- `read`
- `grep`
- `calculate`
- `get_weather`
- `flaky_weather`
- `weather_report_skill`
- parallel weather queries
- fallback to stable tools

Run it:

```bash
python examples/demo_agent/main.py
```

Stress prompt:

```text
分析当前目录下的项目；bash pwd；read examples/demo_agent/tools.py；grep calculate examples/demo_agent；查北京和西安的天气，并计算1123123123+1283123；生成北京天气播报；最后总结。
```

---

## Integration model

AgentTrace works best for:

- custom Python agents with source code
- local development environments
- CLI / hook-based agents
- runtime debugging and diagnostics workflows

The default integration style is intentionally lightweight:

- patch modules once
- wrap runs with `session(...)`
- inspect results locally

---

## Project scope

AgentTrace is currently optimized as:

- a **runtime tracing tool**
- a **local-first diagnostics tool**
- a **developer-facing execution inspector**

It is **not** currently focused on being:

- a hosted eval platform
- a benchmark leaderboard
- a dataset management system
- a full SaaS observability suite

---
## Who this is for

AgentTrace is especially useful for:

- engineers building custom agents
- teams debugging real runtime behavior
- people who need local-first execution visibility
- anyone who wants to inspect agent decisions beyond final output quality

---

## Roadmap direction

Current direction is intentionally focused:

- stronger execution tracing
- better diagnostics and issue localization
- cleaner runtime state modeling
- broader integration patterns for source-based agents
- more production-friendly export / observability hooks

The goal is to keep AgentTrace useful as a **general execution-flow listener**, not to turn it into a bloated all-in-one platform too early.

---


## Still useful for objective metrics

Although AgentTrace centers on tracing and diagnostics, it still retains objective runtime metrics such as:

- total latency
- avg / p95 step latency
- tool success rate
- token usage
- estimated cost
- step efficiency
- correctness (if `expected_output` is provided)
- regression tracking
- comparison helpers

---

## Contributing

Contributions are welcome — especially around:

- new agent integrations
- richer diagnostics
- runtime state capture
- dashboard usability
- packaging and release polish

For local development:

```bash
git clone https://github.com/happli-sys/AgentTrace.git
cd AgentTrace
pip install -e ".[dev]"
pytest tests/
```

If you want to contribute, small focused improvements are preferred over large platform-style expansions.

---

## License

MIT
