Metadata-Version: 2.4
Name: agenttracer-ai
Version: 0.2.0
Summary: Framework-agnostic observability and debugging for AI agents
Project-URL: Homepage, https://github.com/CrazeXD/agenttracer
Project-URL: Repository, https://github.com/CrazeXD/agenttracer
Project-URL: Issues, https://github.com/CrazeXD/agenttracer/issues
Author: AgentTracer Contributors
License: MIT
License-File: LICENSE
Keywords: agents,ai,anthropic,debugging,langchain,llm,observability,openai,tracing
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Debuggers
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: System :: Monitoring
Requires-Python: >=3.10
Requires-Dist: flask>=3.0
Provides-Extra: all
Requires-Dist: anthropic>=0.20; extra == 'all'
Requires-Dist: langchain-core>=0.1; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Requires-Dist: pyautogen>=0.2; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20; extra == 'anthropic'
Provides-Extra: autogen
Requires-Dist: pyautogen>=0.2; extra == 'autogen'
Provides-Extra: dev
Requires-Dist: mypy; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff; extra == 'dev'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.1; extra == 'langchain'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Description-Content-Type: text/markdown

<p align="center">
  <img src="docs/dashboard-screenshot.png" alt="AgentTracer Dashboard" width="800">
</p>

<h1 align="center">AgentTracer</h1>

<p align="center">
  <strong>Framework-agnostic observability and debugging for AI agents.</strong>
  <br>
  See every step, tool call, model call, token usage, cost, and decision — in a clean local dashboard and structured logs.
</p>

<p align="center">
  <a href="#quickstart">Quickstart</a> •
  <a href="#features">Features</a> •
  <a href="#sdk-api">SDK API</a> •
  <a href="#dashboard">Dashboard</a> •
  <a href="#integrations">Integrations</a> •
  <a href="#examples">Examples</a>
</p>

---

## Why AgentTracer?

AI agents are black boxes. When an agent fails, you're left staring at logs wondering: What model was called? What did it return? Which tool errored? How much did that run cost? Why did it choose branch A over branch B?

AgentTracer gives you full visibility into every agent run — regardless of framework — with **6 lines of code**.

```python
import agenttracer as at

tracer = at.AgentTracer()

with tracer.trace("my-agent", input_data="What's the weather?") as t:
    tracer.log_model_call(t, model="gpt-4o", input_data="...", output_data="...")
    tracer.log_tool_call(t, tool_name="weather", tool_args={"city": "SF"}, tool_result="72°F")
    tracer.log_decision(t, "route", options=["search", "answer"], chosen="answer")

at.dashboard()  # Open http://localhost:8484
```

## Features

- **Framework-agnostic** — works with OpenAI, Anthropic, LangChain, LangGraph, AutoGen, or plain Python
- **Simple SDK** — `start_trace()`, `log_step()`, `log_tool_call()`, `log_model_call()`, `log_decision()`, `end_trace()`
- **Fork/merge graph view** — visualize parallel subagent branches with `log_fork()`, `log_subagent()`, `log_merge()` and an interactive DAG view in the dashboard
- **Context managers** — `with tracer.trace(...)` for automatic lifecycle management
- **Token & cost estimation** — auto-estimates tokens and costs for 30+ models (GPT-4o, Claude, Gemini, Llama, etc.)
- **Local dashboard** — dark-themed web UI with timeline view, graph view, nested spans, filters, and diff viewer
- **Dual storage** — JSONL (append-only, human-readable) + SQLite (indexed queries, fast filtering)
- **Export** — download traces as JSON or Markdown reports
- **Auto-patching** — wrap OpenAI/Anthropic clients to trace all calls automatically
- **CI/CD ready** — GitHub Actions workflows for testing and auto-publish to PyPI on release
- **Zero external dependencies** — just Flask for the dashboard, everything else is stdlib

## Quickstart

### Install

```bash
pip install agenttracer-ai
```

Or install from source:

```bash
git clone https://github.com/CrazeXD/agenttracer.git
cd agenttracer
pip install -e .
```

The PyPI package name is `agenttracer-ai`, but the import stays `import agenttracer`.

### Run the demo

Generate sample traces and launch the dashboard:

```bash
python examples/demo_app.py
```

Then open **http://localhost:8484** in your browser.

### Minimal example

```python
import agenttracer as at

tracer = at.AgentTracer(storage_dir="./traces")

# Start a trace
trace = tracer.start_trace("my-agent", input_data="Hello", tags=["demo"])

# Log a model call (tokens and cost auto-estimated)
tracer.log_model_call(
    trace,
    model="gpt-4o",
    input_data=[{"role": "user", "content": "Hello"}],
    output_data="Hi! How can I help?",
)

# Log a tool call
tracer.log_tool_call(
    trace,
    tool_name="search",
    tool_args={"query": "latest news"},
    tool_result={"results": ["..."]},
)

# Log a branching decision
tracer.log_decision(
    trace,
    name="response_strategy",
    options=["concise", "detailed", "follow_up"],
    chosen="concise",
    reasoning="User asked a simple question",
)

# End the trace
tracer.end_trace(trace, output_data="Here's what I found...")

# Launch dashboard
at.dashboard()
```

## SDK API

### Core Functions

| Function | Description |
|----------|-------------|
| `AgentTracer(storage_dir=...)` | Create a tracer instance |
| `start_trace(agent_name, input_data, tags, metadata)` | Begin a new trace |
| `end_trace(trace, status, output_data, error)` | End a trace and persist |
| `log_step(trace, name, input_data, output_data, parent)` | Log a generic step |
| `log_model_call(trace, model, input_data, output_data, token_usage, ...)` | Log an LLM call |
| `log_tool_call(trace, tool_name, tool_args, tool_result, error)` | Log a tool call |
| `log_decision(trace, name, options, chosen, reasoning)` | Log a branching decision |
| `end_span(span, status, output_data, error)` | Explicitly end a span |

### Fork / Merge (Multi-Agent Orchestration)

Trace parallel subagent branches that fork and merge back:

```python
# Fork into parallel branches
fork = tracer.log_fork(trace, "parallel_research", branches=["search", "analysis", "writing"])

# Log each subagent branch (linked by fork_span)
search_sub = tracer.log_subagent(trace, "search-agent", fork_span=fork, input_data="...")
tracer.log_model_call(trace, model="gpt-4o-mini", ..., parent=search_sub)
tracer.end_span(search_sub, output_data="search results")

analysis_sub = tracer.log_subagent(trace, "analysis-agent", fork_span=fork, input_data="...")
tracer.log_tool_call(trace, "code_interpreter", ..., parent=analysis_sub)
tracer.end_span(analysis_sub, output_data="analysis results")

# Merge branches back together
merge = tracer.log_merge(
    trace, "combine_results",
    fork_span=fork,
    source_spans=[search_sub, analysis_sub],
    output_data="merged output",
)
```

| Function | Description |
|----------|-------------|
| `log_fork(trace, name, branches)` | Log a fork point where execution splits into parallel branches |
| `log_subagent(trace, subagent_name, fork_span, input_data)` | Log a subagent branch — returns a span to use as `parent` for all work in the branch |
| `log_merge(trace, name, fork_span, source_spans, output_data)` | Log a merge point where parallel branches rejoin |

The dashboard automatically shows a **Graph** tab with an interactive DAG visualization when a trace contains fork/merge data.

<p align="center">
  <img src="docs/graph-view.png" alt="AgentTracer Graph View — fork/merge DAG" width="700">
</p>

See [`orchestrator_example.py`](examples/orchestrator_example.py) for a complete runnable example.

### Context Managers

```python
# Automatic trace lifecycle
with tracer.trace("my-agent", input_data="...") as t:
    # ... all spans auto-closed on exit
    pass  # trace auto-ends with SUCCESS

# Automatic span lifecycle
with tracer.span(trace, "processing") as s:
    result = process(data)
    s.output_data = result
```

### Token Usage

Provide exact token counts or let AgentTracer estimate:

```python
# Auto-estimated (~4 chars/token)
tracer.log_model_call(trace, model="gpt-4o", input_data="...", output_data="...")

# Exact counts from API response
tracer.log_model_call(
    trace,
    model="gpt-4o",
    token_usage={"prompt_tokens": 150, "completion_tokens": 89, "total_tokens": 239},
)
```

### Nested Spans

```python
parent = tracer.log_step(trace, "research")
tracer.log_model_call(trace, model="gpt-4o", ..., parent=parent)
tracer.log_tool_call(trace, "search", ..., parent=parent)
tracer.end_span(parent)
```

### Export

```python
import agenttracer as at

# Export as JSON
json_str = at.export_json(trace)
at.export_json(trace, path="trace.json")  # to file

# Export as Markdown report
md = at.export_markdown(trace)
at.export_markdown(trace, path="report.md")  # to file
```

### CLI

```bash
# Launch dashboard
agenttracer dashboard --port 8484

# List recent traces
agenttracer list --agent research-bot --limit 20

# Export a trace
agenttracer export <trace-id> --format json -o trace.json
agenttracer export <trace-id> --format markdown -o report.md
```

## Dashboard

<p align="center">
  <img src="docs/dashboard-detail.png" alt="AgentTracer Detail View" width="800">
</p>

The local web dashboard provides:

- **Trace list** — see all agent runs with status, duration, tokens, cost, and error counts
- **Filters** — filter by agent name, status, errors, and tags
- **Timeline view** — nested span tree showing the execution flow
- **Graph view** — interactive DAG visualization of fork/merge/subagent branches with color-coded nodes and curved edges (auto-appears when trace has fork/merge data)
- **Expandable spans** — click any span to see model, tokens, cost, input/output, tool args/results
- **Input/Output diff** — side-by-side view of trace input and output
- **Export buttons** — download any trace as JSON or Markdown
- **Auto-refresh** — updates every 10 seconds
- **Aggregate stats** — total traces, tokens, cost, and agent count in the header

Launch it:

```python
import agenttracer as at
at.dashboard(port=8484)
```

Or via CLI:

```bash
agenttracer dashboard
```

## Integrations

### OpenAI

Auto-patch an OpenAI client to trace all `chat.completions.create` calls:

```python
from agenttracer.integrations.openai_wrapper import patch_openai

client = patch_openai(openai.OpenAI(), tracer, trace)
response = client.chat.completions.create(model="gpt-4o", messages=[...])
# ^ automatically logged with tokens, cost, and any tool calls
```

### Anthropic

```python
from agenttracer.integrations.anthropic_wrapper import patch_anthropic

client = patch_anthropic(anthropic.Anthropic(), tracer, trace)
response = client.messages.create(model="claude-sonnet-4-20250514", messages=[...])
```

### LangChain / LangGraph

Use the callback handler:

```python
from agenttracer.integrations.langchain_callback import AgentTracerCallback

callback = AgentTracerCallback(tracer, trace)
chain.invoke(input, config={"callbacks": [callback]})

# Also works with LangGraph
graph.invoke(input, config={"callbacks": [callback]})
```

### AutoGen

```python
from agenttracer.integrations.autogen_wrapper import trace_autogen_chat

result = trace_autogen_chat(
    tracer, trace,
    initiator=user_proxy,
    recipient=assistant,
    message="Solve this problem...",
)
```

### Plain Python

No framework? No problem. Use the SDK directly:

```python
tracer = at.AgentTracer()
trace = tracer.start_trace("my-script")
tracer.log_step(trace, "step-1", input_data="...")
tracer.log_model_call(trace, model="gpt-4o", ...)
tracer.end_trace(trace)
```

## Cost Estimation

AgentTracer includes pricing data for 30+ models:

| Provider | Models |
|----------|--------|
| OpenAI | GPT-4o, GPT-4o-mini, GPT-4, GPT-3.5, o1, o3, o4-mini |
| Anthropic | Claude 4 Opus/Sonnet, Claude 3.5 Sonnet/Haiku, Claude 3 Opus |
| Google | Gemini 2.5 Pro/Flash, Gemini 2.0 Flash |
| Meta | Llama 3.1 70B/8B |
| Mistral | Mixtral 8x7B |
| DeepSeek | DeepSeek V3, DeepSeek R1 |

Costs are estimated per-call and aggregated per-trace. Update pricing in `src/agenttracer/pricing.py`.

## Storage

Traces are stored in both formats simultaneously:

- **JSONL** (`traces.jsonl`) — append-only, human-readable, `grep`-friendly
- **SQLite** (`traces.db`) — indexed columns for fast filtering by agent, status, duration, cost, tags

Default storage dir: `./agenttracer_data/`

## Project Structure

```
agenttracer/
├── src/agenttracer/
│   ├── __init__.py          # Public API
│   ├── tracer.py            # Core tracer SDK
│   ├── models.py            # Data models (Trace, Span, TokenUsage, etc.)
│   ├── pricing.py           # Token counting & cost estimation
│   ├── __main__.py          # CLI entry point
│   ├── storage/
│   │   ├── jsonl.py         # JSONL storage backend
│   │   └── sqlite.py        # SQLite storage backend
│   ├── exporters/
│   │   ├── json_export.py   # JSON export
│   │   └── markdown_export.py # Markdown report export
│   ├── integrations/
│   │   ├── openai_wrapper.py    # OpenAI auto-patching
│   │   ├── anthropic_wrapper.py # Anthropic auto-patching
│   │   ├── langchain_callback.py # LangChain/LangGraph callback
│   │   └── autogen_wrapper.py   # AutoGen integration
│   └── dashboard/
│       ├── app.py           # Flask dashboard app
│       ├── templates/       # HTML templates
│       └── static/          # CSS, JS & graph.js (SVG DAG renderer)
├── examples/
│   ├── basic_agent.py       # Simple traced agent
│   ├── multi_step_agent.py  # Complex agent with nesting
│   ├── orchestrator_example.py # Multi-agent fork/merge pattern
│   ├── context_manager_demo.py # Context manager API
│   ├── openai_example.py    # OpenAI integration
│   ├── langchain_example.py # LangChain integration
│   └── demo_app.py          # Generate sample data + dashboard
├── .github/workflows/
│   ├── ci.yml               # CI: tests, lint, build check on push/PR
│   └── publish.yml          # Auto-publish to PyPI on GitHub release
├── tests/
├── pyproject.toml
├── LICENSE (MIT)
└── README.md
```

## Examples

| Example | Description |
|---------|-------------|
| [`basic_agent.py`](examples/basic_agent.py) | Simple agent with model calls, tool calls, and decisions |
| [`multi_step_agent.py`](examples/multi_step_agent.py) | Multi-step research agent with nested spans and error handling |
| [`orchestrator_example.py`](examples/orchestrator_example.py) | **Multi-agent fork/merge** — orchestrator splits into parallel subagents and merges results |
| [`context_manager_demo.py`](examples/context_manager_demo.py) | Concise API using `with` statements |
| [`openai_example.py`](examples/openai_example.py) | Auto-trace OpenAI API calls |
| [`langchain_example.py`](examples/langchain_example.py) | LangChain/LangGraph callback handler |
| [`demo_app.py`](examples/demo_app.py) | Generate 25 sample traces and launch the dashboard |

## Contributing

Contributions are welcome. Please open an issue or PR.

```bash
# Development setup
git clone https://github.com/CrazeXD/agenttracer.git
cd agenttracer
pip install -e ".[dev]"
python -m pytest tests/ -v
```

## License

MIT — see [LICENSE](LICENSE).
