Metadata-Version: 2.4
Name: open-agent-observatory
Version: 0.1.1
Summary: The OpenTelemetry for AI agents — structured traces, semantic diffs, and failure pattern mining
Author: Agent Observatory Contributors
License: MIT
Project-URL: Homepage, https://github.com/chiragjoshi12/agent_observatory
Project-URL: Issues, https://github.com/chiragjoshi12/agent_observatory/issues
Keywords: ai,agents,observability,tracing,opentelemetry,llm,langchain,langgraph,crewai,agno
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: click>=8.1
Provides-Extra: otlp
Requires-Dist: requests>=2.28; extra == "otlp"
Provides-Extra: diff
Requires-Dist: sentence-transformers>=2.2; extra == "diff"
Provides-Extra: analytics
Requires-Dist: duckdb>=0.9; extra == "analytics"
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20; extra == "anthropic"
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.1; extra == "langchain"
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.1; extra == "langgraph"
Provides-Extra: crewai
Requires-Dist: crewai>=0.1; extra == "crewai"
Provides-Extra: agno
Requires-Dist: agno>=0.1; extra == "agno"
Provides-Extra: all
Requires-Dist: requests>=2.28; extra == "all"
Requires-Dist: sentence-transformers>=2.2; extra == "all"
Requires-Dist: duckdb>=0.9; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21; extra == "dev"
Requires-Dist: black>=23.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"

# Agent Observatory 🔭

## Overview
Developing agentic AI systems is fundamentally different from building traditional, deterministic software. When an agent fails, goes off track, or loops incessantly, it's often a "cognitive" failure happening inside a black box. 

Developers are left struggling to answer:
- *Why did the agent choose this specific tool?*
- *What was the exact prompt context and raw JSON output at step 5 before the failure?*
- *If I tweak this system prompt, how does the agent's behavior path regression-test against the old version?*

Relying on basic logs and `print()` statements to debug multi-step reasoning loops wastes tokens, time, and developer sanity.

## The Solution
**Agent Observatory** is a production-grade cognitive debugging, tracing, and reliability platform specifically built for complex agentic AI systems. 

It acts as an "X-ray" for your agents—providing zero-friction auto-instrumentation, intelligent failure mining, and a local real-time visual dashboard to bring structural correctness and transparency back into your development lifecycle, entirely offline and local.

## Features

- 🔋 **Zero-Friction Auto-Instrumentation**: Drop-in tracing support for major frameworks (`OpenAI`, `Anthropic`, `CrewAI`, `Agno`). Get full visibility into LLM I/O and tool usage without polluting your core business logic with telemetry code.
- 🔬 **Trace Diff Engine**: Compare execution paths (structurally, not just textually) between two different agent runs. Definitively catch prompt regressions and verify whether a model change altered the agent's logical path.
- 🕵️ **Discriminative Failure Miner**: Intelligent loop detection and semantic deduplication to actively catch when agents get stuck in infinite loops, recursive tool failures, or hallucination spirals.
- 💻 **Real-Time Local Dashboard**: A clean, offline-first visualization dashboard (running on `localhost:7421`) that graphs hierarchical agent reasoning paths in real-time. Keep your proprietary prompts and logs safe without sending them to third-party SaaS tools.

---

## Getting Started

### 1. Installation

Install the package directly into your Python environment:

```bash
pip install open-agent-observatory
```

### 2. Basic Usage (Universal Auto-Instrumentation)

Injecting the observatory requires only a fast, one-line configuration. It automatically detects and patches installed frameworks like Agno, OpenAI, or CrewAI.

**Agno Example:**
```python
import agent_observatory as obs
# Automatically detects Agno and instruments the Agent and all tools globally!
obs.instrument()

from agno.agent import Agent
from agno.tools.yfinance import YFinanceTools

agent = Agent(
    name="Finance Agent",
    tools=[YFinanceTools()]
)

# From here, all cognitive steps, agent reasoning, and tool calls are traced
agent.print_response("What is NVDA trading at?")
```

**OpenAI Example:**
```python
import agent_observatory as obs
obs.instrument()

import openai
response = openai.ChatCompletion.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Analyze my data..."}]
)
```

### 3. Launch the Local Dashboard

Start the real-time visualization server in the background while you build and test your agents:

```bash
python -m agent_observatory.cli.main serve --port 7421
```
*(Proceed to `http://localhost:7421` in your browser to inspect traces).*

### 4. Trace Diffing (Regression Testing)

Ensure stability across prompt versions programmatically:

```python
from agent_observatory.diff import engine

# Compare two trace runs structurally to ensure stable reasoning
diff_report = engine.compare_traces(run_id_feature_v1, run_id_feature_v2)

if diff_report.has_structural_changes:
    print(f"Warning: Agent logic has drifted! {diff_report.summary}")
```

## Architecture Map
- `agent_observatory.auto` — Drop-in instrumentation patches.
- `agent_observatory.core` — The robust atomic event tracer.
- `agent_observatory.diff` — The structural Trace Diff Engine.
- `agent_observatory.analytics` — Failure Miner and loop detection algorithms.
- `agent_observatory.store` — SQLite backed atomic transaction storage.
- `dashboard/` — The raw HTML/JS/CSS visualization layer.
