Metadata-Version: 2.4
Name: agentobs
Version: 1.0.1
Summary: Tracium — reference implementation of RFC-0001 AGENTOBS, the Observability Schema Standard for Agentic AI Systems
Project-URL: Homepage, https://github.com/veerarag1973/agentobs
Project-URL: Documentation, https://github.com/veerarag1973/agentobs/blob/main/docs/index.md
Project-URL: Source, https://github.com/veerarag1973/agentobs
Project-URL: Bug Tracker, https://github.com/veerarag1973/agentobs/issues
Project-URL: Changelog, https://github.com/veerarag1973/agentobs/blob/main/docs/changelog.md
Author: LLM Toolkit Team
License: MIT
License-File: LICENSE
Keywords: agentic-ai,agentobs,events,llm,observability,opentelemetry,schema
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Logging
Classifier: Typing :: Typed
Requires-Python: >=3.9
Provides-Extra: all
Requires-Dist: anthropic>=0.25; extra == 'all'
Requires-Dist: httpx>=0.27; extra == 'all'
Requires-Dist: jsonschema>=4.21; extra == 'all'
Requires-Dist: kafka-python>=2.0; extra == 'all'
Requires-Dist: langchain-core>=0.2; extra == 'all'
Requires-Dist: llama-index-core>=0.10; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Requires-Dist: opentelemetry-sdk>=1.24; extra == 'all'
Requires-Dist: pydantic>=2.7; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.25; extra == 'anthropic'
Provides-Extra: datadog
Provides-Extra: dev
Requires-Dist: httpx>=0.27; extra == 'dev'
Requires-Dist: hypothesis>=6.100; extra == 'dev'
Requires-Dist: jsonschema>=4.21; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pre-commit>=3.7; extra == 'dev'
Requires-Dist: pydantic>=2.7; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-benchmark>=4.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest-xdist>=3.5; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4.0; extra == 'dev'
Provides-Extra: docs
Requires-Dist: pydata-sphinx-theme>=0.15; extra == 'docs'
Requires-Dist: sphinx>=7.0; extra == 'docs'
Provides-Extra: groq
Requires-Dist: groq>=0.9; extra == 'groq'
Provides-Extra: http
Requires-Dist: httpx>=0.27; extra == 'http'
Provides-Extra: jsonschema
Requires-Dist: jsonschema>=4.21; extra == 'jsonschema'
Provides-Extra: kafka
Requires-Dist: kafka-python>=2.0; extra == 'kafka'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.2; extra == 'langchain'
Provides-Extra: llamaindex
Requires-Dist: llama-index-core>=0.10; extra == 'llamaindex'
Provides-Extra: ollama
Requires-Dist: ollama>=0.2; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: otel
Requires-Dist: opentelemetry-sdk>=1.24; extra == 'otel'
Provides-Extra: pydantic
Requires-Dist: pydantic>=2.7; extra == 'pydantic'
Provides-Extra: together
Requires-Dist: together>=1.2; extra == 'together'
Description-Content-Type: text/markdown

﻿<h1 align="center">AgentOBS</h1>

<p align="center">
  <strong>The reference implementation of the AGENTOBS Standard.</strong><br/>
  A lightweight Python SDK that gives your AI applications a common, structured way to record, sign, redact, and export events — with zero mandatory dependencies.
</p>

<p align="center">
  <em>AGENTOBS (RFC-0001) is the open event-schema standard for observability of agentic AI systems.</em>
</p>

<p align="center">
  <img src="https://img.shields.io/badge/python-3.9%2B-4c8cbf?logo=python&logoColor=white" alt="Python 3.9+"/>
  <a href="https://www.getspanforge.com/standard"><img src="https://img.shields.io/badge/standard-AGENTOBS_RFC--0001-4c8cbf" alt="AGENTOBS RFC-0001"/></a>
  <img src="https://img.shields.io/badge/coverage-96%25-brightgreen" alt="96% test coverage"/>
  <img src="https://img.shields.io/badge/tests-1837%20passing-brightgreen" alt="1837 tests"/>
  <img src="https://img.shields.io/badge/dependencies-zero-brightgreen" alt="Zero dependencies"/>
  <a href="docs/index.md"><img src="https://img.shields.io/badge/docs-local-4c8cbf" alt="Documentation"/></a>
  <img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT license"/>
</p>

---

## What is this?

**AgentOBS** (``tracium``) is the **reference implementation of [RFC-0001 AGENTOBS](https://www.getspanforge.com/standard)** — the open event-schema standard for observability of agentic AI systems.

AGENTOBS defines a structured, typed event envelope that every LLM-adjacent instrumentation tool can emit and every observability backend can consume. It covers the full lifecycle: event envelopes, agent span hierarchies, token and cost models, HMAC audit chains, PII redaction, OTLP-compatible export, and schema governance.

> Think of **AgentOBS** as a **universal receipt format** for your AI application.
> Every time your app calls a language model, makes a decision, redacts private data, or checks a guardrail — this library gives that action a consistent, structured record that any tool in your stack can read.

---

## Why use it?

Without a shared schema, every team invents their own log format. With ``tracium`` (and the AGENTOBS standard it implements), your logs, dashboards, compliance reports, and monitoring tools all speak the same language — automatically.

| Without AgentOBS | With AgentOBS |
|---|---|
| Each service logs events differently | Every event follows the same structure |
| Hard to audit who saw what data | Built-in HMAC signing creates a tamper-proof audit trail |
| PII scattered across logs | First-class PII redaction before data leaves your app |
| Vendor-specific observability | OpenTelemetry-compatible — works with any monitoring stack |
| No way to check compatibility | CLI + programmatic compliance checks in CI |
| Complex integration glue | Zero required dependencies — just ``pip install`` |

---

## Install

```bash
pip install agentobs
```

```python
import tracium  # distribution name is agentobs, import name is tracium
```

**Requires Python 3.9 or later.** No other packages are required for core usage.

> **Note:** The PyPI distribution is named `agentobs`. The Python import name remains `tracium`.

### Optional extras

```bash
pip install "agentobs[jsonschema]"   # strict JSON Schema validation
pip install "agentobs[http]"         # Webhook + OTLP export
pip install "agentobs[pydantic]"     # Pydantic v2 model layer
pip install "agentobs[otel]"         # OpenTelemetry SDK integration
pip install "agentobs[kafka]"        # EventStream.from_kafka() via kafka-python
pip install "agentobs[langchain]"    # LangChain callback handler
pip install "agentobs[llamaindex]"   # LlamaIndex event handler
pip install "agentobs[datadog]"      # Datadog APM + metrics exporter
pip install "agentobs[all]"          # everything above
```

---

## Five-minute tour

### 1 — Trace an LLM call with the span API

```python
import tracium

tracium.configure(exporter="console", service_name="my-agent")

with tracium.span("call-llm") as span:
    span.set_model(model="gpt-4o", system="openai")
    result = call_llm(prompt)                          # your LLM call here
    span.set_token_usage(input=512, output=128, total=640)
    span.set_status("ok")
```

The context manager automatically records start/end times, parent-child span relationships, and emits a structured event when it exits.

---

### 2 — Record a raw event

```python
from tracium import Event, EventType, Tags

event = Event(
    event_type=EventType.TRACE_SPAN_COMPLETED,
    source="my-app@1.0.0",          # who emitted this
    org_id="org_acme",              # your organisation
    payload={
        "model": "gpt-4o",
        "prompt_tokens": 512,
        "completion_tokens": 128,
        "latency_ms": 340.5,
    },
    tags=Tags(env="production"),
)

event.validate()         # raises if structure is invalid
print(event.to_json())   # compact JSON string, ready to store or ship
```

Every event gets a **ULID** (a time-sortable unique ID) automatically — no need to generate one yourself.

---

### 3 — Redact private information before logging

```python
from tracium import Event, EventType
from tracium.redact import Redactable, RedactionPolicy, Sensitivity

policy = RedactionPolicy(min_sensitivity=Sensitivity.PII, redacted_by="policy:gdpr-v1")

# Wrap any string that might contain PII
event = Event(
    event_type=EventType.TRACE_SPAN_COMPLETED,
    source="my-app@1.0.0",
    payload={"prompt": Redactable("Call me at 555-867-5309", Sensitivity.PII)},
)
result = policy.apply(event)
# result.event.payload["prompt"] -> "[REDACTED by policy:gdpr-v1]"
```

``Redactable`` is a string wrapper. You mark fields as sensitive at the point where they are created; the policy decides what to remove before the event is written to any log.

> **Tip — auto-redact every span:** pass `redaction_policy=policy` to
> `tracium.configure()` and the policy runs automatically inside `_dispatch()`
> before any exporter sees the event.

---

### 4 — Sign events for tamper-proof audit trails

```python
from tracium.signing import sign, verify_chain, AuditStream

# Sign a single event
signed = sign(event, org_secret="my-org-secret")

# Or build a chain — every event references the one before it,
# so any gap or modification is immediately detectable.
stream = AuditStream(org_secret="my-org-secret")
for e in events:
    stream.append(e)

result = verify_chain(stream.events, org_secret="my-org-secret")
```

This is the same principle used in certificate chains and blockchain — each event's signature covers the previous event's signature, so you cannot alter history without breaking the chain.
> **Tip — auto-sign every span:** pass `signing_key="your-secret"` to
> `tracium.configure()` and every emitted span is signed and chained
> automatically, with no per-event boilerplate.
---

### 5 — Export to anywhere

```python
from tracium.stream import EventStream
from tracium.export.jsonl import JSONLExporter
from tracium.export.webhook import WebhookExporter
from tracium.export.otlp import OTLPExporter
from tracium.export.datadog import DatadogExporter
from tracium.export.grafana import GrafanaLokiExporter

stream = EventStream(events)

# Write everything to a local file
await stream.drain(JSONLExporter("events.jsonl"))

# Ship to your OpenTelemetry collector
await stream.drain(OTLPExporter("http://otel-collector:4318/v1/traces"))

# Send to Datadog APM (traces + metrics)
await stream.drain(DatadogExporter(
    service="my-app",
    env="production",
    agent_url="http://dd-agent:8126",
    api_key="your-dd-api-key",
))

# Push to Grafana Loki
await stream.drain(GrafanaLokiExporter(
    url="http://loki:3100",
    labels={"app": "my-app", "env": "production"},
))

# Fan-out: guard-blocked events -> Slack webhook
await stream.route(
    WebhookExporter("https://hooks.slack.com/your-webhook"),
    predicate=lambda e: e.event_type == "llm.guard.blocked",
)
```

#### Kafka source

```python
from tracium.stream import EventStream

# Drain a Kafka topic directly into an EventStream
stream = EventStream.from_kafka(
    topic="llm-events",
    bootstrap_servers="kafka:9092",
    group_id="analytics",
    max_messages=5000,
)
await stream.drain(exporter)
```

---

### 6 — Sync exporters for non-async workflows

```python
from tracium.exporters.jsonl import SyncJSONLExporter
from tracium.exporters.console import SyncConsoleExporter

# Log all events to a JSONL file synchronously
exporter = SyncJSONLExporter("events.jsonl")
exporter.export(event)
exporter.close()

# Pretty-print events to the terminal during development
console = SyncConsoleExporter()
console.export(event)
```

---

### 7 — Check compliance and inspect events from the command line

```bash
tracium check-compat events.json        # v1.0 compatibility checklist
tracium validate events.jsonl           # JSON Schema validation per event
tracium audit-chain events.jsonl        # verify HMAC signing chain integrity
tracium inspect <EVENT_ID> events.jsonl # pretty-print a single event
tracium stats events.jsonl              # summary: counts, tokens, cost, timestamps
tracium list-deprecated                 # list all deprecated event types
tracium migration-roadmap [--json]      # v2 migration roadmap
tracium check-consumers                 # consumer registry compatibility check
```

```
CHK-1  All required fields present          (500 / 500 events)
CHK-2  Event types valid                    (500 / 500 events)
CHK-3  Source identifiers well-formed       (500 / 500 events)
CHK-5  Event IDs are valid ULIDs            (500 / 500 events)
All checks passed.
```

Drop any of these into your CI pipeline to catch schema drift, signing failures, or schema-breaking migrations before they reach production.

---

## What is inside the box

<table>
<thead>
<tr><th>Module</th><th>What it does</th><th>For whom</th></tr>
</thead>
<tbody>
<tr>
  <td><code>tracium.event</code></td>
  <td>The core <code>Event</code> envelope — the one structure all tools share</td>
  <td>Everyone</td>
</tr>
<tr>
  <td><code>tracium.types</code></td>
  <td>All built-in event type strings (trace, cost, cache, eval, guard…)</td>
  <td>Everyone</td>
</tr>
<tr>
  <td><code>tracium.config</code></td>
  <td><code>configure()</code> and <code>get_config()</code> — global SDK configuration</td>
  <td>Everyone</td>
</tr>
<tr>
  <td><code>tracium._span</code></td>
  <td>Span, AgentRun, AgentStep context managers — the runtime tracing API</td>
  <td>App developers</td>
</tr>
<tr>
  <td><code>tracium._cli</code></td>
  <td>8 CLI sub-commands: <code>check-compat</code>, <code>validate</code>, <code>audit-chain</code>, <code>inspect</code>, <code>stats</code>, <code>list-deprecated</code>, <code>migration-roadmap</code>, <code>check-consumers</code></td>
  <td>DevOps / CI teams</td>
</tr>
<tr>
  <td><code>tracium.redact</code></td>
  <td>PII detection, sensitivity levels, redaction policies</td>
  <td>Data privacy / GDPR teams</td>
</tr>
<tr>
  <td><code>tracium.signing</code></td>
  <td>HMAC-SHA256 event signing and tamper-evident audit chains</td>
  <td>Security / compliance teams</td>
</tr>
<tr>
  <td><code>tracium.compliance</code></td>
  <td>Programmatic v2.0 compatibility checks — no pytest required</td>
  <td>Platform / DevOps teams</td>
</tr>
<tr>
  <td><code>tracium.export</code></td>
  <td>Ship events to files (JSONL), HTTP webhooks, OTLP collectors, Datadog APM, or Grafana Loki</td>
  <td>Infra / observability teams</td>
</tr>
<tr>
  <td><code>tracium.exporters</code></td>
  <td>Sync exporters — <code>SyncJSONLExporter</code> and <code>SyncConsoleExporter</code> for non-async code</td>
  <td>App developers</td>
</tr>
<tr>
  <td><code>tracium.stream</code></td>
  <td>Fan-out router — one <code>drain()</code> call reaches multiple backends; Kafka source via <code>from_kafka()</code></td>
  <td>Platform engineers</td>
</tr>
<tr>
  <td><code>tracium.validate</code></td>
  <td>JSON Schema validation against the published v2.0 schema</td>
  <td>All teams</td>
</tr>
<tr>
  <td><code>tracium.consumer</code></td>
  <td>Declare schema-namespace dependencies; fail fast at startup if version requirements are not met</td>
  <td>Platform / integration teams</td>
</tr>
<tr>
  <td><code>tracium.governance</code></td>
  <td>Policy-based event gating — block prohibited types, warn on deprecated usage, enforce custom rules</td>
  <td>Platform / compliance teams</td>
</tr>
<tr>
  <td><code>tracium.deprecations</code></td>
  <td>Register and surface per-event-type deprecation notices at runtime</td>
  <td>Library maintainers</td>
</tr>
<tr>
  <td><code>tracium.integrations</code></td>
  <td>Plug-in adapters for OpenAI, LangChain, LlamaIndex, Anthropic, Groq, Ollama, and Together</td>
  <td>App developers</td>
</tr>
<tr>
  <td><code>tracium.namespaces</code></td>
  <td>Typed payload dataclasses for all 10 built-in event namespaces</td>
  <td>Tool authors</td>
</tr>
<tr>
  <td><code>tracium.models</code></td>
  <td>Optional Pydantic v2 models for teams that prefer validated schemas</td>
  <td>API / backend teams</td>
</tr>
</tbody>
</table>

---

## Event namespaces

Every event carries a ``payload`` — a dictionary whose shape is defined by the event's **namespace**. The ten built-in namespaces cover everything from raw model traces to safety guardrails:

| Namespace prefix | Dataclass | What it records |
|---|---|---|
| ``llm.trace.*`` | ``SpanPayload``, ``AgentRunPayload``, ``AgentStepPayload`` | Model call — tokens, latency, finish reason **(frozen v2)** |
| ``llm.cost.*`` | ``CostPayload`` | Per-call cost in USD |
| ``llm.cache.*`` | ``CachePayload`` | Cache hit/miss, backend, TTL |
| ``llm.eval.*`` | ``EvalScenarioPayload`` | Scores, labels, evaluator identity |
| ``llm.guard.*`` | ``GuardPayload`` | Safety classifier output, block decisions |
| ``llm.fence.*`` | ``FencePayload`` | Topic constraints, allow/block lists |
| ``llm.prompt.*`` | ``PromptPayload`` | Prompt template version, rendered text |
| ``llm.redact.*`` | ``RedactPayload`` | PII audit record — what was found and removed |
| ``llm.diff.*`` | ``DiffPayload`` | Prompt/response delta between two events |
| ``llm.template.*`` | ``TemplatePayload`` | Template registry metadata |

```python
from tracium.namespaces.trace import SpanPayload
from tracium import Event

payload = SpanPayload(
    span_name="call-llm",
    span_id="abc123",
    trace_id="def456",
    start_time_ns=1_000_000_000,
    end_time_ns=1_340_000_000,
    status="ok",
)

event = Event(
    event_type="llm.trace.span.completed",
    source="my-app@1.0.0",
    payload=payload.to_dict(),
)
```

---

## Quality standards

- **1 837 tests** — unit, integration, property-based (Hypothesis), and performance benchmarks
- **96 % line and branch coverage** — measured with ``pytest-cov``
- **Zero required dependencies** — the entire core runs on Python's standard library alone
- **Typed** — full ``py.typed`` marker; works with mypy and pyright out of the box
- **Frozen v2 trace schema** — ``llm.trace.*`` payload fields will never break between minor releases

---

## Project structure

```
tracium/
├── __init__.py       <- Public API surface (start here)
├── event.py          <- The Event envelope
├── types.py          <- EventType enum
├── config.py         <- configure() / get_config() / TraciumConfig
├── _span.py          <- Span, AgentRun, AgentStep context managers
├── _tracer.py        <- Tracer — top-level tracing entry point
├── _stream.py        <- Internal dispatch: redact → sign → export
├── _cli.py           <- CLI entry-point (8 sub-commands)
├── signing.py        <- HMAC signing & audit chains
├── redact.py         <- PII redaction
├── validate.py       <- JSON Schema validation
├── consumer.py       <- Consumer registry & schema-version compatibility
├── governance.py     <- Event governance policies
├── deprecations.py   <- Per-event-type deprecation tracking
├── compliance/       <- Compatibility checklist suite
├── export/
│   ├── jsonl.py      <- Local file export (async)
│   ├── webhook.py    <- HTTP POST export
│   ├── otlp.py       <- OpenTelemetry export
│   ├── datadog.py    <- Datadog APM traces + metrics
│   └── grafana.py    <- Grafana Loki export
├── exporters/
│   ├── jsonl.py      <- SyncJSONLExporter
│   └── console.py    <- SyncConsoleExporter
├── stream.py         <- EventStream fan-out router (+ Kafka source)
├── integrations/
│   ├── langchain.py  <- LangChain callback handler
│   ├── llamaindex.py <- LlamaIndex event handler
│   ├── openai.py     <- OpenAI tracing wrapper
│   └── ...           (anthropic, groq, ollama, together)
├── namespaces/       <- Typed payload dataclasses
│   ├── trace.py        (SpanPayload, AgentRunPayload, AgentStepPayload — frozen v2)
│   ├── cost.py
│   ├── cache.py
│   └── ...
├── models.py         <- Optional Pydantic v2 models
└── migrate.py        <- Schema migration helpers
examples/             <- Runnable sample scripts
├── openai_chat.py    <- OpenAI + JSONL export
├── agent_workflow.py <- Multi-step agent + console exporter
├── langchain_chain.py<- LangChain callback handler
└── secure_pipeline.py<- HMAC signing + PII redaction together
```

---

## Development setup

```bash
git clone https://github.com/veerarag1973/agentobs.git
cd agentobs

python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/bin/activate     # macOS / Linux

pip install -e ".[dev]"
pytest                          # run all 1 837 tests
```

<details>
<summary><strong>Code quality commands</strong></summary>

```bash
ruff check .                  # linting
ruff format .                 # auto-format
mypy tracium                  # type checking
pytest --cov                  # tests + coverage report (>=90% required)
```

</details>

<details>
<summary><strong>Build the docs locally</strong></summary>

```bash
pip install -e ".[docs]"
cd docs
sphinx-build -b html . _build/html   # open _build/html/index.html
```

</details>

---

## Compatibility and versioning

``tracium`` implements **RFC-0001 AGENTOBS** (Observability Schema Standard for Agentic AI Systems). The current schema version is **2.0**.

This project follows [Semantic Versioning](https://semver.org/):

- **Patch** releases (``1.0.x``) — bug fixes only, fully backwards-compatible
- **Minor** releases (``1.x.0``) — new features, backwards-compatible
- **Major** releases (``x.0.0``) — breaking changes, announced in advance

The ``llm.trace.*`` namespace payload schema is **additionally frozen at v2**: even a major release will not remove or rename fields from ``SpanPayload``, ``AgentRunPayload``, or ``AgentStepPayload``.

---

## Changelog

See [docs/changelog.md](docs/changelog.md) for the full version history.

---

## Contributing

Contributions are welcome! Please read the [Contributing Guide](docs/contributing.md) first, then open an issue or pull request.

Key rules:
- All new code must maintain **>= 90 % test coverage**
- Follow the existing **Google-style docstrings**
- Run ``ruff`` and ``mypy`` before submitting

---

## License

[MIT](LICENSE) — free for personal and commercial use.

---

<p align="center">
  Made with care for the AI observability community.<br/>
  <a href="docs/index.md">Docs</a> ·
  <a href="docs/quickstart.md">Quickstart</a> ·
  <a href="docs/api/index.md">API Reference</a> ·
  <a href="https://github.com/veerarag1973/agentobs/issues">Report a bug</a>
</p>
