Metadata-Version: 2.4
Name: agentguard47
Version: 1.2.10
Summary: Zero-dependency runtime control for production Python agents - stop loops, retry storms, and budget burn
Author-email: BMD PAT LLC <pat@bmdpat.com>
License-Expression: MIT
Project-URL: Homepage, https://agentguard47.com
Project-URL: Documentation, https://github.com/bmdhodl/agent47#readme
Project-URL: Repository, https://github.com/bmdhodl/agent47
Project-URL: Issues, https://github.com/bmdhodl/agent47/issues
Project-URL: Changelog, https://github.com/bmdhodl/agent47/releases
Keywords: agents,coding-agents,ai-agents,multi-agent,llm,guardrails,runtime-guardrails,loop-detection,budget-guard,retry-guard,runtime-enforcement,runtime-control,production-agents,coding-agent-safety,local-first,retry-storms,budget-control,langchain,openai,anthropic
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Testing
Classifier: Topic :: System :: Monitoring
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.84; extra == "langchain"
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.6.11; extra == "langgraph"
Provides-Extra: crewai
Requires-Dist: crewai>=0.28; extra == "crewai"
Provides-Extra: otel
Requires-Dist: opentelemetry-api>=1.41.0; extra == "otel"
Requires-Dist: opentelemetry-sdk>=1.41.0; extra == "otel"
Dynamic: license-file

<!-- Generated by scripts/generate_pypi_readme.py. Edit README.md and CHANGELOG.md instead. -->

# AgentGuard

**Stop runaway Python agents before they burn money.**

AgentGuard47 is the zero-dependency runtime control layer for production Python
agents. Start local with static guards that raise exceptions on budget overruns,
tool loops, retry storms, and timeouts. Add the hosted dashboard when agent work
becomes shared, expensive, or risky.

The SDK is the free local proof path:
- no runtime dependencies
- no dashboard required
- no network calls unless you opt into `HttpSink`
- guard trips stop the agent in-process

[![PyPI](https://img.shields.io/pypi/v/agentguard47)](https://pypi.org/project/agentguard47/)
[![Downloads](https://img.shields.io/pypi/dm/agentguard47)](https://pypi.org/project/agentguard47/)
[![Python](https://img.shields.io/pypi/pyversions/agentguard47)](https://pypi.org/project/agentguard47/)
[![CI](https://github.com/bmdhodl/agent47/actions/workflows/ci.yml/badge.svg)](https://github.com/bmdhodl/agent47/actions/workflows/ci.yml)
[![Coverage](https://img.shields.io/badge/coverage-93%25-brightgreen)](https://github.com/bmdhodl/agent47)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/bmdhodl/agent47/blob/v1.2.10/LICENSE)
[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/bmdhodl/agent47/badge)](https://scorecard.dev/viewer/?uri=github.com/bmdhodl/agent47)
[![GitHub stars](https://img.shields.io/github/stars/bmdhodl/agent47?style=social)](https://github.com/bmdhodl/agent47)

```bash
pip install agentguard47
```

## Local proof in 60 seconds

```bash
agentguard doctor
agentguard demo
agentguard quickstart --framework raw
```

`doctor` verifies the install without network calls. `demo` proves budget, loop,
and retry protection offline. `quickstart` prints the smallest starter for the
stack you actually use.

First value moment: you should see `BudgetGuard`, `LoopGuard`, and
`RetryGuard` stop simulated runaway behavior before you wire a real provider or
share any data.

## Copy-paste safe repo setup

Use this when you want a coding agent or teammate to add AgentGuard to a repo
without hidden network behavior:

```bash
pip install agentguard47
agentguard doctor
agentguard quickstart --framework raw --write
python agentguard_raw_quickstart.py
agentguard report .agentguard/traces.jsonl
```

Optional shared local defaults, saved as `.agentguard.json` in the repo root:

```json
{
  "profile": "coding-agent",
  "service": "my-agent",
  "trace_file": ".agentguard/traces.jsonl",
  "budget_usd": 5.0
}
```

Keep the first PR local-only. No API keys, no dashboard settings, no hosted
sink. Add `HttpSink` later only when retained incidents, alerts, or team
visibility are needed.

## Wrap one agent run

```python
from agentguard import BudgetGuard, JsonlFileSink, LoopGuard, Tracer

tracer = Tracer(
    sink=JsonlFileSink(".agentguard/traces.jsonl"),
    service="support-agent",
    guards=[
        BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8),
        LoopGuard(max_repeats=3),
    ],
)

with tracer.trace("agent.run") as span:
    span.event("tool.call", data={"tool": "search", "query": "refund policy"})
    # Call your agent or tool here. Guards fire during runtime events.
```

Then inspect the local proof:

```bash
agentguard report .agentguard/traces.jsonl
agentguard incident .agentguard/traces.jsonl
```

## What the hosted dashboard adds

Keep the first integration local. Add `HttpSink` when you need retained
incidents, alerts, team visibility, hosted decision history, or dashboard-driven
remote kill signals. `HttpSink` mirrors trace and decision events to the
dashboard; it does not execute remote kill signals by itself.

Dashboard contract details: [`docs/guides/dashboard-contract.md`](https://github.com/bmdhodl/agent47/blob/main/docs/guides/dashboard-contract.md)

## Why this wedge

AgentGuard stays focused on coding-agent safety on purpose.

In an April 2026 report, a16z said that `29%` of the Fortune 500 and about
`19%` of the Global 2000 were live paying customers of leading AI startups,
with coding described as the dominant enterprise AI use case and support/search
next behind it. The report also cited repeated claims of `10-20x`
productivity gains from AI coding tools. Source:
[AI Adoption by the Numbers](https://www.a16z.news/p/ai-adoption-by-the-numbers).

That supports the public SDK strategy in this repo:
- stay narrow on coding-agent runtime safety
- make the first proof local, cheap, and easy to trust
- reuse the same runtime patterns for adjacent managed-agent workflows later,
  without turning the SDK into a generic observability platform

## Why runtime safety matters now

Agents are getting more autonomous. The guardrails around them are not keeping up.

- **Unchecked token burn is real.** Meta's internal "Claudeonomics" leaderboard
  tracked 60 trillion tokens consumed by 85,000 employees in 30 days. Some
  employees left agents running for hours just to climb the rankings. Meta shut
  the dashboard down days after it leaked. ([source](https://fortune.com/2026/04/09/meta-killed-employee-ai-token-dashboard/))
- **Coding-agent adoption can outrun budgets.** Briefs reported that Uber burned
  its 2026 AI tooling budget for Claude Code and Cursor by April, with monthly
  per-engineer API costs ranging from $500 to $2,000. AgentGuard is the local
  cap before those experiments become surprise bills. ([source](https://www.briefs.co/news/uber-torches-entire-2026-ai-budget-on-claude-code-in-four-months/))
- **Self-improving agents need guardrails that don't self-improve.** Cursor's
  Bugbot has auto-generated 44,000+ learned rules across 110,000+ repos. When
  agents write their own rules, the safety layer has to be external and
  deterministic. Not another model. Not another prompt.
  ([source](https://cursor.com/blog/bugbot-learning))
- **Layered agent architectures are the default now.** Orchestrators spawn
  sub-agents that spawn tool calls. Every layer multiplies the blast radius of a
  stuck loop or a retry storm. You need a guard that runs in-process, at every
  layer, and kills the run before it compounds.

AgentGuard is that layer. Zero dependencies. No network calls required. Raises
an exception and stops the agent mid-run.

## Token-metered pricing changes the failure mode

Most model APIs already bill on token-linked usage. That means a runaway agent
is not the only budget risk anymore. One oversized turn with a huge context
window or a verbose completion can erase the run budget on its own. Runtime
budget guards are no longer optional.

AgentGuard's `BudgetGuard` is built for that reality:
- cap spend for the whole run, not just call count
- warn before the limit is gone
- raise `BudgetExceeded` on the spike turn itself

Local proof:

```bash
python examples/per_token_budget_spike.py
agentguard report per_token_budget_spike_traces.jsonl
```

That example prices each turn from token counts, then shows a single
token-heavy turn blowing through the run budget without any network calls or
provider credentials.

## Why static guards

Cost control is table stakes. The harder problem is behavior control.

Recent evidence shows that frontier models scheme, deceive, and resist shutdown
when given autonomy:

- **Mythos Preview** (April 2026) found exploitable vulnerabilities in every
  major OS and browser during a controlled evaluation. The findings triggered a
  government emergency meeting. ([source](https://mythos.dev/preview))
- **Nature** (2026) published peer-reviewed evidence of LLMs disabling their own
  oversight mechanisms, scheming toward hidden objectives, and leaving concealed
  notes to future instances of themselves. ([source](https://www.nature.com/))
- **War games research** put GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash into
  simulated geopolitical conflicts. Every model showed spontaneous deception.
  None surrendered. Multiple runs escalated to nuclear strikes despite explicit
  taboo framing. ([source](https://arxiv.org/abs/2602.14740))

ML-based safety layers share the same failure mode as the agents they guard:
they can be persuaded, prompt-injected, or socially engineered into disabling
themselves. A model that schemes can also scheme past a model-based monitor.

AgentGuard's guards are static, deterministic, rule-based checks. They run
in-process. They raise exceptions. They cannot be convinced, negotiated with,
or talked out of a budget limit. That is the point.

Cost control tells you when to stop spending. Behavior control tells you when
to stop the agent. AgentGuard does both.

## Verify your install

Before wiring a real agent, validate the local SDK path:

```bash
agentguard doctor
```

`doctor` makes no network calls. It verifies local trace writing, confirms the
SDK can initialize in local-only mode, detects optional integrations already
installed in your environment, and prints the smallest correct next-step snippet.

## Generate a starter

When you know the stack you want to wire, print the exact starter snippet:

```bash
agentguard quickstart --framework raw
agentguard quickstart --framework openai
agentguard quickstart --framework langgraph --json
```

`quickstart` is designed for both humans and coding agents. It prints the
install command, the smallest credible starter file, and the next commands to
run after you validate the SDK locally.

If you want a real file instead of a printed snippet:

```bash
agentguard quickstart --framework raw --write
agentguard quickstart --framework openai --write --output agentguard_openai_quickstart.py
```

`--write` creates a local starter file you can run immediately. It refuses to
overwrite an existing file unless you pass `--force`.

## Coding-Agent Defaults

If you want humans and coding agents to share the same safe local defaults, add
a tiny `.agentguard.json` file to the repo:

```json
{
  "profile": "coding-agent",
  "service": "support-agent",
  "trace_file": ".agentguard/traces.jsonl",
  "budget_usd": 5.0
}
```

`agentguard.init(local_only=True)` and `agentguard doctor` will pick this up
automatically. Keep it local and static: no secrets, no API keys, no dashboard
settings.

Every `agentguard quickstart --framework ...` payload also has a matching
runnable file under [`examples/starters/`](https://github.com/bmdhodl/agent47/tree/v1.2.10/examples/starters). Those starter
files live in the repo for copy-paste onboarding and coding-agent setup; they
are not shipped inside the PyPI wheel.

For the repo-first onboarding flow, see
[`docs/guides/coding-agents.md`](https://github.com/bmdhodl/agent47/blob/v1.2.10/docs/guides/coding-agents.md).

For copy-paste setup snippets tailored to Codex, Claude Code, GitHub Copilot,
Cursor, and MCP-capable agents, see
[`docs/guides/coding-agent-safety-pack.md`](https://github.com/bmdhodl/agent47/blob/v1.2.10/docs/guides/coding-agent-safety-pack.md).

If you want AgentGuard to generate those repo-local instruction files for you:

```bash
agentguard skillpack --write
agentguard skillpack --target claude-code --write --output-dir .
```

`skillpack` writes a local `.agentguard.json` plus agent-specific instruction
files for Codex, Claude Code, Copilot, or Cursor. By default it writes into
`agentguard_skillpack/` so you can review the files before copying them into a
real repo.

## MCP Server for Coding-Agent Workflows

If your coding agent already uses MCP, AgentGuard also ships a published
read-only MCP server that exposes traces, decision events, alerts, usage,
costs, and budget health from the AgentGuard read API:

```bash
npx -y @agentguard47/mcp-server
```

The MCP server is intentionally narrow. Use the SDK to enforce safety where the
agent runs. Add MCP when you want Codex, Claude Code, Cursor, or another
MCP-compatible client to inspect traces and incidents without bespoke glue.

## Stateless Harnesses

If one managed-agent session can span multiple disposable harnesses or worker
processes, pass a shared `session_id` to correlate those traces above the
single-`trace_id` level:

```python
from agentguard import JsonlFileSink, Tracer

tracer = Tracer(
    sink=JsonlFileSink(".agentguard/traces.jsonl"),
    service="managed-harness-a",
    session_id="support-session-001",
)
```

Each tracer instance still creates its own `trace_id`, but every emitted span
and point event also carries the shared `session_id`. Guide:
[`docs/guides/managed-agent-sessions.md`](https://github.com/bmdhodl/agent47/blob/main/docs/guides/managed-agent-sessions.md)

## Try it in 60 seconds

No API keys. No dashboard. No network calls. Just run it:

```bash
pip install agentguard47
agentguard demo
```

```
AgentGuard offline demo
No API keys. No dashboard. No network calls.

1. BudgetGuard: stopping runaway spend
  warning fired at $0.84
  stopped on call 9: cost $1.08 exceeded $1.00

2. LoopGuard: stopping repeated tool calls
  stopped on repeated tool call: Loop detected ...

3. RetryGuard: stopping retry storms
  stopped retry storm: Retry limit exceeded ...

Local proof complete.
```

Prefer the example script instead of the CLI? This does the same local demo:

```bash
python examples/try_it_now.py
```

Want the coding-agent version of the failure? This local proof simulates a
review loop where repeated edit attempts burn the run budget and a stuck patch
retry storm gets stopped:

```bash
python examples/coding_agent_review_loop.py
agentguard incident coding_agent_review_loop_traces.jsonl
```

Sample output:
[`docs/examples/coding-agent-review-loop-incident.md`](https://github.com/bmdhodl/agent47/blob/main/docs/examples/coding-agent-review-loop-incident.md)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/bmdhodl/agent47/blob/v1.2.10/examples/quickstart.ipynb)

## Quickstart: Stop a Runaway Coding Agent in 4 Lines

```python
from agentguard import Tracer, BudgetGuard, patch_openai

tracer = Tracer(guards=[BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8)])
patch_openai(tracer)  # auto-tracks every OpenAI call

# Use OpenAI normally - AgentGuard tracks cost and kills the agent at $5
```

That's it. Every `ChatCompletion` call is tracked. When accumulated cost hits $4 (80%), your warning fires. At $5, `BudgetExceeded` is raised and the agent stops.

No config files. No dashboard required. No dependencies.

For a deterministic local proof before wiring a real agent, run:

```bash
agentguard doctor
agentguard quickstart --framework raw
agentguard demo
```

`agentguard doctor` verifies the install path. `agentguard quickstart` prints
the copy-paste starter for your stack. `agentguard demo` then proves SDK-only
enforcement with a realistic local run. Keep the first integration local and
only add hosted pieces after you need retained incidents or team-visible
follow-through.

## The Problem

Coding agents are cheap to start and expensive to leave unattended:
- **Cost overruns average 340%** on autonomous agent tasks ([source](https://arxiv.org/abs/2401.15811))
- A single stuck retry or tool loop can burn through your budget in minutes
- Existing tracing tools show you what happened after the burn, not stop the run while it is still happening

**AgentGuard is built to stop a runaway coding agent mid-run, not just explain the damage later.**

| | AgentGuard | LangSmith | Langfuse | Portkey |
|---|---|---|---|---|
| Hard budget enforcement | **Yes** | No | No | No |
| Kill agent mid-run | **Yes** | No | No | No |
| Loop detection | **Yes** | No | No | No |
| Cost tracking | **Yes** | Yes | Yes | Yes |
| Zero dependencies | **Yes** | No | No | No |
| Self-hosted option | **Yes** | No | Yes | No |
| Price | **Free (MIT)** | $2.50/1k traces | $59/mo | $49/mo |

See also: [AgentGuard vs Vercel AI Gateway](https://github.com/bmdhodl/agent47/blob/v1.2.10/docs/competitive/vercel-ai-gateway.md) -- in-process SDK vs gateway proxy, compared across 7 axes; and [Where AgentGuard fits in the agent security stack](https://github.com/bmdhodl/agent47/blob/main/docs/competitive/agent-security-stack.md) -- identity, MCP governance, sandboxing, and runtime behavior as separate layers.

## Guards

Guards are runtime checks that raise exceptions when limits are hit. The agent stops immediately.

| Guard | What it stops | Example |
|-------|--------------|---------|
| `BudgetGuard` | Dollar/token/call overruns | `BudgetGuard(max_cost_usd=5.00)` |
| `LoopGuard` | Exact repeated tool calls | `LoopGuard(max_repeats=3)` |
| `FuzzyLoopGuard` | Similar tool calls, A-B-A-B patterns | `FuzzyLoopGuard(max_tool_repeats=5)` |
| `TimeoutGuard` | Wall-clock time limits | `TimeoutGuard(max_seconds=300)` |
| `RateLimitGuard` | Calls-per-minute throttling | `RateLimitGuard(max_calls_per_minute=60)` |
| `RetryGuard` | Retry storms on the same flaky tool | `RetryGuard(max_retries=3)` |
| `BudgetAwareEscalation` | Hard turns that should switch to a stronger model | `BudgetAwareEscalation(..., escalate_on=EscalationSignal.TOKEN_COUNT(threshold=2000))` |

```python
from agentguard import BudgetGuard, BudgetExceeded

budget = BudgetGuard(
    max_cost_usd=10.00,
    warn_at_pct=0.8,
    on_warning=lambda msg: print(f"WARNING: {msg}"),
)

# In your agent loop:
budget.consume(tokens=1500, calls=1, cost_usd=0.03)
# At 80% → warning callback fires
# At 100% → BudgetExceeded raised, agent stops
```

```python
from agentguard import RetryGuard, RetryLimitExceeded, Tracer

retry_guard = RetryGuard(max_retries=3)
tracer = Tracer(guards=[retry_guard])

with tracer.trace("agent.run") as span:
    try:
        span.event("tool.retry", data={"tool_name": "search", "attempt": 1})
        span.event("tool.retry", data={"tool_name": "search", "attempt": 2})
        span.event("tool.retry", data={"tool_name": "search", "attempt": 3})
        span.event("tool.retry", data={"tool_name": "search", "attempt": 4})
    except RetryLimitExceeded:
        # Retry storm stopped
        pass
```

```python
from agentguard import BudgetAwareEscalation, EscalationSignal

guard = BudgetAwareEscalation(
    primary_model="ollama/llama3.1:8b",
    escalate_model="claude-opus-4-6",
    escalate_on=(
        EscalationSignal.TOKEN_COUNT(threshold=2000),
        EscalationSignal.CONFIDENCE_BELOW(threshold=0.45),
    ),
)

model = guard.select_model(token_count=2430, confidence=0.39)
```

`BudgetAwareEscalation` gives you an advisor-style pattern without hiding the
provider call inside the SDK. AgentGuard decides when the current turn is too
hard for the cheap model; your app still chooses how to invoke the stronger
model.

Guide:
[`docs/guards/budget-aware-escalation.md`](https://github.com/bmdhodl/agent47/blob/main/docs/guards/budget-aware-escalation.md)

## Integrations

### LangChain

```bash
pip install agentguard47[langchain]
```

```python
from agentguard import Tracer, BudgetGuard
from agentguard.integrations.langchain import AgentGuardCallbackHandler

tracer = Tracer(guards=[BudgetGuard(max_cost_usd=5.00)])
handler = AgentGuardCallbackHandler(
    tracer=tracer,
    budget_guard=BudgetGuard(max_cost_usd=5.00),
)

# Pass to any LangChain component
llm = ChatOpenAI(callbacks=[handler])
```

### LangGraph

```bash
pip install agentguard47[langgraph]
```

```python
from agentguard.integrations.langgraph import guarded_node

@guarded_node(tracer=tracer, budget_guard=BudgetGuard(max_cost_usd=5.00))
def research_node(state):
    return {"messages": state["messages"] + [result]}
```

### CrewAI

```bash
pip install agentguard47[crewai]
```

```python
from agentguard.integrations.crewai import AgentGuardCrewHandler

handler = AgentGuardCrewHandler(
    tracer=tracer,
    budget_guard=BudgetGuard(max_cost_usd=5.00),
)

agent = Agent(role="researcher", step_callback=handler.step_callback)
```

### OpenAI / Anthropic Auto-Instrumentation

```python
from agentguard import Tracer, BudgetGuard, patch_openai, patch_anthropic

tracer = Tracer(guards=[BudgetGuard(max_cost_usd=5.00)])
patch_openai(tracer)      # auto-tracks all ChatCompletion calls
patch_anthropic(tracer)   # auto-tracks all Messages calls
```

## Multi-Agent Safety

When multiple agents share state, a common failure mode is the reactive loop: Agent A updates shared state, Agent B reacts, Agent A reacts to B's update, and the cycle repeats. Without an explicit termination condition, these loops consume tokens indefinitely without converging on a result.

Anthropic's [multi-agent coordination patterns](https://www.anthropic.com/engineering/built-multi-agent-research-system) guide calls out this exact risk for shared-state architectures and recommends time budgets and threshold-based stopping. AgentGuard's `BudgetGuard` and `TimeoutGuard` are those stopping conditions.

```python
from agentguard import BudgetGuard, BudgetExceeded, TimeoutGuard, TimeoutExceeded

# Shared budget across both agents. When either hits the limit, the loop stops.
budget = BudgetGuard(max_cost_usd=2.00, warn_at_pct=0.8,
                     on_warning=lambda msg: print(f"WARN: {msg}"))
timeout = TimeoutGuard(max_seconds=120)

shared_state = {"revision": 0, "content": ""}

try:
    with timeout:
        while True:
            timeout.check()
            # Agent A: writer
            shared_state["content"] = f"draft v{shared_state['revision']}"
            budget.consume(tokens=500, calls=1, cost_usd=0.01)

            # Agent B: reviewer
            shared_state["revision"] += 1
            budget.consume(tokens=300, calls=1, cost_usd=0.008)
except (BudgetExceeded, TimeoutExceeded) as e:
    print(f"Terminated: {e}")
    print(f"Final state: revision {shared_state['revision']}")
```

The guards are static and deterministic. No agent can talk its way past a dollar limit or a wall-clock timeout.

## Cost Tracking

Built-in pricing for OpenAI, Anthropic, Google, Mistral, and Meta models. Updated monthly.

```python
from agentguard import estimate_cost

# Single call estimate
cost = estimate_cost("gpt-4o", input_tokens=1000, output_tokens=500)
# → $0.00625

# Track across a trace — cost is auto-accumulated per span
with tracer.trace("agent.run") as span:
    span.cost.add("gpt-4o", input_tokens=1200, output_tokens=450)
    span.cost.add("claude-sonnet-4-5-20250929", input_tokens=800, output_tokens=300)
    # cost_usd included in trace end event
```

## Tracing

Full structured tracing with zero dependencies — JSONL output, spans, events, and cost data.

```python
from agentguard import Tracer, JsonlFileSink, BudgetGuard

tracer = Tracer(
    sink=JsonlFileSink("traces.jsonl"),
    guards=[BudgetGuard(max_cost_usd=5.00)],
)

with tracer.trace("agent.run") as span:
    span.event("reasoning", data={"thought": "search docs"})
    with span.span("tool.search", data={"query": "quantum computing"}):
        pass  # your tool logic
    span.cost.add("gpt-4o", input_tokens=1200, output_tokens=450)
```

```bash
$ agentguard report traces.jsonl

AgentGuard report
  Total events: 9
  Spans: 6  Events: 3
  Estimated cost: $0.01
  Savings ledger: exact 800 tokens / $0.0010, estimated 1500 tokens / $0.0075
```

When a run trips a guard or needs escalation, render a shareable incident report:

```bash
agentguard incident traces.jsonl
agentguard incident traces.jsonl --format html > incident.html
```

The incident report summarizes guard triggers, exact-vs-estimated savings, and
the dashboard upgrade path for retained alerts, team visibility, and remote kill
signal management.

## Decision Tracing

Capture agent proposals, human edits, overrides, approvals, and binding
outcomes through the normal AgentGuard event path.

```python
from agentguard import JsonlFileSink, Tracer, decision_flow

tracer = Tracer(
    sink=JsonlFileSink(".agentguard/traces.jsonl"),
    service="approval-flow",
)

with tracer.trace("agent.run") as run:
    with decision_flow(
        run,
        workflow_id="deploy-approval",
        object_type="deployment",
        object_id="deploy-042",
        actor_type="agent",
        actor_id="release-bot",
    ) as decision:
        decision.proposed({"action": "deploy", "environment": "staging"})
        decision.edited(
            {"action": "deploy", "environment": "production"},
            actor_type="human",
            actor_id="reviewer-123",
            reason="Customer approved direct rollout",
        )
        decision.approved(actor_type="human", actor_id="reviewer-123")
        decision.bound(
            actor_type="system",
            actor_id="deploy-api",
            binding_state="applied",
            outcome="success",
        )
```

Every decision event includes a stable schema in `event.data`:

- `decision_id`
- `workflow_id`
- `trace_id`
- `object_type`
- `object_id`
- `actor_type`
- `actor_id`
- `event_type`
- `proposal`
- `final`
- `diff`
- `reason`
- `comment`
- `timestamp`
- `binding_state`
- `outcome`

Default `binding_state` values are dashboard-parseable strings:
`proposed`, `edited`, `overridden`, and `approved`. `decision.bound` requires
the caller to provide the binding state, such as `applied`, `merged`, or
`failed`.

Guide: [`docs/guides/decision-tracing.md`](https://github.com/bmdhodl/agent47/blob/main/docs/guides/decision-tracing.md)

For local JSONL traces, you can extract the normalized decision events without
writing your own parser:

```bash
agentguard decisions .agentguard/traces.jsonl
agentguard decisions .agentguard/traces.jsonl --workflow-id deploy-approval --json
```

For retained traces exposed through MCP, use the `get_trace_decisions` tool to
pull the same normalized decision payloads from a hosted trace by `trace_id`.

## Evaluation

Assert properties of your traces in tests or CI.

```python
from agentguard import EvalSuite

result = (
    EvalSuite("traces.jsonl")
    .assert_no_loops()
    .assert_budget_under(tokens=50_000)
    .assert_completes_within(seconds=30)
    .assert_total_events_under(500)
    .assert_no_budget_exceeded()
    .assert_no_errors()
    .run()
)
```

```bash
agentguard eval traces.jsonl --ci   # exits non-zero on failure
```

## CI Cost Gates

Fail your CI pipeline if an agent run exceeds a cost budget. No competitor offers this.

```yaml
# .github/workflows/cost-gate.yml (simplified)
- name: Run agent with budget guard
  run: |
    python3 -c "
    from agentguard import Tracer, BudgetGuard, JsonlFileSink
    tracer = Tracer(
        sink=JsonlFileSink('ci_traces.jsonl'),
        guards=[BudgetGuard(max_cost_usd=5.00)],
    )
    # ... your agent run here ...
    "

- name: Evaluate traces
  uses: bmdhodl/agent47/.github/actions/agentguard-eval@main
  with:
    trace-file: ci_traces.jsonl
    assertions: "no_errors,max_cost:5.00"
```

Full workflow: [`docs/ci/cost-gate-workflow.yml`](https://github.com/bmdhodl/agent47/blob/v1.2.10/docs/ci/cost-gate-workflow.yml)

## Incident Reports

Turn a trace into a postmortem-style incident summary:

```bash
agentguard incident traces.jsonl --format markdown
agentguard incident traces.jsonl --format html > incident.html
```

Use this when a run hits `guard.budget_warning`, `guard.budget_exceeded`,
`guard.loop_detected`, or a fatal error. AgentGuard will summarize the run,
separate exact and estimated savings, and suggest the next control-plane step.

## Async Support

Full async API mirrors the sync API.

```python
from agentguard import AsyncTracer, BudgetGuard, patch_openai_async

tracer = AsyncTracer(guards=[BudgetGuard(max_cost_usd=5.00)])
patch_openai_async(tracer)

# All async OpenAI calls are now tracked and budget-enforced
```

## Optional Hosted Dashboard

For teams that need retained history, alerts, team visibility, and hosted
decision history, the SDK can mirror traces to the hosted dashboard:

```python
from agentguard import Tracer, HttpSink, BudgetGuard

tracer = Tracer(
    sink=HttpSink(
        url="https://app.agentguard47.com/api/ingest",
        api_key="ag_...",
        batch_size=20,
        flush_interval=10.0,
        compress=True,
    ),
    guards=[BudgetGuard(max_cost_usd=50.00)],
    metadata={"env": "prod"},
    sampling_rate=0.1,  # 10% of traces
)
```

Keep the first integration local. Add `HttpSink` only when you need retained
incidents, alerts, or hosted follow-through. `HttpSink` does not poll or execute
dashboard remote kill signals by itself; local guards remain the authoritative
runtime stop path.

Hosted contract details: [`docs/guides/dashboard-contract.md`](https://github.com/bmdhodl/agent47/blob/main/docs/guides/dashboard-contract.md)

## Architecture

```
Your Agent Code
    │
    ▼
┌─────────────────────────────────────┐
│           Tracer / AsyncTracer       │  ← trace(), span(), event()
│  ┌───────────┐  ┌────────────────┐  │
│  │  Guards    │  │  CostTracker   │  │  ← runtime intervention
│  └───────────┘  └────────────────┘  │
└──────────┬──────────────────────────┘
           │ emit(event)
    ┌──────┼──────────┬───────────┐
    ▼      ▼          ▼           ▼
 JsonlFile  HttpSink  OtelTrace  Stdout
  Sink      (gzip,    Sink       Sink
            retry)
```

## What's in this repo

| Directory | Description | License |
|-----------|-------------|---------|
| `sdk/` | Python SDK — guards, tracing, evaluation, integrations | MIT |
| `mcp-server/` | Read-only MCP surface for traces, alerts, usage, costs, and budget health | MIT |
| `site/` | Landing page | MIT |

> Dashboard is in a separate private repo ([agent47-dashboard](https://github.com/bmdhodl/agent47-dashboard)).

## Security

- **Zero runtime dependencies** — one package, nothing to audit, no supply chain risk
- **[OpenSSF Scorecard](https://scorecard.dev/viewer/?uri=github.com/bmdhodl/agent47)** — automated security analysis on every push
- **CodeQL scanning** — GitHub's semantic code analysis on every PR
- **Bandit security linting** — Python-specific security checks in CI

## Contributing

See [CONTRIBUTING.md](https://github.com/bmdhodl/agent47/blob/v1.2.10/CONTRIBUTING.md) for dev setup, test commands, and PR guidelines.

## Commercial Support

Need help rolling out coding-agent safety in production? BMD Pat LLC offers:

- **$500 Async Azure Audit** -- cost, reliability, and governance review. No meetings. Results in 5 business days.
- **Custom agent guardrails** -- production-grade cost controls, compliance tooling, kill switches.

[Start a project](https://bmdpat.com/start) | [See the research](https://bmdpat.com/research)

## License

MIT (BMD PAT LLC)

## Latest Release Notes (1.2.10)

### Activation Proof Path
- Tightened the README and getting-started path around `doctor`, `demo`, and `quickstart` so first-time SDK users can reach local guard proof faster.
- Added a coding-agent review-loop proof artifact that shows budget and retry guards stopping a simulated review/refinement loop without API keys or network calls.
- Added sync coverage for the public sample incident and generated PyPI README so release-facing activation assets do not silently drift.

### Release And Distribution Hygiene
- Added an opt-in activation metrics design doc that defines allowed activation questions and local-first consent boundaries without adding telemetry.
- Hardened release discussion category handling so missing GitHub Discussion categories do not block the package release path.
- Updated the package build timestamp seed to the ZIP-safe reproducible epoch so local and CI release builds do not fail on pre-1980 metadata.
- Clarified hosted ingest language in incident reporting so `HttpSink` is described as event mirroring for retained alerts and follow-up, not a remote kill switch by itself.

Full changelog: [CHANGELOG.md](https://github.com/bmdhodl/agent47/blob/v1.2.10/CHANGELOG.md)
