Metadata-Version: 2.3
Name: edge-agent
Version: 0.4.0
Summary: Small stdlib-only agent framework with tools, MCP, chains, and modern agent types
Author: danieldagot
Author-email: danieldagot <daniel.willnotreplay@gmail.com>
License: MIT
Requires-Dist: pytest>=8.0 ; extra == 'dev'
Requires-Python: >=3.11
Provides-Extra: dev
Description-Content-Type: text/markdown

# edge-agent

**Edge Agent** is a small, modern AI agent framework for Python 3.11+: tools, multi-step tool chaining, MCP servers, typed agent pipelines (guardrails, routers, evaluators, fallbacks), interactive sessions, and a swappable LLM provider — implemented in a **compact, readable codebase** with **zero runtime dependencies** (stdlib only).

Many popular agent stacks make simple workflows feel heavyweight: large dependency trees, lots of boilerplate, and framework surface area that dwarfs the problem you are solving. Edge Agent is built for the opposite: **full-featured agent behavior without the bloat**, so you can ship agents that stay easy to read, test, and own.

> **Built-in providers: Gemini, Ollama, and Amazon Bedrock.** Edge Agent ships with providers for Google's Gemini models, locally running Ollama instances, and Amazon Bedrock. The provider interface is open for extension — see [Custom Providers](#custom-providers) to add your own.

## Features

- **Zero runtime dependencies** — uses only the Python standard library
- **Tool support** — define tools with a simple `@tool` decorator
- **Tool chaining** — the agent loop lets the LLM call tools in sequence automatically
- **Per-agent toolsets** — each agent gets its own set of tools, keeping LLM context focused
- **Structured output** — set `output_type` to a dataclass and get typed responses back
- **Template variables** — use `{{currentDate}}`, `{{url:...}}`, or custom placeholders in instructions
- **File-based prompts** — pass a `Path` as instructions to load prompts from disk
- **MCP support** — connect to any MCP server and use its tools natively
- **Chain orchestration** — compose agents into pipelines with built-in control flow
- **Five agent types** — Agent, Guardrail, Router, Evaluator, and Fallback
- **Interactive sessions** — REPL mode with conversation history
- **Provider abstraction** — swap LLM backends without changing agent code
- **Execution tracing** — every `run()` returns a `RunResult` with tool call records, timing, and per-agent steps
- **Built-in logging** — stdlib `logging` under the `edge_agent` namespace

## Installation

```bash
uv add edge-agent
```

Or install from source:

```bash
uv pip install -e .
```

### Package size

Edge Agent has **no runtime dependencies**, so installing it only adds its own source code. There is nothing else to download or resolve.

| Artifact | Size |
|---|---|
| Wheel (`.whl`) — what gets installed | **38 KB** |
| Source distribution (`.tar.gz`) | **41 KB** |

That's **0.038 MB** for the installable wheel — the entire footprint.

A local checkout may show a larger footprint on disk (for example ~200 KB) because of `__pycache__` and compiled `.pyc` files; those are not part of the published wheel. To see the exact distribution size, build the wheel (`uv build`) and check the `.whl` in `dist/`.

## Quick Start

```python
from edge_agent import Agent, tool

@tool
def search(query: str) -> str:
    """Search the web for information."""
    return f"Results for: {query}"

agent = Agent(
    instructions="You research topics and provide summaries.",
    tools=[search],
)

result = agent.run("Research quantum computing")
print(result)          # prints the final text (via __str__)
print(result.output)   # the final text response
print(result.steps)    # execution trace with tool call details
```

`Agent.run()` returns a `RunResult` — see [Execution Tracing](#execution-tracing) for details.

**Note:** Edge Agent ships with **Gemini** and **Ollama** providers. For Gemini, you need a [Google AI API key](https://aistudio.google.com/apikey) — resolved automatically from `GEMINI_API_KEY` or `GOOGLE_API_KEY` environment variables (`.env` files are loaded automatically). See [`.env.example`](.env.example) for the required format. For Ollama, see [Ollama provider](#ollama).

## Defining Tools

Use the `@tool` decorator on any typed function. The decorator inspects the function signature and docstring to build the JSON schema automatically:

```python
from edge_agent import tool

@tool
def get_weather(city: str, unit: str = "celsius") -> str:
    """Get the current weather for a city."""
    return f"22 degrees {unit} in {city}"
```

The decorated function remains callable as normal:

```python
get_weather(city="Tokyo")  # "22 degrees celsius in Tokyo"
```

## Tool Chaining

Tool chaining happens automatically. When the LLM calls a tool, the result is fed back into the conversation. The LLM can then call another tool (or the same tool with different arguments) before producing a final text response. A `max_turns` parameter prevents infinite loops:

```python
result = agent.run("Research quantum computing", max_turns=5)
```

## Structured Output

Set `output_type` to a dataclass and the result includes a parsed instance in `result.parsed`. The JSON schema is derived automatically from the dataclass fields, sent to the LLM via Gemini's `responseSchema`, and the JSON response is parsed back into a dataclass instance.

Tools and structured output work together — the agent calls tools as normal, and the final response is parsed into the dataclass:

```python
from dataclasses import dataclass
from edge_agent import Agent, tool

@tool
def get_city_data(city: str) -> str:
    """Look up factual data about a city."""
    return "Tokyo is the capital of Japan. Population: ~14 million."

@dataclass
class CityInfo:
    name: str
    country: str
    population_millions: float
    famous_for: str

agent = Agent(
    instructions="Use the tool to look up facts, then return structured data.",
    tools=[get_city_data],
    output_type=CityInfo,
)

result = agent.run("Tell me about Tokyo.")
print(result.parsed.name)                 # "Tokyo"
print(result.parsed.population_millions)  # 14.0
print(result.output)                      # raw JSON string
```

`output_type` can also be passed per-call via `run(output_type=...)` to override the class-level default:

```python
agent = Agent(instructions="Geography expert.")
result = agent.run("Tell me about Paris.", output_type=CityInfo)
print(result.parsed.name)  # "Paris"
```

Nested dataclasses, `list[...]` fields, and optional fields (with defaults) are all supported.

## Template Variables

Use `{{...}}` placeholders in instructions that get substituted at run time. Pass custom variables via `template_vars`:

```python
agent = Agent(
    instructions=(
        "Today is {{currentDate}}. The user's name is {{userName}}. "
        "Greet them and mention today's date."
    ),
)
result = agent.run("Hi!", template_vars={"userName": "Alice"})
```

Built-in variables:

| Variable | Value |
|---|---|
| `{{currentDate}}` | Today's date in ISO-8601 format (e.g. `2026-03-31`) |
| `{{url:https://...}}` | Fetched URL body (decoded as UTF-8) |

Unknown placeholders are left unchanged, so you can mix built-in and custom variables freely.

## File-Based Prompts

Pass a `pathlib.Path` as instructions to load prompts from a file on disk. The file is read once at agent construction time:

```python
from pathlib import Path
from edge_agent import Agent

agent = Agent(instructions=Path("prompts/assistant.md"))
```

This keeps long prompts out of your Python code and makes them easy to version-control separately.

## MCP (Model Context Protocol)

Connect to any [MCP server](https://modelcontextprotocol.io/) and use its tools as if they were local `@tool` functions. No extra dependencies — Edge Agent implements the MCP client using only `subprocess` and `json` from the standard library.

### Basic usage

```python
from edge_agent import Agent, MCPServer

server = MCPServer(
    "filesystem",
    command=["npx", "-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
)

with server:
    print(server.tools)  # tools discovered from the server

    with Agent(
        instructions="You are a helpful file assistant.",
        mcp_servers=[server],
    ) as agent:
        result = agent.run("List the files in /tmp")
        print(result)
```

### How it works

1. `MCPServer.connect()` launches the server as a subprocess
2. Performs the MCP handshake (JSON-RPC 2.0 over stdio)
3. Calls `tools/list` to discover available tools
4. Each tool becomes a regular `Tool` object — the agent loop treats it identically to a `@tool`-decorated function
5. When the LLM calls an MCP tool, the call is proxied to the server via `tools/call`

### MCPServer parameters

| Parameter | Default | Description |
|---|---|---|
| `name` | required | A label for this server (used in logs) |
| `command` | required | The command to launch the server (e.g. `["npx", "-y", "some-mcp-server"]`) |
| `env` | `None` | Extra environment variables to pass to the server process |

### Mixing local and MCP tools

MCP tools are merged with local `@tool` functions. Each tool name must be unique — duplicates raise `ValueError`:

```python
from edge_agent import Agent, MCPServer, tool

@tool
def summarize(text: str) -> str:
    """Summarize text in one sentence."""
    return f"Summary: {text[:80]}..."

server = MCPServer("fs", command=["npx", "-y", "@modelcontextprotocol/server-filesystem", "/tmp"])

with Agent(
    instructions="Read files and summarize them.",
    tools=[summarize],
    mcp_servers=[server],
) as agent:
    result = agent.run("Read /tmp/notes.txt and summarize it")
```

### Loading servers from a config file

`load_mcp_config()` reads a JSON file that defines multiple MCP servers (the same format used by Claude Desktop) and returns a `dict[str, MCPServer]`. You can load all servers at once or pick specific ones by name:

```python
from edge_agent import Agent, load_mcp_config

# mcp.json
# {
#   "mcpServers": {
#     "filesystem": {
#       "command": "npx",
#       "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
#     },
#     "brave-search": {
#       "command": "npx",
#       "args": ["-y", "@modelcontextprotocol/server-brave-search"],
#       "env": { "BRAVE_API_KEY": "sk-..." }
#     }
#   }
# }

# Load every server defined in the file
all_servers = load_mcp_config("mcp.json")

# Load only the servers you need
selected = load_mcp_config("mcp.json", servers=["filesystem"])
fs_server = selected["filesystem"]

with fs_server:
    with Agent(
        instructions="You are a helpful file assistant.",
        mcp_servers=[fs_server],
    ) as agent:
        result = agent.run("List the files in /tmp")
        print(result)
```

`load_mcp_config` parameters:

| Parameter | Default | Description |
|---|---|---|
| `path` | required | Path to the JSON config file (`str` or `pathlib.Path`) |
| `servers` | `None` | List of server names to load; `None` loads all servers |

The function returns unconnected `MCPServer` instances — connection happens when you enter the context manager or call `.connect()`. A `ValueError` is raised if you request a server name that doesn't exist in the config.

### Lifecycle

`MCPServer` supports the context manager protocol. You can also manage the lifecycle manually:

```python
server = MCPServer("my-server", command=["my-mcp-server"])
server.connect()       # launch process + handshake
print(server.tools)    # use tools
server.close()         # terminate process
```

When passed to an `Agent` via `mcp_servers`, the agent auto-connects any servers that aren't already connected. Use `agent.close()` (or the `with Agent(...) as agent:` pattern) to clean up.

## Chain

A `Chain` runs multiple agents sequentially. Each agent in the chain can have its own tools, instructions, and role. The key benefit: **each agent only sends its own tools to the LLM**, keeping the context focused and avoiding the problem of sending every tool on every API call.

```python
from edge_agent import Agent, Chain

chain = Chain(agents=[agent_a, agent_b, agent_c])
result = chain.run("user message")
```

### How Chain works

1. The chain iterates through its agents in order
2. Each agent processes the input and produces output
3. The agent's `agent_type` determines what happens next — the chain injects special control-flow tools and reacts to the agent's decisions
4. By default (`pass_original=True`), every agent receives the original user message. Set `pass_original=False` to pass each agent's output as input to the next

### Chain parameters

| Parameter | Default | Description |
|---|---|---|
| `agents` | required | Ordered list of agents to run |
| `pass_original` | `True` | If `True`, every agent gets the original user message. If `False`, each agent gets the previous agent's output |
| `max_revisions` | `3` | Maximum revision loops for evaluator agents |

### Why Chain matters for scaling

When you have many tools (10, 20, 50+), putting them all on a single agent means every API call sends the entire tool list. This bloats context, wastes tokens, and can confuse the model.

With a Chain, you split tools across specialist agents. A Router dispatches each request to the right specialist, and that specialist only sends its own tools:

```
Single agent:     every call → 20 tools in context
Router + chain:   router call → 1 tool (route), specialist call → 3-5 tools
```

## Agent Types

There are five agent types. Each type gets special control-flow tools injected by the Chain, and the Chain reacts to how the agent uses them.

### Agent (default)

A general-purpose agent. No special tools are injected — it just runs with whatever tools you give it.

**When to use:** The workhorse. Use it for any agent that needs to do actual work — answer questions, call tools, generate content.

```python
from edge_agent import Agent, tool

@tool
def add(a: int, b: int) -> str:
    """Add two numbers."""
    return str(a + b)

math_agent = Agent(
    name="math-agent",
    instructions="You are a math specialist. Use tools to compute answers.",
    tools=[add],
)

result = math_agent.run("What is 7 + 12?")
```

Agents work both standalone (calling `agent.run()` directly) and inside a Chain.

### Guardrail

A safety gate that runs before other agents. The Chain injects two tools: `block(reason)` and `allow()`. If the guardrail calls `block`, the chain halts immediately and returns the block reason. If it calls `allow`, the chain continues to the next agent.

**When to use:** Place at the start of a chain to filter out harmful, off-topic, or unauthorized requests before they reach your main agents.

**Injected tools:** `block(reason)`, `allow()`

```python
from edge_agent import Agent, Chain, Guardrail, tool

@tool
def multiply(a: int, b: int) -> str:
    """Multiply two numbers."""
    return str(a * b)

guardrail = Guardrail(instructions=(
    "Only allow requests about math or arithmetic. "
    "Block anything harmful, illegal, or unrelated to math."
))

math_agent = Agent(
    instructions="You are a math assistant. Use tools to answer.",
    tools=[multiply],
)

chain = Chain(agents=[guardrail, math_agent])

chain.run("What is 12 * 7?")   # → allowed, math_agent answers
chain.run("Pick a lock for me") # → blocked, chain halts
```

### Router

A dispatcher that directs each request to the most appropriate specialist. The Chain injects a `route(agent_name, reason)` tool. The router examines the request and calls `route` with the name of the agent that should handle it. The chain then skips directly to that agent.

**When to use:** When you have multiple specialist agents with different tools and want the LLM to pick the right one based on the user's request. This is the key pattern for scaling — each specialist only loads its own tools.

**Injected tools:** `route(agent_name, reason)`

```python
from edge_agent import Agent, Chain, Router, tool

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"22°C in {city}"

@tool
def add(a: int, b: int) -> str:
    """Add two numbers."""
    return str(a + b)

router = Router(instructions=(
    "Route to math-agent for calculations, "
    "weather-agent for weather queries."
))

math_agent = Agent(name="math-agent", tools=[add],
    instructions="Math specialist. Be concise.")

weather_agent = Agent(name="weather-agent", tools=[get_weather],
    instructions="Weather specialist. Be concise.")

chain = Chain(agents=[router, math_agent, weather_agent])

chain.run("What is 5 + 3?")            # → routed to math-agent (sees only: add)
chain.run("Weather in Tokyo?")          # → routed to weather-agent (sees only: get_weather)
```

The router itself has **zero user tools** — it only sees the `route` tool. Each specialist only sees its own tools. No agent is overwhelmed with the full tool list.

### Evaluator

A quality reviewer that checks the previous agent's output and can request revisions. The Chain injects `approve()` and `revise(feedback)`. If the evaluator calls `revise`, the chain loops back to the previous agent with the feedback appended, up to `max_revisions` times.

**When to use:** After a content-generating agent when you need quality control. The evaluator-writer loop refines output iteratively — useful for copywriting, code generation, report writing, or anything where a first draft might not be good enough.

**Injected tools:** `approve()`, `revise(feedback)`

```python
from edge_agent import Agent, Chain, Evaluator

writer = Agent(
    name="writer",
    instructions="Write a punchy product tagline. Just the tagline, nothing else.",
)

editor = Evaluator(
    name="editor",
    instructions=(
        "Review the tagline for clarity, impact, and brevity (under 10 words). "
        "Approve if it meets all criteria, otherwise request a revision."
    ),
)

chain = Chain(agents=[writer, editor], max_revisions=2)

chain.run("Wireless noise-cancelling headphones")
# writer drafts → editor reviews → (revise?) → writer redrafts → editor approves
```

### Fallback

An agent that can signal it cannot handle a request, so the chain moves on to the next agent. The Chain injects a `fail(reason)` tool. If the agent handles the request successfully, the chain returns immediately. If it calls `fail`, the chain tries the next agent in line.

**When to use:** When you have specialist agents that should only answer certain types of questions. Stack them in order of specificity, with a generalist at the end as a catch-all.

**Injected tools:** `fail(reason)`

```python
from edge_agent import Agent, Chain, Fallback, tool

@tool
def add(a: int, b: int) -> str:
    """Add two numbers."""
    return str(a + b)

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"22°C in {city}"

math_only = Fallback(
    name="math-only",
    instructions="Answer math questions. For anything else, use fail.",
    tools=[add],
)

weather_only = Fallback(
    name="weather-only",
    instructions="Answer weather questions. For anything else, use fail.",
    tools=[get_weather],
)

generalist = Agent(
    name="generalist",
    instructions="Answer any question concisely.",
)

chain = Chain(agents=[math_only, weather_only, generalist])

chain.run("What is 9 + 4?")           # → math_only handles it
chain.run("Weather in London?")        # → math_only fails → weather_only handles it
chain.run("Who wrote Hamlet?")         # → math_only fails → weather_only fails → generalist handles it
```

## Combining Agent Types

The real power of Edge Agent is composing agent types into pipelines. Each combination solves a different architectural problem. Here are the patterns, when to use them, and how they work step by step.

### Pattern 1: Guardrail + Router + Specialists

**The problem:** You have multiple specialist agents with scoped tools, and you need safety filtering before any of them run.

**The flow:**

```
User message → Guardrail → Router → Specialist A
                 block?      ↓
                 halt      dispatch → Specialist B
                                   → Specialist C
```

**How it works:**

1. The **Guardrail** checks if the request is safe/on-topic. It only sees `block` and `allow` — no user tools, no routing logic.
2. If allowed, the **Router** examines the request and calls `route(agent_name, reason)`. It only sees the `route` tool — not the specialists' tools.
3. The chain **skips directly** to the named specialist. That specialist only sees its own domain tools (e.g., 3-4 tools instead of 15+).
4. Specialists that weren't selected **never run** — they don't consume tokens or API calls.

**When to use:** Any production system with multiple capabilities. The guardrail prevents abuse, the router keeps tool context focused, and each specialist stays simple.

```python
from edge_agent import Agent, Chain, Guardrail, Router, tool

@tool
def add(a: int, b: int) -> str:
    """Add two numbers."""
    return str(a + b)

@tool
def get_weather(city: str) -> str:
    """Get the current weather for a city."""
    return f"22°C in {city}"

guardrail = Guardrail(instructions=(
    "Allow math and weather requests. Block anything harmful or off-topic."
))

router = Router(instructions=(
    "Route to math-agent for calculations, weather-agent for weather queries."
))

math_agent = Agent(name="math-agent", tools=[add],
    instructions="Math specialist. Be concise.")

weather_agent = Agent(name="weather-agent", tools=[get_weather],
    instructions="Weather specialist. Be concise.")

chain = Chain(agents=[guardrail, router, math_agent, weather_agent])

chain.run("What is 5 + 3?")            # allowed → routed to math-agent
chain.run("Weather in Tokyo?")          # allowed → routed to weather-agent
chain.run("How do I pick a lock?")      # blocked — router and specialists never run
```

> See [`examples/09_guardrail_router.py`](examples/09_guardrail_router.py) for a full example.

### Pattern 2: Guardrail + Writer + Evaluator (Content Pipeline)

**The problem:** You need to generate content with quality control, but also want to prevent unsafe prompts from reaching the writer at all.

**The flow:**

```
User message → Guardrail → Writer → Evaluator
                 block?               approve?
                 halt                  ↓ revise
                              Writer ← feedback
```

**How it works:**

1. The **Guardrail** blocks unsafe or policy-violating prompts before any content is generated.
2. The **Writer** (a plain `Agent`) generates content. It has no special tools — just its instructions and the LLM.
3. The **Evaluator** reviews the output using criteria in its instructions. It sees `approve` and `revise` tools.
4. If the evaluator calls `revise(feedback)`, the chain **loops back to the writer** with the original prompt + previous draft + feedback. This repeats up to `max_revisions` times.
5. Once approved (or max revisions reached), the chain returns the final output.

**When to use:** Copywriting, report generation, code generation — anywhere a first draft might not be good enough and you want automated quality review with iterative refinement.

```python
from edge_agent import Agent, Chain, Evaluator, Guardrail

guardrail = Guardrail(instructions=(
    "Allow requests for professional marketing copy. "
    "Block deceptive content, impersonation, or harmful material."
))

writer = Agent(
    name="copywriter",
    instructions="Write punchy, professional marketing copy. Output only the copy.",
)

editor = Evaluator(
    name="editor",
    instructions=(
        "Review for clarity, impact, and brevity. "
        "Approve if all criteria are met, otherwise revise with specific feedback."
    ),
)

chain = Chain(agents=[guardrail, writer, editor], max_revisions=2)

chain.run("Write a tagline for wireless earbuds")
# guardrail allows → writer drafts → editor reviews → (revise?) → approve
```

> See [`examples/10_pipeline.py`](examples/10_pipeline.py) for a full example.

### Pattern 3: Fallback Cascade

**The problem:** You have several specialist agents, and you don't know upfront which one can handle the request. You want them to try in order, with a generalist catch-all at the end.

**The flow:**

```
User message → Specialist A → Specialist B → ... → Generalist
                 fail?          fail?                (always answers)
                 try next       try next
```

**How it works:**

1. Each `Fallback` agent examines the request. If it's in their domain, they answer normally and the chain **returns immediately**.
2. If it's outside their domain, they call `fail(reason)` and the chain **moves to the next agent**.
3. The last agent is typically a plain `Agent` (no `fail` tool) that catches everything the specialists couldn't handle.

**When to use:** When routing logic is too complex for a single Router (e.g., domain boundaries are fuzzy), or when you want each specialist to self-assess rather than relying on a central dispatcher.

**Router vs. Fallback — how to choose:**

| | Router | Fallback cascade |
|---|---|---|
| **Decision maker** | One router decides for all | Each agent decides for itself |
| **API calls** | 1 (router) + 1 (specialist) = 2 | 1 per agent tried (worst case: all of them) |
| **Best when** | Domains are clearly distinct | Domains overlap or are hard to describe upfront |
| **Scaling** | Cheap — always 2 calls | Expensive if many specialists fail before one handles it |

```python
from edge_agent import Agent, Chain, Fallback, tool

@tool
def search_docs(query: str) -> str:
    """Search internal documentation."""
    return f"Doc result for: {query}"

@tool
def query_db(sql: str) -> str:
    """Run a database query."""
    return f"DB result for: {sql}"

docs_agent = Fallback(
    name="docs-agent",
    instructions="Answer documentation questions. Fail for anything else.",
    tools=[search_docs],
)

db_agent = Fallback(
    name="db-agent",
    instructions="Answer database questions. Fail for anything else.",
    tools=[query_db],
)

generalist = Agent(
    name="generalist",
    instructions="Answer any question as best you can.",
)

chain = Chain(agents=[docs_agent, db_agent, generalist])
```

### Pattern 4: Full Pipeline (All Types Combined)

**The problem:** You need everything — safety, intelligent routing, specialist tools, graceful degradation, and quality review — in one pipeline.

**The flow:**

```
User message → Guardrail → Router → Specialist (Fallback) → Evaluator
                 block?      ↓          fail?                  revise?
                 halt      dispatch     try next              loop back
```

**How it works:**

1. **Guardrail** blocks unsafe requests at the gate — nothing else runs.
2. **Router** dispatches to the right specialist based on the request content.
3. **Specialists** (as `Fallback` agents) attempt to handle the request. If one can't, the chain tries the next until one succeeds or the chain reaches a generalist.
4. **Evaluator** at the end reviews whatever output was produced and can request revisions.

**When to use:** Complex production systems where you need every layer of control. This is the most expensive pattern (more API calls), so use it when quality and safety are critical.

```python
from edge_agent import Agent, Chain, Evaluator, Fallback, Guardrail, Router, tool

@tool
def add(a: int, b: int) -> str:
    """Add two numbers."""
    return str(a + b)

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"22°C in {city}"

@tool
def search_docs(query: str) -> str:
    """Search documentation."""
    return f"Doc: {query}"

guardrail = Guardrail(
    name="safety",
    instructions="Allow math, weather, and support questions. Block harmful requests.",
)

router = Router(
    name="dispatcher",
    instructions=(
        "Route to: math-specialist for calculations, "
        "weather-specialist for weather, support for product questions."
    ),
)

math_specialist = Fallback(
    name="math-specialist",
    instructions="Handle math questions. Fail for anything else.",
    tools=[add],
)

weather_specialist = Fallback(
    name="weather-specialist",
    instructions="Handle weather questions. Fail for anything else.",
    tools=[get_weather],
)

support = Agent(
    name="support",
    instructions="Answer product and policy questions.",
    tools=[search_docs],
)

quality = Evaluator(
    name="quality",
    instructions="Review for accuracy and clarity. Approve or revise.",
)

chain = Chain(
    agents=[guardrail, router, math_specialist, weather_specialist, support, quality],
    max_revisions=1,
)
```

> See [`examples/11_full_chain.py`](examples/11_full_chain.py) for a full example.

### Choosing the Right Pattern

| Pattern | Agents used | API calls per request | Best for |
|---|---|---|---|
| **Single agent** | Agent | 1+ (tool chaining) | Simple tasks, few tools |
| **Guardrail + Agent** | Guardrail → Agent | 2+ | Safety-gated single capability |
| **Router + Specialists** | Router → Agent(s) | 2+ | Multiple capabilities, clear domains |
| **Guardrail + Router** | Guardrail → Router → Agent(s) | 3+ | Safety + multiple capabilities |
| **Writer + Evaluator** | Agent → Evaluator | 2+ (with revision loops) | Content generation with quality control |
| **Content Pipeline** | Guardrail → Agent → Evaluator | 3+ | Safety + generation + quality |
| **Fallback Cascade** | Fallback(s) → Agent | 1-N | Fuzzy domains, self-assessing specialists |
| **Full Pipeline** | All types | 4+ | Production systems needing every layer |

## Agent Type Reference

| Type | Class | Injected tools | Chain behavior |
|---|---|---|---|
| `"agent"` | `Agent` | none | Runs normally |
| `"guardrail"` | `Guardrail` | `block(reason)`, `allow()` | `block` halts the chain |
| `"router"` | `Router` | `route(agent_name, reason)` | Runs the named agent, returns its result |
| `"evaluator"` | `Evaluator` | `approve()`, `revise(feedback)` | `revise` loops back to previous agent |
| `"fallback"` | `Fallback` | `fail(reason)` | `fail` skips to the next agent |

## Execution Tracing

Every call to `Agent.run()` or `Chain.run()` returns a `RunResult` with full execution traces — what tools were called, with what arguments, what they returned, and how long each took.

### RunResult structure

| Field | Type | Description |
|---|---|---|
| `output` | `str` | The final text response |
| `parsed` | `Any \| None` | The parsed dataclass when `output_type` is used |
| `steps` | `list[AgentStep]` | One entry per agent that ran |

`str(result)` returns `result.output`, so `print(result)` and f-strings work naturally.

### AgentStep fields

| Field | Type | Description |
|---|---|---|
| `agent_name` | `str` | Name of the agent |
| `agent_type` | `str` | `"agent"`, `"guardrail"`, `"router"`, etc. |
| `tools_used` | `list[ToolCallRecord]` | Every tool call made during this step |
| `output` | `str` | The agent's text output |
| `turns` | `int` | Number of LLM turns used |

### ToolCallRecord fields

| Field | Type | Description |
|---|---|---|
| `name` | `str` | Tool name |
| `arguments` | `dict` | Arguments passed to the tool |
| `result` | `str` | The tool's return value |
| `duration_ms` | `float` | Execution time in milliseconds |

### Single agent example

```python
from edge_agent import Agent, tool

@tool
def calculate(expression: str) -> str:
    """Evaluate a math expression."""
    return str(eval(expression))

agent = Agent(instructions="You are a calculator.", tools=[calculate])
result = agent.run("What is 2 + 2?")

print(result.output)                     # "The answer is 4."
print(result.steps[0].agent_name)        # "agent-1"
print(result.steps[0].tools_used[0].name)        # "calculate"
print(result.steps[0].tools_used[0].arguments)   # {"expression": "2 + 2"}
print(result.steps[0].tools_used[0].result)      # "4"
print(result.steps[0].tools_used[0].duration_ms) # 0.02
```

### Chain example

With a Chain, `result.steps` contains one entry per agent that ran, so you can trace the full pipeline:

```python
from edge_agent import Agent, Chain, Guardrail, Evaluator

guard = Guardrail(name="safety", instructions="Allow safe requests.")
writer = Agent(name="writer", instructions="Write copy.")
reviewer = Evaluator(name="reviewer", instructions="Review quality.")

chain = Chain(agents=[guard, writer, reviewer])
result = chain.run("Write a tagline for headphones")

for step in result.steps:
    print(f"{step.agent_name} ({step.agent_type}): {len(step.tools_used)} tool(s)")
# safety (guardrail): 1 tool(s)    — called allow()
# writer (agent): 0 tool(s)        — produced the tagline
# reviewer (evaluator): 1 tool(s)  — called approve()
```

## Session

Wrap any agent in a `Session` for an interactive terminal REPL with conversation history preserved across turns:

```python
from edge_agent import Agent, Session, tool

@tool
def search(query: str) -> str:
    """Search for information."""
    return f"Results for: {query}"

agent = Agent(instructions="You are a helpful assistant.", tools=[search])
session = Session(agent)
session.start()
```

## Providers

### Gemini (default)

```python
from edge_agent.providers import GeminiProvider

provider = GeminiProvider(
    model="gemini-3.1-flash-lite-preview",  # optional, has a sensible default
    api_key="your-key",                     # optional, resolved from env
    verify_ssl=False,                       # optional, default True — disable for local dev
)
```

You can also set the default model with the **`EDGE_AGENT_MODEL`** environment variable (legacy: **`TINYAGENT_MODEL`**).

TLS certificate verification is **enabled by default**. To disable it (e.g. behind a corporate proxy), pass `verify_ssl=False` or set `EDGE_AGENT_VERIFY_SSL=false` in your environment.

### Ollama

Run agents against models on your local machine via [Ollama](https://ollama.com/). No API key required.

```python
from edge_agent import Agent
from edge_agent.providers import OllamaProvider

provider = OllamaProvider(
    model="llama3.2",                         # optional, default "llama3.2"
    base_url="http://localhost:11434",         # optional, default localhost
    timeout=120,                              # optional, default 120s
)

agent = Agent(instructions="You are a helpful assistant.", provider=provider)
result = agent.run("Hello!")
print(result)
```

| Parameter | Default | Env var | Description |
|---|---|---|---|
| `model` | `"llama3.2"` | `OLLAMA_MODEL` | Ollama model name (e.g. `mistral`, `codellama`) |
| `base_url` | `"http://localhost:11434"` | `OLLAMA_HOST` | Ollama server URL |
| `timeout` | `120` | — | Request timeout in seconds |

All agent types (`Agent`, `Guardrail`, `Router`, `Evaluator`, `Fallback`), `Chain`, `Session`, structured output (`output_type`), and tool calling work with the Ollama provider. Use a model that supports function calling and structured output (e.g. `llama3.2`, `mistral`).

See [`examples/14_ollama.py`](examples/14_ollama.py) for a runnable local demo using `OllamaProvider` with tools.

### Bedrock

Run agents against Amazon Bedrock models using the Converse API. Authenticate with a **Bedrock API key** (Bearer token) — no `boto3` required.

**Setup:**

1. Open the [Amazon Bedrock console](https://console.aws.amazon.com/bedrock) → **API keys** → **Generate**
2. Export the key:

```bash
export AWS_BEARER_TOKEN_BEDROCK=<your-key>
# Optional: set model (default: us.anthropic.claude-sonnet-4-20250514-v1:0)
export BEDROCK_MODEL_ID=us.anthropic.claude-sonnet-4-20250514-v1:0
# Optional: set region (default: us-east-1)
export AWS_DEFAULT_REGION=us-east-1
```

**Usage — minimal (env vars):**

```python
from edge_agent import Agent
from edge_agent.providers import BedrockProvider

# API key, model, and region all resolved from env vars
provider = BedrockProvider()

agent = Agent(instructions="You are a helpful assistant.", provider=provider)
result = agent.run("Hello!")
print(result)
```

**Usage — explicit (all parameters inline):**

```python
from edge_agent import Agent
from edge_agent.providers import BedrockProvider

provider = BedrockProvider(
    api_key="your-key",                                      # or set AWS_BEARER_TOKEN_BEDROCK env var
    model_id="us.anthropic.claude-sonnet-4-20250514-v1:0",   # or set BEDROCK_MODEL_ID env var
    region_name="us-east-1",                                 # or set AWS_DEFAULT_REGION env var
)

agent = Agent(instructions="You are a helpful assistant.", provider=provider)
result = agent.run("Hello!")
print(result)
```

**Choosing a model:**

Pass any Bedrock model ID or cross-region inference profile ID as `model_id`:

```python
# Anthropic Claude
provider = BedrockProvider(model_id="us.anthropic.claude-sonnet-4-20250514-v1:0")

# Amazon Titan
provider = BedrockProvider(model_id="amazon.titan-text-lite-v1")

# Meta Llama
provider = BedrockProvider(model_id="meta.llama3-8b-instruct-v1:0")
```

| Parameter | Default | Env var | Description |
|---|---|---|---|
| `api_key` | — | `AWS_BEARER_TOKEN_BEDROCK` | Bedrock API key (Bearer token) |
| `model_id` | `"us.anthropic.claude-sonnet-4-20250514-v1:0"` | `BEDROCK_MODEL_ID` | Bedrock model ID or inference profile ID |
| `region_name` | `"us-east-1"` | `AWS_DEFAULT_REGION` | AWS region |
| `timeout` | `120` | — | Request timeout in seconds |
| `max_retries` | `3` | — | Retries on throttling (429) |
| `retry_backoff` | `2.0` | — | Backoff multiplier between retries |
| `inference_config` | `None` | — | Generic Converse params: `maxTokens`, `temperature`, `topP`, `stopSequences` |
| `additional_model_request_fields` | `None` | — | Model-family-specific params (e.g. `top_k` for Anthropic) |
| `supports_tool_use` | `True` | — | Set `False` if the model doesn't support tool calling |
| `supports_structured_output` | `False` | — | Set `True` only if the model natively supports JSON output |

**Inference config vs. additional model request fields:**

`inference_config` is for parameters common across all Bedrock models (temperature, max tokens). `additional_model_request_fields` is for parameters specific to a model family (e.g. Anthropic's `top_k`). Keep them separate — Bedrock rejects model-specific params in the generic config.

```python
provider = BedrockProvider(
    inference_config={"maxTokens": 1024, "temperature": 0.7},
    additional_model_request_fields={"top_k": 40},
)
```

**Structured output:**

Bedrock's Converse API does not have a native `responseSchema` parameter. When you use `output_type`, the provider injects a JSON-mode prompt into the system message and the agent loop parses the response. This works well with capable models (Claude, Llama, etc.) but is not as strict as Gemini's native schema enforcement.

**Capability flags:**

Not all Bedrock models support the same features. Use capability flags to avoid silent failures:

```python
# Model without tool support
provider = BedrockProvider(supports_tool_use=False)

# Model with native JSON output
provider = BedrockProvider(supports_structured_output=True)
```

If `supports_tool_use=False` and the agent has tools, the provider raises immediately with a clear error.

**Known limitations:**

- No streaming support (uses Converse, not ConverseStream)
- Structured output relies on prompt-based JSON mode, not native schema enforcement
- Text-only (no image/audio input or output)
- No prompt management resource support
- API key auth only (AWS Profile and Access Keys auth coming soon)

See [`examples/15_bedrock.py`](examples/15_bedrock.py) for a runnable demo with tool calling and structured output.

### Custom Providers

You can add support for any LLM by implementing the `Provider` abstract class:

```python
from edge_agent.providers import Provider
from edge_agent import Message, Tool

class MyProvider(Provider):
    def chat(
        self,
        messages: list[Message],
        tools: list[Tool] | None = None,
        output_schema: dict[str, object] | None = None,
    ) -> Message:
        # Make your API call and return a Message
        # output_schema is a JSON schema dict for structured output
        ...
```

## Logging

Logging is silent by default. Opt in by configuring the `edge_agent` logger:

```python
import logging

logging.basicConfig()
logging.getLogger("edge_agent").setLevel(logging.DEBUG)
```

## Examples

See the [`examples/`](examples/) directory:

| Example | What it demonstrates |
|---|---|
| `01_hello.py` | Minimal agent, no tools |
| `02_tools.py` | `@tool` decorator, multi-step tool chaining |
| `03_session.py` | Interactive REPL with conversation history |
| `04_guardrail.py` | Guardrail → worker chain |
| `05_router.py` | Router → specialist dispatch |
| `06_evaluator.py` | Writer → evaluator revision loop |
| `07_fallback.py` | Fallback cascade with generalist catch-all |
| `08_custom_provider.py` | Custom LLM provider implementation |
| `09_guardrail_router.py` | Guardrail + Router + specialists combined |
| `10_pipeline.py` | Content pipeline: Guardrail → Writer → Evaluator |
| `11_full_chain.py` | All agent types in one chain |
| `12_mcp.py` | MCP server connection and tool usage |
| `13_mcp_config.py` | Load MCP servers from a JSON config file |
| `14_ollama.py` | Explicit `OllamaProvider` demo with local tool calling |
| `15_bedrock.py` | Amazon Bedrock: API key auth, tool calling, structured output |
| `multi_tool_demo/` | Tools across multiple files, flat vs. router comparison |
| `mcp_demo/` | Load MCP servers from a `mcp.json` config file and run an agent against them |
| [`advanced_features_demo.py`](advanced_features_demo.py) | Template variables, file-based prompts, and structured output |

## Development

```bash
# Install with dev dependencies
uv sync --dev

# Run tests
uv run pytest tests/ -v
```
