===== `README.md` =====

# Multi-Agent Interface Layer (MAIL)

Single-swarm example | Multi-swarm example
:-------------------:|:-------------------:
![](/assets/mail.png)| ![](/assets/interswarm.png)

**MAIL** is an **open protocol** for letting autonomous agents communicate, coordinate, and cooperate across local runtimes and distributed swarms. This repository hosts both the normative specification and a production-grade **Python/FastAPI reference implementation** that demonstrate how to build interoperable agent systems on top of the MAIL contract.

---

## Quick Links
- **Protocol specification**: [spec/SPEC.md](/spec/SPEC.md)
- **JSON Schemas**: [spec/MAIL-core.schema.json](/spec/MAIL-core.schema.json), [spec/MAIL-interswarm.schema.json](/spec/MAIL-interswarm.schema.json)
- **REST transport** (OpenAPI 3.1): [spec/openapi.yaml](/spec/openapi.yaml)
- **Reference implementation source**: [src/mail/](/src/mail/__init__.py)
- **Command-line interface**: [docs/cli.md](/docs/cli.md), `uv run mail …`
- **Asynchronous HTTP client**: [docs/client.md](/docs/client.md), [src/mail/client.py](/src/mail/client.py)
- **Deployment examples and docs**: [docs/](/docs/README.md)

## 1. MAIL Protocol Overview

### Goals
- Provide a transport-agnostic **message contract** so agents from different vendors can interoperate.
- Encode **routing, addressing, and task lifecycle semantics** that work for single-swarm and cross-swarm topologies.
- Support reliable inter-swarm federation over **standard HTTP** infrastructure.
- Remain **minimal enough** to embed inside bespoke agent runtimes or platform orchestrators.

### Message Primitives
MAIL defines five core message types that all conforming systems MUST understand. Each payload is validated against `MAIL-core.schema.json`.

| `msg_type`           | Required payload fields                                                                 | Typical use case                                  |
|----------------------|-------------------------------------------------------------------------------------------|---------------------------------------------------|
| `request`            | `task_id`, `request_id`, `sender`, `recipient`, `subject`, `body`                        | Agent-to-agent task delegation                    |
| `response`           | `task_id`, `request_id`, `sender`, `recipient`, `subject`, `body`                        | Reply that correlates with a prior request        |
| `broadcast`          | `task_id`, `broadcast_id`, `sender`, `recipients[]`, `subject`, `body`                   | Notify many agents in a swarm                     |
| `interrupt`          | `task_id`, `interrupt_id`, `sender`, `recipients[]`, `subject`, `body`                   | High-priority stop/alter instructions             |
| `broadcast_complete` | `task_id`, `broadcast_id`, `sender`, `recipients[]`, `subject`, `body` (MAILBroadcast)   | Marks task completion by a supervisor agent       |

All messages are wrapped in a `MAILMessage` envelope with an `id` (UUID) and RFC 3339 timestamp. Optional fields such as `sender_swarm`, `recipient_swarm`, and `routing_info` carry federation metadata without altering the core contract.

### Addressing & Routing
- **Local agents** are addressed by name (`agent-name`).
- **Interswarm addresses** append the remote swarm (`agent-name@swarm-name`).
- **Routers** MUST wrap cross-swarm traffic in a `MAILInterswarmMessage` that includes source/target swarm identifiers and optional metadata.
- **Priority tiers** ensure urgent system and user messages preempt regular agent chatter. Within a tier, messages are FIFO by enqueue sequence.

### Transport Requirements
- The **normative HTTP binding** is published in [spec/openapi.yaml](/spec/openapi.yaml) and implemented by the reference **FastAPI** service.
- **`/message`** handles user tasks and local agent traffic. **`/tasks`** returns the caller's in-flight and completed tasks, and **`/task`** fetches a specific task record by ID. **`/interswarm/forward`** / **`/interswarm/back`** move agent traffic between swarms, and **`/interswarm/message`** proxies user/admin requests to a remote swarm.
- Implementations MUST replay responses from remote swarms back into the local queue to complete task lifecycles.

### Conformance & Validation
- Use the **included JSON Schemas** for request/response validation in any runtime.
- Run **`uv run spec/validate_samples.py`** to check sample payloads against the schemas.
- Terms defined in the spec follow RFC 2119/RFC 8174 keywords.

## 2. Reference Implementation

### Key Features
- **Persistent swarm runtime** with pluggable agents, tools, and memory backends.
- **Task resume safety** via automatic queue snapshots that stash pending task messages on completion/breakpoints and restore them when the user resumes work.
- **FastAPI HTTP server** exposing REST endpoints, **Server-Sent Events (SSE)** streams, and **interswarm messaging** routes.
- **Task introspection API** surfaces `GET /tasks` and `GET /task` so callers can audit active work, inspect SSE timelines, and resume confidently from any state.
- **CLI launcher** (`mail server`, `mail client`) for running the server and an interactive REPL without writing code.
- **Async MAIL client** (`MAILClient`) mirroring the REST API with SSE helpers for quick integrations.
- Built-in **swarm registry** with **health checks** and **service discovery** for distributed deployments.
- **Configurable authentication layer** that plugs into external auth/token providers.
- **Example agents** (`supervisor`, `weather`, `math`, cross-swarm demos) showcasing MAIL usage patterns.

### Architecture Highlights
- **[src/mail/core/runtime.py](/src/mail/core/runtime.py)**: Mailbox scheduling, task orchestration, priority queues, and tool execution.
- **[src/mail/server.py](/src/mail/server.py)**: FastAPI application with REST + SSE endpoints and interswarm routing.
- **[src/mail/net/router.py](/src/mail/net/router.py)**: HTTP federation between swarms, including metadata rewriting.
- **[src/mail/net/registry.py](/src/mail/net/registry.py)**: Service registry and liveness monitoring for remote swarms.
- **[src/mail/factories/](/src/mail/factories/__init__.py)**: Agent functions that instantiate agents with their LLM/tool configuration.
- **[src/mail/examples/](/src/mail/examples/__init__.py)**: Example agents and prompts.

The runtime processes MAIL messages **asynchronously**, tracks per-task state, and produces `broadcast_complete` events to signal overall task completion.

## 3. Getting Started

### Prerequisites
- **Python 3.12+**
- [`uv`](https://github.com/astral-sh/uv) package manager (recommended) or `pip`
- **[LiteLLM](https://github.com/BerriAI/litellm) proxy endpoint** for LLM calls
- **Authentication service** providing `/auth/login` and `/auth/check` (see below)

### Installation
```bash
# Clone and enter the repository
git clone https://github.com/charonlabs/mail --branch v1.3.6
cd mail

# Install dependencies (preferred)
uv sync

# or, using pip
pip install -e .
```

### Configuration
Set the following **environment variables** before starting the server:

```bash
# Authentication endpoints
export AUTH_ENDPOINT=http://your-auth-server/auth/login
export TOKEN_INFO_ENDPOINT=http://your-auth-server/auth/check

# LLM proxy (required only if your swarm uses use_proxy=true)
export LITELLM_PROXY_API_BASE=http://your-litellm-proxy

# Optional provider keys (required for direct provider calls)
export OPENAI_API_KEY=sk-your-openai-api-key
export ANTHROPIC_API_KEY=sk-your-anthropic-key

# Optional persistence (set to "none" to disable)
export DATABASE_URL=postgresql://...
```

Defaults for host, port, swarm metadata, and client behaviour are loaded from [`mail.toml`](mail.toml). The `[server.settings]` table exposes `task_message_limit`, which bounds how many messages the runtime will process per task when `run_continuous` is active (default `15`), and `print_llm_streams` (default `true`), which controls whether runtime-managed agents print LLM reasoning/response stream chunks to server stdout. Set `print_llm_streams=false` (or pass `mail server --print-llm-streams false`) for quieter server logs; task/event SSE streaming is unaffected. Override the file or point `MAIL_CONFIG_PATH` at an alternate TOML to adjust these values per environment. Use CLI flags such as `--swarm-name`, `--swarm-source`, `--swarm-registry`, and `--print-llm-streams true|false` (or edit `mail.toml`) to override these at launch; `mail server` exports `SWARM_NAME`, `SWARM_SOURCE`, `SWARM_REGISTRY_FILE`, and `BASE_URL` for downstream tools but does not read them as config overrides.

MAIL will create the parent directory for `SWARM_REGISTRY_FILE` on startup if it is missing, so you can rely on the default `registries/` path without committing the folder.

**Swarm definitions** live in [swarms.json](/swarms.json). Each entry declares the agents, entrypoint, tools, and default models for a swarm.

### Run a Local Swarm
```bash
# Start the FastAPI server (includes SSE + registry)
uv run mail server
# or explicitly
uv run -m mail.server
```

### Federate Two Swarms (Example)
```bash
# Terminal 1
uv run mail server --port 8000 --swarm-name swarm-alpha --swarm-registry registries/swarm-alpha.json

# Terminal 2
uv run mail server --port 8001 --swarm-name swarm-beta --swarm-registry registries/swarm-beta.json

# Register each swarm with the other (requires admin bearer token)
curl -X POST http://localhost:8000/swarms \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "swarm-beta", "base_url": "http://localhost:8001"}'

curl -X POST http://localhost:8001/swarms \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "swarm-alpha", "base_url": "http://localhost:8000"}'
```
Agents can now address peers using `agent-name@swarm-name`, and responses will route back automatically.

## 4. Repository Layout
```
mail/
├── spec/                  # Protocol specification, schemas, validation utilities
├── src/mail/              # Reference implementation (core runtime + FastAPI services)
├── docs/                  # Supplemental docs (registry, inter-swarm, auth, etc.)
├── swarms.json            # Default swarm configurations
├── tests/                 # Pytest suite covering protocol + runtime behaviors
├── scripts/               # Operational helpers (deploy, smoke tests, tooling)
├── registries/            # Swarm registry persistence (created as needed)
├── assets/                # Diagrams and static assets (README image, etc.)
└── pyproject.toml         # Project metadata and dependency definitions
```

## 5. Development Workflow
- **`uv run mail server`** – run the reference server locally.
- **`uv run pytest -q`** – execute the automated test suite.
- **`uv run ruff check --fix .`** – lint and auto-fix style issues.
- **`uv run spec/validate_samples.py`** – validate example MAIL payloads against the schemas.

## 6. Documentation & Resources
- **Quickstart guide**: [docs/quickstart.md](/docs/quickstart.md)
- **Architecture deep-dive**: [docs/architecture.md](/docs/architecture.md)
- **Protocol message format reference**: [docs/message-format.md](/docs/message-format.md)
- **HTTP/API surface**: [docs/api.md](/docs/api.md)
- **Swarm configuration & registry operations**: [docs/configuration.md](/docs/configuration.md), [docs/registry.md](/docs/registry.md)
- **Database persistence**: [docs/database.md](/docs/database.md)
- **HTTP client usage**: [docs/client.md](/docs/client.md)
- **Security hardening checklist**: [docs/security.md](/docs/security.md)
- **Agents, tools, and examples**: [docs/agents-and-tools.md](/docs/agents-and-tools.md), [docs/examples.md](/docs/examples.md)
- **Testing and troubleshooting**: [docs/testing.md](/docs/testing.md), [docs/troubleshooting.md](/docs/troubleshooting.md)
- **Runtime source directories**: [src/mail/examples/](/src/mail/examples/__init__.py), [src/mail/factories/](/src/mail/factories/__init__.py)

## 7. Contributing
- **Read [CONTRIBUTING.md](/CONTRIBUTING.md)** for branching, issue, and review guidelines.
- All commits require a **Developer Certificate of Origin sign-off** (`git commit -s`).
- Please open an issue to propose significant protocol changes before implementation.
- Core maintainers are listed in [MAINTAINERS.md](/MAINTAINERS.md).

## 8. Licensing & Trademarks
- Reference implementation code: **Apache License 2.0** ([LICENSE](/LICENSE)).
- Specification text: **Creative Commons Attribution 4.0** ([SPEC-LICENSE](/SPEC-LICENSE)).
- Essential patent claims: **Open Web Foundation Final Specification Agreement 1.0** ([SPEC-PATENT-LICENSE](/SPEC-PATENT-LICENSE)).
- Trademarks and descriptive use policy: [TRADEMARKS.md](/TRADEMARKS.md).

Using the spec or code implies acceptance of their respective terms.

---

For questions, bug reports, or feature requests, open an issue or start a discussion in this repository.


===== End of `README.md` =====

===== `docs/interswarm.md` =====

# Interswarm Messaging

MAIL supports cross-swarm communication over HTTP. Remote addresses are written as `agent@swarm` and routed via the interswarm router and registry.

## Addressing
- **Local**: `agent`
- **Remote**: `agent@swarm`
- **Helper functions**: `parse_agent_address`, `format_agent_address` ([src/mail/core/message.py](/src/mail/core/message.py))

## Router ([src/mail/net/router.py](/src/mail/net/router.py))
- Detects remote recipients and wraps messages into `MAILInterswarmMessage`
- Uses the registry to find the remote base URL and resolves auth tokens from `${SWARM_AUTH_TOKEN_<NAME>}` when present
- Falls back to the message payload's `auth_token` when the registry entry does not supply one, and raises an explicit error if neither source is available
- Sends new tasks to the remote server via `/interswarm/forward` and returns follow-up or completion traffic through `/interswarm/back`
- When a local user/admin proxies a task, `post_interswarm_user_message` targets the remote `/interswarm/message` endpoint and returns the resulting `MAILMessage`

### Authentication
- Persistent registry entries store a reference to `${SWARM_AUTH_TOKEN_<SWARM>}`. Export these before starting each server so forwarded calls carry a valid bearer token.
- For volatile or ad-hoc registrations you can still embed an `auth_token` on each message; the router now uses this as a fallback.
- If both the registry and the payload omit a token, the router fails fast with `authentication token missing for swarm '<name>'` so issues surface locally rather than as a remote 401.

## Registry ([src/mail/net/registry.py](/src/mail/net/registry.py))
- Tracks local and remote swarms, performs health checks, persists non-volatile entries
- Auth tokens for persistent swarms are converted to environment variable references `${SWARM_AUTH_TOKEN_<NAME>}`
- Validates whether required env vars are set and resolves them at runtime

## Server endpoints ([src/mail/server.py](/src/mail/server.py))
- **POST `/interswarm/forward`** (agent): remote swarms send initial messages for a new or resumed task; body `{ "message": MAILInterswarmMessage }`
- **POST `/interswarm/back`** (agent): remote swarms send follow-up or completion payloads for an existing task; body `{ "message": MAILInterswarmMessage }`
- **POST `/interswarm/message`** (admin/user): local callers proxy a task to a remote swarm; body `{ user_token, body, targets, ... }`

## Enabling interswarm
- Ensure `mail.toml` (or CLI flags like `--swarm-name`, `--swarm-source`, `--swarm-registry`) identifies this server instance. The base URL is derived from `host` + `port`.
- Ensure your persistent swarm template enables interswarm where needed (see agents & supervisor tools)
- Export `SWARM_AUTH_TOKEN_<REMOTE>` for every persistent remote entry before starting the server (the registry logs a warning if the variable is missing).
- Start two servers on different ports; register them with each other using `/swarms` endpoints

## Example flow
1. User calls `POST /message` locally
2. (optional) If the entrypoint agent is not interswarm-enabled, forward the user's message to one that it
3. Interswarm-enabled agent sends a message to `target@remote-swarm` using otherwise-equal MAIL syntax
4. Router wraps the message and POSTs to the remote `/interswarm/forward`
5. Remote swarm processes the task; whenever it has a response or needs to resume locally it POSTs `/interswarm/back` to the origin swarm
6. Local server correlates each `/back` payload to the user’s original task and feeds it into the local runtime
7. Local swarm calls `task_complete` once a `broadcast_complete` arrives through the `/back` channel

## Runtime behavior
- Local agents still need explicit `comm_targets` to message peers in the same swarm, even when the address includes `@<local-swarm>`.
- `comm_targets` are enforced for local agents (including `agent@swarm` targets). Messages originating from remote swarms bypass local `comm_targets` checks.


===== End of `docs/interswarm.md` =====

===== `docs/cli.md` =====

# MAIL Command-Line Interface

The reference implementation ships with a convenience CLI that lets you run the FastAPI server and talk to it interactively from the same entry point. Both commands are exposed via the console script `mail`, which is installed when you install this package (`uv sync` or `pip install -e .`).

## Commands

```shell
mail server   # run the FastAPI reference server
mail client   # launch the interactive MAIL client REPL
mail version  # print MAIL reference + protocol version information
mail ping     # check if a MAIL server is reachable
mail db-init  # initialize database tables for persistence
mail register # register as OS handler for swarm:// URLs
```

The top-level parser accepts the same flags regardless of how you invoke it, for example `python -m mail.cli …` or `uv run mail …`. Use `mail version` any time you need to confirm the reference implementation and protocol version advertised by the CLI.

### `mail server`
- Configuration defaults are read from `mail.toml` (see
  [configuration.md](./configuration.md)). Flags such as `--host`, `--port`, `--reload`, `--swarm-name`, `--swarm-source`, `--swarm-registry`, and `--print-llm-streams true|false` only override the values you provide.
- Use `--config /path/to/mail.toml` to point at a different   configuration file for a single run. The environment variable `MAIL_CONFIG_PATH` acts as the persistent override if you prefer exporting it once.
- Environment variables such as `AUTH_ENDPOINT` and `TOKEN_INFO_ENDPOINT` remain required; `LITELLM_PROXY_API_BASE` is required only if your swarm uses `use_proxy=true`. The CLI does not provide defaults for these. When launched via `mail server`, defaults from `mail.toml` are exported to `SWARM_NAME`, `SWARM_SOURCE`, `SWARM_REGISTRY_FILE`, and `BASE_URL` for you.
- Pass `--debug` (or set `[server].debug = true`) when you need the debug-only surface, including the OpenAI-compatible `/responses` endpoint. Leave it off for production deployments.
- Use `--print-llm-streams true|false` to control whether runtime-managed agents print LLM reasoning/response stream chunks to server stdout. This does not affect SSE event streaming returned by `message-stream`/`POST /message`.
- Example:

  ```bash
  uv run mail server \
  --host 0.0.0.0 \
  --port 8000 \
  --reload
  ```

### `mail client`
Launching `mail client` starts the interactive REPL.

- Provide the server URL as the first positional argument so the client knows which base URL to contact; the CLI does not infer it automatically.
- The default timeout comes from the `[client]` table in `mail.toml`; override it per invocation with `--timeout`.
- The `--config` flag is shared with `mail server`, allowing you to point both commands at the same config file if you keep multiple TOML profiles.
- Toggle verbose HTTP logging for the REPL with `--verbose`; it mirrors `[client].verbose` from `mail.toml`.

```shell
uv run mail client http://localhost:8000 \
--api-key $USER_TOKEN
```

Once inside you will see the prompt `mail>`. The REPL accepts any of the subcommands documented in [docs/client.md](./client.md), plus a few helper commands:

| Command | Description |
| --- | --- |
| `help` / `?` | Print CLI usage information without exiting the loop. |
| `exit` / `quit` | Leave the REPL. |
| `ping` | Invoke `GET /` and display the server name/version. |
| `health [-v]` / `health-update STATUS` | Read or update health probes (update requires admin role). |
| `whoami`, `status`, `login`, `logout` | Inspect or change the caller identity tracked by the session. |
| `message "…" [--entrypoint …] [--task-id …] [--resume-from …] [--kwargs '{…}'] [--show-events]` | Submit a message and print the structured response. |
| `message-stream "…" [--task-id …] [--resume-from …] [--kwargs '{…}']` | Stream SSE events; each event is printed as it arrives. |
| `message-interswarm "…" '["agent@remote"]' $USER_TOKEN` | Proxy a request to another swarm (requires interswarm). |
| `swarms-get`, `swarm-register`, `swarm-dump`, `swarm-load-from-json` | Inspect or mutate the swarm registry and persistent template. |
| `responses INPUT TOOLS [--instructions …] [--previous-response-id …] [--tool-choice …] [--parallel-tool-calls] [--kwargs '{…}']` | Debug-only OpenAI `/responses` call; requires the server to run with `--debug` and a JSON payload for `INPUT`/`TOOLS`. |

The REPL uses `shlex.split`, so quoting works as expected:

```shell
mail> message "Forecast for tomorrow" \
      --entrypoint supervisor
```

Errors raised by `argparse` are caught and reported without exiting the loop, letting you adjust the command and retry. Blank lines are ignored, and `Ctrl+C` returns you to the prompt without killing the process.

### Streaming inside the REPL
`message-stream` mirrors the behaviour of `MAILClient.post_message_stream`. When the server emits events, each `ServerSentEvent` object is printed in arrival order. This is particularly useful when you want to monitor `task_complete` notifications or inspect intermediate `new_message` / `action_call` events without leaving the terminal.

### Working with Tasks and Breakpoint Resumes

- Both messaging commands accept `--task-id`. Provide it to resume an existing task; omit it to let the server allocate one for a brand-new task.
- To continue a task that paused on a breakpoint tool call, add the following flags:

  ```shell
  mail> message-stream "Continuing after breakpoint" \
        --task-id weather-123 \
        --resume-from breakpoint_tool_call \
        --kwargs '{"breakpoint_tool_call_result": "{\"call_id\":\"bp-1\",\"content\":\"Forecast: sunny\"}"}'
  ```

- The `--kwargs` payload must be valid JSON. For breakpoint resumes the runtime only requires `breakpoint_tool_call_result`; supply a JSON-encoded string that mirrors the tool outputs you received. Provide either a single object (`{"content": "..."}`) or a list of objects (`[{"call_id": "...", "content": "..."}]`) when several breakpoint tools paused in parallel.
- Upon resuming, the runtime reloads any stashed queue entries for the task so the agents pick up exactly where they paused.
- For manual follow-ups, use `--resume-from user_response` to inject a new user message into the same task without losing queued events:

  ```shell
  mail> message "Add a final summary" \
        --task-id weather-123 \
        --resume-from user_response
  ```

### `mail ping`

Check if a MAIL server is reachable and display its health status.

```shell
uv run mail ping http://localhost:8000
```

- The command calls `GET /health` on the target server and reports the swarm name and status.
- Use `--timeout` to override the default 5-second timeout.
- Supports `swarm://` URLs (see below), which are automatically converted to HTTPS.

```shell
# With custom timeout
uv run mail ping http://localhost:8000 --timeout 10

# Using swarm:// URL
uv run mail ping "swarm://connect?server=example.com"
```

On success, you'll see output like:
```
✓ my-swarm is healthy
```

On failure:
```
✗ Cannot connect to http://localhost:8000
```

### `mail db-init`

Initialize PostgreSQL database tables for agent history and task persistence.

```shell
uv run mail db-init
```

- Requires the `DATABASE_URL` environment variable to be set.
- Creates four tables: `agent_histories`, `tasks`, `task_events`, `task_responses`.
- Safe to run multiple times (uses `CREATE TABLE IF NOT EXISTS`).
- See [database.md](./database.md) for schema details and setup instructions.

### `mail register`

Register the MAIL client as the operating system handler for `swarm://` URLs.

```shell
uv run mail register
```

This enables clicking `swarm://` links in browsers or other applications to automatically open the MAIL client.

**Platform support:**

- **Linux**: Creates a `.desktop` file and registers via `xdg-mime`. Fully automated.
- **macOS**: Prints `Info.plist` configuration for app bundling (manual setup required).
- **Windows**: Prints PowerShell commands for registry modification (requires Administrator).

### `swarm://` URL Scheme

The CLI supports `swarm://` URLs for connecting to MAIL servers. This provides a convenient way to share connection details.

**Supported formats:**
```
swarm://connect?server=<hostname>&token=<api_key>
swarm://invite?server=<hostname>&token=<api_key>
```

Both `mail client` and `mail ping` accept these URLs:

```shell
# Connect to a server using swarm:// URL
uv run mail client "swarm://connect?server=example.com&token=my-api-key"

# Ping a server using swarm:// URL
uv run mail ping "swarm://connect?server=example.com"
```

The URL is automatically converted to `https://<server>`. The token (if provided) is used as the API key for `mail client`; `mail ping` ignores the token because `/health` is public.

## Tips
- Use the same environment variables you would for the Python client. The CLI simply wraps `MAILClient` and forwards `--api-key`, `--timeout`, and `--verbose` into `ClientConfig`.
- Combine with `uv run` for isolated environments, e.g. `uv run mail client …`.
- Logging inherits the standard logging configuration. Setting `MAIL_LOG_LEVEL=DEBUG` will surface detailed request/response traces while you use the REPL.

### OpenAI Responses from the REPL
- Ensure the server was started with `--debug`; the `/responses` endpoint is hidden otherwise.
- Supply `INPUT` and `TOOLS` as JSON strings. A minimal request looks like:

  ```shell
  mail> responses '[{"role":"user","content":"Summarise the plan"}]' '[]' 
  ```

- Provide additional OpenAI-compatible fields through the optional flags (for example `--instructions "System prompt"` or `--previous-response-id ...`). The command forwards the parsed JSON directly to the HTTP API.

For deeper programmatic examples refer to [docs/client.md](./client.md).


===== End of `docs/cli.md` =====

===== `docs/factories.md` =====

# MAIL Factories

MAIL factories encapsulate the boilerplate required to build agent callables that the runtime can schedule. They translate agent configuration—model selection, tool access, routing flags—into an async callable that conforms to `mail.core.agents.AgentFunction`.

Factories live under `src/mail/factories/` and are organized by concern:

- `base.py` provides the shared factory base classes and the `base_agent_factory` convenience function (now deprecated).
- `action.py` layers action-specific validation (e.g., pydantic tool schemas).
- `supervisor.py` wires in supervisor-only tools and policies.
- `mail.examples/*/agent.py` demonstrates how to compose these factories for concrete agents.

## Quick Start

LLM agents can be easily built with the `LiteLLMAgentFunction`. `LiteLLMAgentFunction.__init__` corresponds to the agent factory; `LiteLLMAgentFunction.__call__` represents the agent function itelf. This class wires the given tools and LiteLLM configuration into a coroutine that the runtime can invoke:

```python
from mail.factories import LiteLLMAgentFunction

analytics_agent = LiteLLMAgentFunction(
    # top-level wiring
    name="analyst",
    comm_targets=["consultant", "supervisor"],
    tools=[{"type": "function", "function": {"name": "fetch_report", "description": "...", "parameters": {...}}}],
    # LiteLLM config
    llm="openai/gpt-5-mini",
    system="system prompt string",
    tool_format="responses",
    enable_entrypoint=False,
    enable_interswarm=False,
    can_complete_tasks=False,
    # runtime instance parameter defaults
    user_token="",  # provided when instantiating a swarm (per runtime instance)
    reasoning_effort="low",
    thinking_budget=4000,
    max_tokens=6000,
    memory=True,
    use_proxy=True,
    stream_tokens=True,
    print_llm_streams=True,
)
```

At runtime, `LiteLLMAgentFunction` receives `messages` and an optional `tool_choice`; the `user_token` is captured at instantiation time (per runtime instance), not per message.

The agent shown above can be directly run as follows:

```python
messages: list[dict[str, Any]] = [
    {
        "role": "user",
        "content": "Test message"
    }
]

tool_choice: str | dict[str, str] = "auto"

agent_output = await analytics_agent(
    messages=messages,
    tool_choice=tool_choice, # default = "required"
)
```

## Agent Function Class Hierarchy

When you need specialized behavior, inherit from the agent function classes defined in `src/mail/factories/base.py`:

- **`MAILAgentFunction`** — abstract base storing common configuration (communication targets, tool sets, scheduling flags).
- **`LiteLLMAgentFunction`** — concrete implementation that prepares LiteLLM requests for either `completions` or `responses` tool formats.

From these, `action.py` and `supervisor.py` define more specific flavors:

- **`ActionAgentFunction` / `LiteLLMActionAgentFunction`** — validate and normalize action tool schemas before delegating to LiteLLM.
- **`SupervisorFunction` / `LiteLLMSupervisorFunction`** — append supervisor control tools (`task_complete`, broadcast helpers) and enable task completion.

Extending an agent function lets you share configuration defaults while allowing callers to override instance-level settings. For example, the sample analyst agent exposes additional metadata but ultimately delegates to `LiteLLMAgentFunction`:

```python
from collections.abc import Awaitable
from typing import Any

from mail.core.tools import AgentToolCall
from mail.factories.base import LiteLLMAgentFunction


class LiteLLMAnalystFunction(LiteLLMAgentFunction):
    def __call__(
        self,
        messages: list[dict[str, Any]],
        tool_choice: str | dict[str, str] = "required",
    ) -> Awaitable[tuple[str | None, list[AgentToolCall]]]:
        # Leverage LiteLLMAgentFunction's async implementation
        return super().__call__(messages, tool_choice)
```

This pattern keeps all LiteLLM handling inside the shared implementation while leaving room to hook custom behavior if needed (e.g., adding traces, rewriting messages).

## Tool Handling

Factories rely on utilities in `mail.core.tools` to expose MAIL-native tools and normalize user-provided actions:

- `create_mail_tools(...)` returns runtime utilities (send, ack, task_complete) and respects the `exclude_tools` list.
- `pydantic_function_tool` and `_make_tools` ensure OpenAI-style tool definitions match LiteLLM expectations.
- Supervisor factories append additional control tools via `create_supervisor_tools`.

When building custom agent functions, consider reusing these helpers instead of reimplementing tool coercion logic. That keeps behavior consistent across agents and ensures new dispatcher features (like inter-swarm messaging) propagate automatically.

## Instance Parameters vs. Top-Level Wiring

Factory call signatures follow a convention:

- **Top-level parameters** (`comm_targets`, `tools`, `name`, `enable_entrypoint`, etc.) describe the agent's static wiring and are typically supplied from `swarms.json` or other configuration.
- **Instance parameters** (`user_token`, instance-level overrides) are filled when the swarm or agent instance is created.
- **Internal parameters (`agent_params`)** (`llm`, `system`, `reasoning_effort`, `thinking_budget`, `stream_tokens`, `print_llm_streams`) control the LLM call and are often set by package defaults or environment configuration.

`LiteLLMAgentFunction` closes over the supplied top-level settings and uses the instance parameters provided when the swarm is instantiated (for example, a per-user `user_token`).

When a `MAILRuntime` is created with `print_llm_streams=False`, it best-effort propagates that value down to these function wrappers so stream output is centrally suppressed even if individual agent params were set to `true`.

## Integrating with Swarms

Agent definitions in `swarms.json` reference factories via import strings, for example:

```json
{
  "name": "analyst",
  "factory": "python::mail.examples.analyst_dummy.agent:LiteLLMAnalystFunction",
  "comm_targets": ["consultant", "supervisor"],
  "agent_params": {
    "llm": "openai/gpt-5-mini",
    "system": "mail.examples.analyst_dummy.prompts:SYSPROMPT"
  }
}
```

The runtime instantiates these factories through `mail.api.MAILAgentTemplate`, passing shared top-level configuration. Custom agent functions should maintain function signatures compatible with the templates so they can be plugged into swarms without additional glue code.

## Testing and Validation

- Use `uv run pytest -q` (or the scoped `tests/unit`) to exercise factory behavior. The sample agents demonstrate how to cover agent execution paths.
- Run `uv run ruff check .` and `uv run mypy src/mail` to keep style and types aligned with project standards.
- When factories introduce new tool schemas, update JSON schema fixtures under `spec/` and validate with `uv run spec/validate_samples.py`.

Keeping agent functions small and composable makes it easy to add new agent personas or capabilities without duplicating LiteLLM interaction logic. Start with `LiteLLMAgentFunction`, and lean on the shared utilities to stay consistent with the rest of MAIL.


===== End of `docs/factories.md` =====

===== `docs/troubleshooting.md` =====

# Troubleshooting

This document contains various tips on how to ensure your MAIL swarm is running correctly.

## Common issues

Below is a list of possible issues you may encounter during setup, and steps you can take to resolve them.

![IMPORTANT]
This list is not exhaustive, and probably never will be. If you run into any resolvable issue worth mentioning, feel free to add it here.

### Server won't start
  - Check required env vars: `AUTH_ENDPOINT`, `TOKEN_INFO_ENDPOINT` (and `LITELLM_PROXY_API_BASE` only if your swarm uses `use_proxy=true`)
  - Verify `mail.toml` (or `MAIL_CONFIG_PATH`) points at the correct swarm source and registry file
  - Verify **Python 3.12+** and dependency install
  
### Auth errors
  - Ensure the auth endpoints respond and the token has the correct role
  - The server expects `Authorization: Bearer <token>` for nearly all endpoints
  
### No response from agents
  - Confirm [swarms.json](/swarms.json) factory and prompt import paths are valid
  - Ensure at least one supervisor agent exists and is the configured entrypoint
  - Run `validate_swarm_from_swarms_json()` on your swarm dict to catch wiring mistakes early; it checks entrypoint validity, comm_targets, supervisor presence, duplicate names, and action references, with "Did you mean '...'?" suggestions for typos
  
### Interswarm routing fails
  - Use `agent@swarm` addressing and register swarms via `/swarms`
  - Verify swarm name/registry settings in `mail.toml`, the registry persistence file path, and env var tokens (set `SWARM_AUTH_TOKEN_<SWARM>` before startup)
  
### SSE stream disconnects
  - Check client and proxy timeouts; events include periodic ping heartbeats

## Logs
- **Enable logging** to debug flow and events
- If model reasoning/token stream output is too noisy in server logs, set `[server.settings].print_llm_streams = false` (or launch with `mail server --print-llm-streams false`)
- See [src/mail/utils/logger.py](/src/mail/utils/logger.py) for initialization

## Where to ask
- **Open an issue** with endpoint responses, logs, and your `swarms.json` (redact secrets)


===== End of `docs/troubleshooting.md` =====

===== `docs/reasoning-trace-findings.md` =====

# Reasoning Trace Smoke Test Findings

## Overview

This document captures findings from smoke tests investigating where reasoning/thinking content lives in different LLM API responses, to inform implementation of reasoning traces in tool call events.

---

## OpenAI Responses API (gpt-5.2)

**Test Date:** 2026-01-06
**Model:** `openai/gpt-5.2` (resolved to `gpt-5.2-2025-12-11`)
**Test Question:** Einstein's Zebra Puzzle (complex logic puzzle requiring significant reasoning)

### Configuration

```python
await aresponses(
    input=messages,
    model="openai/gpt-5.2",
    max_output_tokens=8192,
    include=["reasoning.encrypted_content"],
    reasoning={"effort": "high", "summary": "detailed"},
    tool_choice="required",
    tools=tools,
    stream=True/False,
)
```

### Non-Streaming Response

**Result:** ✅ Reasoning summary IS available

**Location:** `response.output` array contains a `ResponseReasoningItem` with:
- `type: "reasoning"`
- `summary: List[Summary]` - Array of summary objects
- `encrypted_content: str` - Encrypted raw reasoning (not usable)

**Extraction Path:**
```python
for output in res.output:
    if output.type == "reasoning":
        for summary_item in output.summary:
            reasoning_text = summary_item.text  # <-- This is the reasoning summary
```

**Example Output Structure:**
```python
ResponseReasoningItem(
    id='rs_0f0806c96bd0a13f00695dbf738bc081928f41c6fa089b9f24',
    type='reasoning',
    summary=[
        Summary(text="**Analyzing Chesterfield location**\n\nI'm figuring out the possible locations...")
    ],
    content=None,
    encrypted_content='gAAAAABpXb-iTrxe8Argm1GqrIMN83Zy...'
)
```

**Usage Stats:**
- `reasoning_tokens: 1726` (in `usage.output_tokens_details`)

**Important Notes:**
- The `summary` field can be EMPTY (`[]`) if the model didn't need to reason much
- Simple questions may not generate any summary content
- The `reasoning` field on the response object is just the CONFIG (`{'effort': 'high', 'summary': 'detailed'}`), NOT the actual reasoning

### Streaming Response

**Result:** ✅ Reasoning summary IS available via streaming events

**Event Types:**
```
response.reasoning_summary_part.added    - Start of a summary part
response.reasoning_summary_text.delta    - Text chunk of reasoning summary
response.reasoning_summary_text.done     - End of text for this part
response.reasoning_summary_part.done     - End of summary part
```

**Extraction Approach:**
```python
reasoning_parts = []

async for event in stream:
    if event.type == "response.reasoning_summary_text.delta":
        reasoning_parts.append(event.delta)
    elif event.type == "response.completed":
        final_response = event.response

reasoning_summary = "".join(reasoning_parts)
```

**Event Structure (delta events):**
```python
{
    'type': 'response.reasoning_summary_text.delta',
    'sequence_number': 4,
    'item_id': 'rs_022a860c041198e700695dbea537c48190a36863cbcb1aaec5',
    'output_index': 0,
    'summary_index': 0,
    'delta': '**Analy',  # <-- Text chunk
    'obfuscation': 'XTdhgYI7d'
}
```

**Important Notes:**
- The final `response.completed` event's `response.output` does NOT include the reasoning item
- You MUST capture reasoning from streaming events, cannot get it from final response
- 107 reasoning events captured for the zebra puzzle test
- Events come with `obfuscation` field (purpose unclear, possibly for watermarking)

### Key Differences: Streaming vs Non-Streaming

| Aspect | Non-Streaming | Streaming |
|--------|--------------|-----------|
| Reasoning in final response | Yes, in `output` array | No |
| How to capture | `output[i].summary[j].text` | Accumulate `delta` events |
| Event types needed | N/A | `response.reasoning_summary_text.delta` |

### Interleaved Reasoning (CONFIRMED)

**Test Date:** 2026-01-06
**Model:** `openai/gpt-5.2`
**Test:** Multi-tool conversation with web_search and analyze_tradeoffs

**Result:** ✅ OAI DOES have interleaved reasoning!

**Observed Output Sequence:**
```
reasoning -> web_search_call -> reasoning -> function_call
```

**Analysis:**
- Reasoning items at indices: [0, 2]
- Tool items at indices: [1, 3]
- **MULTIPLE REASONING BLOCKS DETECTED**
- **REASONING AFTER TOOL CALL at index [2]**

**Implication:** Must use `pending_reasoning` pattern (same as Anthropic) - each tool gets the reasoning that preceded it, not just "first tool gets all reasoning".

---

## Anthropic Native API (Claude)

**Test Date:** 2026-01-06
**Model:** `claude-sonnet-4-5-20250929`
**Test:** Multi-turn conversation with web_search (server_tool_use), analyze_tradeoffs, and report_findings tools

### Configuration

```python
response = await client.messages.create(
    model="claude-sonnet-4-5-20250929",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000,
    },
    tools=tools,
    messages=messages,
    extra_headers={
        "anthropic-beta": "interleaved-thinking-2025-05-14"  # REQUIRED for interleaved thinking!
    },
)
```

### Content Block Types Observed

- `thinking` - Contains reasoning with `thinking` and `signature` fields
- `text` - Text output (model's verbal response)
- `server_tool_use` - Server-side tools (web_search)
- `web_search_tool_result` - Result from web_search (comes in SAME response!)
- `tool_use` - Client-side tool calls

### Actual Interleaving Pattern (WITH beta header)

**With `interleaved-thinking-2025-05-14` beta header, TRUE interleaving occurs!**

**Turn 1 Block Sequence:**
```
thinking -> text -> server_tool_use -> web_search_tool_result -> thinking -> text... -> tool_use
```

Note: **TWO thinking blocks** in Turn 1 - one before web_search, one AFTER receiving results before analyze_tradeoffs!

**Turn 2 Block Sequence:**
```
thinking -> text -> tool_use
```

**Turn 3 Block Sequence (final):**
```
thinking -> text
```

### Key Findings

1. **With interleaved-thinking beta, each tool call can have its own preceding thinking**
2. **Thinking appears AFTER receiving tool results** (e.g., after web_search_tool_result)
3. **server_tool_use and web_search_tool_result come together** in the same response
4. **text blocks appear between tool operations** (model explaining what it's doing)
5. **Even final text responses have preceding thinking** (Turn 3)

### Interleaving Analysis Results

| Tool Call | Turn | Type | Preceded By |
|-----------|------|------|-------------|
| web_search | 1 | server_tool_use | `['thinking']` |
| analyze_tradeoffs | 1 | tool_use | `['thinking']` ← SECOND thinking block! |
| report_findings | 2 | tool_use | `['thinking']` |

### Implementation Implications

**With interleaved-thinking beta header enabled:**

**Actual pattern within a turn:**
```
thinking -> [text] -> tool1 -> [result1] -> thinking -> [text] -> tool2 -> ...
```

**For reasoning attribution:**
1. **Each tool call gets its own preceding thinking** (if the model generated one)
2. **Thinking can appear AFTER tool results** - model reasons about what it learned
3. **Parallel tool calls** (if any) would share the preceding thinking block
4. **Use `reasoning_ref` only for truly parallel calls** with no thinking between them

### Thinking Block Structure

```python
{
    "type": "thinking",
    "thinking": "The user wants me to research Python 3.13...",
    "signature": "ErUBCkYIARAB..."  # Cryptographic signature
}
```

### Server Tool Result Structure

The `web_search_tool_result` block appears in the SAME response as `server_tool_use`:
- No separate API call needed for server tool results
- Must be included when reconstructing assistant messages for multi-turn

### Extraction Approach

```python
# Within a single response/turn - with interleaved thinking, each tool gets its own thinking
pending_thinking = []
pending_thinking_content = []

for block in response.content:
    if block.type == "thinking":
        pending_thinking.append(block.type)
        pending_thinking_content.append(block.thinking)
    elif block.type == "redacted_thinking":
        pending_thinking.append(block.type)
        pending_thinking_content.append("[REDACTED]")
    elif block.type in ("tool_use", "server_tool_use"):
        tool_call = AgentToolCall(
            tool_name=block.name,
            tool_args=block.input,
            tool_call_id=block.id,
        )
        if pending_thinking:
            # This tool gets all accumulated thinking since last tool
            tool_call.tool_args["thinking_blocks"] = [
                {"type": t, "content": c}
                for t, c in zip(pending_thinking, pending_thinking_content)
            ]
            pending_thinking = []  # Reset for next tool
            pending_thinking_content = []
        tool_calls.append(tool_call)
    # Skip text, web_search_tool_result - not tool calls

# For text_output (text-only responses), attach any remaining thinking
if not tool_calls and pending_thinking:
    # Create text_output with accumulated thinking
    tool_call.tool_args["thinking_blocks"] = [...]
```

### Edge Cases

1. **No thinking before a tool** - Some tools may have no preceding thinking (use `reasoning_ref`)
2. **Multiple thinking blocks before one tool** - Join with newlines, all go to that tool
3. **redacted_thinking** - Handle with `[REDACTED]` placeholder
4. **text blocks between tools** - Ignore for reasoning attribution (they're explanatory text)
5. **web_search_tool_result** - Not a tool call; include in message history but don't emit event
6. **Parallel tool calls** - If multiple tool_use blocks appear consecutively without thinking between, first gets reasoning, others get `reasoning_ref`
7. **Final text with thinking** - Turn 3 showed `thinking -> text` pattern; if no tool_use, thinking goes to text_output

---

## Implementation Recommendations

### For OpenAI Responses API

**Non-Streaming (`_run_responses`):**
```python
# After getting response
reasoning_summary = None
for output in res.output:
    if hasattr(output, 'type') and output.type == 'reasoning':
        if hasattr(output, 'summary') and output.summary:
            # Join all summary texts
            reasoning_summary = "\n".join(
                s.text for s in output.summary if hasattr(s, 'text')
            )
        break

# Attach to first tool call
if reasoning_summary and tool_calls:
    tool_calls[0].tool_args["reasoning"] = reasoning_summary
```

**Streaming (`_stream_responses`):**
```python
reasoning_parts = []

async for event in stream:
    match event.type:
        case "response.reasoning_summary_text.delta":
            reasoning_parts.append(event.delta)
            # Optionally print for visibility
            print(event.delta, end="", flush=True)
        case "response.completed":
            final_response = event.response

# After streaming, attach to first tool call
reasoning_summary = "".join(reasoning_parts)
if reasoning_summary and tool_calls:
    tool_calls[0].tool_args["reasoning"] = reasoning_summary
```

### For Anthropic API

```python
# Track pending thinking blocks
pending_thinking = []

for block in response.content:
    if block.type == "thinking":
        pending_thinking.append(block.thinking)
    elif block.type == "redacted_thinking":
        pending_thinking.append("[redacted thinking]")
    elif block.type in ("tool_use", "server_tool_use"):
        # Create tool call with accumulated thinking
        tool_call = AgentToolCall(...)
        if pending_thinking:
            tool_call.tool_args["thinking_blocks"] = pending_thinking.copy()
            pending_thinking = []  # Reset for next tool
        tool_calls.append(tool_call)
    elif block.type == "text":
        # For text-only, thinking goes to text_output
        text_thinking = pending_thinking.copy()
        pending_thinking = []
```

---

## Test Script

Location: `scripts/smoke_test_reasoning.py`

Runs both streaming and non-streaming tests against OpenAI gpt-5.2 with Einstein's Zebra Puzzle to ensure sufficient reasoning is generated.


===== End of `docs/reasoning-trace-findings.md` =====

===== `docs/quickstart.md` =====

# Quickstart

This guide gets you running a local MAIL swarm and interacting with it.

## Prerequisites
- **Python 3.12+**
- **uv** (recommended) or **pip**
- An **auth service** providing `AUTH_ENDPOINT` and `TOKEN_INFO_ENDPOINT` or a stub for local testing
- An **LLM proxy** compatible with LiteLLM (e.g., `litellm`) only if your swarm uses `use_proxy=true`

## Install

### Cloning the repo
```bash
git clone https://github.com/charonlabs/mail.git \
--branch v1.3.6
```

### Installing dependencies
```bash
# Using uv (recommended)
uv sync

# Or using pip
pip install -e .
```

## Environment & Config
- Start with `mail.toml` (checked into the repo) to control default host, port, swarm source, client timeout, and `[server.settings]` values (`task_message_limit`, `print_llm_streams`). Copy it if you need environment-specific values and point `MAIL_CONFIG_PATH` (or `--config`) at the new file.
- Minimum environment variables:
  - `AUTH_ENDPOINT`, `TOKEN_INFO_ENDPOINT` for auth (see [configuration.md](/docs/configuration.md))
  - `LITELLM_PROXY_API_BASE` for LLM access only when `use_proxy=true`
  - Optional: `DATABASE_URL` for PostgreSQL persistence of agent histories and task state (see [database.md](/docs/database.md))

## Run
- Start the server:
```bash
# Using uv (recommended)
uv run mail server

# Or
python -m mail.server
```
- Default base URL comes from `mail.toml` (`host` + `port`); override per run with CLI flags or by editing the file / `MAIL_CONFIG_PATH`.
- Need the OpenAI Responses bridge for integration testing? Start the server with `uv run mail server --debug` (or set `[server].debug = true`) to expose the debug-only `/responses` endpoint.
- Prefer containers? Follow the [Docker deployment guide](./docker.md) to build and run the same server with Docker or Compose.

## Try it
- **Check connectivity**:
```bash
# Quick health check using the CLI
uv run mail ping http://localhost:8000
```
- **Basic server info**:
```bash
# Get the swarm name, status, and protocol version
curl http://localhost:8000/

# Get the swarm name and more detailed health status
curl http://localhost:8000/health
```
- **Status** (requires admin/user token): 
```bash
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8000/status
```
- **MAIL CLI REPL** (enter `help` for commands): 
```bash
uv run mail client \
http://localhost:8000 \
--api-key $TOKEN
``` 
- **Debug `/responses` (optional)**: 
```bash
uv run mail client http://localhost:8000 --api-key $TOKEN
mail> responses '[{"role":"user","content":"Plan tomorrow"}]' '[]' \
      --instructions "You are a helpful planner."
```
- **Send a message**: 
```bash
curl -X POST http://localhost:8000/message \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"body":"Hello"}'
```
- **Stream (SSE)**: 
```bash
curl -N -X POST http://localhost:8000/message \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"body":"Stream please","stream":true}'
```
- **Python client**: use [`MAILClient`](./client.md) if you prefer asyncio code over raw HTTP. Example:
```python
import asyncio
import os
from mail.client import MAILClient

async def main() -> None:
    token = os.getenv("TOKEN")
    async with MAILClient("http://localhost:8000", api_key=token) as client:
        print(await client.ping())
        print(await client.post_message("Hello from Python"))

asyncio.run(main())
```

## Next steps
- Read [architecture.md](/docs/architecture.md) to learn how the runtime processes messages
- Check [agents-and-tools.md](/docs/agents-and-tools.md) to learm how to add or modify agents
- See [interswarm.md](/docs/interswarm.md) to enable cross‑swarm communication


===== End of `docs/quickstart.md` =====

===== `docs/examples.md` =====

# Examples

This repo includes example agents and demo scripts you can run locally.

## Agents
- **Supervisor**: [src/mail/examples/supervisor/](/src/mail/examples/supervisor/__init__.py)
- **Weather**: [src/mail/examples/weather_dummy/](/src/mail/examples/weather_dummy/__init__.py)
- **Math**: [src/mail/examples/math_dummy/](/src/mail/examples/math_dummy/__init__.py)
- **Consultant**: [src/mail/examples/consultant_dummy/](/src/mail/examples/consultant_dummy/__init__.py)
- **Analyst**: [src/mail/examples/analyst_dummy/](/src/mail/examples/analyst_dummy/__init__.py)

## Agent functions and factories
- **Agent factories** in [src/mail/factories/](/src/mail/factories/__init__.py) are classes/functions that construct MAIL-compatible agent callables used by `MAILAgentTemplate`.
- For class-based factories (e.g., `LiteLLMAgentFunction`), the instantiated object's `__call__` method is the agent function the runtime schedules.

## Demo scripts
- **Single swarm**: [scripts/single_swarm_demo.py](/scripts/single_swarm_demo.py)
- **Multiple swarms**: [scripts/multi_swarm_demo.py](/scripts/multi_swarm_demo.py)
- **HTTP client**: [scripts/demo_client.py](/scripts/demo_client.py) launches a stub server and exercises [`MAILClient`](./client.md)

## Swarms configuration
- Top-level [swarms.json](/swarms.json) provides the default template loaded by the server
- Update agent factories, prompts, or actions to customize behavior


===== End of `docs/examples.md` =====

===== `docs/CHECKLIST.md` =====

# Reasoning Trace Implementation Checklist

Use this checklist to implement the reasoning trace plan in order. Do not move to the next step until the "Verify" items are checked.

## Step 0 - Extend AgentToolCall

- [ ] Add `reasoning: list[str] | None` to `AgentToolCall` (keep existing fields and validator).
- [ ] Add `preamble: str | None` to `AgentToolCall` (keep existing fields and validator).
- [ ] Confirm no other fields are removed or defaults changed.

**Verify**
- [ ] `AgentToolCall` still validates `completion` OR `responses` non-empty.
- [ ] New fields are optional and do not affect existing call sites.

---

## Step 1 - Emit tool_call Events

- [ ] Add `_emit_tool_call_event(...)` to `src/mail/core/runtime.py`.
- [ ] Join `call.reasoning` with `"\n\n"` after filtering empty/whitespace-only blocks.
- [ ] Include `preamble` when present.

**Verify**
- [ ] Events include `tool_name`, `tool_args`, `tool_call_id`.
- [ ] `reasoning_ref` is only set when reasoning is absent.

---

## Step 2 - Remove Old tool_args Assignments

- [ ] Remove `tool_args["reasoning"]` and `tool_args["thinking_blocks"]` assignments in `_run_completions_anthropic_native()`.
- [ ] Remove `tool_args["reasoning"]` and `tool_args["thinking_blocks"]` assignments in `_stream_completions_anthropic_native()`.
- [ ] Remove any now-dead `thinking_content` extraction blocks tied only to those assignments.

**Verify**
- [ ] No remaining `tool_args["reasoning"]` or `tool_args["thinking_blocks"]` in `src/mail/factories/base.py`.
- [ ] Anthropic assistant message dict still contains full thinking blocks via `completion`.

---

## Step 3 - Generate UUIDs for text_output

- [ ] Add `uuid.uuid4()` IDs for `text_output` tool calls in:
  - [ ] `_run_completions()` (generic LiteLLM path - UUID only, no reasoning extraction)
  - [ ] `_run_completions_anthropic_native()`
  - [ ] `_stream_completions_anthropic_native()`
  - [ ] `_run_responses()`
- [ ] Keep `tool_args={"content": ...}` unchanged.

**Verify**
- [ ] All `text_output` calls now have non-empty `tool_call_id`.
- [ ] `content` key is preserved (no `text` key introduced).

---

## Step 4 - Emit tool_call Events Before Mutations

- [ ] In `src/mail/core/runtime.py`, emit `tool_call` events at the top of the tool processing loop.
- [ ] Track `last_reasoning_call_id` and use it for `reasoning_ref` when needed.
- [ ] Ensure the event is emitted before any tool-specific mutations (e.g., adding `target`).

**Verify**
- [ ] `reasoning_ref` points to the most recent call with reasoning.
- [ ] Events are emitted for all calls in original order.

---

## Step 5 - Tool Coverage

- [ ] Ensure `tool_call` events are emitted for:
  - [ ] MAIL tools (send_request/send_response/send_interrupt/send_broadcast/task_complete)
  - [ ] Actions
  - [ ] Builtins (web_search_call, code_interpreter_call)
  - [ ] await_message
  - [ ] help
  - [ ] acknowledge/ignore broadcast
  - [ ] breakpoints (emit before existing `breakpoint_tool_call` event)

**Verify**
- [ ] No tool path bypasses `tool_call` emission.
- [ ] Breakpoint tool calls emit `tool_call` before `breakpoint_tool_call`.

---

## Step 6 - OAI Streaming Reasoning (Interleaved)

- [ ] Update `_stream_responses()` to collect reasoning via `response.reasoning_summary_text.delta`.
- [ ] Use `response.output_item.added` to map reasoning to tool outputs (by `output_index`).
- [ ] Flush `current_reasoning_text` on `response.completed`.
- [ ] Return 3-tuple: `(response, tool_reasoning_map, streaming_pending_reasoning)`.
- [ ] Handle dict and object event items.
- [ ] Update `_run_responses()` call site to unpack 3-tuple (was just `response`).

**Verify**
- [ ] Reasoning deltas are joined with `""` (not newlines).
- [ ] `tool_reasoning_map` keys align with `res.output` indices.
- [ ] Streaming fallback reasoning is available for text-only responses.

---

## Step 7 - OAI Single-Pass Collection (Non-Streaming + Streaming Attach)

- [ ] Replace multi-pass tool collection with a single-pass over `res.output`.
- [ ] Handle both dict and object output formats for:
  - [ ] reasoning
  - [ ] message
  - [ ] function_call
  - [ ] web_search_call
  - [ ] code_interpreter_call
- [ ] Attach `responses=outputs` to every `AgentToolCall`.
- [ ] Use `tool_reasoning_map` (when streaming) to fill reasoning if inline extraction is empty.
- [ ] Use `streaming_pending_reasoning` only for text-only fallback (when no tool calls exist).
- [ ] Keep richer builtin fields (status, outputs, search_type).

**Verify**
- [ ] Tool call ordering matches `res.output`.
- [ ] Builtins still hit runtime builtin branches (`web_search_call`, `code_interpreter_call`).
- [ ] Text-only responses create `text_output` with the first message chunk.

---

## Step 8 - Anthropic Interleaved Thinking

- [ ] Rebuild tool call assembly to track `pending_reasoning` and `pending_preamble` by block order.
- [ ] Use `[redacted thinking]` placeholder for `redacted_thinking`.
- [ ] Store reasoning as list[str] on the call, not in `tool_args`.
- [ ] Preserve `completion` for history.

**Verify**
- [ ] Each tool call gets only the reasoning since the previous tool call.
- [ ] Text blocks contribute to preamble; thinking blocks do not reset on text.
- [ ] Text-only responses create `text_output` with `preamble=None`.

---

## Step 9 - Cleanup and Consistency

- [ ] Remove any remaining references to `tool_args["reasoning"]` or `tool_args["thinking_blocks"]`.
- [ ] Ensure `reasoning` is never placed in `tool_args` in any provider path.
- [ ] Confirm `preamble` is only on `AgentToolCall.preamble`.

**Verify**
- [ ] No new `tool_args` fields were added for reasoning/preamble.
- [ ] All tool calls still carry valid `completion` or `responses` data for history.

---

## Step 10 - Validation Pass

- [ ] Scan for any call sites still expecting `tool_args["reasoning"]` or `tool_args["thinking_blocks"]`.
- [ ] Confirm `tool_call` events include `reasoning` or `reasoning_ref` only when appropriate.
- [ ] Ensure `reasoning` is filtered for empty/whitespace blocks.

**Verify**
- [ ] No linting or type errors introduced.
- [ ] The new event type appears in SSE stream after a tool call.

---

## Post-Implementation Tests (Optional)

- [ ] Anthropic extended thinking (interleaved) produces per-tool reasoning events.
- [ ] OAI Responses streaming produces per-tool reasoning events.
- [ ] OAI Responses non-streaming produces per-tool reasoning events.
- [ ] Parallel tool calls produce `reasoning_ref` to the most recent reasoning call.
- [ ] Text-only responses emit `tool_call` with UUID and reasoning (if any).


===== End of `docs/CHECKLIST.md` =====

===== `docs/AGENTS_MAIL_PRIMER.md` =====

# AGENTS_MAIL_PRIMER — Using MAIL as a Library

Audience: developers integrating MAIL into their own codebases (not hacking MAIL internals).
Focus: programmatic use of Actions, Agents, Swarms, and Templates.

If you are developing MAIL itself, see `AGENTS.md` and `CLAUDE.md`.

## Mental Model

MAIL is an agent orchestration runtime:
- **Actions** are tool-like functions agents can call.
- **Agents** wrap an LLM call + metadata (comm targets, tool set, entrypoint/supervisor).
- **Swarms** are a set of agents + actions + runtime.

Templates vs instances:
- `MAILAction` is an action definition.
- `MAILAgentTemplate` defines how to build an agent.
- `MAILSwarmTemplate` defines a swarm (agents + actions + entrypoint).
- `MAILSwarm` is a running instance with a runtime.

Key constraints:
- Every swarm needs at least one `enable_entrypoint=True` agent (receives user messages).
- Every swarm needs at least one `can_complete_tasks=True` agent (supervisor role).
- `comm_targets` must reference valid agent names (or interswarm `agent@swarm`).

## Core Imports

```python
from mail import MAILAction, MAILAgentTemplate, MAILSwarmTemplate, action
from mail.factories import (
    LiteLLMActionAgentFunction,
    LiteLLMAgentFunction,
    LiteLLMSupervisorFunction,
)
```

## Quick Start — Minimal Swarm

```python
from mail import MAILAgentTemplate, MAILSwarmTemplate
from mail.factories import LiteLLMSupervisorFunction

supervisor = MAILAgentTemplate(
    name="supervisor",
    factory=LiteLLMSupervisorFunction,
    comm_targets=[],
    actions=[],
    agent_params={
        "llm": "anthropic/claude-sonnet-4-20250514",
        "system": "You are a helpful assistant.",
        "use_proxy": False,
    },
    enable_entrypoint=True,
    can_complete_tasks=True,
    tool_format="completions",
)

template = MAILSwarmTemplate(
    name="my_swarm",
    version="1.0.0",
    agents=[supervisor],
    actions=[],
    entrypoint="supervisor",
)

swarm = template.instantiate(
    instance_params={"user_token": "dummy"},
    user_id="local_user",
)

response, _events = await swarm.post_message_and_run(
    body="Hello!",
    subject="Greeting",
)
print(response["message"]["body"])
```

## Actions (Tools)

Actions must return a string. Two common patterns:

### 1) `@action` decorator (typed payload)

```python
from pydantic import BaseModel, Field
from mail import action

class AddArgs(BaseModel):
    a: int = Field(description="First number")
    b: int = Field(description="Second number")

@action(name="add", description="Add two numbers")
async def add(args: AddArgs) -> str:
    return str(args.a + args.b)
```

### 2) `MAILAction.from_pydantic_model` (closure-friendly)

```python
from functools import partial
from pydantic import BaseModel, Field
from mail import MAILAction

class SelectSpeakerArgs(BaseModel):
    player_name: str = Field(description="The player to speak next")

async def select_speaker(game, payload: dict) -> str:
    return game.select_speaker(payload["player_name"])

select_speaker_action = MAILAction.from_pydantic_model(
    model=SelectSpeakerArgs,
    function=partial(select_speaker, game),
    name="select_speaker",
)
```

See `src/mail/examples/mafia/narrator_tools.py` for a full pattern.

## Agents (Templates)

Use existing factories rather than rolling your own:
- `LiteLLMSupervisorFunction` for supervisors (can call `task_complete`)
- `LiteLLMActionAgentFunction` for agents that have custom actions/tools
- `LiteLLMAgentFunction` for agents with no custom actions/tools

Note: Supervisors can also have actions. If an agent is a supervisor, keep using
`LiteLLMSupervisorFunction` even when you add actions.

Key fields on `MAILAgentTemplate`:
- `name`
- `factory`
- `comm_targets` (names or interswarm `agent@swarm`)
- `actions` (list of `MAILAction`)
- `agent_params` (LLM config passed to factory)
- `enable_entrypoint` (receive user messages)
- `can_complete_tasks` (supervisor role)
- `tool_format` (top-level; "completions" or "responses")

Example (based on `MASter/src/master/prebuilt/agents.py`):

```python
supervisor = MAILAgentTemplate(
    name="supervisor",
    factory=LiteLLMSupervisorFunction,
    comm_targets=["worker"],
    actions=[],
    agent_params={
        "llm": "anthropic/claude-opus-4-5-20251101",
        "system": "System prompt here",
        "use_proxy": False,
        "stream_tokens": True,
        "reasoning_effort": "high",
        "default_tool_choice": "auto",
    },
    enable_entrypoint=True,
    can_complete_tasks=True,
    tool_format="completions",
)

worker = MAILAgentTemplate(
    name="worker",
    factory=LiteLLMActionAgentFunction,
    comm_targets=["supervisor"],
    actions=[select_speaker_action],
    agent_params={
        "llm": "anthropic/claude-opus-4-5-20251101",
        "system": "Worker prompt",
        "use_proxy": False,
        "stream_tokens": True,
    },
    tool_format="completions",
)
```

Notes:
- `tool_format` should be set at the template level. It is passed to the factory and used for action tool specs.
- OpenAI models generally use `tool_format="responses"`, Anthropic models use `"completions"`.
- If `tool_format` appears inside `agent_params`, it is ignored and a warning is logged; top-level wins.
- `comm_targets` must include valid agent names (or `agent@swarm` if interswarm is enabled).
- A swarm must have at least one `enable_entrypoint=True` and one `can_complete_tasks=True`.

## Swarms (Templates + Instances)

Build a `MAILSwarmTemplate`, then instantiate it into a runtime:

```python
from mail import MAILSwarmTemplate

swarm_template = MAILSwarmTemplate(
    name="example_swarm",
    version="1.0.0",
    agents=[supervisor, worker],
    actions=[select_speaker_action],
    entrypoint="supervisor",
)

swarm = swarm_template.instantiate(
    instance_params={"user_token": "dummy"},
    user_id="local_user",
)
```

The swarm constructor validates:
- entrypoint exists and is enabled
- comm_targets are valid
- at least one supervisor exists
- breakpoint/exclude tools are valid

## Running a Swarm

### One-off run (no continuous loop)

```python
response, events = await swarm.post_message_and_run(
    body="Hello",
    subject="New Message",
    show_events=True,
)
print(response["message"]["body"])
```

### Continuous runtime

```python
import asyncio

runtime_task = asyncio.create_task(swarm.run_continuous())

response, events = await swarm.post_message(
    body="Hello",
    subject="New Message",
    show_events=True,
)

await swarm.shutdown()
await runtime_task
```

### Streaming

```python
stream = await swarm.post_message_stream(body="Hello")
# stream is an EventSourceResponse (SSE)
```

### Manual mode (step-by-step)

`run_continuous(mode="manual")` enables manual stepping. The mafia example uses this:
`src/mail/examples/mafia/game.py`.

```python
import asyncio
import uuid

asyncio.create_task(swarm.run_continuous(mode="manual"))

task_id = str(uuid.uuid4())
init_msg = swarm.build_message(
    subject="::init::",
    body="Game starting",
    targets=["all"],
    sender_type="user",
    type="broadcast",
    task_id=task_id,
)
await swarm.submit_message_nowait(init_msg)
await swarm.await_queue_empty()

response = await swarm.manual_step(
    task_id=task_id,
    target="narrator",
    response_targets=["all"],
    response_type="broadcast",
    payload="Describe the scene.",
    dynamic_ctx_ratio=0.75,
)
```

## Starting a Server (Programmatic)

`MAILSwarmTemplate.start_server()` launches a FastAPI server with your template
and optional UI dev server:

```python
swarm_template.start_server(
    port=8000,
    host="0.0.0.0",
    launch_ui=True,
    ui_port=3000,
)
```

See `../MASter/scripts/GEPA/start_eval_swarm_server.py` for a full example.

## Breakpoints + Resume

Use `breakpoint_tools` to pause execution and resume with user-supplied tool results:

```python
swarm_template = MAILSwarmTemplate(
    name="eval_swarm",
    version="1.0.0",
    agents=[supervisor, worker],
    actions=[select_speaker_action],
    entrypoint="supervisor",
    breakpoint_tools=["human_review"],
)
```

To resume:
```python
await swarm.post_message(
    body="",
    resume_from="breakpoint_tool_call",
    task_id=existing_task_id,
    breakpoint_tool_call_result={"content": "approved"},
)
```

### Parsing Breakpoint Tool Call Arguments

When a breakpoint tool is called, the response has this structure:

```python
response["message"]["subject"] == "::breakpoint_tool_call::"
response["message"]["body"] == '[{"arguments": "{...}", "name": "tool_name", "id": "call_..."}]'
```

Tool calls are standardized to OpenAI/LiteLLM format. To extract:

```python
import json

def parse_breakpoint_args(response: dict, tool_name: str) -> dict | None:
    message = response.get("message", {})
    if message.get("subject") != "::breakpoint_tool_call::":
        return None
    body = message.get("body", "")
    if not body:
        return None
    data = json.loads(body)
    # Structure: [{"arguments": "{...}", "name": "tool_name", "id": "..."}]
    for call in data:
        if call.get("name") == tool_name:
            args = call.get("arguments", "{}")
            return json.loads(args) if isinstance(args, str) else args
    return None
```

This pattern is useful for structured output: define a breakpoint tool, have the agent call it,
then extract the typed arguments instead of parsing freeform text.

## Common Patterns and Gotchas

1. `tool_format` should be a top-level template field. If it appears inside `agent_params`, it is ignored with a warning.
2. Solo agent swarms need both `enable_entrypoint=True` and `can_complete_tasks=True`.
3. Actions must return a string (serialize dicts yourself).
4. Use `functools.partial` to bind state into `MAILAction.from_pydantic_model`.
5. `dynamic_ctx_ratio` (manual step) controls how much context is reserved for dynamic history.

## Loading from swarms.json (Optional)

You can build templates from `swarms.json` (JSON array of swarms):

```python
template = MAILSwarmTemplate.from_swarm_json_file(
    swarm_name="example",
    json_filepath="swarms.json",
)
```

Note: `actions` is required in each swarm definition (use `[]` if none).

## Tool-to-Message Behavior (Quick Reference)

`send_broadcast` currently ignores `targets` and broadcasts to all agents.
`task_complete` emits a `broadcast_complete` to all agents.

## Where to Look for Real Usage

- `../MASter/src/master/prebuilt/agents.py` — agent templates and factory params
- `../MASter/scripts/GEPA/start_eval_swarm_server.py` — template + server startup
- `src/mail/examples/mafia/game.py` — building swarm templates and running manual mode
- `src/mail/examples/mafia/narrator_tools.py` — `MAILAction.from_pydantic_model` patterns
- `src/mail/examples/weather_dummy/agent.py` — `LiteLLMActionAgentFunction` usage


===== End of `docs/AGENTS_MAIL_PRIMER.md` =====

===== `docs/message-format.md` =====

# Message Format

MAIL messages are strongly typed envelopes used internally and over interswarm HTTP. See [src/mail/core/message.py](/src/mail/core/message.py) and the JSON Schemas in [spec/](/spec/MAIL-core.schema.json).

## Addresses
- **`MAILAddress`**: `{ "address_type": "admin"|"agent"|"user"|"system", "address": "string" }`
- **Helpers**: `create_agent_address`, `create_user_address`, `create_system_address`
- Interswarm routing uses `agent@swarm` in `address`

## Core envelopes
- **`MAILRequest`**
  - `task_id`, `request_id`, `sender`, `recipient`, `subject`, `body`
  - Optional interswarm: `sender_swarm`, `recipient_swarm`, `routing_info`
- **`MAILResponse`**
  - `task_id`, `request_id`, `sender`, `recipient`, `subject`, `body`
  - Optional interswarm: `sender_swarm`, `recipient_swarm`, `routing_info`
- **`MAILBroadcast`**
  - `task_id`, `broadcast_id`, `sender`, `recipients[]`, `subject`, `body`
  - Optional interswarm: `sender_swarm`, `recipient_swarms[]`, `routing_info`
- **`MAILInterrupt`**
  - `task_id`, `interrupt_id`, `sender`, `recipients[]`, `subject`, `body`
  - Optional interswarm: `sender_swarm`, `recipient_swarms[]`, `routing_info`

## Wrapper for interswarm HTTP
- **`MAILInterswarmMessage`**
  - `message_id`, `source_swarm`, `target_swarm`, `timestamp`
  - `task_owner`, `task_contributors`
  - `payload`: one of the core envelopes
  - `msg_type`: `request|response|broadcast|interrupt`
  - `auth_token` (optional in schema; required by current `/interswarm/forward` and `/interswarm/back` validators), `metadata` (optional in schema; required by the same validators)

## XML helper
- The runtime can render a human-readable XML body for LLM input: `build_mail_xml(message)`

## Schemas and examples
- **Spec JSON Schemas**: [spec/MAIL-core.schema.json](/spec/MAIL-core.schema.json), [spec/MAIL-interswarm.schema.json](/spec/MAIL-interswarm.schema.json)
- **Examples**: [spec/examples/](/spec/examples/README.md)


===== End of `docs/message-format.md` =====

===== `docs/testing.md` =====

# Testing

## Overview
- **Runner**: `pytest`
- **Layout**: [tests/mock](/tests/mock/) (unit), [tests/network](/tests/network) (API), [tests/unit](/tests/unit/) (core)
- **Config**: [pytest.ini](/pytest.ini)

## Running
- Install dev deps and run: `pytest -q`

## Fixtures & patterns (see [tests/](/tests/))
- Network tests use **FastAPI TestClient** and patch external I/O
- **Fixtures** patch `SwarmRegistry`, auth helpers, and factory imports to avoid network/LLM calls
- **No real external requests** are performed during tests
- The asynchronous HTTP client is exercised in [`tests/unit/test_mail_client.py`](/tests/unit/test_mail_client.py) with an in-process aiohttp server covering SSE and payload validation

## Extending
- **Follow existing patterns** under [tests/unit](/tests/unit/) and [tests/network](/tests/network/)
- Reuse provided fixtures for isolated behavior


===== End of `docs/testing.md` =====

===== `docs/CLAUDE_MAIL_PRIMER.md` =====

# CLAUDE_MAIL_PRIMER — Using MAIL as a Library

This document is for Claude sessions building applications that use MAIL as an imported library. If you're developing MAIL itself, see `CLAUDE.md` in the repository root.

## Mental Model

MAIL is a multi-agent orchestration runtime where agents communicate via an async message queue. The core abstraction is **templates vs instances**:

```
MAILAction          → defines a tool (callable + schema)
MAILAgentTemplate   → defines how to build an agent (factory + params)
MAILSwarmTemplate   → defines a swarm (agents + actions + entrypoint)
        ↓
    .instantiate()
        ↓
MAILSwarm           → running swarm with a MAILRuntime inside
```

A `MAILSwarmTemplate` is a blueprint. Calling `.instantiate()` creates a `MAILSwarm` with an actual runtime that can process messages.

**Key constraints:**
- Every swarm needs at least one `enable_entrypoint=True` agent (can receive user messages)
- Every swarm needs at least one `can_complete_tasks=True` agent (supervisor role)
- `comm_targets` must reference valid agent names within the swarm

## Core Imports

```python
from mail import MAILAction, MAILAgentTemplate, MAILSwarmTemplate, action
from mail.factories import (
    LiteLLMAgentFunction,        # Base agent (no actions, no task_complete)
    LiteLLMActionAgentFunction,  # Agent with actions (no task_complete)
    LiteLLMSupervisorFunction,   # Supervisor (has task_complete)
)
```

## Quick Start — Minimal Working Swarm

```python
from mail import MAILAgentTemplate, MAILSwarmTemplate
from mail.factories import LiteLLMSupervisorFunction

# 1. Define an agent template
supervisor = MAILAgentTemplate(
    name="supervisor",
    factory=LiteLLMSupervisorFunction,
    comm_targets=[],  # No other agents to talk to
    actions=[],
    agent_params={
        "llm": "anthropic/claude-sonnet-4-20250514",
        "system": "You are a helpful assistant.",
        "use_proxy": False,
    },
    enable_entrypoint=True,
    can_complete_tasks=True,
    tool_format="completions",  # or "responses" for OpenAI
)

# 2. Build swarm template
template = MAILSwarmTemplate(
    name="my_swarm",
    version="1.0.0",
    agents=[supervisor],
    actions=[],
    entrypoint="supervisor",
)

# 3. Instantiate and run
swarm = template.instantiate(
    instance_params={"user_token": "dummy"},
    user_id="local_user",
)

# One-shot execution
response, events = await swarm.post_message_and_run(
    body="Hello!",
    subject="Greeting",
)
print(response["message"]["body"])
```

## Actions (Tools)

Actions are tools that agents can call. They must return a string. Two patterns:

### Pattern 1: `@action` Decorator

Use when the action is standalone and doesn't need external state.

```python
from pydantic import BaseModel, Field
from mail import action

class AddNumbersArgs(BaseModel):
    a: int = Field(description="First number")
    b: int = Field(description="Second number")

@action(name="add_numbers", description="Add two numbers together")
async def add_numbers(args: AddNumbersArgs) -> str:
    return str(args.a + args.b)

# Use in agent template:
worker = MAILAgentTemplate(
    name="worker",
    factory=LiteLLMActionAgentFunction,  # Use ActionAgent when agent has actions
    comm_targets=["supervisor"],
    actions=[add_numbers],  # Pass the decorated function directly
    agent_params={...},
)
```

### Pattern 2: `MAILAction.from_pydantic_model` with Closure

Use when the action needs access to external state (like a game object or database).

```python
from functools import partial
from pydantic import BaseModel, Field
from mail import MAILAction

class SelectPlayerArgs(BaseModel):
    player_name: str = Field(description="Name of the player to select")

async def select_player(game_state, args: dict) -> str:
    """The game_state is captured via partial()."""
    name = args["player_name"]
    game_state.current_player = name
    return f"Selected {name}"

# Create action with game state bound
def get_game_actions(game):
    return [
        MAILAction.from_pydantic_model(
            model=SelectPlayerArgs,
            function=partial(select_player, game),
            name="select_player",
        ),
    ]

# Usage
game = GameState()
actions = get_game_actions(game)

narrator = MAILAgentTemplate(
    name="narrator",
    factory=LiteLLMActionAgentFunction,  # Has actions
    comm_targets=["player1", "player2"],
    actions=actions,
    agent_params={...},
)
```

This pattern is extensively used in `src/mail/examples/mafia/narrator_tools.py`.

## Agent Templates

### Factory Selection

The key distinction is **whether the agent can call `task_complete`** (supervisor role):

- **`LiteLLMSupervisorFunction`** — For agents that can complete tasks (call `task_complete`). Use this for your entrypoint/coordinator agent. Can also have actions.
- **`LiteLLMActionAgentFunction`** — For worker agents that have actions but cannot complete tasks.
- **`LiteLLMAgentFunction`** — Base agent with no actions, cannot complete tasks. Use when the agent only communicates via MAIL messages.

### Key Template Fields

```python
MAILAgentTemplate(
    name="agent_name",                    # Unique within swarm
    factory=LiteLLMSupervisorFunction,    # Or LiteLLMAgentFunction
    comm_targets=["other_agent"],         # Who this agent can message
    actions=[my_action],                  # List of MAILAction objects
    agent_params={                        # Passed to factory
        "llm": "anthropic/claude-sonnet-4-20250514",
        "system": "System prompt here",
        "use_proxy": False,
        "stream_tokens": True,            # Stream to terminal
        "reasoning_effort": "high",       # For extended thinking
        "default_tool_choice": "auto",    # Tool choice override
    },
    enable_entrypoint=True,               # Can receive user messages
    can_complete_tasks=True,              # Can call task_complete
    tool_format="completions",            # "completions" or "responses"
    exclude_tools=[],                     # MAIL tools to hide
)
```

**Note:** `tool_format` should be a top-level template field. OpenAI models use `"responses"`, Anthropic models use `"completions"`.

### Multi-Agent Example

```python
supervisor = MAILAgentTemplate(
    name="supervisor",
    factory=LiteLLMSupervisorFunction,
    comm_targets=["researcher", "coder"],
    actions=[],
    agent_params={
        "llm": "anthropic/claude-sonnet-4-20250514",
        "system": "You coordinate research and coding tasks.",
        "use_proxy": False,
    },
    enable_entrypoint=True,
    can_complete_tasks=True,
    tool_format="completions",
)

researcher = MAILAgentTemplate(
    name="researcher",
    factory=LiteLLMActionAgentFunction,  # Has actions
    comm_targets=["supervisor"],
    actions=[web_search_action],
    agent_params={
        "llm": "anthropic/claude-sonnet-4-20250514",
        "system": "You search the web and report findings.",
        "use_proxy": False,
    },
    tool_format="completions",
)

coder = MAILAgentTemplate(
    name="coder",
    factory=LiteLLMActionAgentFunction,  # Has actions
    comm_targets=["supervisor"],
    actions=[run_code_action],
    agent_params={
        "llm": "anthropic/claude-sonnet-4-20250514",
        "system": "You write and execute code.",
        "use_proxy": False,
    },
    tool_format="completions",
)
```

## Swarm Templates

```python
template = MAILSwarmTemplate(
    name="my_swarm",
    version="1.0.0",
    agents=[supervisor, researcher, coder],
    actions=[web_search_action, run_code_action],  # All actions used by any agent
    entrypoint="supervisor",                        # Must have enable_entrypoint=True
    breakpoint_tools=[],                            # Tools that pause execution
    exclude_tools=[],                               # MAIL tools to hide from all agents
)
```

The swarm template validates:
- Entrypoint agent exists and has `enable_entrypoint=True`
- At least one agent has `can_complete_tasks=True`
- All `comm_targets` reference valid agent names
- All breakpoint tools exist

## Running Swarms

### One-Shot Execution

Best for simple request-response patterns:

```python
swarm = template.instantiate({"user_token": "dummy"}, "user_123")

response, events = await swarm.post_message_and_run(
    body="What's 2 + 2?",
    subject="Math Question",
    show_events=True,  # Include SSE events
)

print(response["message"]["body"])  # Final answer
```

### Continuous Mode

Best for persistent swarms that handle multiple tasks:

```python
swarm = template.instantiate({"user_token": "dummy"}, "user_123")

# Start continuous processing in background
asyncio.create_task(swarm.run_continuous())

# Submit tasks (runtime handles queue)
response, events = await swarm.post_message(
    body="First task",
    subject="Task 1",
)

response2, events2 = await swarm.post_message(
    body="Second task",
    subject="Task 2",
)

# Shutdown when done
await swarm.shutdown()
```

### Streaming

```python
stream = await swarm.post_message_stream(body="Hello")
# Returns EventSourceResponse - use in FastAPI endpoint
```

### Starting a Server

`MAILSwarmTemplate` has a convenience method to launch a FastAPI server with optional UI:

```python
template.start_server(
    port=8000,
    host="0.0.0.0",
    launch_ui=True,   # Starts Next.js dev server
    ui_port=3000,
    open_browser=True,
)
```

See `../MASter/scripts/GEPA/start_eval_swarm_server.py` for a complete example.

## Manual Mode

Manual mode gives you fine-grained control over agent stepping. This is useful for turn-based games, simulations, or debugging.

```python
# Start swarm in manual mode
swarm = template.instantiate({"user_token": "dummy"}, "game_session")
asyncio.create_task(swarm.run_continuous(mode="manual"))

# Initialize a task
task_id = str(uuid.uuid4())
init_msg = swarm.build_message(
    subject="::init::",
    body="Game starting",
    targets=["all"],
    sender_type="user",
    type="broadcast",
    task_id=task_id,
)
await swarm.submit_message_nowait(init_msg)

# Manually step specific agents
response = await swarm.manual_step(
    task_id=task_id,
    target="narrator",                # Agent to invoke
    response_targets=["all"],         # Who receives the response
    response_type="broadcast",        # "broadcast", "response", or "request"
    payload="Describe the scene.",    # Additional context for this step
)

# Step another agent
response = await swarm.manual_step(
    task_id=task_id,
    target="player1",
    response_targets=["narrator"],
    response_type="response",
    payload="What do you do?",
)
```

The mafia example (`src/mail/examples/mafia/game.py`) demonstrates this pattern extensively.

## Breakpoints — Pausing and Resuming

Breakpoint tools pause execution and return control to the caller:

```python
template = MAILSwarmTemplate(
    name="approval_swarm",
    version="1.0.0",
    agents=[...],
    actions=[human_review_action],
    entrypoint="supervisor",
    breakpoint_tools=["human_review"],  # This tool pauses execution
)

swarm = template.instantiate(...)

# Initial request
response, events = await swarm.post_message_and_run(
    body="Please process this document",
    task_id="task_123",
)

# If agent called human_review, task pauses and returns
# Response will indicate breakpoint was hit

# Resume with user-provided result
response, events = await swarm.post_message_and_run(
    body="",
    task_id="task_123",
    resume_from="breakpoint_tool_call",
    breakpoint_tool_call_result={"content": "Approved"},
)
```

### Parsing Breakpoint Tool Call Arguments

When a breakpoint tool is called, the response structure looks like this:

```python
response = {
    "message": {
        "subject": "::breakpoint_tool_call::",  # Indicates breakpoint was hit
        "body": "[{\"arguments\": \"{\\\"field\\\": \\\"value\\\"}\", \"name\": \"tool_name\", \"id\": \"call_...\"}]"
    }
}
```

Tool calls are standardized to OpenAI/LiteLLM format (arguments as JSON string). To extract:

```python
import json

def parse_breakpoint_tool_call(response: dict, tool_name: str) -> dict | None:
    """Extract tool call arguments from a breakpoint response."""
    message = response.get("message", {})
    subject = message.get("subject", "")
    body = message.get("body", "")

    if subject != "::breakpoint_tool_call::" or not body:
        return None

    body_data = json.loads(body)

    # Structure: [{"arguments": "{...}", "name": "tool_name", "id": "..."}]
    if isinstance(body_data, list) and len(body_data) > 0:
        for call in body_data:
            if call.get("name") == tool_name:
                args = call.get("arguments", "{}")
                # Arguments is a JSON string, parse it
                if isinstance(args, str):
                    return json.loads(args)
                return args

    return None
```

This pattern is useful for **structured output** scenarios where you want the LLM to call a tool with specific parameters rather than outputting freeform text.

## Common Patterns and Gotchas

### 1. `tool_format` Location

`tool_format` should be a top-level field on the agent template, not inside `agent_params`:

```python
MAILAgentTemplate(
    ...
    tool_format="completions",  # Top-level field (preferred)
    agent_params={...},         # Don't put tool_format here
)
```

If `tool_format` is found inside `agent_params`, a deprecation warning is logged and the top-level value takes precedence.

### 2. Solo Agent Swarm

A swarm with one agent needs both `enable_entrypoint=True` AND `can_complete_tasks=True`:

```python
solo = MAILAgentTemplate(
    name="solo",
    factory=LiteLLMSupervisorFunction,
    comm_targets=[],  # Empty is OK for solo
    actions=[],
    agent_params={...},
    enable_entrypoint=True,
    can_complete_tasks=True,
)
```

### 3. Action Must Return String

```python
# Wrong
async def my_action(args: dict) -> dict:
    return {"result": 123}

# Correct
async def my_action(args: dict) -> str:
    return json.dumps({"result": 123})
```

### 4. Closing Over State with `functools.partial`

```python
async def do_thing(state, args: dict) -> str:
    state.counter += 1
    return f"Count: {state.counter}"

# Create action with state bound
action = MAILAction.from_pydantic_model(
    model=DoThingArgs,
    function=partial(do_thing, my_state_object),
    name="do_thing",
)
```

### 5. OpenAI vs Anthropic `tool_format`

- OpenAI models: `tool_format="responses"`
- Anthropic models: `tool_format="completions"`

If you see tool parsing errors, check this first.

### 6. Waiting for Queue to Empty

In manual mode, wait for messages to be processed before stepping:

```python
await swarm.await_queue_empty()
response = await swarm.manual_step(...)
```

### 7. Dynamic Context Ratio

When using `manual_step`, the `dynamic_ctx_ratio` parameter (0.0-1.0) controls how much of the agent's context window is reserved for conversation history vs. static content:

```python
await swarm.manual_step(
    task_id=task_id,
    target="agent",
    dynamic_ctx_ratio=0.75,  # 75% for dynamic content
    ...
)
```

## Quick Reference

| What you want | How to do it |
|---------------|--------------|
| Define a tool | `@action` decorator or `MAILAction.from_pydantic_model()` |
| Define an agent | `MAILAgentTemplate(...)` |
| Define a swarm | `MAILSwarmTemplate(...)` |
| Create runnable swarm | `template.instantiate(params, user_id)` |
| One-shot task | `await swarm.post_message_and_run(body=...)` |
| Persistent runtime | `asyncio.create_task(swarm.run_continuous())` |
| Manual stepping | `swarm.run_continuous(mode="manual")` then `swarm.manual_step(...)` |
| Pause on tool call | Add tool name to `breakpoint_tools` |
| Resume from pause | `resume_from="breakpoint_tool_call"` |
| Stream events | `await swarm.post_message_stream(...)` |
| Launch server + UI | `template.start_server(port=8000, launch_ui=True)` |

## Example Files Worth Reading

| File | What it demonstrates |
|------|---------------------|
| `src/mail/examples/mafia/game.py` | Manual mode, complex game loop |
| `src/mail/examples/mafia/narrator_tools.py` | `MAILAction.from_pydantic_model` with closures |
| `../MASter/src/master/prebuilt/agents.py` | Agent template factory patterns |
| `../MASter/scripts/GEPA/start_eval_swarm_server.py` | Server startup with template |


===== End of `docs/CLAUDE_MAIL_PRIMER.md` =====

===== `docs/api.md` =====

# API Surfaces

The MAIL Python reference implementation exposes two integration layers: an **HTTP surface** for remote clients and a **Python surface** for embedding the runtime. Both surfaces operate on the same MAIL message schema defined in [src/mail/core/message.py](/src/mail/core/message.py).

## HTTP API

The server exposes a [FastAPI application](/src/mail/server.py) with endpoints for **user messaging**, **interswarm routing**, and **registry management**. The generated OpenAPI description lives in [spec/openapi.yaml](/spec/openapi.yaml).

### Auth model
- **All non-root endpoints** require `Authorization: Bearer <token>`
- **Tokens** are validated against `TOKEN_INFO_ENDPOINT`, which must respond with `{ role, id, api_key }`
- Supported **roles** map to helpers in [src/mail/utils/auth.py](/src/mail/utils/auth.py): `caller_is_admin`, `caller_is_user`, `caller_is_agent`, and `caller_is_admin_or_user`

### Endpoint reference

| Method | Path | Auth required | Request body | Response body | Summary |
| --- | --- | --- | --- | --- | --- |
| GET | `/` | None (public) | `None` | `types.GetRootResponse { name, status, protocol_version, swarm: SwarmInfo, uptime }` | Returns MAIL service metadata and version string. `SwarmInfo` includes `name`, `version`, `description`, `entrypoint`, `keywords`, and `public`. |
| GET | `/status` | `Bearer` token with role `admin` or `user` | `None` | `types.GetStatusResponse { swarm, active_users, user_mail_ready, user_task_running }` | Reports persistent swarm readiness and whether the caller already has a running runtime |
| GET | `/whoami` | `Bearer` token with role `admin` or `user` | `None` | `types.GetWhoamiResponse { id, role }` | Returns the caller identifier and role associated with the provided token |
| POST | `/message` | `Bearer` token with role `admin` or `user` | `JSON { subject: str, body: str, msg_type?: str, entrypoint?: str, show_events?: bool, stream?: bool, task_id?: str, resume_from?: str, kwargs?: dict }` | `types.PostMessageResponse { response: str, events?: list[ServerSentEvent] }` (or `text/event-stream` when `stream: true`) | Queues or resumes a user-scoped task; supports breakpoint resumes via `resume_from="breakpoint_tool_call"` and extra kwargs |
| GET | `/tasks` | `Bearer` token with role `admin` or `user` | `None` | `dict[str, TaskRecord]` | Lists all tasks owned by the caller together with runtime metadata |
| GET | `/task` | `Bearer` token with role `admin` or `user` | `JSON { task_id: str }` | `TaskRecord` | Returns the full record for a single task, including SSE history and queue snapshot |
| GET | `/health` | None (public) | `None` | `types.GetHealthResponse { status, swarm_name, timestamp }` | Liveness signal used for interswarm discovery |
| POST | `/health` | `Bearer` token with role `admin` | `JSON { status: str }` | `types.GetHealthResponse { status, swarm_name, timestamp }` | Updates the health status reported to other swarms |
| GET | `/swarms` | None (public) | `None` | `types.GetSwarmsResponse { swarms: list[types.SwarmEndpointCleaned] }` | Lists *public* swarms known to the local registry. Returns cleaned endpoints (auth tokens hidden) with fields: `swarm_name`, `base_url`, `version`, `last_seen`, `is_active`, `latency`, `swarm_description`, `keywords`, `metadata`. |
| POST | `/swarms` | `Bearer` token with role `admin` | `JSON { name: str, base_url: str, auth_token?: str, metadata?: dict, volatile?: bool }` | `types.PostSwarmsResponse { status, swarm_name }` | Registers a remote swarm (persistent when `volatile` is `False`) |
| GET | `/swarms/dump` | `Bearer` token with role `admin` | `None` | `types.GetSwarmsDumpResponse { status, swarm_name }` | Logs the configured persistent swarm and returns acknowledgement |
| POST | `/interswarm/forward` | `Bearer` token with role `agent` | `JSON { message: MAILInterswarmMessage }` | `types.PostInterswarmForwardResponse { swarm, task_id, status, local_runner }` | Accepts a remote swarm's new-task payload and spawns/attaches a local runtime |
| POST | `/interswarm/back` | `Bearer` token with role `agent` | `JSON { message: MAILInterswarmMessage }` | `types.PostInterswarmBackResponse { swarm, task_id, status, local_runner }` | Injects a follow-up or completion payload from the remote swarm into the active local runtime |
| POST | `/interswarm/message` | `Bearer` token with role `admin` or `user` | `JSON { user_token: str, body: str, targets: list[str], subject?: str, msg_type?: Literal["request","broadcast"], task_id?: str, routing_info?: dict, stream?: bool, ignore_stream_pings?: bool }` | `types.PostInterswarmMessageResponse { response: MAILMessage, events?: list[ServerSentEvent] }` | Proxies a user/admin task to a remote swarm using the caller's runtime and interswarm router |
| POST | `/swarms/load` | `Bearer` token with role `admin` | `JSON { json: str }` (serialized swarm template) | `types.PostSwarmsLoadResponse { status, swarm_name }` | Replaces the persistent swarm template using a JSON document |
| POST | `/responses` | `Bearer` token with role `admin` or `user` (debug mode only) | `JSON { input: list[dict], tools: list[dict], instructions?: str, previous_response_id?: str, tool_choice?: str \| dict, parallel_tool_calls?: bool, kwargs?: dict }` | `openai.types.responses.Response` | OpenAI Responses-compatible bridge available when the server runs with `debug` enabled; not included in the public OpenAPI spec |

**TaskRecord** aligns with [`mail.core.tasks.MAILTask`](/src/mail/core/tasks.py):
- `task_id`, `task_owner`, `task_contributors`, `start_time`, `is_running`, `completed`, `remote_swarms` summarise runtime status
- `events` echoes the task's Server-Sent Event log (each entry is serialised with `event`, `id`, `retry`, `data`)
- `task_message_queue` contains any stashed downstream messages used when resuming paused work
`GET /task` expects a small JSON body with `task_id` even though it is a GET request—this keeps the signature consistent with other task-management helpers without exposing the identifier in query parameters.

### SSE streaming
- `POST /message` with `stream: true` yields a `text/event-stream`
- **Events** include periodic `ping` heartbeats and terminate with `task_complete` carrying the final serialized response
- When resuming a task from a breakpoint tool call, provide `resume_from="breakpoint_tool_call"` and include `breakpoint_tool_call_result` inside `kwargs`. Pass a JSON string, dict, or list that represents either a single tool response (`{"content": "..."}`) or a list of responses (`[{"call_id": "...", "content": "..."}]`) so the runtime can fan the outputs back to the corresponding breakpoint tool calls.
- `POST /interswarm/message` accepts the same customization flags as local messaging. Use `msg_type="request"` with a single-element `targets` list, or `msg_type="broadcast"` with one or more entries. Include `stream` / `ignore_stream_pings` to mirror local streaming; the server copies those hints into the interswarm `routing_info` it sends downstream.

### Debug mode & OpenAI compatibility
- Enabling server debug mode (`mail server --debug` or `[server].debug = true`) bootstraps a `SwarmOAIClient` alongside the FastAPI app so it can mirror OpenAI's `/responses` API.
- `POST /responses` expects the OpenAI-style `input`, `tools`, `instructions`, `previous_response_id`, and other optional fields. The caller is authenticated via the normal `Authorization: Bearer ...` header, which is used to hydrate or reuse the caller's MAIL runtime before piping the request into the OpenAI bridge.
- Responses conform to `openai.types.responses.Response`, letting you plug a MAIL swarm behind clients or SDKs that already speak the OpenAI Responses protocol. Because it is debug-only, the route is hidden from the generated OpenAPI document.
- Wrap it from code via `MAILClient.debug_post_responses(...)` or from the REPL using `mail client responses …` (see [client.md](/docs/client.md) and [cli.md](/docs/cli.md) for usage).

### Error handling
- FastAPI raises **standard HTTP errors** with a `detail` field
- The runtime emits **structured MAIL error responses** when routing or execution fails

### Notes
- The server keeps a persistent `MAILSwarmTemplate` catalogue and per-user `MAILSwarm` instances
- **Message schemas** are documented in [docs/message-format.md](/docs/message-format.md) and [spec/](/spec/SPEC.md)
- The repository ships an asynchronous helper described in [docs/client.md](/docs/client.md) that wraps these endpoints and handles bearer auth + SSE parsing
- **Task lifecycle**: Each `POST /message` participates in a long-lived task distinguished by `task_id`. Breakpoint-aware tools can pause a task; clients resume by reusing the same `task_id` with the `resume_from` contract described above. Both `resume_from="breakpoint_tool_call"` (supply tool output via `kwargs`) and `resume_from="user_response"` (send another user-authored message) are supported.

### MAILClient helper
- `MAILClient` (see [client.md](/docs/client.md)) mirrors every route above with ergonomic async methods
- Supports bearer tokens, custom timeouts, and optional externally managed `aiohttp.ClientSession`
- Provides `post_message_stream()` to yield `ServerSentEvent` objects without recreating SSE parsing logic
- Used by automated tests and demo scripts (`scripts/demo_client.py`) to validate client/server interoperability

## Python API

The Python surface is designed for embedding MAIL inside other applications, building custom swarms, or scripting tests. The primary exports live in [src/mail/\_\_init\_\_.py](/src/mail/__init__.py) and re-export key classes from `mail.api` and `mail.core`.

### Imports and modules
- To obtain **high-level builder classes**:
  ```python 
  from mail import (
    MAILAgent, 
    MAILAgentTemplate, 
    MAILAction, 
    MAILSwarm, 
    MAILSwarmTemplate
  )
  ``` 
- To obtain **protocol types**:
  ```python
  from mail import (
      MAILMessage,
      MAILRequest,
      MAILResponse,
      MAILBroadcast,
      MAILInterrupt,
      AgentToolCall,
  )
  ```
- To obtain **network helpers** for interswarm support:
  ```python
  from mail.net import SwarmRegistry, InterswarmRouter
  ```
- To work directly with the lower-level runtime primitives:
  ```python
  from mail.core import AgentCore, ActionCore
  ```
- `mail.utils` bundles token helpers, logging utilities, dynamic function loading via `read_python_string`, and interswarm address parsing
- `mail.swarms_json.utils` provides helpers for loading and validating `swarms.json` content before instantiating templates. Beyond basic field-level checks, `validate_swarm_from_swarms_json` runs cross-validation that catches common wiring mistakes at parse time (typo'd entrypoints, invalid comm_targets, missing supervisors, duplicate agent names, etc.) and includes fuzzy-match "Did you mean '...'?" suggestions in error messages

### Class reference

#### `MAILAction` (`mail.api`)
- **Summary**: Describes an action/tool exposed by an agent; wraps a callable with metadata for OpenAI tools.
- **Constructor parameters**: `name: str`, `description: str`, `parameters: dict[str, Any]` (JSONSchema-like), `function: str | ActionFunction` (import string or callable).
- **Key methods**:
  - `from_pydantic_model(model, function_str, name?, description?) -> MAILAction`: build from a Pydantic model definition.
  - `from_swarm_json(json_str) -> MAILAction`: rebuild from persisted `swarms.json` entries.
   - `to_tool_dict(style="responses"|"completions") -> dict[str, Any]`: emit an OpenAI-compatible tool declaration.
   - `to_pydantic_model(for_tools: bool = False) -> type[BaseModel]`: create a Pydantic model for validation or schema reuse.
   - `_validate() -> None` and `_build_action_function(function) -> ActionFunction`: internal guards and loader utilities.

#### `action` decorator (`mail.api`)
- **Summary**: Decorator that turns a Python callable into a `MAILAction`, wiring up schema validation and tool metadata automatically.
- **Parameters**:
  - `name: str | None` – optional override; defaults to the function name.
  - `description: str | None` – required unless supplied via docstring.
  - `model: type[BaseModel] | None` – payload schema; inferred from the first argument annotation when it is a `BaseModel` subclass.
  - `parameters: dict[str, Any] | None` – manual JSON schema (mutually exclusive with `model`).
  - `style: Literal["responses", "completions"]` – schema flavor passed to `pydantic_model_to_tool` (default `"responses"`).
- **Usage**:
  ```python
  from pydantic import BaseModel
  from mail import action

  class WeatherRequest(BaseModel):
      city: str

  @action(description="Return weather information for the requested city.")
  async def get_weather(payload: WeatherRequest) -> str:
      forecast = lookup_forecast(payload.city)
      return forecast.json()

  # get_weather is now a MAILAction ready to install on an agent:
  weather_action = get_weather
  ```

#### `MAILAgent` (`mail.api`)
- **Summary**: Concrete runtime agent produced by an agent factory and associated actions.
- **Constructor parameters**: `name: str`, `factory: str | Callable`, `actions: list[MAILAction]`, `function: AgentFunction`, `comm_targets: list[str]`, `agent_params: dict[str, Any]`, `enable_entrypoint: bool = False`, `enable_interswarm: bool = False`, `can_complete_tasks: bool = False`, `tool_format: Literal["completions", "responses"] = "responses"`, `exclude_tools: list[str] | None = None`.
- **Key methods**:
  - `__call__(messages, tool_choice="required") -> Awaitable[tuple[str | None, list[AgentToolCall]]]`: execute the agent implementation.
  - `_to_template(names: list[str]) -> MAILAgentTemplate`: internal helper that trims targets for sub-swarms.
  - `_validate() -> None`: internal guard ensuring agent metadata is coherent.
- Factories may be supplied as dotted import strings (resolved via `read_python_string`) or as preloaded callables.

#### `MAILAgentTemplate` (`mail.api`)
- **Summary**: Declarative agent description used for persistence, cloning, and factory instantiation.
- **Constructor parameters**: `name: str`, `factory: str | Callable`, `comm_targets: list[str]`, `actions: list[MAILAction]`, `agent_params: dict[str, Any]`, `enable_entrypoint: bool = False`, `enable_interswarm: bool = False`, `can_complete_tasks: bool = False`, `tool_format: Literal["completions", "responses"] = "responses"`, `exclude_tools: list[str] | None = None`.
- **Key methods**:
  - `instantiate(instance_params: dict[str, Any]) -> MAILAgent`: load the factory and produce a concrete `MAILAgent`.
  - `from_swarm_json(json_str, actions_by_name: dict[str, MAILAction] | None = None) -> MAILAgentTemplate`: rebuild from `swarms.json` entries, optionally supplying pre-built actions to resolve `actions` references efficiently.
  - `from_example(name, comm_targets) -> MAILAgentTemplate`: load bundled examples (`supervisor`, `weather`, `math`, `consultant`, `analyst`).
  - `_top_level_params() -> dict[str, Any]` and `_validate() -> None`: internal helpers used during instantiation and validation.
- Accepts either dotted import strings or callables for `factory`, enabling JSON-driven and dynamic runtime construction alike.
- Recursively resolves `python::module:object` and `url::https://...` string prefixes in `agent_params` (and nested structures) so templates can reference code exports or remote JSON payloads without manual preprocessing.

#### `MAILSwarm` (`mail.api`)
- **Summary**: Runtime container that owns instantiated agents/actions and embeds a `MAILRuntime`.
- **Constructor parameters**: `name: str`, `version: str`, `agents: list[MAILAgent]`, `actions: list[MAILAction]`, `entrypoint: str`, `user_id: str = "default"`, `user_role: Literal["admin","agent","user"] = "user"`, `swarm_registry: SwarmRegistry | None = None`, `enable_interswarm: bool = False`, `breakpoint_tools: list[str] = []`, `exclude_tools: list[str] = []`, `task_message_limit: int | None = None`, `description: str = ""`, `keywords: list[str] = []`, `enable_db_agent_histories: bool = False`, `print_llm_streams: bool = True`.
- **Key methods**:
  - `post_message(...)`, `post_message_stream(...)`, `post_message_and_run(...)`: enqueue user requests (optionally streaming or running to completion).
  - `submit_message(...)`, `submit_message_stream(...)`: submit fully-formed `MAILMessage` envelopes.
  - `run_continuous(action_override: ActionOverrideFunction | None = None) -> Awaitable[None]`: long-running loop for user sessions.
  - `shutdown()`, `start_interswarm()`, `stop_interswarm()`, `is_interswarm_running()`: lifecycle and interswarm controls.
  - `handle_interswarm_response(response_message) -> Awaitable[None]`: process responses from remote swarms.
  - `route_interswarm_message(message) -> Awaitable[MAILMessage]`: send outbound interswarm traffic via the router.
  - `get_pending_requests() -> dict[str, asyncio.Future[MAILMessage]]`: inspect outstanding requests per task.
  - `update_from_adjacency_matrix(adj: list[list[int]]) -> None`: overwrite agent communication targets using an adjacency matrix.
  - `get_subswarm(names, name_suffix, entrypoint?) -> MAILSwarmTemplate`: derive a sub-template focused on a subset of agents.
  - `build_message(subject, body, targets, sender_type?, type?) -> MAILMessage`: utility for crafting MAIL envelopes.

#### `MAILSwarmTemplate` (`mail.api`)
- **Summary**: Immutable swarm blueprint comprised of `MAILAgentTemplate`s and shared actions.
- **Notes**: Inline definitions from `actions` may be combined with `action_imports` that resolve to decorated `MAILAction` objects (e.g., from `mail.stdlib`).
- **Constructor parameters**: `name: str`, `version: str`, `agents: list[MAILAgentTemplate]`, `actions: list[MAILAction]`, `entrypoint: str`, `enable_interswarm: bool = False`, `breakpoint_tools: list[str] = []`, `exclude_tools: list[str] = []`, `task_message_limit: int | None = None`, `description: str = ""`, `keywords: list[str] = []`, `public: bool = False`, `enable_db_agent_histories: bool = False`.
- **Key methods**:
  - `instantiate(instance_params, user_id?, user_role?, base_url?, registry_file?, print_llm_streams?) -> MAILSwarm`: produce a runtime swarm (creates `SwarmRegistry` when interswarm is enabled).
  - `get_subswarm(names, name_suffix, entrypoint?) -> MAILSwarmTemplate`: filter agents into a smaller template while preserving supervisors and entrypoints.
  - `update_from_adjacency_matrix(adj: list[list[int]]) -> None`: sync template wiring back to `comm_targets` for each agent.
  - `from_swarm_json(json_str) -> MAILSwarmTemplate` / `from_swarm_json_file(swarm_name, json_filepath?) -> MAILSwarmTemplate`: rebuild from persisted JSON.
  - `_build_adjacency_matrix() -> tuple[list[list[int]], list[str]]`, `_validate() -> None`: internal helpers.

#### `AgentToolCall` (`mail.core.tools`)
- **Summary**: Pydantic model capturing the outcome of an OpenAI tool invocation.
- **Fields**: `tool_name: str`, `tool_args: dict[str, Any]`, `tool_call_id: str`, `completion: dict[str, Any]`, `responses: list[dict[str, Any]]`, `reasoning: list[str] | None`, `preamble: str | None`.
- **Key methods**:
  - `create_response_msg(content: str) -> dict[str, str]`: format a response payload for completions or responses API.
  - `model_validator` (after-init) enforces that either `completion` or `responses` is populated.

#### `MAILRuntime` (`mail.core.runtime`)
- **Summary**: Asynchronous runtime that owns the internal message queue, tool execution, and optional interswarm router.
- **Constructor parameters**: `agents: dict[str, AgentCore]`, `actions: dict[str, ActionCore]`, `user_id: str`, `user_role: Literal["admin","agent","user"]`, `swarm_name: str = "example"`, `entrypoint: str = "supervisor"`, `swarm_registry: SwarmRegistry | None = None`, `enable_interswarm: bool = False`, `breakpoint_tools: list[str] | None = None`, `exclude_tools: list[str] | None = None`, `enable_db_agent_histories: bool = False`, `print_llm_streams: bool = True`.
- Pass the lower-level `AgentCore` / `ActionCore` objects (for example via `MAILAgent.to_core()` and `MAILAction.to_core()`) when instantiating the runtime directly.
- `print_llm_streams` is applied recursively to known agent-function wrappers (`supervisor_fn`, `action_agent_fn`, `_mail_agent`) so a runtime can centrally suppress local LLM stream printing without changing each agent definition.
- **Key methods**:
  - `start_interswarm()`, `stop_interswarm()`, `is_interswarm_running()`.
  - `handle_interswarm_response(response_message)` and internal `_handle_local_message(message)`.
  - `run()` and `run_continuous(action_override?)`: main scheduling loops.
  - `submit(message)`, `submit_and_wait(message, timeout)`, `submit_and_stream(message, timeout)`: queue management helpers.
  - `shutdown()` (and `_graceful_shutdown()`) for orderly teardown.
  - `get_events_by_task_id(task_id) -> list[ServerSentEvent]`: retrieve accumulated SSE events.
  - Attributes such as `pending_requests`, `events`, and `response_queue` expose runtime state.

#### `SwarmRegistry` (`mail.net.registry`)
- **Summary**: Tracks known swarm endpoints, performs health checks, and persists non-volatile registrations.
- **Constructor parameters**: `local_swarm_name: str`, `local_base_url: str`, `persistence_file: str | None = None`, `local_swarm_description: str = ""`, `local_swarm_keywords: list[str] | None = None`, `local_swarm_public: bool = False`.
- **Key methods**:
  - `register_local_swarm(base_url)`, `register_swarm(...)`, `unregister_swarm(swarm_name)`.
  - `get_swarm_endpoint(swarm_name)`, `get_resolved_auth_token(swarm_name)`, `get_all_endpoints()`, `get_active_endpoints()`, `get_persistent_endpoints()`.
  - `save_persistent_endpoints()`, `load_persistent_endpoints()`, `cleanup_volatile_endpoints()`.
  - `start_health_checks()`, `stop_health_checks()`, `discover_swarms(discovery_urls)`: manage background discovery and health loops.
  - Utility helpers for token handling: `_get_auth_token_ref`, `_resolve_auth_token_ref`, `migrate_auth_tokens_to_env_refs`, `validate_environment_variables()`.
  - Serialization helpers: `to_dict()`.

#### `InterswarmRouter` (`mail.net.router`)
- **Summary**: HTTP router that pushes MAIL messages to local handlers or remote swarms using the registry.
- **Constructor parameters**: `swarm_registry: SwarmRegistry`, `local_swarm_name: str`.
- **Key methods**:
  - `start()` / `stop()` / `is_running()` manage the shared `aiohttp` session.
  - `register_message_handler(message_type, handler)` wires local callbacks.
  - `route_message(message) -> Awaitable[MAILMessage]`: choose local vs remote delivery.
  - Internal helpers `_route_to_local_agent`, `_route_to_remote_swarm`, `_create_local_message`, `_create_remote_message`, `_system_router_message` support routing decisions.

### Message typed dictionaries (`mail.core.message`)

#### `MAILAddress`
```python
{ 
    address_type: Literal["admin", "agent", "user", "system"], 
    address: str 
}
```
#### `MAILRequest`
```python
{ 
    task_id: str,
    request_id: str,
    sender: MAILAddress,
    recipient: MAILAddress,
    subject: str,
    body: str,
    sender_swarm: str | None,
    recipient_swarm: str | None,
    routing_info: dict[str, Any] | None 
}
```
#### `MAILResponse`
```python
{ 
    task_id: str,
    request_id: str,
    sender: MAILAddress,
    recipient: MAILAddress, 
    subject: str, 
    body: str,
    sender_swarm: str | None,
    recipient_swarm: str | None,
    routing_info: dict[str, Any] | None 
}
```
#### `MAILBroadcast`
```python
{
    task_id: str, 
    broadcast_id: str, 
    sender: MAILAddress, 
    recipients: list[MAILAddress],
    subject: str,
    body: str,
    sender_swarm: str | None,
    recipient_swarms: list[str] | None,
    routing_info: dict[str, Any] | None 
}
```
#### `MAILInterrupt`
```python
{ 
    task_id: str,
    interrupt_id: str,
    sender: MAILAddress,
    recipients: list[MAILAddress],
    subject: str,
    body: str,
    sender_swarm: str | None,
    recipient_swarms: list[str] | None,
    routing_info: dict[str, Any] | None 
}
```
#### `MAILInterswarmMessage`
```python
{ 
    message_id: str,
    source_swarm: str, target_swarm: str,
    timestamp: str,
    task_owner: str,
    task_contributors: list[str],
    payload: MAILRequest | MAILResponse | MAILBroadcast | MAILInterrupt,
    msg_type: Literal["request", "response", "broadcast", "interrupt"],
    auth_token: str | None,
    metadata: dict[str, Any] | None 
}
```
#### `MAILMessage`
```python
{
    id: str,
    timestamp: str,
    message: MAILRequest | MAILResponse | MAILBroadcast | MAILInterrupt,
    msg_type: Literal["request", "response", "broadcast", "interrupt", "broadcast_complete"] 
}
```
- **Helper utilities**: `parse_agent_address`, `format_agent_address`, `create_agent_address`, `create_user_address`, `create_system_address`, `build_body_xml`, `build_mail_xml`.

### Function reference

#### `mail.core.tools`
##### `pydantic_model_to_tool`
```python
  def pydantic_model_to_tool(
    model_cls,
    name=None,
    description=None,
    style="completions"
  ) -> dict[str, Any]
```
  - **Parameters**: `model_cls: type[BaseModel]` – Pydantic model describing the tool payload; `name: str | None` – optional override for the tool name; `description: str | None` – supplemental natural language description; `style: Literal["completions", "responses"]` – which OpenAI API surface the schema will target.
  - **Returns**: `dict[str, Any]` – Tool metadata in the shape expected by the chosen OpenAI API.
  - **Summary**: Wraps Pydantic models with OpenAI metadata so MAIL agents can advertise structured tool calls across both the Chat Completions and Responses APIs.
##### `convert_call_to_mail_message`
```python
def convert_call_to_mail_message(
    call,
    sender,
    task_id
) -> MAILMessage
```
  - **Parameters**: `call: AgentToolCall` – serialized OpenAI tool invocation captured from the LLM; `sender: str` – MAIL agent name that issued the tool call; `task_id: str` – runtime task identifier tying the message to a conversation loop.
  - **Returns**: `MAILMessage` – Fully populated MAIL envelope ready for routing (request, response, broadcast, interrupt, or completion broadcast).
  - **Summary**: Normalizes OpenAI tool executions into canonical MAIL messages, setting message IDs, timestamps, and typed payloads so downstream routers can deliver them without additional parsing.
##### `create_request_tool`
```python
def create_request_tool(
    targets,
    enable_interswarm=False,
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `targets: list[str]` – approved in-swarm recipients for outgoing requests; `enable_interswarm: bool` – toggles free-form `agent@swarm` addressing; `style: Literal["completions", "responses"]` – OpenAI API surface to tailor schema for.
  - **Returns**: `dict[str, Any]` – OpenAI tool definition whose schema enforces MAIL request fields.
  - **Summary**: Produces a constrained `send_request` tool that lets agents originate MAIL requests while guarding the recipient list and optionally annotating interswarm routing hints.
##### `create_response_tool`
```python
def create_response_tool(
    targets,
    enable_interswarm=False,
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `targets: list[str]` – eligible response recipients; `enable_interswarm: bool` – permits remote swarm addressing when true; `style: Literal["completions", "responses"]` – selects schema layout for the target OpenAI API.
  - **Returns**: `dict[str, Any]` – OpenAI tool description for the `send_response` helper.
  - **Summary**: Mirrors `create_request_tool` but directs the payload through the MAIL response channel so agents can close loops or send follow-ups with correct metadata.
##### `create_interrupt_tool`
```python
def create_interrupt_tool(
    targets,
    enable_interswarm=False,
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `targets: list[str]` – agents whose execution can be interrupted; `enable_interswarm: bool` – expands targeting to `agent@swarm`; `style: Literal["completions", "responses"]` – determines tool schema format.
  - **Returns**: `dict[str, Any]` – OpenAI definition for the `send_interrupt` tool.
  - **Summary**: Enables supervisor-style interventions by emitting MAIL interrupt envelopes that pause or redirect downstream agents, preserving target validation rules.
##### `create_interswarm_broadcast_tool`
```python
def create_interswarm_broadcast_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – OpenAI API variant that should consume the tool description.
  - **Returns**: `dict[str, Any]` – Tool metadata for `send_interswarm_broadcast`.
  - **Summary**: Provides supervisors with a broadcast primitive that targets multiple remote swarms, including optional filtering of destination swarm names.
##### `create_swarm_discovery_tool`
```python
def create_swarm_discovery_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – dictates OpenAI schema flavor.
  - **Returns**: `dict[str, Any]` – Tool definition for `discover_swarms`.
  - **Summary**: Lets supervisors push discovery endpoint URLs into the registry so the runtime can crawl and register additional swarms on demand.
##### `create_broadcast_tool`
```python
def create_broadcast_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – OpenAI API compatibility toggle.
  - **Returns**: `dict[str, Any]` – Tool metadata for `send_broadcast`.
  - **Summary**: Issues swarm-wide broadcasts inside the local runtime, allowing supervisors to disseminate guidance or status simultaneously to every agent.
##### `create_acknowledge_broadcast_tool`
```python
def create_acknowledge_broadcast_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – chooses schema variant for OpenAI tools.
  - **Returns**: `dict[str, Any]` – Tool payload describing `acknowledge_broadcast`.
  - **Summary**: Gives agents a non-disruptive acknowledgement path that stores incoming broadcasts in local memory without generating MAIL traffic.
##### `create_ignore_broadcast_tool` 
```python
def create_ignore_broadcast_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – determines returned schema format.
  - **Returns**: `dict[str, Any]` – Tool metadata for `ignore_broadcast`.
  - **Summary**: Allows agents to discard a broadcast intentionally, optionally recording an internal reason while ensuring no acknowledgement is emitted.
##### `create_await_message_tool`
```python
def create_await_message_tool(
    style="completions"
) -> dict[str, Any]
```
- **Parameters**: `style: Literal["completions", "responses"]` – specifies the OpenAI schema flavor to emit.
- **Returns**: `dict[str, Any]` – Tool description for `await_message` with an optional `reason` field.
- **Summary**: Gives agents a MAIL-native way to yield their turn once they have no additional output; the optional reason is surfaced in runtime events and tool-call history for observability.
##### `create_help_tool`
```python
def create_help_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – determines the OpenAI schema format returned.
  - **Returns**: `dict[str, Any]` – Tool specification for `help` with toggles for summary, identity, per-tool guidance, and full protocol output.
  - **Summary**: Produces the diagnostic helper that agents can call to learn about their identity, available MAIL tools, and optionally the entire protocol specification; the runtime relays the generated content back via a system broadcast.
##### `create_task_complete_tool`
```python
def create_task_complete_tool(
    style="completions"
) -> dict[str, Any]
```
  - **Parameters**: `style: Literal["completions", "responses"]` – aligns the schema with the OpenAI API being used.
  - **Returns**: `dict[str, Any]` – Tool specification for `task_complete`.
  - **Summary**: Produces the termination tool supervisors use to broadcast the final user-facing answer and signal the runtime that the task loop can close.
##### `create_mail_tools`
```python
def create_mail_tools(
    targets, 
    enable_interswarm=False, 
    style="completions"
) -> list[dict[str, Any]]
```
  - **Parameters**: `targets: list[str]` – baseline intra-swarm recipients; `enable_interswarm: bool` – toggles remote routing support; `style: Literal["completions", "responses"]` – OpenAI schema variant shared by all generated tools.
  - **Returns**: `list[dict[str, Any]]` – Bundled request, response, acknowledgement, ignore, await, and help tools configured with the provided options.
  - **Summary**: Supplies a ready-to-install toolkit for standard agents so they can message peers, manage broadcasts, request runtime help, or explicitly wait for new mail without bespoke configuration.
##### `create_supervisor_tools`
```python
def create_supervisor_tools(
    targets, 
    can_complete_tasks=True, 
    enable_interswarm=False, 
    style="completions", 
    _debug_include_intraswarm=True
) -> list[dict[str, Any]]
```
  - **Parameters**: `targets: list[str]` – intra-swarm agents reachable by the supervisor; `can_complete_tasks: bool` – gates inclusion of the task completion tool; `enable_interswarm: bool` – toggles remote messaging and discovery helpers; `style: Literal["completions", "responses"]` – controls schema flavor; `_debug_include_intraswarm: bool` – retains intra-swarm tools when debugging or running evaluations.
  - **Returns**: `list[dict[str, Any]]` – Curated tool set composed of interrupts, broadcasts, discovery, and optional completion helpers.
  - **Summary**: Tailors the MAIL control surface for supervisory agents, combining escalation, coordination, discovery, and shutdown capabilities into a single toolkit.

#### `mail.utils.auth`
##### `login`
```python
def login(
    api_key: str
) -> Awaitable[str]
```
  - **Parameters**: `api_key: str` – credential provided by the operator or registry.
  - **Returns**: `Awaitable[str]` – coroutine resolving to a bearer token when the auth service accepts the key.
  - **Summary**: Performs the remote API key exchange, logs successful authentications, and yields the token MAIL uses for subsequent secured calls.
##### `get_token_info`
```python
def get_token_info(
    token: str
) -> Awaitable[dict[str, Any]]
```
  - **Parameters**: `token: str` – bearer token previously issued by the auth service.
  - **Returns**: `Awaitable[dict[str, Any]]` – coroutine yielding the decoded token payload (role, id, api key reference, etc.).
  - **Summary**: Queries the token introspection endpoint to materialize role metadata used by all downstream authorization checks.
##### `caller_is_admin`
```python
def caller_is_admin(
    request
) -> Awaitable[bool]
```
  - **Parameters**: `request: fastapi.Request` – inbound HTTP request carrying the bearer token header.
  - **Returns**: `Awaitable[bool]` – coroutine resolving to `True` when the token role is `admin`, otherwise raises `HTTPException`.
  - **Summary**: FastAPI dependency that gates endpoints to administrators by validating the caller’s token role against the auth service.
##### `caller_is_user`
```python
def caller_is_user(
    request
) -> Awaitable[bool]
```
  - **Parameters**: `request: fastapi.Request` – HTTP request containing an Authorization header.
  - **Returns**: `Awaitable[bool]` – coroutine that resolves to `True` when the token role is `user` (otherwise raises `HTTPException`).
  - **Summary**: Dependable guard that restricts endpoints to end users, reusing the shared role-checking helper.
##### `caller_is_agent`
```python
def caller_is_agent(
    request
) -> Awaitable[bool]
```
  - **Parameters**: `request: fastapi.Request` – bearer-authenticated HTTP request.
  - **Returns**: `Awaitable[bool]` – coroutine returning `True` if the caller’s role is `agent`, otherwise raising `HTTPException`.
  - **Summary**: Dependency enforcing that only MAIL agents (typically other swarms) can access agent-scoped endpoints.
##### `caller_is_admin_or_user`
```python
def caller_is_admin_or_user(
    request
) -> Awaitable[bool]
```
  - **Parameters**: `request: fastapi.Request` – inbound request from which the method extracts and validates the bearer token.
  - **Returns**: `Awaitable[bool]` – coroutine that resolves to `True` for `admin` or `user` callers, raising `HTTPException` for all others.
  - **Summary**: Combined guard that accepts either administrative or end-user tokens while protecting against malformed or mis-scoped Authorization headers.
##### `extract_token_info`
```python
def extract_token_info(
    request
) -> Awaitable[dict[str, Any]]
```
  - **Parameters**: `request: fastapi.Request` – request object containing bearer token details.
  - **Returns**: `Awaitable[dict[str, Any]]` – coroutine yielding the token metadata dictionary retrieved from the auth service.
  - **Summary**: Utility dependency that unwraps the Authorization header, normalizes the bearer token, and returns the decoded payload for downstream handlers.
##### `generate_user_id`
```python
def generate_user_id(
    token_info
) -> str
```
  - **Parameters**: `token_info: dict[str, Any]` – decoded token payload from the auth service.
  - **Returns**: `str` – stable user identifier combining the caller role and id.
  - **Summary**: Formats the composite user identifier MAIL uses to partition runtimes and per-user state.
##### `generate_agent_id`
```python
def generate_agent_id(
    token_info
) -> str
```
  - **Parameters**: `token_info: dict[str, Any]` – token payload describing the remote agent.
  - **Returns**: `str` – prefixed identifier (`swarm_<id>`) used for interswarm routing and persistence keys.
  - **Summary**: Produces the canonical agent identifier expected by registry and routing components.

#### `mail.utils.logger`
##### `get_loggers`
```python
def get_loggers() -> list[str]
```
  - **Returns**: `list[str]` – names of loggers tracked by the root logging manager.
  - **Summary**: Exposes the logging subsystem’s registry so callers can audit or reconfigure loggers programmatically.
##### `init_logger`
```python
def init_logger() -> None
```
  - **Returns**: `None`.
  - **Summary**: Builds MAIL’s logging pipeline by wiring Rich console output, daily rotating file handlers, and sanitizing third-party logger configurations before runtime startup.

#### `mail.utils.parsing`
##### `read_python_string`
```python
def read_python_string(
    string: str
) -> Any
```
  - **Parameters**: `string: str` – import target in `module:attribute` format.
  - **Returns**: `Any` – referenced attribute imported dynamically from the specified module.
  - **Summary**: Supports template-driven configuration by resolving dotted module references into live Python objects.
##### `target_address_is_interswarm`
```python
def target_address_is_interswarm(
    address: str
) -> bool
```
  - **Parameters**: `address: str` – MAIL address such as `agent` or `agent@swarm`.
  - **Returns**: `bool` – `True` when the address encodes a remote swarm component, otherwise `False`.
  - **Summary**: Uses the core address parser to distinguish local recipients from interswarm destinations for routing decisions.

#### `mail.utils.store`
##### `get_langmem_store`
```python
def get_langmem_store() -> AsyncIterator[Any]
```
  - **Returns**: `AsyncIterator[Any]` – async context manager that yields either a Postgres-backed LangMem store or an in-memory fallback.
  - **Summary**: Centralizes memory-store provisioning, negotiating Postgres connectivity, schema options, and in-memory fallbacks while presenting a consistent async context manager interface.

### Example: programmatic swarm assembly

```python
import asyncio

from mail import MAILAgentTemplate, MAILSwarmTemplate
from mail.examples import weather_dummy  # Provides demo agent params and tools

# Build reusable agent templates from the bundled examples
supervisor = MAILAgentTemplate.from_example("supervisor", comm_targets=["weather"])
weather = MAILAgentTemplate.from_example("weather", comm_targets=["supervisor"])

# Assemble a swarm template that links the agents together
demo_template = MAILSwarmTemplate(
    name="demo-swarm",
    agents=[supervisor, weather],
    actions=[*supervisor.actions, *weather.actions],
    entrypoint="supervisor",
)

async def main() -> None:
    # Instantiate a concrete swarm runtime for a specific user
    swarm = demo_template.instantiate(instance_params={}, user_id="demo-user")
    # Post a message to the supervisor entrypoint and capture optional events
    response, events = await swarm.post_message(
        subject="Forecast check",
        body="What's the outlook for tomorrow in New York?",
        show_events=True,
    )
    # Emit the supervisor's final answer
    print(response["message"]["body"])
    # Always shut the runtime down to flush background tasks
    await swarm.shutdown()

asyncio.run(main())
```

This snippet constructs two agents from the bundled examples, wires them into a `MAILSwarmTemplate`, instantiates the swarm for a specific user, posts a request, and finally tears the runtime down.


===== End of `docs/api.md` =====

===== `docs/client.md` =====

# MAILClient Guide

`MAILClient` is the reference asynchronous Python client for the MAIL HTTP API. It wraps every documented endpoint, handles bearer authentication, and provides helpers for Server‑Sent Events (SSE) streaming and interswarm routing.

Use this guide when you want to talk to a MAIL server from Python without writing raw `aiohttp` calls.

## Installation & Requirements
- `MAILClient` lives in `src/mail/client.py` and ships with the main package (`pip install -e .` or `uv sync`).
- **Python 3.12+** and **aiohttp** (pulled in automatically via `pyproject.toml`).
- The client is fully asynchronous. Run it inside an asyncio event loop, preferably with `asyncio.run(...)` or within async frameworks such as FastAPI or LangChain tools.

## Quick Start

```python
import asyncio

from mail.client import MAILClient


async def main() -> None:
    async with MAILClient("http://localhost:8000", api_key="user-token") as client:
        root = await client.ping()
        print(root["protocol_version"])

        response = await client.post_message(
            "Hello from MAILClient",
            entrypoint="supervisor",
            show_events=True,
        )
        print(response)

        stream = await client.post_message_stream("Stream this task")
        async for event in stream:
            print(event.event, event.data)


if __name__ == "__main__":
    asyncio.run(main())
```

## Connection Options
- `MAILClient(base_url, api_key=None, session=None, config=None)`
  - `base_url`: Root URL for the MAIL server (no trailing slash). Supports standard HTTP/HTTPS URLs.
  - `api_key`: Optional JWT or API key. When provided, every request includes `Authorization: Bearer <api_key>`.
  - `session`: Provide your own `aiohttp.ClientSession` to share connections or customise connectors. The client will not close externally supplied sessions.
  - `config`: Pass a `ClientConfig` instance (for example `ClientConfig(timeout=120.0, verbose=True)`) to reuse or override defaults hydrated from `mail.toml`.

The class implements `__aenter__` / `__aexit__`, so `async with` automatically opens and closes the HTTP session (`aclose()` is also available).

### `swarm://` URL Support

The CLI client (`mail client`) supports `swarm://` URLs for convenient connection sharing:

```bash
# Connect using swarm:// URL
uv run mail client "swarm://connect?server=example.com&token=my-api-key"
```

The URL is automatically parsed and converted:
- `server` parameter becomes the HTTPS base URL
- `token` parameter is used as the API key (if not overridden by `--api-key`)

Supported URL formats:
```
swarm://connect?server=<hostname>&token=<api_key>
swarm://invite?server=<hostname>&token=<api_key>
```

See [cli.md](./cli.md) for more details on URL scheme handling and OS registration.

### ClientConfig and mail.toml
- `ClientConfig` pulls its defaults from the `[client]` table in `mail.toml` (`timeout` and `verbose`).
- `MAILClient` uses these defaults automatically when you omit the `config` argument; the CLI REPL (`mail client`) follows the same behavior.
- Override per run by constructing `ClientConfig(timeout=..., verbose=...)` or by exporting/pointing `MAIL_CONFIG_PATH` to an alternate config file.

## Endpoint Coverage

| Category | Methods | Notes |
| --- | --- | --- |
| Service metadata | `ping()`, `get_status()` | Mirrors `GET /` and `GET /status`. |
| Identity | `get_whoami()` | Fetches the caller's username and role via `GET /whoami`. |
| Health | `get_health()` | Returns interswarm readiness info. |
| Messaging | `post_message(message, entrypoint=None, show_events=False)`, `post_message_stream(message, entrypoint=None)` | Handles synchronous responses and SSE streaming. |
| Task inspection | `get_tasks()`, `get_task(task_id)` | Fetch task overviews or a full record using `GET /tasks` and `GET /task`. |
| Swarm registry | `get_swarms()`, `register_swarm(...)`, `dump_swarm()`, `load_swarm_from_json(json_str)` | Manage remote swarm entries and persistent templates. |
| Interswarm | `post_interswarm_message(...)`, `post_interswarm_response(...)`, `send_interswarm_message(...)` | Submit or receive interswarm traffic. |
| Debug/OpenAI | `debug_post_responses(input, tools, instructions=None, previous_response_id=None, tool_choice=None, parallel_tool_calls=None, **kwargs)` | Calls the debug-only `/responses` endpoint; requires server debug mode. |

All helpers return deserialized `dict` objects matching the schemas in `spec/openapi.yaml`. For MAIL envelope types (`MAILMessage`, `MAILInterswarmMessage`) the client expects the dictionary shape defined in `mail.core.message`.

## Streaming Responses

`post_message_stream` returns an async iterator over `sse_starlette.ServerSentEvent` instances. Internally, the client parses chunked text from the HTTP response and yields structured events.

```python
stream = await client.post_message_stream("Need live updates")
async for event in stream:
    if event.event == "task_complete":
        print("done", event.data)
```

## Task Lifecycle and Resuming Previous Tasks

- Every call to `post_message`/`post_message_stream` participates in a **task** identified by `task_id`. If you omit the field, the server generates an ID. Reuse the same `task_id` to continue the conversation (for example, when running the runtime in continuous mode).
- When an agent invokes a tool that has been marked as a **breakpoint tool**, the runtime pauses the task and waits for the caller to provide the tool result. Resume the task by sending another message with:
  - The original `task_id`.
  - `resume_from="breakpoint_tool_call"`.
  - Extra keyword argument `breakpoint_tool_call_result`, a JSON string describing the tool outputs. Provide either a single object (`{"content": "..."}`) or a list of objects (`[{"call_id": "...", "content": "..."}]`) when multiple breakpoint tool calls paused in parallel.

```python
import json

task_id = "weather-task"

# Start a new task (runtime will mark it running until completion or a breakpoint)
response = await client.post_message(
    "Plan tomorrow's rehearsal dinner",
    task_id=task_id,
    entrypoint="supervisor",
)

# Later, resume the task after the breakpoint tool returns a value
stream = await client.post_message_stream(
    "Continuing after breakpoint",
    task_id=task_id,
    resume_from="breakpoint_tool_call",
    breakpoint_tool_call_result=json.dumps(
        {"call_id": "bp-1", "content": "Forecast: sunny with a high of 75°F"}
    ),
)
async for event in stream:
    ...
```

- The other supported value of `resume_from` is `"user_response"`. Use this for handling cases when a user wants to follow up on a previous task.
  - Note that the `msg_type` of a `user_response` *does not necessarily* need to be a `response`--the default message type is `request`, which works perfectly fine here.

```python
task_id = "weather-task-2"

response = await client.post_message(
    "What will the weather in San Francisco be tomorrow?",
    task_id=task_id,
)

follow_up = await client.post_message(
    "How does that compare to the forecast for Los Angeles?",
    task_id=task_id,
    resume_from="user_response",
) # msg_type = "request" here
```

- The runtime automatically resumes the task loop, restores any stashed queue items for that task, re-hydrates the agent history with the tool output, and emits the usual `task_complete` event once the agents finish.

Use the new inspection helpers to audit active or completed work:

```python
tasks = await client.get_tasks()
for task_id, task in tasks.items():
    print(task_id, task["completed"], task["is_running"])

latest = await client.get_task(task_id="weather-task")
print(latest["events"][-1]["event"])
```

Both helpers require the caller to own the task; the server automatically scopes results to the authenticated user/admin.

## OpenAI Responses bridge

- Enable server debug mode (`mail server --debug` or `[server].debug = true`) before calling `debug_post_responses`. This instantiates the internal OpenAI bridge (`SwarmOAIClient`) and exposes `/responses`.
- The helper expects the same payload shape as OpenAI's Responses API.
- Any extra keyword arguments (for example, `"parallel_tool_calls"` overrides or custom metadata) are forwarded verbatim inside the request body and handed to `SwarmOAIClient` for execution.
- Example (assuming you already have an authenticated `MAILClient` instance named `client`):

  ```python
  response = await client.debug_post_responses(
      input=[
          {"role": "system", "content": "You orchestrate the MAIL swarm."},
          {"role": "user", "content": "Draft a response for tomorrow's stand-up."}
      ],
      tools=[],
  )
  print(response.output)
  ```

- The CLI exposes the same workflow with `mail client responses …`; see [cli.md](./cli.md) for the REPL syntax.

## Error Handling

- HTTP transport errors raise `RuntimeError` with the originating `aiohttp` exception chained.
- Non‑JSON responses raise `ValueError` annotated with the returned content type and body.
- Always wrap calls in `try/except` when the network may be flaky or when tokens can expire.

## Testing & Utilities

- Unit coverage lives in `tests/unit/test_mail_client.py`, using an in‑process aiohttp server to validate payloads and streaming behaviour.
- `scripts/demo_client.py` launches a stubbed MAIL server and exercises the client end‑to‑end—useful for manual testing or onboarding demos.

## Integration Tips

- **Reuse sessions** for high‑throughput scenarios by passing an externally managed `ClientSession`.
- **Custom headers**: Extend `_build_headers` by subclassing `MAILClient` if you need additional per‑request metadata.
- **Timeouts**: Provide an `aiohttp.ClientTimeout(total=...)` for fine control over connect/read limits.
- **Logging**: Enable the `mail.client` logger for request traces (`logging.getLogger("mail.client").setLevel(logging.DEBUG)`).

## Related Documentation

- [API Surfaces](./api.md) – discusses the HTTP routes that `MAILClient` calls.
- [Quickstart](./quickstart.md) – shows how to run the server; you can replace `curl` steps with `MAILClient` snippets.
- [Testing](./testing.md) – outlines the project’s testing strategy, including client exercises.
- [Troubleshooting](./troubleshooting.md) – consult for common connectivity issues.


===== End of `docs/client.md` =====

===== `docs/registry.md` =====

# Swarm Registry

The registry manages discovery and routing for remote swarms.

## Responsibilities
- **Track endpoints**: name, base URL, health URL, auth token reference
- Periodic **health checks** and last-seen timestamps
- **Persistence** of non-volatile entries to a JSON file
- Migration and validation of env-backed auth tokens

## Persistence
- **File**: registry path from `mail.toml` (`[server.swarm].registry` / `registry_file`), default `registries/example-no-proxy.json`
- On shutdown, volatile entries are discarded; persistent entries are saved

## Auth token references
- Persistent registrations convert `auth_token` to environment references like `${SWARM_AUTH_TOKEN_<SWARM>}`
- At runtime these are resolved from the process environment; if unset the router will fall back to the message payload’s `auth_token`, but you should still export the variable so outbound calls always include a static bearer token.
- **Utilities**: `migrate_auth_tokens_to_env_refs`, `validate_environment_variables`

## API integration
- **Server endpoints** expose `GET /swarms`, `POST /swarms`, `GET /swarms/dump`, `POST /swarms/load`
- Use `POST /swarms` with `volatile=false` to persist a remote swarm

## Code
- [src/mail/net/registry.py](/src/mail/net/registry.py)
- [src/mail/net/router.py](/src/mail/net/router.py)


===== End of `docs/registry.md` =====

===== `docs/README.md` =====

# MAIL Python Reference Implementation Documentation

This folder documents the **Multi‑Agent Interface Layer (MAIL) reference implementation** found in this repository. It explains what MAIL is, how this Python implementation is structured, how to run it, and how to extend it with your own agents and swarms.

If you’re new, start with [Quickstart](/docs/quickstart.md), then read [Architecture](/docs/architecture.md) and [Agents & Tools](/docs/agents-and-tools.md). The [API](/docs/api.md) doc covers both HTTP and Python surfaces, [Client](/docs/client.md) explains the asynchronous HTTP helper, and [Message Format](/docs/message-format.md) specifies the wire schema used by every transport.

## Contents
- **Quickstart**: [quickstart.md](/docs/quickstart.md)
- **Docker Deployment**: [docker.md](/docs/docker.md)
- **Architecture**: [architecture.md](/docs/architecture.md)
- **Configuration**: [configuration.md](/docs/configuration.md)
- **Database Persistence**: [database.md](/docs/database.md)
- **API (HTTP & Python)**: [api.md](/docs/api.md)
- **CLI**: [cli.md](/docs/cli.md)
- **HTTP Client**: [client.md](/docs/client.md)
- **Message Format**: [message-format.md](/docs/message-format.md)
- **Agents & Tools**: [agents-and-tools.md](/docs/agents-and-tools.md)
- **Interswarm Messaging**: [interswarm.md](/docs/interswarm.md)
- **Swarm Registry**: [registry.md](/docs/registry.md)
- **Standard Library**: [stdlib/README.md](/docs/stdlib/README.md)
- **Security**: [security.md](/docs/security.md)
- **Testing**: [testing.md](/docs/testing.md)
- **Examples**: [examples.md](/docs/examples.md)
- **Troubleshooting**: [troubleshooting.md](/docs/troubleshooting.md)

## What is MAIL?
- **MAIL** (**M**ulti‑**A**gent **I**nterface **L**ayer) is a protocol and reference implementation that standardizes how autonomous agents communicate, coordinate, and collaborate.
- The Python implementation uses FastAPI for HTTP endpoints, an internal runtime loop for message processing, and a registry/router for inter‑swarm communication over HTTP.
- The normative protocol specification lives in [spec/](/spec/SPEC.md) and includes JSON Schemas and an OpenAPI file for the HTTP surface.

## Where to look in the code
- **Server and API**: [src/mail/server.py](/src/mail/server.py), [src/mail/api.py](/src/mail/api.py)
- **HTTP client**: [src/mail/client.py](/src/mail/client.py)
- **Core runtime, tools, messages**: [src/mail/core/runtime.py](/src/mail/core/runtime.py), [src/mail/core/tools.py](/src/mail/core/tools.py), [src/mail/core/message.py](/src/mail/core/message.py)
- **Interswarm**: [src/mail/net/router.py](/src/mail/net/router.py), [src/mail/net/registry.py](/src/mail/net/registry.py), [src/mail/net/types.py](/src/mail/net/types.py)
- **Utilities**: [src/mail/utils/](/src/mail/utils/__init__.py)
- **Examples and agent functions**: [src/mail/examples/](/src/mail/examples/__init__.py), [src/mail/factories/](/src/mail/factories/__init__.py)


===== End of `docs/README.md` =====

===== `docs/agents-and-tools.md` =====

# Agents & Tools

## Agents
- An **agent** is an async callable created by a factory that takes a chat history and can emit tool calls ([src/mail/api.py](/src/mail/api.py), [src/mail/factories/](/src/mail/factories/__init__.py))
- Agent types can be configured in [swarms.json](/swarms.json) and converted to `MAILAgentTemplate` at runtime
- **Important flags**: `enable_entrypoint`, `enable_interswarm`, `can_complete_tasks`, `tool_format`
- Values inside `agent_params` support string prefixes resolved at load time: use `python::package.module:OBJECT` for Python exports and `url::https://...` to fetch JSON payloads that populate prompts or additional settings

## Actions
- A `MAILAction` defines a structured tool interface backed by a Python function (or coroutine)
- Author new actions with the `@mail.action` decorator or `MAILAction.from_pydantic_model()` helper in [src/mail/api.py](/src/mail/api.py)
- Declare actions once per swarm in [swarms.json](/swarms.json); agents reference them by name in their `actions` list
- Reuse shared actions via the swarm-level `action_imports` array (see [docs/stdlib](./stdlib/README.md) for the built-in catalogue)
- Conversion helpers build Pydantic models and tool specs: see `MAILAction.to_tool_dict()` and `pydantic_model_to_tool()` in [src/mail/core/tools.py](/src/mail/core/tools.py) and [src/mail/api.py](/src/mail/api.py)

## Tool format
- `tool_format` controls how tools are exposed: `completions` (chat completions) or `responses` (OpenAI Responses API shape)
- The system mirrors definitions appropriately so both shapes are supported internally

## Built-in MAIL tools ([src/mail/core/tools.py](/src/mail/core/tools.py))
- `send_request(target, subject, body)` → emits a `MAILRequest` to a validated in-swarm target; when the agent template enables interswarm the `target` accepts the `agent@swarm` form.
- `send_response(target, subject, body)` → mirrors `send_request` but produces a `MAILResponse`, letting agents continue existing conversations.
- `send_interrupt(target, subject, body)` → issues a `MAILInterrupt` so supervisors can pause or redirect downstream agents.
- `send_broadcast(subject, body, targets)` → schema includes `targets`, but the runtime currently ignores it and broadcasts to every agent in the local swarm.
- `acknowledge_broadcast(note=None)` → records the broadcast in agent memory without replying; the optional note stays internal.
- `ignore_broadcast(reason=None)` → explicitly drops the broadcast and skips both memory storage and outbound mail; optional reason is internal only.
- `await_message(reason=None)` → signals that the agent has no further output this turn and should be rescheduled when new mail arrives; an optional reason is surfaced in SSE events and tool-call history for debugging.
- `help(get_summary=True, get_identity=False, get_tool_help=None, get_full_protocol=False)` → generates a MAIL primer for the calling agent, optionally including identity info, per-tool guides, and the full protocol spec; the runtime streams the result back as a system broadcast.
- `send_interswarm_broadcast(subject, body, target_swarms=[])` → (supervisor + interswarm) sends a broadcast to selected remote swarms, defaulting to all when the list is empty.
- `discover_swarms(discovery_urls)` → (supervisor + interswarm) hands discovery endpoints to the registry so it can import additional swarms.
- `task_complete(finish_message)` → (supervisor) broadcasts the final answer and tells the runtime the task loop is finished.

`create_mail_tools()` installs the standard request/response plus broadcast acknowledgement helpers for regular agents, while `create_supervisor_tools()` layers on interrupts, broadcasts, discovery, and task completion based on the template flags described above.

## Supervisors
- Agents with `can_complete_tasks: true` can **signal task completion** and are treated as supervisors
- **Swarms must include at least one supervisor**; the default example uses `supervisor` as the entrypoint

## Communication graph
- `comm_targets` names define a directed graph of which agents an agent can contact
- When interswarm is enabled, targets may include `agent@swarm` and local validation allows remote addresses

## Factories and prompts
- **Example factories and prompts** live in [src/mail/examples/*](/src/mail/examples/__init__.py) and [src/mail/factories/*](/src/mail/factories/__init__.py)
- **Add your own agent** by creating a MAIL-compatible agent function and listing it in [swarms.json](/swarms.json)
- When referencing shared prompt text or other dynamic values, prefer the `python::` and `url::` prefixes so they stay in sync with code or remote configuration without manual duplication


===== End of `docs/agents-and-tools.md` =====

===== `docs/security.md` =====

# Security

## Recommendations
- **Use HTTPS** for all deployments and registry communications
- **Separate tokens** and roles for users, admins, and agents
- Require admin role for registry mutations and loading swarms
- Use environment variable references for persistent interswarm auth tokens
- Apply rate limiting at HTTP ingress if public facing
- **Restrict tool execution**; validate parameters and avoid dangerous side effects

## Auth integration
- The server delegates token validation to `TOKEN_INFO_ENDPOINT`
- **Expected shape**: `{ role: "admin"|"user"|"agent", id: string, api_key: string }`
- Per-user MAIL instances are keyed by caller role + id; task owner identifiers use `{role}:{id}@{swarm}`

## Operational
- Keep `SWARM_REGISTRY_FILE` on secure storage and ensure only env-var references are persisted
- **Rotate environment variables** instead of editing persisted JSON
- **Monitor logs** for interswarm health changes and failures


===== End of `docs/security.md` =====

===== `docs/docker.md` =====

# Docker Deployment

This guide explains how to build and run the MAIL reference server in a Docker container. Use it when you want an immutable runtime or need to deploy MAIL to container platforms.

## Prerequisites
- Docker 24+
- Python 3.12-compatible base image (the example uses `python:3.12-slim`)
- Access to the MAIL repository source tree when building the image

## Example Dockerfile
Place the following `Dockerfile` at the repository root or in a build-specific folder. It mirrors the workflow used throughout the docs (`uv sync` for dependency resolution):

```Dockerfile
FROM python:3.12-slim AS base
WORKDIR /app

# Install system dependencies used during the build
RUN apt-get update \ \
    && apt-get install -y --no-install-recommends curl build-essential \ \
    && rm -rf /var/lib/apt/lists/*

# uv handles dependency resolution and execution
RUN pip install --no-cache-dir uv

# Copy dep metadata first to leverage Docker layer caching
COPY pyproject.toml uv.lock mail.toml ./
RUN uv sync --frozen --no-dev

# Copy the remaining source files required by the server
COPY src ./src
COPY spec ./spec
COPY docs ./docs

# Default configuration; override via `mail.toml`/`MAIL_CONFIG_PATH` or CLI flags
ENV PORT=8000 \
    AUTH_ENDPOINT=http://auth.local/login \
    TOKEN_INFO_ENDPOINT=http://auth.local/token-info \
    LITELLM_PROXY_API_BASE=http://litellm.local

EXPOSE 8000
CMD ["uv", "run", "mail", "server", "--host", "0.0.0.0", "--port", "8000"]
```

### Multi-stage tip
If you want a smaller runtime image, split the `Dockerfile` into build and final stages. Copy `.venv` from the build stage to the runtime stage and strip out build-essential packages there.

## Build the image
```bash
docker build -t mail-server .
```
The build context must contain the repository so that the `COPY` commands pick up the source and configuration files.

## Run the container
The server requires the same environment variables as the native quickstart (`AUTH_ENDPOINT`, `TOKEN_INFO_ENDPOINT`, and `LITELLM_PROXY_API_BASE` only if your swarm uses `use_proxy=true`). Pass them via `--env` flags or an env file. To change swarm name/source/registry or runtime LLM stream-print behavior, mount a custom `mail.toml` and set `MAIL_CONFIG_PATH` or pass `mail server` flags such as `--swarm-name`, `--swarm-source`, `--swarm-registry`, and `--print-llm-streams true|false` in the container command.

```bash
# Option 1: export locally then forward with --env
export AUTH_ENDPOINT=http://127.0.0.1:8999/login
export TOKEN_INFO_ENDPOINT=http://127.0.0.1:8999/token-info
export LITELLM_PROXY_API_BASE=http://127.0.0.1:8080

docker run --rm \
  -p 8000:8000 \
  -e AUTH_ENDPOINT \
  -e TOKEN_INFO_ENDPOINT \
  -e LITELLM_PROXY_API_BASE \
  mail-server
```

```bash
# Option 2: use an env file
cat <<'ENVVARS' > .env.mail
AUTH_ENDPOINT=http://127.0.0.1:8999/login
TOKEN_INFO_ENDPOINT=http://127.0.0.1:8999/token-info
LITELLM_PROXY_API_BASE=http://127.0.0.1:8080
ENVVARS

docker run --rm -p 8000:8000 --env-file .env.mail mail-server
```

`mail server` seeds `SWARM_SOURCE` and `SWARM_REGISTRY_FILE` from `mail.toml`; set them explicitly if you mount alternative swarm definitions or persistence paths into the container.

### Persisting registries and logs
Mount host directories if you want the container to keep swarm registry data or logs between runs:

```bash
docker run --rm \
  -p 8000:8000 \
  --env-file .env.mail \
  -v $(pwd)/registries:/app/registries \
  -v $(pwd)/logs:/app/logs \
  mail-server
```

## Health checks
Use the same endpoints as the quickstart for readiness and status:

```bash
curl http://localhost:8000/health
curl -H "Authorization: Bearer user:demo" http://localhost:8000/status
```

Update the URLs to the forwarded host/port if you publish the container through ngrok or another ingress.

## Troubleshooting
- Ensure Docker is forwarding the port (`-p 8000:8000`) and no other service is bound to that port on the host.
- Confirm the authentication service is reachable from inside the container. If you expose a host service via `localhost`, use the Docker host gateway (`host.docker.internal` on macOS/Windows or `--add-host` on Linux).
- Rebuild the image after dependency changes; Docker layer caching only applies when `pyproject.toml`/`uv.lock` remain unchanged.
```


===== End of `docs/docker.md` =====

===== `docs/database.md` =====

# Database Persistence

MAIL supports optional PostgreSQL persistence for agent histories, task state, and event timelines. When enabled, the runtime automatically saves and restores conversation context, allowing tasks to survive server restarts and enabling audit trails.

## Features

- **Agent History Persistence**: Conversation histories are saved per task and agent, enabling context recovery across sessions
- **Task State Tracking**: Task metadata (owner, contributors, running status, completion) persists to the database
- **Event Timeline Storage**: SSE events are recorded for debugging and replay
- **Task Response Caching**: Final responses are stored for retrieval without re-execution
- **Automatic Recovery**: On instance startup, the runtime loads existing histories and tasks from the database

## Setup

### Prerequisites

- PostgreSQL 12+ (with `gen_random_uuid()` support)
- The `DATABASE_URL` environment variable set to a valid connection string

### 1. Configure the Connection

Set the `DATABASE_URL` environment variable:

```bash
export DATABASE_URL=postgresql://user:password@localhost:5432/mail
```

The connection string format follows the standard PostgreSQL URI scheme:
```
postgresql://[user[:password]@][host][:port]/database
```

### 2. Initialize the Schema

Run the database initialization command to create all required tables:

```bash
uv run mail db-init
```

This creates four tables:

| Table | Purpose |
|-------|---------|
| `agent_histories` | Stores LLM conversation histories keyed by swarm, caller, task, and agent |
| `tasks` | Tracks task metadata including owner, contributors, status, and timestamps |
| `task_events` | Records SSE events (type, data, ID) for each task |
| `task_responses` | Caches final task responses for retrieval |

The command also creates indexes for efficient queries and verifies table creation.

### 3. Verify the Setup

After initialization, you should see output like:

```
Connecting to database...
Connected successfully
Creating agent_histories table...
  agent_histories table created
Creating tasks table...
  tasks table created
Creating task_events table...
  task_events table created
Creating task_responses table...
  task_responses table created

Verifying tables...
  agent_histories: OK
  tasks: OK
  task_events: OK
  task_responses: OK

Database initialization complete!
```

## Schema Reference

### agent_histories

```sql
CREATE TABLE agent_histories (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    swarm_name TEXT NOT NULL,
    caller_role TEXT NOT NULL,      -- 'admin', 'agent', or 'user'
    caller_id TEXT NOT NULL,
    tool_format TEXT NOT NULL,      -- 'completions' or 'responses'
    task_id TEXT NOT NULL,
    agent_name TEXT NOT NULL,
    history JSONB NOT NULL,         -- LLM message history
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```

### tasks

```sql
CREATE TABLE tasks (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    task_id TEXT NOT NULL,
    swarm_name TEXT NOT NULL,
    caller_role TEXT NOT NULL,
    caller_id TEXT NOT NULL,
    task_owner TEXT NOT NULL,
    task_contributors JSONB DEFAULT '[]',
    remote_swarms JSONB DEFAULT '[]',
    is_running BOOLEAN DEFAULT FALSE,
    completed BOOLEAN DEFAULT FALSE,
    start_time TIMESTAMP WITH TIME ZONE NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    UNIQUE(task_id, swarm_name, caller_role, caller_id)
);
```

### task_events

```sql
CREATE TABLE task_events (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    task_id TEXT NOT NULL,
    swarm_name TEXT NOT NULL,
    caller_role TEXT NOT NULL,
    caller_id TEXT NOT NULL,
    event_type TEXT,
    event_data TEXT,
    event_id TEXT,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
```

### task_responses

```sql
CREATE TABLE task_responses (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    task_id TEXT NOT NULL,
    swarm_name TEXT NOT NULL,
    caller_role TEXT NOT NULL,
    caller_id TEXT NOT NULL,
    response JSONB NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    UNIQUE(task_id, swarm_name, caller_role, caller_id)
);
```

## Runtime Behavior

When `DATABASE_URL` is set:

1. **On Instance Startup**: The runtime loads agent histories and task records for the current swarm and caller. This allows resuming conversations where they left off.

2. **During Task Execution**: Agent histories are periodically saved to the database. Task state updates (running, completed) are persisted.

3. **On Task Completion**: The final response and all events are saved to the database.

4. **Connection Pooling**: The runtime uses `asyncpg` connection pooling (5-20 connections) with automatic retry logic for transient failures.

## Disabling Persistence

To run without database persistence, simply don't set `DATABASE_URL`. The runtime will operate entirely in-memory, which is suitable for development or stateless deployments.

## Troubleshooting

### Connection Errors

If you see `DATABASE_URL is not set`:
- Verify the environment variable is exported
- Check that the value is a valid PostgreSQL connection string

If connection fails:
- Verify PostgreSQL is running and accessible
- Check credentials and database name
- Ensure the database exists (create with `createdb mail` if needed)

### Missing Tables

If you see "table does not exist" errors:
- Run `mail db-init` to create the schema
- Verify the init command completed successfully

### Permission Issues

Ensure the database user has:
- `CREATE TABLE` permission (for initialization)
- `SELECT`, `INSERT`, `UPDATE` permissions (for runtime)

## Related Documentation

- [Configuration](./configuration.md) - Environment variables and `mail.toml`
- [CLI](./cli.md) - The `mail db-init` command
- [Architecture](./architecture.md) - How the runtime manages state


===== End of `docs/database.md` =====

===== `docs/architecture.md` =====

# Architecture

This section explains the runtime, server, and networking layers that make up a MAIL swarm.

## Overview
- **Runtime**: per-user (or per-swarm) message queue, agents, tools, and execution ([src/mail/core/runtime.py](/src/mail/core/runtime.py))
- **API/Server**: FastAPI app exposing HTTP endpoints and managing persistent templates and user-scoped instances ([src/mail/server.py](/src/mail/server.py), [src/mail/api.py](/src/mail/api.py))
- **Interswarm**: HTTP router and registry for cross-swarm messaging ([src/mail/net/router.py](/src/mail/net/router.py), [src/mail/net/registry.py](/src/mail/net/registry.py))

## Key concepts
- **`MAILMessage`**: canonical envelope for request/response/broadcast/interrupt; see [message-format.md](/docs/message-format.md) and [src/mail/core/message.py](/src/mail/core/message.py)
- **Agents**: async callables that produce text + tool calls; created by factories, which can be configured in [swarms.json](/swarms.json)
- **Actions/Tools**: structured tool specs that let agents send MAIL messages, broadcast, interrupt, and complete tasks
- **Swarm**: a set of agents plus optional actions, with a directed communication graph and a designated entrypoint

## Runtime
- **Message queue**: priority queue with deterministic tie-breaking; processes messages and schedules tool execution (FIFO by enqueue sequence within a priority)
- **Task queue snapshots**: task-specific messages are stashed when `task_complete` or breakpoint pauses occur and reloaded when the task resumes, preserving execution ordering
- **Task state tracking**: each logical task is represented by a `MAILTask` record that persists the running/completed flag, SSE event log, stashed queue snapshot, and any remote swarms participating in the conversation so resumes and audits stay consistent
- **Agent histories**: maintained per agent for context and multi-turn behavior
- **Pending requests**: tracked futures keyed by task_id for correlating final responses and streaming
- **Events and SSE**: events are collected and streamed via Server-Sent Events (SSE) with heartbeat pings
- **Interswarm**: optional router that detects `agent@swarm` recipients, routes over HTTP, and can proxy streaming SSE responses from remote swarms when requested

## Server and API
- **Persistent template**: built at startup from [swarms.json](/swarms.json) into `MAILSwarmTemplate`
- **User isolation**: each authenticated user gets a dedicated `MAILSwarm` instance with its own runtime loop
- **Endpoints**: `GET /`, `GET /status`, `POST /message` (+SSE), `GET /tasks`, `GET /task`, interswarm endpoints, and registry management; see [api.md](/docs/api.md)
- **Debug mode**: turn on `[server].debug` (or run `mail server --debug`) to instantiate a `SwarmOAIClient` and expose the OpenAI-compatible `/responses` endpoint together with other diagnostic helpers; keep it off in production to minimise the HTTP surface
- **Lifespan**: on startup, initializes registry, loads the persistent swarm, and starts health checks; on shutdown, cleans up instances and saves persistent registry state

## Interswarm
- **Router**: inspects recipient addresses; local vs remote routing; wraps messages into `MAILInterswarmMessage` for HTTP
- **Registry**: tracks local/remote swarms, performs health checks, stores persistent endpoints, supports env-backed auth tokens
- **Addressing**: use `agent@swarm` to target remote swarms; local addresses use just `agent`

## Files to read
- **Runtime and tools**: [src/mail/core/runtime.py](/src/mail/core/runtime.py), [src/mail/core/tools.py](/src/mail/core/tools.py)
- **HTTP Server**: [src/mail/server.py](/src/mail/server.py)
- **Interswarm types**: [src/mail/net/types.py](/src/mail/net/types.py)
- **Router and registry**: [src/mail/net/router.py](/src/mail/net/router.py), [src/mail/net/registry.py](/src/mail/net/registry.py)
- **Message types**: [src/mail/core/message.py](/src/mail/core/message.py)


===== End of `docs/architecture.md` =====

===== `docs/manual-mail-game-guide.md` =====

# Building Games with Manual MAIL Stepping

A comprehensive guide for creating multi-agent games using the MAIL framework's manual stepping mode. This document is designed to give another Claude (or developer) everything they need to create their own game.

---

## Table of Contents

1. [What is MAIL?](#what-is-mail)
2. [Manual Mode vs Continuous Mode](#manual-mode-vs-continuous-mode)
3. [Core Architecture](#core-architecture)
4. [The Mafia Game: A Complete Example](#the-mafia-game-a-complete-example)
5. [Step-by-Step: Building Your Own Game](#step-by-step-building-your-own-game)
6. [API Reference](#api-reference)
7. [Common Patterns](#common-patterns)
8. [Tips and Best Practices](#tips-and-best-practices)

---

## What is MAIL?

**MAIL (Multi-Agent Interface Layer)** is a framework for orchestrating communication between multiple AI agents. Each agent:
- Has its own LLM-backed "brain"
- Maintains its own conversation history
- Can send messages to other agents
- Can use tools/actions to affect game state
- Operates within a **swarm** (a collection of agents working together)

The key insight: MAIL manages message routing, agent histories, and tool execution so you can focus on game logic.

---

## Manual Mode vs Continuous Mode

MAIL has two execution modes:

### Continuous Mode (Default)
```python
await swarm.run_continuous(mode="continuous")
```
- Agents autonomously process messages from a queue
- Agents decide when to respond and to whom
- Good for open-ended multi-agent conversations
- You submit a message and wait for task completion

### Manual Mode (For Games)
```python
await swarm.run_continuous(mode="manual")
```
- **You control exactly which agent speaks and when**
- You specify who receives the response (broadcast vs. private)
- You inject context/instructions via payloads
- Perfect for turn-based games where you need deterministic flow

**Why manual mode for games?** Games have structured phases (night, day, voting). You need to:
- Control who acts when (e.g., Doctor acts before Mafia)
- Send private messages to specific players
- Inject phase-specific instructions
- Accumulate messages in buffers before prompting agents

---

## Core Architecture

### Key Classes

```
MAILSwarmTemplate  →  MAILSwarm  →  MAILRuntime
      ↓                   ↓              ↓
   (config)         (instantiated)   (message queue,
                                      agent histories,
                                      manual stepping)
```

#### 1. MAILAgentTemplate / MAILAgent
Defines an agent's configuration:
- `name`: Agent identifier
- `factory`: Function that creates the LLM-backed agent
- `comm_targets`: List of agents this agent can communicate with
- `actions`: Tools/actions the agent can use
- `agent_params`: LLM config (model, system prompt, etc.)
- `can_complete_tasks`: If True, agent can end a task
- `enable_entrypoint`: If True, can receive initial messages

#### 2. MAILSwarmTemplate / MAILSwarm
Groups agents together:
- `agents`: List of agent templates
- `entrypoint`: Default agent for incoming messages
- `actions`: All actions available in the swarm

#### 3. MAILAction
Defines tools agents can use:
- `name`: Tool name
- `description`: What the tool does
- `parameters`: JSON schema for arguments
- `function`: Async function to execute

#### 4. Game Class (Your Custom Code)
Your game state manager that:
- Tracks game state (phases, players, votes, etc.)
- Provides tool callbacks (actions that modify state)
- Orchestrates the game loop using `manual_step`

---

## The Mafia Game: A Complete Example

The Mafia implementation demonstrates every key concept. Let's break it down:

### File Structure
```
mail/examples/mafia/
├── game.py           # Game state + orchestration loop
├── narrator_tools.py # Actions for the Narrator agent
├── prompts.py        # System prompts for agents
├── roles.py          # Role definitions (Mafia, Doctor, etc.)
└── personas.py       # Character personalities
```

### How It Works

#### 1. Game Initialization

```python
@dataclass
class Game:
    players: list[Agent] = field(default_factory=list)
    _swarm: MAILSwarm | None = None
    task_id: str = field(default_factory=lambda: str(uuid.uuid4()))
    phase: GamePhase = GamePhase.SETUP
    # ... more state fields

    @staticmethod
    def create(n: int, valid_llms: list[str] | None = None) -> "Game":
        # 1. Calculate roles based on player count
        roles = calculate_roles(n)

        # 2. Create player agents with personas and roles
        players = []
        for role in roles:
            players.append(Agent(
                persona=random_persona,
                role=role,
                llm="openai/gpt-5-mini"
            ))

        # 3. Build agent templates
        agents = [player.build_agent_template() for player in players]
        agents.append(build_narrator_template(game, player_names))

        # 4. Create and instantiate swarm
        template = build_agent_swarm(agents)
        swarm = template.instantiate({"user_token": "dummy"}, "MafiaGame")

        # 5. Start in MANUAL mode!
        asyncio.create_task(swarm.run_continuous(mode="manual"))

        return game
```

Key insight: The swarm runs in the background with `mode="manual"`. It doesn't process messages automatically - it waits for `manual_step` calls.

#### 2. The manual_step Function

This is the heart of manual mode. Here's how it works:

```python
async def manual_step(
    self,
    task_id: str,           # Identifies the game session
    target: str,            # Agent to prompt
    response_targets: list[str] | None = None,  # Who receives response
    response_type: Literal["broadcast", "response", "request"] = "broadcast",
    payload: str | None = None,    # Instructions for the agent
    dynamic_ctx_ratio: float = 0.0,  # Context compression (0-1)
    _llm: str | None = None,       # Override LLM for this step
    _system: str | None = None,    # Override system prompt
) -> MAILMessage:
```

What happens internally:
1. Waits for message queue to be empty
2. Collects buffered messages for this agent
3. Formats them into public/private message format
4. Appends your payload as additional context
5. Sends to the target agent
6. Agent generates response using its LLM
7. Response is routed based on `response_targets`
8. Returns the response message

#### 3. Stepping Agents in the Game

The game provides wrapper methods:

```python
async def step_narrator(self, payload: str = "") -> MAILMessage:
    """Step the narrator with a broadcast response."""
    await self.swarm.await_queue_empty()

    response = await self.swarm.manual_step(
        task_id=self.task_id,
        target="Narrator",
        response_targets=["all"],      # Everyone hears this
        response_type="broadcast",
        payload=payload,               # Phase-specific instructions
        dynamic_ctx_ratio=0.75,        # Compress context to save tokens
        _llm=self.narrator_llm,
        _system=create_narrator_system_prompt(),
    )
    return response

async def step_agent(
    self,
    agent_name: str,
    broadcast: bool = False,
    targets: list[str] | None = None,
    payload: str = "",
) -> MAILMessage:
    """Step a player agent."""
    a = self.get_player_by_name(agent_name)

    if broadcast:
        # Public message - everyone hears
        response = await self.swarm.manual_step(
            task_id=self.task_id,
            target=agent_name,
            response_targets=["all"],
            response_type="broadcast",
            payload=payload,
            _llm=a.llm,
            _system=create_agent_system_prompt(a.persona, a.role),
        )
    else:
        # Private message - only specified targets hear
        response_targets = targets or ["Narrator"]
        response = await self.swarm.manual_step(
            task_id=self.task_id,
            target=agent_name,
            response_targets=response_targets,
            response_type="response",
            payload=payload,
            _llm=a.llm,
            _system=create_agent_system_prompt(a.persona, a.role),
        )
    return response
```

#### 4. Game Loop Example: Night Phase

```python
async def run_night_phase(self) -> None:
    self.phase = GamePhase.NIGHT

    # 1. Narrator announces night
    await self.step_narrator(payload=f"""
=== NIGHT {self.day_number} ===
The night falls. Prompt the Doctor to choose who to protect.
""")

    # 2. Doctor acts (private to Narrator)
    if doctor:
        await self.step_agent(
            doctor.persona.name,
            broadcast=False,
            targets=["Narrator"],
            payload="""
[PRIVATE - Only the Narrator sees this]
Choose one player to protect tonight.
Your response must end with: "I protect [player_name]"
""",
        )

        # 3. Narrator processes Doctor's choice (uses tool)
        await self.step_narrator(payload="""
Use the doctor_protect tool to record the doctor's choice.
Then prompt the Detective.
""")

    # 4. Detective acts (private)
    if detective:
        await self.step_agent(
            detective.persona.name,
            broadcast=False,
            targets=["Narrator"],
            payload="[PRIVATE] Choose one player to investigate...",
        )

        # Narrator uses detective_investigate tool
        await self.step_narrator(payload="Use detective_investigate tool...")

    # 5. Mafia members vote (each votes privately)
    for mafia in self.get_mafia_members():
        await self.step_agent(
            mafia.persona.name,
            broadcast=False,
            targets=["Narrator"] + other_mafia,  # Mafia see each other
            payload="Vote for who to kill...",
        )

    # 6. Narrator records all mafia votes
    await self.step_narrator(payload="Use mafia_vote_kill for each vote...")

    # 7. Resolve night actions
    self.resolve_night_actions()
```

#### 5. Defining Actions (Tools)

Actions let agents affect game state. Here's how Mafia defines them:

```python
# narrator_tools.py

class DoctorProtectArgs(BaseModel):
    """Record the doctor's protection target."""
    target_name: str = Field(description="Player to protect")

async def doctor_protect(game: "Game", args: dict) -> str:
    """Called when Narrator uses doctor_protect tool."""
    target = args["target_name"]
    game.protected_player = target  # Modify game state!
    return f"Doctor protected {target} for the night"

# Create action from Pydantic model
def get_narrator_actions(game: "Game") -> list[MAILAction]:
    return [
        MAILAction.from_pydantic_model(
            model=DoctorProtectArgs,
            function=partial(doctor_protect, game),  # Curry the game
            name="doctor_protect",
        ),
        # ... more actions
    ]
```

The Narrator agent template includes these actions:
```python
def build_narrator_template(game: "Game", player_names: list[str]) -> MAILAgentTemplate:
    actions = get_narrator_actions(game)  # Actions that modify game state

    return MAILAgentTemplate(
        name="Narrator",
        factory=base_agent_factory,
        comm_targets=player_names,
        actions=actions,  # Narrator can use these tools
        agent_params={
            "llm": "openai/gpt-5-mini",
            "system": create_narrator_system_prompt(),
            # ...
        },
        can_complete_tasks=True,
    )
```

---

## Step-by-Step: Building Your Own Game

### Step 1: Define Your Game State

```python
from dataclasses import dataclass, field
from enum import Enum

class GamePhase(Enum):
    SETUP = "setup"
    PLAYER_TURN = "player_turn"
    CHALLENGE = "challenge"
    RESOLUTION = "resolution"
    GAME_OVER = "game_over"

@dataclass
class MyGame:
    players: list["Player"] = field(default_factory=list)
    current_player_idx: int = 0
    phase: GamePhase = GamePhase.SETUP
    _swarm: MAILSwarm | None = None
    task_id: str = field(default_factory=lambda: str(uuid.uuid4()))

    # Game-specific state
    scores: dict[str, int] = field(default_factory=dict)
    current_challenge: str | None = None
```

### Step 2: Define Player/Agent Structure

```python
@dataclass
class Player:
    name: str
    personality: str
    llm: str = "openai/gpt-5-mini"

    def build_agent_template(self) -> MAILAgentTemplate:
        system_prompt = f"""You are {self.name}. {self.personality}

Play the game strategically while staying in character."""

        return MAILAgentTemplate(
            name=self.name,
            factory=base_agent_factory,
            comm_targets=["GameMaster"],  # Can talk to GM
            actions=[],  # Players have no special actions
            agent_params={
                "llm": self.llm,
                "system": system_prompt,
                "user_token": "dummy",
                "use_proxy": False,
            },
            enable_entrypoint=True,
            can_complete_tasks=True,
        )
```

### Step 3: Define Game Master Actions

```python
from pydantic import BaseModel, Field
from functools import partial

class ScorePointsArgs(BaseModel):
    """Award points to a player."""
    player_name: str = Field(description="Player to award points to")
    points: int = Field(description="Number of points to award")

async def score_points(game: "MyGame", args: dict) -> str:
    player = args["player_name"]
    points = args["points"]
    game.scores[player] = game.scores.get(player, 0) + points
    return f"Awarded {points} points to {player}. Total: {game.scores[player]}"

class SetChallengeArgs(BaseModel):
    """Set the current challenge."""
    challenge: str = Field(description="The challenge description")

async def set_challenge(game: "MyGame", args: dict) -> str:
    game.current_challenge = args["challenge"]
    return f"Challenge set: {args['challenge']}"

def get_gamemaster_actions(game: "MyGame") -> list[MAILAction]:
    return [
        MAILAction.from_pydantic_model(
            model=ScorePointsArgs,
            function=partial(score_points, game),
            name="score_points",
        ),
        MAILAction.from_pydantic_model(
            model=SetChallengeArgs,
            function=partial(set_challenge, game),
            name="set_challenge",
        ),
    ]
```

### Step 4: Create the Game Master Agent

```python
def build_gamemaster_template(
    game: "MyGame",
    player_names: list[str]
) -> MAILAgentTemplate:
    system = """You are the Game Master. You:
- Run the game fairly and create engaging challenges
- Use your tools to set challenges and award points
- Keep the game moving and entertaining

Available tools:
- score_points(player_name, points): Award points
- set_challenge(challenge): Set a new challenge
"""

    actions = get_gamemaster_actions(game)

    return MAILAgentTemplate(
        name="GameMaster",
        factory=base_agent_factory,
        comm_targets=player_names,
        actions=actions,
        agent_params={
            "llm": "openai/gpt-5-mini",
            "system": system,
            "user_token": "dummy",
            "use_proxy": False,
        },
        enable_entrypoint=True,
        can_complete_tasks=True,
    )
```

### Step 5: Build the Swarm

```python
def build_swarm(agents: list[MAILAgentTemplate]) -> MAILSwarmTemplate:
    # Collect all actions from agents
    actions = []
    for agent in agents:
        actions.extend(agent.actions)

    return MAILSwarmTemplate(
        name="my_game",
        agents=agents,
        actions=actions,
        entrypoint=agents[0].name,  # GameMaster is entrypoint
        enable_interswarm=False,
    )
```

### Step 6: Initialize the Game

```python
@staticmethod
def create(player_configs: list[dict]) -> "MyGame":
    game = MyGame()

    # Create players
    for config in player_configs:
        game.players.append(Player(
            name=config["name"],
            personality=config["personality"],
        ))

    # Build agent templates
    player_names = [p.name for p in game.players]
    agents = [p.build_agent_template() for p in game.players]
    agents.insert(0, build_gamemaster_template(game, player_names))

    # Create swarm
    template = build_swarm(agents)
    swarm = template.instantiate({"user_token": "dummy"}, "MyGame")

    # Start in MANUAL mode
    asyncio.create_task(swarm.run_continuous(mode="manual"))

    game._swarm = swarm
    return game
```

### Step 7: Implement Stepping Helpers

```python
async def step_gamemaster(self, payload: str = "") -> MAILMessage:
    await self.swarm.await_queue_empty()
    return await self.swarm.manual_step(
        task_id=self.task_id,
        target="GameMaster",
        response_targets=["all"],
        response_type="broadcast",
        payload=payload,
    )

async def step_player(
    self,
    player_name: str,
    private: bool = False,
    payload: str = ""
) -> MAILMessage:
    await self.swarm.await_queue_empty()

    if private:
        targets = ["GameMaster"]
        resp_type = "response"
    else:
        targets = ["all"]
        resp_type = "broadcast"

    return await self.swarm.manual_step(
        task_id=self.task_id,
        target=player_name,
        response_targets=targets,
        response_type=resp_type,
        payload=payload,
    )
```

### Step 8: Implement the Game Loop

```python
async def run(self) -> str:
    """Main game loop."""
    # Setup phase
    await self.start_game()

    # Game rounds
    while not self.is_game_over():
        await self.run_round()

    # Announce winner
    return await self.announce_winner()

async def start_game(self):
    self.phase = GamePhase.SETUP

    # Initialize scores
    for player in self.players:
        self.scores[player.name] = 0

    # Send initial message to create task
    player_names = [p.name for p in self.players]
    init_msg = self.swarm.build_message(
        subject="::init::",
        body=f"Game starting with: {', '.join(player_names)}",
        targets=["all"],
        sender_type="user",
        type="broadcast",
        task_id=self.task_id,
    )
    await self.swarm.submit_message_nowait(init_msg)

    # GM welcomes players
    await self.step_gamemaster(payload=f"""
Welcome the players and explain the game rules.
Players: {', '.join(player_names)}
""")

async def run_round(self):
    # 1. GM sets a challenge
    self.phase = GamePhase.CHALLENGE
    await self.step_gamemaster(payload="""
Use set_challenge to create a new challenge for this round.
Then announce it to the players.
""")

    # 2. Each player responds
    self.phase = GamePhase.PLAYER_TURN
    for player in self.players:
        await self.step_player(
            player.name,
            private=False,
            payload=f"""
The challenge is: {self.current_challenge}
Give your response!
""",
        )

    # 3. GM evaluates and scores
    self.phase = GamePhase.RESOLUTION
    await self.step_gamemaster(payload=f"""
Evaluate each player's response to: {self.current_challenge}
Use score_points to award points based on creativity and effort.
Current scores: {self.scores}
""")

def is_game_over(self) -> bool:
    return max(self.scores.values(), default=0) >= 10

async def announce_winner(self) -> str:
    self.phase = GamePhase.GAME_OVER
    winner = max(self.scores, key=self.scores.get)

    await self.step_gamemaster(payload=f"""
The game is over! {winner} wins with {self.scores[winner]} points!
Give a dramatic conclusion and congratulate everyone.
Final scores: {self.scores}
""")

    return winner
```

### Step 9: Run the Game

```python
async def main():
    game = MyGame.create([
        {"name": "Alice", "personality": "Witty and competitive"},
        {"name": "Bob", "personality": "Laid-back but strategic"},
        {"name": "Charlie", "personality": "Enthusiastic and creative"},
    ])

    winner = await game.run()
    print(f"Winner: {winner}")

if __name__ == "__main__":
    asyncio.run(main())
```

---

## API Reference

### MAILSwarm.manual_step()

```python
async def manual_step(
    task_id: str,
    target: str,
    response_targets: list[str] | None = None,
    response_type: Literal["broadcast", "response", "request"] = "broadcast",
    payload: str | None = None,
    dynamic_ctx_ratio: float = 0.0,
    _llm: str | None = None,
    _system: str | None = None,
) -> MAILMessage
```

| Parameter | Description |
|-----------|-------------|
| `task_id` | Unique identifier for this game session |
| `target` | Name of the agent to prompt |
| `response_targets` | List of agents to receive the response. Use `["all"]` for broadcast |
| `response_type` | `"broadcast"` (to all), `"response"` (to specific targets), `"request"` (for delegation) |
| `payload` | Additional context/instructions appended to agent's input |
| `dynamic_ctx_ratio` | Compress context to this ratio (0.0 = no compression, 0.75 = aggressive) |
| `_llm` | Override the agent's LLM for this step |
| `_system` | Override the agent's system prompt for this step |

### MAILSwarm.build_message()

```python
def build_message(
    subject: str,
    body: str,
    targets: list[str],
    sender_type: Literal["admin", "agent", "user"] = "user",
    type: Literal["request", "response", "broadcast", "interrupt"] = "request",
    task_id: str | None = None,
) -> MAILMessage
```

### MAILSwarm.submit_message_nowait()

```python
async def submit_message_nowait(message: MAILMessage) -> None
```
Submits a message to the swarm without waiting for processing. Useful for initialization.

### MAILSwarm.await_queue_empty()

```python
async def await_queue_empty() -> None
```
Waits until all pending messages are processed. Call before `manual_step`.

### MAILAction.from_pydantic_model()

```python
@staticmethod
def from_pydantic_model(
    model: type[BaseModel],
    function: Callable,
    name: str | None = None,
    description: str | None = None,
) -> MAILAction
```

---

## Common Patterns

### Pattern 1: Public vs Private Communication

```python
# Public - everyone hears
await self.swarm.manual_step(
    target="Alice",
    response_targets=["all"],
    response_type="broadcast",
    payload="Share your thoughts with everyone",
)

# Private - only GameMaster hears
await self.swarm.manual_step(
    target="Alice",
    response_targets=["GameMaster"],
    response_type="response",
    payload="[PRIVATE] Tell me your secret strategy",
)

# Group private - only specified agents hear
await self.swarm.manual_step(
    target="Alice",
    response_targets=["Bob", "Charlie"],
    response_type="broadcast",
    payload="[TEAM ONLY] Discuss strategy with your teammates",
)
```

### Pattern 2: Structured Response Requests

```python
# Force specific response format
await self.step_player(
    player_name,
    payload="""
Choose your action for this turn.

Your response MUST end with one of:
- "I choose: ATTACK"
- "I choose: DEFEND"
- "I choose: HEAL"
""",
)
```

### Pattern 3: Injecting Game State

```python
# Give agent current state
await self.step_gamemaster(payload=f"""
=== ROUND {self.round_number} ===

Current standings:
{self.format_scores()}

Remaining items: {self.remaining_items}

Decide who should go next and set the next challenge.
""")
```

### Pattern 4: Tool Result Processing

```python
# Step agent with tools, process results
await self.step_gamemaster(payload="""
Use score_points to award points to the winner.
Then announce the results.
""")

# The tool modifies game state directly via callback
# You can check state after the step returns
print(f"Updated scores: {self.scores}")
```

### Pattern 5: Message Buffering

Messages sent to an agent accumulate in their buffer until they're stepped:

```python
# These all go into Alice's buffer
await self.swarm.submit_message_nowait(msg_from_bob)
await self.swarm.submit_message_nowait(msg_from_charlie)
await self.swarm.submit_message_nowait(msg_from_david)

# When we step Alice, she sees all buffered messages + payload
await self.step_player("Alice", payload="Respond to everyone above")
```

---

## Tips and Best Practices

### 1. Always await_queue_empty() before manual_step()
```python
await self.swarm.await_queue_empty()  # Ensure clean state
response = await self.swarm.manual_step(...)
```

### 2. Use payload for phase-specific instructions
The payload is your control channel. Use it to:
- Tell agents what phase they're in
- Specify required response formats
- Inject current game state
- Give role-specific secret information

### 3. Use dynamic_ctx_ratio for long games
```python
# Compress to 75% to save tokens in long games
await self.swarm.manual_step(
    ...,
    dynamic_ctx_ratio=0.75,
)
```

### 4. Override system prompts for special situations
```python
# Temporarily change agent behavior
await self.swarm.manual_step(
    target="Alice",
    _system="You are now being interrogated. Answer truthfully.",
    payload="What did you do last night?",
)
```

### 5. Design tools that return informative messages
```python
async def score_points(game, args):
    # Return message helps the agent understand what happened
    return f"Awarded {points} to {player}. New total: {game.scores[player]}"
```

### 6. Handle tool errors gracefully
```python
class NarratorError(Exception):
    """Tool validation error - message goes back to agent."""
    pass

async def my_tool(game, args):
    if not valid_target(args["target"]):
        raise NarratorError(f"Invalid target: {args['target']}")
    # ... rest of logic
```

### 7. Use unique task_id per game session
```python
@dataclass
class Game:
    task_id: str = field(default_factory=lambda: str(uuid.uuid4()))
```

### 8. Consider adding interactive mode for debugging
```python
def _interactive_wait(self, agent_name: str, payload: str) -> str:
    if self.interactive:
        print(f"About to step: {agent_name}")
        print(f"Payload: {payload}")
        extra = input("Additional payload (or Enter to continue): ")
        return extra
    return ""
```

---

## Summary

Building games with manual MAIL stepping involves:

1. **Define game state** in a dataclass
2. **Create agent templates** for each player/role
3. **Define actions** as Pydantic models + async functions
4. **Build and instantiate** the swarm in manual mode
5. **Write stepping helpers** that wrap `manual_step`
6. **Implement the game loop** using your stepping helpers
7. **Use payload injection** to control agent behavior per phase
8. **Use response_targets** to control who hears what

The Mafia example demonstrates all these patterns in a complex, multi-phase game with hidden roles, private communication, and sophisticated state management.

---

## Quick Reference

```python
# Initialize
template = MAILSwarmTemplate(name="game", agents=[...], ...)
swarm = template.instantiate({...}, "GameID")
asyncio.create_task(swarm.run_continuous(mode="manual"))

# Send initial message
msg = swarm.build_message(subject="init", body="...", targets=["all"], ...)
await swarm.submit_message_nowait(msg)

# Step an agent
await swarm.await_queue_empty()
response = await swarm.manual_step(
    task_id="...",
    target="AgentName",
    response_targets=["all"] or ["specific", "agents"],
    response_type="broadcast" or "response",
    payload="Phase instructions here",
)

# Define tools
class MyToolArgs(BaseModel):
    arg1: str = Field(description="...")

async def my_tool(game: Game, args: dict) -> str:
    # Modify game state
    return "Result message"

action = MAILAction.from_pydantic_model(
    model=MyToolArgs,
    function=partial(my_tool, game),
    name="my_tool",
)
```


===== End of `docs/manual-mail-game-guide.md` =====

===== `docs/configuration.md` =====

# Configuration

This page describes the configuration surfaces for a MAIL deployment: the project-level `mail.toml`, relevant environment variables, and the `swarms.json` swarm template.

## mail.toml

`mail.toml` provides defaults for both the server and client reference implementations. The CLI, API, and configuration models read from this file the first time configuration is needed.

```toml
[server]
port = 8000
host = "0.0.0.0"
reload = false
debug = false

[server.swarm]
name = "example-no-proxy"
source = "swarms.json"
registry = "registries/example-no-proxy.json"

[server.settings]
task_message_limit = 15
print_llm_streams = true

[client]
timeout = 3600.0
verbose = false
```

- The `[server]` table controls how Uvicorn listens (`port`, `host`, `reload`) and whether debug-only integrations are exposed (`debug`).
- The `[server.swarm]` table specifies the persistent swarm template (`source`), the registry persistence file (`registry` or `registry_file`), and the runtime swarm name (`name`).
- The `[server.settings]` table exposes `task_message_limit`, which caps how many MAIL messages a task will process in continuous mode before yielding control, and `print_llm_streams`, which controls whether runtime-managed agents print LLM reasoning/response stream output to the server console. Set `print_llm_streams = false` for quieter logs (SSE event streaming is unchanged).
- The `[client]` table exposes `timeout` (seconds) and `verbose` (bool). They feed `ClientConfig`, which in turn sets the default timeout and whether the CLI/HTTP client emit debug logs.
- Instantiating `ServerConfig()` or `ClientConfig()` with no arguments uses these values as defaults; if a key is missing or the file is absent, the literal defaults above are applied.
- The CLI command `mail server` accepts `--port`, `--host`, `--reload`, `--debug`, `--swarm-name`, `--swarm-source`, `--swarm-registry`, and `--print-llm-streams true|false`. Provided flags override the file-driven defaults, while omitted flags continue to use `mail.toml` values.
- The CLI command `mail client` honors `timeout` from `[client]` and allows `--timeout` to override it per invocation.
- Set `MAIL_CONFIG_PATH` to point at an alternate `mail.toml` (for example per environment). `mail server --config /path/to/mail.toml` temporarily overrides this variable for the lifetime of the command.
- Toggle `[server].debug` (or pass `mail server --debug`) when you need the optional OpenAI-compatible `/responses` endpoint or other debug helpers exposed by the FastAPI app. Leave it `false` for production deployments to keep the surface minimal.

## Environment variables

### Required
- `AUTH_ENDPOINT`: URL for login endpoint used by the server (Bearer API key -> temporary token)
- `TOKEN_INFO_ENDPOINT`: URL for token info endpoint (Bearer temporary token -> {role,id,api_key})

### Conditional
- `LITELLM_PROXY_API_BASE`: Base URL for your LiteLLM-compatible proxy (required only if your swarm uses `use_proxy=true`).

### Exported by the server (not config overrides)
- `SWARM_NAME`, `SWARM_SOURCE`, `SWARM_REGISTRY_FILE`, `BASE_URL` are set by `mail server` based on the active config. Change them via `mail.toml` or `mail server --swarm-name/--swarm-source/--swarm-registry`.

### Database Persistence (Optional)
- `DATABASE_URL`: PostgreSQL connection string for agent history and task persistence. Format: `postgresql://user:password@host:port/database`. When set, the runtime automatically saves and restores conversation histories, task state, and event timelines. Run `mail db-init` to create the required tables. See [database.md](./database.md) for details.

### Provider Keys
- Optional provider keys consumed by your proxy (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`)

## swarms.json
- Defines the persistent swarm template loaded on server startup
- Sets the entrypoint agent and the set of available agents and actions
- Agents are built via factories referenced by import path strings; prompts and actions are configured per agent
- `action_imports` lets you reuse decorated `MAILAction` objects (for example from `mail.stdlib`) without duplicating their schema in `actions`
- Optional swarm-level fields include `description`, `keywords`, `public`, `breakpoint_tools`, `exclude_tools`, and `enable_db_agent_histories`
- Agent entries may also include `exclude_tools` to hide specific MAIL tools for that agent
- A formal JSON schema for `swarms.json` can be found in [docs/swarms-schema.json](/docs/swarms-schema.json)

### Minimal example
```json
[
    {
        "name": "example",
        "version": "1.3.6",
        "entrypoint": "supervisor",
        "enable_interswarm": true,
        "agents": [
            {
                "name": "supervisor",
                "factory": "python::mail.factories.supervisor:LiteLLMSupervisorFunction",
                "comm_targets": ["weather", "math"],
                "enable_entrypoint": true,
                "can_complete_tasks": true,
                "agent_params": {
                    "llm": "openai/gpt-5-mini",
                    "system": "url::https://example.com/sysprompts/supervisor.json"
                }
            },
            {
                "name": "weather",
                "factory": "python::mail.examples.weather_dummy.agent:LiteLLMWeatherFunction",
                "comm_targets": ["supervisor", "math"],
                "actions": ["get_weather_forecast"],
                "agent_params": {
                    "llm": "openai/gpt-5-mini",
                    "system": "url::https://example.com/sysprompts/weather.json"
                }
            },
            {
                "name": "math",
                "factory": "python::mail.examples.math_dummy.agent:LiteLLMMathFunction",
                "comm_targets": ["supervisor", "weather"],
                "actions": ["calculate_expression"],
                "agent_params": {
                    "llm": "openai/gpt-5-mini",
                    "system": "url::https://example.com/sysprompts/supervisor.json"
                }
            }
        ],
        "actions": [
            {
                "name": "get_weather_forecast",
                "description": "Get the weather forecast for a given location",
                "parameters": { 
                    "type": "object",
                    "properties": {
                        "location": { "type": "string", "description": "The location to get the weather forecast for" },
                        "days_ahead": { "type": "integer", "description": "The number of days ahead to get the weather forecast for" },
                        "metric": { "type": "boolean", "description": "Whether to use metric units" }
                    }
                },
                "function": "python::mail.examples.weather_dummy.actions:get_weather_forecast"
            },
            {
                "name": "calculate_expression",
                "description": "Evaluate a basic arithmetic expression inside the math agent",
                "parameters": { 
                    "type": "object",
                    "properties": {
                        "expression": { "type": "string", "description": "Expression to evaluate" },
                        "precision": { "type": "integer", "minimum": 0, "maximum": 12, "description": "Optional number of decimal places" }
                    },
                    "required": ["expression"]
                },
                "function": "python::mail.examples.math_dummy.actions:calculate_expression"
            }
        ],
        "action_imports": [
            "python::mail.stdlib.mcp.actions:mcp_ping",
            "python::mail.stdlib.mcp.actions:mcp_list_tools"
        ]
    }
]
```

### Prefixed string references

#### python::
- `python::package.module:attribute` strings resolve to Python objects at load time; use this for reusing constants such as prompts or tool factories
- `url::https://example.com/prompt.json` strings are fetched with `httpx` and replaced by the response JSON encoded as a string
- Nested dictionaries and lists inside `agent_params` (and other configuration blocks) are resolved recursively, so you can mix plain literals with both prefix formats

#### url::
- `url::` fetch failures return the original URL unless you set `raise_on_error` when calling `mail.utils.parsing.read_url_string`, which converts errors into descriptive `RuntimeError`s

### Validity of `comm_targets`
- `comm_targets` are cross-validated at parse time by `validate_swarm_from_swarms_json`: each target must reference an existing agent name or use interswarm `agent@swarm` addressing
- Typo'd targets produce error messages with fuzzy-match suggestions (e.g. "Did you mean 'analyst'?")
- `MAILSwarmTemplate._validate()` and `MAILSwarm._validate()` perform additional runtime checks with the same fuzzy-match suggestions

### Validity of `entrypoint`
- The swarm parameter `entrypoint` is required; it must reference exactly one agent with `enable_entrypoint = True` by name
- Cross-validation enforces this at parse time: if the entrypoint doesn't match any agent name, or the matching agent lacks `enable_entrypoint: true`, a clear error is raised
- Default swarm entrypoints are not automatically inferred from the swarm configuration--you must specify this default yourself

### Versioning
- Though a `version` string is required for every swarm, conformance is not strictly enforced by the swarm builder
- Therefore, it is best practice to keep your swarm versions up to date with the MAIL reference implementation version you are using

### Other notes
- Actions are declared once at the swarm level and referenced by name in each agent's `actions` list; [see agents-and-tools.md](/docs/agents-and-tools.md)
- The helpers in `mail.swarms_json.utils` can be used to validate and load `swarms.json` prior to instantiating templates
- Cross-validation checks run automatically during `validate_swarm_from_swarms_json` and cover: entrypoint validity, `enable_entrypoint` / `can_complete_tasks` flags, `comm_targets` references, duplicate agent names, and agent action references


===== End of `docs/configuration.md` =====

