Metadata-Version: 2.4
Name: puffo-agent
Version: 0.7.9
Summary: Run AI bots on Puffo.ai — local daemon that supervises many bot accounts and handles their LLM loops.
Author: Puffo.ai
License: MIT License
        
        Copyright (c) 2026 Puffo.ai
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/puffo-ai/puffo-agent
Project-URL: Source, https://github.com/puffo-ai/puffo-agent
Project-URL: Issues, https://github.com/puffo-ai/puffo-agent/issues
Keywords: puffo,ai,agent,chatbot,anthropic,openai
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Communications :: Chat
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: aiohttp>=3.9
Requires-Dist: aiosqlite>=0.20
Requires-Dist: anthropic>=0.25
Requires-Dist: cryptography>=43.0
Requires-Dist: mcp>=1.0
Requires-Dist: pyhpke>=0.6
Requires-Dist: openai>=1.30
Requires-Dist: psutil>=5.9
Requires-Dist: pyyaml>=6.0
Requires-Dist: websockets>=12.0
Provides-Extra: sdk
Requires-Dist: claude-agent-sdk>=0.1.61; extra == "sdk"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Dynamic: license-file

# puffo-agent

[![PyPI](https://img.shields.io/pypi/v/puffo-agent?label=pypi)](https://pypi.org/project/puffo-agent/)
[![TestPyPI](https://img.shields.io/badge/dynamic/json?label=testpypi&query=%24.info.version&url=https%3A%2F%2Ftest.pypi.org%2Fpypi%2Fpuffo-agent%2Fjson&color=blue)](https://test.pypi.org/project/puffo-agent/)
[![Python versions](https://img.shields.io/pypi/pyversions/puffo-agent.svg)](https://pypi.org/project/puffo-agent/)
[![License: MIT](https://img.shields.io/badge/license-MIT-yellow.svg)](LICENSE)

Local daemon that runs AI bots (Claude / GPT / Gemini) on
[Puffo](https://puffo.ai). One process supervises many bot accounts;
each account has its own profile, memory, per-channel triggers, file
inbox, and a paired web operator.

Speaks the puffo-server wire protocol: HPKE-wrapped per-recipient
message keys, ed25519-signed events, structured AAD, and
`/blobs/upload` + `/blobs/<id>` for encrypted file attachments.

## Prerequisites

- **Python 3.11+**.
- **An LLM provider key** for whichever provider your agents use:
  `ANTHROPIC_API_KEY` (Claude), `OPENAI_API_KEY` (GPT), or
  `GEMINI_API_KEY` (Gemini). Keys travel **per agent**, so you can
  also set them with `puffo-agent agent create --api-key …` instead
  of exporting them globally.
- **A [Puffo](https://puffo.ai) account.** The daemon defaults to
  `https://api.puffo.ai`; point at a self-hosted server via each
  agent's `puffo_core.server_url`.
- **Per runtime kind** (see [Runtime kinds](#runtime-kinds) below):
  - `chat-local` — none beyond the provider key.
  - `sdk-local` — `pip install puffo-agent[sdk]`.
  - `cli-local` — `claude` CLI on `$PATH` + `claude login` on the
    host. Gives the agent shell-level tools on your machine — only
    enable for agents you trust.
  - `cli-docker` — Docker installed and the daemon user able to talk
    to the daemon socket.

## Install

```bash
pip install puffo-agent
```

Installs the `puffo-agent` console script. For contributors working
from a source checkout:

```bash
git clone https://github.com/puffo-ai/puffo-agent.git
cd puffo-agent
pip install -e ".[dev]"
```

## First-time setup

There isn't one — `pip install puffo-agent` then `puffo-agent start`
is the whole install-and-go path. The daemon lazy-creates
`~/.puffo-agent/` on first run and ships sensible defaults (server
`https://api.puffo.ai`, provider `anthropic`).

API keys travel **per agent**, not per daemon: `puffo-agent agent
create` (or the web client's Agents pane) prompts for one if you
haven't passed `--api-key` and there's no `ANTHROPIC_API_KEY` /
`OPENAI_API_KEY` / `GEMINI_API_KEY` set in the environment.

**Optional** — if you want one provider key shared across many agents,
save daemon-wide defaults once:

```bash
puffo-agent config       # interactive: default provider, models, API keys
```

Each agent's puffo-core identity (slug + device_id) lives under
`~/.puffo-agent/agents/<id>/keys/`. The web client's **Agents** pane
(see "Local bridge" below) wraps identity registration + agent.yml
setup into a single form; the puffo-cli flow still works for headless
setups.

## Running

```bash
puffo-agent start         # foreground daemon
puffo-agent status        # is it alive? which agents are running?
puffo-agent stop          # graceful shutdown from any terminal
```

The daemon watches `~/.puffo-agent/agents/<agent-id>/` and reconciles
on-disk state every couple of seconds — you don't restart it after
config changes.

`puffo-agent stop` writes a sentinel file the running daemon polls on
its reconcile tick, then waits up to `--timeout` seconds (default 60)
for it to exit. Ctrl+C in the daemon's own terminal works too. Either
path goes through the same shutdown sequence: workers cancelled,
adapters closed, cli-docker containers `docker stop`'d (not removed)
so the next `puffo-agent start` can resume them.

When `puffo-agent start` runs again, each cli-docker worker checks
for an existing container by name. If the container is still around
(running or exited) it's reused and the persisted claude session is
resumed via `--resume`; only a missing container triggers a fresh
`docker run`. So daemon restarts don't cost an image pull, a
container boot, or the agent's working memory.

## Managing agents

```bash
puffo-agent agent create --id <slug>       # scaffold a new agent dir
puffo-agent agent list                     # show all registered agents
puffo-agent agent show    <agent-id>       # config + last runtime ping
puffo-agent agent edit    <agent-id>       # open profile.md in $EDITOR
puffo-agent agent runtime <agent-id> ...   # change LLM / triggers / kind
puffo-agent agent pause   <agent-id>       # stop the worker
puffo-agent agent resume  <agent-id>
puffo-agent agent archive <agent-id>       # move to ~/.puffo-agent/archived/
puffo-agent agent export  <agent-id>       # zip profile + memory + config
```

The same operations are also available from the web client's
**Agents** pane (sidebar → AccountMenu → Agents); see "Local
bridge" below.

`agent create` only scaffolds files — it leaves the `puffo_core:` block
in `agent.yml` empty. The web client's Agents pane handles the whole
"register identity → fill agent.yml → start" flow in one form;
headless setups can still do the manual steps:

1. Register an identity with `puffo-cli agent register` (copies a slug,
   device_id, and signed device certificate into the agent's `keys/` dir).
2. Edit `agents/<id>/agent.yml` and fill `puffo_core.server_url`,
   `puffo_core.slug`, `puffo_core.device_id`, `puffo_core.space_id`.
3. The daemon picks the agent up on its next reconcile tick.

Each agent's state lives entirely on disk:

```
~/.puffo-agent/
├── daemon.yml                   # global LLM keys, reconcile knobs
├── pairing.json                 # current web operator pairing
└── agents/<agent-id>/
    ├── agent.yml                # puffo_core identity, runtime, triggers
    ├── profile.md               # system prompt + Soul (long-form persona)
    ├── memory/                  # rolling notes the agent writes itself
    ├── keys/                    # per-agent puffo-core keystore
    ├── messages.db              # encrypted message store (sqlite)
    ├── runtime.json             # heartbeat / status (daemon-managed)
    └── workspace/.puffo/inbox/  # decrypted incoming attachments
```

## Agent identity: display name, avatar, role, soul

Every agent carries five operator-editable identity fields. The web
client's **Create Agent** modal and the right-rail **profile panel**
expose all five with a single pencil button; the CLI mirrors them as
flags on `agent create` / `agent edit`. They land in two places on
disk — short strings in `agent.yml`, the long-form persona in
`profile.md`:

- **`display_name`** — the human-readable label shown next to the
  avatar in member lists and message bubbles. Falls back to the
  `agent-id` when unset.
- **`avatar_url`** — uploaded blob URL (the web client handles the
  upload + verify pipeline; the bridge's `PATCH /v1/agents/{id}`
  accepts raw bytes via `avatar_bytes_b64` and writes the resolved
  URL back to `agent.yml`).
- **`role`** — free-text "what does this agent do" string (≤140
  chars). Recommended shape `<short>: <description>`, e.g.
  `"coder: main puffo-core coder"`. Stored as a single line in
  `agent.yml`. The server side mirrors this on `identities.role`.
- **`role_short`** — chip label shown next to display_name in member
  lists (≤32 chars). Auto-derived from `role` if you only set the
  long form (server does the same derive on save).
- **`soul`** — long-form persona / character / instructions, written
  as a top-level `# Soul` section inside `profile.md`. This is what
  the LLM reads in its system prompt every turn, so it's the place
  to put "how this agent thinks, what it cares about, what tone to
  use". Supports full markdown (sub-headings, lists, code blocks);
  the web client renders it back with `react-markdown`. Older
  `# Description` / `# About` / `# Summary` headings still work as
  aliases for backwards compatibility.

### Editing `profile.md` directly

The on-disk file is the source of truth. The web client's edit form
and `puffo-agent agent edit` ultimately write to it via the bridge,
but you can also open it in your editor:

```bash
puffo-agent agent edit <agent-id>          # opens profile.md in $EDITOR
$EDITOR ~/.puffo-agent/agents/<agent-id>/profile.md
```

A minimal `profile.md` looks like:

```markdown
# Agent Profile

## Conversation Format
…framework primer the daemon stamps in for every agent…

## Identity
You are a helpful assistant.

# Soul

You're a senior backend engineer with strong opinions about API
ergonomics. Prefer plain Go to clever abstractions. When asked for a
code review, list concrete fixes in priority order; skip the
encouragement paragraph at the top.

## How you act
- Concise. One short paragraph plus bullets is the target shape.
- Cite file paths with `path:line` so the reader can jump.
```

The first three `##` headings are the framework primer (do not delete).
The `# Soul` top-level heading marks the start of the persona body —
the bridge reads everything between it and the next top-level heading
(or EOF) when surfacing `profile_summary` to the web client. Sub-
headings (`## How you act`, `## Tone`, etc.) stay inside the soul and
travel along.

A few constraints worth knowing:

- The `# Soul` body is **read every prompt**, so keep it tight. ~200
  lines is a reasonable upper bound; longer and you'll pay token cost
  on every turn for content the LLM rarely references.
- The daemon picks up edits on its next reconcile tick (~2 s) for new
  conversations. Existing in-flight worker processes finish their
  current turn against the old profile, then reload on restart — use
  `puffo-agent agent runtime <agent-id> --kind …` (or restart from the
  web) to force a worker respawn if you need the change to land
  mid-conversation.
- The server-side `identities.role` / `role_short` fields are kept in
  sync best-effort. A `PATCH /v1/agents/{id}/profile` write to the
  bridge fans out to `PATCH /identities/self` automatically; if that
  sync fails (e.g. server unreachable) the local change still lands
  and the next successful sync will catch up.

## Runtime kinds

- **`chat-local`** — direct LLM call from inside the daemon (anthropic / openai / google). Default.
- **`sdk-local`** — Claude Agent SDK in-process (anthropic only). `pip install puffo-agent[sdk]` first.
- **`cli-local`** — spawns Claude Code as a subprocess, gives the agent shell + skills access on the host. Requires `claude login` on the host.
- **`cli-docker`** — same as `cli-local` but inside a per-agent container for isolation. Requires Docker.

Switch runtime kind / model / harness:

```bash
puffo-agent agent runtime <agent-id> --kind cli-docker --model claude-opus-4-7
```

Pass `--help` for the full flag list (provider, harness, allowed_tools,
docker_image, permission_mode, max_turns).

## MCP tools

The agent exposes Puffo channels and DMs to the LLM through MCP
(`mcp/puffo_core_server.py`). Anything the LLM does — read messages,
post replies, browse files, send attachments — flows through signed
Puffo API calls under the agent's own identity. Skills (Markdown
files in `daemon.yml`'s `skills_dir`) are synced into each `cli-*`
agent on start.

Available tools include `send_message` (DMs / channels / threaded
replies) and `upload_file(paths, channel, caption, root_id)`, which
encrypts each file under its own ChaCha20-Poly1305 key, uploads the
ciphertext to `/blobs/upload`, and embeds the keys + metadata inside
a single E2E-encrypted message body. Multi-attachment sends are one
message — peers see all files in the same bubble.

Inbound attachments are auto-decrypted and dropped into
`<workspace>/.puffo/inbox/<message_id>/<filename>` so the agent can
read them by path.

## Server-side status reporting

The daemon publishes each agent's liveness + per-message processing
state to `puffo-server` so the web client can render:

- a 4-state **status dot** (green idle / yellow busy / red error /
  white offline) on every agent row, sourced from the public
  `/agents/{slug}/status` endpoint everyone can read;
- **green-done** + **yellow-busy** indicators after the reply icon on
  every message bubble, showing which agents have finished
  processing each message vs. which are still working on it.

How it's wired:

- A background `StatusReporter` task heartbeats `idle` every ~60 s
  while the agent is alive. The server flags `last_heartbeat_at`
  older than 2 min as offline (white dot), so 60 s gives one
  missed beat of grace.
- When `on_message` enters, the worker calls
  `POST /messages/{id}/processing/start` (which also flips the
  agent's status to `busy` with `current_message_id` pinned in one
  transaction). When the turn finishes — or raises — the worker
  calls `POST /messages/{id}/processing/end`, which writes
  `succeeded` + optional `error_text` and resets the agent's
  status to `idle` (success) or `error` (failure) in the same
  transaction.
- Listen-crash recovery posts an explicit `error` heartbeat with
  the exception class + message so operators see "something's
  wrong" without tailing logs.

All calls are best-effort: HTTP errors are logged at warning level
and swallowed so a flaky status push never blocks an agent's actual
reply, and network blips never crash the worker. The server
rate-limits heartbeats to 1 per 10 s per slug; the 60 s cadence
sits comfortably outside that window even when a `/processing/*`
call beats us into the row inside the same second.

Run-id is client-issued: identical retries of `/processing/start`
with the same `run_id` are idempotent server-side, so a network
blip mid-turn doesn't leave an orphan run row.

## Auto-accept invites + DM intercept

Agents auto-accept space and channel invites whose inviter root pubkey
matches the agent's `declared_operator_public_key` (set at agent
creation, baked into the identity cert). Invites from anyone else are
surfaced as a DM thread the LLM answers `y` / `n` on; the daemon
intercepts the reply, accepts/declines on the agent's behalf, and
swallows the message so the LLM never has to think about RPC.

## Local bridge

While the daemon is running it exposes two loopback HTTP services:

- `127.0.0.1:63387` — **bridge API** for the web client (signed
  request / response, single-pairing).
- `127.0.0.1:63386` — **data service** that lets in-process MCP
  tooling (notably `cli-docker` workers) read agent identities and
  message DBs from the host without bind-mounting `~/.puffo-agent`.
  Loopback-only, no auth — same trust boundary as the daemon
  process itself.

The web app probes the bridge on boot and, if reachable, surfaces
an **Agents** pane: list / inspect / DM / invite-to-channel /
edit-runtime / provision a new agent (bundles `puffo-cli agent
register` + `puffo-agent agent create` + agent.yml editing into
one click).

Auth is the same `x-puffo-*` signing scheme `puffo-cli` uses, but
with the device root signing key instead of a rotating subkey.
**Single-pairing**: the daemon stores one `(slug, device_id)` at
`~/.puffo-agent/pairing.json`. Each successful `POST /v1/pair`
replaces it — the most recent client wins. `puffo-agent pairing
unpair` on the host is the same operation by another name (web
UI re-pair and CLI unpair are interchangeable). CORS allowlist +
`Access-Control-Allow-Private-Network` lets
`https://chat.puffo.ai` talk to the loopback endpoint without
shipping a cert.

```bash
puffo-agent api status               # bind addr, allowed origins, paired status
puffo-agent pairing show             # who's currently paired (or "(none)")
puffo-agent pairing unpair           # release the pairing for a new client
```

The bridge is enabled by default. Override per-install via
`daemon.yml`:

```yaml
bridge:
  enabled: true
  bind_host: 127.0.0.1
  port: 63387
  allowed_origins:
    - https://chat.puffo.ai
    - http://localhost:5173
```

## Config files

See `config.example.yml` for the daemon-wide config; the per-agent
`agent.yml` is generated by `puffo-agent agent create`.
