# MindRoom

> AI agents that live in Matrix and work everywhere via bridges

# MindRoom Docs

# MindRoom

AI agents that live in Matrix and work everywhere via bridges.

## What is MindRoom?

MindRoom is an AI agent orchestration system with Matrix integration. It provides:

- **Multi-agent collaboration** - Configure multiple specialized agents that can work together
- **Matrix-native** - Agents live in Matrix rooms and respond to messages
- **Persistent memory** - Agent and team-scoped memory that persists across conversations
- **100+ tool integrations** - Connect to external services like GitHub, Slack, Gmail, and more
- **Hot-reload configuration** - Update `config.yaml` and agents restart automatically
- **Scheduled tasks** - Schedule agents to run at specific times with cron expressions or natural language
- **Voice messages** - Speech-to-text transcription with intelligent command recognition
- **Image analysis** - Pass images to vision-capable AI models for analysis
- **Authorization** - Fine-grained access control for users and rooms

> [!TIP] **Matrix is the backbone** - MindRoom agents communicate through the Matrix protocol, which means they can be bridged to Discord, Slack, Telegram, and other platforms.

## Quick Start

### Recommended: Full Stack Docker Compose (bundled dashboard + Matrix + MindRoom client)

**Prereqs:** Docker + Docker Compose.

```
git clone https://github.com/mindroom-ai/mindroom-stack
cd mindroom-stack
cp .env.example .env
$EDITOR .env  # add at least one AI provider key

docker compose up -d
```

Open:

- MindRoom UI: http://localhost:8765
- MindRoom client: http://localhost:8080
- Matrix homeserver: http://localhost:8008

The stack uses published `mindroom`, `mindroom-cinny`, and `mindroom-tuwunel` images by default.

If you access the stack from another device, set `CLIENT_HOMESERVER_URL=http://<host-ip>:8008` in `.env` before starting it.

### Manual Install (advanced)

Use this if you already have a Matrix homeserver and want to run MindRoom directly.

```
# Using uv
uv tool install mindroom

# Or using pip
pip install mindroom
```

### Basic Usage (manual)

1. Create a `config.yaml`:

```
agents:
  assistant:
    display_name: Assistant
    role: A helpful AI assistant
    model: default
    rooms: [lobby]

models:
  default:
    provider: openai
    id: gpt-5.4

defaults:
  tools: [scheduler]
  markdown: true
```

1. Set up your environment in `.env`:

```
# Matrix homeserver (must allow open registration)
MATRIX_HOMESERVER=https://matrix.example.com

# AI provider API keys
OPENAI_API_KEY=your_api_key
```

1. Run MindRoom:

```
mindroom run
```

For local development with a host-installed backend plus Dockerized Synapse + Cinny (Linux/macOS), you can bootstrap the local stack with:

```
mindroom local-stack-setup --synapse-dir /path/to/mindroom-stack/local/matrix
mindroom run
```

## Features

| Feature                      | Description                                                                                                                                                                                                |
| ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Agents**                   | Single-specialty actors with specific tools and instructions                                                                                                                                               |
| **Teams**                    | Collaborative bundles of agents (coordinate or collaborate modes)                                                                                                                                          |
| **Router**                   | Built-in traffic director that routes messages to the right agent                                                                                                                                          |
| **Memory**                   | Mem0-inspired memory system with agent and team scopes                                                                                                                                                     |
| **Knowledge Bases**          | File-backed RAG indexing with per-agent base assignment                                                                                                                                                    |
| **Tools**                    | 100+ integrations for external services                                                                                                                                                                    |
| **Skills**                   | OpenClaw-compatible skills system for extended agent capabilities                                                                                                                                          |
| **Scheduling**               | Schedule tasks with cron expressions or natural language                                                                                                                                                   |
| **Voice**                    | Speech-to-text transcription for voice messages                                                                                                                                                            |
| **Images**                   | Pass user-sent images to vision-capable AI models                                                                                                                                                          |
| **File & Video Attachments** | Context-scoped file and video handling with attachment IDs                                                                                                                                                 |
| **Cultures**                 | Shared evolving principles across groups of agents                                                                                                                                                         |
| **Interactive Q&A**          | Clickable multiple-choice questions via Matrix reactions                                                                                                                                                   |
| **Authorization**            | Fine-grained user and room access control                                                                                                                                                                  |
| **OpenAI-Compatible API**    | Use agents from LibreChat, Open WebUI, or any OpenAI client                                                                                                                                                |
| **Streaming**                | Progressive message edits with presence-based gating and tool-call markers                                                                                                                                 |
| **Chat Commands**            | Built-in `!schedule <task>`, `!list_schedules`, `!cancel_schedule <id>`, `!edit_schedule <id> <task>`, `!help [topic]`, `!reload-plugins`, `!config <operation>`, and `!hi` commands handled by the router |
| **Hot Reload**               | Config changes are detected and agents restart automatically                                                                                                                                               |

## Architecture

```
┌─────────────────────────────────────────────────────┐
│                 Matrix Homeserver                    │
└─────────────────────┬───────────────────────────────┘
                      │
┌─────────────────────▼───────────────────────────────┐
│              MultiAgentOrchestrator                  │
│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐   │
│  │ Router  │ │ Agent 1 │ │ Agent 2 │ │  Team   │   │
│  └─────────┘ └─────────┘ └─────────┘ └─────────┘   │
└─────────────────────────────────────────────────────┘
```

## Documentation

- [Getting Started](https://docs.mindroom.chat/getting-started/index.md) - Installation and first steps
- [Hosted Matrix Deployment](https://docs.mindroom.chat/deployment/hosted-matrix/index.md) - Run only `uvx mindroom` locally against hosted Matrix
- [Configuration](https://docs.mindroom.chat/configuration/index.md) - All configuration options
- [Cultures](https://docs.mindroom.chat/configuration/cultures/index.md) - Configure shared agent cultures
- [Dashboard](https://docs.mindroom.chat/dashboard/index.md) - Web UI for configuration
- [OpenAI-Compatible API](https://docs.mindroom.chat/openai-api/index.md) - Use agents from any OpenAI-compatible client
- [Tools](https://docs.mindroom.chat/tools/index.md) - Available tool integrations
- [OpenClaw Import](https://docs.mindroom.chat/openclaw/index.md) - Reuse OpenClaw workspace files in MindRoom
- [MCP](https://docs.mindroom.chat/mcp/index.md) - Configure native MCP client servers and expose their tools to agents
- [Skills](https://docs.mindroom.chat/skills/index.md) - OpenClaw-compatible skills system
- [Plugins](https://docs.mindroom.chat/plugins/index.md) - Extend with custom tools, OAuth providers, and skills
- [OAuth Framework](https://docs.mindroom.chat/oauth-framework/index.md) - Build scoped OAuth-backed tool integrations
- [Knowledge Bases](https://docs.mindroom.chat/knowledge/index.md) - Configure RAG-backed document indexing
- [Memory System](https://docs.mindroom.chat/memory/index.md) - How agent memory works
- [Scheduling](https://docs.mindroom.chat/scheduling/index.md) - Schedule tasks with cron or natural language
- [Voice Messages](https://docs.mindroom.chat/voice/index.md) - Voice message transcription
- [Image Messages](https://docs.mindroom.chat/images/index.md) - Image analysis with vision models
- [File & Video Attachments](https://docs.mindroom.chat/attachments/index.md) - Context-scoped file and video handling
- [Streaming Responses](https://docs.mindroom.chat/streaming/index.md) - Progressive message edits with presence-based gating
- [Chat Commands](https://docs.mindroom.chat/chat-commands/index.md) - Built-in `!schedule <task>`, `!list_schedules`, `!cancel_schedule <id>`, `!edit_schedule <id> <task>`, `!help [topic]`, `!reload-plugins`, `!config <operation>`, and `!hi` commands
- [Interactive Q&A](https://docs.mindroom.chat/interactive/index.md) - Clickable multiple-choice questions via Matrix reactions
- [Authorization](https://docs.mindroom.chat/authorization/index.md) - User and room access control
- [Matrix Space](https://docs.mindroom.chat/matrix-space/index.md) - Optional root Matrix Space for grouping managed rooms
- [Architecture](https://docs.mindroom.chat/architecture/index.md) - How it works under the hood
- [Deployment](https://docs.mindroom.chat/deployment/index.md) - Docker and Kubernetes deployment
- [Bridges](https://docs.mindroom.chat/deployment/bridges/index.md) - Connect Telegram, Slack, and other platforms to Matrix
- [Sandbox Proxy](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) - Isolate code-execution tools in a sandbox
- [Google Services OAuth](https://docs.mindroom.chat/deployment/google-services-oauth/index.md) - Admin OAuth setup for Gmail/Calendar/Drive/Sheets
- [Google Services OAuth (Individual)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md) - Single-user OAuth setup
- [CLI Reference](https://docs.mindroom.chat/cli/index.md) - Command-line interface
- [Support](https://docs.mindroom.chat/support/index.md) - Contact and troubleshooting help
- [Privacy Policy](https://docs.mindroom.chat/privacy/index.md) - Privacy and data handling information
- [Terms of Service](https://docs.mindroom.chat/terms/index.md) - Terms for using MindRoom services and clients

## License

- **Repository (except `saas-platform/`)**: Apache License 2.0
- **SaaS Platform** (`saas-platform/`): Business Source License 1.1 (converts to Apache 2.0 on 2030-02-06)

# Getting Started

This guide will help you set up MindRoom and create your first AI agent.

## Recommended: Hosted Matrix + Local MindRoom (`uv` only)

If you do not want to self-host Matrix yet, this is the simplest setup. You only run MindRoom locally.

**Prerequisite:** Install [uv](https://docs.astral.sh/uv/getting-started/installation/).

### 1. Initialize local config

```
uvx mindroom config init --profile public
```

This creates:

- `~/.mindroom/config.yaml`
- `~/.mindroom/.env` prefilled with `MATRIX_HOMESERVER=https://mindroom.chat`

The `--profile public` template defaults to the `openai` provider. Use `--provider` to select a different provider preset:

```
# Use Anthropic Claude
uvx mindroom config init --profile public --provider anthropic

# Use Codex CLI ChatGPT subscription auth
uvx mindroom config init --profile public-codex

# Use Vertex AI Claude (Google Cloud)
uvx mindroom config init --profile public-vertexai-anthropic
```

`public-codex` is the canonical profile name for hosted Matrix with Codex CLI subscription auth. The shorter `codex` profile alias is also accepted. Run `codex login` before starting MindRoom when using this profile.

`public-vertexai-anthropic` is the canonical profile name for Vertex AI Claude on hosted Matrix. Aliases `public-vertexai-claude`, `vertexai-anthropic`, and `vertexai-claude` are also accepted.

Other profiles:

- `--profile full` — rich starter config with interactive provider selection (default)
- `--profile minimal` or `--minimal` — bare-minimum config

### 2. Add model API key(s)

```
$EDITOR ~/.mindroom/.env
```

Set at least one key:

- `ANTHROPIC_API_KEY=...`, or
- `OPENAI_API_KEY=...`, or
- `OPENROUTER_API_KEY=...`, or
- For Codex CLI subscription auth: run `codex login`.
- For Vertex AI Claude: set `ANTHROPIC_VERTEX_PROJECT_ID` and `CLOUD_ML_REGION` and authenticate with `gcloud auth application-default login`.

### 3. Pair your local install from chat UI

1. Open `https://chat.mindroom.chat` and sign in.
1. Go to `Settings -> Local MindRoom`.
1. Click `Generate Pair Code`.
1. Run locally:

```
uvx mindroom connect --pair-code ABCD-EFGH
```

Notes:

- Pair code is short-lived (10 minutes). Generate a new one if it expires.
- `mindroom connect` writes local provisioning values (including `MINDROOM_NAMESPACE`) into `~/.mindroom/.env` by default.
- Use `--no-persist-env` to export variables only for the current shell session instead of writing to `.env`.

### 4. Run MindRoom

```
uvx mindroom run
```

### 5. Verify

**In chat:** Send a message mentioning your agent in a room where it is configured.

**Dashboard:** Access the web dashboard at `http://localhost:8765` to configure agents, models, and tools. Protect the dashboard API in non-localhost environments by setting `MINDROOM_API_KEY` in your `.env`.

**Preflight check:** Run `mindroom doctor` before `mindroom run` to verify config, API keys, Matrix connectivity, and storage in one pass.

For a detailed architecture and credential model, see: [Hosted Matrix deployment guide](https://docs.mindroom.chat/deployment/hosted-matrix/index.md).

## Alternative: Full Stack Docker Compose (bundled dashboard + Matrix + MindRoom client)

Use this when you want everything local: the bundled MindRoom dashboard, Matrix homeserver, and a Matrix client in one stack.

**Prereqs:** Docker + Docker Compose.

### 1. Clone the full stack repo

```
git clone https://github.com/mindroom-ai/mindroom-stack
cd mindroom-stack
```

### 2. Add your API keys

```
cp .env.example .env
$EDITOR .env  # add at least one AI provider key
```

### 3. Start everything

```
docker compose up -d
```

Open:

- MindRoom UI: http://localhost:8765
- MindRoom client: http://localhost:8080
- Matrix homeserver: http://localhost:8008

The stack uses published `mindroom`, `mindroom-cinny`, and `mindroom-tuwunel` images by default.

If you access the stack from another device, set `CLIENT_HOMESERVER_URL=http://<host-ip>:8008` in `.env` before starting it.

## Manual Install (advanced)

Use this if you already have a Matrix homeserver and want to run MindRoom directly.

### Prerequisites

- Python 3.12 or higher
- A Matrix homeserver (or use a public one like matrix.org)
- API keys for your preferred AI provider (Anthropic, OpenAI, etc.)

### Installation

=== "uv (recommended)"

````
```bash
uv tool install mindroom
````

```

=== "pip"

```

```bash
pip install mindroom
```

```

=== "From source"

```

```bash
git clone https://github.com/mindroom-ai/mindroom
cd mindroom
uv sync
source .venv/bin/activate
```

```

### Configuration

#### 1. Create your config file

Create a `config.yaml` in your working directory:

```

agents: assistant: display_name: Assistant role: A helpful AI assistant that can answer questions model: default include_default_tools: true rooms: [lobby] # Optional: file-based context (OpenClaw-style) # context_files: [SOUL.md, USER.md]

models: default: provider: openai id: gpt-5.4

defaults: tools: [scheduler] markdown: true

timezone: America/Los_Angeles

```

#### 2. Set up environment variables

Create a `.env` file with your credentials:

```

# Matrix homeserver (must allow open registration for agent accounts)

MATRIX_HOMESERVER=https://matrix.example.com

# Optional: For self-signed certificates (development)

# MATRIX_SSL_VERIFY=false

# Optional: For federation setups where server_name differs from homeserver hostname

# MATRIX_SERVER_NAME=example.com

# AI provider API keys

OPENAI_API_KEY=your_openai_key

# OPENROUTER_API_KEY=your_openrouter_key

# ANTHROPIC_API_KEY=your_anthropic_key

# Optional: protect the dashboard API (recommended for non-localhost)

# MINDROOM_API_KEY=your-secret-key

```

#### Optional: Bootstrap local Synapse + Cinny with Docker (Linux/macOS)

If you want a local Matrix + client setup without running the full `mindroom-stack` app, use the helper command:

```

mindroom local-stack-setup --synapse-dir /path/to/mindroom-stack/local/matrix

```

If you're running from source in this repo, use:

```

uv run mindroom local-stack-setup --synapse-dir /path/to/mindroom-stack/local/matrix

```

This starts Synapse from the `mindroom-stack` compose files, starts a MindRoom Cinny container, waits for both services to be healthy, and by default writes local Matrix settings to `.env` next to your active `config.yaml`.

> [!NOTE]
> MindRoom automatically creates Matrix user accounts for each agent. Your Matrix homeserver must allow open registration, or you need to configure it to allow registration from localhost. If registration fails, check your homeserver's registration settings.

#### 3. Run MindRoom

```

mindroom run

````

MindRoom will:

1. Connect to your Matrix homeserver
2. Create Matrix users for each agent
3. Create any rooms that don't exist and join them
4. Start listening for messages

## Next Steps

- Learn about [agent configuration](https://docs.mindroom.chat/configuration/agents/index.md)
- Learn about [OpenClaw workspace import](https://docs.mindroom.chat/openclaw/index.md) if you want file-based memory/context patterns
- Explore [available tools](https://docs.mindroom.chat/tools/index.md)
- Set up [teams for multi-agent collaboration](https://docs.mindroom.chat/configuration/teams/index.md)```
````

# Web Dashboard

MindRoom includes a web dashboard for configuring agents, teams, rooms, and integrations without editing YAML files. Changes are synchronized to `config.yaml` in real-time.

## Accessing the Dashboard

**Standalone Mode:**

```
mindroom run
```

The dashboard will be available at `http://localhost:8765`. When running from a source checkout, MindRoom will build the dashboard assets on first start if Bun is available.

**SaaS Platform:** Access your dashboard at `https://<instance-id>.mindroom.chat`

## Dashboard Tabs

### Dashboard (Overview)

The main dashboard shows system stats and monitoring:

- **Stats cards** - Agents (with status breakdown), rooms, teams, models, and voice status
- **Network graph** - Visual representation of agent-room-team relationships (desktop only)
- **Search and filter** - Filter by agents, rooms, or teams
- **Export Config** - Download configuration as JSON

### Agents

Configure AI agents:

- **Display name** and **Role description**
- **Model** - Select from configured models
- **Memory backend** - Inherit global memory backend or override per agent (`mem0`, `file`, or `none`)
- **Tools** - Organized into configured tools (green badge) and default tools (no config needed)
- **Instructions** - Custom behavior instructions
- **Rooms** - Where the agent operates
- **Learning** - Enable or disable Agno Learning per agent (enabled by default)
- **Learning mode** - Choose `always` (automatic extraction) or `agentic` (tool-driven)

### Teams

Configure multi-agent collaboration:

- **Display name** and **Team purpose**
- **Collaboration mode** - Coordinate (sequential) or Collaborate (parallel)
- **Team model** - Optional model override
- **Team members** and **Team rooms**

### Rooms

Manage Matrix room configuration:

- **Display name** and **Description**
- **Room model** - Optional model override
- **Agents in room** - Select which agents have access

### External Rooms

View and manage rooms that agents have joined but are not in the configuration:

- **Per-agent view** with room names and IDs
- **Bulk selection** and **Leave rooms** functionality
- **Open in Matrix** - Link to view in your Matrix client

### Models & API Keys

Configure AI model providers:

- **Add/edit models** with provider, model ID, host URL, and advanced settings
- **Provider filter** to show models by provider
- **Test connection** to verify model accessibility
- **Provider API keys** section for configuring credentials

**Runtime-supported providers:** OpenAI, Codex CLI subscription auth (`codex`), Anthropic, Google Gemini (`google`/`gemini`), Vertex AI Claude (`vertexai_claude`), Ollama, OpenRouter, Groq, DeepSeek, Cerebras

### Memory

Configure global memory defaults:

- **Backend** - Global default backend (`mem0`, `file`, or `none`)
- **Provider** - Ollama (local), OpenAI, or Sentence Transformers
- **Model** - Provider-specific embedding models
- **Host URL** - For Ollama provider
- **File backend settings** - Path and file memory tuning options
- **Auto-flush settings** - Background extraction and flush controls for file-backed memory

Per-agent overrides are configured from the **Agents** tab using the **Memory backend** selector.

### Knowledge

Manage file-backed RAG knowledge bases:

- **Create/edit/delete knowledge bases** with `path` and refresh-on-access `watch` settings
- **Configure Git repository, branch, filtering, credentials service, and sync options**
- **Upload and remove files** for non-Git-backed knowledge bases
- **Reindex** a knowledge base on demand
- **Track index status** (`file_count` and `indexed_count`)
- **Assign agents** to a specific knowledge base from the Agents tab

Git-backed knowledge bases are managed from the dashboard, but file mutations still belong in the repository.

- The dashboard hides upload, dropzone, and per-file delete controls for Git-backed bases.
- `/api/knowledge/bases/{base_id}/files` reflects the manager's filtered file set (for example `include_patterns`/`exclude_patterns`).
- Private HTTPS repo auth can be managed in the **Credentials** tab, then referenced by `knowledge_bases.<id>.git.credentials_service`.
- `POST /api/knowledge/bases/{base_id}/reindex` syncs Git first for Git-backed bases before rebuilding the index.
- `POST /api/knowledge/bases/{base_id}/upload` and `DELETE /api/knowledge/bases/{base_id}/files/{path}` reject Git-backed bases with `409`; update the repository and reindex instead.
- Chat/runtime requests use last successfully published indexes and do not wait for indexing or Git sync.

### Credentials

Manage service credentials directly from the dashboard:

- **List configured credential services** from `CredentialsManager`
- **Create/select service names** (for example `github_private` or `model:sonnet`)
- **Edit raw JSON credential payloads** and save via `/api/credentials/{service}`
- **Test credentials existence** using `/api/credentials/{service}/test`
- **Delete credential sets** using `/api/credentials/{service}`
- **Reuse credentials for Git knowledge sync** by setting `knowledge_bases.<id>.git.credentials_service` to the same service name
- `GITHUB_TOKEN` auto-seeds `github_private` (`username: x-access-token`, `token: <GITHUB_TOKEN>`, `_source: env`) unless the service is UI-managed

### Culture

Configure shared culture rules that apply across agents:

- **Create/edit/delete cultures** with description and mode
- **Assign agents** to cultures
- **Mode selection** - `automatic` (always active), `agentic` (agent decides when to update), or `manual` (read-only)

### Schedules

View and manage scheduled tasks across rooms:

- **List all schedules** with room, status, schedule type, and next run time
- **Edit schedule timing** and description
- **Cancel schedules** by task ID

### Skills

Manage OpenClaw-compatible skills:

- **List installed skills** with origin and edit status
- **View skill content** (SKILL.md)
- **Create new skills** with name and description
- **Edit user-created skills**
- **Delete user-created skills**

### Voice

Configure voice message handling:

- **Enable/disable** voice message support
- **Speech-to-Text** - OpenAI Whisper or self-hosted
- **Command Intelligence** - Model selection for command recognition

### Integrations

Connect external services to enable agent capabilities:

- **Categories** - Email & Calendar, Communication, Shopping, Entertainment, Social, Development, Research, Smart Home, Information
- **Search and filter** by status (Available, Unconfigured, Configured)
- **OAuth flows** for Google (6 endpoints), Spotify (4 endpoints), Home Assistant (7 endpoints), and more

## Features

### Real-time Sync

The sync status indicator in the header shows:

- **Synced** - All changes saved
- **Syncing...** - Save in progress
- **Sync Error** - Sync failed
- **Disconnected** - Lost connection to backend

### Theme and Responsive Design

Toggle between dark and light themes. The dashboard adapts to desktop and mobile devices.

## API Endpoints

The dashboard communicates with the backend API at `/api/`:

### Configuration

| Method | Endpoint                     | Description                                                 |
| ------ | ---------------------------- | ----------------------------------------------------------- |
| POST   | `/api/config/load`           | Fetch current configuration                                 |
| PUT    | `/api/config/save`           | Save full configuration                                     |
| GET    | `/api/config/raw`            | Fetch the raw `config.yaml` source for recovery editing     |
| PUT    | `/api/config/raw`            | Replace the entire raw `config.yaml` source during recovery |
| GET    | `/api/config/agents`         | List all agents                                             |
| POST   | `/api/config/agents`         | Create new agent                                            |
| PUT    | `/api/config/agents/{id}`    | Update agent                                                |
| DELETE | `/api/config/agents/{id}`    | Delete agent                                                |
| GET    | `/api/config/teams`          | List all teams                                              |
| POST   | `/api/config/teams`          | Create new team                                             |
| PUT    | `/api/config/teams/{id}`     | Update team                                                 |
| DELETE | `/api/config/teams/{id}`     | Delete team                                                 |
| GET    | `/api/config/models`         | List model configurations                                   |
| PUT    | `/api/config/models/{id}`    | Update model configuration                                  |
| GET    | `/api/config/room-models`    | Get room model overrides                                    |
| PUT    | `/api/config/room-models`    | Update room model overrides                                 |
| POST   | `/api/config/agent-policies` | Get backend-derived agent policies for a draft config       |

When `/api/config/load` returns validation errors, the dashboard fetches `/api/config/raw`, opens the recovery editor, and saves a full replacement through `PUT /api/config/raw` before retrying the structured reload.

### Credentials

| Method | Endpoint                                                | Description                           |
| ------ | ------------------------------------------------------- | ------------------------------------- |
| GET    | `/api/credentials/list`                                 | List services with credentials        |
| GET    | `/api/credentials/{service}/status`                     | Get credential status                 |
| GET    | `/api/credentials/{service}`                            | Get credentials for editing           |
| POST   | `/api/credentials/{service}`                            | Set credentials                       |
| POST   | `/api/credentials/{service}/api-key`                    | Set API key                           |
| GET    | `/api/credentials/{service}/api-key`                    | Get masked API key                    |
| POST   | `/api/credentials/{service}/test`                       | Test credentials validity             |
| DELETE | `/api/credentials/{service}`                            | Delete credentials                    |
| POST   | `/api/credentials/{service}/copy-from/{source_service}` | Copy credentials from another service |

Credentials support scoping via query parameters:

- `agent_name` — scope credentials to a specific agent
- `execution_scope` — scope credentials to a specific worker scope (e.g., `shared`, `unscoped`)

### Knowledge

| Method | Endpoint                                      | Description                                                            |
| ------ | --------------------------------------------- | ---------------------------------------------------------------------- |
| GET    | `/api/knowledge/bases`                        | List configured knowledge bases                                        |
| GET    | `/api/knowledge/bases/{base_id}/files`        | List files in a knowledge base                                         |
| POST   | `/api/knowledge/bases/{base_id}/upload`       | Upload one or more files for a non-Git-backed base                     |
| DELETE | `/api/knowledge/bases/{base_id}/files/{path}` | Delete a file from disk for a non-Git-backed base and schedule refresh |
| GET    | `/api/knowledge/bases/{base_id}/status`       | Get indexing status                                                    |
| POST   | `/api/knowledge/bases/{base_id}/reindex`      | Rebuild the index for a base                                           |

### Skills

| Method | Endpoint                   | Description                                     |
| ------ | -------------------------- | ----------------------------------------------- |
| GET    | `/api/skills`              | List all installed skills                       |
| GET    | `/api/skills/{skill_name}` | Get skill detail (content, origin, edit status) |
| POST   | `/api/skills`              | Create a new user skill                         |
| PUT    | `/api/skills/{skill_name}` | Update a user skill's content                   |
| DELETE | `/api/skills/{skill_name}` | Delete a user skill                             |

### Schedules

| Method | Endpoint                   | Description                               |
| ------ | -------------------------- | ----------------------------------------- |
| GET    | `/api/schedules`           | List scheduled tasks (filterable by room) |
| PUT    | `/api/schedules/{task_id}` | Edit a scheduled task                     |
| DELETE | `/api/schedules/{task_id}` | Cancel a scheduled task                   |

### Workers

| Method | Endpoint               | Description                   |
| ------ | ---------------------- | ----------------------------- |
| GET    | `/api/workers`         | List active sandbox workers   |
| POST   | `/api/workers/cleanup` | Clean up idle sandbox workers |

### Health & Readiness

| Method | Endpoint      | Description                                                                                                                                                                                                                                        |
| ------ | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| GET    | `/api/health` | Returns `{"status": "healthy"}` when the HTTP server is running and Matrix sync is active. Returns `503` with `{"status": "unhealthy", "stale_sync_entities": [...]}` when Matrix sync has been stale for >180s (after watchdog recovery attempts) |
| GET    | `/api/ready`  | Returns `{"status": "ready"}` when the orchestrator has finished startup. Returns `503` with `{"status": "<phase>", "detail": "..."}` otherwise                                                                                                    |

MindRoom tracks runtime phases internally:

| Phase      | Meaning                                               |
| ---------- | ----------------------------------------------------- |
| `idle`     | Process not started                                   |
| `starting` | Startup in progress (detail message available)        |
| `ready`    | Orchestrator booted, serving requests                 |
| `failed`   | Startup or runtime failure (detail message available) |

Use `/api/health` for liveness probes and `/api/ready` for readiness probes in container orchestrators. Note: `/api/health` returns `503` when Matrix sync is stale (>180s without successful sync, after the 120s watchdog timeout has attempted recovery). Configure liveness probe `failureThreshold` to allow sufficient time for watchdog self-healing.

### Tools & Matrix

| Method | Endpoint                        | Description                      |
| ------ | ------------------------------- | -------------------------------- |
| GET    | `/api/tools`                    | List available tools             |
| GET    | `/api/rooms`                    | List configured rooms            |
| GET    | `/api/matrix/agents/rooms`      | Get all agents' room memberships |
| GET    | `/api/matrix/agents/{id}/rooms` | Get specific agent's rooms       |
| POST   | `/api/matrix/rooms/leave`       | Leave a single room              |
| POST   | `/api/matrix/rooms/leave-bulk`  | Leave multiple rooms             |

# Configuration

MindRoom is configured through a `config.yaml` file. This section covers all configuration options.

## Configuration File

MindRoom searches for the configuration file in this order (first match wins):

1. `MINDROOM_CONFIG_PATH` environment variable (if set)
1. `./config.yaml` (current working directory)
1. `~/.mindroom/config.yaml` (home directory)

Data storage (`mindroom_data/`) is placed next to the config file by default.

You can also validate a specific file directly:

```
mindroom config validate --path /path/to/config.yaml
```

## MCP Servers

MindRoom can connect to external Model Context Protocol servers through the top-level `mcp_servers` block. See [MCP](https://docs.mindroom.chat/mcp/index.md) for transport-specific config, tool naming, examples, and agent setup.

## Tool Approval

Use the top-level `tool_approval` block to gate tool calls behind human approval in Matrix conversations. Rules are evaluated in order and the first matching rule wins. Each rule must set exactly one of `action` or `script`. Use `action: require_approval` to always pause the tool call and send a Matrix approval card. Use `script: ./approval_scripts/review.py` to run `check(tool_name, arguments, agent_name) -> bool` and require approval only when it returns `True`. `timeout_days` sets the default approval expiry window and can be overridden per rule. React to the approval card with `✅` to approve the tool call. Reply to the approval card with a message to deny the tool call and record that text as the denial reason. Only the original human requester can approve or deny their pending tool call. Approval responses only resolve the live Matrix approval card in the same room; approval IDs are used only as a live client hint. If MindRoom restarts before a tool call is approved, the live tool call is cancelled. On startup, MindRoom attempts to mark recent unresolved approval cards sent by the current router as expired. Agent-authored, system-authored, and configured bridge-bot-authored tool calls are denied instead of entering the approval flow. OpenAI-compatible `/v1/chat/completions` has no approval transport, so any tool function that matches a required-approval rule, including script-based rules, is hidden from the `/v1` tool schema instead of being exposed and blocked later.

```
tool_approval:
  default: auto_approve
  timeout_days: 7
  rules:
    - match: slack_*
      action: require_approval
    - match: run_shell_command
      script: ./approval_scripts/shell_review.py
      timeout_days: 3
```

## Environment Variables

### Core

| Variable                   | Description                                                                                                                                                                             | Default                                     |
| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- |
| `MINDROOM_CONFIG_PATH`     | Path to `config.yaml`                                                                                                                                                                   | `./config.yaml` → `~/.mindroom/config.yaml` |
| `MINDROOM_STORAGE_PATH`    | Data storage directory                                                                                                                                                                  | `mindroom_data/` next to config             |
| `MINDROOM_CONFIG_TEMPLATE` | Path to a config template. When set and `config.yaml` does not exist, MindRoom copies this template to the config path. Used in Docker containers to seed config from bundled templates | Same as config path                         |
| `LOG_LEVEL`                | Logging level for `mindroom run` (`DEBUG`, `INFO`, `WARNING`, `ERROR`)                                                                                                                  | `INFO`                                      |
| `MINDROOM_LOGGER_LEVELS`   | Optional comma- or semicolon-separated logger level overrides, for example `mindroom:DEBUG,httpx:WARNING,httpcore:WARNING,anthropic:INFO,nio:WARNING`                                   | unset                                       |

### Matrix

| Variable             | Description                | Default                     |
| -------------------- | -------------------------- | --------------------------- |
| `MATRIX_HOMESERVER`  | Matrix homeserver URL      | `http://localhost:8008`     |
| `MATRIX_SERVER_NAME` | Server name for federation | *(derived from homeserver)* |
| `MATRIX_SSL_VERIFY`  | Verify SSL certificates    | `true`                      |

### API Keys

Set the API key for each provider you use in `config.yaml`:

| Variable             | Provider                                                            |
| -------------------- | ------------------------------------------------------------------- |
| `ANTHROPIC_API_KEY`  | Anthropic (Claude)                                                  |
| `OPENAI_API_KEY`     | OpenAI                                                              |
| `GOOGLE_API_KEY`     | Google (Gemini)                                                     |
| `OPENROUTER_API_KEY` | OpenRouter                                                          |
| `DEEPSEEK_API_KEY`   | DeepSeek                                                            |
| `CEREBRAS_API_KEY`   | Cerebras                                                            |
| `GROQ_API_KEY`       | Groq                                                                |
| `OLLAMA_HOST`        | Ollama (host URL, not a key)                                        |
| `OPENAI_BASE_URL`    | Base URL for OpenAI-compatible APIs (e.g., local inference servers) |

All API key variables also support a `_FILE` suffix for file-based secrets (e.g., `ANTHROPIC_API_KEY_FILE=/run/secrets/anthropic-api-key`). See [Model Configuration — File-based Secrets](https://docs.mindroom.chat/configuration/models/#file-based-secrets) for details.

### Codex CLI Subscription Auth

The `codex` provider does not use an API key environment variable. Run `codex login` so `~/.codex/auth.json` contains ChatGPT OAuth tokens. Set `CODEX_HOME` only if your Codex CLI state lives outside `~/.codex`.

### Operational

| Variable                                             | Description                                                                                                                                                                                                                                      | Default                          |
| ---------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------- |
| `MINDROOM_NAMESPACE`                                 | Installation namespace for Matrix identity isolation (4–32 lowercase alphanumeric chars)                                                                                                                                                         | *(none)*                         |
| `MINDROOM_PORT`                                      | Port used by Google OAuth callback URL construction and deployment tooling. Does **not** change the API server bind port — use `mindroom run --api-port` for that                                                                                | `8765`                           |
| `MINDROOM_API_KEY`                                   | API key for authenticating dashboard/API requests (`mindroom config init` auto-generates one; unset = open access)                                                                                                                               | *(none)*                         |
| `MINDROOM_NO_AUTO_INSTALL_TOOLS`                     | Set to `1`/`true`/`yes` to disable automatic tool dependency installation                                                                                                                                                                        | *(unset — auto-install enabled)* |
| `MINDROOM_MATRIX_HOMESERVER_STARTUP_TIMEOUT_SECONDS` | Seconds to wait for homeserver to become reachable at startup (0 = skip). MindRoom polls the homeserver's `/_matrix/client/versions` endpoint with exponential backoff retry, detecting permanent errors (e.g., wrong URL) vs transient failures | *(wait indefinitely)*            |
| `MINDROOM_WORKER_BACKEND`                            | Worker backend for tool execution (`static_runner` or `kubernetes`)                                                                                                                                                                              | `static_runner`                  |

### OpenAI-Compatible API

| Variable                              | Description                                                            | Default                                          |
| ------------------------------------- | ---------------------------------------------------------------------- | ------------------------------------------------ |
| `OPENAI_COMPAT_API_KEYS`              | Comma-separated API keys for authenticating `/v1/*` requests           | *(none — locked without this or the flag below)* |
| `OPENAI_COMPAT_ALLOW_UNAUTHENTICATED` | Set to `true` to allow unauthenticated `/v1/*` access (local dev only) | *(unset — locked)*                               |

See [OpenAI-Compatible API](https://docs.mindroom.chat/openai-api/index.md) for the full auth matrix.

### Provisioning / Pairing

These are set automatically by `mindroom connect` and stored in `.env`:

| Variable                       | Description                                              |
| ------------------------------ | -------------------------------------------------------- |
| `MINDROOM_PROVISIONING_URL`    | Provisioning service URL (e.g., `https://mindroom.chat`) |
| `MINDROOM_LOCAL_CLIENT_ID`     | Client ID from hosted pairing                            |
| `MINDROOM_LOCAL_CLIENT_SECRET` | Client secret from hosted pairing                        |

### Frontend / Development

| Variable                       | Description                                                 | Default            |
| ------------------------------ | ----------------------------------------------------------- | ------------------ |
| `MINDROOM_FRONTEND_DIST`       | Override path to pre-built frontend assets                  | *(auto-detected)*  |
| `MINDROOM_AUTO_BUILD_FRONTEND` | Set to `0` to skip automatic frontend build                 | *(enabled)*        |
| `DOCKER_CONTAINER`             | Set to `true` when running inside the packaged Docker image | *(unset)*          |
| `BROWSER_EXECUTABLE_PATH`      | Path to browser executable for the browser tool             | *(system default)* |

### Vertex AI

| Variable                         | Description                                  |
| -------------------------------- | -------------------------------------------- |
| `ANTHROPIC_VERTEX_PROJECT_ID`    | Google Cloud project ID for Vertex AI Claude |
| `ANTHROPIC_VERTEX_BASE_URL`      | Custom Vertex AI base URL                    |
| `CLOUD_ML_REGION`                | Google Cloud region for Vertex AI            |
| `GOOGLE_CLOUD_PROJECT`           | Google Cloud project ID                      |
| `GOOGLE_CLOUD_LOCATION`          | Google Cloud region                          |
| `GOOGLE_APPLICATION_CREDENTIALS` | Path to Google service account JSON          |

Authenticate with `gcloud auth application-default login` or set `GOOGLE_APPLICATION_CREDENTIALS`.

### Worker / Sandbox

| Variable                       | Description                                | Default  |
| ------------------------------ | ------------------------------------------ | -------- |
| `MINDROOM_SANDBOX_PROXY_URL`   | Sandbox proxy endpoint URL (static runner) | *(none)* |
| `MINDROOM_SANDBOX_PROXY_TOKEN` | Auth token for the sandbox proxy           | *(none)* |

See [Sandbox Proxy](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) for the full list of `MINDROOM_SANDBOX_*` variables, including Kubernetes backend variables (`MINDROOM_SANDBOX_KUBERNETES_*`).

### SaaS-Only

| Variable      | Description                                                     | Default  |
| ------------- | --------------------------------------------------------------- | -------- |
| `CUSTOMER_ID` | Tenant identity for worker key derivation (SaaS platform only)  | *(none)* |
| `ACCOUNT_ID`  | Account identity for worker key derivation (SaaS platform only) | *(none)* |

## Basic Structure

```
# Agent definitions (at least one recommended)
agents:
  assistant:
    display_name: Assistant        # Required: Human-readable name
    role: A helpful AI assistant   # Optional: Description of purpose
    model: sonnet                  # Optional: Model name (default: "default")
    tools:                         # Optional: Agent-specific tools (merged with defaults.tools)
      - file
      - shell
      # - shell:                    # Per-agent tool config overrides (single-key dict):
      #     extra_env_passthrough: "DAWARICH_*"
    include_default_tools: true    # Optional: Per-agent opt-out for defaults.tools
    skills: []                     # Optional: List of skill names
    instructions: []               # Optional: Custom instructions
    rooms: [lobby]                 # Optional: Rooms to auto-join
    accept_invites: true           # Optional: Accept authorized ad-hoc room invites
    markdown: true                 # Optional: Override default (inherits from defaults section)
    worker_tools: [shell, file]    # Optional: Override default (inherits from defaults section)
    worker_scope: user_agent       # Optional: Reuse one proxied runtime per requester+agent
    learning: true                 # Optional: Override default (inherits from defaults section)
    learning_mode: always          # Optional: Override default (inherits from defaults section)
    memory_backend: file           # Optional: Per-agent memory backend override (mem0, file, or none)
    knowledge_bases: [docs]         # Optional: Assign one or more configured knowledge bases
    context_files:                 # Optional: Load files into each freshly built agent instance
      - SOUL.md
      - AGENTS.md
      - USER.md
      - IDENTITY.md
      - TOOLS.md
      - HEARTBEAT.md
  researcher:
    display_name: Researcher
    role: Research and gather information
    model: sonnet
  writer:
    display_name: Writer
    role: Write and edit content
    model: sonnet
  developer:
    display_name: Developer
    role: Write code and implement features
    model: sonnet
  reviewer:
    display_name: Reviewer
    role: Review code and provide feedback
    model: sonnet

# Model configurations (at least a "default" model is recommended)
models:
  default:
    provider: anthropic            # Required: openai, anthropic, ollama, google, gemini, vertexai_claude, groq, cerebras, openrouter, deepseek
    id: claude-sonnet-4-6            # Required: Model ID for the provider
  sonnet:
    provider: anthropic            # Required: openai, anthropic, ollama, google, gemini, vertexai_claude, groq, cerebras, openrouter, deepseek
    id: claude-sonnet-4-6            # Required: Model ID for the provider
    host: null                     # Optional: Host URL (e.g., for Ollama)
    api_key: null                  # Optional: API key (usually from env vars)
    extra_kwargs: null             # Optional: Provider-specific parameters
    context_window: null           # Optional: Needed on the active runtime model for replay safety; explicit compaction.model also needs its own window for summary generation

# Team configurations (optional)
teams:
  research_team:
    display_name: Research Team    # Required: Human-readable name
    role: Collaborative research   # Required: Description of team purpose
    agents: [researcher, writer]   # Required: List of agent names
    mode: collaborate              # Optional: "coordinate" or "collaborate" (default: coordinate)
    model: sonnet                  # Optional: Model for team coordination (default: "default")
    num_history_runs: 8            # Optional: Team-scoped replay policy
    num_history_messages: null     # Optional: Mutually exclusive with num_history_runs
    max_tool_calls_from_history: 6 # Optional: Limit replayed tool call messages
    compaction:                    # Optional: Team-scoped required-compaction overrides
      # Soft thresholds do not compact by themselves while history still fits.
      enabled: true
      threshold_percent: 0.8
      reserve_tokens: 16384
    rooms: []                      # Optional: Rooms to auto-join

# Culture configurations (optional)
cultures:
  engineering:
    description: Follow clean code principles and write tests  # Shared principles
    agents: [developer, reviewer]  # Agents assigned (each agent can belong to at most one culture)
    mode: automatic                # automatic, agentic, or manual

# Router configuration (optional)
router:
  model: default                   # Optional: Model for routing (default: "default")

# Default settings for all agents (optional)
defaults:
  tools:                           # Default: ["scheduler"] (added to every agent; set [] to disable)
    - scheduler                    # Plain string or single-key dict with inline config overrides
  markdown: true                   # Default: true
  enable_streaming: true           # Default: true (stream responses via message edits)
  streaming:
    update_interval: 5.0           # Default: 5.0 (steady-state seconds between streamed edits)
    min_update_interval: 0.5       # Default: 0.5 (fast-start seconds between early edits)
    interval_ramp_seconds: 15.0    # Default: 15.0 (set 0 to disable interval ramping)
    max_idle: 2.0                  # Default: 2.0 (event-driven idle ceiling before the next edit)
  learning: true                   # Default: true
  learning_mode: always            # Default: always (or agentic)
  max_preload_chars: 50000         # Hard cap for preloaded context from context_files
  show_stop_button: true           # Default: true (global only, cannot be overridden per-agent)
  num_history_runs: null           # Number of prior runs to include (null = all)
  num_history_messages: null       # Max messages from history (null = use num_history_runs)
  compress_tool_results: false     # Safer default; enabling can invalidate Anthropic/Vertex Claude prompt caches
  # Required compaction is enabled by default.
  # Soft thresholds do not compact by themselves while history still fits.
  # Set enabled: false to disable automatic pre-reply compaction globally.
  compaction:
    enabled: true
    threshold_percent: 0.8
    reserve_tokens: 16384
  max_tool_calls_from_history: null  # Limit tool call messages replayed from history (null = no limit)
  show_tool_calls: true            # Default: true (show tool details inline; hidden mode still allows generic worker warmup copy)
  worker_tools: null               # Default: null (tool names to route through workers; null = use MindRoom's default routing policy, [] = disable)
  worker_scope: null               # Default: null (no runtime reuse; set shared/user/user_agent to enable)
  worker_grantable_credentials: null  # Default: null (deny by default; list credential service names to make available inside isolated workers, e.g. [openai, github_private])
  allow_self_config: false         # Default: false (allow agents to modify their own config via a tool)
  thread_summary_temperature: 0.2  # Default: 0.2 (set null to omit temperature and use provider defaults)
  thread_summary_first_threshold: 1  # Default: 1 (first automatic thread summary after first message)
  thread_summary_subsequent_interval: 10  # Default: 10 (messages between later automatic thread summaries)

# defaults.tools are appended to each agent's tools list with duplicates removed.
# Set agents.<name>.include_default_tools: false to opt out a specific agent.
# defaults.streaming is also global-only and controls streamed message edit cadence.
# Tools can be plain strings or single-key dicts with per-agent config overrides.

MindRoom uses `defaults.thread_summary_temperature` for automatic thread summaries on providers that support runtime temperature overrides.
Set it to `null` to omit the field and use provider defaults.
MindRoom always omits temperature for Vertex Claude thread summaries because the provider rejects that field on this path.

`defaults.worker_grantable_credentials` is a list of credential service names.
Use built-in names like `openai`, `anthropic`, `google`, `openrouter`, `deepseek`, `cerebras`, `groq`, `ollama`, and `github_private`, or custom shared credential service names you saved through the dashboard or API.
Google OAuth client config and Google OAuth token services stay in the primary runtime and cannot be mirrored into isolated workers.
If a tool runs inside an isolated worker, only the services listed here are available to that worker.
Leave this unset to keep isolated workers deny-by-default for shared credentials.
This setting never injects provider env vars such as `OPENAI_API_KEY`.

For worker-routed tools, it only controls which shared credentials MindRoom may load inside isolated workers.
This setting also does not control local shared-only integrations that stay in the main runtime, such as `homeassistant`.
Those tools keep using normal shared credentials even when `worker_grantable_credentials` is empty.
`google_vertex_adc` is intentionally not supported here because isolated workers do not receive ADC files or `GOOGLE_APPLICATION_CREDENTIALS`; use that auth path only in the main runtime.
Sandbox-proxied execution is stricter than direct local execution: ordinary runtime `.env` values and provider env do not carry over unless they are explicitly passed through.

# Required compaction is destructive inside the active session.
# It uses one Matrix lifecycle notice that is edited in place.
# It runs before a reply when raw history exceeds the hard replay budget.
# It also runs before the next reply after a manual compact_context request.
# Otherwise MindRoom leaves the stored session unchanged and relies on replay fitting for that reply.
# It rewrites the stored session summary and removes compacted raw runs from the live session.
# Agno then replays only the summary plus recent runs.
# Use __MINDROOM_INHERIT__ inside a tool override to clear one inherited authored field
# while keeping the rest of defaults.tools for that agent.
# See agents.md for the full per-agent tool configuration syntax.
# These thresholds only affect automatic thread summaries; manual `set_thread_summary`
# tool calls write immediately and reset the automatic baseline from the new message count.

# Memory system configuration (optional)
memory:
  backend: mem0                    # Global default backend (mem0, file, or none); agents can override with memory_backend
  team_reads_member_memory: false  # Default: false (when true, team reads can access member agent memories)
  embedder:
    provider: openai               # Default: openai (openai, ollama, huggingface, sentence_transformers)
    config:
      model: text-embedding-3-small  # Default embedding model
      api_key: null                # Optional: From env var
      host: null                   # Optional: For self-hosted
      dimensions: null             # Optional: Embedding dimension override (e.g., 256)
  llm:                             # Optional: LLM for memory operations
    provider: ollama
    config: {}
  file:                            # File-backed memory settings (when backend: file)
    path: null                     # Optional: fallback root for file memory paths
    max_entrypoint_lines: 200      # Default: 200 (max lines preloaded from MEMORY.md)
  auto_flush:                      # Background memory auto-flush (file backend only)
    enabled: false                 # Default: false (enable background flush worker)
    flush_interval_seconds: 1800   # Default: 1800 (loop interval)
    idle_seconds: 120              # Default: 120 (idle time before flush eligibility)
    max_dirty_age_seconds: 600     # Default: 600 (force flush after this many seconds dirty)
    stale_ttl_seconds: 86400       # Default: 86400 (drop stale flush-state entries older than this)
    max_cross_session_reprioritize: 5  # Default: 5 (same-agent dirty sessions reprioritized per prompt)
    retry_cooldown_seconds: 30     # Default: 30 (cooldown before retrying a failed extraction)
    max_retry_cooldown_seconds: 300  # Default: 300 (upper bound for retry cooldown backoff)
    batch:
      max_sessions_per_cycle: 10   # Default: 10 (max sessions processed per auto-flush loop)
      max_sessions_per_agent_per_cycle: 3  # Default: 3 (max sessions per agent per loop)
    extractor:
      no_reply_token: NO_REPLY     # Default: NO_REPLY (token indicating no durable memory)
      max_messages_per_flush: 20   # Default: 20 (max messages considered per extraction)
      max_chars_per_flush: 12000   # Default: 12000 (max chars considered per extraction)
      max_extraction_seconds: 30   # Default: 30 (timeout for one extraction job)
      include_memory_context:
        memory_snippets: 5         # Default: 5 (max MEMORY.md snippets for dedupe context)
        snippet_max_chars: 400     # Default: 400 (max chars per snippet)
#
# See docs/memory.md for full auto-flush behavior and tuning guidance.
#
# Set memory.embedder.provider: sentence_transformers to run embeddings in-process.
# MindRoom auto-installs that optional extra on first use.

# Knowledge base configuration (optional)
# Keys must be non-empty single path components, so do not use "", ., .., /, or \ in a knowledge base ID.
knowledge_bases:
  docs:
    path: ./knowledge_docs          # Folder containing documents for this base (Pydantic default)
    watch: false                   # Direct external edits require reindex; API mutations still schedule refresh
    chunk_size: 5000               # Default: 5000 (max characters per indexed chunk)
    chunk_overlap: 0               # Default: 0 (overlapping characters between chunks)
    git:                           # Optional: Sync this folder from a Git repository
      repo_url: https://github.com/pipefunc/pipefunc
      branch: main
      poll_interval_seconds: 300  # Interval for background Git refresh scheduling
      lfs: false                   # Optional: enable Git LFS support (requires git-lfs on the runtime host)
      sync_timeout_seconds: 3600   # Optional: abort a hung git command after this many seconds
      skip_hidden: true
      include_patterns: ["docs/**"]  # Optional: root-anchored glob filters
      exclude_patterns: []
      credentials_service: github_private # Optional: service in CredentialsManager

# Voice message handling (optional)
voice:
  enabled: false                   # Default: false
  visible_router_echo: false       # Optional: show the normalized voice text from the router
  stt:
    provider: openai               # Default: openai
    model: whisper-1               # Default: whisper-1
    api_key: null
    host: null
  intelligence:
    model: default                 # Model for command recognition

# Internal MindRoom user account (optional, omit for hosted/public profiles)
# When present, defaults are: username: mindroom_user, display_name: MindRoomUser
mindroom_user:
  username: mindroom_user          # Set before first startup (localpart only)
  display_name: MindRoomUser       # Can be changed later

# Matrix room onboarding/discoverability (optional)
matrix_room_access:
  mode: single_user_private        # Default keeps invite-only/private behavior
  multi_user_join_rule: public     # In multi_user mode: public or knock
  publish_to_room_directory: false # Publish managed rooms in server room directory
  invite_only_rooms: []            # Room keys/aliases/IDs that stay invite-only/private
  reconcile_existing_rooms: false  # Explicit migration of existing managed rooms

# Authorization (optional)
authorization:
  global_users: []                 # Users with access to all rooms
  room_permissions: {}             # Keys: room ID (!id), full alias (#alias:domain), or managed room key (alias)
  default_room_access: false       # Default: false
  aliases: {}                      # Map canonical Matrix user IDs to bridge aliases (see authorization docs)
  agent_reply_permissions: {}      # Per-agent/team/router (or '*') reply allowlists; supports globs like '*:example.com'

# Room-specific model overrides (optional)
# Keys are room aliases, values are model names from the models section
# Example: room_models: {dev: sonnet, lobby: gpt4o}
room_models: {}

# Non-MindRoom bot accounts to exclude from multi-human detection (optional)
# These accounts won't trigger the mention requirement in threads
bot_accounts:
  - "@telegram:example.com"

# Plugin paths (optional)
plugins: []

# Matrix Space grouping (optional)
matrix_space:
  enabled: true                    # Default: true (create a root Matrix Space for managed rooms)
  name: MindRoom                   # Default: "MindRoom" (display name for the root Space)

# Matrix delivery policy (optional)
matrix_delivery:
  ignore_unverified_devices: false # Default: false (keep Matrix E2EE device-trust checks enabled)

# Timezone for scheduled tasks (optional)
timezone: America/Los_Angeles      # Default: UTC
```

`matrix_delivery.ignore_unverified_devices` is an explicit opt-in for outgoing encrypted Matrix sends. Leave it `false` to preserve Matrix E2EE device-trust checks. Setting it to `true` can improve bot delivery when rooms contain unverified devices, but Matrix may encrypt messages for devices the bot has not verified.

## Credential Seeds

MindRoom can bootstrap additional shared credential services at startup from explicit seed declarations. Use this for deployment-managed credentials that should live in `CredentialsManager` without requiring inline one-off migration scripts. Seeded credentials are marked `_source=env`: MindRoom updates them on later startups, but it never overwrites dashboard-managed credentials (`_source=ui`) or legacy credentials with no source marker.

Set `MINDROOM_CREDENTIAL_SEEDS_FILE` to a JSON file path, or `MINDROOM_CREDENTIAL_SEEDS_JSON` to equivalent inline JSON. Relative file paths resolve from the config directory. Credential fields can read from env vars, from files, or from literal values:

```
[
  {
    "service": "example_oauth_client",
    "credentials": {
      "client_id": {"env": "EXAMPLE_CLIENT_ID"},
      "client_secret": {"env": "EXAMPLE_CLIENT_SECRET"}
    }
  }
]
```

Env refs use the existing secret convention: if `EXAMPLE_CLIENT_SECRET` is unset, MindRoom also checks `EXAMPLE_CLIENT_SECRET_FILE` and reads that file. If any declared field is missing or empty, MindRoom skips that seed instead of creating a partial credential document.

## Debug Logging

`debug.log_llm_requests` enables pre-provider request assembly logging for troubleshooting. When enabled, MindRoom writes JSONL request records under `debug.llm_request_log_dir` or `mindroom_data/logs/llm_requests` by default. Those records include prompts, messages, tool schemas, model parameters, correlation IDs, requester metadata, and source Matrix event metadata. The same flag also records successful tool-call rows in `mindroom_data/tracking/tool_calls.jsonl` so tool activity can be correlated with LLM request logs. Tool failures are always recorded in `tool_calls.jsonl`, even when request logging is disabled. These artifacts can contain sensitive prompt, argument, and result data, so leave the flag disabled unless you are actively debugging.

## Managed Avatars

MindRoom can generate managed avatars for agents, teams, rooms, and the optional root Matrix Space. Use the optional `avatars.prompts` block to override the built-in prompt styles without editing Python code. Every field is optional and falls back to MindRoom's built-in defaults when omitted.

```
avatars:
  prompts:
    character_style: "professional AI avatar portrait, abstract geometric silhouette"
    room_style: "minimalist wayfinding icon, precise geometry, strong silhouette"
    agent_system_prompt: "You are creating distinctive visual elements for a professional AI agent avatar."
    team_system_prompt: "You are creating distinctive visual elements for a professional AI team avatar."
    room_system_prompt: "You are creating a refined, minimalist icon design for a room avatar."
```

`mindroom avatars generate` only creates missing local avatar files by default. Run `mindroom avatars generate --force` to overwrite existing managed workspace avatar files after changing prompts or styles. `mindroom avatars sync` only fills missing Matrix avatars by default. Run `mindroom avatars sync --force` to replace existing Matrix room or root-space avatars.

## Internal User Username

- Configure `mindroom_user.username` with the Matrix localpart you want before first startup.
- After the account is created, `mindroom_user.username` is locked and cannot be changed in-place.
- You can safely change `mindroom_user.display_name` at any time.

## Sections

- [Agents](https://docs.mindroom.chat/configuration/agents/index.md) - Configure individual AI agents
- [Models](https://docs.mindroom.chat/configuration/models/index.md) - Configure AI model providers
- [Teams](https://docs.mindroom.chat/configuration/teams/index.md) - Configure multi-agent collaboration
- [Toolkits](https://docs.mindroom.chat/configuration/toolkits/index.md) - Configure dynamic tool bundles that agents load on demand
- [Cultures](https://docs.mindroom.chat/configuration/cultures/index.md) - Configure shared agent cultures
- [Router](https://docs.mindroom.chat/configuration/router/index.md) - Configure message routing
- [Memory](https://docs.mindroom.chat/memory/index.md) - Configure memory providers and behavior
- [Knowledge Bases](https://docs.mindroom.chat/knowledge/index.md) - Configure file-backed knowledge bases
- [Voice](https://docs.mindroom.chat/voice/index.md) - Configure speech-to-text voice processing
- [Authorization](https://docs.mindroom.chat/authorization/index.md) - Configure user and room access control
- [Matrix Space](https://docs.mindroom.chat/matrix-space/index.md) - Configure the root Matrix Space for managed rooms
- [Skills](https://docs.mindroom.chat/skills/index.md) - Skill format, gating, and allowlists
- [Plugins](https://docs.mindroom.chat/plugins/index.md) - Plugin manifest and tool/skill loading

## Notes

- All top-level sections are optional with sensible defaults, but at least one agent is recommended for Matrix interactions
- A model named `default` is required unless agents, teams, and the router all specify explicit non-`default` models
- Agents can set `knowledge_bases`, but each entry must exist in the top-level `knowledge_bases` section
- `agents.<name>.accept_invites` defaults to `true`; when enabled, authorized ad-hoc room invites are accepted and persisted across restarts without adding those rooms to the static `rooms` list
- Approval-gated tools require the router to be joined to the Matrix room.
- In ad-hoc invited rooms accepted through `accept_invites`, approval only works if the router is already joined to that room.
- `agents.<name>.context_files` load files from the agent's workspace into each agent instance, so edits take effect on the next reply without restarting (see [Agents](https://docs.mindroom.chat/configuration/agents/index.md))
- `agents.<name>.room_thread_modes` overrides `thread_mode` for specific rooms, and resolution is room-aware for agents, teams, and router decisions (see [Agents](https://docs.mindroom.chat/configuration/agents/index.md))
- `memory.backend` sets the global memory default, and `agents.<name>.memory_backend` overrides it per agent
- `memory.backend: none`, `memory: none`, or `agents.<name>.memory_backend: none` disables built-in durable memory for the effective agent without disabling Agno Learning
- `defaults.max_preload_chars` caps preloaded file context (`context_files`)
- When `authorization.default_room_access` is `false`, only users in `global_users` or room-specific `room_permissions` can interact with agents
- `authorization.agent_reply_permissions` can further restrict which users specific agents/teams/router will reply to
- `authorization.aliases` maps bridge bot user IDs to canonical users so bridged messages inherit the same permissions (see [Authorization](https://docs.mindroom.chat/authorization/index.md))
- `authorization.room_permissions` accepts room IDs, full room aliases, and managed room keys
- `matrix_room_access.mode` defaults to `single_user_private`; this preserves current private/invite-only behavior
- In `multi_user` mode, MindRoom sets managed room join rules and directory visibility from config
- In `multi_user` mode, MindRoom also reconciles managed room power levels so `com.mindroom.thread.tags` can be written at PL0
- Publishing to the room directory requires the managing service account (typically router) to have moderator/admin power in each room
- Thread-tag power-level reconciliation also requires the managing service account to be joined and able to update `m.room.power_levels`
- The `memory` system works out of the box with OpenAI; use `memory.llm` for memory summarization with a different provider

# Agent Configuration

Agents are the core building blocks of MindRoom. Each agent is a specialized AI actor with specific capabilities.

## Basic Agent

```
agents:
  assistant:
    display_name: Assistant
    role: A helpful AI assistant
    model: sonnet
    rooms: [lobby]
```

## Full Configuration

```
agents:
  developer:
    # Display name shown in Matrix
    display_name: Developer

    # Role description - guides the agent's behavior
    role: Generate code, manage files, execute shell commands

    # Model to use (defined in models section)
    model: sonnet

    # Tools the agent can use (plain names or inline config overrides)
    tools:
      - file
      - shell
      - github
      # Per-agent tool config override (single-key dict syntax):
      # - shell:
      #     extra_env_passthrough: "DAWARICH_*"
      #     enable_run_shell_command: true

    # Skills the agent can use (defined in skills section or plugins)
    skills:
      - my_custom_skill

    # Custom instructions
    instructions:
      - Always read files before modifying them
      - Use clear variable names
      - Add comments for complex logic

    # Rooms to join (will be created if they don't exist)
    rooms:
      - lobby
      - dev

    # Accept authorized ad-hoc room invites for this agent
    accept_invites: true

    # Enable markdown formatting
    markdown: true

    # Enable Agno Learning for this agent
    learning: true

    # Learning mode: always (automatic) or agentic (tool-driven)
    learning_mode: always

    # Memory backend override for this agent (optional: mem0, file, or none)
    memory_backend: file

    # Assign agent to one or more configured knowledge bases (optional)
    knowledge_bases: [docs]

    # Optional: additional files loaded into each freshly built agent instance
    context_files:
      - SOUL.md
      - AGENTS.md
      - USER.md
      - IDENTITY.md
      - TOOLS.md
      - HEARTBEAT.md

    # Whether to include defaults.tools for this agent (default: true)
    include_default_tools: true

    # Response mode: "thread" (replies in Matrix threads) or "room" (plain room messages)
    thread_mode: thread

    # Optional room-specific overrides for thread mode
    # Keys may be managed room aliases/names or Matrix room IDs
    room_thread_modes:
      lobby: thread
      bridge_telegram: room
      "!abc123:example.com": room

    # Participate in room-level startup prewarm for rooms already joined at first sync (default: true)
    startup_thread_prewarm: true

    # Tools to run in the sandbox proxy instead of the main process (optional, inherits from defaults)
    worker_tools: [shell, file]

    # How sandbox runtimes are shared (optional, inherits from defaults)
    worker_scope: user_agent

    # Allow this agent to read and modify its own config at runtime
    allow_self_config: false

    # Delegate tasks to other agents via tool calls
    delegate_to:
      - research
      - finance

    # History context controls (all optional, inherit from defaults)
    num_history_runs: null
    num_history_messages: null
    compress_tool_results: false
    max_tool_calls_from_history: null

    # Required compaction is enabled by default.
    # Soft thresholds do not compact by themselves while history still fits.
    # Set enabled: false to disable automatic pre-reply compaction for this agent.
    compaction:
      enabled: true
      threshold_percent: 0.8
      reserve_tokens: 16384
```

## Configuration Options

| Option                        | Type   | Default               | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| ----------------------------- | ------ | --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `display_name`                | string | *required*            | Human-readable name shown in Matrix as the bot's display name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| `role`                        | string | `""`                  | System prompt describing the agent's purpose — guides its behavior and expertise                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `model`                       | string | `"default"`           | Model name (must match a key in the `models` section)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `tools`                       | list   | `[]`                  | Agent-specific tool entries — plain strings or single-key dicts with config overrides (see [Tools](https://docs.mindroom.chat/tools/index.md) and [Per-Agent Tool Configuration](#per-agent-tool-configuration)); effective tools are `tools + defaults.tools` with duplicates removed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `include_default_tools`       | bool   | `true`                | When `true`, append `defaults.tools` to this agent's `tools`; set to `false` to opt this agent out                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `skills`                      | list   | `[]`                  | Skill names the agent can use (see [Skills](https://docs.mindroom.chat/skills/index.md))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `instructions`                | list   | `[]`                  | Extra lines appended to the system prompt after the role                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `rooms`                       | list   | `[]`                  | Room aliases to auto-join; rooms are created if they don't exist                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `accept_invites`              | bool   | `true`                | Accept authorized inbound Matrix room invites for this agent. Invited room IDs are persisted so ad-hoc memberships survive restarts and room cleanup. Set to `false` to ignore new invites for this agent. Approval-gated tools still require the router to be joined to the room, so ad-hoc invited rooms only support approval if the router is already joined there                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `markdown`                    | bool   | `null`                | When enabled, the agent is instructed to format responses as Markdown. Inherits from `defaults.markdown` (default: `true`)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| `learning`                    | bool   | `null`                | Enable [Agno Learning](https://docs.agno.com/agents/learning) — the agent builds a persistent profile of user preferences and adapts over time. Inherits from `defaults.learning` (default: `true`)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| `learning_mode`               | string | `null`                | `always`: agent automatically learns from every interaction. `agentic`: agent decides when to learn via a tool call. Inherits from `defaults.learning_mode` (default: `"always"`)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `memory_backend`              | string | `null`                | Memory backend override for this agent (`"mem0"`, `"file"`, or `"none"`). Inherits from global `memory.backend` when omitted                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `private`                     | object | `null`                | Optional requester-private state for one shared agent definition. `private.per` defines which requester boundary gets a separate private instance of the agent's state. Private agents must not set `worker_scope`. Internally, MindRoom reuses that same requester boundary for worker execution, but `private.per` is still a different public config concept from `worker_scope`. `private.root` defaults to `<agent_name>_data`, `private.template_dir` copies a local template into each requester root without overwriting existing files, `private.context_files` loads private-root-relative files into role context, and `private.knowledge` adds PrivateAgentKnowledge indexed from that private root. `private` does not implicitly enable file memory, context files, or private knowledge, and private agents cannot participate in teams yet |
| `knowledge_bases`             | list   | `[]`                  | Knowledge base IDs from top-level `knowledge_bases` — gives the agent RAG access to the indexed documents                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `context_files`               | list   | `[]`                  | File paths (relative to the agent's workspace) loaded into each agent instance and prepended to role context (under `Personality Context`)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| `thread_mode`                 | string | `"thread"`            | `thread`: responses are sent in Matrix threads (default). `room`: responses are sent as plain room messages with a single persistent session per room — ideal for bridges (Telegram, Signal, WhatsApp) and mobile                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| `room_thread_modes`           | map    | `{}`                  | Per-room thread mode overrides keyed by room alias/name or Matrix room ID. Values are `thread` or `room`. Overrides apply before `thread_mode` fallback                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `startup_thread_prewarm`      | bool   | `true`                | When enabled, this bot may prewarm recent thread snapshots for rooms already joined when first sync completes, which can reduce cold-cache latency for early thread replies after startup                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `num_history_runs`            | int    | `null`                | Number of prior Agno runs to include as history context (`null` = all). Mutually exclusive with `num_history_messages`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `num_history_messages`        | int    | `null`                | Max messages from history. Mutually exclusive with `num_history_runs`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `compress_tool_results`       | bool   | `null`                | Compress tool results in history to save context. Inherits from `defaults.compress_tool_results` (default: `false`). On Anthropic and Vertex Claude models, setting this to `true` can mutate replayed tool messages and invalidate prompt-cache prefixes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `compaction`                  | object | `defaults.compaction` | Per-agent required-compaction overrides                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `max_tool_calls_from_history` | int    | `null`                | Limit tool call messages replayed from history (`null` = no limit)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `show_tool_calls`             | bool   | `null`                | Show tool-call markers and trace metadata in Matrix messages. Inherits from `defaults.show_tool_calls` (default: `true`). When `false`, inline markers and `io.mindroom.tool_trace` are omitted from sent Matrix message content. Routed tools may still show generic worker warmup text such as `Preparing isolated worker...`, but that copy never includes tool identifiers or tool-trace metadata. Note: this flag is not currently enforced by the OpenAI-compatible `/v1/chat/completions` path.                                                                                                                                                                                                                                                                                                                                                     |
| `worker_tools`                | list   | `null`                | Tool names to run in the [sandbox proxy](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) instead of the main process. Inherits from `defaults.worker_tools`. When omitted everywhere, MindRoom uses its built-in default. Set to `[]` to disable proxying for this agent                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| `worker_scope`                | string | `null`                | How sandbox runtimes are shared for non-private agents. `shared`: one per agent. `user`: one per user (shared across agents). `user_agent`: one per user+agent pair. Inherits from `defaults.worker_scope`. Do not set this when the agent uses `private`, because `private.per` already defines the requester partition for that agent                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| `allow_self_config`           | bool   | `null`                | Give this agent a scoped tool to read and modify its own configuration at runtime. Inherits from `defaults.allow_self_config` (default: `false`). Lighter-weight alternative to the `config_manager` tool                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| `delegate_to`                 | list   | `[]`                  | Agent names this agent can delegate tasks to via tool calls (see [Agent Delegation](#agent-delegation))                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |

Each entry in `knowledge_bases` must match a key under `knowledge_bases` in `config.yaml`.

Per-agent fields with a `null` default inherit from the `defaults` section at runtime. Per-agent values override them. `memory.backend` is the global memory default, and `agents.<name>.memory_backend` overrides it per agent. Use `memory_backend: none` for stateless agents that should skip prompt memory lookup, automatic memory persistence, and the explicit `memory` tool. `show_stop_button` and `enable_streaming` are global-only settings in `defaults` and cannot be overridden per-agent. The dashboard Agents tab exposes this as the **Memory Backend** selector for each agent.

Startup thread prewarm is a background, best-effort cache warmup for rooms already joined when first sync completes. Agents use `agents.<name>.accept_invites`, while the router uses its own `router.accept_invites` option with the same durable invite semantics. Teams do not currently expose a separate `accept_invites` option, but accepted team invites are still persisted as durable desired membership. Invite acceptance still respects your normal authorization rules, so unauthorized senders cannot force an entity to join and persist a room. Approval-gated tools are stricter than plain ad-hoc chat access. Approval-gated tools only work there while the router is already joined.

MindRoom compacts in one visible lifecycle. Per-agent compaction supports `enabled`, `threshold_tokens`, `threshold_percent`, `reserve_tokens`, and `model`. When the active runtime model has a known `context_window`, MindRoom always computes a per-run replay plan that reduces or disables persisted replay before the model call if needed. Automatic destructive compaction is enabled by default through `defaults.compaction`, but it runs only when raw history exceeds the hard replay budget for the next reply. `threshold_tokens` and `threshold_percent` set a soft trigger budget for planning metadata and compaction notices. Crossing that soft trigger while still within the hard budget leaves the stored session unchanged and relies on replay fitting. Use `reserve_tokens` to leave hard-budget headroom, use `model` to choose the summary model, or set `enabled: false` to disable automatic pre-reply compaction for this agent. Replay safety always uses the active runtime model window. If you set `compaction.model`, that summary model must also define its own `context_window`, but only for the durable summary-generation pass. If the current reply needs required compaction to preserve usable history, MindRoom sends `Compacting history...`, compacts before the model call, and edits that same notice with the result. Manual `compact_context` records a durable request that runs before the next reply in the same conversation scope. Manual `compact_context` remains available when a compaction model and context window are configured. MindRoom does not run a separate background post-response compaction path. It always plans the replay that is safe for the current model call when the active runtime model has a known `context_window`. That replay planner can keep configured replay, reduce raw replay, fall back to summary-only replay, or disable persisted replay for the run. Compaction rewrites the persisted Agno session in SQLite. Older compacted runs are removed from `session.runs` and replaced by the merged `session.summary`, so raw pre-compaction runs are not retained for later audit or debugging.

Learning data is persisted under `agents/<name>/learning/<agent>.db`, so it survives container restarts when the storage directory is mounted. `context_files` are resolved relative to the agent's workspace directory (`agents/<name>/workspace/`). When the effective memory backend is `file`, the agent's canonical file memory root is that same workspace directory. Absolute paths and `..` traversal are rejected.

## Per-Agent Tool Configuration

Tools can be plain strings or single-key dicts with inline config overrides. This lets you customize tool behavior per agent without affecting other agents that use the same tool.

```
agents:
  code:
    tools:
      - file                              # no override, uses defaults
      - shell:                            # per-agent override
          extra_env_passthrough: "DAWARICH_*"
          enable_run_shell_command: true
  research:
    tools:
      - shell                             # uses global defaults (no overrides)
      - duckduckgo
```

### Merge Order

MindRoom resolves tool configuration in layers:

1. Tool constructor defaults (hardcoded in tool code)
1. Credentials (dashboard or credential store)
1. `defaults.tools` overrides (global inline config)
1. `agents.<name>.tools` overrides (per-agent inline config)
1. Runtime overrides (sandbox proxy, init overrides)

Within the authored layers (`defaults.tools` and `agents.<name>.tools`), each field has three possible states:

- Key omitted: keep the value from the next lower layer unchanged.
- Concrete value: override the next lower layer with that value.
- `__MINDROOM_INHERIT__`: clear an inherited authored override and fall back to the next lower layer.

When the same tool appears in both `defaults.tools` and `agents.<name>.tools`, MindRoom merges them field-by-field. Per-agent values win for overlapping keys, non-overlapping keys are kept from both, and `__MINDROOM_INHERIT__` removes the inherited authored value instead of passing the literal string to the tool.

### Defaults with Overrides

`defaults.tools` also accepts the single-key dict syntax for global overrides that apply to all agents:

```
defaults:
  tools:
    - scheduler
    - shell:
        enable_run_shell_command: true     # global default for all agents
```

### Clearing An Inherited Override

Use `__MINDROOM_INHERIT__` when an agent should keep the tool but stop inheriting one authored field from `defaults.tools`.

Optional-field example:

```
defaults:
  tools:
    - shell:
        extra_env_passthrough: "DAWARICH_*"
        enable_run_shell_command: true

agents:
  research:
    tools:
      - shell:
          extra_env_passthrough: __MINDROOM_INHERIT__
```

`research` still inherits `enable_run_shell_command: true`, but `extra_env_passthrough` falls back to the lower layer (persisted tool config if set, otherwise the tool's normal default). For sandboxed `shell`, provider API keys and other committed runtime credentials are denied by default in both worker startup env and command env. Use `extra_env_passthrough` when a specific exported process env value must be visible to shell commands.

Required non-secret field example:

```
defaults:
  tools:
    - clickup:
        master_space_id: "space-default"

agents:
  ops:
    tools:
      - clickup:
          master_space_id: __MINDROOM_INHERIT__
```

`ops` still uses the `clickup` tool, but `master_space_id` no longer inherits `"space-default"`. MindRoom falls back to the next lower layer, which is usually the stored tool config from the dashboard or credential store.

### `include_default_tools` vs `__MINDROOM_INHERIT__`

- `include_default_tools: false` is coarse-grained: it removes every tool and every override inherited from `defaults.tools` for that agent.
- `__MINDROOM_INHERIT__` is fine-grained: it keeps the tool and the rest of the inherited fields, but clears one specific authored override.

### Security Restrictions

Not all config fields can be overridden inline:

- `type="password"` fields are blocked (credentials must go through the dashboard or credential store)
- `base_dir` is blocked (runtime-only, set by the workspace system)
- Fields with `authored_override: false` in the tool metadata are blocked

MindRoom validates overrides at config load time and rejects unknown field names, wrong value types, and blocked fields with a clear error message.

### Backward Compatibility

Existing configs with plain string tool lists work unchanged:

```
tools: [shell, file, duckduckgo]   # still valid
```

### Config Manager

The `!config` chat command and the `config_manager` tool preserve inline overrides when updating tool lists. Adding or removing tools via chat does not discard existing per-agent overrides on other tools.

## Worker Routing

`worker_tools` decides which tools run in the sandbox proxy instead of the main MindRoom process. When omitted, MindRoom routes `coding`, `file`, `python`, and `shell` through the proxy by default. `worker_scope` controls how those sandbox runtimes are reused between calls. The shared-only integrations require `worker_scope` unset or `shared`. That list includes `spotify`, `homeassistant`, and all configured `mcp_<server_id>` tools. Separately, `gmail`, `google_calendar`, `google_drive`, `google_sheets`, and `homeassistant` always stay local regardless of `worker_tools` (they are never proxied to the sandbox). `spotify` can still be proxied through the sandbox.

The supported `worker_scope` values are:

- `shared`: one runtime per agent, shared by all users.
- `user`: one runtime per user, shared across that user's agents.
- `user_agent`: one runtime per user+agent pair.

Leave `worker_scope` unset for unscoped execution — calls still run in the sandbox, but each call gets a fresh runtime instead of a persistent one. `worker_scope` also affects dashboard credential support and OpenAI-compatible agent eligibility.

### Filesystem Isolation

`worker_scope` controls runtime reuse, not filesystem security. When the effective memory backend is `file`, tools like `shell`, `file`, `python`, and `coding` get a default working directory (`base_dir`) at the agent's canonical workspace root. Without file-backed workspace state, those tools keep their normal defaults such as the current directory. Even when set, `base_dir` is a convenience, not a hard boundary.

Isolation depends on the worker backend:

- **Kubernetes dedicated workers** (`shared`, `user_agent`, unscoped): the runtime can only see its own agent's storage directory plus its worker-local scratch space. This is the strongest isolation available today.
- **Kubernetes dedicated workers** (`user`): the runtime can see all agents' storage, because `user` mode intentionally shares one runtime across multiple agents for a single user. Treat this as a shared workstation.
- **Shared-runner and local backends**: no hard filesystem boundary today, regardless of scope.

Use `user_agent` if you need per-agent filesystem isolation.

For per-workspace env that an agent can edit (PATH, package indexes, npm cache locations, etc.), drop a `.mindroom/worker-env.sh` script in the agent workspace; MindRoom sources it before each worker-routed `shell` or `python` request. MindRoom-owned workspace identity, cache, and virtualenv env names are reasserted after the hook, so hooks cannot redirect `HOME`, `MINDROOM_AGENT_WORKSPACE`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, `XDG_STATE_HOME`, `XDG_CACHE_HOME`, `PIP_CACHE_DIR`, `UV_CACHE_DIR`, `PYTHONPYCACHEPREFIX`, or `VIRTUAL_ENV`. With `worker_scope: user`, the same runtime can move between several agent workspaces, and the hook is discovered from the current request's workspace — different agents get different overlays automatically. See [Workspace env hook](https://docs.mindroom.chat/deployment/sandbox-proxy/#workspace-env-hook-mindroomworker-envsh) for filename, filtering, and failure semantics.

### Where Agent Data Lives

Agents without `private` store all their data in one canonical directory: `agents/<name>/` (context files, workspace, memory, sessions, learning). Changing `worker_scope` changes how tool runtimes are isolated. It does **not** change where that non-private agent's data lives. All runtimes for the same non-private agent read and write the same storage directory. If multiple runtimes run concurrently, files and databases in that directory must tolerate concurrent access. Agents that use `private` are different. They materialize one canonical state root per requester-scoped private instance under `private_instances/<scope-key>/<agent>/`. Workers mount those canonical private-instance roots. They do not own them.

The dashboard's generic credential forms only work for unscoped agents and agents with `worker_scope=shared`. OAuth providers that support scoped dashboard flows, such as the Google Drive, Gmail, Calendar, and Sheets providers, are the exception. For those providers, the dashboard can connect scoped `user` and `user_agent` credentials, but the Google tools still execute in the primary MindRoom runtime. Tools without a scoped OAuth provider still manage `user` and `user_agent` credentials through their worker runtime instead.

For more details on storage layout and isolation, see [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md).

## Private Instances

Use `private` when one shared agent definition should behave like a template that materializes a separate requester-local instance at runtime. The YAML definition stays shared. The private root, copied files, file-memory workspace, and private knowledge path do not. Private agents cannot participate in teams yet. That restriction also applies transitively: a shared team member that reaches a private agent through `delegate_to` is rejected.

`private.per` is not a second spelling of `worker_scope`. `private.per` chooses who gets a separate private instance of the agent's state. MindRoom then uses that same requester partition for worker execution, but that is an internal consequence of private execution, not the public meaning of `worker_scope`.

```
knowledge_bases:
  company_docs:
    path: ./company_docs
    watch: false

agents:
  mind:
    display_name: Mind
    role: A persistent personal AI companion
    model: sonnet
    tools: [file, shell]
    worker_tools: [file, shell]
    memory_backend: file
    private:
      per: user
      root: mind_data
      template_dir: ./mind_template
      context_files:
        - SOUL.md
        - AGENTS.md
        - USER.md
        - IDENTITY.md
        - TOOLS.md
        - HEARTBEAT.md
        - MEMORY.md
      knowledge:
        path: memory
        watch: false
    knowledge_bases: [company_docs]
```

Example template directory:

```
mind_template/
├── SOUL.md
├── AGENTS.md
├── USER.md
├── IDENTITY.md
├── TOOLS.md
├── HEARTBEAT.md
├── MEMORY.md
└── memory/
```

In the example above, each requester gets their own effective `mind_data/` root under a canonical private-instance state root in shared storage. That private root is not created next to `config.yaml`. It is not stored under `workers/<worker>/`. Workers mount the same canonical private-instance root when they execute that requester scope. For a `mind` agent with `private.per: user`, different users get different private `mind_data/` trees even though the agent definition is shared.

### Private Fields

| Field                             | Type                   | Default             | Description                                                                                                                                                                                                                                            |
| --------------------------------- | ---------------------- | ------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `private.per`                     | `user` or `user_agent` | *required*          | Which requester boundary gets its own private instance of the agent's state. MindRoom also uses that same boundary for the agent's internal execution scope                                                                                            |
| `private.root`                    | string                 | `<agent_name>_data` | Private root name under the canonical private-instance state root. Must be a relative path and cannot escape with `..`                                                                                                                                 |
| `private.template_dir`            | string                 | `null`              | Optional local directory copied recursively into each private root without overwriting existing files. Relative paths are resolved from `config.yaml`, and absolute paths are also allowed. MindRoom raises an error when the directory does not exist |
| `private.context_files`           | list                   | `null`              | Optional files loaded into role context from inside the private root. Each path is relative to the private root and cannot escape it                                                                                                                   |
| `private.knowledge`               | object                 | `null`              | Optional PrivateAgentKnowledge indexed from inside the private root. Sub-fields below. See [Knowledge Bases](https://docs.mindroom.chat/knowledge/#private-agent-knowledge)                                                                            |
| `private.knowledge.enabled`       | bool                   | `true`              | Whether to index PrivateAgentKnowledge for this private agent instance. Set to `false` to disable indexing                                                                                                                                             |
| `private.knowledge.path`          | string                 | `null`              | Path to a private knowledge directory relative to the private root                                                                                                                                                                                     |
| `private.knowledge.watch`         | bool                   | `true`              | When true, PrivateAgentKnowledge schedules background refresh on access. When false, direct external edits require explicit refresh                                                                                                                    |
| `private.knowledge.chunk_size`    | int                    | `5000`              | Maximum characters per indexed chunk (min: 128)                                                                                                                                                                                                        |
| `private.knowledge.chunk_overlap` | int                    | `0`                 | Overlapping characters between adjacent chunks (min: 0)                                                                                                                                                                                                |
| `private.knowledge.git`           | object                 | `null`              | Optional Git sync configuration for PrivateAgentKnowledge (same schema as top-level `knowledge_bases.<id>.git`)                                                                                                                                        |

### Runtime Behavior

1. MindRoom resolves the canonical private-instance state root from `private.per`.
1. MindRoom creates the effective private root inside that canonical private-instance state root.
1. If `private.template_dir` is set, MindRoom copies the template directory into the private root without overwriting files that already exist there.
1. MindRoom loads any `private.context_files` from that private root when the agent is created or reloaded.
1. If `memory_backend: file` is enabled, MindRoom uses that same private root as the file-memory root for that requester.
1. If `private.knowledge.path` is configured, MindRoom indexes that private-root-relative path as PrivateAgentKnowledge for that requester only.

### Important Rules

- `private` is explicit opt-in.
- `private` does not automatically enable file memory.
- `private` does not automatically load any context files.
- `private` does not automatically create a private knowledge base.
- Private agents cannot participate in teams yet.
- Shared team members that reach a private agent through `delegate_to` are rejected for the same reason.
- If `private.template_dir` is omitted, MindRoom still creates the private root.
- Private agents require an active requester-scoped runtime context.
- MindRoom raises an error instead of silently falling back to a shared config-relative path when that requester scope is missing.
- Set `memory_backend: file` if you want `MEMORY.md` and `memory/` inside the private root to be the agent's actual file memory.
- Set `memory_backend: none` if the private agent should stay stateless while still using its private files and knowledge configuration.
- Set `private.context_files` explicitly for any copied files you want loaded into role context.
- Set `private.knowledge.path` explicitly for any copied files or folders you want indexed as PrivateAgentKnowledge.
- Omit `private.knowledge` entirely, or set `private.knowledge.enabled: false`, when you do not want PrivateAgentKnowledge indexing.
- `private` cannot be combined with `worker_scope`.
- Top-level `knowledge_bases` remain shared or company-wide corpora, so one agent can use both PrivateAgentKnowledge and shared knowledge in the same run.
- Top-level `context_files` remain the shared workspace-relative mechanism used by single-user setups, including the default `mindroom config init` output.
- Custom templates are fully supported.
- The Mind-style filenames shown above are a convention, not a requirement, unless you choose to reference them in `private.context_files` or `private.knowledge.path`.

## Thread Mode Resolution

Thread mode is resolved per message using the current room ID. For an agent, MindRoom checks `room_thread_modes` in this order. First, it checks an exact room ID key. Second, it checks the managed room key/alias associated with that room ID. Third, it resolves each configured `room_thread_modes` key to a room ID and matches that against the current room. If none match, it falls back to `thread_mode`.

For a team, MindRoom resolves mode per member agent for that room. If all member agents resolve to the same mode, the team uses that mode. If member modes differ, the team defaults to `thread`.

For the router, MindRoom resolves mode using agents relevant to the active room. This includes agents directly configured for the room and agents included via `teams.<name>.rooms`. If all relevant agents resolve to the same mode, the router uses that mode. If modes are mixed, the router defaults to `thread`.

## File-Based Context Loading

You can inject file content directly into an agent's role context without using a knowledge base.

`context_files` behavior:

- Paths are relative to the agent's workspace (`agents/<name>/workspace/`)
- `private.context_files` paths are resolved relative to the effective private root
- Existing files are loaded in list order and added under `Personality Context`
- Missing files are skipped with a warning in logs

MindRoom loads the files when it builds an agent instance. The normal Matrix and OpenAI-compatible reply paths build fresh agent instances per reply/request, so editing a context file affects the next reply without restarting the process.

## Agent Delegation

Agents can delegate tasks to other agents using the `delegate_to` field. When configured, a delegation tool is automatically added to the agent — no need to include `"delegate"` in the `tools` list.

The delegated agent runs as a fresh, one-shot instance with no shared session or history. It executes the task and returns its response as the tool result.

```
agents:
  leader:
    display_name: Leader
    role: Orchestrate tasks by delegating to specialist agents
    model: sonnet
    delegate_to: [code, research]
    rooms: [lobby]

  code:
    display_name: CodeAgent
    role: Generate code, manage files
    model: sonnet
    tools: [file, shell]
    delegate_to: [research]  # can further delegate
    rooms: [lobby]

  research:
    display_name: ResearchAgent
    role: Research topics and provide summaries
    model: sonnet
    tools: [duckduckgo]
    rooms: [lobby]
```

**Constraints:**

- Targets must reference existing agent names in the config
- An agent cannot delegate to itself
- Recursive delegation is supported (agent A delegates to B, B delegates to C) up to a maximum depth of 3

## Naming Rules

Agent and team YAML keys must contain only alphanumeric characters and underscores (matching `^[a-zA-Z0-9_]+$`). Agent and team names must be distinct — the same key cannot appear in both `agents:` and `teams:`.

## Rich Prompt Agents

Certain agent names (the YAML key, not `display_name`) have built-in rich prompts:

`code`, `research`, `calculator`, `general`, `shell`, `summary`, `finance`, `news`, `data_analyst`

When using these names, the built-in prompt replaces the `role` field and any custom `instructions` are ignored.

## Defaults

The `defaults` section sets fallback values for all agents. Any agent that omits a setting inherits the value from here.

```
defaults:
  tools:                                # Tools added to every agent by default (set [] to disable)
    - scheduler
    # Per-agent tool config overrides also work in defaults:
    # - shell:
    #     enable_run_shell_command: true
  markdown: true                        # Format responses as Markdown
  learning: true                        # Enable Agno Learning
  learning_mode: always                 # "always" or "agentic"
  max_preload_chars: 50000              # Hard cap for preloaded context from context_files
  show_stop_button: true                # Show a stop button while agent is responding (global-only, cannot be overridden per-agent)
  num_history_runs: null                # Number of prior runs to include (null = all)
  num_history_messages: null            # Max messages from history (null = use num_history_runs)
  enable_streaming: true                # Stream agent responses via progressive message edits
  streaming:
    update_interval: 5.0                # Steady-state seconds between streamed edits
    min_update_interval: 0.5            # Fast-start seconds between early edits
    interval_ramp_seconds: 15.0         # Set 0 to disable interval ramping
    max_idle: 2.0                       # Event-driven idle ceiling before the next edit
  compress_tool_results: false          # Safer default; enabling can invalidate Anthropic/Vertex Claude prompt caches
  compaction:
    enabled: true
    threshold_percent: 0.8
    reserve_tokens: 16384
  max_tool_calls_from_history: null     # Limit tool call messages replayed from history (null = no limit)
  show_tool_calls: true                 # Show tool-call markers and trace metadata; hidden mode still allows generic worker warmup copy
  worker_tools: null                     # Tool names to route through workers (null = use MindRoom's default routing policy, [] = disable)
  worker_scope: null                     # Worker runtime reuse for proxied tools (shared, user, user_agent)
  allow_self_config: false               # Allow agents to read/modify their own config at runtime
```

`defaults.streaming` is global-only and controls the timing of progressive message edits for streaming responses.

To opt out a specific agent:

```
agents:
  researcher:
    display_name: Researcher
    role: Focus on deep research
    include_default_tools: false
    tools: [web_search]
```

# Model Configuration

Models define the AI providers and model IDs used by agents.

## Supported Providers

- `anthropic` - Claude models (Anthropic)
- `openai` - GPT models and OpenAI-compatible endpoints
- `codex` or `openai_codex` - OpenAI models available through a local Codex CLI ChatGPT subscription login
- `google` or `gemini` - Google Gemini models
- `vertexai_claude` - Anthropic Claude models on Google Vertex AI
- `ollama` - Local models via Ollama
- `groq` - Groq-hosted models (fast inference)
- `openrouter` - OpenRouter-hosted models (access to many providers)
- `cerebras` - Cerebras-hosted models
- `deepseek` - DeepSeek models

## Model Config Fields

Each model configuration supports the following fields:

| Field            | Required | Default | Description                                                                                                                                                                                               |
| ---------------- | -------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `provider`       | Yes      | -       | The AI provider (see supported providers above)                                                                                                                                                           |
| `id`             | Yes      | -       | Model ID specific to the provider                                                                                                                                                                         |
| `host`           | No       | `null`  | Host URL for self-hosted models (e.g., Ollama)                                                                                                                                                            |
| `api_key`        | No       | `null`  | API key (usually read from environment variables)                                                                                                                                                         |
| `extra_kwargs`   | No       | `null`  | Additional provider-specific parameters                                                                                                                                                                   |
| `context_window` | No       | `null`  | Context window size in tokens. MindRoom needs it on the active runtime model to enforce replay budgets, and an explicit `compaction.model` also needs its own `context_window` for destructive compaction |

## Configuration Examples

```
models:
  # Anthropic Claude
  sonnet:
    provider: anthropic
    id: claude-sonnet-4-6
    context_window: 200000

  haiku:
    provider: anthropic
    id: claude-haiku-4-5
    context_window: 200000

  # OpenAI
  gpt:
    provider: openai
    id: gpt-5.4

  # OpenAI via Codex CLI subscription
  codex:
    provider: codex
    id: gpt-5.5

  # Google Gemini (both 'google' and 'gemini' work as provider names)
  gemini:
    provider: google
    id: gemini-3.1-pro-preview

  # Anthropic Claude on Vertex AI
  vertex_claude:
    provider: vertexai_claude
    id: claude-sonnet-4-6
    extra_kwargs:
      project_id: your-gcp-project
      region: us-central1

  # Local via Ollama
  local:
    provider: ollama
    id: llama3.2
    host: http://localhost:11434  # Uses dedicated host field

  # OpenRouter (access to many model providers)
  openrouter:
    provider: openrouter
    id: anthropic/claude-sonnet-4.6

  # Groq (fast inference)
  groq:
    provider: groq
    id: llama-3.1-70b-versatile

  # Cerebras
  cerebras:
    provider: cerebras
    id: llama3.1-8b

  # DeepSeek
  deepseek:
    provider: deepseek
    id: deepseek-chat

  # Custom OpenAI-compatible endpoint (e.g., vLLM, llama.cpp server)
  custom:
    provider: openai
    id: my-model
    extra_kwargs:
      base_url: http://localhost:8080/v1
```

## Codex Subscription Models

Use `provider: codex` when you want MindRoom to call models exposed through an authenticated local Codex CLI session instead of the regular OpenAI API. Run `codex login` first so `~/.codex/auth.json` contains ChatGPT OAuth tokens. MindRoom refreshes the access token when needed and sends requests to the Codex Responses endpoint. The model ID may be either the bare Codex slug, such as `gpt-5.5`, or the LLM-plugin-style form `openai-codex/gpt-5.5`. If you keep Codex state outside `~/.codex`, pass `extra_kwargs.codex_home`. For starter config generation, use `mindroom config init --profile public-codex` or `mindroom config init --provider codex`.

```
models:
  default:
    provider: codex
    id: gpt-5.5
    context_window: 258000
    # Prompt caching is enabled automatically per active agent session.
    extra_kwargs:
      reasoning_effort: medium
```

Set Codex reasoning effort through `extra_kwargs.reasoning_effort`. Agno maps this to the Responses API `reasoning.effort` field. Supported effort values are `minimal`, `low`, `medium`, and `high`. The starter Codex profile uses `medium`. MindRoom sends a Codex prompt-cache key plus the Codex CLI session headers for each active agent session. By default, that key is derived from the current execution identity, so separate Matrix threads can run concurrently without sharing one global cache key. You can set `extra_kwargs.prompt_cache_key` to override that derived key for a model, but avoid a single low-cardinality value for many busy threads unless you intentionally want those requests routed together. Live testing against the Codex subscription endpoint reported `cached_tokens` only when the request included Codex CLI-style session headers tied to the prompt-cache key. Repeated long requests then reported cache hits, while requests without those headers stayed at `cached_tokens: 0`, and `prompt_cache_retention` was rejected. Treat Codex prompt caching as best-effort rather than guaranteed.

## Context Window

When `context_window` is set, MindRoom uses it to budget persisted replay and required destructive compaction. MindRoom always applies a final replay-fit step when the active runtime model has a known `context_window`. That replay-fit step reduces or disables persisted replay for the current run when needed. Automatic destructive compaction is enabled by default through `defaults.compaction`. Set `enabled: false` in `defaults.compaction` or a per-agent/per-team `compaction` override to disable automatic pre-reply compaction. It runs only when history exceeds the hard replay budget for the next reply. Use `threshold_tokens` or `threshold_percent` to set the soft trigger budget that appears in planning metadata and compaction notices. Crossing that soft trigger while still within the hard budget leaves the stored session unchanged and relies on replay fitting for that reply. Use `reserve_tokens` to leave hard-budget headroom for the current prompt and output. Manual `compact_context` records a durable request that runs before the next reply in the same conversation scope. Manual `compact_context` remains available when a compaction model and context window are configured. It still uses the active runtime window for the final replay-fit step, but destructive compaction itself can be available whenever an explicit `compaction.model` has its own `context_window`. If you set `compaction.model`, that summary model must also define its own `context_window` for the durable summary-generation pass. Required compaction runs before the reply with a Matrix lifecycle notice that is edited in place. Otherwise MindRoom leaves the session unchanged and relies on replay fitting for that reply. The budget uses a chars/4 approximation and reserves headroom for the current prompt and output. MindRoom does not mutate configured `num_history_runs` to fit the window. Instead, it computes the replay plan that actually fits the current call and uses compaction to keep future replay healthy. If needed, that replay plan can reduce raw replay, fall back to summary-only replay, or disable persisted replay entirely for the run.

```
models:
  default:
    provider: anthropic
    id: claude-sonnet-4-6
    context_window: 200000  # 200K tokens
```

This is useful for models with smaller context windows or long-running conversations that accumulate persisted history.

## Extra Kwargs

The `extra_kwargs` field passes additional parameters directly to the underlying [Agno](https://docs.agno.com/) model class. Common options include:

- `base_url` - Custom API endpoint (useful for OpenAI-compatible servers)
- `temperature` - Sampling temperature
- `max_tokens` - Maximum tokens in response

## Environment Variables

API keys are read from environment variables:

```
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...
GROQ_API_KEY=...
OPENROUTER_API_KEY=...
CEREBRAS_API_KEY=...
DEEPSEEK_API_KEY=...
```

For Ollama, you can also set:

```
OLLAMA_HOST=http://localhost:11434
```

For Vertex AI Claude, set these instead of an API key:

```
ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project
CLOUD_ML_REGION=us-central1
```

Authenticate with `gcloud auth application-default login` or set `GOOGLE_APPLICATION_CREDENTIALS` to a service account key file.

### File-based Secrets

For container environments (Kubernetes, Docker Swarm), you can also use file-based secrets by appending `_FILE` to any environment variable name:

```
# Instead of setting the key directly:
ANTHROPIC_API_KEY=sk-ant-...

# Point to a file containing the key:
ANTHROPIC_API_KEY_FILE=/run/secrets/anthropic-api-key
```

This works for all API key environment variables (e.g., `OPENAI_API_KEY_FILE`, `GOOGLE_API_KEY_FILE`, etc.).

# Team Configuration

Teams allow multiple agents to collaborate on tasks. MindRoom supports two collaboration modes.

## Team Modes

### Coordinate Mode

The team coordinator analyzes the task and delegates different subtasks to specific team members:

```
teams:
  dev_team:
    display_name: Dev Team
    role: Development team for building features
    agents: [architect, coder, reviewer]
    mode: coordinate
```

In coordinate mode, the coordinator analyzes the task and selects which agents should handle which subtasks based on their roles. The coordinator decides whether to run tasks sequentially or in parallel based on dependencies, then synthesizes all outputs into a cohesive response.

### Collaborate Mode

All agents work on the same task simultaneously and their outputs are synthesized:

```
teams:
  research_team:
    display_name: Research Team
    role: Research team for comprehensive analysis
    agents: [researcher, analyst, writer]
    mode: collaborate
```

In collaborate mode, the task is delegated to all team members simultaneously. Each agent works on the same task independently, and the coordinator synthesizes all perspectives into a final response. This is useful when you want diverse perspectives on the same problem.

## Full Configuration

```
teams:
  super_team:
    # Display name shown in Matrix
    display_name: Super Team

    # Description of the team's purpose (required)
    role: Multi-disciplinary team for complex tasks

    # Agents in this team (must be defined in agents section)
    agents:
      - code
      - research
      - finance

    # Collaboration mode: coordinate or collaborate (default: coordinate)
    mode: collaborate

    # Rooms the team responds in
    rooms:
      - team-room

    # Model for team coordination (default: "default")
    model: sonnet

    # Participate in room-level startup prewarm for rooms already joined at first sync (default: true)
    startup_thread_prewarm: true

    # Team-scoped replay controls (optional; inherit from defaults when omitted)
    num_history_runs: 8
    num_history_messages: null
    max_tool_calls_from_history: 6

    # Team-scoped required-compaction overrides (optional)
    # Soft thresholds do not compact by themselves while history still fits.
    compaction:
      enabled: true
      threshold_percent: 0.8
      reserve_tokens: 16384
```

## Configuration Fields

| Field                         | Required | Default                                | Description                                                                                                                                                                               |
| ----------------------------- | -------- | -------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `display_name`                | Yes      | -                                      | Human-readable name shown in Matrix                                                                                                                                                       |
| `role`                        | Yes      | -                                      | Description of the team's purpose                                                                                                                                                         |
| `agents`                      | Yes      | -                                      | List of agent names that compose this team                                                                                                                                                |
| `mode`                        | No       | `coordinate`                           | Collaboration mode: `coordinate` or `collaborate`                                                                                                                                         |
| `rooms`                       | No       | `[]`                                   | List of room names the team responds in                                                                                                                                                   |
| `model`                       | No       | `default`                              | Model used for team coordination and synthesis                                                                                                                                            |
| `startup_thread_prewarm`      | No       | `true`                                 | When enabled, this bot may prewarm recent thread snapshots for rooms already joined when first sync completes, which can reduce cold-cache latency for early thread replies after startup |
| `num_history_runs`            | No       | `defaults.num_history_runs`            | Number of prior team-scoped runs to replay                                                                                                                                                |
| `num_history_messages`        | No       | `defaults.num_history_messages`        | Max messages from team-scoped history replayed into the next run                                                                                                                          |
| `max_tool_calls_from_history` | No       | `defaults.max_tool_calls_from_history` | Max tool call messages replayed from team-scoped history                                                                                                                                  |
| `compaction`                  | No       | `defaults.compaction`                  | Team-scoped required-compaction overrides                                                                                                                                                 |

Team YAML keys follow the same naming rules as agents: alphanumeric characters and underscores only, and no overlap with agent names.

`num_history_runs` and `num_history_messages` are mutually exclusive, just like the agent-level settings. When a named team sets these fields, the team scope uses the team-owned policy instead of inheriting one member's history policy.

Team-scoped compaction supports `enabled`, `threshold_tokens`, `threshold_percent`, `reserve_tokens`, and `model`. When the active team model has a known `context_window`, MindRoom always computes a final replay plan for the shared team scope and reduces or disables persisted replay for the run when needed. Automatic destructive compaction is enabled by default through `defaults.compaction`, but it runs only when raw history exceeds the hard replay budget for the next reply. `threshold_tokens` and `threshold_percent` set a soft trigger budget for planning metadata and compaction notices. Crossing that soft trigger while still within the hard budget leaves the stored session unchanged and relies on replay fitting. Use `enabled: false` to disable automatic pre-reply compaction for a team. Replay safety always uses the active team model window. If you set `compaction.model`, that summary model must also define its own `context_window`, but only for the durable summary-generation pass. Manual `compact_context` remains available when a compaction model and context window are configured. Compaction uses an in-room lifecycle notice that is edited in place.

Startup thread prewarm is a background, best-effort cache warmup for rooms already joined when first sync completes.

## When to Use Each Mode

| Mode          | Use Case                                      | Example                                                                                   |
| ------------- | --------------------------------------------- | ----------------------------------------------------------------------------------------- |
| `coordinate`  | Agents need to do different subtasks          | "Get weather and news" - coordinator assigns weather to one agent, news to another        |
| `collaborate` | Want diverse perspectives on the same problem | "What do you think about X?" - all agents analyze the same question and share their views |

## Dynamic Team Formation

When multiple agents are mentioned in a message (e.g., `@code @research analyze this`), MindRoom automatically forms an ad-hoc team. Dynamic teams form in these scenarios: In threads with multiple human participants, stale thread context does not auto-form a team. A fresh explicit `@mention` in the current message is required before agents respond.

1. **Multiple agents explicitly tagged** - e.g., `@code @research analyze this`
1. **Thread with previously mentioned agents** - Follow-up messages in a thread where multiple agents were mentioned earlier, as long as the thread has not become a multi-human conversation that now requires a fresh explicit mention
1. **Thread with multiple agent participants** - Continuing a conversation where multiple agents have responded, as long as the thread has not become a multi-human conversation that now requires a fresh explicit mention
1. **DM room with multiple agents** - Messages in a DM room containing multiple agents (main timeline only)

### Mode Selection

For dynamic teams, the collaboration mode is selected by AI based on the task:

- Tasks with different subtasks for each agent use **coordinate** mode
- Tasks asking for opinions or brainstorming use **collaborate** mode

When AI mode selection is unavailable or fails, MindRoom falls back to:

- **coordinate** when multiple agents are explicitly tagged in the message (they likely have different roles to fulfill)
- **collaborate** for all other cases, such as agents from thread history or DM rooms (likely discussing the same topic)

Dynamic teams do not have a named `teams:` entry, so their history replay and compaction policy comes from `defaults`, not from any participating agent's overrides.

# Culture Configuration

Cultures let a group of agents share evolving principles, practices, and conventions. A culture is backed by [Agno's CultureManager](https://docs.agno.com/agents/culture) and persists its knowledge in a SQLite database under `mindroom_data/culture/<culture_name>.db`.

## Basic Culture

```
cultures:
  engineering:
    description: Follow clean code principles and write tests
    agents: [developer, reviewer]
```

## Full Configuration

```
cultures:
  engineering:
    # Describes the shared principles this culture captures
    description: Follow clean code principles, write tests, and review before merging

    # Agents assigned to this culture (must be defined in agents section)
    agents:
      - developer
      - reviewer

    # How the culture is updated: automatic, agentic, or manual (default: automatic)
    mode: automatic
```

## Configuration Fields

| Field         | Required | Default       | Description                                                                                                             |
| ------------- | -------- | ------------- | ----------------------------------------------------------------------------------------------------------------------- |
| `description` | No       | `""`          | Description of the shared principles and practices the culture captures                                                 |
| `agents`      | No       | `[]`          | Agent names assigned to this culture (must exist in the `agents` section). Each agent can belong to at most one culture |
| `mode`        | No       | `"automatic"` | How culture knowledge is updated (see modes below)                                                                      |

## Culture Modes

| Mode        | Behavior                                                                                                  |
| ----------- | --------------------------------------------------------------------------------------------------------- |
| `automatic` | Culture knowledge is automatically extracted from every agent interaction and added to the shared context |
| `agentic`   | The agent decides when to update culture knowledge via a tool call                                        |
| `manual`    | Culture context is read-only; the description is included in agent context but knowledge is never updated |

All modes include the culture description in the agent's context. The difference is whether and how the culture's knowledge base evolves over time.

## Rules

- Each agent can belong to **at most one** culture. Assigning the same agent to multiple cultures is a validation error.
- All agents listed in a culture must exist in the top-level `agents` section.
- Culture state is persisted to `mindroom_data/culture/<culture_name>.db` and survives restarts.
- Culture managers are cached and shared across agents in the same culture — if two agents belong to the same culture, they share the same `CultureManager` instance.
- Changes to a culture's `description` or `mode` in `config.yaml` invalidate the cache, so the manager is recreated on the next hot-reload.

# Router Configuration

The router is a built-in system component that handles intelligent message routing and room management. It decides which agent should respond when no specific agent is mentioned, sends welcome messages to new rooms, and manages various system-level tasks.

## Configuration

```
router:
  # Model for routing decisions (defaults to "default")
  model: haiku

  # Accept authorized room invites and preserve them across restarts (default: true)
  accept_invites: true

  # Participate in room-level startup prewarm for rooms already joined at first sync (default: true)
  startup_thread_prewarm: true
```

The router has three configuration options:

| Option                   | Type   | Default     | Description                                                                                                                                                                                 |
| ------------------------ | ------ | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `model`                  | string | `"default"` | Model to use for routing decisions                                                                                                                                                          |
| `accept_invites`         | bool   | `true`      | When enabled, the router accepts authorized room invites, persists accepted room IDs, rejoins them after restart, and preserves them during room cleanup                                    |
| `startup_thread_prewarm` | bool   | `true`      | When enabled, the router may prewarm recent thread snapshots for rooms already joined when first sync completes, which can reduce cold-cache latency for early thread replies after startup |

Startup thread prewarm is a background, best-effort cache warmup for rooms already joined when first sync completes.

## How Routing Works

When a message arrives in a room without a specific agent mention:

1. The router checks if there are configured agents in that room
1. It analyzes the message content and any recent thread context (up to 3 previous messages)
1. Based on the available agents' roles, tools, and instructions, it selects the best match
1. The router posts a message mentioning the selected agent (e.g., "@agent could you help with this?")
1. The mentioned agent sees the mention and responds in the thread

The router uses a structured output schema to ensure consistent routing decisions, including the agent name and reasoning for the selection.

## Router Responsibilities

The router is a special system agent that handles several important tasks beyond message routing:

### Command Handling

The router exclusively handles all commands:

- `!help [topic]` - Get help on commands or specific topics
- `!hi` - Show the welcome message again
- `!schedule <task>` - Schedule tasks and reminders
- `!list_schedules` - List scheduled tasks
- `!cancel_schedule <id>` - Cancel a scheduled task
- `!edit_schedule <id> <task>` - Edit an existing scheduled task
- `!config <operation>` - Manage configuration

Even in single-agent rooms, commands are always processed by the router.

### Welcome Messages

When the router joins a room with no messages (or only a previous welcome message), it automatically sends a welcome message listing:

- All available agents in that room with their descriptions
- How to interact with agents (mentions, commands)
- Quick command reference

Use `!hi` in any room to see the welcome message again.

### Room Management

The router creates and manages rooms:

- Creates configured rooms that don't exist yet
- Invites agents and users to their configured rooms
- Applies `matrix_room_access` policy for managed rooms (when enabled)
- Reconciles managed room power levels so the custom thread-tags state event can be written at PL0
- Generates AI-powered room topics based on configured agents
- Has admin privileges to manage room membership
- Cleans up orphaned bots on startup

By default (`matrix_room_access.mode: single_user_private`), rooms remain invite-only and private in the room directory. In `multi_user` mode, the router can set join rules (`public`/`knock`) and optionally publish rooms to the server directory. That same reconciliation path also updates `m.room.power_levels` for managed rooms, so the router must be joined and able to edit room power levels when thread tags are enabled.

### Voice Message Processing

Audio events are handled through the shared media pipeline on all bots. The router only posts a visible handoff when it must disambiguate between multiple eligible responders in a multi-agent room. When the responder is already clear, normalized audio follows the normal direct agent or team dispatch rules without an extra router message. Set `voice.visible_router_echo: true` if you also want the router to post the normalized voice text as a display-only message when it is allowed to reply. See [Voice Messages](https://docs.mindroom.chat/voice/index.md) for the detailed dispatch behavior.

### Configuration Confirmations

The router handles interactive configuration changes. When a config change is requested, the router posts a confirmation message with reactions, and only the router processes the confirmation reactions.

### Scheduled Task Restoration

When the router joins a room, it restores any previously scheduled tasks and pending configuration changes to ensure they persist across restarts.

## Routing Behavior Details

### Single-Agent Optimization

When there's only one agent configured in a room, the router skips AI routing entirely. The single agent handles messages directly, which is faster and more efficient.

### Multi-Human Thread Protection

When multiple human users have posted in a thread, the router and agents require an explicit `@mention` before responding. This prevents agents from injecting themselves into human-to-human conversations.

The rules are:

1. **Mentioned agents always respond** — an explicit `@agent` overrides all other rules.
1. **Non-thread messages** — agents auto-respond if they're the only agent in the room, regardless of how many humans are present.
1. **Threads with one human** — normal auto-response behavior applies (the agent continues the conversation).
1. **Threads with two or more humans** — agents stay silent unless explicitly mentioned.
1. **Mentioning a non-agent user** — if a message tags only humans (not agents), agents stay silent.

#### Bot accounts

By default, any Matrix user that is not a MindRoom agent counts as a "human" for the rules above. This includes bridge bots (Telegram, Slack, etc.) and other non-MindRoom bots. If a bridge bot relays a message into a thread, it looks like a second human to MindRoom and triggers the mention requirement.

To prevent this, list those accounts in `bot_accounts`:

```
bot_accounts:
  - "@telegram:example.com"
  - "@slackbot:example.com"
```

Accounts in this list are treated like MindRoom agents for response logic — their messages and mentions don't count toward the multi-human detection.

### Routing Fallback

If routing fails (model error, invalid suggestion, etc.), the router sends a helpful error message: "⚠️ I couldn't determine which agent should help with this. Please try mentioning an agent directly with @ or rephrase your request."

Users can always mention agents directly with `@agent_name` to bypass routing.

## Note on the Router Agent

The router is always present and cannot be disabled. It automatically joins any room with configured agents. If no `router` section is configured, it uses the default model.

# Tools

MindRoom includes 100+ built-in tools and presets that agents can use to work with files, services, external APIs, and Matrix-native workflows.

## Enabling Tools

Tools are enabled per-agent in the configuration. Each tool entry can be a plain string or a single-key dict with inline config overrides:

```
agents:
  assistant:
    display_name: Assistant
    role: A helpful assistant with file and web access
    model: sonnet
    tools:
      - file
      - shell:
          extra_env_passthrough: "DAWARICH_*"
      - github
      - duckduckgo
```

You can also assign tools to all agents globally:

```
defaults:
  tools:
    - scheduler
```

`defaults.tools` are merged into each agent's own `tools` list with duplicates removed. Set `defaults.tools: []` to disable global default tools, or set `agents.<name>.include_default_tools: false` to opt out a specific agent. When the same tool appears in both `defaults.tools` and an agent's `tools` with inline overrides, the per-agent overrides take priority, with non-overlapping keys merged from both. See [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration) for the full override syntax and merge order. Configured MCP servers also appear here as dynamic tools named `mcp_<server_id>`. See [MCP](https://docs.mindroom.chat/mcp/index.md) for the `mcp_servers` config and naming rules.

## Browse By Topic

- [Execution & Coding](https://docs.mindroom.chat/tools/execution-and-coding/index.md) - Local files, shell, Python, coding helpers, and worker-routed execution tools.
- [Data & Databases](https://docs.mindroom.chat/tools/data-and-databases/index.md) - SQL, databases, Google Drive files, spreadsheets, tabular analysis, and financial/business datasets.
- [Web Search](https://docs.mindroom.chat/tools/web-search/index.md) - Search engines and search APIs.
- [Web Scraping & Browser](https://docs.mindroom.chat/tools/web-scraping-and-browser/index.md) - Crawlers, extractors, browser automation, and page-reading tools.
- [Research Sources](https://docs.mindroom.chat/tools/research-sources/index.md) - ArXiv, Wikipedia, PubMed, and Hacker News.
- [AI & Generation](https://docs.mindroom.chat/tools/ai-and-generation/index.md) - Image, video, speech, and transcription APIs.
- [Media & Content](https://docs.mindroom.chat/tools/media-and-content/index.md) - Media processing, brand/media retrieval, and Spotify.
- [Matrix & Attachments](https://docs.mindroom.chat/tools/matrix-and-attachments/index.md) - Matrix-native messaging, thread tags and summaries, low-level Matrix API access, and attachment-aware workflows.
- [Messaging & Social](https://docs.mindroom.chat/tools/messaging-and-social/index.md) - Email, chat, and social/community integrations.
- [Project Management](https://docs.mindroom.chat/tools/project-management/index.md) - Git hosting, issue trackers, docs platforms, and task managers.
- [Calendar & Scheduling](https://docs.mindroom.chat/tools/calendar-and-scheduling/index.md) - Calendar APIs and MindRoom scheduling tools.
- [Memory & Storage](https://docs.mindroom.chat/tools/memory-and-storage/index.md) - Explicit memory tools and external memory providers.
- [Agent Orchestration](https://docs.mindroom.chat/tools/agent-orchestration/index.md) - Subagents, delegation, config tools, OpenClaw compatibility, and Claude Agent sessions.
- [Automation & Platforms](https://docs.mindroom.chat/tools/automation-and-platforms/index.md) - Infrastructure automation, generic APIs, and platform aggregators.
- [Location, Commerce, & Home](https://docs.mindroom.chat/tools/location-commerce-and-home/index.md) - Maps, weather, commerce, and Home Assistant.

## Tool Presets And Implied Tools

Some entries are config-only presets rather than runtime toolkits. `openclaw_compat` expands to a native bundle of MindRoom tools. Some tools also imply companion tools through `Config.IMPLIED_TOOLS`. Today `matrix_message` implies `attachments`, so the effective tool set includes both even when only `matrix_message` is configured explicitly.

## Tool Runtime Context

When a tool runs inside a Matrix-connected agent, it receives a `ToolRuntimeContext` via a context variable. This context carries the current `room_id`, source `thread_id`, canonical `resolved_thread_id`, `requester_id`, `agent_name`, the Matrix client, the active config, and runtime paths. `thread_id` preserves the raw inbound thread provenance, while `resolved_thread_id` is the canonical thread scope after compatible plain replies and other transitive resolution are applied. Tools like `matrix_message`, `matrix_room`, `thread_tags`, and `matrix_api` use this context to act on the correct room and canonical thread without the caller passing explicit IDs. `thread_tags` can also target another authorized room, but it still checks the target room's canonical thread root and requester membership before writing the shared tag state. `thread_tags.tag_thread()` and `thread_tags.untag_thread()` still use the active thread when the caller explicitly repeats the current `room_id`. `thread_tags.list_thread_tags()` uses the active thread by default, but passing `room_id` without `thread_id` forces room-wide listing even from inside an active thread. `thread_tags.list_thread_tags(tag=...)` narrows both thread-specific and room-wide responses to the requested tag only. `thread_tags.list_thread_tags(include_tag=..., exclude_tag=...)` filters which threads are returned: `include_tag` keeps only threads with that tag, `exclude_tag` removes threads with that tag. Both can be combined. Unlike `tag` (which narrows the output payload), these filter which threads appear at all. `thread_tags` also validates and normalizes predefined payload schemas for `blocked.data.blocked_by`, `waiting.data.waiting_on`, `priority.data.level`, and `due.data.deadline`. `thread_tags` intentionally replaces the removed experimental `thread_resolution` tool and does not auto-read old `com.mindroom.thread.resolution` markers. `matrix_api` defaults `room_id` to the active room, supports authorized cross-room targeting, never infers event IDs or state keys from thread context, and now also supports room-scoped full-text search through `action="search"`.

## Worker-Routed Execution

Some tools default to running in a sandboxed worker container instead of the primary agent process. The current worker-routed defaults are `file`, `shell`, `python`, and `coding`. Use [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) for deployment details and worker-scope behavior.

## Shared-Only Integrations

Some dashboard integrations are restricted to shared or unscoped execution and cannot be used by agents with isolating worker scopes. The current shared-only integrations are `spotify`, `homeassistant`, and all configured `mcp_<server_id>` tools.

## Automatic Dependency Installation

Each tool declares its optional Python dependencies in `pyproject.toml`. When a tool is enabled but its dependencies are missing, MindRoom can auto-install the required extra at runtime. Set `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` to disable that behavior.

## Related Docs

- [MCP](https://docs.mindroom.chat/mcp/index.md) - Configure native MCP client servers and expose them as MindRoom tools.
- [Plugins](https://docs.mindroom.chat/plugins/index.md) - Extend MindRoom with custom tools and skills.
- [Attachments](https://docs.mindroom.chat/attachments/index.md) - Attachment lifecycle and context scoping.
- [Scheduling](https://docs.mindroom.chat/scheduling/index.md) - Chat command scheduling and task behavior.
- [OpenClaw Workspace Import](https://docs.mindroom.chat/openclaw/index.md) - `openclaw_compat` preset and workspace portability.

# Execution & Coding

Use these tools to inspect and edit local files, run shell commands or Python, work in a code-aware workspace, manage Docker resources, and generate local artifacts such as exports and charts.

## What This Page Covers

This page documents the built-in tools in the `execution-and-coding` group. Use these tools when you need local execution, coding-oriented file access, lightweight computation, or artifact generation inside the agent runtime.

## Tools On This Page

- \[`file`\] - Generic local file reads, writes, listings, searches, and chunk edits.
- \[`shell`\] - Shell command execution with background handles, runtime env passthrough, and PATH overrides.
- \[`python`\] - Python code execution, file helpers, and package installation in the active runtime.
- \[`coding`\] - Code-oriented reads, precise edits, grep, file discovery, and directory listing.
- \[`docker`\] - Local Docker container, image, volume, and network management.
- \[`calculator`\] - Exact arithmetic and small numeric helper functions.
- \[`reasoning`\] - Internal `think` and `analyze` scratchpad tools for structured reasoning.
- \[`file_generation`\] - JSON, CSV, PDF, and text file export helpers.
- \[`visualization`\] - Matplotlib-backed chart generation.
- \[`sleep`\] - Intentional delays and pauses.

## Common Setup Notes

All ten tools on this page are exposed as `setup_type: none` in the live tool registry, so they do not require dashboard OAuth setup or credential forms before they appear as available. `src/mindroom/api/integrations.py` currently has no dedicated integration endpoints for them because they are local-runtime tools rather than OAuth-backed services. MindRoom's built-in default worker-routed set is `coding`, `file`, `python`, and `shell`. You can override the effective routed set with `defaults.worker_tools` or `agents.<name>.worker_tools`. When `worker_scope` is unset, worker-routed calls still execute in the sandbox, but they use a fresh runtime per call instead of a persistent scoped worker. `worker_scope: shared` reuses one runtime per agent, `worker_scope: user` reuses one runtime per requester across that requester's agents, and `worker_scope: user_agent` reuses one runtime per requester-agent pair. `worker_scope` controls runtime reuse, not filesystem security. Use [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) for the deployment model, storage visibility rules, and scope tradeoffs. When an agent has a canonical workspace root, MindRoom injects that workspace as `base_dir` for tools that expose a `base_dir` constructor field. In normal `config.yaml` authoring, `base_dir` is therefore usually runtime-managed instead of something you set inline. Those workspace-backed agents also receive the optional `mindroom_output_path` argument on eligible tools. Set it to a workspace-relative file path to save the full supported tool output to that file and return a compact receipt to the model. Agents without a resolved workspace do not receive this argument. No extra configuration is required beyond that workspace root, and `MINDROOM_TOOL_OUTPUT_REDIRECT_MAX_BYTES` only overrides the default 64 MiB per-output write cap. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set. That matters most here for `docker`, `file_generation`, and `visualization`, which depend on Docker access, `reportlab`, and `matplotlib`.

```
defaults:
  worker_scope: user_agent

agents:
  code:
    display_name: Code
    role: Edit code, run local checks, and export artifacts
    model: sonnet
    memory_backend: file
    tools:
      - coding
      - file
      - python:
          restrict_to_base_dir: true
      - shell:
          extra_env_passthrough:
            - GITHUB_TOKEN
            - INTERNAL_API_*
          shell_path_prepend:
            - /run/wrappers/bin
            - /opt/custom/bin
      - file_generation:
          output_directory: exports
      - visualization:
          output_dir: charts
```

## \[`file`\]

`file` is the generic local filesystem toolkit for read, write, list, search, delete, and chunk-based edits.

### What It Does

`file` exposes `save_file()`, `read_file()`, `delete_file()`, `list_files()`, `search_files()`, `read_file_chunk()`, and `replace_file_chunk()`. The underlying Agno toolkit resolves paths against `base_dir` and rejects paths that escape that root. `read_file()` enforces `max_file_length` and `max_file_lines`, and it tells the caller to use chunk reads when a file is too large. `search_files()` uses glob patterns relative to `base_dir` rather than full-text search. MindRoom marks `file` as worker-routed by default, so it usually executes in the sandboxed worker runtime unless you override `worker_tools`.

### Configuration

| Option                      | Type      | Required | Default    | Notes                                                                                                                                                      |
| --------------------------- | --------- | -------- | ---------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `base_dir`                  | `text`    | `no`     | `null`     | Runtime-managed working root when an agent workspace exists, otherwise the current directory. This field is not normally authored inline in `config.yaml`. |
| `enable_save_file`          | `boolean` | `no`     | `true`     | Enable `save_file()`.                                                                                                                                      |
| `enable_read_file`          | `boolean` | `no`     | `true`     | Enable `read_file()`.                                                                                                                                      |
| `enable_delete_file`        | `boolean` | `no`     | `false`    | Enable `delete_file()`.                                                                                                                                    |
| `enable_list_files`         | `boolean` | `no`     | `true`     | Enable `list_files()`.                                                                                                                                     |
| `enable_search_files`       | `boolean` | `no`     | `true`     | Enable `search_files()`.                                                                                                                                   |
| `enable_read_file_chunk`    | `boolean` | `no`     | `true`     | Enable `read_file_chunk()`.                                                                                                                                |
| `enable_replace_file_chunk` | `boolean` | `no`     | `true`     | Enable `replace_file_chunk()`.                                                                                                                             |
| `expose_base_directory`     | `boolean` | `no`     | `false`    | Include absolute file paths and `base_directory` in `search_files()` output.                                                                               |
| `max_file_length`           | `number`  | `no`     | `10000000` | Maximum character count for `read_file()`.                                                                                                                 |
| `max_file_lines`            | `number`  | `no`     | `100000`   | Maximum line count for `read_file()`.                                                                                                                      |
| `line_separator`            | `text`    | `no`     | `"\n"`     | Separator used by the chunk helpers.                                                                                                                       |
| `all`                       | `boolean` | `no`     | `false`    | Enable every upstream `file` function at once.                                                                                                             |

### Example

```
agents:
  editor:
    tools:
      - file:
          enable_delete_file: true
          max_file_lines: 2000
```

```
read_file("README.md")
read_file_chunk("src/mindroom/tools/file.py", 0, 80)
replace_file_chunk("docs/notes.md", 10, 12, "Updated text")
list_files(directory="src")
search_files("**/*.py")
save_file("temporary notes\n", "scratch/notes.txt")
```

### Notes

- `file` is the compatibility-friendly general file toolkit, but `coding` is a better default for code-editing agents.
- `delete_file()` is disabled by default, so destructive access is opt-in.
- `search_files()` matches filesystem globs, not content inside files.

## \[`shell`\]

`shell` runs argv-style commands and is MindRoom's most configurable execution tool for sandboxed command-line work.

### What It Does

`shell` exposes `run_shell_command()`, `check_shell_command()`, and `kill_shell_command()`. `run_shell_command()` expects a list of arguments, not a shell-parsed string. If the command exits within `timeout`, the tool returns the last `tail` lines of stdout, or stderr on non-zero exit. If the timeout is exceeded, the process keeps running in the background and the tool returns a `shell:...` handle. Use `check_shell_command(handle)` to poll a backgrounded command and `kill_shell_command(handle)` to stop it. MindRoom keeps up to 16 backgrounded shell processes per runner and automatically sweeps finished handle records after roughly 10 minutes. Unlike upstream Agno's simple shell wrapper, proxied MindRoom shell execution uses a deny-by-default env, supports explicit exported-process-env passthrough patterns, and supports PATH prepends. MindRoom marks `shell` as worker-routed by default, so it usually executes in the sandboxed worker runtime.

### Configuration

| Option                     | Type      | Required | Default | Notes                                                                                                                                                                                                  |
| -------------------------- | --------- | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `base_dir`                 | `text`    | `no`     | `null`  | Runtime-managed working directory when an agent workspace exists. This field is not normally authored inline in `config.yaml`.                                                                         |
| `enable_run_shell_command` | `boolean` | `no`     | `true`  | Enable `run_shell_command()` and the companion handle APIs.                                                                                                                                            |
| `all`                      | `boolean` | `no`     | `false` | Enable all shell functions.                                                                                                                                                                            |
| `extra_env_passthrough`    | `text`    | `no`     | `null`  | Extra exported process env var names or glob patterns exposed to sandboxed shell execution beyond the small default system env. This matches exported process env, not config-adjacent `.env` entries. |
| `shell_path_prepend`       | `text`    | `no`     | `null`  | Extra PATH entries prepended for shell subprocesses only.                                                                                                                                              |

### Example

```
agents:
  ops:
    worker_scope: user_agent
    tools:
      - shell:
          extra_env_passthrough:
            - GITHUB_TOKEN
            - INTERNAL_API_*
          shell_path_prepend:
            - /run/wrappers/bin
            - /opt/custom/bin
```

```
run_shell_command(["git", "status", "--short"], tail=50)
run_shell_command(["bash", "-lc", "sleep 300 && echo done"], timeout=2)
check_shell_command("shell:abcd1234")
kill_shell_command("shell:abcd1234")
```

### Notes

- `extra_env_passthrough` only affects sandboxed `shell` calls and matches exported process env, not config-adjacent `.env` entries. MindRoom forwards no committed runtime `.env` values by default; matched values pass through except runner control names (`MINDROOM_API_KEY`, `MINDROOM_LOCAL_CLIENT_SECRET`, `MINDROOM_SANDBOX_PROXY_TOKEN`, `MINDROOM_SANDBOX_STARTUP_MANIFEST_PATH`) and names starting with `MINDROOM_SANDBOX_`.
- In authored YAML, `extra_env_passthrough` and `shell_path_prepend` can be written as lists, and MindRoom normalizes them to the tool's comma-or-newline form.
- Background handles survive multiple requests to the same long-lived runner process, but they do not survive runner restarts.
- `shell_path_prepend` deduplicates PATH entries and only changes subprocess PATH, not the main MindRoom process PATH.
- For per-workspace env that an agent can edit on the fly (PATH prefixes, `NPM_CONFIG_PREFIX`, `NPM_CONFIG_CACHE`, `PIP_INDEX_URL`, etc.), drop a `.mindroom/worker-env.sh` script in the workspace and `export` the values you want — see "Workspace env hook" in `docs/deployment/sandbox-proxy.md`.
- MindRoom-owned env names are reasserted after the hook and cannot be redirected from `.mindroom/worker-env.sh`: `HOME`, `MINDROOM_AGENT_WORKSPACE`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, `XDG_STATE_HOME`, `XDG_CACHE_HOME`, `PIP_CACHE_DIR`, `UV_CACHE_DIR`, `PYTHONPYCACHEPREFIX`, and `VIRTUAL_ENV`.
- Example:

```
mkdir -p .mindroom .local/bin .cache/npm
cat > .mindroom/worker-env.sh <<'EOF'
export NPM_CONFIG_PREFIX="$PWD/.local"
export NPM_CONFIG_CACHE="$PWD/.cache/npm"
export PATH="$PWD/.local/bin:$PATH"
EOF
```

## \[`python`\]

`python` executes arbitrary Python code, runs Python files, exposes a few file helpers, and can install packages into the active interpreter environment.

### What It Does

`python` exposes `save_to_file_and_run()`, `run_python_code()`, `pip_install_package()`, `uv_pip_install_package()`, `run_python_file_return_variable()`, `read_file()`, and `list_files()`. The upstream toolkit can execute arbitrary Python code and warns that it should be used with human supervision. MindRoom wraps the installer functions so both `pip_install_package()` and `uv_pip_install_package()` install into the current interpreter environment through MindRoom's shared installer path. When `python` is worker-routed, that means package installs land in the active worker environment rather than the primary agent process. MindRoom marks `python` as worker-routed by default, so sandbox execution is the normal path when a sandbox backend is configured.

### Configuration

| Option                 | Type      | Required | Default | Notes                                                                                                                     |
| ---------------------- | --------- | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------- |
| `base_dir`             | `text`    | `no`     | `null`  | Runtime-managed working root when an agent workspace exists. This field is not normally authored inline in `config.yaml`. |
| `safe_globals`         | `text`    | `no`     | `null`  | Advanced raw constructor input that maps to the upstream `safe_globals` dict parameter.                                   |
| `safe_locals`          | `text`    | `no`     | `null`  | Advanced raw constructor input that maps to the upstream `safe_locals` dict parameter.                                    |
| `restrict_to_base_dir` | `boolean` | `no`     | `true`  | Constrain the file helper methods to `base_dir`.                                                                          |

### Example

```
agents:
  analyst:
    tools:
      - python:
          restrict_to_base_dir: true
```

```
run_python_code("total = sum(i * i for i in range(10))", variable_to_return="total")
save_to_file_and_run("scripts/demo.py", "result = 6 * 7", variable_to_return="result")
run_python_file_return_variable("scripts/demo.py", variable_to_return="result")
pip_install_package("rich")
read_file("scripts/demo.py")
list_files()
```

### Notes

- `restrict_to_base_dir` only constrains the file helper paths, not what arbitrary Python code can do once executed.
- `safe_globals` and `safe_locals` are exposed directly from the upstream constructor and are mainly useful for advanced programmatic wiring, not typical hand-written YAML.
- If you need runtime-scoped environment isolation, rely on worker-routed execution instead of assuming in-process Python emulation is a security boundary.
- Worker-routed `python` execution also receives `.mindroom/worker-env.sh` overlay env via `os.environ` (e.g., `PIP_INDEX_URL`). See "Workspace env hook" in `docs/deployment/sandbox-proxy.md`.
- Workspace identity, worker cache, and virtualenv env names remain controlled by MindRoom, so hooks cannot redirect `HOME`, `MINDROOM_AGENT_WORKSPACE`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, `XDG_STATE_HOME`, `XDG_CACHE_HOME`, `PIP_CACHE_DIR`, `UV_CACHE_DIR`, `PYTHONPYCACHEPREFIX`, or `VIRTUAL_ENV`.

## \[`coding`\]

`coding` is MindRoom's code-oriented local toolkit with line-numbered reads, precise text edits, grep, file discovery, and directory listing.

### What It Does

`coding` exposes `read_file()`, `edit_file()`, `write_file()`, `grep()`, `find_files()`, and `ls()`. `read_file()` adds line numbers and pagination hints when output is truncated. `edit_file()` requires the old text to match exactly one location and returns a unified diff after the edit. If exact matching fails, `edit_file()` falls back to whitespace-and-Unicode-normalized fuzzy matching. `grep()` prefers `rg` when available and falls back to Python regex search otherwise. `find_files()` filters hidden and gitignored paths, and `ls()` keeps dotfiles visible while adding `/` markers to directories. All path resolution stays inside `base_dir`. MindRoom marks `coding` as worker-routed by default.

### Configuration

| Option     | Type   | Required | Default | Notes                                                                                                                                              |
| ---------- | ------ | -------- | ------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| `base_dir` | `text` | `no`     | `null`  | Runtime-managed working directory for code operations when an agent workspace exists. This field is not normally authored inline in `config.yaml`. |

### Example

```
agents:
  code:
    tools:
      - coding
```

```
read_file("src/mindroom/tools/shell.py", offset=1, limit=120)
grep("default_execution_target", path="src/mindroom/tools")
find_files("**/*.md", path="docs")
edit_file("docs/example.md", "old text", "new text")
write_file("scratch/todo.txt", "first line\nsecond line\n")
ls("src/mindroom")
```

### Notes

- Prefer `coding` over `file` for code-editing agents because it gives better read pagination, better search, and safer text replacement behavior.
- Recursive `grep()` and `find_files()` filter hidden and gitignored paths automatically, but explicit file targets are not filtered.
- `edit_file()` refuses ambiguous edits, so widen the surrounding context in `old_text` when a match is not unique.

## \[`docker`\]

`docker` manages local containers, images, volumes, and networks through the Docker Python client.

### What It Does

`docker` exposes container operations such as `list_containers()`, `run_container()`, `exec_in_container()`, `start_container()`, `stop_container()`, `remove_container()`, `get_container_logs()`, and `inspect_container()`. It also exposes image, volume, and network operations including `pull_image()`, `build_image()`, `tag_image()`, `list_volumes()`, `create_volume()`, `list_networks()`, and `connect_container_to_network()`. On startup, the toolkit checks common Docker socket locations and pings the Docker daemon. `docker` defaults to primary execution, so it normally runs beside the main agent process rather than in the worker sandbox.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  platform:
    tools:
      - docker
```

```
list_containers()
run_container("postgres:16", detach=True, environment={"POSTGRES_PASSWORD": "example"})
get_container_logs("postgres", tail=50)
list_images()
list_networks()
```

### Notes

- The runtime that executes `docker` must have access to a working Docker socket or daemon.
- If you explicitly route `docker` through workers with `worker_tools`, the worker runtime also needs Docker access.
- `get_container_logs(stream=True)` does not return a live stream payload to the model and instead returns a status message telling you to use non-streaming mode.

## \[`calculator`\]

`calculator` provides exact small-math helper functions without requiring arbitrary code execution.

### What It Does

`calculator` exposes `add()`, `subtract()`, `multiply()`, `divide()`, `exponentiate()`, `factorial()`, `is_prime()`, and `square_root()`. Each function returns a small JSON payload describing the operation and result. `calculator` defaults to primary execution.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  math:
    tools:
      - calculator
```

```
add(2, 3)
divide(22, 7)
factorial(6)
is_prime(97)
square_root(144)
```

### Notes

- Errors such as division by zero, negative factorials, and negative square roots are returned as JSON error payloads instead of raising Python exceptions into the model.
- Use `calculator` for exact arithmetic when you do not need the broader power and risk of `python`.

## \[`reasoning`\]

`reasoning` gives an agent an internal scratchpad for structured `think` and `analyze` steps.

### What It Does

`reasoning` exposes `think()` and `analyze()`. Both functions write reasoning steps into run-scoped session state keyed by the current Agno `run_id`. `think()` records an intermediate thought plus an optional next action. `analyze()` records the result of a prior step and maps `next_action` onto `continue`, `validate`, or `final_answer`. These steps are intended for the agent's internal reasoning flow rather than user-visible output. `reasoning` defaults to primary execution.

### Configuration

| Option              | Type      | Required | Default | Notes                                                                          |
| ------------------- | --------- | -------- | ------- | ------------------------------------------------------------------------------ |
| `enable_think`      | `boolean` | `no`     | `true`  | Enable `think()`.                                                              |
| `enable_analyze`    | `boolean` | `no`     | `true`  | Enable `analyze()`.                                                            |
| `add_instructions`  | `boolean` | `no`     | `false` | Inject the toolkit's reasoning instructions into the model prompt.             |
| `add_few_shot`      | `boolean` | `no`     | `false` | Append built-in or custom few-shot examples when instructions are being added. |
| `instructions`      | `text`    | `no`     | `null`  | Replace the default reasoning instructions with your own text.                 |
| `few_shot_examples` | `text`    | `no`     | `null`  | Provide custom few-shot examples for the reasoning toolkit.                    |
| `all`               | `boolean` | `no`     | `false` | Enable all reasoning functions.                                                |

### Example

```
agents:
  researcher:
    tools:
      - reasoning:
          add_instructions: true
          add_few_shot: true
```

```
think(
    title="Plan the investigation",
    thought="I should inspect the config and then compare it with the runtime behavior.",
    action="Read the relevant files.",
)
analyze(
    title="Evaluate the findings",
    result="The config and runtime behavior match.",
    analysis="I have enough information to answer clearly.",
    next_action="final_answer",
)
```

### Notes

- `instructions` replaces the default reasoning prompt text entirely.
- `few_shot_examples` only matters when few-shot examples are actually being included.
- The stored reasoning steps live in session state for the current run, which lets later steps see the full scratchpad history.

## \[`file_generation`\]

`file_generation` creates export artifacts as JSON, CSV, PDF, or plain text and can optionally save them to disk.

### What It Does

`file_generation` exposes `generate_json_file()`, `generate_csv_file()`, `generate_pdf_file()`, and `generate_text_file()`. Each function returns a `ToolResult` with a generated file artifact attached. If `output_directory` is set, the generated file is also written to disk and the result message includes that file path. If `output_directory` is unset, the file still exists in the tool result payload but is not persisted to disk by the toolkit itself. PDF generation is automatically disabled when `reportlab` is unavailable, even if `enable_pdf_generation` is left on. `file_generation` defaults to primary execution.

### Configuration

| Option                   | Type      | Required | Default | Notes                                                         |
| ------------------------ | --------- | -------- | ------- | ------------------------------------------------------------- |
| `output_directory`       | `text`    | `no`     | `null`  | Optional directory where generated files are written to disk. |
| `enable_json_generation` | `boolean` | `no`     | `true`  | Enable `generate_json_file()`.                                |
| `enable_csv_generation`  | `boolean` | `no`     | `true`  | Enable `generate_csv_file()`.                                 |
| `enable_pdf_generation`  | `boolean` | `no`     | `true`  | Enable `generate_pdf_file()` when `reportlab` is available.   |
| `enable_txt_generation`  | `boolean` | `no`     | `true`  | Enable `generate_text_file()`.                                |
| `all`                    | `boolean` | `no`     | `false` | Enable all file-generation functions.                         |

### Example

```
agents:
  reporter:
    tools:
      - file_generation:
          output_directory: exports
          enable_pdf_generation: true
```

```
generate_json_file({"status": "ok", "items": 3}, filename="summary.json")
generate_csv_file([{"name": "alpha", "value": 1}, {"name": "beta", "value": 2}], filename="data.csv")
generate_pdf_file("Quarterly summary", filename="report.pdf", title="Q1 Report")
generate_text_file("Plain text export", filename="notes.txt")
```

### Notes

- Filenames are auto-generated when omitted, and missing file extensions are appended automatically for the matching export type.
- `generate_json_file()` accepts dicts, lists, or strings, and plain strings are wrapped into JSON when they are not already valid JSON.
- Use a real `output_directory` if you want the artifact to remain on disk for later shell or file-tool access.

## \[`visualization`\]

`visualization` creates chart images with matplotlib and writes them to an output directory.

### What It Does

`visualization` exposes `create_bar_chart()`, `create_line_chart()`, `create_pie_chart()`, `create_scatter_plot()`, and `create_histogram()`. The upstream toolkit switches matplotlib to the non-interactive `Agg` backend and auto-creates `output_dir` if it does not exist. Each chart function accepts dict-like data, list-based data, or JSON strings, normalizes the data, saves a PNG image, and returns a JSON payload with the file path and status. `visualization` defaults to primary execution.

### Configuration

| Option                       | Type      | Required | Default    | Notes                                             |
| ---------------------------- | --------- | -------- | ---------- | ------------------------------------------------- |
| `output_dir`                 | `text`    | `no`     | `"charts"` | Directory where generated chart images are saved. |
| `enable_create_bar_chart`    | `boolean` | `no`     | `true`     | Enable `create_bar_chart()`.                      |
| `enable_create_line_chart`   | `boolean` | `no`     | `true`     | Enable `create_line_chart()`.                     |
| `enable_create_pie_chart`    | `boolean` | `no`     | `true`     | Enable `create_pie_chart()`.                      |
| `enable_create_scatter_plot` | `boolean` | `no`     | `true`     | Enable `create_scatter_plot()`.                   |
| `enable_create_histogram`    | `boolean` | `no`     | `true`     | Enable `create_histogram()`.                      |
| `all`                        | `boolean` | `no`     | `false`    | Enable all chart functions.                       |

### Example

```
agents:
  analyst:
    tools:
      - visualization:
          output_dir: charts
          enable_create_pie_chart: false
```

```
create_bar_chart({"Mon": 12, "Tue": 18, "Wed": 9}, title="Requests per day")
create_line_chart({"Jan": 3, "Feb": 8, "Mar": 13}, title="Growth")
create_scatter_plot(
    x=[1, 2, 3],
    y=[2, 5, 7],
    title="Experiment results",
)
create_histogram([1, 1, 2, 3, 5, 8, 13], title="Value distribution")
```

### Notes

- The toolkit saves PNG files to disk immediately, so later tool calls can read or send those files.
- If you explicitly route `visualization` through workers, the chart files will be created in the worker-visible filesystem instead of the primary process filesystem.
- `matplotlib` must be importable in the runtime that executes the tool.

## \[`sleep`\]

`sleep` is a minimal delay utility for workflows that need an intentional pause between steps.

### What It Does

`sleep` exposes a single `sleep()` function that blocks for the requested number of seconds and then returns a confirmation string. `sleep` defaults to primary execution.

### Configuration

| Option         | Type      | Required | Default | Notes                       |
| -------------- | --------- | -------- | ------- | --------------------------- |
| `enable_sleep` | `boolean` | `no`     | `true`  | Enable `sleep()`.           |
| `all`          | `boolean` | `no`     | `false` | Enable all sleep functions. |

### Example

```
agents:
  scheduler_helper:
    tools:
      - sleep
```

```
sleep(5)
```

### Notes

- `sleep` is useful for deliberate polling loops or staged workflows, but it still ties up the runtime that executes it while the delay is in progress.
- If you explicitly route `sleep` through workers, the delay occurs in the worker runtime instead of the primary process.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)
- [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md)

# Data & Databases

Use these tools to query SQL and graph databases, analyze tabular files, work with Google datasets, Drive files, and spreadsheets, and fetch financial or business data.

## What This Page Covers

This page documents the built-in tools in the `data-and-databases` group. Use these tools when you need database access, dataframe-style analysis, Google Drive file lookup, spreadsheet automation, or market and company data.

## Tools On This Page

- \[`sql`\] - Generic SQLAlchemy-backed SQL access for databases that can be reached by URL or engine.
- \[`postgres`\] - PostgreSQL-specific table inspection, query analysis, querying, and export.
- \[`redshift`\] - Amazon Redshift warehouse access with password or IAM-based authentication.
- \[`neo4j`\] - Neo4j graph inspection and Cypher queries.
- \[`duckdb`\] - Local analytical SQL with file loading, exports, and full-text helpers.
- \[`csv`\] - Pre-registered CSV reading and DuckDB-backed SQL queries over CSV files.
- \[`pandas`\] - In-memory dataframe creation and dataframe method execution.
- \[`google_bigquery`\] - BigQuery dataset inspection and SQL queries.
- \[`google_drive`\] - Google Drive file listing, metadata search, and file reading through the per-service Google Drive OAuth provider.
- \[`google_sheets`\] - Google Sheets access through the per-service Google Sheets OAuth provider, with read support verified by default and create/update support when enabled.
- \[`openbb`\] - Stock prices, company search, news, profiles, and price targets through OpenBB.
- \[`yfinance`\] - Yahoo Finance market data, fundamentals, news, and history.
- \[`financial_datasets_api`\] - Structured financial statements, filings, ownership, and crypto data from Financial Datasets.

## Common Setup Notes

`sql`, `postgres`, `redshift`, `neo4j`, `google_bigquery`, `google_drive`, `google_sheets`, and `financial_datasets_api` are registered as `requires_config`, so they stay unavailable in the dashboard until their required config or auth is present. `duckdb`, `csv`, `pandas`, `openbb`, and `yfinance` are `setup_type: none`, so they can be enabled immediately once their optional Python dependencies are installed. MindRoom validates inline tool overrides against the declared `config_fields`, and `type="password"` fields such as `password`, `secret_access_key`, and `api_key` must go through the dashboard or credential store instead of inline YAML. Several fields on this page are advanced constructor inputs rather than normal `config.yaml` values, including `db_engine`, `connection`, `credentials`, `duckdb_connection`, `duckdb_kwargs`, `obb`, and `session`. Token-like fields such as `openbb_pat` are better kept in stored credentials even when the current metadata does not mark them as password fields. `src/mindroom/api/integrations.py` currently contains Spotify-specific OAuth endpoints only, while Google Drive and Google Sheets use the generic `/api/oauth/google_drive/*` and `/api/oauth/google_sheets/*` flows. `google_drive` and `google_sheets` declare per-service `auth_provider` values and store OAuth tokens separately from editable tool settings. `csv` queries use DuckDB under the hood, and `duckdb` is the better fit when you need to create tables from files, export results, or load local and S3 data repeatedly. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set.

## \[`sql`\]

`sql` is the generic SQL toolkit for database engines that SQLAlchemy can open directly.

### What It Does

`sql` exposes `list_tables()`, `describe_table()`, and `run_sql_query()`. The toolkit can connect through `db_url`, an existing `db_engine`, or a URL assembled from `user`, `password`, `host`, `port`, `schema`, and `dialect`. `list_tables()` and `describe_table()` use SQLAlchemy inspection, and `run_sql_query()` returns JSON rows with a default limit of 10 unless you pass `limit=None`. If you pass a `tables` mapping, `list_tables()` returns that mapping directly instead of live database introspection. For dialects where database name and schema are distinct concepts, `db_url` is the safest authored configuration because the generic `schema` field is used both in the constructed URL path and in later table inspection calls.

### Configuration

| Option                  | Type       | Required | Default | Notes                                                                      |
| ----------------------- | ---------- | -------- | ------- | -------------------------------------------------------------------------- |
| `db_url`                | `url`      | `no`     | `null`  | Preferred authored connection string.                                      |
| `db_engine`             | `text`     | `no`     | `null`  | Advanced programmatic SQLAlchemy `Engine` input, not the normal YAML path. |
| `user`                  | `text`     | `no`     | `null`  | Username for URL assembly when not using `db_url`.                         |
| `password`              | `password` | `no`     | `null`  | Database password stored through the dashboard or credential store.        |
| `host`                  | `url`      | `no`     | `null`  | Database host for URL assembly.                                            |
| `port`                  | `number`   | `no`     | `null`  | Database port for URL assembly.                                            |
| `schema`                | `text`     | `no`     | `null`  | Schema name, and also the path segment in the assembled generic URL.       |
| `dialect`               | `text`     | `no`     | `null`  | SQLAlchemy dialect prefix such as `postgresql`, `mysql`, or `sqlite`.      |
| `tables`                | `text`     | `no`     | `null`  | Advanced predeclared table metadata mapping.                               |
| `enable_list_tables`    | `boolean`  | `no`     | `true`  | Enable `list_tables()`.                                                    |
| `enable_describe_table` | `boolean`  | `no`     | `true`  | Enable `describe_table()`.                                                 |
| `enable_run_sql_query`  | `boolean`  | `no`     | `true`  | Enable `run_sql_query()`.                                                  |
| `all`                   | `boolean`  | `no`     | `false` | Enable the full upstream SQL tool surface.                                 |

### Example

```
agents:
  analyst:
    tools:
      - sql:
          db_url: sqlite:////tmp/analytics.db
          enable_run_sql_query: true
```

```
list_tables()
describe_table("events")
run_sql_query("SELECT * FROM events ORDER BY created_at DESC", limit=20)
```

### Notes

- Use `db_url` for normal YAML authoring, because `db_engine` expects a live SQLAlchemy object rather than a string.
- If you need passwords, store them through the dashboard or credential store instead of inline YAML.
- This generic toolkit is useful for simple SQL inspection, but `postgres` and `redshift` expose richer warehouse-style helpers such as query inspection and exports.

## \[`postgres`\]

`postgres` is the PostgreSQL-specific toolkit for read-only schema inspection, query review, querying, and CSV export.

### What It Does

`postgres` exposes `show_tables()`, `describe_table()`, `summarize_table()`, `inspect_query()`, `run_query()`, and `export_table_to_path()`. The toolkit opens a Psycopg connection, sets `search_path` to `table_schema`, and marks the connection read-only. `inspect_query()` runs `EXPLAIN`, which makes it the safe first step before a larger `run_query()`. `export_table_to_path()` writes queryable table output to a local file path from the running process.

### Configuration

| Option         | Type       | Required | Default  | Notes                                                                                             |
| -------------- | ---------- | -------- | -------- | ------------------------------------------------------------------------------------------------- |
| `connection`   | `text`     | `no`     | `null`   | Programmatic existing Psycopg connection object, not usable from the UI or normal YAML authoring. |
| `host`         | `url`      | `yes`    | `null`   | PostgreSQL server hostname.                                                                       |
| `port`         | `number`   | `no`     | `5432`   | PostgreSQL server port.                                                                           |
| `db_name`      | `text`     | `yes`    | `null`   | Database name.                                                                                    |
| `user`         | `text`     | `yes`    | `null`   | PostgreSQL username.                                                                              |
| `password`     | `password` | `yes`    | `null`   | PostgreSQL password stored through the dashboard or credential store.                             |
| `table_schema` | `text`     | `no`     | `public` | Schema used for table operations and connection `search_path`.                                    |

### Example

```
agents:
  warehouse:
    tools:
      - postgres:
          host: warehouse.internal
          port: 5432
          db_name: analytics
          user: analyst
          table_schema: reporting
```

```
show_tables()
describe_table("daily_revenue")
inspect_query("SELECT * FROM daily_revenue WHERE day >= CURRENT_DATE - INTERVAL '7 days'")
run_query("SELECT day, total FROM daily_revenue ORDER BY day DESC LIMIT 7")
export_table_to_path("daily_revenue", "/tmp/daily_revenue.csv")
```

### Notes

- `postgres` is read-only by design in the upstream toolkit, which makes it a safer default than generic unrestricted SQL access.
- Use `connection` only for programmatic instantiation where you already have a Psycopg connection object.
- Secrets such as `password` must be stored outside authored YAML.

## \[`redshift`\]

`redshift` is the warehouse-oriented toolkit for Amazon Redshift clusters with either password or IAM authentication.

### What It Does

`redshift` exposes `show_tables()`, `describe_table()`, `summarize_table()`, `inspect_query()`, `run_query()`, and `export_table_to_path()`. The upstream connector supports standard `user` and `password` authentication, IAM authentication through `profile`, and IAM authentication through explicit AWS credentials. When IAM auth is enabled, the toolkit can fall back to environment variables such as `REDSHIFT_HOST`, `REDSHIFT_DATABASE`, `REDSHIFT_CLUSTER_IDENTIFIER`, `AWS_REGION`, and `AWS_PROFILE`. `table_schema` defaults to `public`, and `ssl` defaults to `true`.

### Configuration

| Option               | Type       | Required | Default  | Notes                                                                   |
| -------------------- | ---------- | -------- | -------- | ----------------------------------------------------------------------- |
| `host`               | `url`      | `yes`    | `null`   | Redshift cluster endpoint.                                              |
| `port`               | `number`   | `no`     | `5439`   | Redshift port.                                                          |
| `database`           | `text`     | `yes`    | `null`   | Database name.                                                          |
| `user`               | `text`     | `yes`    | `null`   | Username for password auth.                                             |
| `password`           | `password` | `yes`    | `null`   | Password for standard authentication.                                   |
| `iam`                | `boolean`  | `no`     | `false`  | Use IAM-based auth instead of password auth.                            |
| `cluster_identifier` | `text`     | `no`     | `null`   | Cluster identifier, required for IAM auth against provisioned clusters. |
| `region`             | `text`     | `no`     | `null`   | AWS region for IAM auth.                                                |
| `db_user`            | `text`     | `no`     | `null`   | Database user for IAM auth.                                             |
| `access_key_id`      | `password` | `no`     | `null`   | Optional AWS access key for IAM auth.                                   |
| `secret_access_key`  | `password` | `no`     | `null`   | Optional AWS secret key for IAM auth.                                   |
| `session_token`      | `password` | `no`     | `null`   | Optional AWS session token for temporary credentials.                   |
| `profile`            | `text`     | `no`     | `null`   | AWS profile name for IAM auth.                                          |
| `ssl`                | `boolean`  | `no`     | `true`   | Enable SSL.                                                             |
| `table_schema`       | `text`     | `no`     | `public` | Schema used for table operations.                                       |

### Example

```
agents:
  warehouse:
    tools:
      - redshift:
          host: my-cluster.abc123.us-east-1.redshift.amazonaws.com
          database: dev
          iam: true
          cluster_identifier: analytics-prod
          region: us-east-1
          db_user: analyst
```

```
show_tables()
describe_table("fact_orders")
inspect_query("SELECT COUNT(*) FROM fact_orders WHERE created_at >= current_date - 30")
run_query("SELECT created_at::date, COUNT(*) FROM fact_orders GROUP BY 1 ORDER BY 1 DESC")
export_table_to_path("fact_orders", "/tmp/fact_orders.csv")
```

### Notes

- If `iam: true`, the toolkit can use `profile` or explicit AWS credentials instead of `user` and `password`.
- If you use password auth, store `password` through the dashboard or credential store rather than inline YAML.
- `redshift` is the better fit than generic `sql` when you want Redshift-aware connection options and warehouse export helpers.

## \[`neo4j`\]

`neo4j` is the graph database toolkit for labels, relationship types, schema discovery, and Cypher queries.

### What It Does

`neo4j` exposes `list_labels()`, `list_relationship_types()`, `get_schema()`, and `run_cypher_query()`. It uses the Neo4j Python driver and can target a specific database when `database` is set. The individual enable flags let you expose schema discovery without allowing arbitrary Cypher execution.

### Configuration

| Option                      | Type       | Required | Default | Notes                                                            |
| --------------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------- |
| `uri`                       | `url`      | `no`     | `null`  | Neo4j connection URI such as `bolt://localhost:7687`.            |
| `user`                      | `text`     | `yes`    | `null`  | Neo4j username.                                                  |
| `password`                  | `password` | `yes`    | `null`  | Neo4j password stored through the dashboard or credential store. |
| `database`                  | `text`     | `no`     | `null`  | Optional target database name.                                   |
| `enable_list_labels`        | `boolean`  | `no`     | `true`  | Enable `list_labels()`.                                          |
| `enable_list_relationships` | `boolean`  | `no`     | `true`  | Enable `list_relationship_types()`.                              |
| `enable_get_schema`         | `boolean`  | `no`     | `true`  | Enable `get_schema()`.                                           |
| `enable_run_cypher`         | `boolean`  | `no`     | `true`  | Enable `run_cypher_query()`.                                     |
| `all`                       | `boolean`  | `no`     | `false` | Enable the full Neo4j toolkit.                                   |

### Example

```
agents:
  graph:
    tools:
      - neo4j:
          uri: bolt://graph.internal:7687
          user: neo4j
          database: analytics
          enable_run_cypher: true
```

```
list_labels()
list_relationship_types()
get_schema()
run_cypher_query("MATCH (u:User)-[:PLACED]->(o:Order) RETURN u.id, count(o) AS orders LIMIT 10")
```

### Notes

- `uri` is optional in metadata, but in practice you should supply it unless the runtime injects a connection some other way.
- Store the Neo4j password through the dashboard or credential store instead of inline YAML.
- Disable `enable_run_cypher` when you want schema visibility without free-form graph queries.

## \[`duckdb`\]

`duckdb` is the local analytical SQL engine for file-backed analytics, table creation, exports, and full-text indexing.

### What It Does

`duckdb` exposes `show_tables()`, `describe_table()`, `inspect_query()`, `run_query()`, `summarize_table()`, `create_table_from_path()`, `export_table_to_path()`, `load_local_path_to_table()`, `load_local_csv_to_table()`, `load_s3_path_to_table()`, `load_s3_csv_to_table()`, `create_fts_index()`, and `full_text_search()`. If `db_path` is unset, DuckDB runs in memory. `create_table_from_path()` can load CSV or other file formats directly into a table, and `export_table_to_path()` defaults to `PARQUET`. `init_commands`, `connection`, and `config` are advanced constructor inputs that are passed directly to the upstream toolkit.

### Configuration

| Option          | Type      | Required | Default | Notes                                                              |
| --------------- | --------- | -------- | ------- | ------------------------------------------------------------------ |
| `db_path`       | `text`    | `no`     | `null`  | Path to a persistent DuckDB database file.                         |
| `connection`    | `text`    | `no`     | `null`  | Advanced existing DuckDB connection object, not normal YAML input. |
| `init_commands` | `text`    | `no`     | `null`  | Advanced startup commands passed through to the toolkit.           |
| `read_only`     | `boolean` | `no`     | `false` | Open the database in read-only mode.                               |
| `config`        | `text`    | `no`     | `null`  | Advanced raw DuckDB config mapping.                                |

### Example

```
agents:
  analyst:
    tools:
      - duckdb:
          db_path: data/analytics.duckdb
          read_only: false
```

```
show_tables(True)
create_table_from_path("/workspace/data/orders.parquet", table="orders", replace=True)
inspect_query("SELECT customer_id, COUNT(*) FROM orders GROUP BY 1")
run_query("SELECT customer_id, COUNT(*) AS orders FROM orders GROUP BY 1 ORDER BY orders DESC LIMIT 10")
export_table_to_path("orders", format="CSV", path="/tmp")
```

### Notes

- Use `duckdb` when you need repeatable local analytics over files, especially Parquet, CSV, or S3-backed datasets.
- If you only need quick reads and SQL over a few predeclared CSVs, `csv` is lighter.
- The upstream `show_tables()` signature currently expects `show_tables=True`.

## \[`csv`\]

`csv` is the lightweight CSV analysis toolkit for pre-registered files with optional DuckDB-backed SQL queries.

### What It Does

`csv` exposes `list_csv_files()`, `read_csv_file()`, `get_columns()`, and `query_csv_file()`. The toolkit works with a preconfigured list of CSV paths and exposes each one by its filename stem. `read_csv_file()` returns JSON rows and respects either the per-call `row_limit` or the configured default `row_limit`. `query_csv_file()` loads the target CSV into DuckDB and runs one SQL statement against it.

### Configuration

| Option                  | Type      | Required | Default | Notes                                                                |
| ----------------------- | --------- | -------- | ------- | -------------------------------------------------------------------- |
| `csvs`                  | `text`    | `no`     | `null`  | Advanced pre-registered CSV path list passed through to the toolkit. |
| `row_limit`             | `number`  | `no`     | `null`  | Default row cap for `read_csv_file()`.                               |
| `duckdb_connection`     | `text`    | `no`     | `null`  | Advanced existing DuckDB connection object.                          |
| `duckdb_kwargs`         | `text`    | `no`     | `null`  | Advanced DuckDB connection kwargs mapping.                           |
| `enable_read_csv_file`  | `boolean` | `no`     | `true`  | Enable `read_csv_file()`.                                            |
| `enable_list_csv_files` | `boolean` | `no`     | `true`  | Enable `list_csv_files()`.                                           |
| `enable_get_columns`    | `boolean` | `no`     | `true`  | Enable `get_columns()`.                                              |
| `enable_query_csv_file` | `boolean` | `no`     | `true`  | Enable `query_csv_file()`.                                           |
| `all`                   | `boolean` | `no`     | `false` | Enable the full CSV toolkit.                                         |

### Example

```
agents:
  analyst:
    tools:
      - csv:
          row_limit: 200
```

```
list_csv_files()
read_csv_file("sales_2025", row_limit=50)
get_columns("sales_2025")
query_csv_file("sales_2025", 'SELECT "region", COUNT(*) FROM sales_2025 GROUP BY 1')
```

### Notes

- `csv` is most useful when the runtime or a higher-level wrapper has already pre-registered the CSV paths for the tool.
- `query_csv_file()` requires DuckDB and only runs the first SQL statement you provide.
- If you need richer file loading or exports, use `duckdb` instead.

## \[`pandas`\]

`pandas` is the in-memory dataframe toolkit for creating named dataframes and running dataframe methods on them.

### What It Does

`pandas` exposes `create_pandas_dataframe()` and `run_dataframe_operation()`. `create_pandas_dataframe()` calls a top-level Pandas constructor such as `read_csv` or `read_json` and stores the resulting dataframe under a caller-chosen name. `run_dataframe_operation()` then calls a dataframe method such as `head`, `tail`, `describe`, or `groupby` on that stored dataframe. Stored dataframes live in the current process memory only and are not persisted to disk.

### Configuration

| Option                           | Type      | Required | Default | Notes                               |
| -------------------------------- | --------- | -------- | ------- | ----------------------------------- |
| `enable_create_pandas_dataframe` | `boolean` | `no`     | `true`  | Enable `create_pandas_dataframe()`. |
| `enable_run_dataframe_operation` | `boolean` | `no`     | `true`  | Enable `run_dataframe_operation()`. |
| `all`                            | `boolean` | `no`     | `false` | Enable the full Pandas toolkit.     |

### Example

```
agents:
  analyst:
    tools:
      - pandas
```

```
create_pandas_dataframe(
    "sales",
    "read_csv",
    {"filepath_or_buffer": "/workspace/data/sales.csv"},
)
run_dataframe_operation("sales", "head", {"n": 5})
run_dataframe_operation("sales", "describe", {})
```

### Notes

- The toolkit keeps dataframe state in memory on the current runtime process, so a restart clears it.
- `create_pandas_dataframe()` rejects empty dataframes and duplicate dataframe names.
- Use `pandas` when you want dataframe-native operations rather than SQL.

## \[`google_bigquery`\]

`google_bigquery` is the BigQuery toolkit for listing tables, describing schemas, and running SQL inside one dataset.

### What It Does

`google_bigquery` exposes `list_tables()`, `describe_table()`, and `run_sql_query()`. The toolkit builds a `bigquery.Client` at initialization and scopes query jobs to `project.dataset` through the default query job configuration. MindRoom's metadata marks `dataset`, `project`, and `location` as required, even though the upstream toolkit can fall back to `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_LOCATION` if those values are omitted. `credentials` is an advanced programmatic Google credentials object for cases where the process should not rely on default application credentials.

### Configuration

| Option                  | Type      | Required | Default | Notes                                                                      |
| ----------------------- | --------- | -------- | ------- | -------------------------------------------------------------------------- |
| `dataset`               | `text`    | `yes`    | `null`  | BigQuery dataset name.                                                     |
| `project`               | `text`    | `yes`    | `null`  | Google Cloud project ID.                                                   |
| `location`              | `text`    | `yes`    | `null`  | BigQuery location such as `US` or `EU`.                                    |
| `credentials`           | `text`    | `no`     | `null`  | Advanced Google credentials object passed directly to the BigQuery client. |
| `enable_list_tables`    | `boolean` | `no`     | `true`  | Enable `list_tables()`.                                                    |
| `enable_describe_table` | `boolean` | `no`     | `true`  | Enable `describe_table()`.                                                 |
| `enable_run_sql_query`  | `boolean` | `no`     | `true`  | Enable `run_sql_query()`.                                                  |
| `all`                   | `boolean` | `no`     | `false` | Enable the full BigQuery toolkit.                                          |

### Example

```
agents:
  analyst:
    tools:
      - google_bigquery:
          project: my-gcp-project
          dataset: analytics
          location: US
```

```
list_tables()
describe_table("events")
run_sql_query("SELECT event_name, COUNT(*) AS total FROM events GROUP BY 1 ORDER BY total DESC LIMIT 20")
```

### Notes

- Configure `dataset`, `project`, and `location` explicitly in MindRoom, because that is the documented and validated path in the live metadata.
- `google_bigquery` does not use a MindRoom Google OAuth provider.
- If `credentials` is unset, the BigQuery SDK falls back to the process's default Google Cloud authentication behavior.

## \[`google_drive`\]

`google_drive` is the Google Drive toolkit for listing, searching, and reading files from the connected user's Drive account.

### What It Does

MindRoom exposes `google_drive_list_files()`, `google_drive_search_files()`, and `google_drive_read_file()` through the Google Drive OAuth provider. `google_drive_list_files()` returns recent Drive files visible to the connected account. `google_drive_search_files()` searches Drive metadata. `google_drive_read_file()` reads Google Workspace files and non-Google files up to the configured `max_read_size`. When no usable MindRoom OAuth credentials exist, the wrapper raises `OAuthConnectionRequired` instead of falling back to a local token flow.

### Configuration

| Option          | Type      | Required | Default    | Notes                                                    |
| --------------- | --------- | -------- | ---------- | -------------------------------------------------------- |
| `list_files`    | `boolean` | `no`     | `true`     | Enable recent file listing.                              |
| `search_files`  | `boolean` | `no`     | `true`     | Enable Drive metadata search.                            |
| `read_file`     | `boolean` | `no`     | `true`     | Enable file content reads.                               |
| `max_read_size` | `number`  | `no`     | `10485760` | Maximum non-Google-Workspace file size to read in bytes. |

### Example

```
agents:
  assistant:
    tools:
      - google_drive:
          max_read_size: 10485760
```

```
google_drive_list_files()
google_drive_search_files("name contains 'budget'")
google_drive_read_file("1AbCdEfGhIjKlMnOpQrStUvWxYz")
```

### Notes

- `google_drive` uses the per-service `google_drive` OAuth provider and always runs in the primary MindRoom runtime.
- The provider requests Drive read-only access plus OpenID email/profile scopes.
- Configure Google OAuth through [Google Services OAuth (Admin Setup)](https://docs.mindroom.chat/deployment/google-services-oauth/index.md) or [Google Services OAuth (Individual Setup)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md).

## \[`google_sheets`\]

`google_sheets` is the Google Sheets toolkit for spreadsheet access through the Google Sheets OAuth provider.

### What It Does

MindRoom exposes Agno's `read_sheet()`, `create_sheet()`, and `update_sheet()` operations. MindRoom wraps Agno's Google Sheets toolkit with `ScopedOAuthClientMixin`, so it loads stored Google credentials from MindRoom's credential store instead of relying only on local token files. MindRoom's `google_sheets` OAuth provider requests Sheets access, and dashboard flags only gate which Agno methods are exposed. If `spreadsheet_id` or `spreadsheet_range` is unset, you can still pass them per call. MindRoom maps the dashboard config fields `read`, `create`, and `update` onto Agno's constructor flags. When no usable MindRoom OAuth credentials exist, the wrapper raises `OAuthConnectionRequired` instead of falling back to Agno's local token flow.

### Configuration

| Option              | Type      | Required | Default | Notes                                                                                           |
| ------------------- | --------- | -------- | ------- | ----------------------------------------------------------------------------------------------- |
| `spreadsheet_id`    | `text`    | `no`     | `null`  | Default spreadsheet ID. Leave unset to work with multiple spreadsheets by passing IDs per call. |
| `spreadsheet_range` | `text`    | `no`     | `null`  | Default range such as `Sheet1!A1:Z100`.                                                         |
| `read`              | `boolean` | `no`     | `true`  | Enable read operations.                                                                         |
| `create`            | `boolean` | `no`     | `false` | Enable spreadsheet creation.                                                                    |
| `update`            | `boolean` | `no`     | `false` | Enable sheet updates.                                                                           |

### Example

```
agents:
  ops:
    worker_scope: shared
    tools:
      - google_sheets:
          spreadsheet_id: 1AbCdEfGhIjKlMnOpQrStUvWxYz
          spreadsheet_range: Sheet1!A1:G200
```

```
read_sheet()
read_sheet(
    spreadsheet_id="1AbCdEfGhIjKlMnOpQrStUvWxYz",
    spreadsheet_range="Sheet1!A1:B20",
)
```

### Notes

- `google_sheets` uses the per-service `google_sheets` OAuth provider and always runs in the primary MindRoom runtime.
- Configure Google OAuth through [Google Services OAuth (Admin Setup)](https://docs.mindroom.chat/deployment/google-services-oauth/index.md) or [Google Services OAuth (Individual Setup)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md).
- The dashboard marks the tool available only when stored Google Sheets credentials include the required Sheets scope.

## \[`openbb`\]

`openbb` is the finance toolkit for stock quotes, symbol lookup, company news, company profiles, and price targets through OpenBB providers.

### What It Does

`openbb` exposes `get_stock_price()`, `search_company_symbol()`, `get_company_news()`, `get_company_profile()`, and `get_price_targets()`. The toolkit logs into OpenBB when `openbb_pat` or `OPENBB_PAT` is available, but it still works without a PAT when the selected provider supports unauthenticated access. The default provider is `yfinance`, which makes the tool usable without premium OpenBB credentials for many common quote lookups. You can selectively enable additional research functions without exposing the full OpenBB surface.

### Configuration

| Option                         | Type      | Required | Default    | Notes                                                                                            |
| ------------------------------ | --------- | -------- | ---------- | ------------------------------------------------------------------------------------------------ |
| `obb`                          | `text`    | `no`     | `null`     | Advanced preconfigured OpenBB instance object.                                                   |
| `openbb_pat`                   | `text`    | `no`     | `null`     | Optional OpenBB PAT for premium providers. Usually stored in credentials instead of inline YAML. |
| `provider`                     | `text`    | `no`     | `yfinance` | Data provider such as `yfinance`, `benzinga`, `fmp`, `intrinio`, `polygon`, `tiingo`, or `tmx`.  |
| `enable_get_stock_price`       | `boolean` | `no`     | `true`     | Enable `get_stock_price()`.                                                                      |
| `enable_search_company_symbol` | `boolean` | `no`     | `false`    | Enable `search_company_symbol()`.                                                                |
| `enable_get_company_news`      | `boolean` | `no`     | `false`    | Enable `get_company_news()`.                                                                     |
| `enable_get_company_profile`   | `boolean` | `no`     | `false`    | Enable `get_company_profile()`.                                                                  |
| `enable_get_price_targets`     | `boolean` | `no`     | `false`    | Enable `get_price_targets()`.                                                                    |
| `all`                          | `boolean` | `no`     | `false`    | Enable the full OpenBB toolkit.                                                                  |

### Example

```
agents:
  market:
    tools:
      - openbb:
          provider: yfinance
          enable_search_company_symbol: true
          enable_get_company_news: true
```

```
get_stock_price("AAPL,MSFT")
search_company_symbol("Nvidia")
get_company_news("AAPL", num_stories=5)
get_company_profile("MSFT")
get_price_targets("NVDA")
```

### Notes

- `openbb_pat` is optional, and the default `provider: yfinance` keeps the tool useful even without premium OpenBB access.
- Use `obb` only when you are constructing the toolkit programmatically with a live OpenBB object.
- This tool overlaps with `yfinance`, but `openbb` is the better fit when you want provider switching or OpenBB-specific data sources.

## \[`yfinance`\]

`yfinance` is the direct Yahoo Finance toolkit for quotes, company info, fundamentals, statements, recommendations, news, and price history.

### What It Does

`yfinance` exposes `get_current_stock_price()`, `get_company_info()`, `get_historical_stock_prices()`, `get_stock_fundamentals()`, `get_income_statements()`, `get_key_financial_ratios()`, `get_analyst_recommendations()`, `get_company_news()`, and `get_technical_indicators()`. Unlike `openbb`, it does not require a PAT or provider selection. The optional `session` field is an advanced programmatic HTTP session hook for callers that need custom transport behavior.

### Configuration

| Option    | Type   | Required | Default | Notes                                      |
| --------- | ------ | -------- | ------- | ------------------------------------------ |
| `session` | `text` | `no`     | `null`  | Advanced programmatic HTTP session object. |

### Example

```
agents:
  market:
    tools:
      - yfinance
```

```
get_current_stock_price("AAPL")
get_company_info("MSFT")
get_historical_stock_prices("NVDA", period="6mo", interval="1d")
get_company_news("TSLA", num_stories=5)
```

### Notes

- `yfinance` has no required setup fields for normal use.
- `session` is only useful for programmatic customization and is not a common `config.yaml` setting.
- Choose `yfinance` when you want the widest Yahoo Finance surface with the fewest setup requirements.

## \[`financial_datasets_api`\]

`financial_datasets_api` is the structured market-data toolkit for company info, statements, ownership, filings, news, search, and crypto prices.

### What It Does

`financial_datasets_api` exposes methods such as `get_income_statements()`, `get_balance_sheets()`, `get_cash_flow_statements()`, `get_segmented_financials()`, `get_financial_metrics()`, `get_company_info()`, `get_stock_prices()`, `get_earnings()`, `get_insider_trades()`, `get_institutional_ownership()`, `get_news()`, `get_sec_filings()`, `get_crypto_prices()`, and `search_tickers()`. The toolkit sends authenticated HTTP requests to `https://api.financialdatasets.ai` with `X-API-KEY`. If `api_key` is unset, it falls back to `FINANCIAL_DATASETS_API_KEY`, and otherwise the tool returns an API-key-not-set error instead of working partially.

### Configuration

| Option    | Type       | Required | Default | Notes                                                                                      |
| --------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------------------ |
| `api_key` | `password` | `no`     | `null`  | Financial Datasets API key stored through the dashboard, credential store, or environment. |

### Example

```
agents:
  market:
    tools:
      - financial_datasets_api
```

```
search_tickers("cloud software", limit=5)
get_company_info("MSFT")
get_stock_prices("AAPL", interval="1d", limit=30)
get_income_statements("NVDA", period="quarterly", limit=4)
get_news("TSLA", limit=10)
```

### Notes

- Configure the API key through the dashboard or credential store, or provide `FINANCIAL_DATASETS_API_KEY` in the environment.
- Use this tool when you need more structured financial datasets than `yfinance` or the default OpenBB provider exposes.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)
- [Google Services OAuth (Admin Setup)](https://docs.mindroom.chat/deployment/google-services-oauth/index.md)
- [Google Services OAuth (Individual Setup)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md)

# Web Search

Use these tools to search the public web, query paid search APIs, access Google- or Baidu-oriented results, or point agents at a self-hosted SearXNG instance.

## What This Page Covers

This page documents the built-in tools in the `web-search` group. Use these tools when you need general web discovery, current-events search, answer-style search APIs, Google or Baidu specific results, or a self-hosted metasearch backend.

## Tools On This Page

- \[`duckduckgo`\] - No-key DuckDuckGo-backed web and news search through the shared DDGS backend.
- \[`googlesearch`\] - No-key Google-backed web and news search through the shared DDGS backend.
- \[`baidusearch`\] - No-key Baidu search tuned for Chinese-language discovery.
- \[`tavily`\] - API-backed current-information search with optional answer, context, and URL extraction modes.
- \[`exa`\] - API-backed research search with content fetching, similar-page lookup, answers, and deep research tasks.
- \[`serpapi`\] - API-backed Google and YouTube SERP access.
- \[`serper`\] - API-backed Google web, news, and scholar search plus lightweight webpage scraping.
- \[`searxng`\] - Self-hosted SearXNG search across web, images, maps, music, science, news, and video.
- \[`linkup`\] - API-backed web search that can return either raw search results or sourced answers.

## Common Setup Notes

`duckduckgo`, `googlesearch`, and `baidusearch` are `setup_type: none`, so they work out of the box once their optional Python dependencies are available. `tavily`, `exa`, `serpapi`, `serper`, and `linkup` are `status=requires_config` and are intended to be configured with a stored `api_key`. `searxng` is also `status=requires_config`, but it needs a reachable `host` URL instead of an API key. None of the tools on this page declare an `auth_provider`, and `src/mindroom/api/integrations.py` currently only exposes Spotify OAuth routes, so these tools use ordinary tool credentials or SDK environment variables rather than a dedicated dashboard OAuth flow. Password fields such as `api_key` should be stored through the dashboard or credential store instead of inline YAML. Current upstream SDKs also support environment variables such as `TAVILY_API_KEY`, `TAVILY_API_BASE_URL`, `EXA_API_KEY`, `SERP_API_KEY`, `SERPER_API_KEY`, and `LINKUP_API_KEY`. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set. `duckduckgo` and `googlesearch` are the simplest no-key defaults for general search and basic news lookups. `baidusearch` is the better fit when you want Baidu indexing or Chinese-language defaults. `tavily` and `linkup` are useful when you want answer-oriented search output instead of only result lists. `exa` is the deepest research option on this page when you need domain filters, date filters, content fetches, find-similar, or a long-running research task. `serpapi` and `serper` are Google-focused paid APIs, with `serpapi` covering Google and YouTube verticals and `serper` covering Google web, news, scholar, and a scrape endpoint. `searxng` is the best fit when you control your own search stack or want SearXNG categories such as images, maps, music, science, and video.

## \[`duckduckgo`\]

`duckduckgo` is the simplest built-in web search option for general search and news without any API key setup.

### What It Does

`duckduckgo` wraps Agno's `DuckDuckGoTools`, which is a convenience layer over the shared `WebSearchTools` backend with `backend="duckduckgo"`. It exposes `web_search(query, max_results=5)` and `search_news(query, max_results=5)`. `modifier` prepends extra query text, `fixed_max_results` caps all calls, and `proxy`, `timeout`, and `verify_ssl` control the underlying DDGS client. The tool returns JSON strings from DDGS rather than a MindRoom-specific normalized response format.

### Configuration

| Option              | Type      | Required | Default | Notes                                             |
| ------------------- | --------- | -------- | ------- | ------------------------------------------------- |
| `enable_search`     | `boolean` | `no`     | `true`  | Enable `web_search()`.                            |
| `enable_news`       | `boolean` | `no`     | `true`  | Enable `search_news()`.                           |
| `modifier`          | `text`    | `no`     | `null`  | Prepends fixed query text to every web search.    |
| `fixed_max_results` | `number`  | `no`     | `null`  | Caps result count for both web and news searches. |
| `proxy`             | `url`     | `no`     | `null`  | Optional proxy for DDGS requests.                 |
| `timeout`           | `number`  | `no`     | `10`    | Request timeout in seconds.                       |
| `verify_ssl`        | `boolean` | `no`     | `true`  | Verify TLS certificates for DDGS requests.        |

### Example

```
agents:
  researcher:
    tools:
      - duckduckgo:
          enable_news: true
          fixed_max_results: 8
```

```
web_search("latest Matrix client features", max_results=5)
search_news("Matrix ecosystem", max_results=5)
```

### Notes

- Pick `duckduckgo` when you want the lowest-friction no-key option for general web and news search.
- Pick `googlesearch` instead when you want Google-style ranking but still do not want a paid API.
- Pick `tavily`, `exa`, `serper`, or `serpapi` when you need provider-backed APIs, answer generation, or more vertical-specific search behavior.

## \[`googlesearch`\]

`googlesearch` uses the same DDGS-powered search surface as `duckduckgo`, but it hardwires the backend to Google.

### What It Does

MindRoom registers `googlesearch` as a custom wrapper around Agno's `WebSearchTools` with `backend="google"`. It exposes `web_search(query, max_results=5)` and `search_news(query, max_results=5)`. Runtime behavior matches the `duckduckgo` tool surface, including `modifier`, `fixed_max_results`, `proxy`, `timeout`, and `verify_ssl`. This is still a DDGS-backed scraper-style search path rather than an official Google paid search API.

### Configuration

| Option              | Type      | Required | Default | Notes                                                                                                              |
| ------------------- | --------- | -------- | ------- | ------------------------------------------------------------------------------------------------------------------ |
| `enable_search`     | `boolean` | `no`     | `true`  | Enable `web_search()`. The current registry metadata marks this field as text, but the wrapper expects a boolean.  |
| `enable_news`       | `boolean` | `no`     | `true`  | Enable `search_news()`. The current registry metadata marks this field as text, but the wrapper expects a boolean. |
| `modifier`          | `text`    | `no`     | `null`  | Prepends fixed query text to every web search.                                                                     |
| `fixed_max_results` | `number`  | `no`     | `null`  | Caps result count for both web and news searches. The current registry metadata marks this field as text.          |
| `proxy`             | `url`     | `no`     | `null`  | Optional proxy for DDGS requests. The current registry metadata marks this field as text.                          |
| `timeout`           | `number`  | `no`     | `10`    | Request timeout in seconds. The current registry metadata marks this field as text.                                |
| `verify_ssl`        | `boolean` | `no`     | `true`  | Verify TLS certificates for DDGS requests. The current registry metadata marks this field as text.                 |

### Example

```
agents:
  researcher:
    tools:
      - googlesearch:
          modifier: site:docs.mindroom.chat
          fixed_max_results: 6
```

```
web_search("MindRoom Matrix threads", max_results=5)
search_news("open source Matrix news", max_results=5)
```

### Notes

- Pick `googlesearch` when you want Google-backed ranking without introducing an API key dependency.
- If you need a first-party paid Google SERP API with more predictable structure, use `serper` or `serpapi` instead.
- The current MindRoom wrapper makes this tool available without dedicated dashboard integration or OAuth.

## \[`baidusearch`\]

`baidusearch` is the Baidu-specific search tool for Chinese-language search and Baidu-indexed results.

### What It Does

`baidusearch` exposes one method, `baidu_search(query, max_results=5, language="zh")`. `fixed_language` overrides the per-call `language`, and non-two-letter language values are normalized through `pycountry` when possible. If language normalization fails, the upstream tool falls back to `zh`. The returned payload is a JSON array with `title`, `url`, `abstract`, and `rank`.

### Configuration

| Option                | Type      | Required | Default | Notes                                                                                                                             |
| --------------------- | --------- | -------- | ------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `fixed_max_results`   | `number`  | `no`     | `null`  | Caps result count for every call.                                                                                                 |
| `fixed_language`      | `text`    | `no`     | `null`  | Forces a default search language, with `zh` as the upstream fallback.                                                             |
| `headers`             | `text`    | `no`     | `null`  | Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`. |
| `proxy`               | `url`     | `no`     | `null`  | Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`. |
| `timeout`             | `number`  | `no`     | `10`    | Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`. |
| `debug`               | `boolean` | `no`     | `false` | Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to `search()`. |
| `enable_baidu_search` | `boolean` | `no`     | `true`  | Enable `baidu_search()`.                                                                                                          |
| `all`                 | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface.                                                                                         |

### Example

```
agents:
  cn_research:
    tools:
      - baidusearch:
          fixed_language: zh
          fixed_max_results: 8
```

```
baidu_search("Matrix 协议 新闻", max_results=5, language="zh")
```

### Notes

- Pick `baidusearch` when Chinese-language search quality matters more than Google-style ranking.
- Use `duckduckgo` or `googlesearch` for simpler English-centric general search defaults.
- The current installed upstream `baidu_search()` path only forwards keyword and result count, so `headers`, `proxy`, `timeout`, and `debug` are best treated as placeholders until the wrapper or upstream call path is tightened.

## \[`tavily`\]

`tavily` is the built-in current-information search API with optional context mode and URL extraction.

### What It Does

`tavily` can expose `web_search_using_tavily(query, max_results=5)`, `web_search_with_tavily(query)`, and `extract_url_content(urls)`, depending on the enable flags. `enable_search_context` switches the search surface from the normal result-list call to the context-oriented call, so you get one search method or the other instead of both. `web_search_using_tavily()` can include an AI-generated answer and returns either JSON or Markdown depending on `format`. `extract_url_content()` accepts one URL or a comma-separated URL list and formats extracted page content as Markdown or plain text depending on `extract_format`.

### Configuration

| Option                  | Type       | Required | Default    | Notes                                                                           |
| ----------------------- | ---------- | -------- | ---------- | ------------------------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`     | Tavily API key. The upstream SDK also checks `TAVILY_API_KEY`.                  |
| `api_base_url`          | `url`      | `no`     | `null`     | Optional base URL override. The upstream SDK also checks `TAVILY_API_BASE_URL`. |
| `enable_search`         | `boolean`  | `no`     | `true`     | Enable Tavily search.                                                           |
| `enable_search_context` | `boolean`  | `no`     | `false`    | Use `web_search_with_tavily()` instead of `web_search_using_tavily()`.          |
| `enable_extract`        | `boolean`  | `no`     | `false`    | Enable `extract_url_content()`.                                                 |
| `all`                   | `boolean`  | `no`     | `false`    | Enable the full upstream toolkit surface.                                       |
| `max_tokens`            | `number`   | `no`     | `6000`     | Token budget for context output and filtered result formatting.                 |
| `include_answer`        | `boolean`  | `no`     | `true`     | Include the answer field in search output when available.                       |
| `search_depth`          | `text`     | `no`     | `advanced` | Tavily search depth, currently `basic` or `advanced`.                           |
| `extract_depth`         | `text`     | `no`     | `basic`    | Tavily extract depth, currently `basic` or `advanced`.                          |
| `include_images`        | `boolean`  | `no`     | `false`    | Include images in extract responses when supported.                             |
| `include_favicon`       | `boolean`  | `no`     | `false`    | Include favicons in extract responses when supported.                           |
| `extract_timeout`       | `number`   | `no`     | `null`     | Optional extraction timeout in seconds.                                         |
| `extract_format`        | `text`     | `no`     | `markdown` | Extraction output format, currently `markdown` or `text`.                       |
| `format`                | `text`     | `no`     | `markdown` | Search output format, currently `json` or `markdown`.                           |

### Example

```
agents:
  newsdesk:
    tools:
      - tavily:
          enable_extract: true
          include_answer: true
          search_depth: advanced
          format: markdown
```

```
web_search_using_tavily("latest Matrix bridge updates", max_results=5)
extract_url_content("https://matrix.org/blog/")
```

### Notes

- Pick `tavily` when you want current-information search plus an optional synthesized answer or URL extraction in the same toolkit.
- Use `enable_search_context` when you want a compact context blob rather than a normal result list.
- If you want deeper research features such as similar-page search, date filters, and long-running structured research tasks, use `exa` instead.

## \[`exa`\]

`exa` is the research-heavy search toolkit for web search, content retrieval, similar-page discovery, answer generation, and deep research tasks.

### What It Does

`exa` can expose `search_exa(query, num_results=5, category=None)`, `get_contents(urls)`, `find_similar(url, num_results=5)`, `exa_answer(query, text=False)`, and `research(instructions, output_schema=None)`. Search results can include title, author, published date, URL, and truncated page text. The toolkit supports domain allowlists and denylists, crawl-date and publish-date filters, category and type filters, answer-model selection, and a separate `research_model` for long-running research tasks. `enable_research` is off by default, so deep research is opt-in even when the rest of the toolkit is enabled.

### Configuration

| Option                 | Type       | Required | Default        | Notes                                                                                                                                  |
| ---------------------- | ---------- | -------- | -------------- | -------------------------------------------------------------------------------------------------------------------------------------- |
| `enable_search`        | `boolean`  | `no`     | `true`         | Enable `search_exa()`.                                                                                                                 |
| `enable_get_contents`  | `boolean`  | `no`     | `true`         | Enable `get_contents()`.                                                                                                               |
| `enable_find_similar`  | `boolean`  | `no`     | `true`         | Enable `find_similar()`.                                                                                                               |
| `enable_answer`        | `boolean`  | `no`     | `true`         | Enable `exa_answer()`.                                                                                                                 |
| `enable_research`      | `boolean`  | `no`     | `false`        | Enable `research()`.                                                                                                                   |
| `all`                  | `boolean`  | `no`     | `false`        | Enable the full upstream toolkit surface.                                                                                              |
| `text`                 | `boolean`  | `no`     | `true`         | Include page text in results.                                                                                                          |
| `text_length_limit`    | `number`   | `no`     | `1000`         | Maximum text length per result.                                                                                                        |
| `summary`              | `boolean`  | `no`     | `false`        | Request result summaries where supported.                                                                                              |
| `api_key`              | `password` | `yes`    | `null`         | Exa API key. The upstream SDK also checks `EXA_API_KEY`.                                                                               |
| `num_results`          | `number`   | `no`     | `null`         | Default result count override.                                                                                                         |
| `livecrawl`            | `text`     | `no`     | `always`       | Exposed in MindRoom metadata, but the current installed upstream call path on this branch does not pass it through to search requests. |
| `start_crawl_date`     | `text`     | `no`     | `null`         | Include results crawled on or after this date.                                                                                         |
| `end_crawl_date`       | `text`     | `no`     | `null`         | Include results crawled on or before this date.                                                                                        |
| `start_published_date` | `text`     | `no`     | `null`         | Include results published on or after this date.                                                                                       |
| `end_published_date`   | `text`     | `no`     | `null`         | Include results published on or before this date.                                                                                      |
| `type`                 | `text`     | `no`     | `null`         | Optional content type filter such as article, blog, or video.                                                                          |
| `category`             | `text`     | `no`     | `null`         | Optional category filter such as `news`, `github`, or `research paper`.                                                                |
| `include_domains`      | `string[]` | `no`     | `null`         | Domain allowlist. The current registry metadata exposes this as a text field, but runtime expects a list of domains.                   |
| `exclude_domains`      | `string[]` | `no`     | `null`         | Domain denylist. The current registry metadata exposes this as a text field, but runtime expects a list of domains.                    |
| `show_results`         | `boolean`  | `no`     | `false`        | Emit debug logs with raw parsed results.                                                                                               |
| `model`                | `text`     | `no`     | `null`         | Answer model for `exa_answer()`, currently `exa` or `exa-pro`.                                                                         |
| `timeout`              | `number`   | `no`     | `30`           | Timeout in seconds for API operations.                                                                                                 |
| `research_model`       | `text`     | `no`     | `exa-research` | Model for `research()`, currently `exa-research` or `exa-research-pro`.                                                                |

### Example

```
agents:
  analyst:
    tools:
      - exa:
          enable_research: true
          category: news
          include_domains:
            - matrix.org
            - element.io
          research_model: exa-research
```

```
search_exa("Matrix sliding sync adoption", num_results=5)
find_similar("https://matrix.org/blog/")
exa_answer("What changed in the Matrix ecosystem this week?")
research("Compare hosted Matrix bridges for small teams.")
```

### Notes

- Pick `exa` when you need the richest research surface on this page rather than a simple search box.
- `model` only affects `exa_answer()`, and `research_model` only affects `research()`.
- The current wrapper exposes `livecrawl`, but the installed upstream call path in this worktree does not apply that setting to the search requests, so do not rely on it yet for behavior changes.

## \[`serpapi`\]

`serpapi` is the Google and YouTube search toolkit for agents that need a paid SERP provider instead of DDGS-backed scraping.

### What It Does

`serpapi` exposes `search_google(query, num_results=10)` and `search_youtube(query)`. `search_google()` returns a JSON payload with `search_results`, `recipes_results`, `shopping_results`, `knowledge_graph`, and `related_questions`. `search_youtube()` returns `video_results`, `movie_results`, and `channel_results`. MindRoom does not add extra behavior here beyond registering the tool metadata and dependency set.

### Configuration

| Option                  | Type       | Required | Default | Notes                                                     |
| ----------------------- | ---------- | -------- | ------- | --------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`  | SerpApi key. The upstream SDK also checks `SERP_API_KEY`. |
| `enable_search_google`  | `boolean`  | `no`     | `true`  | Enable `search_google()`.                                 |
| `enable_search_youtube` | `boolean`  | `no`     | `false` | Enable `search_youtube()`.                                |
| `all`                   | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                 |

### Example

```
agents:
  researcher:
    tools:
      - serpapi:
          enable_search_youtube: true
```

```
search_google("Matrix bridges", num_results=10)
search_youtube("Matrix conference talks")
```

### Notes

- Pick `serpapi` when you specifically want Google plus YouTube search from one paid provider.
- `serpapi` is a better fit than `googlesearch` when you want a provider-backed API instead of DDGS-backed scraping.
- `serper` is the better fit when you need Google news, Google Scholar, or a scrape endpoint instead of YouTube search.

## \[`serper`\]

`serper` is the Google API toolkit for web, news, scholar, and lightweight scrape calls.

### What It Does

`serper` exposes `search_web(query, num_results=None)`, `search_news(query, num_results=None)`, `search_scholar(query, num_results=None)`, and `scrape_webpage(url, markdown=False)`. `location`, `language`, and `date_range` become shared request parameters across the search endpoints. The search methods return raw JSON responses from Serper. `scrape_webpage()` hits Serper's scrape endpoint and can optionally request Markdown output.

### Configuration

| Option                  | Type       | Required | Default | Notes                                                          |
| ----------------------- | ---------- | -------- | ------- | -------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`  | Serper API key. The upstream SDK also checks `SERPER_API_KEY`. |
| `location`              | `text`     | `no`     | `us`    | Google location code sent as `gl`.                             |
| `language`              | `text`     | `no`     | `en`    | Search language code sent as `hl`.                             |
| `num_results`           | `number`   | `no`     | `10`    | Default result count for search calls.                         |
| `date_range`            | `text`     | `no`     | `null`  | Shared date-range filter sent as `tbs`.                        |
| `enable_search`         | `boolean`  | `no`     | `true`  | Enable `search_web()`.                                         |
| `enable_search_news`    | `boolean`  | `no`     | `true`  | Enable `search_news()`.                                        |
| `enable_search_scholar` | `boolean`  | `no`     | `true`  | Enable `search_scholar()`.                                     |
| `enable_scrape_webpage` | `boolean`  | `no`     | `true`  | Enable `scrape_webpage()`.                                     |
| `all`                   | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                      |

### Example

```
agents:
  analyst:
    tools:
      - serper:
          location: us
          language: en
          enable_search_scholar: true
```

```
search_web("latest Matrix rooms UX", num_results=5)
search_news("Matrix foundation news", num_results=5)
search_scholar("Matrix protocol paper", num_results=5)
scrape_webpage("https://matrix.org/blog/", markdown=True)
```

### Notes

- Pick `serper` when you want Google news and scholar in the same paid toolkit.
- `serper` also covers quick scrape calls, which makes it a good bridge between search and light extraction workflows.
- If you want YouTube search instead of scholar or scraping, use `serpapi` instead.

## \[`searxng`\]

`searxng` points an agent at your own SearXNG instance instead of a hosted paid API.

### What It Does

`searxng` exposes `search_web(query, max_results=5)`, `image_search(query, max_results=5)`, `it_search(query, max_results=5)`, `map_search(query, max_results=5)`, `music_search(query, max_results=5)`, `news_search(query, max_results=5)`, `science_search(query, max_results=5)`, and `video_search(query, max_results=5)`. All of those calls route through the same `/search?format=json` endpoint on the configured `host`. If `engines` is set, the tool appends those engine names to the SearXNG request. `fixed_max_results` truncates every category response to a consistent maximum.

### Configuration

| Option              | Type       | Required | Default | Notes                                                                                                                              |
| ------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| `host`              | `url`      | `yes`    | `null`  | Base URL for the SearXNG instance. Use the instance root, not a prebuilt `/search` URL.                                            |
| `engines`           | `string[]` | `no`     | `[]`    | Optional engine allowlist. The current registry metadata exposes this as a text field, but runtime expects a list of engine names. |
| `fixed_max_results` | `number`   | `no`     | `null`  | Caps result count for all categories.                                                                                              |

### Example

```
agents:
  privacy_research:
    tools:
      - searxng:
          host: https://search.example.com
          engines:
            - duckduckgo
            - wikipedia
          fixed_max_results: 6
```

```
search_web("Matrix federation guide", max_results=5)
news_search("Matrix news", max_results=5)
science_search("decentralized messaging protocol", max_results=5)
image_search("Matrix logo", max_results=5)
```

### Notes

- Pick `searxng` when you want a self-hosted or privacy-preserving search backend under your own control.
- `searxng` is the only tool on this page that exposes image, map, music, science, and video categories through the same configuration.
- If your SearXNG deployment needs auth or reverse-proxy policy, handle that at the instance or network layer because the current MindRoom tool metadata only exposes `host`, `engines`, and `fixed_max_results`.

## \[`linkup`\]

`linkup` is a web-search API that can return either search-result lists or sourced answers.

### What It Does

`linkup` exposes `web_search_with_linkup(query, depth=None, output_type=None)`. `depth` controls how aggressively Linkup searches, and `output_type` controls whether the response is a `searchResults` list or a `sourcedAnswer`. The configured defaults are applied when the call does not override them. The tool returns the raw response from the Linkup SDK rather than a MindRoom-specific normalized envelope.

### Configuration

| Option                          | Type       | Required | Default         | Notes                                                                                                                                                     |
| ------------------------------- | ---------- | -------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`                       | `password` | `yes`    | `null`          | Linkup API key. The upstream SDK documents `LINKUP_API_KEY`, but the current MindRoom wrapper is safest when you store the key explicitly in tool config. |
| `depth`                         | `text`     | `no`     | `standard`      | Default search depth, currently `standard` or `deep`.                                                                                                     |
| `output_type`                   | `text`     | `no`     | `searchResults` | Default output type, currently `searchResults` or `sourcedAnswer`.                                                                                        |
| `enable_web_search_with_linkup` | `boolean`  | `no`     | `true`          | Enable `web_search_with_linkup()`.                                                                                                                        |
| `all`                           | `boolean`  | `no`     | `false`         | Enable the full upstream toolkit surface.                                                                                                                 |

### Example

```
agents:
  briefings:
    tools:
      - linkup:
          depth: deep
          output_type: sourcedAnswer
```

```
web_search_with_linkup(
    "Summarize the latest Matrix bridge announcements",
    depth="deep",
    output_type="sourcedAnswer",
)
```

### Notes

- Pick `linkup` when you want a sourced answer directly from the search provider instead of stitching one together downstream.
- Pick `tavily` when you also want built-in extract calls, and pick `exa` when you need broader research primitives such as `find_similar()` or `research()`.
- The current MindRoom wrapper initializes the Linkup client from the explicit `api_key` argument, so a stored tool credential is more reliable than relying on environment-only fallback on this branch.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)

# Web Scraping & Browser

Use these tools to read pages, extract article text, crawl websites, and drive either a local browser runtime or a hosted browser API.

## What This Page Covers

This page documents the built-in tools in the `web-scraping-and-browser` group. Use these tools when you need lightweight text extraction, structured scraping APIs, or browser automation against live websites.

## Tools On This Page

- \[`crawl4ai`\] - Local Crawl4AI crawling with readable-text extraction and optional query-aware filtering.
- \[`website`\] - Simple website reader and optional knowledge-base ingester.
- \[`trafilatura`\] - Local page extraction, metadata extraction, HTML-to-text conversion, batch extraction, and focused crawling.
- \[`newspaper`\] - News-article reader backed by the `newspaper4k` dependency.
- \[`jina`\] - Jina Reader URL reading and optional web search with an optional API key.
- \[`firecrawl`\] - Firecrawl API for scrape, crawl, map, and search jobs.
- \[`spider`\] - Spider Cloud API for search, scrape, and crawl.
- \[`scrapegraph`\] - ScrapeGraph AI extraction, markdown conversion, search scraping, and agentic crawling.
- \[`apify`\] - Apify Actor runner that turns configured actors into tool functions.
- \[`brightdata`\] - Bright Data scraping, screenshots, SERP queries, and feed endpoints.
- \[`oxylabs`\] - Oxylabs Google search, Amazon data, and general web scraping.
- \[`agentql`\] - AgentQL browser-assisted scraping with optional custom extraction queries.
- \[`browserbase`\] - Browserbase-hosted browser sessions with remote navigation, screenshots, and page reads.
- \[`browser`\] - MindRoom's local Playwright browser controller.
- \[`web_browser_tools`\] - Host OS browser opener for launching a real browser tab or window.

## Common Setup Notes

`crawl4ai`, `website`, `trafilatura`, `newspaper`, and `web_browser_tools` are the lowest-friction no-config options on this page. `firecrawl`, `browserbase`, `agentql`, `scrapegraph`, `apify`, `brightdata`, and `oxylabs` are all credentialed tools that normally need stored credentials or SDK environment variables before they are useful. `spider` also needs credentials in practice even though the current MindRoom metadata marks it as `setup_type: none`, because the installed `spider-client` raises when `SPIDER_API_KEY` is missing. `jina` is the middle ground here, because the installed `JinaReaderTools` only adds an `Authorization` header when `api_key` is present, so public `read_url()` usage works without a key while authenticated plans can still set one. `browser` is local Playwright automation, `browserbase` is a hosted browser API that you connect to over CDP, and `web_browser_tools` simply asks the host operating system to open a browser tab or window. `src/mindroom/api/integrations.py` currently only exposes Spotify OAuth routes on this branch, so none of the tools on this page have a dedicated MindRoom OAuth flow. Store password fields through the dashboard or credential store instead of inline YAML, and use environment variables such as `FIRECRAWL_API_KEY`, `SPIDER_API_KEY`, `BROWSERBASE_API_KEY`, `BROWSERBASE_PROJECT_ID`, `AGENTQL_API_KEY`, `SGAI_API_KEY`, `APIFY_API_TOKEN`, `BRIGHT_DATA_API_KEY`, `OXYLABS_USERNAME`, `OXYLABS_PASSWORD`, and `JINA_API_KEY` when you prefer SDK-native auth. `crawl4ai`, `agentql`, `browserbase`, and `browser` also depend on a working browser runtime, and `web_browser_tools` only makes sense on a host that can open a real desktop browser. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set.

## No-Config Scrapers

### \[`crawl4ai`\]

`crawl4ai` is the best local option on this page when you want one tool that can fetch readable page content from one URL or a short URL list.

#### What It Does

`crawl4ai` exposes `crawl(url, search_query=None)`. It accepts either one URL string or a list of URLs and returns readable extracted content for each one. When you pass `search_query`, the tool enables BM25-based content filtering to keep the extracted text focused on that query. When `use_pruning` is enabled without a query, the tool uses Crawl4AI pruning to trim noisy page content. The current implementation bypasses Crawl4AI cache for fresher reads and truncates the result to `max_length` when needed. This is a local crawler rather than a hosted API, so it does not need an API key, but it still needs a working browser runtime.

#### Configuration

| Option              | Type      | Required | Default            | Notes                                                                                                                    |
| ------------------- | --------- | -------- | ------------------ | ------------------------------------------------------------------------------------------------------------------------ |
| `max_length`        | `number`  | `no`     | `5000`             | Maximum returned character count after extraction.                                                                       |
| `timeout`           | `number`  | `no`     | `60`               | Crawl timeout in seconds.                                                                                                |
| `use_pruning`       | `boolean` | `no`     | `false`            | Enable pruning-based cleanup when no `search_query` is provided.                                                         |
| `pruning_threshold` | `number`  | `no`     | `0.48`             | Threshold passed to Crawl4AI pruning mode.                                                                               |
| `bm25_threshold`    | `number`  | `no`     | `1.0`              | Threshold passed to BM25 filtering when `search_query` is used.                                                          |
| `headless`          | `boolean` | `no`     | `true`             | Launch Crawl4AI's browser in headless mode.                                                                              |
| `wait_until`        | `text`    | `no`     | `domcontentloaded` | Playwright wait condition before extraction.                                                                             |
| `proxy_config`      | `object`  | `no`     | `null`             | Raw browser proxy config passed into Crawl4AI `BrowserConfig`, while the current MindRoom metadata exposes this as text. |
| `enable_crawl`      | `boolean` | `no`     | `true`             | Enable `crawl()`.                                                                                                        |
| `all`               | `boolean` | `no`     | `false`            | Enable the full upstream toolkit surface.                                                                                |

#### Example

```
agents:
  researcher:
    tools:
      - crawl4ai:
          max_length: 8000
          use_pruning: true
          wait_until: networkidle
```

```
crawl("https://matrix.org/blog/", search_query="bridges and federation")
```

#### Notes

- Use `crawl4ai` when you want a local scraper instead of a hosted API.
- `proxy_config` maps directly to Crawl4AI browser settings, so treat it as an advanced raw config object.
- For heavily protected or browser-hostile sites, `browserbase`, `brightdata`, or `browser` can be a better fit.

### \[`website`\]

`website` is the lightest built-in page reader on this page.

#### What It Does

With normal MindRoom YAML configuration, `website` exposes `read_url(url)` and returns JSON-serialized `Document` objects from Agno's `WebsiteReader`. If a `Knowledge` object is injected programmatically through the `knowledge` constructor argument, the tool exposes `add_website_to_knowledge(url)` instead of `read_url()`. That means the same registry entry can act either as a simple page reader or as a knowledge-base ingestion hook depending on how it is constructed. In normal hand-authored `config.yaml`, you should treat this as a quick page-reading tool.

#### Configuration

| Option      | Type     | Required | Default | Notes                                                                                                                               |
| ----------- | -------- | -------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| `knowledge` | `object` | `no`     | `null`  | Advanced programmatic `Knowledge` object injection that changes the tool surface from `read_url()` to `add_website_to_knowledge()`. |

#### Example

```
agents:
  assistant:
    tools:
      - website
```

```
read_url("https://docs.mindroom.chat")
```

#### Notes

- `website` is the simplest default when you just need to read one page.
- The `knowledge` field is not typical hand-written YAML and is mainly useful in programmatic setups.
- If you need metadata-only extraction, batch extraction, or crawling, `trafilatura` is usually a better fit.

### \[`trafilatura`\]

`trafilatura` is the most capable local extractor on this page when you want text extraction, metadata, HTML conversion, and lightweight crawling from one toolkit.

#### What It Does

`trafilatura` exposes `extract_text()`, `extract_metadata_only()`, `crawl_website()`, `html_to_text()`, and `extract_batch()`. It fetches pages locally through Trafilatura and can return plain text, Markdown, JSON, XML, CSV, or HTML output depending on `output_format`. `extract_metadata_only()` returns metadata without full article text. `extract_batch()` loops over multiple URLs and returns one JSON payload with successes and failures. `crawl_website()` uses Trafilatura's focused spider support when that module is importable in the runtime. If the spider module is missing, the tool skips crawler registration instead of exposing a broken crawl function.

#### Configuration

| Option                         | Type      | Required | Default  | Notes                                                                                 |
| ------------------------------ | --------- | -------- | -------- | ------------------------------------------------------------------------------------- |
| `output_format`                | `text`    | `no`     | `txt`    | Default extraction format such as `txt`, `json`, `markdown`, `xml`, `csv`, or `html`. |
| `include_comments`             | `boolean` | `no`     | `true`   | Include comment content in extracted output.                                          |
| `include_tables`               | `boolean` | `no`     | `true`   | Keep table content in extracted output.                                               |
| `include_images`               | `boolean` | `no`     | `false`  | Include image information where Trafilatura supports it.                              |
| `include_formatting`           | `boolean` | `no`     | `false`  | Preserve formatting markers in extracted output.                                      |
| `include_links`                | `boolean` | `no`     | `false`  | Preserve links in extracted output.                                                   |
| `with_metadata`                | `boolean` | `no`     | `false`  | Include metadata in extraction output.                                                |
| `favor_precision`              | `boolean` | `no`     | `false`  | Bias extraction toward precision.                                                     |
| `favor_recall`                 | `boolean` | `no`     | `false`  | Bias extraction toward recall.                                                        |
| `target_language`              | `text`    | `no`     | `null`   | Optional ISO 639-1 language filter such as `en` or `de`.                              |
| `deduplicate`                  | `boolean` | `no`     | `false`  | Deduplicate repeated content segments.                                                |
| `max_tree_size`                | `number`  | `no`     | `null`   | Optional parser tree-size limit.                                                      |
| `max_crawl_urls`               | `number`  | `no`     | `10`     | Maximum URLs to visit when crawling.                                                  |
| `max_known_urls`               | `number`  | `no`     | `100000` | Maximum discovered URLs to track while crawling.                                      |
| `enable_extract_text`          | `boolean` | `no`     | `true`   | Enable `extract_text()`.                                                              |
| `enable_extract_metadata_only` | `boolean` | `no`     | `true`   | Enable `extract_metadata_only()`.                                                     |
| `enable_html_to_text`          | `boolean` | `no`     | `true`   | Enable `html_to_text()`.                                                              |
| `enable_extract_batch`         | `boolean` | `no`     | `true`   | Enable `extract_batch()`.                                                             |
| `enable_crawl_website`         | `boolean` | `no`     | `true`   | Enable `crawl_website()` when Trafilatura spider support is available.                |
| `all`                          | `boolean` | `no`     | `false`  | Enable the full upstream toolkit surface.                                             |

#### Example

```
agents:
  analyst:
    tools:
      - trafilatura:
          output_format: markdown
          with_metadata: true
          include_links: true
```

```
extract_text("https://matrix.org/blog/", output_format="markdown")
extract_metadata_only("https://matrix.org/blog/")
```

#### Notes

- `trafilatura` is the strongest no-key option when you want more than a plain page read.
- `crawl_website()` depends on Trafilatura spider support in the runtime, so verify the crawler function exists if crawling matters to your workflow.
- For news-article specific extraction with titles, authors, and summaries, `newspaper` can be a better fit.

### \[`newspaper`\]

`newspaper` is the article-focused extractor for news pages and blog posts.

#### What It Does

`newspaper` exposes `read_article(url)`. It returns JSON with whichever article fields were extracted successfully, including title, authors, text, publish date, and optional summary. `article_length` truncates article text after extraction. The registry name is `newspaper`, but the underlying module and dependency still come from `newspaper4k`. That means old references to `newspaper4k` are stale for current MindRoom config.

#### Configuration

| Option                | Type      | Required | Default | Notes                                          |
| --------------------- | --------- | -------- | ------- | ---------------------------------------------- |
| `include_summary`     | `boolean` | `no`     | `false` | Include article summary when available.        |
| `article_length`      | `number`  | `no`     | `null`  | Truncate article text to this many characters. |
| `enable_read_article` | `boolean` | `no`     | `true`  | Enable `read_article()`.                       |
| `all`                 | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface.      |

#### Example

```
agents:
  newsdesk:
    tools:
      - newspaper:
          include_summary: true
          article_length: 6000
```

```
read_article("https://matrix.org/blog/")
```

#### Notes

- Use `newspaper` in `tools:`, not `newspaper4k`.
- This tool is tuned for article-style pages rather than arbitrary websites.
- For generic site crawling or metadata extraction across many URLs, use `trafilatura` or `crawl4ai`.

### \[`jina`\]

`jina` wraps Jina Reader's read and search endpoints and is the easiest hosted option on this page when you want an optional-key reader rather than a strict credential gate.

#### What It Does

`jina` exposes `read_url(url)` and, when enabled, `search_query(query)`. `read_url()` prepends the target URL to `base_url`, which defaults to `https://r.jina.ai/`. `search_query()` posts the query to `search_url`, which defaults to `https://s.jina.ai/`. When `search_query_content` is false, the tool adds `X-Respond-With: no-content` to avoid returning full page text in search results. Returned content is truncated to `max_content_length`. The installed implementation only adds the `Authorization` header when an API key is present, so unauthenticated public-reader usage still works.

#### Configuration

| Option                 | Type       | Required | Default              | Notes                                                                            |
| ---------------------- | ---------- | -------- | -------------------- | -------------------------------------------------------------------------------- |
| `api_key`              | `password` | `no`     | `null`               | Optional Jina API key, with `JINA_API_KEY` as the SDK fallback.                  |
| `base_url`             | `url`      | `no`     | `https://r.jina.ai/` | Base URL for `read_url()`.                                                       |
| `search_url`           | `url`      | `no`     | `https://s.jina.ai/` | Base URL for `search_query()`.                                                   |
| `max_content_length`   | `number`   | `no`     | `10000`              | Maximum returned character count.                                                |
| `timeout`              | `number`   | `no`     | `null`               | Optional Jina timeout header in seconds.                                         |
| `search_query_content` | `boolean`  | `no`     | `true`               | Return full content in search results instead of metadata-only search summaries. |
| `enable_read_url`      | `boolean`  | `no`     | `true`               | Enable `read_url()`.                                                             |
| `enable_search_query`  | `boolean`  | `no`     | `false`              | Enable `search_query()`.                                                         |
| `all`                  | `boolean`  | `no`     | `false`              | Enable the full upstream toolkit surface.                                        |

#### Example

```
agents:
  researcher:
    tools:
      - jina:
          enable_search_query: true
          search_query_content: false
```

```
read_url("https://matrix.org/blog/")
search_query("latest Matrix bridge updates")
```

#### Notes

- `jina` works without a key for public reader endpoints, but a key is still useful for authenticated plans or rate limits.
- The current MindRoom metadata marks this tool as `requires_config`, but the installed code only treats auth as optional.
- Pick `jina` when you specifically want Jina Reader semantics instead of local extraction libraries.

## API-Based Scrapers

### \[`firecrawl`\]

`firecrawl` is the hosted scraper on this page that covers scrape, crawl, map, and search from one API.

#### What It Does

`firecrawl` exposes `scrape_website()`, `crawl_website()`, `map_website()`, and `search_web()`. `formats` is applied to scrape, crawl, and search requests. `limit` acts as the default result cap for crawl and search operations. `poll_interval` controls how often crawl jobs are polled. `search_params` is passed through to Firecrawl search calls as raw provider-specific options. The upstream tool falls back to `FIRECRAWL_API_KEY` when `api_key` is not provided directly.

#### Configuration

| Option           | Type       | Required | Default                     | Notes                                                                                                                     |
| ---------------- | ---------- | -------- | --------------------------- | ------------------------------------------------------------------------------------------------------------------------- |
| `api_key`        | `password` | `yes`    | `null`                      | Firecrawl API key, with `FIRECRAWL_API_KEY` as the SDK fallback.                                                          |
| `enable_scrape`  | `boolean`  | `no`     | `true`                      | Enable `scrape_website()`.                                                                                                |
| `enable_crawl`   | `boolean`  | `no`     | `false`                     | Enable `crawl_website()`.                                                                                                 |
| `enable_mapping` | `boolean`  | `no`     | `false`                     | Enable `map_website()`.                                                                                                   |
| `enable_search`  | `boolean`  | `no`     | `false`                     | Enable `search_web()`.                                                                                                    |
| `all`            | `boolean`  | `no`     | `false`                     | Enable the full upstream toolkit surface.                                                                                 |
| `formats`        | `string[]` | `no`     | `null`                      | Requested Firecrawl formats such as `markdown` or `html`, while the current MindRoom metadata exposes this field as text. |
| `limit`          | `number`   | `no`     | `10`                        | Default page or result limit for crawl and search.                                                                        |
| `poll_interval`  | `number`   | `no`     | `30`                        | Crawl polling interval in seconds.                                                                                        |
| `search_params`  | `object`   | `no`     | `null`                      | Raw Firecrawl search parameters object, while the current MindRoom metadata exposes this field as text.                   |
| `api_url`        | `url`      | `no`     | `https://api.firecrawl.dev` | Firecrawl API base URL.                                                                                                   |

#### Example

```
agents:
  research:
    tools:
      - firecrawl:
          enable_crawl: true
          enable_search: true
          limit: 5
```

```
scrape_website("https://matrix.org/blog/")
search_web("latest Matrix bridges")
```

#### Notes

- Use `firecrawl` when you want scrape, crawl, map, and search in one hosted API.
- `formats` and `search_params` are raw upstream arguments, so verify them against your Firecrawl plan and endpoint version.
- This is usually a better fit than `crawl4ai` when you want provider-hosted crawling instead of local browser work.

### \[`spider`\]

`spider` is Spider Cloud's search, scrape, and crawl toolkit for LLM-ready output.

#### What It Does

`spider` exposes `search_web(query, max_results=5)`, `scrape(url)`, and `crawl(url, limit=None)`. The current wrapper calls Spider search with `fetch_page_content: false`, so search is primarily discovery rather than full-content extraction. `scrape()` and `crawl()` request Markdown-style output from Spider. `optional_params` is merged into Spider API requests as a raw provider options object. The installed `spider-client` constructor raises when no API key is available, even though the current MindRoom metadata says this tool is available without setup.

#### Configuration

| Option            | Type      | Required | Default | Notes                                                                                                                                     |
| ----------------- | --------- | -------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------- |
| `max_results`     | `number`  | `no`     | `null`  | Default result count override for `search_web()`.                                                                                         |
| `url`             | `url`     | `no`     | `null`  | Optional default URL constructor argument from the upstream toolkit.                                                                      |
| `optional_params` | `object`  | `no`     | `null`  | Raw Spider API parameters merged into search, scrape, and crawl requests, while the current MindRoom metadata exposes this field as text. |
| `enable_search`   | `boolean` | `no`     | `true`  | Enable `search_web()`.                                                                                                                    |
| `enable_scrape`   | `boolean` | `no`     | `true`  | Enable `scrape()`.                                                                                                                        |
| `enable_crawl`    | `boolean` | `no`     | `true`  | Enable `crawl()`.                                                                                                                         |
| `all`             | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface.                                                                                                 |

#### Example

```
agents:
  crawler:
    tools:
      - spider:
          max_results: 8
          enable_crawl: true
```

```
search_web("MindRoom Matrix setup", max_results=5)
scrape("https://matrix.org/blog/")
```

#### Notes

- Treat `spider` as a credentialed tool and set `SPIDER_API_KEY`, even though the current MindRoom metadata still says `setup_type: none`.
- `optional_params` is a raw provider object and is best used only when you already know the Spider API field names you want.
- If you want a cleaner, explicitly credentialed hosted scraper with clearer metadata, `firecrawl` is usually simpler.

### \[`scrapegraph`\]

`scrapegraph` is the prompt-driven extractor on this page for turning web pages into structured answers.

#### What It Does

`scrapegraph` exposes `smartscraper()`, `markdownify()`, `crawl()`, `agentic_crawler()`, `searchscraper()`, and `scrape()`. `smartscraper()` extracts structured data from one page based on a natural-language prompt. `markdownify()` returns a Markdown version of a page. `crawl()` applies a prompt plus JSON schema across a crawl. `agentic_crawler()` performs automated steps in the browser and can optionally run AI extraction over the resulting content. `searchscraper()` searches the web before extracting information. `render_heavy_js` only affects the low-level `scrape()` path.

#### Configuration

| Option                   | Type       | Required | Default | Notes                                                         |
| ------------------------ | ---------- | -------- | ------- | ------------------------------------------------------------- |
| `api_key`                | `password` | `yes`    | `null`  | ScrapeGraph API key, with `SGAI_API_KEY` as the SDK fallback. |
| `enable_smartscraper`    | `boolean`  | `no`     | `true`  | Enable `smartscraper()`.                                      |
| `enable_markdownify`     | `boolean`  | `no`     | `false` | Enable `markdownify()`.                                       |
| `enable_crawl`           | `boolean`  | `no`     | `false` | Enable `crawl()`.                                             |
| `enable_searchscraper`   | `boolean`  | `no`     | `false` | Enable `searchscraper()`.                                     |
| `enable_agentic_crawler` | `boolean`  | `no`     | `false` | Enable `agentic_crawler()`.                                   |
| `enable_scrape`          | `boolean`  | `no`     | `false` | Enable raw `scrape()`.                                        |
| `render_heavy_js`        | `boolean`  | `no`     | `false` | Ask ScrapeGraph to render heavy JavaScript for `scrape()`.    |
| `all`                    | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                     |

#### Example

```
agents:
  extractor:
    tools:
      - scrapegraph:
          enable_searchscraper: true
          enable_agentic_crawler: true
```

```
smartscraper("https://matrix.org/blog/", "Extract the title, date, and three main points.")
markdownify("https://matrix.org/blog/")
```

#### Notes

- If you disable `enable_smartscraper` without enabling `all`, the installed upstream toolkit auto-enables `markdownify()` so the tool still has a useful default surface.
- Use `scrapegraph` when you want prompt-shaped extraction rather than generic page text.
- For purely local extraction with no hosted API dependency, use `crawl4ai` or `trafilatura`.

### \[`apify`\]

`apify` is the dynamic tool on this page, because its callable surface depends on which Actors you register.

#### What It Does

`apify` does not expose one fixed method like the other tools on this page. Instead, it reads the configured Actor IDs and registers one tool function per Actor at startup. Each generated tool uses the Actor's input schema to build parameters and returns that Actor's dataset items as JSON. Without configured `actors`, there is no practical tool surface. This is best thought of as a hosted Actor adapter rather than a single scraper API.

#### Configuration

| Option            | Type       | Required | Default | Notes                                                                                                                                                                                                   |
| ----------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `apify_api_token` | `password` | `yes`    | `null`  | Apify API token, with `APIFY_API_TOKEN` as the SDK fallback.                                                                                                                                            |
| `actors`          | `text`     | `yes`    | `null`  | Actor ID string such as `apify/rag-web-browser`, with the current MindRoom metadata also claiming comma-separated lists even though the installed upstream class treats a plain string as one actor ID. |

#### Example

```
agents:
  extractor:
    tools:
      - apify:
          actors: apify/rag-web-browser
```

#### Notes

- `actors` is the important field here, because it determines which functions actually exist at runtime.
- The current metadata advertises comma-separated Actor IDs, but the installed upstream constructor does not split plain strings, so the safest documented path on this branch is a single Actor ID.
- Generated tool names are derived from the Actor ID, so check the runtime tool list if you need the exact callable name.

### \[`brightdata`\]

`brightdata` is the hosted toolkit for markdown scraping, screenshots, SERP queries, and provider-specific web data feeds.

#### What It Does

`brightdata` exposes `scrape_as_markdown()`, `get_screenshot()`, `search_engine()`, and `web_data_feed()`. `scrape_as_markdown()` uses the configured web-unlocker zone and returns Markdown output. `get_screenshot()` returns a `ToolResult` with an image artifact instead of just raw text. `search_engine()` supports Google, Bing, and Yandex search through Bright Data's SERP infrastructure. `web_data_feed()` accesses Bright Data feed endpoints for supported source types. Zone selection is controlled by `serp_zone` and `web_unlocker_zone`, which can also be overridden by environment variables.

#### Configuration

| Option                   | Type       | Required | Default         | Notes                                                                        |
| ------------------------ | ---------- | -------- | --------------- | ---------------------------------------------------------------------------- |
| `api_key`                | `password` | `yes`    | `null`          | Bright Data API key, with `BRIGHT_DATA_API_KEY` as the SDK fallback.         |
| `enable_scrape_markdown` | `boolean`  | `no`     | `true`          | Enable `scrape_as_markdown()`.                                               |
| `enable_screenshot`      | `boolean`  | `no`     | `true`          | Enable `get_screenshot()`.                                                   |
| `enable_search_engine`   | `boolean`  | `no`     | `true`          | Enable `search_engine()`.                                                    |
| `enable_web_data_feed`   | `boolean`  | `no`     | `true`          | Enable `web_data_feed()`.                                                    |
| `all`                    | `boolean`  | `no`     | `false`         | Enable the full upstream toolkit surface.                                    |
| `serp_zone`              | `text`     | `no`     | `serp_api`      | SERP zone, with `BRIGHT_DATA_SERP_ZONE` able to override it.                 |
| `web_unlocker_zone`      | `text`     | `no`     | `web_unlocker1` | Web unlocker zone, with `BRIGHT_DATA_WEB_UNLOCKER_ZONE` able to override it. |
| `verbose`                | `boolean`  | `no`     | `false`         | Emit extra Bright Data request logging.                                      |
| `timeout`                | `number`   | `no`     | `600`           | Timeout in seconds.                                                          |

#### Example

```
agents:
  research:
    tools:
      - brightdata:
          enable_web_data_feed: false
          timeout: 300
```

```
scrape_as_markdown("https://matrix.org/blog/")
search_engine("Matrix hosting", engine="google", num_results=5)
```

#### Notes

- `brightdata` is the better fit than `firecrawl` when screenshots and feed endpoints matter.
- Zone environment variables can override the inline config values, so document your deployment defaults if multiple zones exist.
- `get_screenshot()` returns an image artifact rather than a file path string, which is useful for agents that need to hand the screenshot to a model immediately.

### \[`oxylabs`\]

`oxylabs` is the e-commerce and SERP-oriented scraper on this page.

#### What It Does

`oxylabs` exposes `search_google()`, `get_amazon_product()`, `search_amazon_products()`, and `scrape_website()`. It uses the Oxylabs realtime client for Google and Amazon scraping rather than a generic HTML fetch path. `search_google()` returns parsed organic results with title, URL, description, and position. The Amazon functions expose both product-detail and product-search workflows. `scrape_website()` is the generic fallback when you just want one URL scraped. This tool is credentialed with a username and password pair rather than one API key.

#### Configuration

| Option     | Type       | Required | Default | Notes                                                          |
| ---------- | ---------- | -------- | ------- | -------------------------------------------------------------- |
| `username` | `text`     | `yes`    | `null`  | Oxylabs username, with `OXYLABS_USERNAME` as the SDK fallback. |
| `password` | `password` | `yes`    | `null`  | Oxylabs password, with `OXYLABS_PASSWORD` as the SDK fallback. |

#### Example

```
agents:
  commerce:
    tools:
      - oxylabs
```

```
search_google("Matrix hosting", domain_code="com")
search_amazon_products("ergonomic keyboard", domain_code="com")
```

#### Notes

- `oxylabs` needs both `username` and `password`, so it is not a single-key setup like `firecrawl` or `brightdata`.
- Use `domain_code` to switch between regional Google and Amazon domains.
- Pick `oxylabs` when Google SERP plus Amazon data matters more than generic website crawling.

## Browser Tools

### \[`agentql`\]

`agentql` is the browser-assisted extractor for sites where you want AgentQL queries rather than plain text scraping.

#### What It Does

`agentql` exposes `scrape_website(url)` and, when enabled, `custom_scrape_website(url)`. `scrape_website()` uses a built-in query that extracts generic page text. `custom_scrape_website()` only becomes useful when `agentql_query` is non-empty. The installed upstream toolkit registers the custom scrape function automatically when `agentql_query` is set, even if `enable_custom_scrape_website` is false. The current upstream implementation launches Playwright with `headless=False`, which matters on headless-only runtimes.

#### Configuration

| Option                         | Type       | Required | Default | Notes                                                                 |
| ------------------------------ | ---------- | -------- | ------- | --------------------------------------------------------------------- |
| `api_key`                      | `password` | `yes`    | `null`  | AgentQL API key, with `AGENTQL_API_KEY` as the SDK fallback.          |
| `enable_scrape_website`        | `boolean`  | `no`     | `true`  | Enable `scrape_website()`.                                            |
| `enable_custom_scrape_website` | `boolean`  | `no`     | `false` | Enable `custom_scrape_website()` when `agentql_query` is also useful. |
| `all`                          | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                             |
| `agentql_query`                | `text`     | `no`     | `""`    | Custom AgentQL query used by `custom_scrape_website()`.               |

#### Example

```
agents:
  extractor:
    tools:
      - agentql:
          agentql_query: |
            {
              title
              links[]
            }
```

```
scrape_website("https://matrix.org/blog/")
custom_scrape_website("https://matrix.org/blog/")
```

#### Notes

- The installed upstream code launches Playwright with `headless=False`, so this tool may need a GUI-capable runtime or virtual display.
- Setting `agentql_query` is enough to register the custom scrape function on this branch.
- Use `agentql` when you want AgentQL query semantics rather than a generic readable-text scraper.

### \[`browserbase`\]

`browserbase` is the hosted browser session tool for navigation, screenshots, and page-content reads over a remote browser.

#### What It Does

`browserbase` exposes `navigate_to()`, `screenshot()`, `get_page_content()`, and `close_session()`, plus async variants for async agent execution. The tool auto-creates a Browserbase session, stores its `connect_url`, and connects to it over Playwright CDP. `get_page_content()` returns visible cleaned text when `parse_html` is true and raw HTML when `parse_html` is false. Long page content is truncated to `max_content_length`. `base_url` configures the Browserbase API endpoint, not the website you want to visit. This is simpler than `browser` when you only need remote navigation, screenshots, and page reads.

#### Configuration

| Option                    | Type       | Required | Default  | Notes                                                                                        |
| ------------------------- | ---------- | -------- | -------- | -------------------------------------------------------------------------------------------- |
| `api_key`                 | `password` | `yes`    | `null`   | Browserbase API key, with `BROWSERBASE_API_KEY` as the SDK fallback.                         |
| `project_id`              | `text`     | `yes`    | `null`   | Browserbase project ID, with `BROWSERBASE_PROJECT_ID` as the SDK fallback.                   |
| `base_url`                | `url`      | `no`     | `null`   | Optional Browserbase API endpoint override, with `BROWSERBASE_BASE_URL` as the SDK fallback. |
| `enable_navigate_to`      | `boolean`  | `no`     | `true`   | Enable `navigate_to()`.                                                                      |
| `enable_screenshot`       | `boolean`  | `no`     | `true`   | Enable `screenshot()`.                                                                       |
| `enable_get_page_content` | `boolean`  | `no`     | `true`   | Enable `get_page_content()`.                                                                 |
| `enable_close_session`    | `boolean`  | `no`     | `true`   | Enable `close_session()`.                                                                    |
| `all`                     | `boolean`  | `no`     | `false`  | Enable the full upstream toolkit surface.                                                    |
| `parse_html`              | `boolean`  | `no`     | `true`   | Return cleaned visible text instead of raw HTML.                                             |
| `max_content_length`      | `number`   | `no`     | `100000` | Maximum returned character count for page content.                                           |

#### Example

```
agents:
  browser_worker:
    tools:
      - browserbase:
          parse_html: true
          max_content_length: 20000
```

```
navigate_to("https://matrix.org/blog/")
get_page_content()
```

#### Notes

- `browserbase` needs both `api_key` and `project_id`.
- It still depends on local Playwright support because the client connects to the remote browser over CDP.
- Use `browserbase` when you want a hosted browser session but do not need the broader local action surface of `browser`.

### \[`browser`\]

`browser` is MindRoom's local browser controller for multi-step browser sessions, snapshots, screenshots, PDFs, uploads, dialogs, and low-level actions.

#### What It Does

`browser` exposes one callable, `browser(action=...)`, with actions such as `status`, `start`, `stop`, `profiles`, `tabs`, `open`, `focus`, `close`, `snapshot`, `screenshot`, `navigate`, `console`, `pdf`, `upload`, `dialog`, and `act`. It manages named browser profiles, with `mindroom` as the default profile name. It creates tabs, tracks the active tab, records console entries, and resolves temporary element refs from `snapshot()` into later `act()` and `screenshot()` calls. `snapshot()` can return either `ai` or `aria` format. `act()` currently supports `click`, `type`, `press`, `hover`, `drag`, `select`, `fill`, `resize`, `wait`, `evaluate`, and `close`. Only `target="host"` is supported on this branch, so sandbox or node targeting fields currently return an error. If `output_dir` is unset, screenshots and PDFs are written under `<storage>/browser`. The runtime picks Chromium from `BROWSER_EXECUTABLE_PATH`, `chromium`, or `google-chrome-stable` when available.

#### Configuration

| Option       | Type   | Required | Default | Notes                                                                                                                                |
| ------------ | ------ | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| `output_dir` | `text` | `no`     | `null`  | Optional directory for screenshots, PDFs, and other browser artifacts, with `<storage>/browser` as the runtime default when omitted. |

#### Example

```
agents:
  browser_worker:
    tools:
      - browser:
          output_dir: browser-artifacts
```

```
browser(action="open", targetUrl="https://matrix.org/blog/")
browser(action="snapshot", snapshotFormat="ai")
browser(action="act", request={"kind": "click", "ref": "e1"})
browser(action="screenshot", fullPage=True)
```

#### Notes

- This tool is local Playwright automation rather than a hosted browser API.

### \[`web_browser_tools`\]

`web_browser_tools` is the simplest browser-related tool here, because it just opens a URL in the host's real browser.

#### What It Does

`web_browser_tools` exposes `open_page(url, new_window=False)`. It uses Python's standard-library `webbrowser` module to open a tab or window on the host operating system. It does not return page content, DOM state, screenshots, or automation handles. This makes it useful for human handoff or local desktop workflows, but not for scraping.

#### Configuration

| Option             | Type      | Required | Default | Notes                                     |
| ------------------ | --------- | -------- | ------- | ----------------------------------------- |
| `enable_open_page` | `boolean` | `no`     | `true`  | Enable `open_page()`.                     |
| `all`              | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface. |

#### Example

```
agents:
  assistant:
    tools:
      - web_browser_tools
```

```
open_page("https://docs.mindroom.chat")
open_page("https://matrix.org/blog/", new_window=True)
```

#### Notes

- `web_browser_tools` only makes sense on a host that can launch a real browser window or tab.
- This tool is not a scraper and does not feed page content back to the model.
- Use `browser` or `browserbase` when you need browser automation or content returned to the agent.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)

# Research Sources

Use these tools to query source-specific knowledge bases such as ArXiv, Wikipedia, PubMed, and Hacker News instead of doing general web search.

## What This Page Covers

This page documents the built-in tools in the `research-sources` group. Use these tools when you want paper-only search, encyclopedia summaries, biomedical literature lookup, or Hacker News story and user data.

## Tools On This Page

- \[`arxiv`\] - Search ArXiv and optionally download papers to extract page text.
- \[`wikipedia`\] - Fetch Wikipedia summaries, or update an injected knowledge base from Wikipedia.
- \[`pubmed`\] - Search PubMed for medical and life-science literature with concise or expanded result formatting.
- \[`hackernews`\] - Fetch top Hacker News stories and basic user details from the public API.

## Common Setup Notes

All four tools are `setup_type: none`, so they work out of the box and do not require API keys or OAuth. `src/mindroom/api/integrations.py` currently only exposes Spotify OAuth routes on this branch, so these tools have no dedicated dashboard auth flow. Missing optional Python dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set. MindRoom does not add Matrix runtime-context behavior or worker-routing overrides for these tools. Use [Web Search](https://docs.mindroom.chat/tools/web-search/index.md) instead when you need broader web discovery, news search, or provider-backed search APIs.

## \[`arxiv`\]

`arxiv` searches ArXiv by query and can download selected PDFs to extract text from their pages.

### What It Does

By default `arxiv` exposes `search_arxiv_and_return_articles(query, num_articles=10)` and `read_arxiv_papers(id_list, pages_to_read=None)`. Search results are returned as JSON with title, short ID, entry URL, authors, categories, publish timestamp, PDF URL, links, summary, and comment. Reading papers downloads each PDF locally, parses it with `pypdf`, and returns the same metadata plus per-page extracted text.

### Configuration

| Option                     | Type      | Required | Default | Notes                                                                    |
| -------------------------- | --------- | -------- | ------- | ------------------------------------------------------------------------ |
| `enable_search_arxiv`      | `boolean` | `no`     | `true`  | Enable `search_arxiv_and_return_articles()`.                             |
| `enable_read_arxiv_papers` | `boolean` | `no`     | `true`  | Enable `read_arxiv_papers()`.                                            |
| `all`                      | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface.                                |
| `download_dir`             | `text`    | `no`     | `null`  | Local directory where downloaded PDFs are stored before text extraction. |

### Example

```
agents:
  researcher:
    tools:
      - arxiv:
          download_dir: mindroom_data/arxiv
```

```
search_arxiv_and_return_articles("matrix protocol", num_articles=5)
read_arxiv_papers(["2103.03404v1"], pages_to_read=3)
```

### Notes

- `read_arxiv_papers()` expects ArXiv IDs such as `2103.03404v1`, not a free-text search query.
- If `download_dir` is not set, the upstream toolkit writes PDFs to its default local `arxiv_pdfs` directory before parsing them.
- Use `duckduckgo`, `googlesearch`, or `exa` from [Web Search](https://docs.mindroom.chat/tools/web-search/index.md) when you need broader search beyond ArXiv papers.

## \[`wikipedia`\]

`wikipedia` is the lightweight encyclopedia lookup tool for summary-style retrieval from Wikipedia.

### What It Does

In normal MindRoom usage `wikipedia` exposes `search_wikipedia(query)`, which returns one JSON document containing the queried title and `wikipedia.summary(query)` content. If an upstream `Knowledge` object is injected, the toolkit instead exposes `search_wikipedia_and_update_knowledge_base(topic)`, which inserts the topic into that knowledge base and returns relevant documents from it. This makes `wikipedia` a simple direct lookup tool by default, with an advanced knowledge-base update mode for custom integrations.

### Configuration

| Option      | Type      | Required | Default | Notes                                                                                                                                         |
| ----------- | --------- | -------- | ------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
| `knowledge` | `text`    | `no`     | `null`  | Advanced upstream hook for injecting a `Knowledge` object. In typical MindRoom YAML usage you leave this unset and use direct summary search. |
| `all`       | `boolean` | `no`     | `false` | Exposed in metadata, but the current upstream implementation does not change behavior for this toolkit.                                       |

### Example

```
agents:
  researcher:
    tools:
      - wikipedia
```

```
search_wikipedia("Matrix protocol")
```

### Notes

- `knowledge` is not a normal string option at runtime, so the usual MindRoom configuration is just `- wikipedia`.
- Search uses the upstream `wikipedia.summary()` call, so ambiguous topics work best with a specific query.
- Use [Web Search](https://docs.mindroom.chat/tools/web-search/index.md) when you need multiple result links or broader web coverage instead of one encyclopedia summary.

## \[`pubmed`\]

`pubmed` searches PubMed through NCBI E-utilities and formats article metadata for medical and life-science research.

### What It Does

`pubmed` exposes `search_pubmed(query, max_results=10)`. It first looks up PubMed IDs through `esearch`, then fetches article XML through `efetch`, and finally returns a JSON list of formatted result strings. Default output includes title, publication year, and summary text. When `results_expanded` is enabled, each result also includes first author, journal, publication type, DOI, PubMed URL, full-text URL when available, keywords, and MeSH terms.

### Configuration

| Option                 | Type      | Required | Default                  | Notes                                                                                                   |
| ---------------------- | --------- | -------- | ------------------------ | ------------------------------------------------------------------------------------------------------- |
| `email`                | `text`    | `no`     | `your_email@example.com` | Contact email sent to NCBI E-utilities. A real email is recommended even though no API key is required. |
| `max_results`          | `number`  | `no`     | `null`                   | Default result cap used when the call does not pass `max_results`.                                      |
| `results_expanded`     | `boolean` | `no`     | `false`                  | Return richer metadata instead of the concise title and summary format.                                 |
| `enable_search_pubmed` | `boolean` | `no`     | `true`                   | Enable `search_pubmed()`.                                                                               |
| `all`                  | `boolean` | `no`     | `false`                  | Enable the full upstream toolkit surface.                                                               |

### Example

```
agents:
  clinician:
    tools:
      - pubmed:
          email: research@example.com
          max_results: 5
          results_expanded: true
```

```
search_pubmed("CRISPR therapy", max_results=5)
```

### Notes

- `pubmed` does not need an API key, but the upstream client sends the configured `email` with requests to NCBI.
- Concise mode truncates long abstracts to about 200 characters, so use `results_expanded: true` when you need more context in each result.
- The tool returns a JSON list of formatted text blocks rather than a deeply nested article schema.

## \[`hackernews`\]

`hackernews` reads the public Hacker News Firebase API for top-story and user-profile data.

### What It Does

By default `hackernews` exposes `get_top_hackernews_stories(num_stories=10)` and `get_user_details(username)`. Top-story lookups return the raw story objects from the Hacker News item endpoint, with an extra `username` field copied from `by`. User lookups return a smaller JSON object with karma, about text, and total submitted item count.

### Configuration

| Option                    | Type      | Required | Default | Notes                                     |
| ------------------------- | --------- | -------- | ------- | ----------------------------------------- |
| `enable_get_top_stories`  | `boolean` | `no`     | `true`  | Enable `get_top_hackernews_stories()`.    |
| `enable_get_user_details` | `boolean` | `no`     | `true`  | Enable `get_user_details()`.              |
| `all`                     | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface. |

### Example

```
agents:
  tech_watch:
    tools:
      - hackernews
```

```
get_top_hackernews_stories(num_stories=5)
get_user_details("pg")
```

### Notes

- This tool uses public Hacker News endpoints and does not need credentials.
- `get_top_hackernews_stories()` is best for front-page monitoring and lightweight discussion sourcing, not full web search.
- Pair it with [Web Search](https://docs.mindroom.chat/tools/web-search/index.md) or [Web Scraping & Browser](https://docs.mindroom.chat/tools/web-scraping-and-browser/index.md) when you want to follow story links and inspect the linked pages themselves.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Web Search](https://docs.mindroom.chat/tools/web-search/index.md)
- [Web Scraping & Browser](https://docs.mindroom.chat/tools/web-scraping-and-browser/index.md)

# AI & Generation

Use these tools to transcribe audio, generate images and videos, synthesize speech, and call provider-hosted media generation APIs.

## What This Page Covers

This page documents the built-in tools in the `ai-and-generation` group. Use these tools when you need OpenAI- or Google-style multimodal generation, provider-specific media APIs, or text-to-speech and audio workflows.

## Tools On This Page

- \[`openai`\] - OpenAI-backed transcription, image generation, and text-to-speech.
- \[`gemini`\] - Google-backed image generation and Vertex-only video generation.
- \[`groq`\] - Groq-backed audio transcription, translation, and speech generation.
- \[`replicate`\] - Replicate-hosted image or video generation from prompt-driven models.
- \[`fal`\] - Fal-hosted media generation and a fixed image-to-image workflow.
- \[`dalle`\] - Dedicated OpenAI DALL-E image generation.
- \[`cartesia`\] - Voice listing, voice localization, and text-to-speech.
- \[`eleven_labs`\] - Voice listing, sound effect generation, and text-to-speech.
- \[`desi_vocal`\] - Hindi and Indian-language voice listing and text-to-speech.
- \[`lumalabs`\] - Luma AI video generation and image-to-video workflows.
- \[`modelslabs`\] - ModelsLab media generation for MP4, GIF, MP3, and WAV outputs.

## Common Setup Notes

Every tool on this page is `status=requires_config` in the live registry and is meant to be configured with provider credentials. These tools do not use an `auth_provider`, and `src/mindroom/api/integrations.py` currently only exposes Spotify OAuth routes, so setup is done through stored tool credentials or provider SDK environment variables rather than a dedicated dashboard OAuth flow. Password fields such as `api_key` should be stored through the dashboard or credential store instead of inline YAML. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set. Most generation calls on this page return `ToolResult` media attachments rather than only raw text, so they are best suited to agents that can pass generated images, videos, or audio back to the user. `openai` and `dalle` both use the OpenAI Python SDK and the same `OPENAI_API_KEY`, but they expose different tool surfaces. `gemini` uses `GOOGLE_API_KEY` in Gemini API mode, and MindRoom also maps provider name `gemini` to shared Google credentials in its provider credential helpers. The current upstream SDK implementations also honor provider env vars such as `OPENAI_API_KEY`, `GOOGLE_API_KEY`, `GROQ_API_KEY`, `REPLICATE_API_KEY`, `FAL_API_KEY`, `CARTESIA_API_KEY`, `ELEVEN_LABS_API_KEY`, `DESI_VOCAL_API_KEY`, `LUMAAI_API_KEY`, and `MODELS_LAB_API_KEY`.

## \[`openai`\]

`openai` is the general OpenAI media toolkit for audio transcription, image generation, and text-to-speech.

### What It Does

`openai` exposes `transcribe_audio(audio_path)`, `generate_image(prompt)`, and `generate_speech(text_input)`. `transcribe_audio()` expects a local file path and sends it to the configured transcription model, which defaults to `whisper-1`. `generate_image()` uses the configured `image_model`, defaults to `dall-e-3`, and returns attached image bytes rather than only a remote URL. The current implementation handles both `gpt-image-*` style models and older DALL-E response formats internally. `generate_speech()` uses the configured OpenAI TTS model, voice, and output format and returns an attached audio artifact.

### Configuration

| Option                     | Type       | Required | Default     | Notes                                                                |
| -------------------------- | ---------- | -------- | ----------- | -------------------------------------------------------------------- |
| `api_key`                  | `password` | `yes`    | `null`      | OpenAI API key. The upstream SDK also checks `OPENAI_API_KEY`.       |
| `enable_transcription`     | `boolean`  | `no`     | `true`      | Enable `transcribe_audio()`.                                         |
| `enable_image_generation`  | `boolean`  | `no`     | `true`      | Enable `generate_image()`.                                           |
| `enable_speech_generation` | `boolean`  | `no`     | `true`      | Enable `generate_speech()`.                                          |
| `all`                      | `boolean`  | `no`     | `false`     | Enable all three OpenAI media functions.                             |
| `transcription_model`      | `text`     | `no`     | `whisper-1` | Model used by `transcribe_audio()`.                                  |
| `text_to_speech_voice`     | `text`     | `no`     | `alloy`     | Default voice for `generate_speech()`.                               |
| `text_to_speech_model`     | `text`     | `no`     | `tts-1`     | Default TTS model for `generate_speech()`.                           |
| `text_to_speech_format`    | `text`     | `no`     | `mp3`       | Output format for generated speech, such as `mp3`, `wav`, or `opus`. |
| `image_model`              | `text`     | `no`     | `dall-e-3`  | Image generation model for `generate_image()`.                       |
| `image_quality`            | `text`     | `no`     | `null`      | Optional image quality override passed through to the API.           |
| `image_size`               | `text`     | `no`     | `null`      | Optional image size override passed through to the API.              |
| `image_style`              | `text`     | `no`     | `null`      | Optional image style override passed through to the API.             |

### Example

```
agents:
  creator:
    tools:
      - openai:
          transcription_model: whisper-1
          image_model: dall-e-3
          text_to_speech_voice: alloy
```

```
transcribe_audio("recordings/intro.wav")
generate_image("A retro-futurist Matrix control room with warm lighting.")
generate_speech("Status update complete.")
```

### Notes

- `openai` is the broad OpenAI media tool, while \[`dalle`\] is the narrower image-only wrapper.
- `transcribe_audio()` expects a readable local path, not a URL.
- If you only want image generation with explicit DALL-E-specific options like `n`, `size`, `quality`, and `style`, use \[`dalle`\] instead.

## \[`gemini`\]

`gemini` is the Google media toolkit for image generation through Imagen and video generation through Veo.

### What It Does

`gemini` exposes `generate_image(prompt)` and `generate_video(prompt)`. `generate_image()` uses the configured `image_generation_model`, which defaults to `imagen-3.0-generate-002`, and returns attached image bytes. `generate_video()` uses the configured `video_generation_model`, which defaults to `veo-2.0-generate-001`, polls until the long-running operation completes, and returns attached video artifacts. The current implementation requires Vertex AI mode for video generation and returns an error if `vertexai` is not enabled. In non-Vertex mode, the tool uses the Gemini API through `GOOGLE_API_KEY`.

### Configuration

| Option                   | Type       | Required | Default                   | Notes                                                                                                   |
| ------------------------ | ---------- | -------- | ------------------------- | ------------------------------------------------------------------------------------------------------- |
| `api_key`                | `password` | `no`     | `null`                    | Google API key for Gemini API mode. Required unless `vertexai: true` with working Vertex configuration. |
| `vertexai`               | `boolean`  | `no`     | `false`                   | Use Vertex AI instead of the direct Gemini API. Required for `generate_video()`.                        |
| `project_id`             | `text`     | `no`     | `null`                    | Vertex project override. Falls back to `GOOGLE_CLOUD_PROJECT` when omitted.                             |
| `location`               | `text`     | `no`     | `null`                    | Vertex location override. Falls back to `GOOGLE_CLOUD_LOCATION` when omitted.                           |
| `image_generation_model` | `text`     | `no`     | `imagen-3.0-generate-002` | Model used by `generate_image()`.                                                                       |
| `video_generation_model` | `text`     | `no`     | `veo-2.0-generate-001`    | Model used by `generate_video()`.                                                                       |
| `enable_generate_image`  | `boolean`  | `no`     | `true`                    | Enable `generate_image()`.                                                                              |
| `enable_generate_video`  | `boolean`  | `no`     | `true`                    | Enable `generate_video()`.                                                                              |
| `all`                    | `boolean`  | `no`     | `false`                   | Enable both generation functions.                                                                       |

### Example

```
agents:
  studio:
    tools:
      - gemini:
          vertexai: true
          project_id: my-gcp-project
          location: us-central1
          image_generation_model: imagen-3.0-generate-002
          video_generation_model: veo-2.0-generate-001
```

```
generate_image("A minimal poster for a Matrix developer conference.")
generate_video("A slow cinematic flythrough of a neon data center.")
```

### Notes

- `generate_video()` only works in Vertex AI mode on this branch.
- In MindRoom's provider credential helpers, `gemini` maps to shared Google credentials rather than its own independent provider bucket.
- The current tool polls every 5 seconds until the video operation finishes, and that polling interval is not exposed as a tool config field.

## \[`groq`\]

`groq` is the audio-focused toolkit for fast transcription, translation, and speech generation.

### What It Does

`groq` exposes `transcribe_audio(audio_source)`, `translate_audio(audio_source)`, and `generate_speech(text_input)`. `transcribe_audio()` and `translate_audio()` accept either a local file path or a public URL. `translate_audio()` translates the source audio to English using the configured translation model. `generate_speech()` uses the configured Groq TTS model and voice and returns an attached WAV artifact. All three functions use the Groq SDK directly and require a Groq API key.

### Configuration

| Option                    | Type       | Required | Default            | Notes                                                      |
| ------------------------- | ---------- | -------- | ------------------ | ---------------------------------------------------------- |
| `api_key`                 | `password` | `yes`    | `null`             | Groq API key. The upstream SDK also checks `GROQ_API_KEY`. |
| `transcription_model`     | `text`     | `no`     | `whisper-large-v3` | Model used by `transcribe_audio()`.                        |
| `translation_model`       | `text`     | `no`     | `whisper-large-v3` | Model used by `translate_audio()`.                         |
| `tts_model`               | `text`     | `no`     | `playai-tts`       | Model used by `generate_speech()`.                         |
| `tts_voice`               | `text`     | `no`     | `Chip-PlayAI`      | Voice used by `generate_speech()`.                         |
| `enable_transcribe_audio` | `boolean`  | `no`     | `true`             | Enable `transcribe_audio()`.                               |
| `enable_translate_audio`  | `boolean`  | `no`     | `true`             | Enable `translate_audio()`.                                |
| `enable_generate_speech`  | `boolean`  | `no`     | `true`             | Enable `generate_speech()`.                                |
| `all`                     | `boolean`  | `no`     | `false`            | Enable all three audio functions.                          |

### Example

```
agents:
  audio:
    tools:
      - groq:
          transcription_model: whisper-large-v3
          tts_model: playai-tts
          tts_voice: Chip-PlayAI
```

```
transcribe_audio("samples/interview.mp3")
translate_audio("https://example.com/spanish-briefing.mp3")
generate_speech("Your transcript is ready.")
```

### Notes

- `transcribe_audio()` and `translate_audio()` are more flexible than \[`openai`\] because they accept either local files or public URLs.
- The current Groq TTS path always asks the API for `wav` output and returns an `audio/wav` artifact.
- Use \[`openai`\] instead if you want OpenAI Whisper or OpenAI TTS specifically.

## \[`replicate`\]

`replicate` is the generic Replicate wrapper for prompt-driven image or video generation.

### What It Does

`replicate` exposes one call, `generate_media(prompt)`. It runs the configured Replicate model with `input={"prompt": prompt}` and expects one `FileOutput` or an iterable of `FileOutput` objects. The current implementation infers whether each output is an image or a video from the returned file URL extension. Generated artifacts are attached by remote URL rather than downloaded into MindRoom-managed bytes.

### Configuration

| Option                  | Type       | Required | Default            | Notes                                                                           |
| ----------------------- | ---------- | -------- | ------------------ | ------------------------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`             | Replicate API key. The upstream implementation also checks `REPLICATE_API_KEY`. |
| `model`                 | `text`     | `no`     | `minimax/video-01` | Replicate model ref used by `generate_media()`.                                 |
| `enable_generate_media` | `boolean`  | `no`     | `true`             | Enable `generate_media()`.                                                      |
| `all`                   | `boolean`  | `no`     | `false`            | Enable the full toolkit, which is currently just `generate_media()`.            |

### Example

```
agents:
  video:
    tools:
      - replicate:
          model: minimax/video-01
```

```
generate_media("A short looping animation of code flowing across a terminal.")
```

### Notes

- The current wrapper only supports models that accept a single `prompt` input field.
- Output parsing depends on file extensions in returned URLs, so nonstandard model outputs can fail even if the Replicate run itself succeeds.
- Use \[`fal`\], \[`lumalabs`\], or \[`modelslabs`\] instead when you want a narrower wrapper with a more opinionated provider-specific flow.

## \[`fal`\]

`fal` is the Fal wrapper for prompt-driven media generation plus a dedicated image-to-image path.

### What It Does

`fal` exposes `generate_media(prompt)` and, when enabled, `image_to_image(prompt, image_url=None)`. `generate_media()` calls `fal_client.subscribe()` with the configured `model` and a single `prompt` argument and returns the first `image` or `video` URL from the provider result. `image_to_image()` is a separate fixed workflow that always uses `fal-ai/flux/dev/image-to-image` rather than the configured `model`. The current implementation streams queue log messages to the MindRoom process logs while the job is running.

### Configuration

| Option                  | Type       | Required | Default                | Notes                                                               |
| ----------------------- | ---------- | -------- | ---------------------- | ------------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`                 | Fal API key. The upstream implementation also checks `FAL_API_KEY`. |
| `model`                 | `text`     | `no`     | `fal-ai/hunyuan-video` | Model used by `generate_media()`.                                   |
| `enable_generate_media` | `boolean`  | `no`     | `true`                 | Enable `generate_media()`.                                          |
| `enable_image_to_image` | `boolean`  | `no`     | `false`                | Enable `image_to_image()`.                                          |
| `all`                   | `boolean`  | `no`     | `false`                | Enable both Fal functions.                                          |

### Example

```
agents:
  visuals:
    tools:
      - fal:
          model: fal-ai/hunyuan-video
          enable_image_to_image: true
```

```
generate_media("A cinematic drone shot over a rainy cyberpunk street.")
image_to_image(
    "Turn this product photo into a watercolor illustration.",
    image_url="https://example.com/source.png",
)
```

### Notes

- `model` only affects `generate_media()`.
- `image_to_image()` ignores `model` and always calls Fal's `fal-ai/flux/dev/image-to-image` route on this branch.
- Returned media are attached by remote URL rather than stored bytes.

## \[`dalle`\]

`dalle` is the dedicated DALL-E image generation wrapper.

### What It Does

`dalle` exposes one call, `create_image(prompt)`. It uses the OpenAI image API directly with the configured `model`, `n`, `size`, `quality`, and `style`. Unlike \[`openai`\], this wrapper is image-only and exposes DALL-E-specific request options directly in the tool config. Generated images are returned as provider-hosted URLs with optional revised prompts when the API supplies them.

### Configuration

| Option                | Type       | Required | Default     | Notes                                                                                                    |
| --------------------- | ---------- | -------- | ----------- | -------------------------------------------------------------------------------------------------------- |
| `model`               | `text`     | `no`     | `dall-e-3`  | DALL-E model used by `create_image()`. The current implementation only accepts `dall-e-3` or `dall-e-2`. |
| `n`                   | `number`   | `no`     | `1`         | Number of images to request. `dall-e-3` only supports `1` in the current implementation.                 |
| `size`                | `text`     | `no`     | `1024x1024` | Output size. The current implementation validates it against a fixed allowed set.                        |
| `quality`             | `text`     | `no`     | `standard`  | Image quality, currently `standard` or `hd`.                                                             |
| `style`               | `text`     | `no`     | `vivid`     | Image style, currently `vivid` or `natural`.                                                             |
| `api_key`             | `password` | `yes`    | `null`      | OpenAI API key. The upstream SDK also checks `OPENAI_API_KEY`.                                           |
| `enable_create_image` | `boolean`  | `no`     | `true`      | Enable `create_image()`.                                                                                 |
| `all`                 | `boolean`  | `no`     | `false`     | Enable the full toolkit, which is currently just `create_image()`.                                       |

### Example

```
agents:
  illustrator:
    tools:
      - dalle:
          model: dall-e-3
          size: 1792x1024
          quality: hd
          style: vivid
```

```
create_image("A cover illustration for a Matrix automation handbook.")
```

### Notes

- Use \[`dalle`\] when you want explicit DALL-E request controls instead of the broader \[`openai`\] toolkit.
- `dall-e-3` plus `n > 1` is rejected before the API call.
- The current implementation does not expose image edits, variations, or `response_format` controls.

## \[`cartesia`\]

`cartesia` is the voice toolkit for listing voices, localizing voices into new languages, and generating speech.

### What It Does

`cartesia` exposes `list_voices()`, `localize_voice(name, description, language, original_speaker_gender, voice_id=None)`, and `text_to_speech(transcript, voice_id=None)`. `list_voices()` returns a filtered JSON list of voice IDs, names, descriptions, and languages. `localize_voice()` creates a localized derivative of an existing voice, using `default_voice_id` unless you pass a different `voice_id`. `text_to_speech()` uses the configured `model_id` and voice ID and returns attached MP3 audio bytes. The current implementation hardcodes MP3 output at 44.1 kHz and 128 kbps.

### Configuration

| Option                  | Type       | Required | Default                                | Notes                                                                                       |
| ----------------------- | ---------- | -------- | -------------------------------------- | ------------------------------------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`                                 | Cartesia API key. The upstream SDK also checks `CARTESIA_API_KEY`.                          |
| `model_id`              | `text`     | `no`     | `sonic-2`                              | Model used by `text_to_speech()`.                                                           |
| `default_voice_id`      | `text`     | `no`     | `78ab82d5-25be-4f7d-82b3-7ad64e5b85b2` | Default source voice for localization and TTS when no call-specific `voice_id` is supplied. |
| `enable_text_to_speech` | `boolean`  | `no`     | `true`                                 | Enable `text_to_speech()`.                                                                  |
| `enable_list_voices`    | `boolean`  | `no`     | `true`                                 | Enable `list_voices()`.                                                                     |
| `enable_localize_voice` | `boolean`  | `no`     | `false`                                | Enable `localize_voice()`.                                                                  |
| `all`                   | `boolean`  | `no`     | `false`                                | Enable all Cartesia functions.                                                              |

### Example

```
agents:
  voice:
    tools:
      - cartesia:
          model_id: sonic-2
          enable_localize_voice: true
```

```
list_voices()
localize_voice(
    name="French Support Voice",
    description="Warm and clear support voice.",
    language="fr",
    original_speaker_gender="female",
)
text_to_speech("Deployment complete.")
```

### Notes

- `localize_voice()` is disabled by default, so voice cloning or localization is opt-in.
- `voice_id` can be overridden per call for both `localize_voice()` and `text_to_speech()`.
- The current TTS path always returns MP3 bytes even though the tool config does not expose an output-format option.

## \[`eleven_labs`\]

`eleven_labs` is the ElevenLabs toolkit for voices, sound effects, and text-to-speech.

### What It Does

`eleven_labs` exposes `get_voices()`, `generate_sound_effect(prompt, duration_seconds=None)`, and `text_to_speech(prompt)`. `get_voices()` returns voice IDs, names, and descriptions from the ElevenLabs account. `generate_sound_effect()` turns a text description into an attached audio artifact. `text_to_speech()` uses the configured `voice_id`, `model_id`, and `output_format` and returns attached audio bytes. If `target_directory` is set, the current implementation also saves generated audio files to disk in that directory.

### Configuration

| Option                         | Type       | Required | Default                  | Notes                                                                              |
| ------------------------------ | ---------- | -------- | ------------------------ | ---------------------------------------------------------------------------------- |
| `voice_id`                     | `text`     | `no`     | `JBFqnCBsd6RMkjVDRZzb`   | Default voice used by `text_to_speech()`.                                          |
| `api_key`                      | `password` | `yes`    | `null`                   | ElevenLabs API key. The upstream implementation also checks `ELEVEN_LABS_API_KEY`. |
| `target_directory`             | `text`     | `no`     | `null`                   | Optional directory where generated audio is also saved locally.                    |
| `model_id`                     | `text`     | `no`     | `eleven_multilingual_v2` | Model used by `text_to_speech()`.                                                  |
| `output_format`                | `text`     | `no`     | `mp3_44100_64`           | Output codec and bitrate preset for generated audio.                               |
| `enable_get_voices`            | `boolean`  | `no`     | `true`                   | Enable `get_voices()`.                                                             |
| `enable_generate_sound_effect` | `boolean`  | `no`     | `true`                   | Enable `generate_sound_effect()`.                                                  |
| `enable_text_to_speech`        | `boolean`  | `no`     | `true`                   | Enable `text_to_speech()`.                                                         |
| `all`                          | `boolean`  | `no`     | `false`                  | Enable all ElevenLabs functions.                                                   |

### Example

```
agents:
  audio_fx:
    tools:
      - eleven_labs:
          model_id: eleven_multilingual_v2
          output_format: mp3_44100_64
          target_directory: generated-audio
```

```
get_voices()
generate_sound_effect("Mechanical keyboard typing in a quiet office.", duration_seconds=4)
text_to_speech("The build succeeded.")
```

### Notes

- `target_directory` is optional and only affects local file saving, not the returned attachment.
- The current implementation always emits `audio/mpeg` artifacts, even when you choose a PCM- or u-law-style output format.
- `generate_sound_effect()` is useful when you want non-speech audio from the same provider toolkit.

## \[`desi_vocal`\]

`desi_vocal` is the speech toolkit for Hindi and other Indian-language voices.

### What It Does

`desi_vocal` exposes `get_voices()` and `text_to_speech(prompt, voice_id=None)`. `get_voices()` returns a provider voice list with ID, name, gender, voice type, supported languages, and preview URL. `text_to_speech()` posts the prompt to DesiVocal's generation API and returns the resulting audio as a remote URL attachment. The default `voice_id` can be overridden per call.

### Configuration

| Option                  | Type       | Required | Default                                | Notes                                                                                                                                 |
| ----------------------- | ---------- | -------- | -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`                                 | DesiVocal API key. The current TTS request sends it as `X_API_KEY`, and the upstream implementation also checks `DESI_VOCAL_API_KEY`. |
| `voice_id`              | `text`     | `no`     | `f27d74e5-ea71-4697-be3e-f04bbd80c1a8` | Default voice used by `text_to_speech()`.                                                                                             |
| `enable_get_voices`     | `boolean`  | `no`     | `true`                                 | Enable `get_voices()`.                                                                                                                |
| `enable_text_to_speech` | `boolean`  | `no`     | `true`                                 | Enable `text_to_speech()`.                                                                                                            |
| `all`                   | `boolean`  | `no`     | `false`                                | Enable both DesiVocal functions.                                                                                                      |

### Example

```
agents:
  hindi_voice:
    tools:
      - desi_vocal:
          voice_id: f27d74e5-ea71-4697-be3e-f04bbd80c1a8
```

```
get_voices()
text_to_speech("नमस्ते, आपकी रिपोर्ट तैयार है।")
```

### Notes

- This is the most language-specific TTS tool on this page and is the best fit when you want Hindi or Indian-language voices.
- The current `get_voices()` implementation reads a public voice list endpoint, but `text_to_speech()` needs the API key.
- Generated audio is returned as a provider-hosted URL rather than inline bytes.

## \[`lumalabs`\]

`lumalabs` is the Luma AI toolkit for text-to-video and image-to-video generation.

### What It Does

`lumalabs` exposes `generate_video(prompt, loop=False, aspect_ratio="16:9", keyframes=None)` and `image_to_video(prompt, start_image_url, end_image_url=None, loop=False, aspect_ratio="16:9")`. Both calls create a Luma generation job and poll until it completes or times out. `generate_video()` optionally accepts provider-style keyframes, while `image_to_video()` builds the required keyframe structure from one or two image URLs. Completed jobs return remote video URL attachments. If `wait_for_completion` is false, the current implementation returns `Async generation unsupported`.

### Configuration

| Option                  | Type       | Required | Default | Notes                                                                                                                              |
| ----------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`               | `password` | `yes`    | `null`  | Luma AI API key. The upstream implementation also checks `LUMAAI_API_KEY`.                                                         |
| `wait_for_completion`   | `boolean`  | `no`     | `true`  | Poll until the provider job completes. Setting it to `false` is not useful on this branch because async return is not implemented. |
| `poll_interval`         | `number`   | `no`     | `3`     | Seconds between status polls.                                                                                                      |
| `max_wait_time`         | `number`   | `no`     | `300`   | Maximum wait time in seconds before timing out.                                                                                    |
| `enable_generate_video` | `boolean`  | `no`     | `true`  | Enable `generate_video()`.                                                                                                         |
| `enable_image_to_video` | `boolean`  | `no`     | `true`  | Enable `image_to_video()`.                                                                                                         |
| `all`                   | `boolean`  | `no`     | `false` | Enable both Luma functions.                                                                                                        |

### Example

```
agents:
  motion:
    tools:
      - lumalabs:
          poll_interval: 5
          max_wait_time: 600
```

```
generate_video("A calm flythrough of a futuristic coworking space.", aspect_ratio="16:9")
image_to_video(
    "Animate this concept art into a short reveal shot.",
    start_image_url="https://example.com/frame0.png",
    end_image_url="https://example.com/frame1.png",
)
```

### Notes

- `image_to_video()` requires remote image URLs, not local file paths.
- `wait_for_completion: false` does not currently provide a job handle or async response.
- Use \[`gemini`\] instead when you specifically want Google's Veo-backed video path.

## \[`modelslabs`\]

`modelslabs` is the ModelsLab wrapper for provider-hosted MP4, GIF, MP3, or WAV generation.

### What It Does

`modelslabs` exposes one call, `generate_media(prompt)`. The current wrapper chooses one of several provider endpoints based on `file_type` and sends a fixed payload template for that media class. For MP4 and GIF generation, it currently uses the provider's text-to-video endpoint and returns future-link URLs with an ETA. For MP3 and WAV generation, it uses provider voice endpoints and returns audio URLs. If `wait_for_completion` is enabled, the tool polls the provider fetch endpoint until the media is ready or the timeout is reached.

### Configuration

| Option                | Type       | Required | Default | Notes                                                                                               |
| --------------------- | ---------- | -------- | ------- | --------------------------------------------------------------------------------------------------- |
| `api_key`             | `password` | `yes`    | `null`  | ModelsLab API key. The upstream implementation also checks `MODELS_LAB_API_KEY`.                    |
| `file_type`           | `text`     | `no`     | `mp4`   | Output type for `generate_media()`, currently `mp4`, `gif`, or audio types such as `mp3` and `wav`. |
| `wait_for_completion` | `boolean`  | `no`     | `false` | Poll the provider fetch endpoint until the output is ready.                                         |
| `add_to_eta`          | `number`   | `no`     | `15`    | Extra seconds added to the provider ETA before timing out.                                          |
| `max_wait_time`       | `number`   | `no`     | `60`    | Maximum total wait time in seconds.                                                                 |

### Example

```
agents:
  generator:
    tools:
      - modelslabs:
          file_type: gif
          wait_for_completion: true
          max_wait_time: 90
```

```
generate_media("A looping animation of messages flowing through a Matrix bridge.")
```

### Notes

- Despite the broad provider branding, the current wrapper exposes one opinionated `generate_media()` path rather than a generic arbitrary-model interface.
- MP4 and GIF generation currently use a fixed provider-side video template, including default dimensions and a hardcoded model ID.
- Returned media are provider URLs, and the success message usually includes the provider ETA rather than immediate ready-to-view bytes.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [OpenAI-Compatible API](https://docs.mindroom.chat/openai-api/index.md)

# Media & Content

Use these tools to process local video files, search GIFs and stock images, inspect YouTube videos, fetch brand assets, and work with Spotify content.

## What This Page Covers

This page documents the built-in tools in the `media-and-content` group. Use these tools when you need local video processing, media lookup, brand asset retrieval, or Spotify-backed search and playlist workflows.

## Tools On This Page

- \[`moviepy_video_tools`\] - Local video helpers for audio extraction, SRT creation, and caption burn-in.
- \[`giphy`\] - GIF search that returns Giphy-hosted animated images.
- \[`youtube`\] - YouTube URL inspection for video metadata, captions, and timestamped transcript lines.
- \[`unsplash`\] - Stock photo search and photo metadata lookup from Unsplash.
- \[`brandfetch`\] - Brand asset and identity lookup by domain, brand ID, ISIN, stock ticker, or brand name.
- \[`spotify`\] - Spotify search, playlist, profile, recommendation, and playback actions.

## Common Setup Notes

`moviepy_video_tools` and `youtube` are `setup_type: none`, so they do not need dashboard OAuth or stored API credentials. `giphy`, `unsplash`, `brandfetch`, and `spotify` all use stored credentials, and password-type fields such as `api_key`, `access_key`, and `access_token` should be managed through the dashboard or credential store instead of inline YAML. The upstream toolkits for `giphy`, `unsplash`, and `brandfetch` also fall back to provider-specific environment variables such as `GIPHY_API_KEY`, `UNSPLASH_ACCESS_KEY`, `BRANDFETCH_API_KEY`, and `BRANDFETCH_CLIENT_ID`. These tools operate on external URLs or local file paths rather than Matrix attachment IDs directly. When you pass local files, the paths must exist inside the runtime that executes the tool. Missing optional Python dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set. That matters most on this page for `moviepy_video_tools`, `giphy`, `brandfetch`, `spotify`, and `youtube`. `spotify` is the only tool on this page with dedicated integration routes in `src/mindroom/api/integrations.py`. MindRoom treats `spotify` as a shared-only integration, so dashboard credential management and tool support require `worker_scope` unset or `shared`, not `user` or `user_agent`.

## \[`moviepy_video_tools`\]

`moviepy_video_tools` is the local video-processing toolkit for extracting audio, saving SRT text, and burning captions into a rendered video file.

### What It Does

`moviepy_video_tools` exposes `extract_audio(video_path, output_path)`, `create_srt(transcription, output_path)`, and `embed_captions(video_path, srt_path, output_path=None, font_size=24, font_color="white", stroke_color="black", stroke_width=1)`. Despite the `enable_process_video` config name, the current upstream method it enables is specifically `extract_audio()`, not a general-purpose video editing surface. `create_srt()` writes the provided transcription text directly to disk, so it expects the caller to already have SRT-formatted content. `embed_captions()` reads an SRT file, converts it to word timings, and renders word-highlighted captions onto a new MP4 output. This tool works entirely on local files, so it is only useful when the agent runtime can read the source media and write the output paths.

### Configuration

| Option                     | Type      | Required | Default | Notes                                     |
| -------------------------- | --------- | -------- | ------- | ----------------------------------------- |
| `enable_process_video`     | `boolean` | `no`     | `true`  | Enable `extract_audio()`.                 |
| `enable_generate_captions` | `boolean` | `no`     | `true`  | Enable `create_srt()`.                    |
| `enable_embed_captions`    | `boolean` | `no`     | `true`  | Enable `embed_captions()`.                |
| `all`                      | `boolean` | `no`     | `false` | Enable the full upstream toolkit surface. |

### Example

```
agents:
  editor:
    tools:
      - moviepy_video_tools:
          enable_embed_captions: true
```

```
extract_audio("clips/demo.mp4", "clips/demo.wav")
create_srt(transcription_srt, "clips/demo.srt")
embed_captions("clips/demo.mp4", "clips/demo.srt", output_path="clips/demo_captioned.mp4")
```

### Notes

- `moviepy` is the declared Python dependency, and the upstream toolkit also expects FFmpeg support for real audio and video processing.
- `embed_captions()` defaults the output filename to `<video>_captioned.mp4` when `output_path` is omitted.
- Use this tool for simple local media transforms, not remote video discovery or hosting.

## \[`giphy`\]

`giphy` searches Giphy for animated GIFs and returns image artifacts that agents can reuse in a response.

### What It Does

`giphy` exposes `search_gifs(query)`. The upstream method signature includes the active agent or team object, but MindRoom callers only provide the search query because the runtime injects the current tool context. Successful calls return a `ToolResult` with both plain-text URLs and attached image artifacts for each GIF. `limit` is fixed at toolkit construction time, so callers do not set result count per request.

### Configuration

| Option               | Type       | Required | Default | Notes                                                            |
| -------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------- |
| `api_key`            | `password` | `yes`    | `null`  | Giphy API key. The upstream toolkit also checks `GIPHY_API_KEY`. |
| `limit`              | `number`   | `no`     | `1`     | Number of GIFs returned per search.                              |
| `enable_search_gifs` | `boolean`  | `no`     | `true`  | Enable `search_gifs()`.                                          |
| `all`                | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                        |

### Example

```
agents:
  social:
    tools:
      - giphy:
          limit: 3
```

```
search_gifs("matrix code review celebration")
```

### Notes

- The metadata marks `api_key` as optional, but successful requests effectively require a real Giphy API key.
- `search_gifs()` returns hosted GIF URLs, not downloaded local files.
- Use this when you want animated reaction media rather than stock photography or brand assets.

## \[`youtube`\]

`youtube` works from a YouTube video URL and extracts metadata, captions, or timestamped transcript lines.

### What It Does

`youtube` exposes `get_youtube_video_data(url)`, `get_youtube_video_captions(url)`, and `get_video_timestamps(url)`. `get_youtube_video_data()` uses YouTube's oEmbed endpoint and returns metadata such as title, author, thumbnail, size, and provider fields. `get_youtube_video_captions()` and `get_video_timestamps()` use `youtube_transcript_api` against the parsed video ID. The current tool does not perform keyword-based YouTube search. It expects a specific YouTube URL and then fetches metadata or transcript-derived output for that video.

### Configuration

| Option                        | Type       | Required | Default | Notes                                                         |
| ----------------------------- | ---------- | -------- | ------- | ------------------------------------------------------------- |
| `enable_get_video_captions`   | `boolean`  | `no`     | `true`  | Enable `get_youtube_video_captions()`.                        |
| `enable_get_video_data`       | `boolean`  | `no`     | `true`  | Enable `get_youtube_video_data()`.                            |
| `enable_get_video_timestamps` | `boolean`  | `no`     | `true`  | Enable `get_video_timestamps()`.                              |
| `all`                         | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                     |
| `languages`                   | `string[]` | `no`     | `null`  | Preferred transcript languages, for example `["en", "es"]`.   |
| `proxies`                     | `text`     | `no`     | `null`  | Optional proxy mapping forwarded to `youtube_transcript_api`. |

### Example

```
agents:
  researcher:
    tools:
      - youtube:
          languages: [en]
```

```
get_youtube_video_data("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
get_youtube_video_captions("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
get_video_timestamps("https://www.youtube.com/watch?v=dQw4w9WgXcQ")
```

### Notes

- If you need keyword-based YouTube discovery rather than URL-based transcript or metadata extraction, use a search tool such as `serpapi` instead of `youtube`.
- `languages` only affects transcript retrieval methods, not `get_youtube_video_data()`.
- Invalid or unsupported URLs return plain-text error strings from the upstream toolkit.

## \[`unsplash`\]

`unsplash` searches Unsplash for stock photography, fetches one photo's metadata, or requests random photo selections.

### What It Does

`unsplash` exposes `search_photos(query, per_page=10, page=1, orientation=None, color=None)`, `get_photo(photo_id)`, `get_random_photo(query=None, orientation=None, count=1)`, and optionally `download_photo(photo_id)`. `search_photos()` returns a JSON payload with total counts plus a simplified list of photo metadata, author info, and image URLs. `get_photo()` adds extra fields such as EXIF data, views, downloads, and location when the API returns them. `get_random_photo()` supports an optional query filter and returns one or more formatted photo records. `download_photo()` does not fetch the image binary. It triggers Unsplash's required download-tracking endpoint and returns the download URL that the caller can fetch separately.

### Configuration

| Option                    | Type       | Required | Default | Notes                                                                        |
| ------------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------------- |
| `access_key`              | `password` | `yes`    | `null`  | Unsplash access key. The upstream toolkit also checks `UNSPLASH_ACCESS_KEY`. |
| `enable_search_photos`    | `boolean`  | `no`     | `true`  | Enable `search_photos()`.                                                    |
| `enable_get_photo`        | `boolean`  | `no`     | `true`  | Enable `get_photo()`.                                                        |
| `enable_get_random_photo` | `boolean`  | `no`     | `true`  | Enable `get_random_photo()`.                                                 |
| `enable_download_photo`   | `boolean`  | `no`     | `false` | Enable `download_photo()`.                                                   |
| `all`                     | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                                    |

### Example

```
agents:
  designer:
    tools:
      - unsplash:
          enable_download_photo: true
```

```
search_photos("conference stage lighting", per_page=5, orientation="landscape")
get_random_photo(query="workspace desk", count=3)
get_photo("abcd1234")
```

### Notes

- `download_photo()` is off by default because it exists mainly for Unsplash API compliance and usage tracking.
- The tool returns URLs and metadata, not local downloaded image files.
- Use `unsplash` for stock photography, not logos or brand identity assets.

## \[`brandfetch`\]

`brandfetch` retrieves brand identity data such as logos, colors, fonts, and related brand metadata.

### What It Does

`brandfetch` exposes `search_by_identifier(identifier)` and optionally `search_by_brand(name)`. `search_by_identifier()` uses the Brand API and accepts domains, Brandfetch brand IDs, ISINs, or stock tickers. `search_by_brand()` uses the Brand Search API and is useful when you only know the brand name and need to discover the canonical brand entry first. The two methods use different credentials. `search_by_identifier()` requires `api_key`, while `search_by_brand()` requires `client_id`.

### Configuration

| Option                        | Type       | Required | Default                        | Notes                                                                                                   |
| ----------------------------- | ---------- | -------- | ------------------------------ | ------------------------------------------------------------------------------------------------------- |
| `api_key`                     | `password` | `no`     | `null`                         | Brandfetch API key for `search_by_identifier()`. The upstream toolkit also checks `BRANDFETCH_API_KEY`. |
| `client_id`                   | `text`     | `no`     | `null`                         | Brandfetch Client ID for `search_by_brand()`. The upstream toolkit also checks `BRANDFETCH_CLIENT_ID`.  |
| `enable_search_by_identifier` | `boolean`  | `no`     | `true`                         | Enable `search_by_identifier()`.                                                                        |
| `enable_search_by_brand`      | `boolean`  | `no`     | `false`                        | Enable `search_by_brand()`.                                                                             |
| `base_url`                    | `url`      | `no`     | `https://api.brandfetch.io/v2` | Base Brandfetch API URL.                                                                                |
| `timeout`                     | `number`   | `no`     | `20.0`                         | Request timeout in seconds.                                                                             |
| `all`                         | `boolean`  | `no`     | `false`                        | Enable the full upstream toolkit surface.                                                               |
| `async_tools`                 | `boolean`  | `no`     | `false`                        | Deprecated upstream flag that is no longer needed for normal use.                                       |

### Example

```
agents:
  branding:
    tools:
      - brandfetch:
          enable_search_by_brand: true
          timeout: 10
```

```
search_by_identifier("openai.com")
search_by_brand("OpenAI")
```

### Notes

- The credential you need depends on which Brandfetch API surface you enable.
- `search_by_identifier()` is the better default when you already know the brand domain or ticker.
- `async_tools` is kept only for upstream compatibility and should be left at its default.

## \[`spotify`\]

`spotify` is the richest content tool on this page, covering music search, recommendations, playlists, profile lookups, and limited playback control.

### What It Does

`spotify` exposes a broad toolkit including `search_tracks()`, `search_playlists()`, `search_artists()`, `search_albums()`, `get_user_playlists()`, `get_track_recommendations()`, `get_artist_top_tracks()`, `get_album_tracks()`, `get_my_top_tracks()`, `get_my_top_artists()`, `create_playlist()`, `add_tracks_to_playlist()`, `get_playlist()`, `update_playlist_details()`, `remove_tracks_from_playlist()`, `get_current_user()`, `play_track()`, and `get_currently_playing()`. The tool itself consumes an `access_token`, but MindRoom also provides a dedicated dashboard OAuth flow in `src/mindroom/api/integrations.py` via `/api/integrations/spotify/connect`, `/spotify/status`, `/spotify/callback`, and `/spotify/disconnect`. That OAuth flow stores `access_token` plus extra metadata such as `refresh_token`, `expires_at`, and `username`. By default the connect flow requests the scopes `user-read-private`, `user-read-email`, `user-read-playback-state`, `user-read-currently-playing`, and `user-top-read`. The upstream playlist and playback methods need additional Spotify scopes beyond that base dashboard flow, so manual token provisioning or a broadened OAuth scope set is still required if you want playlist modification or playback control to succeed.

### Configuration

| Option           | Type       | Required | Default | Notes                                                    |
| ---------------- | ---------- | -------- | ------- | -------------------------------------------------------- |
| `access_token`   | `password` | `yes`    | `null`  | Spotify OAuth access token used by the toolkit.          |
| `default_market` | `text`     | `no`     | `US`    | Default market code for search and album lookup methods. |
| `timeout`        | `number`   | `no`     | `30`    | Request timeout in seconds.                              |

### Example

```
agents:
  dj:
    worker_scope: shared
    tools:
      - spotify:
          default_market: GB
```

```
search_tracks("ambient coding music", max_results=5)
get_my_top_tracks(time_range="short_term", limit=10)
create_playlist("MindRoom Picks", description="Tracks from this week's chat")
get_currently_playing()
```

### Notes

- `spotify` is shared-only in MindRoom, so agents using `worker_scope=user` or `worker_scope=user_agent` will see it marked unsupported and the dashboard status/connect routes will reject that scope.
- The redirect URI defaults to the API callback URL, but `SPOTIFY_REDIRECT_URI` can override it when the dashboard is behind a different public URL.
- `play_track()` requires an active Spotify device and returns a specific `NO_ACTIVE_DEVICE` error when playback cannot start anywhere.
- The current OAuth helper marks saved Spotify credentials as UI-managed so unscoped and shared execution can mirror them correctly.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)
- [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md)

# Matrix & Attachments

Use these tools to work inside the active Matrix room and thread, send follow-up messages, manage thread tags and summaries, and reuse files that belong to the current conversation.

## What This Page Covers

This page documents the built-in tools in the `matrix-and-attachments` group. Use these tools when you need to send or inspect Matrix messages, manage thread tags or summaries, or handle attachment IDs that are scoped to the current room and thread.

## Tools On This Page

- \[`matrix_message`\] - Send, reply, react, read, edit, or inspect Matrix conversation context.
- \[`thread_tags`\] - Add, remove, and inspect shared tags on a Matrix thread.
- \[`thread_summary`\] - Set or update a Matrix thread summary from the current room and thread context.
- \[`matrix_api`\] - Use a low-level Matrix event and state API with explicit room and event IDs.
- \[`attachments`\] - List, inspect, and register context-scoped attachment IDs for later tool calls.

## Common Setup Notes

These tools depend on the active `ToolRuntimeContext`, so they only work when an agent is running in a Matrix-connected conversation. `matrix_message` implies `attachments` through `Config.IMPLIED_TOOLS`, so enabling `matrix_message` makes the `attachments` toolkit available even when you do not list it separately. Attachment IDs are context-scoped `att_*` values, and the runtime only exposes IDs from the current conversation plus any IDs registered during the current tool run. Current source in this worktree exposes `matrix_message`, `thread_tags`, `thread_summary`, `matrix_api`, and `attachments` in this area.

## \[`matrix_message`\]

`matrix_message` is the main Matrix-native tool for sending, reading, reacting to, editing, and inspecting conversation context.

### What It Does

`matrix_message` supports `send`, `reply`, `thread-reply`, `react`, `read`, `thread-list`, `edit`, and `context`. `send` targets the room timeline by default, even when the current conversation is inside a thread. When a room-level `send` includes both text and attachments, the text is posted to the room timeline and the attachments are threaded under that new text event. When a room-level `send` includes multiple attachments and no text, the first attachment is posted to the room timeline and the remaining attachments are threaded under it. When `send` uses an explicit `thread_id`, both text and attachments stay in that existing thread instead of creating a new attachment thread. In `thread_mode: room`, room-level `send` stays plain room messaging and does not auto-thread attachments unless you pass an explicit `thread_id`. `reply` and `thread-reply` inherit the current thread when one can be resolved, and they return an error when no thread target is available. `read`, `edit`, and `context` also inherit the current thread when one can be resolved, while `thread_id="room"` forces room-level scope instead of thread inheritance. `thread-list` uses the current thread when one is active, and it requires an explicit `thread_id` when there is no active thread context. `react` requires `target` and uses `👍` when `message` is empty. `read` defaults to 20 messages and caps `limit` at 50. `thread-list` returns recent thread messages plus `edit_options` for messages that the current Matrix account can edit. Only `send`, `reply`, and `thread-reply` accept attachments, with a combined cap of five `attachment_ids` plus `attachment_file_paths` per call. The tool rate-limits each `(agent_name, requester_id, room_id)` combination to 12 weighted actions per 30 seconds, where each attachment increases the weight of a send or reply.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  assistant:
    tools:
      - matrix_message
```

```
matrix_message(action="context")
matrix_message(action="send", message="Posting this to the room timeline.", thread_id="room")
matrix_message(
    action="reply",
    message="I reviewed the thread and attached the export.",
    attachment_file_paths=["/tmp/report.csv"],
)
matrix_message(action="react", target="$event123", message="✅")
```

### Notes

- `ignore_mentions` defaults to `True`, which writes `com.mindroom.skip_mentions=True` so visible mentions do not wake other agents accidentally.
- Set `ignore_mentions=False` only for deliberate self-handoffs or cross-agent dispatch, because the tool will preserve normal mention handling and record `com.mindroom.original_sender` for human requesters.
- Use `action="context"` before a follow-up write when you want to inspect the resolved `room_id`, `thread_id`, and `reply_to_event_id`.
- Successful attachment sends also return `attachment_thread_id`, which identifies the thread root used for the uploaded files.
- If you need to send existing conversation files, pass `attachment_ids` from the current context or use the `attachments` tool to inspect them first.

## \[`thread_tags`\]

`thread_tags` lets agents add, remove, and inspect shared thread tags using Matrix room state.

### What It Does

`thread_tags` exposes `tag_thread()`, `untag_thread()`, and `list_thread_tags()`. All three operations default to the current room and active resolved thread context. When there is no active resolved thread context, pass `thread_id` explicitly. The tool normalizes the supplied event into the canonical thread root before reading or writing state. Tags are stored as `com.mindroom.thread.tags` room state. Each `(thread_root_id, tag)` pair uses its own state event, and the state key is the JSON array `[thread_root_id, tag]`. Writes fail unless both the running Matrix client and the human requester have enough power to send that state event in the target room. When the requester differs from the bot account, the requester must also be joined to the target room.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  assistant:
    tools:
      - thread_tags
```

```
tag_thread("blocked")
untag_thread("blocked")
list_thread_tags(thread_id="$threadRootEvent")
```

### Notes

- This tool writes shared room state, so it is stricter than `matrix_message` about Matrix permissions.
- Tag writes and removals return the updated canonical tag state for the target thread.
- `list_thread_tags()` can inspect the active thread or an explicitly provided `thread_id`.

## \[`thread_summary`\]

`thread_summary` lets agents set or replace the current thread summary explicitly instead of waiting for the automatic summarizer.

### What It Does

`thread_summary` exposes `set_thread_summary(summary, thread_id=None, room_id=None)`. The tool defaults to the active room and current resolved thread from `ToolRuntimeContext`. When there is no active resolved thread context, pass `thread_id` explicitly. The tool normalizes the target to the canonical thread root before sending a new `m.notice` summary event with `io.mindroom.thread_summary` metadata. Manual summaries are marked with `model_name="manual"` and update the cached last-summary count so later automatic summaries continue from the new baseline. A per-thread async lock prevents concurrent duplicate manual summaries from racing each other.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  assistant:
    tools:
      - thread_summary
```

```
set_thread_summary("Decision: ship the current plan and revisit logs tomorrow.")
set_thread_summary(
    "Summary for the import thread.",
    thread_id="$threadRoot",
    room_id="!ops:example.org",
)
```

### Notes

- `summary` must be a non-empty string up to 300 characters after whitespace normalization.
- The tool writes a normal Matrix notice event, so the updated summary remains visible in the thread timeline.
- Automatic thread summaries still exist, but this tool gives an agent an explicit override path when a human asks for a manual summary refresh.

## \[`matrix_api`\]

`matrix_api` exposes a small low-level Matrix API surface for explicit room, event, and state operations, including room-scoped search.

### What It Does

`matrix_api` supports `send_event`, `get_state`, `put_state`, `redact`, `get_event`, and `search`. It defaults `room_id` to the active room, but it also supports authorized cross-room access when the requester is allowed to act there. It never infers thread IDs, event IDs, or state keys from thread context, so callers must pass those identifiers explicitly for low-level operations. `send_event`, `put_state`, and `redact` are rate-limited per `(agent_name, requester_id, room_id)` and audited in logs. Dangerous state event types like `m.room.power_levels` and `m.room.encryption` are blocked by default. Pass `allow_dangerous=true` only when you intentionally want to change critical room state. Hard-blocked state event types like `m.room.create` remain blocked. `search` is read-only, scopes results to one room via `room_id`, uses the top-level `limit` parameter, and rejects `filter.limit`. When `event_context={"include_profile": true}` is requested, returned context preserves `profile_info` for matching senders.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  assistant:
    tools:
      - matrix_api
```

```
matrix_api(action="get_event", event_id="$event123")
matrix_api(action="get_state", event_type="m.room.topic")
matrix_api(
    action="put_state",
    event_type="com.example.marker",
    state_key="status",
    content={"value": "ready"},
)
matrix_api(action="redact", event_id="$event123", reason="Cleanup")
matrix_api(
    action="search",
    search_term="deployment incident",
    keys=["content.body"],
    event_context={"before_limit": 1, "after_limit": 1, "include_profile": True},
)
```

### Notes

- Use this tool when you need exact Matrix event or state control rather than the higher-level `matrix_message` convenience actions.
- Use `action="search"` when you need one-room full-text event search without falling back to homeserver-wide or ad-hoc history scans.
- The tool returns structured JSON payloads for both success and error cases.
- Because it is intentionally low-level, it requires explicit IDs instead of deriving them from reply or thread context.

## \[`attachments`\]

`attachments` lets agents inspect and register files that are scoped to the current Matrix conversation.

### What It Does

`attachments` exposes `list_attachments()`, `get_attachment()`, and `register_attachment()`. `list_attachments()` returns the attachment IDs currently available in tool runtime context, the resolved metadata payloads, and any `missing_attachment_ids`. `get_attachment()` returns a single attachment record, including the runtime-local path, when called with only an attachment ID. `get_attachment(attachment_id, mindroom_output_path="relative/path")` saves the attachment bytes into the agent workspace and returns a `mindroom_tool_output` save receipt with the saved path, byte count, binary format, and SHA256 digest. Use `mindroom_output_path` before handing attachments to worker-routed workspace tools such as `file`, `coding`, `python`, or `shell`, because the runtime-local path may not exist inside the worker workspace. In worker-routed shell and python tools, the agent workspace is also `~`, `$HOME`, and `$MINDROOM_AGENT_WORKSPACE`, so a saved path like `incoming/file.txt` can also be read as `~/incoming/file.txt`. The path must be relative to the workspace and must not be empty, absolute, point at the workspace root, contain `..` or NUL bytes, or use environment or user expansion. `register_attachment()` turns a local file path into a new context-scoped `att_*` ID and appends that ID to the current runtime context so later tool calls in the same run can reuse it. Attachment records include kind, filename, MIME type, room ID, thread ID, sender, creation time, and an `available` flag that reports whether the local file still exists. This tool does not send files by itself, but its IDs can be passed to `matrix_message` for `send`, `reply`, or `thread-reply`.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  assistant:
    tools:
      - attachments
```

```
list_attachments()
get_attachment("att_abc123")
get_attachment("att_abc123", mindroom_output_path="incoming/plan.pdf")
register_attachment("/tmp/plan.pdf")
matrix_message(action="reply", message="Sharing the plan here.", attachment_ids=["att_abc123"])
```

### Notes

- `attachment_id` values must be non-empty `att_*` IDs that are already present in the current tool runtime context.
- Registering a new file attaches it to the current `room_id` and `thread_id`, which prevents accidental reuse across unrelated conversations.
- For the full attachment lifecycle, media kinds, retention rules, and Matrix ingestion flow, use the dedicated [Attachments](https://docs.mindroom.chat/attachments/index.md) guide.

## Related Matrix Runtime Features

Automatic thread summaries are still implemented in `src/mindroom/thread_summary.py` as bot runtime behavior. The summarizer posts one `m.notice` summary after a thread reaches the configured first threshold (one message by default), and then again every ten additional messages by default, using `defaults.thread_summary_model` or `default`. MindRoom uses `defaults.thread_summary_temperature` for automatic summaries when the provider supports runtime temperature overrides, and always omits temperature for Vertex Claude summaries. The `thread_summary` tool complements that automatic behavior by letting an agent publish a manual summary immediately and advance the stored summary baseline.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Attachments](https://docs.mindroom.chat/attachments/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)

# Messaging & Social

Use these tools to read and send email, post into chat systems, deliver SMS or WhatsApp messages, work with social platforms, and schedule or inspect meetings.

## What This Page Covers

This page documents the built-in tools in the `messaging-and-social` group. Use these tools when you need outbound communication, mailbox access, team-chat delivery, social/community lookups, or Zoom meeting management.

## Tools On This Page

- \[`gmail`\] - Gmail mailbox access and message composition through the Google Gmail OAuth provider.
- \[`slack`\] - Slack channel messaging, threaded replies, channel listing, and history reads.
- \[`discord`\] - Discord bot messaging, channel inspection, history reads, and message deletion.
- \[`telegram`\] - Telegram bot delivery to one configured chat.
- \[`whatsapp`\] - WhatsApp Business API text and template messaging.
- \[`twilio`\] - Twilio SMS delivery, call lookup, and recent message listing.
- \[`webex`\] - Webex room messaging and room listing.
- \[`resend`\] - Transactional email delivery through Resend.
- \[`email`\] - Simple SMTP email sending through Gmail SMTP.
- \[`x`\] - X posting, replying, DMs, profile lookup, timeline reads, and recent-post search.
- \[`reddit`\] - Reddit read access plus optional posting and replies with user auth.
- \[`zoom`\] - Zoom Server-to-Server OAuth meeting scheduling and management.

## Common Setup Notes

`gmail` uses `auth_provider="google_gmail"` and connects through the generic `/api/oauth/google_gmail/*` flow. Its OAuth tokens are stored separately from editable Gmail tool settings. `homeassistant` stays local even when other tools are routed through the sandbox proxy. MindRoom enforces that restriction both at config-validation time and again during tool construction. Password fields should be stored through the dashboard or credential store instead of inline YAML. Several metadata fields on this page are marked `required: false`, but the installed SDKs still need the corresponding token or secret in practice. Useful environment fallbacks on this page include `SLACK_TOKEN`, `DISCORD_BOT_TOKEN`, `TELEGRAM_TOKEN`, `WHATSAPP_ACCESS_TOKEN`, `WHATSAPP_PHONE_NUMBER_ID`, `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `WEBEX_ACCESS_TOKEN`, `RESEND_API_KEY`, `X_BEARER_TOKEN`, `X_CONSUMER_KEY`, `X_CONSUMER_SECRET`, `X_ACCESS_TOKEN`, `X_ACCESS_TOKEN_SECRET`, `REDDIT_CLIENT_ID`, `REDDIT_CLIENT_SECRET`, `REDDIT_USERNAME`, `REDDIT_PASSWORD`, `ZOOM_ACCOUNT_ID`, `ZOOM_CLIENT_ID`, and `ZOOM_CLIENT_SECRET`. The generic-looking `email` tool is not a fully configurable SMTP client on this branch. Its installed upstream implementation is hard-wired to Gmail SMTP over `smtp.gmail.com:465` and does not expose SMTP host, port, or TLS configuration fields.

## \[`gmail`\]

`gmail` is the mailbox-oriented tool for reading, searching, drafting, sending, and labeling Gmail messages through the Google Gmail OAuth provider.

### What It Does

MindRoom wraps Agno's `GmailTools` with `ScopedOAuthClientMixin`, so Gmail credentials are loaded from MindRoom's scoped OAuth credential store instead of a local `token.json` file. The wrapper refreshes stored Google tokens when needed and raises `OAuthConnectionRequired` with a connect URL when no usable stored OAuth credentials are available. It does not fall back to Agno's local OAuth flow when MindRoom credentials are missing. It only bypasses MindRoom OAuth when configured for Google service-account auth. The current installed Gmail toolkit exposes `get_latest_emails()`, `get_emails_from_user()`, `get_unread_emails()`, `get_starred_emails()`, `get_emails_by_context()`, `get_emails_by_date()`, `get_emails_by_thread()`, `search_emails()`, `create_draft_email()`, `send_email()`, `send_email_reply()`, `mark_email_as_read()`, `mark_email_as_unread()`, `list_custom_labels()`, `apply_label()`, `remove_label()`, and `delete_custom_label()`. Draft and send operations accept local file-system paths for attachments.

### Configuration

| Option                 | Type      | Required | Default | Notes                                                          |
| ---------------------- | --------- | -------- | ------- | -------------------------------------------------------------- |
| `get_latest_emails`    | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `get_emails_from_user` | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `get_unread_emails`    | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `get_starred_emails`   | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `search_emails`        | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `create_draft_email`   | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `send_email`           | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |
| `send_email_reply`     | `boolean` | `no`     | `true`  | Declared in MindRoom's registry metadata as a capability flag. |

### Example

```
agents:
  assistant:
    worker_scope: shared
    tools:
      - gmail
```

```
get_latest_emails(10)
search_emails("label:unread from:billing@example.com", 10)
send_email("alice@example.com", "Project update", "Here is the latest status.")
send_email_reply("thread_id", "message_id", "alice@example.com", "Re: Project update", "Thanks for the update.")
apply_label("is:unread category:promotions", "Needs Review", count=10)
```

### Notes

- Connect Gmail through the `google_gmail` OAuth provider rather than storing a Gmail-specific API key.
- The Gmail provider requests Gmail read, modify, and compose scopes.
- `gmail` always runs in the primary MindRoom runtime so worker runtimes do not receive Google OAuth secrets.
- Agno's Gmail constructor accepts per-method selector kwargs (`get_latest_emails`, `get_unread_emails`, `search_emails`, etc.), and the MindRoom wrapper forwards them via `**kwargs`.
- Use those selector kwargs to disable specific methods you do not want the agent calling.
- Attachment arguments are local file paths in the current runtime, not Matrix attachment IDs.

## \[`slack`\]

`slack` is the Slack bot toolkit for posting to channels, replying in threads, listing channels, and reading recent channel history.

### What It Does

The installed upstream tool exposes `send_message()`, `send_message_thread()`, `list_channels()`, and `get_channel_history()`. It uses Slack's `WebClient` from `slack_sdk`. `markdown` maps to the `mrkdwn` flag on `chat_postMessage()`. Channel-history responses are normalized into a smaller JSON structure instead of returning the full Slack API payload.

### Configuration

| Option                       | Type       | Required | Default | Notes                                                        |
| ---------------------------- | ---------- | -------- | ------- | ------------------------------------------------------------ |
| `token`                      | `password` | `no`     | `null`  | Slack bot token, or use `SLACK_TOKEN`. Required in practice. |
| `markdown`                   | `boolean`  | `no`     | `true`  | Enable Slack markdown rendering on sent messages.            |
| `enable_send_message`        | `boolean`  | `no`     | `true`  | Enable `send_message()`.                                     |
| `enable_send_message_thread` | `boolean`  | `no`     | `true`  | Enable `send_message_thread()`.                              |
| `enable_list_channels`       | `boolean`  | `no`     | `true`  | Enable `list_channels()`.                                    |
| `enable_get_channel_history` | `boolean`  | `no`     | `true`  | Enable `get_channel_history()`.                              |
| `all`                        | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                    |

### Example

```
agents:
  support:
    tools:
      - slack:
          markdown: true
          enable_get_channel_history: true
```

```
list_channels()
send_message("C0123456789", "Deployment finished.")
send_message_thread("C0123456789", "Following up in the same thread.", "1743112345.678900")
get_channel_history("C0123456789", limit=25)
```

### Notes

- Although `token` is marked optional in metadata, the installed Slack toolkit raises if no token or `SLACK_TOKEN` is present.
- Use channel IDs for the most reliable calls, especially for threaded replies and history reads.
- `get_channel_history()` returns a simplified JSON view rather than the raw Slack response.

## \[`discord`\]

`discord` is the Discord bot toolkit for channel messaging, channel metadata, channel history, server channel listing, and message deletion.

### What It Does

The installed upstream tool exposes `send_message()`, `get_channel_messages()`, `get_channel_info()`, `list_channels()`, and `delete_message()`. It talks directly to `https://discord.com/api/v10` with a bot token in the `Authorization` header. This is a raw REST wrapper rather than a gateway-connected real-time bot runtime. Most functions expect Discord IDs such as `channel_id`, `guild_id`, and `message_id`, not display names.

### Configuration

| Option                        | Type       | Required | Default | Notes                                                                |
| ----------------------------- | ---------- | -------- | ------- | -------------------------------------------------------------------- |
| `bot_token`                   | `password` | `no`     | `null`  | Discord bot token, or use `DISCORD_BOT_TOKEN`. Required in practice. |
| `enable_send_message`         | `boolean`  | `no`     | `true`  | Enable `send_message()`.                                             |
| `enable_get_channel_messages` | `boolean`  | `no`     | `true`  | Enable `get_channel_messages()`.                                     |
| `enable_get_channel_info`     | `boolean`  | `no`     | `true`  | Enable `get_channel_info()`.                                         |
| `enable_list_channels`        | `boolean`  | `no`     | `true`  | Enable `list_channels()`.                                            |
| `enable_delete_message`       | `boolean`  | `no`     | `true`  | Enable `delete_message()`.                                           |
| `all`                         | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                            |

### Example

```
agents:
  community:
    tools:
      - discord:
          enable_delete_message: false
```

```
list_channels("123456789012345678")
get_channel_info("123456789012345678")
get_channel_messages("123456789012345678", limit=25)
send_message("123456789012345678", "Hello from MindRoom.")
```

### Notes

- `bot_token` is effectively required even though the metadata marks it optional.
- This tool uses ordinary Discord REST calls and does not manage slash commands, presence, or gateway subscriptions.
- Deletion is enabled by default, so disable `enable_delete_message` if you only want read and send access.

## \[`telegram`\]

`telegram` is the simplest chat-delivery tool on this page, with one configured destination and one send function.

### What It Does

The installed upstream tool exposes only `send_message()`. It posts to Telegram Bot API `sendMessage` for the configured `chat_id`. The tool instance is bound to one chat destination, so callers do not pass a chat ID per request. Responses are returned as the raw Telegram API response text.

### Configuration

| Option                | Type       | Required | Default | Notes                                                              |
| --------------------- | ---------- | -------- | ------- | ------------------------------------------------------------------ |
| `chat_id`             | `text`     | `yes`    | `null`  | Telegram chat or channel ID that this tool instance will target.   |
| `token`               | `password` | `no`     | `null`  | Telegram bot token, or use `TELEGRAM_TOKEN`. Required in practice. |
| `enable_send_message` | `boolean`  | `no`     | `true`  | Enable `send_message()`.                                           |
| `all`                 | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                          |

### Example

```
agents:
  notifier:
    tools:
      - telegram:
          chat_id: "-1001234567890"
```

```
send_message("Nightly backup completed.")
```

### Notes

- Set both `chat_id` and `token` for a usable configuration.
- Because the destination chat is fixed in config, this tool is best for one bot-to-room delivery path rather than general multi-room Telegram automation.
- If you need different Telegram destinations, configure different agents or different tool credentials per scope.

## \[`whatsapp`\]

`whatsapp` is the WhatsApp Business API toolkit for text messages and template messages through Meta's Graph API.

### What It Does

The installed upstream tool can expose either `send_text_message_sync()` and `send_template_message_sync()` or their async variants, depending on `async_mode`. It posts to `https://graph.facebook.com/<version>/<phone_number_id>/messages`. `recipient_waid` sets a default recipient so callers can omit the `recipient` argument. If no default recipient is configured, every send call must provide one.

### Configuration

| Option            | Type       | Required | Default | Notes                                                                                       |
| ----------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------------------- |
| `access_token`    | `password` | `no`     | `null`  | WhatsApp Business API token, or use `WHATSAPP_ACCESS_TOKEN`. Required in practice.          |
| `phone_number_id` | `text`     | `no`     | `null`  | WhatsApp Business phone number ID, or use `WHATSAPP_PHONE_NUMBER_ID`. Required in practice. |
| `version`         | `text`     | `no`     | `v22.0` | Graph API version prefix.                                                                   |
| `recipient_waid`  | `text`     | `no`     | `null`  | Default recipient phone number or WhatsApp ID.                                              |
| `async_mode`      | `boolean`  | `no`     | `false` | Register async send functions instead of sync send functions.                               |

### Example

```
agents:
  pager:
    tools:
      - whatsapp:
          version: v22.0
          recipient_waid: "+15551234567"
          async_mode: false
```

```
send_text_message_sync("The deployment finished.")
send_template_message_sync(template_name="deployment_notice", language_code="en_US")
```

### Notes

- Set both `access_token` and `phone_number_id` even though the metadata marks them optional.
- `async_mode: true` changes the registered function names to the async variants, which matters if you are debugging tool traces or reading model-generated tool calls.
- Template sends expect a preapproved WhatsApp template name and, optionally, template `components`.

## \[`twilio`\]

`twilio` is the telecom-oriented toolkit for SMS sends, call lookups, and recent message history.

### What It Does

The installed upstream tool exposes `send_sms()`, `get_call_details()`, and `list_messages()`. It supports two auth modes: `account_sid` plus `auth_token`, or `account_sid` plus `api_key` and `api_secret`. Optional `region` and `edge` are passed into the Twilio client for regional routing. `send_sms()` validates that both `to` and `from_` are in E.164 format before sending.

### Configuration

| Option                    | Type       | Required | Default | Notes                                                                  |
| ------------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------- |
| `account_sid`             | `text`     | `no`     | `null`  | Twilio account SID, or use `TWILIO_ACCOUNT_SID`. Required in practice. |
| `auth_token`              | `password` | `no`     | `null`  | Twilio auth token for the simpler auth mode.                           |
| `api_key`                 | `password` | `no`     | `null`  | Twilio API key for key-and-secret auth.                                |
| `api_secret`              | `password` | `no`     | `null`  | Twilio API secret for key-and-secret auth.                             |
| `region`                  | `text`     | `no`     | `null`  | Optional Twilio region such as `au1`.                                  |
| `edge`                    | `text`     | `no`     | `null`  | Optional Twilio edge such as `sydney`.                                 |
| `debug`                   | `boolean`  | `no`     | `false` | Enable Twilio HTTP client logging.                                     |
| `enable_send_sms`         | `boolean`  | `no`     | `true`  | Enable `send_sms()`.                                                   |
| `enable_get_call_details` | `boolean`  | `no`     | `true`  | Enable `get_call_details()`.                                           |
| `enable_list_messages`    | `boolean`  | `no`     | `true`  | Enable `list_messages()`.                                              |
| `all`                     | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                              |

### Example

```
agents:
  phone_ops:
    tools:
      - twilio:
          region: au1
          edge: sydney
          enable_list_messages: true
```

```
send_sms("+15551234567", "+15557654321", "Build passed.")
get_call_details("CA1234567890abcdef")
list_messages(limit=10)
```

### Notes

- `account_sid` is required in both supported auth modes.
- Use either `auth_token` or the `api_key` plus `api_secret` pair, not both as separate independent auth methods.
- `send_sms()` rejects non-E.164 numbers before it reaches Twilio's API.

## \[`webex`\]

`webex` is the Cisco Webex toolkit for room messaging and room listing.

### What It Does

The installed upstream tool exposes `send_message()` and `list_rooms()`. It authenticates with `webexpythonsdk.WebexAPI` using one access token. Despite the broader description in the registry, the current tool surface is messaging-centric and does not manage meetings. Responses are returned as JSON strings built from the SDK result objects.

### Configuration

| Option                | Type       | Required | Default | Notes                                                                  |
| --------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------- |
| `enable_send_message` | `boolean`  | `no`     | `true`  | Enable `send_message()`.                                               |
| `enable_list_rooms`   | `boolean`  | `no`     | `true`  | Enable `list_rooms()`.                                                 |
| `all`                 | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                              |
| `access_token`        | `password` | `no`     | `null`  | Webex access token, or use `WEBEX_ACCESS_TOKEN`. Required in practice. |

### Example

```
agents:
  meetings:
    tools:
      - webex:
          enable_list_rooms: true
```

```
list_rooms()
send_message("Y2lzY29zcGFyazovL3VzL1JPT00v...", "Agenda is ready.")
```

### Notes

- Set `access_token` even though metadata marks it optional.
- The current tool surface is room messaging and room discovery only.
- Use room IDs, not room titles, when sending messages.

## \[`resend`\]

`resend` is the transactional-email API toolkit for HTML email delivery through Resend.

### What It Does

The installed upstream tool exposes one function, `send_email()`. It sets `resend.api_key` on each call and sends email through `resend.Emails.send()`. The current implementation passes the message body as HTML, not plain text. `from_email` is stored on the tool instance and reused for every call.

### Configuration

| Option              | Type       | Required | Default | Notes                                                                              |
| ------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------------------- |
| `api_key`           | `password` | `no`     | `null`  | Resend API key, or use `RESEND_API_KEY`. Required in practice.                     |
| `from_email`        | `text`     | `no`     | `null`  | Sender identity used for every call. Usually required by the provider in practice. |
| `enable_send_email` | `boolean`  | `no`     | `true`  | Enable `send_email()`.                                                             |
| `all`               | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                                          |

### Example

```
agents:
  mailer:
    tools:
      - resend:
          from_email: no-reply@example.com
```

```
send_email("alice@example.com", "Welcome", "<p>Your account is ready.</p>")
```

### Notes

- `from_email` is not marked required in metadata, but most Resend setups need a verified sender.
- The `body` argument is sent as HTML.
- Use `resend` when you want provider-managed transactional delivery rather than Gmail mailbox access.

## \[`email`\]

`email` is the simplest SMTP mailer on this page, but on this branch it is specifically wired for Gmail SMTP rather than generic SMTP.

### What It Does

The installed upstream tool exposes one function, `email_user()`. It builds an `EmailMessage`, then logs in with `smtplib.SMTP_SSL("smtp.gmail.com", 465)`. That means the sender account must be a Gmail account or a Google Workspace account compatible with Gmail SMTP. The current implementation sends plain-text bodies only.

### Configuration

| Option              | Type       | Required | Default | Notes                                                                |
| ------------------- | ---------- | -------- | ------- | -------------------------------------------------------------------- |
| `receiver_email`    | `text`     | `no`     | `null`  | Default recipient for all `email_user()` calls.                      |
| `sender_name`       | `text`     | `no`     | `null`  | Display name shown in the `From` header.                             |
| `sender_email`      | `text`     | `no`     | `null`  | Gmail sender address used for SMTP login.                            |
| `sender_passkey`    | `password` | `no`     | `null`  | Gmail app password or equivalent SMTP passkey. Required in practice. |
| `enable_email_user` | `boolean`  | `no`     | `true`  | Enable `email_user()`.                                               |
| `all`               | `boolean`  | `no`     | `false` | Enable the full upstream toolkit surface.                            |

### Example

```
agents:
  alerts:
    tools:
      - email:
          receiver_email: oncall@example.com
          sender_name: MindRoom Alerts
          sender_email: alerts@gmail.com
```

```
email_user("Nightly report", "The report finished successfully.")
```

### Notes

- Despite the generic tool name, there are no host, port, TLS, or auth-mechanism overrides in the current implementation.
- This tool is best for simple Gmail-based outbound mail, not arbitrary SMTP providers.
- If you need mailbox reads, drafts, replies, or labels, use `gmail` instead.
- If you need provider-managed transactional delivery with HTML bodies, use `resend` instead.

## \[`x`\]

`x` is the X or Twitter toolkit for recent-post search, user lookup, timeline access, posting, replying, and DMs.

### What It Does

The installed upstream tool exposes `create_post()`, `reply_to_post()`, `send_dm()`, `get_user_info()`, `get_home_timeline()`, and `search_posts()`. It uses `tweepy.Client` and passes through `wait_on_rate_limit`. `search_posts()` calls `search_recent_tweets()` and, when `include_post_metrics` is enabled, augments each returned post with reply, retweet, like, and quote counts. The implementation clamps `max_results` for search to the Twitter API's supported `10` to `100` range.

### Configuration

| Option                 | Type       | Required | Default | Notes                                                           |
| ---------------------- | ---------- | -------- | ------- | --------------------------------------------------------------- |
| `bearer_token`         | `password` | `no`     | `null`  | Bearer token, or use `X_BEARER_TOKEN`.                          |
| `consumer_key`         | `password` | `no`     | `null`  | OAuth consumer key, or use `X_CONSUMER_KEY`.                    |
| `consumer_secret`      | `password` | `no`     | `null`  | OAuth consumer secret, or use `X_CONSUMER_SECRET`.              |
| `access_token`         | `password` | `no`     | `null`  | OAuth access token, or use `X_ACCESS_TOKEN`.                    |
| `access_token_secret`  | `password` | `no`     | `null`  | OAuth access token secret, or use `X_ACCESS_TOKEN_SECRET`.      |
| `include_post_metrics` | `boolean`  | `no`     | `false` | Include public metrics in `search_posts()` output.              |
| `wait_on_rate_limit`   | `boolean`  | `no`     | `false` | Let Tweepy wait for rate limits instead of failing immediately. |

### Example

```
agents:
  social:
    tools:
      - x:
          include_post_metrics: true
          wait_on_rate_limit: true
```

```
search_posts("mindroom matrix", max_results=10)
get_user_info("mindroom_ai")
create_post("MindRoom now supports another release.")
reply_to_post("1890123456789012345", "Thanks for the feedback.")
```

### Notes

- A bearer token can support read-style endpoints such as search, but posting, replying, DMs, and home-timeline access generally need full OAuth user credentials.
- The toolkit also defines `get_my_info()`, but MindRoom's current registered tool list on this branch does not advertise a separate config flag for it.
- `reply_to_post()` builds a `twitter.com` URL in its response, while `create_post()` builds an `x.com` URL, so response formatting is currently inconsistent upstream.

## \[`reddit`\]

`reddit` is the Reddit toolkit for reading user and subreddit data, listing trending communities, and optionally posting or replying with user credentials.

### What It Does

The installed upstream tool exposes `get_user_info()`, `get_top_posts()`, `get_subreddit_info()`, `get_trending_subreddits()`, `get_subreddit_stats()`, `create_post()`, `reply_to_post()`, and `reply_to_comment()`. With only `client_id` and `client_secret`, the tool initializes a read-only `praw.Reddit` client. If `username` and `password` are also configured, the tool enables posting and replying with authenticated user actions. `reddit_instance` is an advanced programmatic injection point for an existing `praw.Reddit` object rather than a normal YAML value.

### Configuration

| Option            | Type       | Required | Default | Notes                                                                          |
| ----------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------ |
| `reddit_instance` | `text`     | `no`     | `null`  | Advanced programmatic Reddit client injection, not normal YAML authoring.      |
| `client_id`       | `text`     | `no`     | `null`  | Reddit app client ID, or use `REDDIT_CLIENT_ID`. Required in practice.         |
| `client_secret`   | `password` | `no`     | `null`  | Reddit app client secret, or use `REDDIT_CLIENT_SECRET`. Required in practice. |
| `user_agent`      | `text`     | `no`     | `null`  | Optional custom user agent, with `RedditTools v1.0` as the upstream fallback.  |
| `username`        | `text`     | `no`     | `null`  | Reddit username for posting and replying.                                      |
| `password`        | `password` | `no`     | `null`  | Reddit password for posting and replying.                                      |

### Example

```
agents:
  research:
    tools:
      - reddit:
          user_agent: MindRoomResearchBot/1.0
```

```
get_top_posts("matrixdotorg", time_filter="week", limit=10)
get_subreddit_info("python")
get_trending_subreddits()
get_user_info("spez")
```

### Notes

- `client_id` and `client_secret` are enough for read-only operations.
- `create_post()`, `reply_to_post()`, and `reply_to_comment()` additionally require `username` and `password`.
- The current MindRoom metadata treats `reddit_instance` as text, but the upstream constructor expects an already constructed `praw.Reddit` object.

## \[`zoom`\]

`zoom` is the Zoom meeting-management toolkit for scheduling, listing, inspecting, deleting, and reading recordings through Zoom's Server-to-Server OAuth flow.

### What It Does

The installed upstream tool exposes `get_access_token()`, `schedule_meeting()`, `get_upcoming_meetings()`, `list_meetings()`, `get_meeting_recordings()`, `delete_meeting()`, and `get_meeting()`. It exchanges `account_id`, `client_id`, and `client_secret` against `https://zoom.us/oauth/token` with `grant_type=account_credentials`. The generated access token is cached in process until shortly before expiry. Meeting creation always targets `users/me/meetings` and applies a fixed set of default meeting settings in the upstream implementation.

### Configuration

| Option          | Type       | Required | Default | Notes                                                                                              |
| --------------- | ---------- | -------- | ------- | -------------------------------------------------------------------------------------------------- |
| `account_id`    | `text`     | `no`     | `null`  | Zoom account ID from a Server-to-Server OAuth app, or use `ZOOM_ACCOUNT_ID`. Required in practice. |
| `client_id`     | `text`     | `no`     | `null`  | Zoom client ID, or use `ZOOM_CLIENT_ID`. Required in practice.                                     |
| `client_secret` | `password` | `no`     | `null`  | Zoom client secret, or use `ZOOM_CLIENT_SECRET`. Required in practice.                             |

### Example

```
agents:
  coordinator:
    tools:
      - zoom:
          account_id: your_account_id
          client_id: your_client_id
```

```
schedule_meeting("Weekly sync", "2026-04-02T16:00:00Z", 30, timezone="America/Los_Angeles")
get_upcoming_meetings()
list_meetings(type="scheduled")
get_meeting("81234567890")
```

### Notes

- Although MindRoom marks `zoom` as `setup_type: oauth`, the current implementation is not an interactive browser OAuth flow.
- The installed tool expects Server-to-Server OAuth credentials and exchanges them directly for an access token on demand.
- `get_meeting_recordings()` can request a download token and TTL per call, but those are call arguments rather than config fields.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Automation & Platforms](https://docs.mindroom.chat/tools/automation-and-platforms/index.md) - For `aws_ses` when you want an alternative outbound-email provider.

# Project Management

Use these tools to work with source hosts, issue trackers, knowledge bases, kanban boards, task managers, and support help centers.

## What This Page Covers

This page documents the built-in tools in the `project-management` group. Use these tools when you need repository context, issue tracking, documentation updates, board workflows, personal task management, or support article search.

## Tools On This Page

- \[`github`\] - GitHub repositories, issues, pull requests, files, branches, code search, and review requests.
- \[`bitbucket`\] - Bitbucket workspace and repository inspection for repositories, commits, pull requests, and issues.
- \[`jira`\] - Jira issue lookup, creation, JQL search, comments, and worklogs.
- \[`linear`\] - Linear GraphQL access for viewer info, teams, issues, and issue updates.
- \[`clickup`\] - ClickUp space, list, task, and task-lifecycle operations.
- \[`confluence`\] - Confluence page lookup, space discovery, page creation, and page updates.
- \[`notion`\] - Notion database page creation, page appends, and tag-based search.
- \[`trello`\] - Trello boards, lists, cards, and card moves.
- \[`todoist`\] - Todoist task creation, updates, completion, deletion, and project discovery.
- \[`zendesk`\] - Zendesk Help Center article search.

## Common Setup Notes

All tools on this page are registered as `status=requires_config`, so they stay unavailable in the dashboard until their required credentials or connection fields are present. None of these tools declare an `auth_provider`, and `src/mindroom/api/integrations.py` currently only exposes Spotify OAuth routes, so project-management tools are configured through stored tool credentials or environment variables rather than a dedicated dashboard OAuth flow. Password and token fields should be stored through the dashboard or credential store instead of inline YAML. Most upstream SDKs also read environment variables, including `GITHUB_ACCESS_TOKEN`, `BITBUCKET_USERNAME`, `BITBUCKET_PASSWORD`, `BITBUCKET_TOKEN`, `JIRA_SERVER_URL`, `JIRA_USERNAME`, `JIRA_PASSWORD`, `JIRA_TOKEN`, `LINEAR_API_KEY`, `CLICKUP_API_KEY`, `MASTER_SPACE_ID`, `CONFLUENCE_URL`, `CONFLUENCE_USERNAME`, `CONFLUENCE_API_KEY`, `CONFLUENCE_PASSWORD`, `NOTION_API_KEY`, `NOTION_DATABASE_ID`, `TRELLO_API_KEY`, `TRELLO_API_SECRET`, `TRELLO_TOKEN`, `TODOIST_API_TOKEN`, `ZENDESK_USERNAME`, `ZENDESK_PASSWORD`, and `ZENDESK_COMPANY_NAME`. Several registry fields on this page are marked optional in metadata even though the upstream tool effectively requires them at runtime, so the notes below call out the practical requirement level for each tool. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set.

## \[`github`\]

`github` is the broadest repository-hosting tool on this page, covering repository search, repository stats, issues, pull requests, files, branches, and code search.

### What It Does

`github` exposes repository discovery methods such as `search_repositories()`, `list_repositories()`, `get_repository()`, `get_repository_with_stats()`, `list_branches()`, `get_repository_languages()`, and `get_repository_stars()`. It also exposes issue and pull request workflows such as `list_issues()`, `get_issue()`, `comment_on_issue()`, `edit_issue()`, `get_pull_request()`, `get_pull_request_comments()`, `create_pull_request()`, `create_pull_request_comment()`, and `create_review_request()`. The file-management surface includes `create_file()`, `get_file_content()`, `update_file()`, `delete_file()`, `get_directory_content()`, and `get_branch_content()`. `base_url` lets the same tool talk to GitHub Enterprise, but it must point at the API root rather than the normal web UI root.

### Configuration

| Option         | Type       | Required | Default | Notes                                                                                |
| -------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------------ |
| `access_token` | `password` | `no`     | `null`  | GitHub personal access token or GitHub App token.                                    |
| `base_url`     | `url`      | `no`     | `null`  | Optional GitHub Enterprise API base URL such as `https://github.example.com/api/v3`. |

### Example

```
agents:
  maintainer:
    tools:
      - github:
          base_url: https://github.example.com/api/v3
```

```
get_repository("mindroom-ai/mindroom")
list_issues("mindroom-ai/mindroom", state="open", page=1, per_page=20)
get_pull_request("mindroom-ai/mindroom", 123)
```

### Notes

- `access_token` is marked optional in MindRoom metadata, but the upstream client raises at startup if neither `access_token` nor `GITHUB_ACCESS_TOKEN` is present.
- Use `base_url` only for GitHub Enterprise, and set it to the API endpoint such as `/api/v3` rather than the human-facing site root.
- `github` is the best fit on this page when you need repository file operations or rich pull-request inspection in addition to issue tracking.

## \[`bitbucket`\]

`bitbucket` is the Bitbucket repository tool for a configured workspace and repository slug.

### What It Does

`bitbucket` exposes `list_repositories()`, `get_repository_details()`, `create_repository()`, `list_repository_commits()`, `list_all_pull_requests()`, `get_pull_request_details()`, `get_pull_request_changes()`, and `list_issues()`. The tool always authenticates with a configured `username` plus either `password` or `token`, and it scopes most operations to the configured `workspace` and `repo_slug`. If `server_url` has no scheme, the upstream tool normalizes it to `https://<server_url>/<api_version>`.

### Configuration

| Option        | Type       | Required | Default             | Notes                                                  |
| ------------- | ---------- | -------- | ------------------- | ------------------------------------------------------ |
| `username`    | `text`     | `yes`    | `null`              | Bitbucket username.                                    |
| `password`    | `password` | `no`     | `null`              | App password, used when `token` is not supplied.       |
| `token`       | `password` | `no`     | `null`              | Access token, used instead of `password` when present. |
| `workspace`   | `text`     | `yes`    | `null`              | Bitbucket workspace name.                              |
| `repo_slug`   | `text`     | `yes`    | `null`              | Repository slug used by most repository-scoped calls.  |
| `server_url`  | `url`      | `no`     | `api.bitbucket.org` | Bitbucket host or full base URL.                       |
| `api_version` | `text`     | `no`     | `2.0`               | Bitbucket REST API version appended to `server_url`.   |

### Example

```
agents:
  maintainer:
    tools:
      - bitbucket:
          username: buildbot
          workspace: mindroom
          repo_slug: docs
```

```
get_repository_details()
list_all_pull_requests(state="OPEN")
list_repository_commits(count=10)
```

### Notes

- Provide either `password` or `token`, and use an app password for Bitbucket Cloud unless you have a reason to use token-based auth.
- `repo_slug` is not just a default, because most methods are hard-scoped to that repository and the current `create_repository()` call path also posts through the configured `repo_slug` endpoint on this branch.
- `list_repositories()` is the workspace-wide overview method, while the pull-request, commit, and issue methods all use the configured repository context.

## \[`jira`\]

`jira` is the issue-tracking toolkit for issue lookup, issue creation, JQL search, comments, and worklogs.

### What It Does

`jira` can expose `get_issue()`, `create_issue()`, `search_issues()`, `add_comment()`, and `add_worklog()` through individual enable flags. `server_url` is required at runtime, and the upstream client authenticates with `username` plus `token` when both are present, falls back to `username` plus `password`, and otherwise attempts anonymous access. `search_issues()` uses plain JQL, which makes it the main entry point for filtered issue lists and backlog queries.

### Configuration

| Option                 | Type       | Required | Default | Notes                                                                          |
| ---------------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------ |
| `server_url`           | `url`      | `no`     | `null`  | Jira base URL such as `https://example.atlassian.net`.                         |
| `username`             | `text`     | `no`     | `null`  | Jira username or Atlassian account email.                                      |
| `password`             | `password` | `no`     | `null`  | Jira password for self-hosted deployments.                                     |
| `token`                | `password` | `no`     | `null`  | Jira or Atlassian API token, preferred over `password` for cloud deployments.  |
| `enable_get_issue`     | `boolean`  | `no`     | `true`  | Enable `get_issue()`.                                                          |
| `enable_create_issue`  | `boolean`  | `no`     | `true`  | Enable `create_issue()`.                                                       |
| `enable_search_issues` | `boolean`  | `no`     | `true`  | Enable `search_issues()`.                                                      |
| `enable_add_comment`   | `boolean`  | `no`     | `true`  | Enable `add_comment()`.                                                        |
| `enable_add_worklog`   | `boolean`  | `no`     | `true`  | Enable `add_worklog()`.                                                        |
| `all`                  | `boolean`  | `no`     | `false` | Enable the full upstream Jira tool surface regardless of the per-method flags. |

### Example

```
agents:
  delivery:
    tools:
      - jira:
          server_url: https://mindroom.atlassian.net
          username: bot@example.com
          enable_add_worklog: false
```

```
get_issue("PROJ-123")
search_issues("project = PROJ AND status != Done", max_results=20)
add_comment("PROJ-123", "Reviewed and ready for testing.")
```

### Notes

- `server_url` is marked optional in metadata, but the upstream client raises if neither `server_url` nor `JIRA_SERVER_URL` is available.
- For Atlassian Cloud, use `username` plus `token` instead of `password`.
- If your Jira deployment allows anonymous API access, the tool can still work without credentials, but most hosted installations do not permit that.

## \[`linear`\]

`linear` is the GraphQL-backed issue tracker tool for viewer info, teams, issues, and issue updates.

### What It Does

`linear` exposes `get_user_details()`, `get_teams_details()`, `get_issue_details()`, `create_issue()`, `update_issue()`, `get_user_assigned_issues()`, `get_workflow_issues()`, and `get_high_priority_issues()`. All calls go to `https://api.linear.app/graphql`, and the tool expects a Linear API key in either `api_key` or `LINEAR_API_KEY`. The read methods are useful for discovering the IDs you need before calling `create_issue()` or `update_issue()`.

### Configuration

| Option    | Type       | Required | Default | Notes           |
| --------- | ---------- | -------- | ------- | --------------- |
| `api_key` | `password` | `no`     | `null`  | Linear API key. |

### Example

```
agents:
  delivery:
    tools:
      - linear
```

```
get_user_details()
get_teams_details()
get_high_priority_issues()
```

### Notes

- `api_key` is marked optional in metadata, but the upstream client raises if neither `api_key` nor `LINEAR_API_KEY` is present.
- `get_issue_details()` takes a Linear issue ID rather than an issue key, so use `get_teams_details()` or other Linear discovery steps first when you only know the human-readable issue key from the UI.
- `linear` is the best fit on this page when your workflow is already centered on Linear IDs, teams, and workflow states rather than repository-native pull requests.

## \[`clickup`\]

`clickup` is the ClickUp task-management tool for spaces, lists, tasks, and task lifecycle operations.

### What It Does

`clickup` exposes `list_tasks()`, `create_task()`, `get_task()`, `update_task()`, `delete_task()`, `list_spaces()`, and `list_lists()`. The tool uses `master_space_id` to call ClickUp's `team/{id}/space` endpoints, so this field is effectively the team or workspace identifier used to discover spaces. Name-based space and list lookup is case-insensitive and also supports regex-style matching in the current upstream implementation. `list_tasks()` aggregates tasks across all lists in a space, while `create_task()` creates into the first list returned for the matched space.

### Configuration

| Option            | Type       | Required | Default | Notes                                                  |
| ----------------- | ---------- | -------- | ------- | ------------------------------------------------------ |
| `api_key`         | `password` | `yes`    | `null`  | ClickUp API key.                                       |
| `master_space_id` | `text`     | `yes`    | `null`  | ClickUp team or workspace ID used to enumerate spaces. |

### Example

```
agents:
  delivery:
    tools:
      - clickup:
          master_space_id: "90123456"
```

```
list_spaces()
list_lists("Engineering")
create_task("Engineering", "ISSUE-075", "Draft the project-management tool page")
```

### Notes

- The runtime also checks `CLICKUP_API_KEY` and `MASTER_SPACE_ID`, so you can keep both values in stored credentials or environment instead of YAML.
- `create_task()` always uses the first list returned for the matching space on this branch, so use `list_lists()` first if list placement matters.
- `update_task()` passes arbitrary keyword updates through to the ClickUp API, which makes it the most flexible write method once you have a task ID.

## \[`confluence`\]

`confluence` is the Atlassian wiki tool for space discovery and page retrieval, creation, and updates.

### What It Does

`confluence` exposes `get_page_content()`, `get_space_key()`, `create_page()`, `update_page()`, `get_all_space_detail()`, and `get_all_page_from_space()`. The tool resolves a space by human-readable name or by key, and other space-scoped methods depend on that resolution step. `get_page_content()` defaults to `expand="body.storage"`, and `create_page()` and `update_page()` pass raw body content to the Confluence API. At runtime the tool accepts either `api_key` or `password`, with the current implementation preferring `api_key` or `CONFLUENCE_API_KEY` when both are present.

### Configuration

| Option       | Type       | Required | Default | Notes                                                                |
| ------------ | ---------- | -------- | ------- | -------------------------------------------------------------------- |
| `url`        | `url`      | `no`     | `null`  | Confluence base URL.                                                 |
| `username`   | `text`     | `no`     | `null`  | Confluence username or Atlassian account email.                      |
| `password`   | `password` | `no`     | `null`  | Confluence password for self-hosted deployments.                     |
| `api_key`    | `password` | `no`     | `null`  | Confluence API key, preferred over `password` for cloud deployments. |
| `verify_ssl` | `boolean`  | `no`     | `true`  | Verify TLS certificates when connecting to Confluence.               |

### Example

```
agents:
  docs:
    tools:
      - confluence:
          url: https://mindroom.atlassian.net/wiki
          username: docs@example.com
```

```
get_all_space_detail()
get_page_content("Engineering", "Runbook")
create_page("Engineering", "Release Notes", "<p>Initial draft</p>")
```

### Notes

- `url`, `username`, and one of `api_key` or `password` are all required in practice even though the registry marks them optional.
- For Atlassian Cloud, use `username` plus `api_key`, and reserve `password` for self-hosted or older installations.
- Set `verify_ssl: false` only for self-signed or internal deployments where you understand the TLS tradeoff.

## \[`notion`\]

`notion` is the Notion database tool for page creation, content appends, and tag-based search.

### What It Does

`notion` can expose `create_page()`, `update_page()`, and `search_pages()` through individual enable flags. The current upstream implementation assumes the target database has a title property named `Name` and a select property named `Tag`. `create_page()` creates a page with a title, a tag, and one initial paragraph block. `update_page()` appends a paragraph block to an existing page instead of rewriting the whole page. `search_pages()` queries the database directly over HTTP and filters by the `Tag` select value.

### Configuration

| Option                | Type       | Required | Default | Notes                                                                            |
| --------------------- | ---------- | -------- | ------- | -------------------------------------------------------------------------------- |
| `api_key`             | `password` | `yes`    | `null`  | Notion integration token.                                                        |
| `database_id`         | `text`     | `yes`    | `null`  | Notion database ID.                                                              |
| `enable_create_page`  | `boolean`  | `no`     | `true`  | Enable `create_page()`.                                                          |
| `enable_update_page`  | `boolean`  | `no`     | `true`  | Enable `update_page()`.                                                          |
| `enable_search_pages` | `boolean`  | `no`     | `true`  | Enable `search_pages()`.                                                         |
| `all`                 | `boolean`  | `no`     | `false` | Enable the full upstream Notion tool surface regardless of the per-method flags. |

### Example

```
agents:
  docs:
    tools:
      - notion:
          database_id: 0123456789abcdef0123456789abcdef
          enable_update_page: false
```

```
search_pages("docs")
create_page("ISSUE-075", "docs", "Draft the project-management tool page")
update_page("PAGE_ID", "Added rollout notes")
```

### Notes

- The integration must be shared with the target database before the tool can create or search pages.
- The database schema must include a `Name` title property and a `Tag` select property, because those names are hard-coded in the current upstream implementation.
- `all: true` overrides the individual enable flags when you want the full Notion surface.

## \[`trello`\]

`trello` is the board-management tool for boards, lists, cards, and card moves.

### What It Does

`trello` exposes `create_card()`, `get_board_lists()`, `move_card()`, `get_cards()`, `create_board()`, `create_list()`, and `list_boards()`. `create_card()` looks up the target list by case-insensitive `list_name` within a board and then creates the card there. `move_card()` works by card ID and destination list ID, which makes `get_board_lists()` and `get_cards()` the normal discovery helpers before edits. If the Trello client cannot initialize, the current upstream methods return `"Trello client not initialized"` instead of structured JSON.

### Configuration

| Option       | Type       | Required | Default | Notes              |
| ------------ | ---------- | -------- | ------- | ------------------ |
| `api_key`    | `password` | `no`     | `null`  | Trello API key.    |
| `api_secret` | `password` | `no`     | `null`  | Trello API secret. |
| `token`      | `password` | `no`     | `null`  | Trello user token. |

### Example

```
agents:
  planner:
    tools:
      - trello
```

```
list_boards(board_filter="open")
get_board_lists("BOARD_ID")
create_card("BOARD_ID", "To Do", "Write docs", "Draft the new tool page")
```

### Notes

- The registry marks all three fields optional, but a working Trello client effectively needs `api_key`, `api_secret`, and `token`.
- `list_boards()` accepts filters such as `all`, `open`, `closed`, `organization`, `public`, and `starred`.
- Use `get_board_lists()` first when you need list IDs for `move_card()` or when you want to confirm the exact list names present on a board.

## \[`todoist`\]

`todoist` is the personal task-management tool for creating, updating, completing, deleting, and listing tasks and projects.

### What It Does

`todoist` exposes `create_task()`, `get_task()`, `update_task()`, `close_task()`, `delete_task()`, `get_active_tasks()`, and `get_projects()`. `create_task()` supports optional `project_id`, natural-language `due_string`, `priority`, and `labels`. `update_task()` is the richest write method, with support for content, description, labels, priority, `due_string`, `due_date`, `due_datetime`, `due_lang`, `assignee_id`, and `section_id`. `close_task()` marks a task complete, while `delete_task()` permanently removes it.

### Configuration

| Option      | Type       | Required | Default | Notes              |
| ----------- | ---------- | -------- | ------- | ------------------ |
| `api_token` | `password` | `no`     | `null`  | Todoist API token. |

### Example

```
agents:
  planner:
    tools:
      - todoist
```

```
create_task("Write project-management docs", due_string="tomorrow", priority=4)
get_active_tasks()
close_task("TASK_ID")
```

### Notes

- `api_token` is marked optional in metadata, but the upstream client raises if neither `api_token` nor `TODOIST_API_TOKEN` is present.
- Use `get_projects()` first when you want to target a specific project with `project_id`.
- `priority` follows Todoist's `1` to `4` scale, where `4` is the highest priority.

## \[`zendesk`\]

`zendesk` is the help-center search tool on this page.

### What It Does

`zendesk` can expose `search_zendesk()` through the `enable_search_zendesk` flag. The current upstream implementation calls the Zendesk Help Center articles search endpoint at `https://<company_name>.zendesk.com/api/v2/help_center/articles/search.json`. Search results are reduced to cleaned article body text with HTML tags removed. This tool does not expose ticket lookup or ticket updates on this branch.

### Configuration

| Option                  | Type       | Required | Default | Notes                                                                             |
| ----------------------- | ---------- | -------- | ------- | --------------------------------------------------------------------------------- |
| `username`              | `text`     | `no`     | `null`  | Zendesk username.                                                                 |
| `password`              | `password` | `no`     | `null`  | Zendesk password.                                                                 |
| `company_name`          | `text`     | `no`     | `null`  | Zendesk subdomain used to build the API URL.                                      |
| `enable_search_zendesk` | `boolean`  | `no`     | `true`  | Enable `search_zendesk()`.                                                        |
| `all`                   | `boolean`  | `no`     | `false` | Enable the full upstream Zendesk tool surface regardless of the per-method flags. |

### Example

```
agents:
  support:
    tools:
      - zendesk:
          username: support@example.com
          company_name: acme
```

```
search_zendesk("Matrix onboarding")
```

### Notes

- `username`, `password`, and `company_name` are all required in practice even though the registry marks them optional.
- `company_name` is the Zendesk subdomain, not the human-readable company display name.
- Because the current tool returns cleaned article body text without titles or URLs, it is better for knowledge lookup than for navigational link retrieval.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)
- [Dashboard](https://docs.mindroom.chat/dashboard/index.md)

# Calendar & Scheduling

Use these tools to read external calendars, manage bookings, and schedule future agent work inside MindRoom.

## What This Page Covers

This page documents the built-in tools in the `calendar-and-scheduling` group. Use these tools when you need Google Calendar access, Cal.com booking APIs, or Matrix-native scheduled tasks that post back into MindRoom later.

## Tools On This Page

- \[`google_calendar`\] - Read Google Calendar data and, when enabled, create, update, or delete events through Google OAuth.
- \[`cal_com`\] - Query Cal.com availability and manage bookings through the Cal.com API.
- \[`scheduler`\] - Schedule, edit, list, and cancel MindRoom tasks and reminders in the current Matrix conversation.

## Common Setup Notes

`google_calendar` is a per-service Google OAuth integration. It uses the `google_calendar` OAuth provider instead of an API key form. It always runs in the primary MindRoom runtime so worker runtimes do not receive Google OAuth secrets. Use [Google Services OAuth (Admin Setup)](https://docs.mindroom.chat/deployment/google-services-oauth/index.md) or [Google Services OAuth (Individual Setup)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md) to connect Google before enabling `google_calendar`. `cal_com` is a standard credential-backed tool with its own config fields and no shared-only restriction. `scheduler` is MindRoom's built-in scheduling system, so it does not need dashboard OAuth setup or API keys. Unlike the two calendar API tools, `scheduler` depends on the active Matrix `ToolRuntimeContext`, so it only works from a live room or thread. MindRoom also includes `scheduler` in `defaults.tools` by default on this branch.

## \[`google_calendar`\]

`google_calendar` wraps Agno's Google Calendar toolkit with MindRoom-scoped Google OAuth credentials.

### What It Does

`google_calendar` exposes `list_events()`, `fetch_all_events()`, `find_available_slots()`, `list_calendars()`, `create_event()`, `update_event()`, and `delete_event()`. MindRoom loads the connected Google account from its unified credential store instead of relying on a per-process `token.json`. The OAuth provider requests a consistent Google Calendar scope, while MindRoom gates write methods with the `allow_update` setting. Write calls are still part of the tool surface, but they are only exposed when `allow_update: true` is configured. When no usable MindRoom OAuth credentials exist, the wrapper raises `OAuthConnectionRequired` instead of falling back to Agno's local token flow. `find_available_slots()` derives openings from the user's current calendar events plus working-hours settings inferred from Google Calendar settings and locale.

### Configuration

| Option         | Type      | Required | Default   | Notes                                         |
| -------------- | --------- | -------- | --------- | --------------------------------------------- |
| `calendar_id`  | `text`    | `no`     | `primary` | Google Calendar ID to query or update.        |
| `allow_update` | `boolean` | `no`     | `false`   | Expose create, update, and delete operations. |

### Example

```
agents:
  assistant:
    worker_scope: shared
    tools:
      - google_calendar:
          calendar_id: primary
          allow_update: true
```

```
list_events(limit=5)
find_available_slots(start_date="2026-04-01", end_date="2026-04-03", duration_minutes=30)
create_event(
    start_date="2026-04-02T15:00:00",
    end_date="2026-04-02T15:30:00",
    title="Deployment review",
    attendees=["ops@example.com"],
    add_google_meet_link=True,
)
```

### Notes

- `calendar_id` defaults to `primary`, and `list_calendars()` can return the other calendar IDs available to the connected account.
- If the Google Calendar connection is missing the required calendar scope, `google_calendar` stays unavailable until the user reconnects and grants it.
- Use the Google Services OAuth guides for consent-screen setup, redirect URIs, and environment variables.

## \[`cal_com`\]

`cal_com` talks to the Cal.com v2 booking API for availability lookup and booking management.

### What It Does

`cal_com` exposes `get_available_slots()`, `create_booking()`, `get_upcoming_bookings()`, `reschedule_booking()`, and `cancel_booking()`. The toolkit uses one configured `event_type_id` as the default booking type for slot lookup and booking creation. Responses are converted from UTC into `user_timezone` before they are returned. The per-method enable flags let you narrow the exposed call surface when an agent should only inspect availability or only manage existing bookings.

### Configuration

| Option                         | Type       | Required | Default | Notes                                                                                              |
| ------------------------------ | ---------- | -------- | ------- | -------------------------------------------------------------------------------------------------- |
| `api_key`                      | `password` | `no`     | `null`  | Cal.com API key. Configure this through the dashboard or credential store rather than inline YAML. |
| `event_type_id`                | `number`   | `no`     | `null`  | Default Cal.com event type ID used for slot lookup and new bookings.                               |
| `user_timezone`                | `text`     | `no`     | `null`  | IANA timezone used when formatting returned booking times.                                         |
| `enable_get_available_slots`   | `boolean`  | `no`     | `true`  | Enable `get_available_slots()`.                                                                    |
| `enable_create_booking`        | `boolean`  | `no`     | `true`  | Enable `create_booking()`.                                                                         |
| `enable_get_upcoming_bookings` | `boolean`  | `no`     | `true`  | Enable `get_upcoming_bookings()`.                                                                  |
| `enable_reschedule_booking`    | `boolean`  | `no`     | `true`  | Enable `reschedule_booking()`.                                                                     |
| `enable_cancel_booking`        | `boolean`  | `no`     | `true`  | Enable `cancel_booking()`.                                                                         |
| `all`                          | `boolean`  | `no`     | `false` | Enable every Cal.com operation at once.                                                            |

### Example

```
agents:
  scheduler_assistant:
    tools:
      - cal_com:
          event_type_id: 123456
          user_timezone: America/Los_Angeles
          enable_cancel_booking: false
```

```
get_available_slots(start_date="2026-04-01", end_date="2026-04-07")
create_booking(
    start_time="2026-04-03T17:00:00+00:00",
    name="Alex Example",
    email="alex@example.com",
)
get_upcoming_bookings(email="alex@example.com")
```

### Notes

- Although the metadata marks `api_key` and `event_type_id` as optional fields, the runtime only works properly when those values are supplied either through stored credentials or the `CALCOM_API_KEY` and `CALCOM_EVENT_TYPE_ID` environment variables.
- If `user_timezone` is omitted, the upstream toolkit falls back to `America/New_York`.
- `api_key` is a password field, so MindRoom blocks inline YAML overrides for it in normal authored config.
- All current requests go to `https://api.cal.com/v2`.

## \[`scheduler`\]

`scheduler` is MindRoom's built-in task scheduler for future messages, reminders, and recurring agent work.

### What It Does

`scheduler` exposes `schedule()`, `edit_schedule()`, `list_schedules()`, and `cancel_schedule()`. It reuses the same backend as `!schedule`, `!edit_schedule`, `!list_schedules`, and `!cancel_schedule`. By default `schedule()` posts back into the current room or thread scope, while `new_thread=True` schedules a future room-level root message. Scheduled tasks are stored in Matrix room state and persist across restarts. The scheduler validates mentioned agents against the current room or thread before it saves a task. If no Matrix room context is available, the tool returns an unavailable error instead of creating a task.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  assistant:
    tools:
      - scheduler
```

```
schedule("tomorrow at 9am @ops check the deployment")
schedule("every weekday at 8am post the on-call handoff summary", new_thread=True)
list_schedules()
edit_schedule("a1b2c3d4", "tomorrow at 10am @ops check the deployment")
cancel_schedule("a1b2c3d4")
```

### Notes

- `scheduler` needs no dashboard setup and is included in `defaults.tools` by default unless you explicitly disable that inheritance.
- Editing preserves the original schedule type, so switching between one-time and recurring schedules requires cancelling the old task and creating a new one.
- Conditional phrases such as `if` and `when` are converted into recurring polling schedules rather than real event subscriptions.
- Use [Scheduling](https://docs.mindroom.chat/scheduling/index.md) for the full command syntax, timezone behavior, persistence details, and command-line aliases.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Scheduling](https://docs.mindroom.chat/scheduling/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)
- [Google Services OAuth (Admin Setup)](https://docs.mindroom.chat/deployment/google-services-oauth/index.md)
- [Google Services OAuth (Individual Setup)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md)

# Memory & Storage

Use these tools to explicitly manage MindRoom memory or connect an agent to external memory providers.

## What This Page Covers

This page documents the built-in tools in the `memory-and-storage` group. Use these tools when you need explicit memory CRUD operations, direct provider-specific memory access, or a clear separation between MindRoom memory and third-party memory services.

## Tools On This Page

- \[`memory`\] - Explicitly add, search, list, read, update, and delete MindRoom memories visible to the current agent.
- \[`mem0`\] - Direct Mem0 toolkit for user-scoped persistent memory outside MindRoom's built-in memory API.
- \[`zep`\] - Direct Zep Cloud toolkit for session memory and user-graph search.

## Common Setup Notes

`memory` is MindRoom-native and has no tool-specific configuration fields. It operates on the same MindRoom memory backend configured through `memory.backend` or `agents.<name>.memory_backend`, so it follows the effective `mem0`, `file`, or `none` backend for that agent. If the effective backend is `none`, MindRoom does not attach the `memory` tool to that agent. Use [Memory System](https://docs.mindroom.chat/memory/index.md) for the canonical docs on backend selection, automatic extraction, file-backed memory, Agno Learning, and storage layout. `mem0` and `zep` are separate upstream Agno toolkits that talk to external memory providers directly. Enabling `mem0` or `zep` does not change MindRoom's own memory backend, automatic memory extraction, or the behavior of the `memory` tool. `mem0` can work with a hosted Mem0 API key or with local/default upstream Mem0 configuration. `zep` requires a Zep API key, either through stored credentials or the `ZEP_API_KEY` environment variable. If optional dependencies for these tools are missing, MindRoom can auto-install them at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set. This page does not document conversation-scoped file attachments even though they are storage-like. Use [Matrix & Attachments](https://docs.mindroom.chat/tools/matrix-and-attachments/index.md) and [Attachments](https://docs.mindroom.chat/attachments/index.md) for attachment IDs, retention, and Matrix media flow.

## \[`memory`\]

`memory` gives an agent explicit control over the MindRoom memories available in its current scope.

### What It Does

`memory` exposes `add_memory()`, `search_memories()`, `list_memories()`, `get_memory()`, `update_memory()`, and `delete_memory()`. It complements MindRoom's automatic post-response memory extraction by letting the agent deliberately remember or inspect something on demand. The tool always uses the current agent's configured MindRoom memory backend, so the same calls work whether that agent uses built-in `mem0` storage or file-backed memory. Agents with `memory_backend: none` do not receive this tool. Search and list results include memory IDs, and those IDs are then used with `get_memory()`, `update_memory()`, and `delete_memory()`. The tool is bound to the current agent's MindRoom scope and can reach any agent or team memories that MindRoom makes visible to that agent.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
memory:
  backend: file

agents:
  assistant:
    tools:
      - memory
```

```
add_memory("The user prefers terse release notes.")
search_memories("release notes", limit=3)
list_memories(limit=20)
get_memory("abc123")
update_memory("abc123", "The user prefers terse release notes with dates.")
delete_memory("abc123")
```

### Notes

- The tool uses whichever MindRoom backend is active for the agent, so enable and tune that backend through [Memory System](https://docs.mindroom.chat/memory/index.md), not through tool-local options.
- This is the right tool when you want explicit control over MindRoom's built-in durable memory rather than a separate provider account.
- The tool returns user-facing error strings on failures instead of raising raw exceptions into the conversation.

## \[`mem0`\]

`mem0` connects an agent directly to the upstream Mem0 toolkit.

### What It Does

`mem0` exposes `add_memory()`, `search_memory()`, `get_all_memories()`, and `delete_all_memories()`. It uses the upstream `mem0ai` client directly rather than MindRoom's built-in memory API. If `api_key` is set, or `MEM0_API_KEY` is present in the environment, the toolkit connects to Mem0's hosted platform client. If no API key is present but `config` is supplied, the toolkit initializes upstream Mem0 from that local config object. If neither `api_key` nor `config` is supplied, the toolkit falls back to upstream Mem0 defaults. Operations need a `user_id`, either from tool config or from the run context, and they return an error string when no user ID can be resolved.

### Configuration

| Option                       | Type       | Required | Default | Notes                                                                                                                    |
| ---------------------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------ |
| `config`                     | `text`     | `no`     | `null`  | Advanced upstream Mem0 config passed through to `Memory.from_config()`, typically used for local or self-managed setups. |
| `api_key`                    | `password` | `no`     | `null`  | Optional Mem0 Platform API key, also read from `MEM0_API_KEY`.                                                           |
| `user_id`                    | `text`     | `no`     | `null`  | Fixed user scope for all calls when you do not want to rely on runtime user context.                                     |
| `org_id`                     | `text`     | `no`     | `null`  | Optional Mem0 organization ID for platform usage, also read from `MEM0_ORG_ID`.                                          |
| `project_id`                 | `text`     | `no`     | `null`  | Optional Mem0 project ID for platform usage, also read from `MEM0_PROJECT_ID`.                                           |
| `infer`                      | `boolean`  | `no`     | `true`  | Let Mem0 infer facts from added content.                                                                                 |
| `enable_add_memory`          | `boolean`  | `no`     | `true`  | Enable `add_memory()`.                                                                                                   |
| `enable_search_memory`       | `boolean`  | `no`     | `true`  | Enable `search_memory()`.                                                                                                |
| `enable_get_all_memories`    | `boolean`  | `no`     | `true`  | Enable `get_all_memories()`.                                                                                             |
| `enable_delete_all_memories` | `boolean`  | `no`     | `true`  | Enable `delete_all_memories()`.                                                                                          |
| `all`                        | `boolean`  | `no`     | `false` | Enable all currently exposed Mem0 methods regardless of the individual flags.                                            |

### Example

```
agents:
  assistant:
    tools:
      - mem0:
          user_id: assistant-memory
          infer: true
```

```
add_memory("The user prefers terse release notes.")
search_memory("release notes")
get_all_memories()
delete_all_memories()
```

### Notes

- `api_key` is optional because the toolkit can use local/default upstream Mem0 initialization instead of the hosted Mem0 platform.
- This toolkit is separate from MindRoom's `memory.backend: mem0` setting, so enabling `mem0` here does not configure or replace MindRoom's built-in memory backend.
- If you want MindRoom's automatic memory extraction and built-in memory retrieval to use Mem0, configure that in [Memory System](https://docs.mindroom.chat/memory/index.md) instead of relying on this toolkit alone.
- Store API keys outside authored YAML even when the current metadata marks them as optional.

## \[`zep`\]

`zep` connects an agent directly to Zep Cloud for conversational memory and user-graph search.

### What It Does

`zep` exposes `add_zep_message()`, `get_zep_memory()`, and `search_zep_memory()`. The toolkit requires a Zep API key at initialization time and raises an error when neither `api_key` nor `ZEP_API_KEY` is available. If `session_id` is omitted, the toolkit generates a new session ID automatically. If `user_id` is omitted, the toolkit generates a new Zep user and creates it in the remote account. If `user_id` is provided but the user does not exist yet, the toolkit attempts to create it before use. `get_zep_memory(memory_type="context")` returns either session context or raw message history. `search_zep_memory(query, search_scope="edges")` searches the Zep user graph by facts or nodes. `ignore_assistant_messages` skips assistant-role content when messages are added to Zep.

### Configuration

| Option                      | Type       | Required | Default | Notes                                                                        |
| --------------------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------------- |
| `session_id`                | `text`     | `no`     | `null`  | Optional stable Zep thread ID, otherwise a new UUID is generated.            |
| `user_id`                   | `text`     | `no`     | `null`  | Optional stable Zep user ID, otherwise a new user is generated and created.  |
| `api_key`                   | `password` | `yes`    | `null`  | Required in practice unless it is provided through `ZEP_API_KEY`.            |
| `ignore_assistant_messages` | `boolean`  | `no`     | `false` | Ignore assistant-role messages when adding session messages.                 |
| `enable_add_zep_message`    | `boolean`  | `no`     | `true`  | Enable `add_zep_message()`.                                                  |
| `enable_get_zep_memory`     | `boolean`  | `no`     | `true`  | Enable `get_zep_memory()`.                                                   |
| `enable_search_zep_memory`  | `boolean`  | `no`     | `true`  | Enable `search_zep_memory()`.                                                |
| `instructions`              | `text`     | `no`     | `null`  | Override the default tool instructions injected for the model.               |
| `add_instructions`          | `boolean`  | `no`     | `false` | Add the tool instructions to the model prompt.                               |
| `all`                       | `boolean`  | `no`     | `false` | Enable all currently exposed Zep methods regardless of the individual flags. |

### Example

```
agents:
  assistant:
    tools:
      - zep:
          user_id: assistant-user
          session_id: release-review
```

```
add_zep_message(role="user", content="The user prefers terse release notes.")
get_zep_memory(memory_type="context")
search_zep_memory(query="release notes", search_scope="edges")
```

### Notes

- `zep` is an external provider toolkit and does not change MindRoom's built-in memory backend or the behavior of the `memory` tool.
- Use explicit `user_id` and `session_id` values when you want continuity across runs instead of a fresh generated identity.
- `search_scope="edges"` returns fact-style results, while `search_scope="nodes"` returns node summaries.
- Store the API key through credentials or `ZEP_API_KEY` rather than inline YAML.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Memory System](https://docs.mindroom.chat/memory/index.md)
- [Matrix & Attachments](https://docs.mindroom.chat/tools/matrix-and-attachments/index.md)
- [Attachments](https://docs.mindroom.chat/attachments/index.md)

# Agent Orchestration

Use these tools and presets to coordinate other agents, change runtime configuration, import OpenClaw-style workspaces, and keep long-lived Claude coding sessions alive across turns.

## What This Page Covers

This page documents the built-in tools in the `agent-orchestration` group. Use these tools when you need multi-agent coordination, runtime config changes, config-only presets, or persistent Claude Agent SDK sessions.

## Tools On This Page

- \[`subagents`\] - Spawn Matrix-backed sub-agent sessions and message them later by session key or label.
- \[`delegate`\] - Run another configured agent as a one-shot specialist and return its answer inline.
- \[`config_manager`\] - Inspect MindRoom config and create, update, validate, or template agents and teams.
- \[`self_config`\] - Let an agent read and update only its own configuration.
- \[`openclaw_compat`\] - Config-only preset that expands to native MindRoom tools.
- \[`claude_agent`\] - Persistent Claude Agent SDK sessions with optional gateway support and per-session labels.

## Common Setup Notes

All six entries on this page are MindRoom-native orchestration features rather than third-party OAuth integrations. Only \[`claude_agent`\] has tool-specific credential fields. \[`delegate`\] and \[`self_config`\] can be added automatically based on agent config, so they are not limited to explicit `tools:` entries. `agents.<name>.delegate_to` auto-enables \[`delegate`\] when the list is non-empty and the current delegation depth is below the hard limit of 3. `agents.<name>.allow_self_config` or `defaults.allow_self_config` auto-enables \[`self_config`\]. \[`config_manager`\] and \[`self_config`\] both save changes by revalidating the full runtime config before rewriting `config.yaml`. \[`subagents`\] requires a live Matrix tool runtime context with `room_id`, `requester_id`, Matrix client access, and a writable storage path. \[`openclaw_compat`\] is a config preset, not a runtime toolkit. `Config.expand_tool_names()` expands presets and implied tools while deduping and preserving order. For \[`openclaw_compat`\], that means `matrix_message` is added directly and `attachments` is added indirectly through `Config.IMPLIED_TOOLS`.

## \[`subagents`\]

`subagents` creates and tracks Matrix-backed sub-agent sessions that can continue across multiple tool calls.

### What It Does

`subagents` exposes `agents_list()`, `sessions_spawn()`, `sessions_send()`, and `list_sessions()`. All four calls return JSON strings with a `status` field, a `tool` field, and operation-specific payload data. `agents_list()` returns the configured agent IDs and the current agent name. `sessions_spawn(task, summary, tag, label=None, agent_id=None)` requires a non-empty task plus a normalized summary and tag. `sessions_spawn()` posts a fresh room-level Matrix message that mentions the target agent, then treats the resulting event ID as the root of a new isolated session thread. After the spawn succeeds, it writes the requested thread summary and tag through the lower-level thread summary and thread tag APIs. If you pass a `label` and the current `(agent_name, room_id, requester_id)` scope already has a matching tracked session, `sessions_spawn()` reuses that session instead of creating a new one and still applies the requested summary and tag to the existing thread. If the post-spawn summary or tag write fails, the spawn still succeeds and the response includes a `warnings` list describing the follow-up failure. `sessions_send()` sends a follow-up message into an existing tracked session. If you omit `session_key`, `sessions_send()` defaults to the current room or thread session key from `create_session_id(room_id, thread_id)`. If you pass `label` without `session_key`, `sessions_send()` resolves the most recent in-scope session with that label. If you pass `agent_id`, `sessions_send()` prefixes the outgoing message with `@mindroom_<agent_id>` before sending it. Tracked sessions are persisted in `subagents/session_registry.json` under the current runtime storage root. `list_sessions()` paginates those tracked sessions with a default `limit` of 50 and a maximum of 200. Isolated spawned sessions require thread-capable agents. If the target agent uses `thread_mode=room`, `sessions_spawn()` fails and threaded `sessions_send()` calls to that session also fail.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  coordinator:
    display_name: Coordinator
    role: Break work into long-running threaded sub-sessions
    model: sonnet
    tools:
      - subagents
```

```
agents_list()
sessions_spawn(
    task="Review the failing deployment and propose a rollback plan.",
    summary="Investigate the failing deployment and propose a safe rollback plan.",
    tag="incident-rollback",
    label="incident-42",
    agent_id="ops",
)
sessions_send(
    message="Add a short list of commands we should run first.",
    label="incident-42",
)
list_sessions(limit=20)
```

### Notes

- Session tracking is scoped to the current `agent_name`, `room_id`, and `requester_id`, so labels are not global across unrelated conversations.
- `sessions_spawn()` returns normalized `summary` and `tag` values in the success payload and may include `warnings` if the follow-up summary or tag write fails after the session is created.
- Use \[`subagents`\] when you want a continuing Matrix thread that other agents or humans can revisit later.
- Use \[`delegate`\] instead when you want a one-shot specialist answer returned directly as the tool result.

## \[`delegate`\]

`delegate` runs another configured agent as a fresh one-shot specialist and returns that agent's response inline.

### What It Does

`delegate` exposes one tool call, `delegate_task(agent_name, task)`. The delegated agent is created with `create_agent()` and runs independently with no shared session or chat history from the caller. The caller waits for the delegated agent to finish, and the delegated agent's `response.content` becomes the tool result. MindRoom gives the delegated agent any already-published last-good knowledge indexes and schedules missing or stale refresh work in the background. Interactive questions are disabled for delegated runs. Unlike \[`subagents`\], \[`delegate`\] does not create a Matrix thread, does not write to the room timeline, and does not keep a reusable session handle. If `agent_name` is not in the caller's allowed `delegate_to` list, the tool returns an error string. Empty tasks are rejected.

### Configuration

This tool has no tool-specific inline configuration fields. Enable it by setting `delegate_to` on the agent config. MindRoom adds the tool automatically when `delegate_to` is non-empty, so listing `delegate` in `tools:` is usually unnecessary.

### Example

```
agents:
  lead:
    display_name: Lead
    role: Coordinate specialist agents
    model: sonnet
    delegate_to:
      - code
      - research

  code:
    display_name: Code
    role: Implement and debug code changes
    model: sonnet
    tools:
      - coding
      - shell

  research:
    display_name: Research
    role: Gather sources and summarize findings
    model: sonnet
    tools:
      - duckduckgo
```

```
delegate_task(
    agent_name="research",
    task="Summarize the three main risks in this proposal and cite supporting facts.",
)
```

### Notes

- `Config.validate_delegate_to()` rejects self-delegation and unknown target agents at config-load time.
- Recursive delegation is supported, but only up to a maximum depth of 3.
- Use \[`subagents`\] when you need an ongoing threaded workflow.
- Use \[`delegate`\] when you need a synchronous specialist answer inside the current run.

## \[`config_manager`\]

`config_manager` is the broad config-authoring toolkit for inspecting MindRoom and creating or updating agents and teams.

### What It Does

`config_manager` exposes `get_info()`, `manage_agent()`, and `manage_team()`. `get_info(info_type, name=None)` supports `mindroom_docs`, `config_schema`, `available_models`, `agents`, `teams`, `available_tools`, `tool_details`, `agent_config`, and `agent_template`. `tool_details` requires `name` and reads from live `TOOL_METADATA`, so it includes real config fields and statuses from the current worktree. `agent_config` returns the authored YAML for a specific agent. `agent_template` generates starter YAML for one of the built-in template types: `researcher`, `developer`, `social`, `communicator`, `analyst`, or `productivity`. `manage_agent()` supports `create`, `update`, and `validate`. Agent creates and updates validate tool names against the live registry and validate knowledge base IDs against the current config. When a plain string tool list replaces an existing tool list, `config_manager` preserves inline overrides for retained tools instead of flattening them away. On create, `include_default_tools` falls back to `true` when you omit it. `manage_team()` creates a new team with `coordinate` or `collaborate` mode and rejects unknown member agents or duplicate team names. All writes go through full runtime config validation before `config.yaml` is saved.

### Configuration

This tool has no tool-specific inline configuration fields.

### Example

```
agents:
  builder:
    display_name: Builder
    role: Create and maintain MindRoom agents and teams
    model: sonnet
    tools:
      - config_manager
```

```
get_info("available_tools")
get_info("tool_details", name="claude_agent")
manage_agent(
    operation="create",
    agent_name="triage",
    display_name="Triage",
    role="Sort incoming requests and hand them to the right specialist.",
    tools=["duckduckgo", "subagents"],
    model="default",
    rooms=["lobby"],
)
manage_team(
    team_name="incident_team",
    display_name="Incident Team",
    role="Coordinate incident response across ops and code agents.",
    agents=["ops", "code"],
    mode="coordinate",
)
```

### Notes

- \[`config_manager`\] is broader and more privileged than \[`self_config`\] because it can inspect and modify other agents and teams.
- `manage_team()` creates teams, but it does not expose a separate update operation on this branch.
- Use [Agent Configuration](https://docs.mindroom.chat/configuration/agents/index.md) for the full authored schema outside the tool's curated helper surface.

## \[`self_config`\]

`self_config` lets an agent inspect and update only its own config entry.

### What It Does

`self_config` exposes `get_own_config()` and `update_own_config()`. `get_own_config()` returns the current agent's authored YAML block. `update_own_config()` only changes fields that you pass explicitly. On this branch, `update_own_config()` can modify `display_name`, `role`, `instructions`, `tools`, `model`, `rooms`, `markdown`, `learning`, `learning_mode`, `knowledge_bases`, `skills`, `include_default_tools`, `show_tool_calls`, `thread_mode`, `num_history_runs`, `num_history_messages`, `compress_tool_results`, `max_tool_calls_from_history`, and `context_files`. The update path validates tool names against the live registry and validates knowledge base IDs against the current config. It also preserves inline tool overrides for retained tools when a string-only tool list is provided. Updates are validated through `AgentConfig.model_validate()` before the file is saved. Only the current agent can be changed. There is no path to modify other agents or teams through this tool.

### Configuration

This tool has no tool-specific inline configuration fields. The normal way to enable it is `agents.<name>.allow_self_config: true` or `defaults.allow_self_config: true`.

### Example

```
defaults:
  allow_self_config: false

agents:
  research:
    display_name: Research
    role: Research and summarize external sources
    model: sonnet
    allow_self_config: true
    tools:
      - duckduckgo
      - wikipedia
```

```
get_own_config()
update_own_config(
    instructions=[
        "Cite sources for factual claims.",
        "Prefer concise summaries with clear takeaways.",
    ],
    tools=["duckduckgo", "wikipedia", "subagents"],
    thread_mode="room",
    context_files=["SOUL.md", "USER.md"],
)
```

### Notes

- `self_config` blocks privileged self-escalation by rejecting `config_manager` in its `tools` update list.
- `include_default_tools=True` is also rejected when `defaults.tools` contains blocked privileged tools such as `config_manager`.
- Use \[`self_config`\] for narrow self-tuning at runtime and \[`config_manager`\] for full config-authoring workflows.

## \[`openclaw_compat`\]

`openclaw_compat` is a config-only preset for OpenClaw-style workspace portability.

### What It Does

`openclaw_compat` is not a runtime toolkit. The registered factory returns an empty `Toolkit`, and the real behavior comes from `Config.TOOL_PRESETS`. `Config.expand_tool_names()` expands `openclaw_compat` into `shell`, `coding`, `duckduckgo`, `website`, `browser`, `scheduler`, `subagents`, and `matrix_message`. `matrix_message` then implies `attachments`, so the effective enabled set also includes `attachments` even though the preset does not list it directly. Preset expansion dedupes while preserving order, so adding `openclaw_compat` alongside one of its member tools does not create duplicates. This preset is meant for OpenClaw-compatible workspace behavior inside MindRoom rather than for cloning the full OpenClaw gateway control plane.

### Configuration

This preset has no inline configuration fields.

### Example

```
agents:
  openclaw:
    display_name: OpenClawAgent
    role: OpenClaw-style personal assistant with a file-first workspace
    model: opus
    include_default_tools: false
    learning: false
    memory_backend: file
    context_files:
      - SOUL.md
      - AGENTS.md
      - USER.md
      - IDENTITY.md
      - TOOLS.md
      - HEARTBEAT.md
    tools:
      - openclaw_compat
      - python
```

### Notes

- \[`openclaw_compat`\] is a preset name that belongs in `tools:` but does not expose callable runtime methods of its own.
- Use the dedicated [OpenClaw Workspace Import](https://docs.mindroom.chat/openclaw/index.md) guide for workspace layout, file memory behavior, and migration details.
- If you only need one or two of the member tools, configure those tools directly instead of using the preset.

## \[`claude_agent`\]

`claude_agent` keeps persistent Claude Agent SDK coding sessions alive across turns and exposes explicit session lifecycle controls.

### What It Does

`claude_agent` exposes `claude_start_session()`, `claude_send()`, `claude_session_status()`, `claude_interrupt()`, and `claude_end_session()`. `claude_send()` automatically creates the session if it does not already exist, so `claude_start_session()` is optional. Session keys are namespaced by agent identity and Agno run session ID, with optional `session_label` suffixes for parallel sub-sessions. The same session key is serialized by an `asyncio.Lock`, so concurrent calls to one label run one after the other. Different `session_label` values create distinct Claude sessions that can proceed independently. Idle sessions expire after `session_ttl_minutes`, which defaults to 60 minutes. The process-wide session manager keeps at most `max_sessions` active sessions per agent namespace, defaulting to 200. `resume` and `fork_session` only apply when creating a new session. `fork_session=True` requires a non-empty `resume` session ID. If a session already exists for the computed key, passing `resume` or `fork_session` returns an error instead of silently changing the live session. `claude_session_status()` reports age, idle time, and the underlying Claude session ID once Claude has returned a result. On SDK failures, the tool includes recent Claude CLI stderr lines in its error output to help debug gateway or CLI issues.

### Configuration

| Option                       | Type       | Required | Default   | Notes                                                                                                                               |
| ---------------------------- | ---------- | -------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`                    | `password` | `no`     | `null`    | Anthropic API key or gateway-compatible key material. Usually stored in credentials JSON or dashboard setup instead of inline YAML. |
| `anthropic_base_url`         | `url`      | `no`     | `null`    | Optional Anthropic-compatible gateway root URL. Use the host root, not a `/v1` suffix.                                              |
| `anthropic_auth_token`       | `password` | `no`     | `null`    | Optional bearer token for Anthropic-compatible gateways.                                                                            |
| `disable_experimental_betas` | `boolean`  | `no`     | `false`   | Sets `CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1` for gateway compatibility.                                                          |
| `cwd`                        | `text`     | `no`     | `null`    | Working directory passed to the Claude Agent SDK client.                                                                            |
| `model`                      | `text`     | `no`     | `null`    | Claude model override. When omitted, the tool falls back to the current agent model ID when one is available.                       |
| `permission_mode`            | `text`     | `no`     | `default` | One of `default`, `acceptEdits`, `plan`, or `bypassPermissions`. Invalid values fall back to `default`.                             |
| `continue_conversation`      | `boolean`  | `no`     | `false`   | Continue the same Claude conversation context across queries in one session.                                                        |
| `allowed_tools`              | `text`     | `no`     | `null`    | Comma-separated Claude Code tool names to allow.                                                                                    |
| `disallowed_tools`           | `text`     | `no`     | `null`    | Comma-separated Claude Code tool names to deny.                                                                                     |
| `max_turns`                  | `number`   | `no`     | `null`    | Maximum Claude turns per query. Values below 1 are normalized up to 1.                                                              |
| `system_prompt`              | `text`     | `no`     | `null`    | Extra system prompt passed directly to the Claude Agent SDK.                                                                        |
| `cli_path`                   | `text`     | `no`     | `null`    | Optional path to the Claude CLI executable.                                                                                         |
| `session_ttl_minutes`        | `number`   | `no`     | `60`      | Idle-session expiration window in minutes. Values below 1 are normalized up to 1.                                                   |
| `max_sessions`               | `number`   | `no`     | `200`     | Maximum live sessions per agent namespace. Values below 1 are normalized up to 1.                                                   |

### Example

```
agents:
  code:
    display_name: Code Agent
    role: Coding assistant with persistent Claude sessions
    model: default
    tools:
      - claude_agent:
          model: claude-sonnet-4-6
          cwd: /workspace/project
          permission_mode: acceptEdits
          continue_conversation: true
          session_ttl_minutes: 180
          max_sessions: 20
```

```
{
  "api_key": "sk-ant-or-proxy-key",
  "model": "claude-sonnet-4-6",
  "permission_mode": "default",
  "continue_conversation": true,
  "session_ttl_minutes": 60,
  "max_sessions": 200
}
```

```
{
  "api_key": "sk-dummy",
  "anthropic_base_url": "http://litellm.local",
  "anthropic_auth_token": "sk-dummy",
  "disable_experimental_betas": true
}
```

```
claude_send(
    prompt="Refactor the failing test and explain the diff.",
    session_label="bugfix",
)
claude_session_status(session_label="bugfix")
claude_interrupt(session_label="bugfix")
claude_end_session(session_label="bugfix")
```

### Notes

- Dashboard setup and `mindroom_data/credentials/claude_agent_credentials.json` both feed the same tool credential fields, because runtime credentials are stored as `<service>_credentials.json`.
- For Anthropic-compatible gateways, set `anthropic_base_url` to the gateway root without `/v1`, because the Claude client appends its own API path.
- Some gateways reject Claude beta headers, so `disable_experimental_betas: true` is the compatibility switch for that case.
- When you use MindRoom's OpenAI-compatible API, keep the same `X-Session-Id` across requests so the same Claude session key is reused.
- See [OpenAI-Compatible API](https://docs.mindroom.chat/openai-api/index.md) for request-level session continuity details.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Agent Configuration](https://docs.mindroom.chat/configuration/agents/index.md)
- [OpenClaw Workspace Import](https://docs.mindroom.chat/openclaw/index.md)
- [OpenAI-Compatible API](https://docs.mindroom.chat/openai-api/index.md)

# Automation & Platforms

Use these tools to manage AWS-backed automation, edit Airflow DAG files, run code in hosted sandboxes, call arbitrary HTTP APIs, and bridge into broader integration platforms.

## What This Page Covers

This page documents the built-in tools in the `automation-and-platforms` group. Use these tools when you need infrastructure automation, generic API access, remote execution environments, or platform-level integration brokers.

## Tools On This Page

- \[`aws_lambda`\] - AWS Lambda function listing and invocation.
- \[`aws_ses`\] - Amazon SES outbound email sending.
- \[`airflow`\] - Local Airflow DAG file reads and writes.
- \[`e2b`\] - Hosted E2B code-execution sandbox with file, command, and server helpers.
- \[`daytona`\] - Persistent remote sandbox and dev-environment execution.
- \[`composio`\] - Dynamic Composio-backed integration toolset for connected external apps.
- \[`custom_api`\] - Generic HTTP requests to arbitrary APIs.

## Common Setup Notes

All seven tools on this page default to the primary agent runtime instead of MindRoom's worker-routed execution set. `aws_lambda`, `airflow`, and `custom_api` are registered as `setup_type: none`, while `aws_ses`, `e2b`, `daytona`, and `composio` are marked `requires_config`. `src/mindroom/api/integrations.py` currently only exposes Spotify OAuth routes on this branch, so none of the tools on this page have a dedicated MindRoom OAuth flow. Password fields such as `api_key` and `password` must be stored through the dashboard or credential store instead of inline YAML. `aws_lambda` and `aws_ses` both rely on standard boto3 credential resolution, so normal AWS environment variables, shared config files, or instance-role credentials are the real authentication path. That matters especially for `aws_ses`, because the current registry marks it as `setup_type: api_key` even though the tool itself does not expose an API-key field. `e2b` accepts `api_key` inline from stored credentials or falls back to `E2B_API_KEY`. `daytona` accepts stored credentials or environment fallback through `DAYTONA_API_KEY`, and `api_url` can also fall back to `DAYTONA_API_URL`. `composio` can fall back to cached Composio user data or `COMPOSIO_API_KEY` when `api_key` is not stored directly. Several fields on this page are advanced raw constructor inputs rather than friendly hand-authored YAML values, including `sandbox_options`, `sandbox_env_vars`, `sandbox_labels`, `workspace_config`, `connected_account_ids`, `metadata`, `processors`, and `headers`. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set.

## \[`aws_lambda`\]

`aws_lambda` lists Lambda functions and invokes them in the configured AWS region.

### What It Does

`aws_lambda` exposes `list_functions()` and `invoke_function(function_name, payload="{}")`. The toolkit constructs a boto3 Lambda client at init time with `region_name`. `invoke_function()` passes the payload through to boto3 as a string and returns the Lambda status code plus the decoded response payload. `list_functions()` currently makes one direct `list_functions()` call rather than using a paginator, so very large accounts may need a richer AWS-specific path than this thin wrapper.

### Configuration

| Option                   | Type      | Required | Default     | Notes                                                      |
| ------------------------ | --------- | -------- | ----------- | ---------------------------------------------------------- |
| `region_name`            | `text`    | `no`     | `us-east-1` | AWS region used when constructing the boto3 Lambda client. |
| `enable_list_functions`  | `boolean` | `no`     | `true`      | Enable `list_functions()`.                                 |
| `enable_invoke_function` | `boolean` | `no`     | `true`      | Enable `invoke_function()`.                                |
| `all`                    | `boolean` | `no`     | `false`     | Enable the full upstream Lambda toolkit surface.           |

### Example

```
agents:
  automation:
    tools:
      - aws_lambda:
          region_name: us-west-2
```

```
list_functions()
invoke_function("daily-report", payload='{"date": "2026-03-31"}')
```

### Notes

- Configure AWS credentials through the standard boto3 chain rather than expecting a MindRoom-specific key field.
- The payload is a raw string in the upstream wrapper, so JSON requests should be serialized before invocation.
- Use this tool for simple invocation workflows, not full Lambda administration.

## \[`aws_ses`\]

`aws_ses` sends plain-text outbound email through Amazon SES with a configured sender identity.

### What It Does

`aws_ses` exposes `send_email(subject, body, receiver_email)`. The toolkit builds a boto3 SES client with `region_name` and then sends a plain-text message from `"{sender_name} <{sender_email}>"`. It validates that `subject` and `body` are non-empty before sending. The current wrapper does not add HTML email support, templates, attachments, or richer SES delivery controls on top of the basic send call.

### Configuration

| Option              | Type      | Required | Default     | Notes                                                                                                          |
| ------------------- | --------- | -------- | ----------- | -------------------------------------------------------------------------------------------------------------- |
| `sender_email`      | `text`    | `no`     | `null`      | Sender email address used in the SES `Source` header. This is effectively required in practice.                |
| `sender_name`       | `text`    | `no`     | `null`      | Display name used in the SES `Source` header. Set this together with `sender_email` for a clean sender format. |
| `region_name`       | `text`    | `no`     | `us-east-1` | AWS region used when constructing the boto3 SES client.                                                        |
| `enable_send_email` | `boolean` | `no`     | `true`      | Enable `send_email()`.                                                                                         |
| `all`               | `boolean` | `no`     | `false`     | Enable the full upstream SES toolkit surface.                                                                  |

### Example

```
agents:
  notifications:
    tools:
      - aws_ses:
          sender_email: alerts@example.com
          sender_name: MindRoom Alerts
          region_name: us-east-1
```

```
send_email(
    subject="Nightly sync complete",
    body="The nightly sync finished successfully.",
    receiver_email="ops@example.com",
)
```

### Notes

- Despite the current `requires_config` and `api_key` registry metadata, authentication still comes from the standard boto3 AWS credential chain.
- Verify the sender identity in SES before relying on this tool for real mail delivery.
- This wrapper only sends plain-text email.

## \[`airflow`\]

`airflow` is a local DAG-file helper for reading and writing Airflow Python files.

### What It Does

`airflow` exposes `save_dag_file(contents, dag_file)` and `read_dag_file(dag_file)`. If `dags_dir` is a string, the upstream toolkit resolves it relative to the current working directory at tool initialization time. `save_dag_file()` creates missing parent directories before writing the target DAG file. This tool manages DAG source files only. It does not talk to the Airflow scheduler, trigger DAG runs, inspect task state, or call the Airflow REST API.

### Configuration

| Option                 | Type      | Required | Default | Notes                                                                                                    |
| ---------------------- | --------- | -------- | ------- | -------------------------------------------------------------------------------------------------------- |
| `dags_dir`             | `text`    | `no`     | `null`  | Base directory for DAG files, resolved relative to the current working directory when given as a string. |
| `enable_save_dag_file` | `boolean` | `no`     | `true`  | Enable `save_dag_file()`.                                                                                |
| `enable_read_dag_file` | `boolean` | `no`     | `true`  | Enable `read_dag_file()`.                                                                                |
| `all`                  | `boolean` | `no`     | `false` | Enable the full upstream Airflow toolkit surface.                                                        |

### Example

```
agents:
  airflow_editor:
    tools:
      - airflow:
          dags_dir: dags
```

```
read_dag_file("daily_reporting.py")
save_dag_file("from airflow import DAG\n", "generated/new_job.py")
```

### Notes

- Use `airflow` when the job is editing DAG source files, not when you need live Airflow control-plane access.
- `dags_dir` is not a MindRoom-managed workspace root like `base_dir` on some local execution tools.
- Keep the configured directory aligned with the filesystem path your Airflow deployment actually watches.

## \[`e2b`\]

`e2b` provides a hosted code-execution sandbox with Python execution, file transfer, command execution, and temporary public URLs.

### What It Does

`e2b` requires an API key from stored credentials or `E2B_API_KEY`. The toolkit creates one E2B sandbox at initialization time and reuses it for subsequent tool calls from that toolkit instance. It exposes `run_python_code()`, `upload_file()`, `download_png_result()`, `download_chart_data()`, `download_file_from_sandbox()`, `run_command()`, `stream_command()`, `run_background_command()`, `kill_background_command()`, `list_files()`, `read_file_content()`, `write_file_content()`, `watch_directory()`, `get_public_url()`, `run_server()`, `set_sandbox_timeout()`, `get_sandbox_status()`, `shutdown_sandbox()`, and `list_running_sandboxes()`. The media helpers operate on the most recent `run_python_code()` result, which is why chart and PNG download flows are companion actions instead of standalone reads. `timeout` is passed into `Sandbox.create(...)`, and `sandbox_options` is splatted directly into that constructor.

### Configuration

| Option            | Type       | Required | Default | Notes                                                                                                                                                |
| ----------------- | ---------- | -------- | ------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`         | `password` | `no`     | `null`  | E2B API key. The tool also falls back to `E2B_API_KEY`.                                                                                              |
| `timeout`         | `number`   | `no`     | `300`   | Sandbox timeout in seconds passed into `Sandbox.create(...)`.                                                                                        |
| `sandbox_options` | `text`     | `no`     | `null`  | Advanced raw sandbox-constructor options. The upstream constructor expects a dict-like object, while current MindRoom metadata exposes this as text. |

### Example

```
agents:
  remote_exec:
    tools:
      - e2b:
          timeout: 600
```

```
run_python_code("print('hello from e2b')")
upload_file("data/report.csv", "workspace/report.csv")
run_server("python -m http.server 8000", port=8000)
```

### Notes

- The tool fails fast if no API key is available or if sandbox creation fails during initialization.
- `sandbox_options` is mainly useful for advanced programmatic setup rather than normal handwritten YAML.
- Use `e2b` when you want a cloud code interpreter with file and server helpers, not just a single command runner.

## \[`daytona`\]

`daytona` runs code and shell commands in a remote sandbox that can persist across agent sessions.

### What It Does

`daytona` requires an API key from stored credentials or `DAYTONA_API_KEY`. `api_url` can also fall back to `DAYTONA_API_URL`. The toolkit exposes `run_code()`, `run_shell_command()`, `create_file()`, `read_file()`, `list_files()`, `delete_file()`, and `change_directory()`. When `persistent` is true, the tool stores the active sandbox ID in `agent.session_state` and tries to reuse that sandbox on later calls. It also tracks a working directory in `agent.session_state`, and `run_shell_command()` treats `cd ...` specially so later relative-path commands and file operations stay in that directory. If no reusable sandbox exists, the toolkit creates one automatically unless `auto_create_sandbox` is disabled. The bundled default instructions describe a code-write, execute, and show-results workflow, but those instructions are only added to the agent prompt when `add_instructions: true`.

### Configuration

| Option                | Type       | Required | Default  | Notes                                                                                                                                                                 |
| --------------------- | ---------- | -------- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`             | `password` | `no`     | `null`   | Daytona API key. The tool also falls back to `DAYTONA_API_KEY`.                                                                                                       |
| `api_url`             | `url`      | `no`     | `null`   | Daytona API URL. The tool also falls back to `DAYTONA_API_URL`.                                                                                                       |
| `sandbox_id`          | `text`     | `no`     | `null`   | Explicit sandbox ID to reuse instead of creating or looking up a session-backed sandbox.                                                                              |
| `sandbox_language`    | `text`     | `no`     | `PYTHON` | Primary sandbox language, such as `PYTHON`, `JAVASCRIPT`, or `TYPESCRIPT`.                                                                                            |
| `sandbox_target`      | `text`     | `no`     | `null`   | Daytona target passed into `DaytonaConfig`.                                                                                                                           |
| `sandbox_os`          | `text`     | `no`     | `null`   | Declared sandbox OS field. The current creation path stores this value but does not pass it into `CreateSandboxFromSnapshotParams`.                                   |
| `auto_stop_interval`  | `number`   | `no`     | `60`     | Auto-stop interval in minutes for created sandboxes.                                                                                                                  |
| `sandbox_os_user`     | `text`     | `no`     | `null`   | OS user for the sandbox.                                                                                                                                              |
| `sandbox_env_vars`    | `text`     | `no`     | `null`   | Advanced raw environment-variable mapping for the sandbox. The upstream constructor expects a dict-like object, while current MindRoom metadata exposes this as text. |
| `sandbox_labels`      | `text`     | `no`     | `{}`     | Advanced raw label mapping for the sandbox. The upstream constructor expects a dict-like object, while current MindRoom metadata exposes this as text.                |
| `organization_id`     | `text`     | `no`     | `null`   | Daytona organization ID.                                                                                                                                              |
| `timeout`             | `number`   | `no`     | `300`    | Timeout in seconds for sandbox operations.                                                                                                                            |
| `auto_create_sandbox` | `boolean`  | `no`     | `true`   | Create a new sandbox automatically when lookup or reuse fails.                                                                                                        |
| `verify_ssl`          | `boolean`  | `no`     | `false`  | Verify Daytona SSL certificates. The default `false` path monkey-patches the Daytona client to disable SSL verification warnings and checks.                          |
| `persistent`          | `boolean`  | `no`     | `true`   | Reuse the same sandbox across agent-session calls instead of creating a fresh sandbox each time.                                                                      |
| `sandbox_public`      | `boolean`  | `no`     | `null`   | Whether created sandboxes should be public.                                                                                                                           |
| `instructions`        | `text`     | `no`     | `null`   | Custom toolkit instructions that replace the bundled default instructions.                                                                                            |
| `add_instructions`    | `boolean`  | `no`     | `false`  | Add the toolkit instructions into the agent prompt.                                                                                                                   |

### Example

```
agents:
  remote_dev:
    tools:
      - daytona:
          api_url: https://api.daytona.io
          sandbox_language: PYTHON
          auto_stop_interval: 30
          persistent: true
          add_instructions: true
```

```
run_code("print('hello from daytona')")
run_shell_command("pwd && ls -la")
change_directory("project")
create_file("main.py", "print('ok')")
```

### Notes

- `sandbox_env_vars` and `sandbox_labels` are advanced constructor inputs rather than convenient hand-authored YAML fields.
- `verify_ssl: false` is not a cosmetic flag here, because the current implementation actively patches the Daytona client to skip certificate verification.
- Use `sandbox_id` when you want to pin the tool to a known sandbox instead of letting session-state reuse choose one.

## \[`composio`\]

`composio` is a dynamic bridge into Composio's connected-app ecosystem rather than a fixed list of built-in actions.

### What It Does

The registered MindRoom tool instantiates `composio_agno.ComposioToolSet`. That upstream object does not expose a stable fixed method list like `aws_lambda` or `custom_api`. Instead, its main surface is `get_tools(actions=..., apps=..., tags=...)`, which wraps selected Composio actions into Agno `Toolkit` objects at runtime. The resulting callable tools therefore depend on your Composio workspace, connected accounts, and action selection rather than a static MindRoom-defined function list. MindRoom's current registry metadata on this branch documents the connection and workspace fields, but it does not expose separate per-agent app or action filter fields in `config.yaml`.

### Configuration

| Option                  | Type       | Required | Default   | Notes                                                                                                                                                                  |
| ----------------------- | ---------- | -------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `api_key`               | `password` | `no`     | `null`    | Composio API key. The SDK can also fall back to cached Composio user data or `COMPOSIO_API_KEY`.                                                                       |
| `base_url`              | `url`      | `no`     | `null`    | Optional Composio API base URL.                                                                                                                                        |
| `entity_id`             | `text`     | `no`     | `default` | Composio entity identifier used when executing actions.                                                                                                                |
| `workspace_id`          | `text`     | `no`     | `null`    | Optional Composio workspace identifier.                                                                                                                                |
| `workspace_config`      | `text`     | `no`     | `null`    | Advanced raw workspace configuration. The upstream constructor expects a workspace-config object, while current MindRoom metadata exposes this as text.                |
| `connected_account_ids` | `text`     | `no`     | `null`    | Advanced raw mapping of app names to connected account IDs. The upstream constructor expects a dict-like object, while current MindRoom metadata exposes this as text. |
| `metadata`              | `text`     | `no`     | `null`    | Advanced raw metadata mapping used by Composio actions and processors.                                                                                                 |
| `processors`            | `text`     | `no`     | `null`    | Advanced raw processor mapping for request, response, or schema hooks.                                                                                                 |
| `output_dir`            | `text`     | `no`     | `null`    | Optional output directory for file-based results.                                                                                                                      |
| `lockfile`              | `text`     | `no`     | `null`    | Optional lockfile path for action-version locking and concurrency control.                                                                                             |
| `max_retries`           | `number`   | `no`     | `3`       | Maximum retries for failed Composio operations.                                                                                                                        |
| `verbosity_level`       | `number`   | `no`     | `null`    | Optional verbosity level.                                                                                                                                              |
| `output_in_file`        | `boolean`  | `no`     | `false`   | Write operation output to files instead of only returning it directly.                                                                                                 |
| `allow_tracing`         | `boolean`  | `no`     | `false`   | Enable tracing support for debugging.                                                                                                                                  |
| `lock`                  | `boolean`  | `no`     | `true`    | Enable lockfile-based coordination.                                                                                                                                    |
| `logging_level`         | `text`     | `no`     | `INFO`    | Composio logging level such as `DEBUG`, `INFO`, `WARNING`, `ERROR`, or `CRITICAL`.                                                                                     |

### Example

```
agents:
  integrations:
    tools:
      - composio:
          entity_id: default
          workspace_id: workspace_123
          logging_level: INFO
```

### Notes

- The resulting callable actions are dynamic and depend on the Composio workspace rather than on fixed MindRoom-defined function names.
- `workspace_config`, `connected_account_ids`, `metadata`, and `processors` are advanced constructor inputs and are not the most ergonomic handwritten YAML fields in the current metadata model.
- Use `composio` when you want one external platform to broker many app integrations instead of configuring each app-specific tool directly in MindRoom.

## \[`custom_api`\]

`custom_api` is the generic escape hatch for making HTTP requests to APIs that do not have a dedicated MindRoom tool.

### What It Does

`custom_api` exposes `make_request(endpoint, method="GET", params=None, data=None, headers=None, json_data=None)`. If `base_url` is set, the tool joins it with the passed endpoint. If `username` and `password` are configured, the request uses HTTP Basic Auth. If `api_key` is configured, the tool adds `Authorization: Bearer <api_key>` to the default headers. Per-call headers are merged on top of configured default headers. The response body is parsed as JSON when possible and otherwise returned as plain text inside a JSON envelope with `status_code`, response `headers`, and `data`. Non-2xx responses still return a structured result object, with an added `"error": "Request failed"` field.

### Configuration

| Option                | Type       | Required | Default | Notes                                                                                                                                           |
| --------------------- | ---------- | -------- | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| `base_url`            | `url`      | `no`     | `null`  | Base URL joined with the `endpoint` argument when set.                                                                                          |
| `username`            | `text`     | `no`     | `null`  | Optional HTTP Basic Auth username.                                                                                                              |
| `password`            | `password` | `no`     | `null`  | Optional HTTP Basic Auth password stored through the dashboard or credential store.                                                             |
| `api_key`             | `password` | `no`     | `null`  | Optional bearer token stored through the dashboard or credential store.                                                                         |
| `headers`             | `text`     | `no`     | `null`  | Advanced raw default-header mapping. The upstream constructor expects a dict-like object, while current MindRoom metadata exposes this as text. |
| `verify_ssl`          | `boolean`  | `no`     | `true`  | Verify SSL certificates for outgoing HTTPS requests.                                                                                            |
| `timeout`             | `number`   | `no`     | `30`    | Request timeout in seconds.                                                                                                                     |
| `enable_make_request` | `boolean`  | `no`     | `true`  | Enable `make_request()`.                                                                                                                        |
| `all`                 | `boolean`  | `no`     | `false` | Enable the full upstream custom-API toolkit surface.                                                                                            |

### Example

```
agents:
  api_bridge:
    tools:
      - custom_api:
          base_url: https://api.example.com/v1
          verify_ssl: true
          timeout: 20
```

```
make_request("health")
make_request("users/42", method="GET")
make_request("reports", method="POST", json_data={"range": "7d"})
```

### Notes

- If `base_url` is omitted, `endpoint` must be a full URL.
- If both Basic Auth and `api_key` are configured, the request will send both the `Authorization: Bearer ...` header and the Basic Auth credentials because the wrapper does not treat them as mutually exclusive modes.
- `headers` is an advanced constructor input rather than a polished hand-authored YAML field on this branch.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Execution & Coding](https://docs.mindroom.chat/tools/execution-and-coding/index.md)
- [Project Management](https://docs.mindroom.chat/tools/project-management/index.md)
- [MCP](https://docs.mindroom.chat/mcp/index.md)

# Location, Commerce, & Home

Use these tools to search places, fetch weather data, analyze Shopify stores, and control Home Assistant devices.

## What This Page Covers

This page documents the built-in tools in the `location-commerce-and-home` group. Use these tools when you need physical-world lookup data, store analytics, or smart home control.

## Tools On This Page

- \[`google_maps`\] - Google Maps place search, directions, geocoding, address validation, elevation, and timezone lookups.
- \[`openweather`\] - Current weather, multi-day forecast, air pollution, and location geocoding from OpenWeather.
- \[`shopify`\] - Shopify Admin API analytics for shop info, products, orders, customers, inventory, and sales trends.
- \[`homeassistant`\] - Home Assistant entity state queries, device control, scene activation, and generic service calls.

## Common Setup Notes

All four tools on this page are `requires_config`, so they only become available after the needed credentials or integration setup is present. MindRoom validates inline overrides against the declared `config_fields`, and `type="password"` fields such as `key`, `api_key`, `access_token`, and `HOMEASSISTANT_TOKEN` must be stored through the dashboard or credential store instead of inline YAML. `google_maps`, `openweather`, and `shopify` are standard credential-backed tools with no dedicated MindRoom integration routes in `src/mindroom/api/integrations.py`. Their upstream Agno toolkits also support environment fallbacks through `GOOGLE_MAPS_API_KEY`, `OPENWEATHER_API_KEY`, `SHOPIFY_SHOP_NAME`, and `SHOPIFY_ACCESS_TOKEN`. `homeassistant` is different because MindRoom ships a dedicated integration flow in `src/mindroom/api/homeassistant_integration.py` with both OAuth and long-lived-token setup paths. `homeassistant` is also a shared-only integration, so it requires `worker_scope` to be unset or `shared`. Like the Google OAuth tools and unlike `spotify`, `homeassistant` always stays local and is never proxied through worker sandbox routing. Missing optional dependencies can auto-install at first use unless `MINDROOM_NO_AUTO_INSTALL_TOOLS=1` is set.

## \[`google_maps`\]

`google_maps` is the Google Maps toolkit for place search, routing, geocoding, address validation, and location metadata.

### What It Does

`google_maps` exposes `search_places()`, `get_directions()`, `validate_address()`, `geocode_address()`, `reverse_geocode()`, `get_distance_matrix()`, `get_elevation()`, and `get_timezone()`. The upstream toolkit builds both a `googlemaps.Client` and a `google.maps.places_v1.PlacesClient`. `search_places()` returns rich place details including name, formatted address, rating, reviews, phone number, website, and opening hours. `validate_address()` uses Google's Address Validation API rather than normal geocoding. MindRoom does not add extra runtime behavior on top of the upstream toolkit beyond metadata, dependency management, and credential storage.

### Configuration

| Option | Type       | Required | Default | Notes                                                                 |
| ------ | ---------- | -------- | ------- | --------------------------------------------------------------------- |
| `key`  | `password` | `no`     | `null`  | Google Maps API key stored through the dashboard or credential store. |

### Example

```
agents:
  local_guide:
    tools:
      - google_maps
```

```
search_places("coffee shops near Pike Place Market")
get_directions("Seattle, WA", "Portland, OR", mode="driving")
geocode_address("1600 Amphitheatre Parkway, Mountain View, CA")
reverse_geocode(47.6205, -122.3493)
validate_address("1600 Amphitheatre Pkwy, Mountain View, CA", region_code="US")
```

### Notes

- `key` is optional in MindRoom metadata only because the upstream toolkit can also read `GOOGLE_MAPS_API_KEY` from the runtime environment.
- In practice, the tool still needs a valid Google Maps API key before any call succeeds.
- If you plan to use `validate_address()`, enable the Address Validation API for the same Google Cloud project as the key.

## \[`openweather`\]

`openweather` is the OpenWeather toolkit for weather, forecast, air quality, and place geocoding.

### What It Does

`openweather` exposes `get_current_weather()`, `get_forecast()`, `get_air_pollution()`, and `geocode_location()`. The weather, forecast, and air pollution methods geocode the requested location first and then query OpenWeather by latitude and longitude. `units` controls whether the toolkit requests `standard`, `metric`, or `imperial` output from the API. `get_forecast()` uses the 5-day forecast endpoint and caps the response to 40 three-hour entries. MindRoom does not add custom behavior here beyond metadata, dependency management, and credential storage.

### Configuration

| Option                   | Type       | Required | Default  | Notes                                                                 |
| ------------------------ | ---------- | -------- | -------- | --------------------------------------------------------------------- |
| `api_key`                | `password` | `no`     | `null`   | OpenWeather API key stored through the dashboard or credential store. |
| `units`                  | `text`     | `no`     | `metric` | Measurement units passed to the OpenWeather API.                      |
| `enable_current_weather` | `boolean`  | `no`     | `true`   | Enable `get_current_weather()`.                                       |
| `enable_forecast`        | `boolean`  | `no`     | `true`   | Enable `get_forecast()`.                                              |
| `enable_air_pollution`   | `boolean`  | `no`     | `true`   | Enable `get_air_pollution()`.                                         |
| `enable_geocoding`       | `boolean`  | `no`     | `true`   | Enable `geocode_location()`.                                          |
| `all`                    | `boolean`  | `no`     | `false`  | Enable the full OpenWeather toolkit.                                  |

### Example

```
agents:
  weather:
    tools:
      - openweather:
          units: imperial
          enable_air_pollution: false
```

```
get_current_weather("San Francisco")
get_forecast("Chicago", days=3)
get_air_pollution("Los Angeles")
geocode_location("Reykjavik", limit=3)
```

### Notes

- `api_key` is optional in MindRoom metadata only because the upstream toolkit can also read `OPENWEATHER_API_KEY` from the runtime environment.
- In practice, the tool still needs a valid OpenWeather API key before any call succeeds.
- Because weather lookups reuse the first geocoding match, ambiguous location names can resolve to an unexpected city unless you make the query more specific.

## \[`shopify`\]

`shopify` is the Shopify Admin GraphQL toolkit for store analytics, catalog inspection, order reporting, customer lookups, and inventory visibility.

### What It Does

`shopify` exposes `get_shop_info()`, `get_products()`, `get_orders()`, `get_top_selling_products()`, `get_products_bought_together()`, `get_sales_by_date_range()`, `get_order_analytics()`, `get_product_sales_breakdown()`, `get_customer_order_history()`, `get_inventory_levels()`, `get_low_stock_products()`, `get_sales_trends()`, `get_average_order_value()`, and `get_repeat_customers()`. The toolkit talks to Shopify's Admin GraphQL endpoint at `https://<shop_name>.myshopify.com/admin/api/<api_version>/graphql.json`. Most list-style methods cap query size to Shopify's first-page limits, such as 250 products or orders. `get_orders()` supports `created_after`, `created_before`, and financial status filters, and the date filters expect `YYYY-MM-DD`. MindRoom does not wrap the Shopify API further, so behavior comes directly from the upstream Agno toolkit.

### Configuration

| Option         | Type       | Required | Default   | Notes                                                                            |
| -------------- | ---------- | -------- | --------- | -------------------------------------------------------------------------------- |
| `shop_name`    | `text`     | `yes`    | `null`    | Store subdomain such as `my-store` from `my-store.myshopify.com`.                |
| `access_token` | `password` | `yes`    | `null`    | Shopify Admin API access token stored through the dashboard or credential store. |
| `api_version`  | `text`     | `no`     | `2025-10` | Shopify Admin API version used in GraphQL requests.                              |
| `timeout`      | `number`   | `no`     | `30`      | Request timeout in seconds.                                                      |

### Example

```
agents:
  store_analyst:
    tools:
      - shopify:
          shop_name: my-store
          api_version: 2025-10
          timeout: 45
```

```
get_shop_info()
get_products(max_results=25, status="ACTIVE")
get_orders(max_results=50, created_after="2026-03-01", created_before="2026-03-31")
get_low_stock_products(threshold=10)
get_average_order_value(days=30)
```

### Notes

- Create a custom app in Shopify Admin and grant the scopes you need before generating the access token.
- The upstream toolkit explicitly expects `read_orders`, `read_products`, `read_customers`, and `read_analytics` for its full analytics surface.
- `shop_name` and `access_token` can also come from `SHOPIFY_SHOP_NAME` and `SHOPIFY_ACCESS_TOKEN`, but MindRoom's documented configuration path is stored tool credentials.

## \[`homeassistant`\]

`homeassistant` is MindRoom's custom Home Assistant toolkit for entity state queries, device control, scenes, automations, and generic service calls.

### What It Does

`homeassistant` exposes `get_entity_state()`, `list_entities()`, `turn_on()`, `turn_off()`, `toggle()`, `set_brightness()`, `set_color()`, `set_temperature()`, `activate_scene()`, `trigger_automation()`, and `call_service()`. The toolkit calls Home Assistant's REST API through `/api/states` and `/api/services/...`. `list_entities()` returns a simplified response and limits the output to the first 50 entities to avoid huge payloads. `set_brightness()` validates a `0` to `255` range, `set_color()` validates each RGB channel in the same range, and `call_service()` expects extra service data as a JSON string. MindRoom adds important runtime behavior here by loading scoped credentials, enforcing shared-only integration rules, and returning a clear error when the agent's `worker_scope` does not allow the integration.

### Configuration

| Option                | Type       | Required | Default | Notes                                                                                                                                |
| --------------------- | ---------- | -------- | ------- | ------------------------------------------------------------------------------------------------------------------------------------ |
| `HOMEASSISTANT_URL`   | `url`      | `yes`    | `null`  | Dashboard field label for the Home Assistant base URL. The dedicated integration flow stores the normalized value as `instance_url`. |
| `HOMEASSISTANT_TOKEN` | `password` | `yes`    | `null`  | Dashboard field label for a long-lived access token. OAuth setup stores `access_token` and `refresh_token` instead.                  |

### Example

```
agents:
  home:
    worker_scope: shared
    tools:
      - homeassistant
```

```
list_entities("light")
get_entity_state("climate.thermostat")
turn_on("light.living_room")
set_brightness("light.living_room", 128)
activate_scene("scene.movie_time")
call_service("notify", "send_message", data='{"message": "Dinner is ready"}')
```

### Notes

- `homeassistant` requires `worker_scope` to be unset or `shared`, and it is unavailable for `worker_scope: user` or `worker_scope: user_agent`.
- `homeassistant`, `gmail`, `google_calendar`, `google_drive`, and `google_sheets` always stay local and are never proxied through the sandbox, even if you change `worker_tools`.
- The current setup path is the dedicated Home Assistant integration flow in the dashboard or `src/mindroom/api/homeassistant_integration.py`, not generic env-to-credentials syncing.
- That integration supports both OAuth and long-lived access tokens, and the OAuth flow requires a Home Assistant OAuth application with the callback URL `/api/homeassistant/callback` on the MindRoom dashboard host.
- The runtime tool itself looks for stored `instance_url` plus either `access_token` or `long_lived_token`, which is why tool availability checks differ from the raw metadata field names.

## Related Docs

- [Tools Overview](https://docs.mindroom.chat/tools/index.md)
- [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration)
- [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md)

# Built-in Tools

!!! note "This page has been reorganized" The built-in tools documentation has been split into dedicated category pages for better navigation. Please use the links below to find the tool you're looking for.

## Browse By Category

| Category                      | Tools                                                                            | Link                                                                                                 |
| ----------------------------- | -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
| **Execution & Coding**        | file, shell, python, coding, docker, calculator, reasoning, sleep                | [→ execution-and-coding](https://docs.mindroom.chat/tools/execution-and-coding/index.md)             |
| **Data & Databases**          | sql, postgres, neo4j, duckdb, csv, pandas, google_drive, google_sheets, yfinance | [→ data-and-databases](https://docs.mindroom.chat/tools/data-and-databases/index.md)                 |
| **Web Search**                | duckduckgo, googlesearch, tavily, exa, serpapi, serper, searxng                  | [→ web-search](https://docs.mindroom.chat/tools/web-search/index.md)                                 |
| **Web Scraping & Browser**    | firecrawl, crawl4ai, browser, jina, website, trafilatura                         | [→ web-scraping-and-browser](https://docs.mindroom.chat/tools/web-scraping-and-browser/index.md)     |
| **Research Sources**          | arxiv, wikipedia, pubmed, hackernews                                             | [→ research-sources](https://docs.mindroom.chat/tools/research-sources/index.md)                     |
| **AI & Generation**           | openai, gemini, groq, replicate, fal, dalle, eleven_labs                         | [→ ai-and-generation](https://docs.mindroom.chat/tools/ai-and-generation/index.md)                   |
| **Media & Content**           | youtube, spotify, giphy, moviepy, unsplash, brandfetch                           | [→ media-and-content](https://docs.mindroom.chat/tools/media-and-content/index.md)                   |
| **Matrix & Attachments**      | matrix_message, thread_tags, thread_summary, matrix_api, attachments             | [→ matrix-and-attachments](https://docs.mindroom.chat/tools/matrix-and-attachments/index.md)         |
| **Messaging & Social**        | gmail, slack, discord, telegram, whatsapp, email, x, reddit                      | [→ messaging-and-social](https://docs.mindroom.chat/tools/messaging-and-social/index.md)             |
| **Project Management**        | github, jira, linear, clickup, notion, trello, todoist                           | [→ project-management](https://docs.mindroom.chat/tools/project-management/index.md)                 |
| **Calendar & Scheduling**     | google_calendar, cal_com, scheduler                                              | [→ calendar-and-scheduling](https://docs.mindroom.chat/tools/calendar-and-scheduling/index.md)       |
| **Memory & Storage**          | memory, mem0, zep                                                                | [→ memory-and-storage](https://docs.mindroom.chat/tools/memory-and-storage/index.md)                 |
| **Agent Orchestration**       | subagents, delegate, claude_agent, config_manager, openclaw_compat               | [→ agent-orchestration](https://docs.mindroom.chat/tools/agent-orchestration/index.md)               |
| **Automation & Platforms**    | aws_lambda, aws_ses, airflow, e2b, daytona, composio, custom_api                 | [→ automation-and-platforms](https://docs.mindroom.chat/tools/automation-and-platforms/index.md)     |
| **Location, Commerce & Home** | google_maps, openweather, shopify, homeassistant                                 | [→ location-commerce-and-home](https://docs.mindroom.chat/tools/location-commerce-and-home/index.md) |

For tool configuration, presets, and runtime context, see the [Tools Overview](https://docs.mindroom.chat/tools/index.md).

# MCP

Native MCP client support is now documented at [MCP](https://docs.mindroom.chat/mcp/index.md). Use that page for `mcp_servers`, transport examples, tool naming, and agent setup.

This compatibility page remains only so older links continue to resolve.

# OpenClaw Workspace Import

MindRoom supports a practical OpenClaw-compatible workflow focused on workspace portability:

- Reuse your OpenClaw markdown files (`SOUL.md`, `AGENTS.md`, `USER.md`, `MEMORY.md`, etc.) after copying them into the agent's canonical MindRoom workspace
- Use the `openclaw_compat` preset to enable a native MindRoom tool bundle
- Use MindRoom's unified memory backend (`memory.backend`) for persistence
- Optionally add semantic recall over workspace files via knowledge bases

## What this is (and is not)

MindRoom is compatible with OpenClaw workspace patterns, not a full OpenClaw gateway clone.

Works well:

- File-based identity and memory documents
- OpenClaw-inspired behavior and instructions
- Native MindRoom tool bundle via the `openclaw_compat` preset
- Native Matrix messaging via the `matrix_message` tool in the preset bundle
- Native sub-agent session orchestration via the `subagents` tool in the preset bundle

Not included:

- OpenClaw gateway control plane
- Device nodes and canvas platform tools
- OpenClaw alias-name wrapper APIs like `exec`, `process`, `web_search`, and `web_fetch`
- `tts` and `image` aliases (use MindRoom's native TTS/image tools directly)
- Heartbeat runtime - schedule heartbeats via `cron`/`scheduler` instead

## The `openclaw_compat` preset

`openclaw_compat` is a config macro, not a runtime toolkit. `Config.get_agent_tools` expands it into native MindRoom tools and dedupes while preserving order.

Preset expansion:

- `shell`
- `coding`
- `duckduckgo`
- `website`
- `browser`
- `scheduler`
- `subagents`
- `matrix_message`
- `attachments` (auto-implied by `matrix_message` via `IMPLIED_TOOLS`, not listed in the preset directly)

Memory is not a separate OpenClaw subsystem in MindRoom. It uses the normal MindRoom memory backend.

## Drop-in config

Use this as a starting point for importing an OpenClaw workspace into MindRoom's canonical agent workspace:

```
agents:
  openclaw:
    display_name: OpenClawAgent
    include_default_tools: false
    learning: false
    memory_backend: file
    model: opus
    role: OpenClaw-style personal assistant with persistent file-based identity and memory.
    rooms: [personal]

    instructions:
      - You wake up fresh each session with no memory of previous conversations. Your context files are already loaded into your system prompt.
      - Important long-term context is persisted by the configured MindRoom memory backend. If something must be preserved exactly, write/update the relevant file directly.
      - MEMORY.md is curated long-term memory; daily files are short-lived notes and logs.
      - Ask before external/public actions and destructive operations.
      - Before answering prior-history questions, search memory files first with `search_knowledge_base` when configured.

    context_files:
      - SOUL.md
      - AGENTS.md
      - USER.md
      - IDENTITY.md
      - TOOLS.md
      - HEARTBEAT.md

    knowledge_bases: [openclaw_memory]

    tools:
      - openclaw_compat
      - python

    skills:
      - transcribe

knowledge_bases:
  openclaw_memory:
    path: ${MINDROOM_STORAGE_PATH}/agents/openclaw/workspace/memory
    watch: false

memory:
  file:
    max_entrypoint_lines: 200
  auto_flush:
    enabled: true
```

When using `memory_backend: file`, the file backend automatically loads `MEMORY.md` from the canonical workspace root, so there is no need to add it to `context_files`. If you switch to `mem0`, add `MEMORY.md` back to `context_files` if you still want it preloaded. The `openclaw_compat` preset already expands to native shell, coding, duckduckgo, website, browser, scheduler, sub-agent orchestration, and `matrix_message` tools (`attachments` is auto-implied by `matrix_message`), so listing those tools individually is not necessary. Copy or sync your OpenClaw files into `agents/openclaw/workspace/` before using this config so `context_files`, file memory, and `openclaw_memory` read the same canonical workspace. This example sets `watch: false`, so direct external file edits require explicit reindex, while dashboard/API mutations still schedule refresh after a successful mutation. For shared local non-Git knowledge bases, `watch: true` still starts a live filesystem watcher. If `openclaw_memory` is Git-backed, update the repository and reindex instead of using dashboard upload or delete actions.

## Recommended workspace layout

```
mindroom_data/
└── agents/
    └── openclaw/
        └── workspace/
            ├── SOUL.md
            ├── AGENTS.md
            ├── USER.md
            ├── IDENTITY.md
            ├── MEMORY.md
            ├── TOOLS.md
            ├── HEARTBEAT.md
            └── memory/
                ├── YYYY-MM-DD.md
                └── topic-notes.md
```

## Unified memory behavior

OpenClaw-compatible agents use the same memory system as every other MindRoom agent:

- `memory.backend: mem0` for vector memory (global default)
- `memory.backend: file` for file-first memory (global default)
- `memory.backend: none` or `memory: none` to disable built-in durable memory globally
- `memory_backend: file` on an individual agent to override the global default
- `memory_backend: none` on an individual agent to keep that agent stateless
- agents that use file memory store it under `agents/<name>/workspace/`, not under the shared global `memory.file.path` tree
- `context_files` and OpenClaw-style knowledge paths should point into that same canonical workspace if you want one consistent file-first workflow
- optional `knowledge_bases` for semantic recall over arbitrary workspace folders

Recommended for OpenClaw-style setups: `memory_backend: file` with the canonical workspace layout and `memory.auto_flush.enabled: true`.

## Context Management

MindRoom includes built-in context controls for OpenClaw-style agents:

- **Conversation history** is stored in Agno sessions, but MindRoom decides what replay summary and raw history messages are injected into each run.
- **Replay depth** is controlled with `num_history_runs` or `num_history_messages`, and optional required compaction is controlled with `compaction` (see [Agents](https://docs.mindroom.chat/configuration/agents/index.md)).
- **Preloaded role context** from `context_files` is hard-capped by `defaults.max_preload_chars` (configured in `config.yaml` under `defaults`). When the combined context exceeds this limit, chunks are trimmed from the end and a truncation marker is inserted.

## Known limitations

**Threading model:** MindRoom responds in Matrix threads by default. OpenClaw uses continuous room-level conversations. To match this behavior on mobile or via bridges (Telegram, Signal, WhatsApp), set `thread_mode: room` on the agent - this sends plain room messages with a single persistent session per room instead of creating threads.

## Privacy guidance

`context_files` apply to all rooms for that agent. If `MEMORY.md` is sensitive:

- Keep the agent in private rooms only, or
- Split into private/public agents and exclude sensitive files from the public agent

## Skills

For details on skill eligibility gating (`openclaw.os`, `openclaw.requires`, `openclaw.always`), see [Skills](https://docs.mindroom.chat/skills/index.md).

Skills are loaded from `~/.mindroom/skills/<name>/`. To use an OpenClaw skill like `transcribe`, copy the skill directory from your OpenClaw workspace:

```
mkdir -p ~/.mindroom/skills
cp -r /path/to/openclaw-workspace/skills/transcribe ~/.mindroom/skills/
```

Set required environment variables (for example `WHISPER_URL`) as defined in the skill's `SKILL.md` frontmatter.

# Skills

MindRoom uses Agno's skills system with OpenClaw-compatible metadata. Skills are instruction packs (a `SKILL.md` file) with optional scripts and references that guide agents without adding new code capabilities.

## Skill directory structure

A skill is a directory containing:

```
my-skill/
├── SKILL.md         # Required: instructions; YAML frontmatter is recommended
├── scripts/         # Optional: executable scripts
│   └── audit.sh
└── references/      # Optional: reference documents
    └── examples.md
```

Agents access skills via `get_skill_instructions()`, scripts via `get_skill_script()`, and references via `get_skill_reference()`.

## SKILL.md format (OpenClaw compatible)

```
---
name: repo-quick-audit
description: Quick repository audit checklist
metadata: '{openclaw:{requires:{bins:["git"], env:["GITHUB_TOKEN"]}}}'
---

# Repo Quick Audit

1. Check CI status
2. Review open issues
```

Notes:

- `metadata` can be a JSON5 string (shown above) or a YAML mapping.
- If `name` is omitted, MindRoom falls back to the skill directory name.
- If `description` is omitted or blank, MindRoom falls back to the resolved skill name.
- If YAML frontmatter is omitted entirely, the skill still loads with those same name/description fallbacks. Frontmatter is still recommended for clearer listings and metadata.

## Frontmatter fields

| Field           | Type                    | Description                                                                           |
| --------------- | ----------------------- | ------------------------------------------------------------------------------------- |
| `name`          | string                  | Unique skill identifier                                                               |
| `description`   | string                  | Brief summary shown to users/models; defaults to the skill name when omitted or blank |
| `metadata`      | mapping or JSON5 string | OpenClaw metadata and custom fields                                                   |
| `license`       | string                  | Informational only; accepted but not used by the runtime                              |
| `compatibility` | string                  | Informational only; accepted but not used by the runtime                              |
| `allowed-tools` | list                    | Reserved; accepted in frontmatter but not enforced by the runtime                     |

## Eligibility gating (OpenClaw metadata)

If `metadata.openclaw` is present, MindRoom filters skills using these rules:

- `os: ["linux", "darwin", "windows"]`
- `always: true` bypasses `requires`, but it does not bypass an OS mismatch
- `requires.env`: env var set or credential key exists
- `requires.config`: config path is truthy (e.g., `agents.code.tools`)
- `requires.bins`: all binaries must exist in PATH
- `requires.anyBins`: at least one binary must exist in PATH

Skills without `metadata.openclaw` are always eligible.

## Skill locations and precedence

MindRoom resolves skills for each agent from these locations, in this order:

1. Bundled skills: `skills/` at the repository root (if present)
1. Plugin-provided skill directories (see [Plugins](https://docs.mindroom.chat/plugins/index.md))
1. User skills: `~/.mindroom/skills/`
1. Agent workspace skills: `<storage>/agents/<agent>/workspace/skills/`

If multiple skills share the same name, the last one wins (agent workspace > user > plugin > bundled).

Agent workspace skills are only available to the owning agent at runtime. They do not appear in the global skills API or dashboard listing because those views are not agent-scoped.

## Configuring skills

Add skills to an agent allowlist in `config.yaml`:

```
agents:
  developer:
    display_name: Developer
    role: A coding assistant
    model: sonnet
    skills:
      - repo-quick-audit
      - code-review
```

The `skills:` list is an allowlist for bundled, plugin, and user skills. If `skills` is empty or unset, the agent gets no bundled, plugin, or user skills. Workspace skills under `<storage>/agents/<agent>/workspace/skills/` are still auto-loaded for that agent. This lets an agent create or receive skills in its own workspace without editing `config.yaml`.

Workspace auto-loading is a runtime capability, not a proactive behavior policy. If you want agents to create skills on their own when they notice reusable workflows, add that guidance to the agent's prompt or instructions.

## Using skills at runtime

Agents see available skills in the system prompt and can load details using these tools:

- `get_skill_instructions(skill_name)` - Load the full instructions for a skill
- `get_skill_reference(skill_name, reference_path)` - Access reference documentation
- `get_skill_script(skill_name, script_path, execute=False, args=None, timeout=30)` - Read or execute scripts

Workspace skill scripts can be read with `get_skill_script(..., execute=False)`. Workspace skill scripts cannot be executed through `get_skill_script(..., execute=True)`. Agents that have shell or file execution permissions can still read and execute workspace files through their normal authorized tools.

## Skill vs tool

| Aspect       | Skills                    | Tools            |
| ------------ | ------------------------- | ---------------- |
| Definition   | Markdown + YAML           | Python code      |
| Location     | File system               | Code/plugins     |
| Filtering    | Automatic by requirements | Always available |
| Instructions | Rich markdown             | Docstrings       |
| Invocation   | Model via skill tools     | Model only       |

## Hot reloading

MindRoom polls skill directories every second. When a `SKILL.md` file is added, removed, or modified, the skill cache is automatically cleared so agents pick up the new instructions on their next request. For workspace skills created during an agent turn, assume they become available on the next agent run rather than in the same response.

## Best practices

1. Keep skills focused - one skill per capability
1. Declare dependencies with `metadata.openclaw.requires`
1. Use descriptive names like `code-review`

# Plugins

> [!WARNING] **Plugins execute arbitrary Python code in the same process as MindRoom.** A malicious plugin has full access to your credentials, Matrix sessions, file system, and network. Only install plugins you trust and have reviewed.

MindRoom plugins extend agents with custom tools, [hooks](https://docs.mindroom.chat/hooks/index.md), and skills. A plugin is a directory with a `mindroom.plugin.json` manifest, one or more Python modules, and optionally skill directories. Plugins are loaded from paths listed under `plugins:` in `config.yaml`.

## Plugin structure

A plugin is a directory containing `mindroom.plugin.json`:

```
my-plugin/
├── mindroom.plugin.json   # Required manifest
├── tools.py               # Tool factories (optional)
├── oauth.py               # OAuth providers (optional)
├── hooks.py               # Event hooks (optional)
└── skills/                # Skill directories (optional)
    └── my-skill/
        └── SKILL.md
```

A plugin must have at least one of `tools_module`, `hooks_module`, `oauth_module`, or `skills`. A tools-only plugin exposes callable functions to agents. A plugin with `oauth_module` registers OAuth providers whose state, callbacks, and scoped credential storage are handled by MindRoom core. A hooks-only plugin observes or transforms events without adding agent-facing tools. Many plugins combine both.

## Manifest format

The manifest is a JSON file named `mindroom.plugin.json` at the plugin root:

```
{
  "name": "my-plugin",
  "tools_module": "tools.py",
  "oauth_module": "oauth.py",
  "hooks_module": "hooks.py",
  "skills": ["skills"]
}
```

| Field          | Type            | Required | Description                                                                                                                                                                                                       |
| -------------- | --------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `name`         | string          | **yes**  | Plugin identifier. Must be lowercase ASCII letters, digits, `-`, and `_` only (pattern: `^[a-z0-9_-]+$`). Must be unique across all configured plugins. Invalid or duplicate names abort plugin loading entirely. |
| `tools_module` | string          | no       | Relative path to the Python module containing `@register_tool_with_metadata` factories. Must exist on disk if declared.                                                                                           |
| `oauth_module` | string          | no       | Relative path to the Python module containing `register_oauth_providers(settings, runtime_paths)`. Must exist on disk if declared.                                                                                |
| `hooks_module` | string          | no       | Relative path to the Python module containing `@hook`-decorated functions. Must exist on disk if declared.                                                                                                        |
| `skills`       | list of strings | no       | Relative directories containing skill subdirectories (each with a `SKILL.md`). Each directory must exist on disk.                                                                                                 |

Unknown fields are silently ignored. Invalid, duplicate, or malformed manifests are configuration errors and stop all plugin loading. All declared module files and skill directories must exist on disk.

If `hooks_module` is omitted, MindRoom auto-scans `tools_module` for `@hook`-decorated functions. If both fields point at the same file, MindRoom imports it once and reuses it for both tool registration and hook discovery.

## Configure plugins

Add plugin paths under `plugins:` in `config.yaml`:

```
plugins:
  - ./plugins/my-plugin
  - python:my_skill_pack
  - path: ./plugins/personal-context
    enabled: true
    settings:
      dawarich_url: http://dawarich.local
      api_key: secret
    hooks:
      enrich_with_location:
        priority: 20
      audit_messages:
        enabled: false
```

### Entry formats

Plugin entries can be **strings** (path only) or **objects** (with options). Both forms can be mixed in the same list.

**String entry** — just the path:

```
plugins:
  - ./plugins/my-plugin
```

**Object entry** — path plus options:

```
plugins:
  - path: ./plugins/my-plugin
    enabled: true
    settings:
      api_key: secret
    hooks:
      my_hook:
        enabled: false
```

| Field      | Type   | Default    | Description                                                            |
| ---------- | ------ | ---------- | ---------------------------------------------------------------------- |
| `path`     | string | *required* | Plugin path (see resolution rules below)                               |
| `enabled`  | bool   | `true`     | Set to `false` to disable the plugin without removing it from the list |
| `settings` | dict   | `{}`       | Free-form key-value config passed to the plugin at load time           |
| `hooks`    | dict   | `{}`       | Per-hook overrides keyed by hook function name                         |

Each hook override supports:

| Field        | Type | Default | Description                                                 |
| ------------ | ---- | ------- | ----------------------------------------------------------- |
| `enabled`    | bool | `true`  | Disable a specific hook without removing it from the plugin |
| `priority`   | int  | `null`  | Override the hook's default execution priority              |
| `timeout_ms` | int  | `null`  | Override the hook's default timeout                         |

### Path resolution

Paths are resolved in this order:

1. **Absolute paths** — used as-is
1. **Relative paths** — resolved relative to the directory containing `config.yaml`
1. **Python package specs** — see below

## Python package plugins

MindRoom can resolve plugins from installed Python packages:

```
plugins:
  - my_skill_pack
  - python:my_skill_pack
  - pkg:my_skill_pack:plugins/demo
  - module:my_skill_pack:plugins/demo
```

Rules:

- A bare package name (no slashes, no `.` or `..` prefix) is tried as a Python package first.
- `python:`, `pkg:`, and `module:` are explicit prefixes that force package resolution.
- `:sub/path` after the package name points to a subdirectory inside the package.

MindRoom resolves the package location via `importlib` and looks for `mindroom.plugin.json` in that directory.

## Tools module

A tools module is a Python file that registers one or more tool factories using the `@register_tool_with_metadata` decorator. Each factory function returns a **Toolkit class** (not an instance). MindRoom instantiates the class when building agents.

## OAuth providers

An OAuth module registers provider definitions without registering FastAPI routes. MindRoom core owns state generation, callback consumption, authenticated requester binding, scoped OAuth token writes, status checks, and disconnect handling. The provider module supplies only provider-specific details such as endpoint URLs, scopes, token credential service names, optional tool config service names, optional PKCE requirements, token parsing, optional claim validators, and display metadata.

Declare the module in the manifest:

```
{
  "name": "drive-plugin",
  "tools_module": "tools.py",
  "oauth_module": "oauth.py"
}
```

Then expose `register_oauth_providers(settings, runtime_paths)`:

```
from __future__ import annotations

from mindroom.oauth import OAuthProvider


def register_oauth_providers(settings, runtime_paths):
    del runtime_paths
    return [
        OAuthProvider(
            id="acme_drive",
            display_name="Acme Drive",
            authorization_url="https://accounts.acme.example/oauth/authorize",
            token_url="https://accounts.acme.example/oauth/token",
            scopes=("files.read",),
            credential_service="acme_drive_oauth",
            tool_config_service="acme_drive",
            client_config_services=(
                settings.get("client_config_service", "acme_drive_oauth_client"),
            ),
            allowed_email_domains=tuple(settings.get("allowed_email_domains", [])),
            allowed_hosted_domains=tuple(settings.get("allowed_hosted_domains", [])),
        ),
    ]
```

OAuth provider IDs are exposed through `/api/oauth/{provider}/connect`, `/api/oauth/{provider}/authorize`, `/api/oauth/{provider}/callback`, `/api/oauth/{provider}/status`, and `/api/oauth/{provider}/disconnect`. Dashboard flows normally call `connect` and use the returned provider authorization URL. Conversation flows should show the browser-openable `authorize` URL, because that URL first authenticates the MindRoom user and then redirects to the external provider. Conversation-issued links include an opaque connect token so the callback can verify the requester before storing scoped credentials. The connect token is also bound to the runtime requester, and redemption fails unless the authenticated dashboard user resolves to that requester. The callback stores tokens under `credential_service` using the resolved requester and agent execution scope, including private `user` and `user_agent` scopes. If the tool also has editable dashboard settings, declare `tool_config_service` and store those settings separately through the normal credentials API. Set `pkce_code_challenge_method="S256"` when the upstream OAuth provider requires PKCE. MindRoom stores the verifier in pending state and passes it as the fifth argument to custom `token_exchanger` callbacks. For example, an Acme Drive provider can store OAuth tokens in `acme_drive_oauth` while the `acme_drive` tool settings document contains only options such as file-size limits or capability toggles. Tokens and client secrets must never be written to `config.yaml`, prompt files, logs, or tool responses.

OAuth-backed tools should set `setup_type=SetupType.OAUTH` and `auth_provider="<provider_id>"` in `@register_tool_with_metadata`. When credentials are missing, return a concise instruction containing a browser-openable URL built with `mindroom.oauth.build_oauth_connect_instruction(provider, runtime_paths, worker_target=...)`. The user can complete OAuth and retry the same tool request.

Deployment restrictions belong in plugin settings. Use `allowed_email_domains` to restrict verified email claims by domain. Use `allowed_hosted_domains` when the provider supplies a verified hosted-domain claim. If a configured restriction cannot be checked from verified claims, MindRoom fails the callback closed and does not save credentials.

### Minimal example

```
from __future__ import annotations

from typing import TYPE_CHECKING

from mindroom.tool_system.metadata import (
    SetupType,
    ToolCategory,
    ToolStatus,
    register_tool_with_metadata,
)

if TYPE_CHECKING:
    from agno.tools import Toolkit


@register_tool_with_metadata(
    name="greeter",
    display_name="Greeter",
    description="A simple greeting tool",
    category=ToolCategory.DEVELOPMENT,
    status=ToolStatus.AVAILABLE,
    setup_type=SetupType.NONE,
)
def greeter_tools() -> type[Toolkit]:
    from agno.tools import Toolkit

    class GreeterTools(Toolkit):
        def __init__(self) -> None:
            super().__init__(name="greeter", tools=[self.greet])

        def greet(self, name: str) -> str:
            """Greet someone by name."""
            return f"Hello, {name}!"

    return GreeterTools
```

After registering the plugin, assign the tool to agents in `config.yaml`:

```
plugins:
  - ./plugins/my-greeter

agents:
  assistant:
    tools:
      - greeter
```

### Decorator fields

All `@register_tool_with_metadata` arguments are keyword-only.

**Required fields:**

| Field          | Type           | Description                                                                        |
| -------------- | -------------- | ---------------------------------------------------------------------------------- |
| `name`         | string         | Tool identifier — used to reference the tool in `config.yaml` agent `tools:` lists |
| `display_name` | string         | Human-readable name shown in the dashboard                                         |
| `description`  | string         | Brief description of what the tool does                                            |
| `category`     | `ToolCategory` | Category for dashboard grouping (see values below)                                 |

`ToolCategory` values: `COMMUNICATION`, `DEVELOPMENT`, `EMAIL`, `ENTERTAINMENT`, `INFORMATION`, `INTEGRATIONS`, `PRODUCTIVITY`, `RESEARCH`, `SHOPPING`, `SMART_HOME`, `SOCIAL`.

**Optional fields:**

| Field                      | Type                          | Default     | Description                                                                                                          |
| -------------------------- | ----------------------------- | ----------- | -------------------------------------------------------------------------------------------------------------------- |
| `status`                   | `ToolStatus`                  | `AVAILABLE` | `AVAILABLE` or `REQUIRES_CONFIG` — controls whether the tool appears as ready or needs setup in the dashboard        |
| `setup_type`               | `SetupType`                   | `NONE`      | `NONE`, `API_KEY`, `OAUTH`, or `SPECIAL` — tells the dashboard what kind of setup flow to show                       |
| `config_fields`            | list of `ConfigField`         | `None`      | Describes constructor parameters configurable through the dashboard (see [ConfigField](#configfield))                |
| `dependencies`             | list of strings               | `None`      | Python packages the tool requires (see [Dependencies](#dependencies))                                                |
| `docs_url`                 | string                        | `None`      | Link to external documentation                                                                                       |
| `icon`                     | string                        | `None`      | Icon name for the dashboard (e.g., `"FaGoogle"`, `"Home"`)                                                           |
| `icon_color`               | string                        | `None`      | Tailwind color class for the icon (e.g., `"text-blue-500"`)                                                          |
| `helper_text`              | string                        | `None`      | Markdown help text shown in the dashboard setup panel                                                                |
| `auth_provider`            | string                        | `None`      | OAuth provider identifier when using OAuth-based setup                                                               |
| `managed_init_args`        | tuple of `ToolManagedInitArg` | `()`        | Declares which MindRoom-managed values the toolkit constructor expects (see [Managed init args](#managed-init-args)) |
| `default_execution_target` | `ToolExecutionTarget`         | `PRIMARY`   | `PRIMARY` or `WORKER` — controls whether the tool runs on the primary agent or a sandbox worker                      |

### Dependencies

The `dependencies` field lists Python packages that the tool requires at runtime. MindRoom checks whether each package is importable before the tool is instantiated.

**For built-in tools** (those shipped inside `src/mindroom/tools/`), missing dependencies trigger automatic installation via `uv sync` or `pip install` using the matching optional extra from MindRoom's `pyproject.toml`.

**For plugin tools**, automatic installation does **not** apply — there is no matching optional extra in MindRoom's package metadata. If the listed dependencies are not already installed in the environment, MindRoom raises an `ImportError` with a message listing the missing packages. Plugin authors should document their dependencies in their README so users can install them manually:

```
pip install openviking-client aiohttp
```

Even though plugin dependencies are not auto-installed, listing them in the decorator is still useful: MindRoom surfaces a clear error at tool load time rather than failing with an obscure traceback when an agent tries to call the tool mid-conversation.

You can disable automatic dependency installation entirely (for built-in tools too) by setting the environment variable `MINDROOM_NO_AUTO_INSTALL_TOOLS=1`.

### ConfigField

Each `ConfigField` describes one constructor parameter that can be configured through the dashboard or credentials store.

| Field         | Type   | Default    | Description                                                             |
| ------------- | ------ | ---------- | ----------------------------------------------------------------------- |
| `name`        | string | *required* | Constructor kwarg name (e.g., `"api_key"`)                              |
| `label`       | string | *required* | Display label shown in the dashboard                                    |
| `type`        | string | `"text"`   | Input type: `text`, `password`, `url`, `number`, `boolean`, or `select` |
| `required`    | bool   | `True`     | Whether the field must be set before the tool can be used               |
| `default`     | any    | `None`     | Default value when not configured                                       |
| `placeholder` | string | `None`     | Placeholder text shown in the input                                     |
| `description` | string | `None`     | Help text for the field                                                 |
| `options`     | list   | `None`     | For `select` type: list of `{"label": "...", "value": "..."}` dicts     |
| `validation`  | dict   | `None`     | Optional validation rules (min, max, pattern, etc.)                     |

Example — a tool that requires an API key:

```
from mindroom.tool_system.metadata import (
    ConfigField,
    SetupType,
    ToolCategory,
    ToolStatus,
    register_tool_with_metadata,
)

@register_tool_with_metadata(
    name="weather",
    display_name="Weather",
    description="Get current weather data",
    category=ToolCategory.INFORMATION,
    status=ToolStatus.REQUIRES_CONFIG,
    setup_type=SetupType.API_KEY,
    config_fields=[
        ConfigField(name="api_key", label="API Key", type="password"),
        ConfigField(
            name="units",
            label="Units",
            type="select",
            required=False,
            default="metric",
            options=[
                {"label": "Metric (°C)", "value": "metric"},
                {"label": "Imperial (°F)", "value": "imperial"},
            ],
        ),
    ],
)
def weather_tools() -> type[Toolkit]:
    ...
```

### Managed init args

If your toolkit constructor needs MindRoom-managed runtime values, declare them with `managed_init_args`. MindRoom does **not** auto-detect constructor parameter names — undeclared managed args are not passed through.

| Value                 | Constructor kwarg     | Description                                                             |
| --------------------- | --------------------- | ----------------------------------------------------------------------- |
| `RUNTIME_PATHS`       | `runtime_paths`       | Storage paths, environment values, and data directory access            |
| `CREDENTIALS_MANAGER` | `credentials_manager` | Read and write the per-tool credentials store                           |
| `WORKER_TARGET`       | `worker_target`       | Resolved worker routing context (scope, execution identity, worker key) |

Example:

```
from agno.tools import Toolkit
from mindroom.tool_system.metadata import ToolCategory, ToolManagedInitArg, register_tool_with_metadata


@register_tool_with_metadata(
    name="needs_runtime",
    display_name="Needs Runtime",
    description="Example tool that needs runtime paths",
    category=ToolCategory.DEVELOPMENT,
    managed_init_args=(ToolManagedInitArg.RUNTIME_PATHS,),
)
def needs_runtime_tools() -> type[Toolkit]:
    class NeedsRuntimeTools(Toolkit):
        def __init__(self, *, runtime_paths):
            self.runtime_paths = runtime_paths
            super().__init__(name="needs_runtime", tools=[])

    return NeedsRuntimeTools
```

## MCP via plugins (advanced)

MindRoom supports native MCP servers in `config.yaml` — see [MCP](https://docs.mindroom.chat/mcp/index.md) for the normal setup path. This plugin pattern is still useful when you want a custom wrapper around Agno `MCPTools`:

```
from agno.tools.mcp import MCPTools
from mindroom.tool_system.metadata import (
    SetupType,
    ToolCategory,
    ToolStatus,
    register_tool_with_metadata,
)


class FilesystemMCPTools(MCPTools):
    def __init__(self, **kwargs):
        super().__init__(
            command="npx -y @modelcontextprotocol/server-filesystem /path/to/dir",
            **kwargs,
        )


@register_tool_with_metadata(
    name="mcp_filesystem",
    display_name="MCP Filesystem",
    description="Tools from an MCP filesystem server",
    category=ToolCategory.DEVELOPMENT,
    status=ToolStatus.AVAILABLE,
    setup_type=SetupType.NONE,
)
def mcp_filesystem_tools():
    return FilesystemMCPTools
```

Reference the plugin and tool in `config.yaml`:

```
plugins:
  - ./plugins/mcp-filesystem

agents:
  assistant:
    tools:
      - mcp_filesystem
```

The factory function must return the toolkit class, not an instance. MCP toolkits are async; Agno's async agent runs (`arun`, `aprint_response`) handle MCP connect and disconnect automatically.

## Plugin skills

List skill directories in the manifest `skills` array. Each listed directory is added to MindRoom's skill search roots. Skill subdirectories must contain a `SKILL.md` file with YAML frontmatter (name, description, requirements).

## Hooks

Plugins can ship typed event hooks for message enrichment, response transformation, lifecycle observation, tool call gating, reactions, schedules, and custom events. See the [Hooks](https://docs.mindroom.chat/hooks/index.md) page for full documentation including:

- The `@hook` decorator and all parameters
- The built-in events and their execution modes
- The enrichment pipeline (`message:enrich`)
- Custom events
- Error handling without cooldowns or circuit breakers
- Testing patterns

## Live development (hot reload)

Plugins hot-reload automatically. When you edit any file inside a configured plugin directory, MindRoom notices the change on the next poll, waits out the debounce window, re-imports the plugin's modules in place, swaps the new hooks and tools into the live registry, and the next event invokes your new code. In practice the new code is usually live about 1-2 seconds after a save. No service restart and no agent session disruption.

### How it works

- A background watcher polls each configured plugin root every ~1s and debounces saves over a ~1s window.
- On change, the synthetic plugin package subtree is evicted from `sys.modules`, `load_plugins()` re-runs, a fresh `HookRegistry` is built, and the live registry is swapped atomically.
- Module-level `asyncio.Task` objects, and one-level containers like `dict[..., Task]`, on the old module are best-effort cancelled before the swap.
- The watcher ignores `__pycache__/`, `*.pyc`, `*.pyo`, editor swap files (`*.swp`, `*~`, `.#*`, `*.tmp`), and tool caches (`.ruff_cache/`, `.mypy_cache/`, `.pytest_cache/`).

### Iterating on a plugin

```
# 1. Edit any file under your plugin
$EDITOR ~/.mindroom/plugins/my-plugin/hooks.py

# 2. Save. Watch the journal:
journalctl -u mindroom.service -f | grep -E 'Reloading plugins|Plugin reload complete'

# 3. Trigger your hook (send a message, fire the matching event, etc.)
#    The new code path is live.
```

You can break and fix a plugin freely. A broken save that prevents reload, such as an import error or plugin validation error, can deactivate the affected plugin set, and the next valid save reloads it successfully. A hook that only raises at runtime is different: that failure is logged for that event, and the hook is tried again on the next matching event. There is no quarantine, failure threshold, or cooldown. Each save just reloads.

### Manual reload

If you need to force a reload, for example because the watcher missed something or you want to confirm a swap explicitly, an admin user can send the chat command:

```
!reload-plugins
```

The bot replies with the active plugin set and the count of cancelled background tasks. Admin gating uses `authorization.global_users` from `config.yaml`.

### Caveats and tradeoffs

The hot-reload path is intentionally best-effort, not transactional.

- **In-flight turns keep their old code.** A reload swaps the registry for new events, but any callback already running on the old module finishes there, and only new events use the new module.
- **No partial-write detection.** If your editor saves the file in two writes, the watcher may briefly load the half-written first state, log an import error, and then reload again on the second write.
- **CPU-bound infinite loops still wedge the event loop.** The hook dispatcher uses `asyncio.timeout()` for cooperative cancellation, so truly blocking CPU code is not preempted.
- **Background resources held by the old module can leak until natural cleanup.** Only `asyncio.Task` objects directly attached to module globals are cancelled, so plugins that hold long-lived non-task resources need their own cleanup bookkeeping.
- **New plugins added to disk are not auto-enabled.** You still have to add them under `plugins:` in `config.yaml`, because the watcher only reloads plugins that are already configured.

### Production tip

Hot reload is enabled by default in production. Edit any configured plugin directory directly while `mindroom.service` is running. `~/.mindroom/plugins/<name>/` is the common local layout, and active agent sessions, in-flight conversations, and streaming responses continue untouched.

## Community plugins

The [mindroom-ai](https://github.com/mindroom-ai) organization maintains a collection of open-source plugins. Clone any of them into your plugins directory and add the path to `config.yaml`:

```
git clone https://github.com/mindroom-ai/ping-hook-plugin.git ~/.mindroom/plugins/ping-hook
```

```
plugins:
  - ~/.mindroom/plugins/ping-hook
```

### Hooks-only plugins

| Plugin                                                                          | Description                                                                                                                                |
| ------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| [ping-hook-plugin](https://github.com/mindroom-ai/ping-hook-plugin)             | Minimal example — responds to `!ping-hook` with a pong message. Good starting point for learning the hook system.                          |
| [shell-guard-plugin](https://github.com/mindroom-ai/shell-guard-plugin)         | Blocks dangerous shell commands (e.g., `systemctl restart mindroom`) via `tool:before_call` gating.                                        |
| [voice-enrich-plugin](https://github.com/mindroom-ai/voice-enrich-plugin)       | Injects AI-only metadata when a voice-transcribed message arrives, warning the model about possible transcription errors.                  |
| [location-enrich-plugin](https://github.com/mindroom-ai/location-enrich-plugin) | Enriches prompts with real-time GPS location from [Dawarich](https://dawarich.app/), including place matching and movement classification. |
| [restart-resume-plugin](https://github.com/mindroom-ai/restart-resume-plugin)   | Re-activates threads tagged `pending-restart` after a bot restart.                                                                         |

### Hooks + tools plugins

| Plugin                                                                      | Description                                                                                                                                  |
| --------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| [thread-snooze-plugin](https://github.com/mindroom-ai/thread-snooze-plugin) | Snooze and unsnooze threads — temporarily resolves a thread and wakes it at a specified time.                                                |
| [thread-goal-plugin](https://github.com/mindroom-ai/thread-goal-plugin)     | Persistent per-thread goals stored in Matrix room state that survive context compaction and restarts.                                        |
| [workloop-plugin](https://github.com/mindroom-ai/workloop-plugin)           | Autonomous work plans with dependencies, priorities, auto-poke, and template-driven task creation.                                           |
| [openviking-plugin](https://github.com/mindroom-ai/openviking-plugin)       | Long-term memory via [OpenViking](https://github.com/volcengine/OpenViking) — automatic memory extraction, recall, and compaction archiving. |

# OAuth Integration Framework

MindRoom owns OAuth state, callback handling, credential scoping, and token persistence because those steps decide which human and agent scope receive access to an external account. Providers supply only provider-specific metadata and parsing behavior, such as OAuth endpoints, scopes, client config services, optional PKCE requirements, token response parsing, claim validation, the token credential service name used by OAuth, and the optional tool config service name used by dashboard settings.

The generic API surface is `/api/oauth/{provider}/connect`, `/api/oauth/{provider}/authorize`, `/api/oauth/{provider}/callback`, `/api/oauth/{provider}/status`, and `/api/oauth/{provider}/disconnect`. Dashboard flows can call `connect` to receive an authorization URL, while conversation flows can show the `authorize` URL so the user opens a normal authenticated MindRoom page before MindRoom redirects to the external provider. Dashboard OAuth state is opaque, time-limited, single-use, and bound to the authenticated MindRoom user plus the persisted agent execution scope resolved by the existing credentials target machinery. Conversation OAuth links use an additional opaque, time-limited, single-use connect token that binds the browser flow to the requester that produced the missing-credentials tool result. That connect token also carries the requester identity from the tool runtime, and MindRoom rejects redemption unless the authenticated dashboard user resolves to the same requester for scoped credentials. Standalone deployments should set `MINDROOM_OWNER_USER_ID` through pairing so dashboard credential management and agent-issued OAuth links resolve to the owner Matrix user instead of the generic dashboard API-key principal. `MINDROOM_OWNER_USER_ID` is a single-owner shortcut and is not suitable for a hosted multi-user private-agent deployment. Hosted deployments that put MindRoom behind an external access layer should enable trusted upstream auth and configure the exact headers MindRoom may trust. When trusted upstream auth is enabled, MindRoom reads the configured stable user ID and optional email headers into `request.scope["auth_user"]`. For Matrix-backed private agents, the trusted identity must resolve to a Matrix user ID either from a configured Matrix user ID header or from `MINDROOM_TRUSTED_UPSTREAM_EMAIL_TO_MATRIX_USER_ID_TEMPLATE`. The email-to-Matrix template must contain exactly one `{localpart}` placeholder and requires `MINDROOM_TRUSTED_UPSTREAM_EMAIL_HEADER`. If a browser request cannot map to the requester stored in the conversation connect token, the OAuth authorize or callback path fails closed and no credential is saved. The access layer must strip any client-supplied copies of the trusted headers before injecting verified values.

Plugins may declare an `oauth_module` in `mindroom.plugin.json`. That module exposes `register_oauth_providers(settings, runtime_paths)` and returns `OAuthProvider` objects. This keeps FastAPI routing and state handling in core while still letting plugin authors define provider IDs, scopes, token exchange details, optional claim validators, and tool metadata.

OAuth token writes always go through `resolve_request_credentials_target()` and `save_scoped_credentials()`. For private agents, the target worker key is derived from the authenticated requester and the agent's saved `worker_scope`, so a user-owned OAuth token lands under the same scope normal tools will read at runtime. If MindRoom cannot resolve the authenticated dashboard user to the requester carried by a conversation-issued link, the link fails closed and no credential is saved. Credential placement and visibility policy is centralized in `src/mindroom/credential_policy.py`. That module owns service classification, OAuth token field filtering, local-only credential service names, and worker-grantable rejections. Storage, API routing, OAuth provider loading, and worker identity derivation stay in their existing modules. Tools should declare `auth_provider` and, when credentials are missing, return a concise connect instruction that points at the generic `authorize` route for the provider and agent. Google OAuth tools always execute in the primary MindRoom runtime so worker runtimes never need Google OAuth client config or user refresh tokens. OAuth token documents and editable tool setting documents should be separate services. The OAuth callback writes only the provider's `credential_service`, while dashboard configuration reads and writes the provider's `tool_config_service` when one is declared. OAuth app client config is stored separately from both of those services. Providers declare `client_config_services` in lookup order, and MindRoom reads `client_id`, `client_secret`, and optional `redirect_uri` from those services. Providers can also declare shared client config services for shared app IDs and secrets. Every client config service name must end with `_oauth_client` so credential placement and worker allowlist validation can identify plugin client config services without loading provider code. Shared client config services do not supply redirect URIs because each provider must use its own callback route. Client config services are local-only deployment configuration and cannot be mirrored into worker containers. Generic credential responses redact `client_secret` for client config services. Generic credential saves preserve the existing `client_secret` only when the saved `client_id` is unchanged. Changing `client_id` requires submitting the matching new `client_secret`. First-time client config saves require both fields to be non-empty. Client config services cannot be copied through the generic copy route. Generic credentials endpoints do not return OAuth token fields and reject direct writes to OAuth token services.

Providers that require PKCE should set `pkce_code_challenge_method="S256"`. MindRoom generates one verifier per OAuth flow, stores it in pending server-side state, adds the S256 challenge to the authorization URL, and passes the verifier into token exchange. Custom `token_exchanger` callbacks receive `(provider, code, client_config, runtime_paths, code_verifier)` so they can include the verifier in provider-specific exchange requests.

Identity restrictions are provider settings, not MindRoom policy. Providers can enforce allowed email domains, allowed hosted-domain claims, and custom claim validators. If a configured restriction cannot be checked from verified provider claims, the callback fails closed and no credential is saved.

Built-in Google providers use the generic framework for Drive, Calendar, Sheets, and Gmail. Each provider has minimal service-specific scopes, stores OAuth tokens under its own `*_oauth` service, stores editable tool settings separately, and uses `/api/oauth/*`. Each provider first checks its provider-specific client config service, then the shared `google_oauth_client` service. The shared `google_oauth_client` service supplies only `client_id` and `client_secret`; MindRoom derives the provider-specific redirect URI.

# Knowledge Bases

Knowledge bases give your agents access to your own documents through RAG (Retrieval-Augmented Generation). Drop files into a folder, point a knowledge base at it, and agents can search the indexed content when answering questions.

## How It Works

1. You configure a knowledge base pointing to a folder of documents
1. MindRoom indexes the files into a vector database (ChromaDB) using an embedder
1. Agents assigned to that knowledge base get a search tool that queries the indexed documents
1. When the agent uses the tool, relevant document chunks are included in its context

```
Indexing (scheduled refresh):

  ┌──────────────┐      ┌──────────┐      ┌──────────┐
  │ Files/Folder │ ───▶ │ Embedder │ ───▶ │ ChromaDB │
  └──────────────┘      └──────────┘      └──────────┘
         ▲
         │ on-access/API refresh
         │ git sync during refresh

Querying (agentic RAG):

  ┌───────┐  search   ┌──────────┐
  │ Agent │ ────────▶ │ ChromaDB │
  │       │ ◀──────── │          │
  └───────┘  chunks   └──────────┘
```

## Quick Start

Add a knowledge base and assign it to an agent:

```
knowledge_bases:
  docs:
    path: ./knowledge_docs
    watch: false
    chunk_size: 5000
    chunk_overlap: 0

agents:
  assistant:
    display_name: Assistant
    role: A helpful assistant with access to our docs
    knowledge_bases: [docs]
```

Place files in `./knowledge_docs/`, then trigger a reindex from the dashboard/API or let MindRoom watch shared local bases with `watch: true`. Chat uses the last successfully published index and continues without blocking when a base is missing, stale, or failed. When a watched file changes, MindRoom marks the published index stale, refreshes in the background, and atomically publishes the replacement when it succeeds. When `watch: false`, direct external file edits require explicit reindex, while dashboard/API upload and delete actions still schedule refresh after a successful mutation. Knowledge base IDs are the keys under `knowledge_bases`. Use a non-empty single path component such as `docs` or `company_docs`, not `""`, `.`, `..`, or names containing `/` or `\`.

## Configuration

### Basic Knowledge Base

```
knowledge_bases:
  my_docs:
    path: ./knowledge_docs/my_docs   # Folder containing documents
    watch: false                      # Direct external edits require reindex; API mutations still schedule refresh
    chunk_size: 5000                  # Max characters per chunk
    chunk_overlap: 0                  # Overlap between adjacent chunks
```

| Field           | Type   | Default            | Description                                                                                                                                                                                                                                                 |
| --------------- | ------ | ------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `path`          | string | `./knowledge_docs` | Folder path (relative to the config file directory or absolute)                                                                                                                                                                                             |
| `watch`         | bool   | `true`             | When true, shared local folders watch filesystem changes and schedule background published-index refresh without blocking reads. When false, direct external edits require explicit reindex; dashboard/API upload and delete actions still schedule refresh |
| `chunk_size`    | int    | `5000`             | Maximum characters per chunk for text-like files (minimum: `128`)                                                                                                                                                                                           |
| `chunk_overlap` | int    | `0`                | Overlap characters between adjacent chunks (must be `< chunk_size`)                                                                                                                                                                                         |
| `git`           | object | `null`             | Optional Git repository sync settings                                                                                                                                                                                                                       |

Use smaller `chunk_size` values when your embedding server has lower token or batch limits. If chunking is too large, indexing retries will fail with embedder 500 errors.

### Private Agent Knowledge

Use `agents.<name>.private.knowledge` when one shared agent definition should index PrivateAgentKnowledge from that requester's private root.

```
knowledge_bases:
  company_docs:
    path: ./company_docs
    watch: false

agents:
  mind:
    display_name: Mind
    role: A persistent personal AI companion
    model: sonnet
    private:
      per: user
      root: mind_data
      template_dir: ./mind_template
      knowledge:
        path: memory
        watch: false
    knowledge_bases: [company_docs]
```

With this configuration, each requester's private knowledge path becomes `<their private root>/memory`. The template source is explicit, so you can see and edit the files being copied into each requester's private root. `private.template_dir` only copies files. PrivateAgentKnowledge is enabled only when you explicitly configure `private.knowledge.path`. `private.knowledge.path` must be relative to the private root and cannot be absolute or escape with `..`. `private.knowledge.path` can point to any folder inside the private root, including `.` for the private root itself. MindRoom keeps a separate index per effective private root, so one requester's indexed data is not shared with another requester's runtime. For isolating scopes such as `user` and `user_agent`, MindRoom refreshes the private index on access instead of keeping a background watcher alive for every requester root. Git-backed knowledge syncs during scheduled or explicit refreshes. Top-level `knowledge_bases` remain the shared/global mechanism, so the same agent can combine PrivateAgentKnowledge with shared company knowledge. PrivateAgentKnowledge applies to the normal agent runtime path, not the OpenAI-compatible `/v1` API. If you enable `private.knowledge.git`, use a dedicated subtree such as `kb_repo`. Do not point Git-backed private knowledge at `.` or `memory/`, and do not use a Git checkout path that your template or private file memory also writes into.

| Field                             | Type   | Default | Description                                                                                                                                                             |
| --------------------------------- | ------ | ------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `private.knowledge.enabled`       | bool   | `true`  | Whether PrivateAgentKnowledge indexing is active for this agent                                                                                                         |
| `private.knowledge.path`          | string | `null`  | Private-root-relative folder to index. Required when `private.knowledge.enabled` is `true`; set `enabled: false` to disable private knowledge                           |
| `private.knowledge.watch`         | bool   | `true`  | When true, PrivateAgentKnowledge schedules background refresh on access. When false, direct external edits require explicit refresh                                     |
| `private.knowledge.chunk_size`    | int    | `5000`  | Maximum characters per indexed chunk                                                                                                                                    |
| `private.knowledge.chunk_overlap` | int    | `0`     | Overlap characters between adjacent chunks. Must be smaller than `chunk_size`                                                                                           |
| `private.knowledge.git`           | object | `null`  | Optional Git sync configuration for PrivateAgentKnowledge. Git-backed private knowledge must use a dedicated subtree outside requester-writable memory/template content |

Use `private.knowledge` when the data itself should be private to that requester's private instance. Use top-level `knowledge_bases` when the same documents should stay shared across agents or users.

### Multiple Knowledge Bases

You can define multiple knowledge bases and assign them to different agents:

```
knowledge_bases:
  engineering:
    path: ./knowledge_docs/engineering
    watch: false
    chunk_size: 5000
    chunk_overlap: 0
  product:
    path: ./knowledge_docs/product
    watch: false
    chunk_size: 5000
    chunk_overlap: 0
  legal:
    path: ./knowledge_docs/legal
    watch: false
    chunk_size: 1000
    chunk_overlap: 100

agents:
  developer:
    display_name: Developer
    role: Engineering assistant
    knowledge_bases: [engineering]

  pm:
    display_name: Product Manager
    role: Product planning assistant
    knowledge_bases: [product, engineering]  # Can access multiple bases

  compliance:
    display_name: Compliance
    role: Legal and compliance reviewer
    knowledge_bases: [legal]
```

When an agent has multiple knowledge bases, results are interleaved fairly so no single base dominates the top results.

## Git-Backed Knowledge Bases

Knowledge bases can sync from a Git repository. MindRoom starts a background refresh for configured shared Git knowledge bases when runtime support starts. After that, it schedules another background refresh every `poll_interval_seconds`. Reads keep using the last published index while a refresh is running.

```
knowledge_bases:
  pipefunc_docs:
    path: ./knowledge_docs/pipefunc
    watch: false
    chunk_size: 1200
    chunk_overlap: 120
    git:
      repo_url: https://github.com/pipefunc/pipefunc
      branch: main
      poll_interval_seconds: 300
      lfs: false
      skip_hidden: true
      include_patterns:
        - "docs/**"
```

### Git Configuration Fields

| Field                   | Type   | Default    | Description                                                                                                    |
| ----------------------- | ------ | ---------- | -------------------------------------------------------------------------------------------------------------- |
| `repo_url`              | string | *required* | HTTPS repository URL to clone/fetch                                                                            |
| `branch`                | string | `main`     | Branch to track                                                                                                |
| `poll_interval_seconds` | int    | `300`      | Interval for scheduling background Git refreshes                                                               |
| `credentials_service`   | string | `null`     | Service name in CredentialsManager for private repos                                                           |
| `lfs`                   | bool   | `false`    | Enable Git LFS support and hydrate the checkout after sync. Requires `git-lfs` on the machine running MindRoom |
| `sync_timeout_seconds`  | int    | `3600`     | Abort one Git command if it exceeds this timeout                                                               |
| `skip_hidden`           | bool   | `true`     | Skip files/folders starting with `.`                                                                           |
| `include_patterns`      | list   | `[]`       | Root-anchored glob patterns to include                                                                         |
| `exclude_patterns`      | list   | `[]`       | Root-anchored glob patterns to exclude                                                                         |

When `lfs: true`, install `git-lfs` on the runtime host for `uv run` or `uvx` flows. Bundled container images already include it.

### Sync Behavior

- Chat and runtime requests never wait for Git sync or indexing.
- Missing, stale, or failed knowledge schedules a per-binding refresh and the current request continues with availability metadata.
- Explicit dashboard/API reindex runs Git sync first for Git-backed bases and then rebuilds a candidate index.
- When `lfs: true`, MindRoom disables implicit LFS smudge during clone/checkout/reset and explicitly hydrates the checkout after sync, keeping the working tree complete even when indexing filters only include some file types.
- Local edits to Git-tracked files are discarded during refresh sync, and tracked deletions are restored from the remote checkout.
- Git-backed bases reject dashboard/API file upload and delete mutations; update the repository and reindex instead.
- Successful refresh publishes a new last successfully published index while failed refresh preserves the previous one and records the error in status metadata.

### File Filtering with Patterns

Patterns are matched from the repository root. `*` matches one path segment, `**` matches zero or more segments.

```
knowledge_bases:
  project_docs:
    path: ./knowledge_docs/project
    git:
      repo_url: https://github.com/org/project
      include_patterns:
        - "docs/**"                    # All files under docs/
        - "README.md"                  # Root README only
        - "content/posts/*/index.md"   # Specific nested files
      exclude_patterns:
        - "docs/internal/**"           # Exclude internal docs
```

- If `include_patterns` is empty, all non-hidden files are eligible
- If `include_patterns` is set, a file must match at least one pattern
- `exclude_patterns` are applied last and remove matching files

Multiple knowledge bases may point at the same root when they use the same source ownership settings. This is the preferred way to expose separate views of a large repository without cloning it more than once.

```
knowledge_bases:
  project_docs:
    path: ./knowledge_docs/project
    git:
      repo_url: https://github.com/org/project
      branch: main
      include_patterns: ["docs/**"]
  project_source:
    path: ./knowledge_docs/project
    git:
      repo_url: https://github.com/org/project
      branch: main
      include_patterns: ["src/**"]
```

### Private Repository Authentication

For private HTTPS repositories, store credentials and reference them in the config.

**Step 1:** Store credentials via the API or Dashboard (Credentials tab):

```
curl -X POST http://localhost:8765/api/credentials/github_private \
  -H "Content-Type: application/json" \
  -d '{"credentials":{"username":"x-access-token","token":"ghp_your_token_here"}}'
```

**Step 2:** Reference the service name in your knowledge base config:

```
knowledge_bases:
  private_docs:
    path: ./knowledge_docs/private
    git:
      repo_url: https://github.com/org/private-repo
      credentials_service: github_private
```

Accepted credential fields:

| Fields                  | Notes                                           |
| ----------------------- | ----------------------------------------------- |
| `username` + `token`    | Standard GitHub/GitLab access token auth        |
| `username` + `password` | Basic HTTP auth                                 |
| `api_key`               | Uses `x-access-token` as username automatically |

## Embedder Configuration

Knowledge bases use the same embedder configured in the `memory` section:

```
memory:
  embedder:
    provider: openai        # or "ollama", "huggingface", or "sentence_transformers"
    config:
      model: text-embedding-3-small
      host: null             # For self-hosted (Ollama)
      dimensions: null       # Optional: embedding dimension override (e.g., 256)
```

| Provider                | Model Example                            | Notes                                                                     |
| ----------------------- | ---------------------------------------- | ------------------------------------------------------------------------- |
| `openai`                | `text-embedding-3-small`                 | Requires `OPENAI_API_KEY`                                                 |
| `ollama`                | `nomic-embed-text`                       | Self-hosted, set `host` or `OLLAMA_HOST`                                  |
| `sentence_transformers` | `sentence-transformers/all-MiniLM-L6-v2` | Fully local Python runtime; auto-installs the optional extra on first use |

## Storage

Knowledge data is stored under `<storage_path>/knowledge_db/<sanitized_base_id>_<hash>/`. Each successful refresh publishes a generation-specific ChromaDB collection whose name begins with `mindroom_knowledge_<sanitized_base_id>_<hash>`. The base ID is sanitized to alphanumerics, hyphens, and underscores only, and the hash is a digest of the resolved knowledge path. For PrivateAgentKnowledge, the effective private-root path is part of that hash, so each requester-local root gets an isolated index.

The storage path defaults to `mindroom_data/` next to your `config.yaml`, or can be set with `MINDROOM_STORAGE_PATH`.

## Dashboard Management

The web dashboard provides a Knowledge tab for managing knowledge bases without editing YAML:

- Create, edit, and delete knowledge bases
- Configure chunk size and overlap per knowledge base
- Configure Git sync settings
- Upload and remove files for non-Git-backed bases
- Trigger a full reindex on demand
- Monitor indexing status (file count vs. indexed count)
- Assign knowledge bases to agents from the Agents tab

## API Endpoints

See the [Dashboard API reference](https://docs.mindroom.chat/dashboard/#knowledge) for the full list of knowledge base endpoints (list, upload, delete, reindex, status).

## Hot Reload

Knowledge base configuration supports hot reload. Changing `config.yaml` does not initialize every configured knowledge base. Agents keep using last successfully published indexes until a refresh for their resolved binding succeeds. Changed settings make existing published indexes stale or unavailable depending on query compatibility, and scheduled refresh rebuilds the affected binding in the background.

# Memory System

MindRoom supports three memory backends:

- `mem0`: vector memory (semantic retrieval + extraction via Mem0)
- `file`: markdown memory files (`MEMORY.md` plus optional dated notes)
- `none`: disabled memory for stateless agents

Set the global default backend with `memory.backend`. Override the backend per agent with `agents.<name>.memory_backend`. When an agent uses `memory_backend: file`, its file memory lives in its canonical workspace root. When an agent uses `memory_backend: none`, MindRoom skips prompt memory lookup, automatic memory persistence, and the explicit `memory` tool for that agent. Use `agents.<name>.private` when one shared agent definition should keep file memory inside a requester-local private root. `private` changes where private files live. It does not switch the memory backend by itself.

OpenClaw compatibility uses this same backend selection; there is no separate OpenClaw-only memory engine.

Optional:

- `memory.team_reads_member_memory: true` allows team-context memory reads to include member agent scopes.

## Memory Scopes

| Scope | User ID Format               | Description                                |
| ----- | ---------------------------- | ------------------------------------------ |
| Agent | `agent_<name>`               | Agent preferences and durable user context |
| Team  | `team_<agent1>+<agent2>+...` | Shared team conversation memory            |

Notes:

- Team IDs are sorted agent names joined by `+`.

## Backend: `mem0`

`mem0` keeps the existing behavior:

- semantic retrieval before response
- automatic extraction after turns
- storage in Chroma-backed Mem0 collections

Example:

```
memory:
  backend: mem0
  embedder:
    provider: openai
    config:
      model: text-embedding-3-small
      dimensions: null             # Optional: embedding dimension override (e.g., 256)
```

Fully local embedder example:

```
memory:
  backend: mem0
  embedder:
    provider: sentence_transformers
    config:
      model: sentence-transformers/all-MiniLM-L6-v2
```

MindRoom auto-installs the optional `sentence_transformers` extra the first time this provider is used.

Ollama embedder example:

```
memory:
  backend: mem0
  embedder:
    provider: ollama
    config:
      model: nomic-embed-text
      host: http://localhost:11434
```

Supported embedder providers: `openai`, `ollama`, `huggingface`, `sentence_transformers`.

### Memory LLM

The memory system uses an LLM for extraction. Configure it with `memory.llm`:

```
memory:
  llm:
    provider: ollama    # ollama, openai, or anthropic
    config:
      model: llama3.2
```

Supported LLM providers: `ollama` (default), `openai`, `anthropic`.

## Backend: `none`

`none` disables MindRoom's built-in durable memory for the effective agent.

Global example:

```
memory:
  backend: none
```

Shorthand example:

```
memory: none
```

Per-agent stateless override:

```
memory:
  backend: mem0

agents:
  scratch:
    display_name: Scratch
    role: Stateless scratchpad agent
    memory_backend: none
```

Disabled memory does not disable Agno Learning. Set `learning: false` separately if you also want to disable learning.

## Backend: `file`

`file` keeps memory in markdown files and treats files as source-of-truth.

Example:

```
memory:
  backend: file
  file:
    max_entrypoint_lines: 200
```

`memory.file.path` is an optional fallback root for file-memory paths. It does not relocate canonical agent file memory (which always lives under the agent's workspace root). It can affect team file memory when the resolution determines the configured path should be used.

Per-agent override example:

```
memory:
  backend: mem0

agents:
  coder:
    display_name: Coder
    role: Write and review code
    memory_backend: file
```

For shared agents, file memory now lives directly under `agents/<name>/workspace/`. For requester-private agents, file memory lives directly under the effective private root. Use `private` when you need per-requester file-memory isolation.

Private instance example:

```
agents:
  mind:
    display_name: Mind
    role: A persistent personal AI companion
    memory_backend: file
    private:
      per: user
      root: mind_data
      template_dir: ./mind_template
```

In this setup, each requester gets their own private `mind_data/` root inside a canonical private-instance state root in shared storage. When `memory_backend: file` is enabled, that private root becomes the agent's effective file-memory root. If `./mind_template/` contains `MEMORY.md` and `memory/`, those files are copied into each private root on first use and then remain editable per requester. Later runs backfill newly added scaffold files without overwriting requester edits. MindRoom does not invent `MEMORY.md` or `memory/` for private agents. Put those files in your template directory if you want them scaffolded into each private root. If `memory_backend` is not `file`, `private` still creates private files and directories, but it does not make file memory active. Use `private` for requester-isolated workspaces.

### File layout

Agent file memory is stored under each agent's canonical workspace root:

- `agents/<agent>/workspace/MEMORY.md`
- `agents/<agent>/workspace/memory/YYYY-MM-DD.md`

Team file memory is mirrored under each participating agent's storage directory:

- `agents/<agent>/memory_files/team_<sorted_members>/MEMORY.md`
- `agents/<agent>/memory_files/team_<sorted_members>/memory/YYYY-MM-DD.md`

## File Auto-Flush Worker

When the effective backend is `file` for at least one agent, you can enable background auto-flush:

```
memory:
  backend: file
  auto_flush:
    enabled: true
    flush_interval_seconds: 1800
    idle_seconds: 120
    max_dirty_age_seconds: 600
    stale_ttl_seconds: 86400
    max_cross_session_reprioritize: 5
    retry_cooldown_seconds: 30       # Cooldown before retrying a failed extraction
    max_retry_cooldown_seconds: 300   # Upper bound for retry cooldown backoff
    batch:
      max_sessions_per_cycle: 10
      max_sessions_per_agent_per_cycle: 3
    extractor:
      no_reply_token: NO_REPLY
      max_messages_per_flush: 20
      max_chars_per_flush: 12000
      max_extraction_seconds: 30
      include_memory_context:
        memory_snippets: 5
        snippet_max_chars: 400
```

High-level behavior:

1. Turns mark sessions dirty.
1. Background worker picks eligible dirty sessions in bounded batches.
1. Worker runs a model-driven extraction (not keyword heuristics) to produce durable memories.
1. If extractor returns `NO_REPLY`, nothing is written.
1. Successful writes append to memory files via normal memory APIs.

## UI Configuration

The Dashboard **Memory** page supports:

- backend selection (`mem0`, `file`, or `none`)
- team/member read toggle (`team_reads_member_memory`)
- embedder provider/model/host
- file backend settings (`path`, `max_entrypoint_lines`)
- auto-flush settings (intervals, idle/age thresholds, retries)
- batch sizing
- extractor settings (`no_reply_token`, message/char/time limits, `include_memory_context` dedupe bounds)

Save from the Memory page to persist changes to `config.yaml`. Use the Dashboard **Agents** page to set an agent-specific **Memory Backend** override.

## Optional Memory Tool

For explicit agent-controlled memory operations, add the `memory` tool:

```
agents:
  assistant:
    tools: [memory]
```

This exposes `add_memory`, `search_memories`, `list_memories`, `get_memory`, `update_memory`, and `delete_memory`.

## Agno Learning

MindRoom integrates Agno's built-in Learning system, which lets agents learn and adapt from conversations. Learning is separate from the memory backends above — it uses Agno's own SQLite-backed storage in each agent's state root (`learning/`).

### Configuration

```
defaults:
  learning: true          # Enable learning for all agents (default: true)
  learning_mode: always   # "always" (extract after every turn) or "agentic" (agent decides via tool)
```

Per-agent override:

```
agents:
  assistant:
    learning: false       # Disable learning for this agent
  research:
    learning_mode: agentic  # Agent controls when to learn
```

| Field           | Type   | Default  | Description                                                                                      |
| --------------- | ------ | -------- | ------------------------------------------------------------------------------------------------ |
| `learning`      | bool   | `true`   | Enable Agno Learning for the agent                                                               |
| `learning_mode` | string | `always` | `always`: automatic extraction after every turn. `agentic`: agent decides via tool when to learn |

Agents inherit `learning` and `learning_mode` from `defaults` unless explicitly overridden. Disabled agents do not create or update learning state. Learning data persists in `agents/<name>/learning/<agent>.db` within the agent's state root.

# Voice Messages

MindRoom can surface Matrix voice messages as attachment-aware prompts for agents. If STT is configured, MindRoom also transcribes the audio and routes it through the normal text pipeline. If STT is unavailable, disabled, or fails, the audio still remains available as an attachment and falls back to `🎤 [Attached voice message]`.

## Overview

When a voice message is received:

1. The audio event is handled through the shared media pipeline.
1. Audio is downloaded and decrypted, if needed, and registered as a context-scoped attachment.
1. If STT is configured and succeeds, the audio is transcribed and lightly normalized for mentions and commands.
1. If STT is unavailable, disabled, or fails, MindRoom falls back to `🎤 [Attached voice message]`.
1. The normalized text plus attachment metadata is dispatched using the normal routing and thread logic.
1. If routing is ambiguous in a multi-agent room, the router posts a visible handoff message.
1. If `voice.visible_router_echo` is enabled and the router is present and allowed to reply, the router also posts the normalized voice text as a display-only message.
1. Otherwise, no extra router message is posted and the chosen agent replies directly.
1. The responding agent receives the original audio attachment alongside the normalized prompt.

## Configuration

Enable STT and voice-intelligence formatting in `config.yaml`:

```
voice:
  enabled: true
  visible_router_echo: false
  stt:
    provider: openai
    model: whisper-1
    # Optional: custom endpoint (without /v1 suffix)
    # host: http://localhost:8080
  intelligence:
    model: default  # Model used for command recognition
```

Or use the dashboard's Voice tab.

With `voice.enabled: false`, audio messages are still surfaced as attachments with the fallback prompt. Enabling voice adds STT and command-recognition on top of that attachment flow. With `voice.visible_router_echo: true`, the router also posts the normalized transcript or fallback text for inspection when it is present in the room and allowed to reply.

## STT Providers

MindRoom uses the OpenAI-compatible transcription API. Any service that implements the `/v1/audio/transcriptions` endpoint will work.

### OpenAI Whisper (Cloud)

```
voice:
  enabled: true
  stt:
    provider: openai
    model: whisper-1
```

Requires `OPENAI_API_KEY` environment variable.

### Self-Hosted Whisper

```
voice:
  enabled: true
  stt:
    provider: openai
    model: whisper-1
    host: http://localhost:8080
```

Note: Do not include `/v1` in the host URL - MindRoom appends `/v1/audio/transcriptions` automatically.

Use with [faster-whisper-server](https://github.com/fedirz/faster-whisper-server) or similar OpenAI-compatible STT servers.

### Custom API Key

For self-hosted solutions that require authentication:

```
voice:
  enabled: true
  stt:
    provider: openai
    model: whisper-1
    host: http://localhost:8080
    api_key: your-custom-api-key
```

If `api_key` is not set, MindRoom falls back to the `OPENAI_API_KEY` environment variable.

## Command Recognition

The intelligence component uses an AI model to analyze transcriptions and format them properly:

1. **Agent mentions** - Converts spoken agent names to `@agent` format
1. **Mention sanitization** - Mentions of agents not available in the current room have their `@` stripped so the agent is not falsely targeted
1. **Command patterns** - Identifies and formats `!command` syntax
1. **Speculative command rejection** - Commands the AI invents that were not in the original transcription are rejected to prevent false positives
1. **Smart formatting** - Handles speech recognition errors and natural language variations

### Intelligence Model

The intelligence model processes raw transcriptions to recognize commands and agent names:

```
voice:
  intelligence:
    model: default  # Uses the default model from your models config
```

You can specify a different model for faster or more accurate command recognition.

## How It Works

```
┌─────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Voice Msg   │────▶│ Download &  │────▶│ Transcribe  │────▶│ Format with │
│ (Audio)     │     │ Decrypt     │     │ (STT)       │     │ AI (LLM)    │
└─────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
                                                                  │
                                                                  ▼
                                                         ┌──────────────────┐
                                                         │ Normal Dispatch  │
                                                         │ Decision         │
                                                         └──────────────────┘
                                                           │            │
                                                           │            │
                                                           ▼            ▼
                                                 ┌──────────────┐  ┌──────────────┐
                                                 │ Visible      │  │ No Visible   │
                                                 │ Router       │  │ Router       │
                                                 │ Handoff      │  │ Handoff      │
                                                 └──────────────┘  └──────────────┘
                                                           │            │
                                                           └──────┬─────┘
                                                                  ▼
                                                           ┌─────────────┐
                                                           │ Agent       │
                                                           │ Responds    │
                                                           └─────────────┘
```

## Dispatch Behavior

### Single-agent rooms or explicitly targeted audio

If only one eligible agent is visible, that agent responds directly to the normalized audio event. If the audio caption or transcript explicitly mentions an agent, that targeted agent responds directly as well. In these cases, the router does not post an extra visible routing handoff. The transcript or fallback text is used internally for dispatch, not echoed to the room as a separate message. If `voice.visible_router_echo` is enabled, the router still posts a display-only copy of the normalized voice text, but agents ignore that echo and continue responding to the original audio event.

### Multi-agent rooms where the router must choose

If multiple agents are available and the audio does not already target one of them, the router uses the normalized text to do the usual routing step. The router then posts a normal handoff message such as `@home could you help with this?`. The selected agent responds to that router handoff, and the handoff carries the original audio attachment metadata forward. This is the case where a visible router message appears. If `voice.visible_router_echo` is also enabled, the router first posts the normalized voice text as a display-only echo and then posts the normal handoff.

### No router, or router cannot reply

Audio still works when the router is absent. In that case, agents handle the normalized audio directly using the same mention, thread, and permission rules as normal text messages. The same direct handling also applies when the router is present but is not allowed to reply to the original sender. In these cases, there is no visible router echo because the router does not handle the event. If multiple eligible agents remain and the audio does not already target one of them, there is no automatic handoff until the user mentions an agent.

### Visibility rule

MindRoom does not automatically post the transcript to the room. A visible router message appears only when the router must disambiguate between multiple eligible responders. If the responder is already clear from room shape, thread context, or explicit targeting, the chosen agent replies directly without an extra router message. Setting `voice.visible_router_echo: true` adds a visible router-authored echo of the normalized voice text when the router is actually allowed to process the event, without changing which event agents actually answer.

### Attachment access

The original audio is always registered as a context-scoped attachment before dispatch continues. That means the responding agent can inspect the file directly, use audio-capable models, or fetch it later with the `attachments` tool. This is true whether the prompt came from a transcript, a fallback message, or a router handoff.

## Matrix Integration

Voice messages in Matrix are:

- Detected as `RoomMessageAudio` or `RoomEncryptedAudio` events
- Downloaded from the Matrix media server
- Decrypted if end-to-end encrypted (using the encryption key from the event)
- Registered as audio attachments before dispatch
- Sent to the STT service via the OpenAI-compatible API when transcription is enabled
- Normalized once per room and thread context, even though multiple bots may observe the event

Audio callbacks are registered on all bots because audio now follows the shared media pipeline. Shared normalization prevents repeated download and STT work for the same event. Reply-permission checks still use the original human sender, not a later router relay.

## Environment Variables

| Variable         | Description                                                          |
| ---------------- | -------------------------------------------------------------------- |
| `OPENAI_API_KEY` | For OpenAI Whisper API (used as fallback if no `api_key` configured) |

## Text-to-Speech Tools

MindRoom also supports text-to-speech (TTS) through agent tools. These are separate from voice message transcription and allow agents to generate audio responses:

- **OpenAI** - Speech synthesis via `openai` tool
- **ElevenLabs** - High-quality AI voices and sound effects via `eleven_labs` tool
- **Cartesia** - Voice AI with optional voice localization via `cartesia` tool
- **Groq** - Fast speech generation via `groq` tool

See the [Tools documentation](https://docs.mindroom.chat/tools/index.md) for configuration details.

## Voice Fallback (No STT Available)

When STT is unavailable, disabled, or transcription fails, MindRoom falls back to raw audio passthrough:

1. The voice message audio is downloaded and saved locally as an attachment
1. The normalized text becomes `🎤 [Attached voice message]`
1. The raw audio is registered as an attachment ID available to agents in the room or thread context
1. When an agent responds, it automatically receives the raw audio as an Agno `Audio` object

This means voice messages still reach agents even without STT. Agents with audio-capable models can process the raw audio directly, and tool-using agents can retrieve the file by attachment ID. Attachment IDs in this fallback path use the same context-scoping rules described in [File & Video Attachments](https://docs.mindroom.chat/attachments/index.md).

## Limitations

- Only OpenAI-compatible STT APIs are supported
- Audio quality and background noise affect transcription accuracy
- Without STT, routing has less textual context, so explicit `@mentions` or existing thread context are more reliable in multi-agent rooms
- Without STT, agents receive raw audio instead of transcription, so the model or tools must support audio inputs to process it

## Tips

- **Say the agent name first** - "Hey @assistant, what's the weather?"
- **Use display names** - The AI converts spoken names like "HomeAssistant" to the correct `@home` mention

# Image Messages

MindRoom can process images sent to Matrix rooms, passing them to vision-capable AI models for analysis.

## Overview

When a user sends an image in a Matrix room:

1. The agent determines whether it should respond (via mention, thread participation, or DM)
1. The image is downloaded and decrypted (if E2E encrypted)
1. The image is wrapped as an `agno.media.Image` and passed to the AI model
1. The agent responds with its analysis

Image support works automatically for all agents -- no configuration is needed. The AI model must support vision (e.g., Claude, GPT-5.4).

## Supported Formats

MindRoom detects image format from file byte signatures:

- PNG
- JPEG
- GIF
- WebP
- BMP
- TIFF

If the declared MIME type in the Matrix event does not match the detected byte signature, MindRoom logs a warning and uses the detected type.

## How It Works

```
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ Image Msg   │────>│ Download &  │────>│ Pass to AI  │
│ (Matrix)    │     │ Decrypt     │     │ Model       │
└─────────────┘     └─────────────┘     └─────────────┘
                                              │
                                              v
                                        ┌─────────────┐
                                        │ Agent       │
                                        │ Responds    │
                                        └─────────────┘
```

## Usage

Send an image in a Matrix room and mention the agent in the caption:

- **With caption**: `@assistant What does this diagram show?` -- the caption is used as the prompt
- **Without caption**: The agent receives `[Attached image]` as the prompt and describes what it sees
- **Bare filename**: If the body is just a filename (e.g., `IMG_1234.jpg`), it is treated the same as no caption

Images work in both direct messages and threads, and with both individual agents and teams.

## Captions (MSC2530)

If the Matrix event's `filename` field differs from `body`, the `body` is used as a user caption. This follows [MSC2530](https://github.com/matrix-org/matrix-spec-proposals/pull/2530) semantics and works with clients that set the caption in the body.

## Image Persistence

Images are saved under `mindroom_data/attachments/` and `mindroom_data/incoming_media/` and registered as attachment records with 30-day retention. In addition to being passed to the AI model as vision input, each image is also registered as an `att_*` attachment ID so agents can reference it via tool calls. See [Attachments](https://docs.mindroom.chat/attachments/index.md) for details on retention and context scoping.

## Encryption

Both unencrypted and E2E encrypted images are supported. Encrypted images are decrypted transparently using the key material from the Matrix event.

## Caching

AI response caching is automatically skipped when images are present, since image payloads are large and unlikely to repeat.

## Media Fallback

If a model rejects inline media (images, audio, video, or documents), MindRoom automatically retries the request without the inline media. The retried prompt includes `[Inline media unavailable for this model]` to inform the agent that attachments were dropped. Agents can still reference the files via attachment IDs and tools.

This fallback is transparent — no user action is required. It detects provider-specific error patterns such as unsupported media type, base64 field validation failures, and capability rejections.

## Limitations

- **Routing in multi-agent rooms** -- in multi-agent rooms without an `@mention`, the router selects the best agent based on the image caption.
- **Bridge mention detection** uses `m.mentions` in the event, falling back to parsing HTML pills from `formatted_body` when `m.mentions` is absent (e.g., mautrix-telegram). Bridges that set neither may not trigger agent responses.
- **Model support** -- the configured model must support vision. Text-only models will ignore the image or return an error. If the model rejects the image entirely, the [media fallback](#media-fallback) retries without the inline image.

# Attachments

MindRoom can process files, images, audio, and videos sent to Matrix rooms, passing them to agents for analysis or action. Supported attachment kinds: `audio`, `file`, `image`, `video`.

## Overview

When a user sends a file, image, audio message, or video in a Matrix room:

1. The agent determines whether it should respond (via mention, thread participation, or DM)
1. The media is downloaded and decrypted (if E2E encrypted)
1. The file is saved locally and registered as a context-scoped attachment
1. The agent receives the media as an Agno `File`, `Video`, `Audio`, or `Image` object plus an attachment ID it can reference in tool calls
1. The agent responds with its analysis or takes action on the file

Attachment support works automatically for all agents -- no configuration is needed.

## How It Works

```
┌──────────────────┐     ┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│ File/Image/Audio │────>│ Download &  │────>│ Register    │────>│ Pass to AI  │
│ /Video (Matrix)  │     │ Decrypt     │     │ Attachment  │     │ Model       │
└──────────────────┘     └─────────────┘     └─────────────┘     └─────────────┘
                                                                  │
                                                                  v
                                                            ┌─────────────┐
                                                            │ Agent       │
                                                            │ Responds    │
                                                            └─────────────┘
```

## Usage

Send a file, image, audio message, or video in a Matrix room and mention the agent in the caption:

- **With caption**: `@assistant Summarize this document` -- the caption is used as the prompt
- **Without caption**: The agent receives `[Attached file]`, `[Attached image]`, `[Attached audio]`, or `[Attached video]` as the prompt
- **Bare filename**: If the body is just the filename (e.g., `report.pdf`), it is treated the same as no caption

Attachments work in both direct messages and threads, and with both individual agents and teams.

## Attachment IDs

Each uploaded file or video is assigned a stable attachment ID (e.g., `att_abc123`). The agent's prompt is augmented with the available IDs:

```
Available attachment IDs: att_abc123. Use tool calls to inspect or process them.
```

Attachment IDs are **context-scoped** -- an attachment registered in one room or thread is not accessible from another. This prevents cross-room data leakage for ID-based access. Voice raw-audio fallback uses the same attachment ID mechanism; see [Voice Fallback](https://docs.mindroom.chat/voice/#voice-fallback-no-stt-available).

## The `attachments` Tool

Agents can use the optional `attachments` tool to interact with context-scoped attachments programmatically.

### Enabling

Add `attachments` to the agent's tool list:

```
agents:
  assistant:
    tools:
      - attachments
```

### Operations

| Operation                                              | Description                                                                                                                                    |
| ------------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| `list_attachments(target?)`                            | List metadata for attachments in the current context (ID, kind, local_path, filename, MIME type, size, room_id, thread_id, sender, created_at) |
| `get_attachment(attachment_id, mindroom_output_path?)` | Return one context attachment record, or save its bytes to a workspace-relative path and return a save receipt                                 |
| `register_attachment(file_path)`                       | Register a local file path as a context attachment ID (`att_*`)                                                                                |

When `mindroom_output_path` is omitted, `get_attachment()` returns the attachment metadata response, including the runtime-local `local_path`. For worker-routed agents, prefer `get_attachment("att_...", mindroom_output_path="incoming/file.ext")` before processing an attachment with `file`, `coding`, `python`, or `shell`, because the runtime-local path may not exist inside the worker workspace. `mindroom_output_path` must be a file path relative to the agent workspace. It must not be empty, absolute, point at the workspace root, contain `..` or NUL bytes, or use environment or user expansion. When the save succeeds, the response includes `mindroom_tool_output` with `status: "saved_to_file"`, `path`, byte count, `format: "binary"`, and `sha256`. In worker-routed shell and python tools, that workspace is also exposed as `~`, `$HOME`, and `$MINDROOM_AGENT_WORKSPACE`, so `incoming/file.ext` and `~/incoming/file.ext` refer to the same saved file.

`attachment_ids` accepts only context attachment IDs (`att_*`). `attachment_file_paths` accepts local file paths and auto-registers them in the current context before sending. Use `matrix_message(action="send"|"reply"|"thread-reply", attachment_ids=..., attachment_file_paths=...)` to send attachments.

### Why use this tool?

Not all AI models support direct file inputs. The `attachments` tool lets any model work with files by calling tools that operate on attachment IDs, even if the model itself cannot ingest the raw bytes.

## Encryption

Both unencrypted and E2E encrypted files and videos are supported. Encrypted media is decrypted transparently using the key material from the Matrix event.

## Caching

AI response caching is automatically skipped when files, images, audio, or videos are present, since media payloads are large and unlikely to repeat.

## Retention

MindRoom automatically prunes attachment metadata and managed `incoming_media/` files older than 30 days. Pruning runs opportunistically during new attachment registration.

## Limitations

- **Routing in multi-agent rooms** -- in multi-agent rooms without an `@mention`, the router selects the best agent based on the file caption.
- **Model support** -- the configured model must support file or video inputs for direct analysis. Models that do not can still use the `attachments` tool to inspect and process files via tool calls.

# Chat Commands

MindRoom provides chat commands that users can type in any Matrix room where MindRoom agents are present. Commands start with `!` and are handled by the router agent.

## Quick Reference

| Command                      | Description                                      |
| ---------------------------- | ------------------------------------------------ |
| `!help [topic]`              | Get help on commands or a specific topic         |
| `!hi`                        | Show the welcome message again                   |
| `!schedule <task>`           | Schedule a task or reminder                      |
| `!list_schedules`            | List pending scheduled tasks                     |
| `!cancel_schedule <id>`      | Cancel a scheduled task                          |
| `!edit_schedule <id> <task>` | Edit an existing scheduled task                  |
| `!config <operation>`        | View and modify configuration                    |
| `!reload-plugins`            | Force-reload all configured plugins (admin only) |

## Who Handles Commands

The **router** handles all commands exclusively. Even in single-agent rooms, commands are always processed by the router, not the agent. Commands work in both main room messages and within threads.

Voice messages that contain commands (e.g., spoken `!schedule`) are recognized after transcription and processed the same way.

## Permission Behavior

Commands are subject to the same authorization rules as normal messages. The sender must be authorized to interact with agents in the room (via `global_users`, `room_permissions`, or `default_room_access`). See [Authorization](https://docs.mindroom.chat/authorization/index.md) for details.

For `!config set`, only the user who requested the change can confirm or cancel it via reactions. Pending config changes expire after 24 hours.

## Commands

### `!help`

Display available commands or get detailed help on a specific topic.

```
!help
!help schedule
!help config
!help cancel_schedule
!help edit_schedule
```

**Topics:** `schedule`, `config`, `list_schedules`, `inspect_schedules`, `cancel`, `cancel_schedule`, `edit`, `edit_schedule`

### `!hi`

Show the welcome message for the current room, listing available agents, their roles and tools, and quick-start instructions.

```
!hi
```

### `!schedule`

Schedule a one-time or recurring task using natural language. Tasks run in the thread where they were created.

```
!schedule <natural-language-request>
```

**One-time tasks:**

```
!schedule in 5 minutes Check the deployment
!schedule tomorrow at 3pm Send the weekly report
```

**Recurring tasks:**

```
!schedule Every hour, @shell check server status
!schedule Daily at 9am, @finance market report
!schedule Weekly on Friday, @analyst prepare weekly summary
```

**Conditional workflows (polling-based):**

Conditional requests are converted to recurring cron-based polling schedules. These are periodic checks, not real event subscriptions.

```
!schedule If I get an email about "urgent", @phone_agent call me
!schedule When Bitcoin drops below $40k, @crypto_agent notify me
```

Include `@agent_name` in your schedule to target specific agents. The scheduler validates that mentioned agents are available in the room before creating the task.

Schedules use the timezone from `config.yaml` (defaults to UTC). See [Scheduling](https://docs.mindroom.chat/scheduling/index.md) for full details.

### `!list_schedules`

List pending scheduled tasks in the current room or thread.

```
!list_schedules
```

**Aliases:** `!listschedules`, `!list-schedules`, `!list_schedule`, `!listschedule`, `!list-schedule`, `!inspect_schedules`, `!inspectschedules`, `!inspect-schedules`, `!inspect_schedule`, `!inspectschedule`, `!inspect-schedule`

### `!cancel_schedule`

Cancel a specific scheduled task or all tasks in the room.

```
!cancel_schedule <task-id>
!cancel_schedule all
```

Use `!list_schedules` to find task IDs.

**Aliases:** `!cancelschedule`, `!cancel-schedule`

### `!edit_schedule`

Replace an existing scheduled task with new timing and content.

```
!edit_schedule <task-id> <new-task-description>
```

The task description is re-parsed to update timing and content. Schedule type cannot be changed (one-time to recurring or vice versa) -- cancel and recreate instead.

**Aliases:** `!editschedule`, `!edit-schedule`

### `!config`

View and modify MindRoom configuration from chat. Changes are validated against the Pydantic config schema before applying.

**View configuration:**

```
!config show
!config get agents
!config get models.default
!config get agents.analyst.display_name
```

**Modify configuration:**

```
!config set agents.analyst.display_name "Research Expert"
!config set models.default.id gpt-5.4
!config set defaults.markdown false
!config set timezone America/New_York
```

**Path syntax:**

- Use dot notation to navigate nested config (e.g., `agents.analyst.role`)
- Arrays use indexes (e.g., `agents.analyst.tools.0` for first tool)
- String values with spaces must be quoted

#### Confirmation flow

When you use `!config set`, MindRoom:

1. Validates the proposed change against the config schema
1. Shows a preview with the current and new values
1. Adds reaction buttons to the preview message
1. Waits for the requester to react with ✅ (confirm) or ❌ (cancel)

Only the user who requested the change can confirm or cancel it. Pending changes are persisted in Matrix room state and survive restarts. Unconfirmed changes expire after 24 hours.

Changes are saved to `config.yaml` immediately on confirmation and take effect for new agent interactions.

### `!reload-plugins`

Force-reload every configured plugin from disk. Admin-only.

```
!reload-plugins
```

Plugins are also auto-reloaded on file save, typically about 1-2 seconds after save — see [plugins.md / Live development](https://docs.mindroom.chat/plugins/#live-development-hot-reload) for details. This command is the manual override: useful if the auto-watcher missed something, or to confirm a swap explicitly.

**Reply format:**

```
✅ Reloaded N plugins; cancelled K tasks; active: <plugin names>
```

**Permission:** Caller must be in `authorization.global_users`. Aliases: `!reload-plugins`, `!reload_plugins`.

## Stop Button

MindRoom supports cancelling in-progress responses via a reaction-based stop button, not a chat command.

When `defaults.show_stop_button` is `true` (the default), MindRoom adds a 🛑 reaction to the agent's message while it is generating. React with 🛑 on the message to cancel the response. The agent finalizes the partial text with `**[Response cancelled by user]**`.

The stop button only works on messages currently being generated. Only non-agent users can trigger cancellation — agent reactions are ignored.

See [Streaming — Cancellation](https://docs.mindroom.chat/streaming/#cancellation-and-errors) for details on how cancelled responses are finalized.

## Unknown Commands

Any message starting with `!` that does not match a known command returns an error message suggesting `!help`.

# Interactive Q&A

MindRoom agents can present clickable multiple-choice questions to users using Matrix reactions. When an agent's response contains a specially formatted JSON block, MindRoom automatically renders it as a numbered list with emoji reactions that users can click to respond.

## How It Works

1. An agent includes an `interactive` code block in its response.
1. MindRoom parses the JSON, formats the options as a numbered list, and adds emoji reactions to the message.
1. The user clicks a reaction emoji or types the option number.
1. MindRoom captures the selection and feeds it back to the agent as a follow-up prompt (`"The user selected: <value>"`).

The entire flow happens within the thread where the original question was asked.

## JSON Format

Agents emit interactive questions by wrapping JSON in an `interactive` code block:

````
```interactive
{
    "question": "What approach would you prefer?",
    "options": [
        {"emoji": "🚀", "label": "Fast and automated", "value": "fast"},
        {"emoji": "🔍", "label": "Careful and manual", "value": "careful"}
    ]
}
````

```

### Fields

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `question` | string | No | The question text shown above options. Defaults to `"Please choose an option:"`. |
| `options` | array | Yes | List of option objects (max 5). |
| `options[].emoji` | string | No | Emoji shown as a reaction button. Defaults to `"❓"`. |
| `options[].label` | string | No | Human-readable label for the option. Defaults to `"Option"`. |
| `options[].value` | string | No | Value passed back to the agent when selected. Defaults to the label in lowercase. |

### Rendered Output

The JSON block is replaced with a formatted message:

```

What approach would you prefer?

1. 🚀 Fast and automated
1. 🔍 Careful and manual

React with an emoji or type the number to respond.

```

The corresponding emoji reactions are added to the message as clickable buttons.

## User Response Methods

Users can respond in two ways:

- **Reaction**: Click one of the emoji reactions added to the message.
- **Text**: Send a message with a single-digit option number (e.g., `1` or `2`) in the same thread. Only digits 1–5 are recognized; multi-digit numbers like `10` are ignored.

Both methods trigger the same follow-up behavior: the agent receives the selected value and continues the conversation.

## Agent Integration

Agents don't need any special tools or configuration to use interactive questions.
Any agent can include an `interactive` code block in its response text.
You can guide agents to use this feature through their `instructions` or `role`:

```

agents: assistant: display_name: Assistant role: A helpful assistant instructions: - > When the user needs to choose between options, present them using an interactive code block with JSON containing question and options (each with emoji, label, and value fields).

````

## Limitations

- Maximum of **5 options** per question. Additional options are silently truncated.
- Only **one active question per message**. If a response contains multiple interactive blocks, only the first is processed.
- Questions are tracked **in memory** and do not persist across restarts.
- Only human users can respond; reactions from other agents are ignored.
- Only the agent that created the question processes reactions to it.```
````

# Streaming Responses

MindRoom streams agent responses to Matrix by progressively editing a single message. Instead of waiting for the full response, users see text appear in real time as the model generates it.

## How It Works

1. Agent starts generating a response.
1. MindRoom sends an initial message with the first chunk of text plus an in-progress marker (`⋯`).
1. As more text arrives, MindRoom edits the same message with the accumulated content.
1. When the response is complete, the final edit removes the `⋯` marker.

```
User sends message
       │
       ▼
┌──────────────┐     presence check
│ Agent starts │ ──────────────────▶ Is user online?
│ generating   │                          │
└──────┬───────┘                    ┌─────┴─────┐
       │                           Yes          No
       ▼                            │            │
  Stream chunks                     ▼            ▼
  via edits                    Streaming     Single message
  with ⋯ marker               (progressive   (sent when
       │                       edits)          complete)
       ▼
  Final edit
  (⋯ removed)
```

## Configuration

Streaming is enabled by default. Disable it globally in `config.yaml`:

```
defaults:
  enable_streaming: false   # Default: true
```

`enable_streaming` is a global-only setting under `defaults` and cannot be overridden per agent.

Tune the streaming edit cadence globally under `defaults.streaming`:

```
defaults:
  enable_streaming: true
  streaming:
    update_interval: 5.0         # Default: 5.0 steady-state seconds between edits
    min_update_interval: 0.5     # Default: 0.5 fast-start seconds between early edits
    interval_ramp_seconds: 15.0  # Default: 15.0; set 0 to disable ramping
    max_idle: 2.0                # Default: 2.0 event-driven idle ceiling before the next edit
```

These timing settings are global-only. Agents inherit them from `defaults` and cannot override them individually.

## Presence-Based Streaming

Even when streaming is enabled, MindRoom only streams to users who are currently online. This is checked via `should_use_streaming()` which queries the Matrix presence API.

| Presence State | Streaming Used?                        |
| -------------- | -------------------------------------- |
| `online`       | Yes                                    |
| `unavailable`  | Yes                                    |
| `offline`      | No (single message sent when complete) |

If the presence check fails, MindRoom defaults to non-streaming (safer, fewer API calls). When no requester user ID is available, MindRoom defaults to streaming.

## In-Progress Marker

While a response is being generated, the message ends with `⋯` followed by zero to two dots that cycle as edits arrive. This gives users a visual indicator that the agent is still working.

```
Hello! I can help you with that ⋯
Hello! I can help you with that ⋯.
Hello! I can help you with that ⋯..
Hello! I can help you with that ⋯
```

If no text has arrived yet, a `Thinking...` placeholder is shown with the marker. The marker is removed on the final edit.

## Throttling

MindRoom throttles edits to avoid overwhelming the Matrix homeserver:

- **Time-based**: `defaults.streaming.update_interval` sets the steady-state interval between edits (default: 5 seconds).
- **Character-based**: An edit is also triggered when enough new characters have accumulated. The character threshold ramps from 48 characters (fast start) to 240 characters (steady-state) over the ramp-up period.
- **Ramp-up**: `defaults.streaming.min_update_interval` and `defaults.streaming.interval_ramp_seconds` control how quickly the time-based interval ramps from a fast start to the steady-state value. By default it ramps from 0.5s to 5s over 15 seconds. Setting `interval_ramp_seconds: 0` disables the ramp and uses the steady-state interval immediately.
- **Shared ramp window**: The same ramp window also controls the built-in character threshold ramp from 48 characters (fast start) to 240 characters (steady-state).
- **Minimum interval**: A hard floor (0.35s) applies to character-triggered and idle-triggered throttled edits. Time-triggered edits still follow the current ramped interval.
- **Idle flush**: `defaults.streaming.max_idle` triggers an edit on the next streaming event after 2.0s without a new delta, but only once `min_char_update_interval` has also elapsed. This is event-driven and does not run on a background timer.
- **Tool-start boundary refresh**: Visible tool-start markers request an immediate refresh so the marker can surface without waiting for later text. Rapid back-to-back tool starts are coalesced by the single delivery owner instead of forcing one Matrix edit per tool.

## Tool Calls During Streaming

When an agent calls tools during a streamed response, MindRoom shows inline markers in the message text:

```
🔧 `web_search` [1] ⏳       ← tool call started (pending)
🔧 `web_search` [1]          ← tool call completed
```

The number in brackets (`[N]`) is a 1-indexed counter per message. Each marker maps to `io.mindroom.tool_trace.events[N-1]` in the message metadata.

When `show_tool_calls` is disabled for an entity, tool markers are omitted from the message text and tool-trace metadata is not attached. If a routed tool needs an isolated worker, streaming may still show a generic worker warmup line such as `Preparing isolated worker...`. That hidden-tool warmup copy never includes tool names or tool-trace metadata.

## Cancellation and Errors

Users can cancel an in-progress response by reacting with 🛑 on the message being generated (see [Stop Button](https://docs.mindroom.chat/chat-commands/#stop-button)). An explicit user stop finalizes the streamed message with:

```
<partial text so far>

**[Response cancelled by user]**
```

Other non-error interruptions finalize the streamed message with one of these notes:

```
<partial text so far>

**[Response interrupted]**
```

```
<partial text so far>

**[Response interrupted by service restart]**
```

If an error occurs during streaming, the message is finalized with:

```
<partial text so far>

**[Response interrupted by an error: <error description>]**
```

## Large Streamed Messages

If a streamed response exceeds the Matrix event size limit (55KB for new messages, 27KB for edits), the large message system automatically uploads a JSON sidecar and includes a preview in the event body. See [Matrix Integration — Large Messages](https://docs.mindroom.chat/architecture/matrix/#large-messages) for details.

## Visibility Toggles

Two global defaults control what users see during streaming:

```
defaults:
  show_tool_calls: true     # Default: true — show inline tool markers and tool-trace metadata
  show_stop_button: true    # Default: true — add 🛑 reaction for cancellation
```

When `show_tool_calls` is `false`, inline tool markers (`🔧 tool_name [N]`) are omitted from the message text and `io.mindroom.tool_trace` metadata is not attached. The agent still shows typing activity during hidden tool calls. If a routed tool needs an isolated worker, users may still see generic worker progress copy such as `Preparing isolated worker...` or `Preparing isolated worker... 17s elapsed.`. Hidden-tool mode never includes tool identifiers or tool-trace metadata in that worker progress text. `show_tool_calls` can also be overridden per agent in the agent config.

When `show_stop_button` is `false`, the 🛑 reaction is not added to in-progress messages. Streaming itself still works — only the cancellation affordance is removed. `show_stop_button` is a global-only setting under `defaults`.

`enable_streaming` is also global-only and cannot be overridden per agent.

## Room Mode

When an agent operates in `thread_mode: room` (see [Thread Mode Resolution](https://docs.mindroom.chat/configuration/agents/#thread-mode-resolution)), streaming skips all thread relations and sends plain room messages. This is used for bridges and mobile clients that don't support Matrix threads.

## Replacement Streaming

MindRoom also supports a `ReplacementStreamingResponse` variant where each chunk replaces the entire message content instead of appending to it. This is used for structured live rendering where the full document is rebuilt on each tick.

# Scheduling

Schedule agents to perform tasks at specific times or intervals using natural language. Tasks run in the thread where they were created.

## Commands

### Schedule a Task

```
!schedule <natural-language-request>
```

**One-Time Tasks:**

```
!schedule in 5 minutes Check the deployment
!schedule tomorrow at 3pm Send the weekly report
```

**Recurring Tasks:**

```
!schedule Every hour, @shell check server status
!schedule Daily at 9am, @finance market report
!schedule Weekly on Friday, @analyst prepare weekly summary
```

**Conditional Workflows (polling-based):**

Conditional or event-like requests are converted to recurring cron-based polling schedules. The AI picks an appropriate polling frequency based on urgency, and the condition is embedded in the task message so the agent checks it on each poll cycle. These are **not** real event subscriptions — they are periodic checks.

```
!schedule If I get an email about "urgent", @phone_agent call me
!schedule When Bitcoin drops below $40k, @crypto_agent notify me
```

### Edit a Schedule

```
!edit_schedule <task-id> <new-task-description>
```

Edits an existing scheduled task by ID. The task description is re-parsed to update timing and content.

### List and Cancel Schedules

```
!list_schedules                  # Show pending tasks
!cancel_schedule <task-id>       # Cancel specific task
!cancel_schedule all             # Cancel all tasks in room
```

Aliases: `!listschedules`, `!list-schedules`, `!list_schedule`, `!listschedule`, `!list-schedule`, `!inspect_schedules`, `!inspectschedules`, `!inspect-schedules`, `!inspect_schedule`, `!inspectschedule`, `!inspect-schedule`, `!cancelschedule`, `!cancel-schedule`, `!editschedule`, `!edit-schedule`

Use `!help schedule` for detailed inline help on scheduling commands.

## Agent Mentions

Include `@agent_name` in your schedule to have specific agents respond. The scheduler validates that mentioned agents are available in the room before creating the task.

## Timezone

Schedules use the timezone from `config.yaml` (defaults to UTC):

```
timezone: America/Los_Angeles
```

## Limitations

- **Schedule type cannot be changed** — editing a one-time task to be recurring (or vice versa) is not supported. Cancel the existing task and create a new one instead.
- **Conditional workflows are polling** — event-like schedules (`If ...`, `When ...`) are converted to recurring cron polls, not real event subscriptions.

## Persistence

Schedules are stored in Matrix room state and persist across restarts. New schedules use the live runtime to start their in-memory runners immediately. Edits are state-only Matrix writes. Running tasks pick up edited state on their next poll instead of relying on caller-supplied cache or restart hooks. Past one-time tasks are automatically skipped during restoration. Only the router restores persisted schedules after startup — individual agents do not restore their own. On shutdown, the router cancels its in-memory scheduled tasks before exiting.

# Authorization

MindRoom controls which Matrix users can interact with agents.

Room access (joinability/discoverability) is configured separately through `matrix_room_access`.

## Configuration

Configure authorization in `config.yaml`:

```
authorization:
  # Users with access to all rooms
  global_users:
    - "@admin:example.com"
    - "@developer:example.com"

  # Room-specific permissions (room ID, full alias, or managed room key)
  room_permissions:
    "!abc123:example.com":
      - "@user1:example.com"
      - "@user2:example.com"
    "#lobby:example.com":
      - "@user3:example.com"
    "ops":
      - "@user4:example.com"

  # Default for rooms not in room_permissions
  default_room_access: false

  # Optional: per-agent/team/router reply allowlists
  # Keys must match an agent name, team name, "router", or "*"
  # Values are canonical Matrix user IDs or glob patterns (aliases are resolved)
  # Examples: "*:example.com", "@admin:*", "*"
  agent_reply_permissions:
    "*":
      - "@admin:example.com"
    code:
      - "@admin:example.com"
    research:
      - "@developer:example.com"
    router:
      - "*"

# Optional: configure the internal MindRoom user identity (omit for hosted/public profiles)
mindroom_user:
  username: mindroom_user          # Set before first startup (cannot be changed later)
  display_name: MindRoomUser

# Optional: room onboarding/discoverability policy
matrix_room_access:
  mode: single_user_private        # default
  multi_user_join_rule: public     # public or knock (multi_user only)
  publish_to_room_directory: false # publish managed rooms to public directory
  invite_only_rooms: []            # room keys/aliases/IDs that stay restricted
  reconcile_existing_rooms: false  # migrate existing managed rooms when true
```

**Defaults** (when `authorization` block is omitted):

- `global_users: []`
- `room_permissions: {}`
- `default_room_access: false`
- `agent_reply_permissions: {}`

This means only MindRoom system users (agents, teams, router, and the configured internal user if present) can interact with agents by default.

`mindroom_user.username` is a one-time setting used to create the internal Matrix account. After the account exists, keep the same username and only change `mindroom_user.display_name` for visible name changes.

For `authorization.room_permissions`, MindRoom accepts these key formats:

- Room ID: `!roomid:example.com`
- Full room alias: `#alias:example.com`
- Managed room key: `alias` (the configured room name/key used by MindRoom)

## Matrix Room Onboarding for OIDC Users

When users authenticate through Synapse OIDC, they are regular Matrix users. To let them join managed MindRoom rooms by alias without manual invites:

1. Set `matrix_room_access.mode: multi_user`.
1. Set `multi_user_join_rule` to `public` (direct join) or `knock` (request access).
1. Set `publish_to_room_directory: true` if rooms should appear in Explore/public room directory.

If you keep `mode: single_user_private` (default), managed rooms remain invite-only and private in the directory.

### Required Service Account Permissions

MindRoom applies room join rules and directory visibility using its managing account (typically the router account, e.g. `@mindroom_router:<domain>`).

- The managing account must be joined to the room.
- The managing account must have enough power to send `m.room.join_rules`.
- To publish to the room directory, Synapse requires moderator/admin-level power in that room.

If permissions are insufficient, MindRoom logs actionable warnings including the Matrix API error and required permission hint.

## Migration Guide (Existing Deployments)

Use this opt-in migration flow to move existing managed rooms to multi-user onboarding safely:

1. Update config:
1. `matrix_room_access.mode: multi_user`
1. choose `multi_user_join_rule`
1. set `publish_to_room_directory` as needed
1. optionally list restricted rooms in `invite_only_rooms`
1. Enable reconciliation once:
1. `matrix_room_access.reconcile_existing_rooms: true`
1. Restart MindRoom and verify logs for each managed room.
1. After migration is complete, set `reconcile_existing_rooms: false` again (recommended steady state).

Only managed rooms (rooms configured through MindRoom agents/teams) are reconciled.

## Matrix ID Format

User IDs follow the Matrix format: `@localpart:homeserver.domain`

Examples: `@alice:matrix.org`, `@bob:example.com`, `@admin:company.internal`

## Authorization Flow

Authorization checks are performed in order:

1. **Internal system user** - When `mindroom_user` is configured, `@{mindroom_user.username}:{domain}` is always authorized. When omitted (hosted/public profiles), this check is skipped. Note: the same user ID from a different domain is NOT authorized.
1. **MindRoom agents/teams/router** - Configured agents, teams, and the router are authorized
1. **Alias resolution** - If the sender matches a bridge alias in `aliases`, it is resolved to the canonical user ID for the remaining checks
1. **Global users** - Users in `global_users` have access to all rooms
1. **Room permissions** - If any matching room identifier exists in `room_permissions` (room ID, full alias, or managed room key), user must be in that list (does NOT fall through to `default_room_access`)
1. **Default access** - Rooms not in `room_permissions` use `default_room_access`

> [!TIP] Set `default_room_access: false` and explicitly grant access via `global_users` or `room_permissions` for better security.

## Bridge Aliases

When using Matrix bridges (e.g., mautrix-telegram, mautrix-signal), messages from the bridged platform arrive with a different Matrix user ID. Use `aliases` to map these bridge-created IDs to a canonical user so they inherit the same permissions:

```
authorization:
  global_users:
    - "@alice:example.com"
  room_permissions:
    "!room1:example.com":
      - "@bob:example.com"
  aliases:
    "@alice:example.com":
      - "@telegram_123:example.com"
      - "@signal_456:example.com"
    "@bob:example.com":
      - "@telegram_789:example.com"
```

In this example, messages from `@telegram_123:example.com` are treated as `@alice:example.com` (global access), and messages from `@telegram_789:example.com` are treated as `@bob:example.com` (access to `!room1:example.com` only).

## Per-Agent Reply Permissions

Use `authorization.agent_reply_permissions` to restrict which users each agent can reply to.

- The map key is an entity name: agent name, team name, `router`, or `*`.
- The `*` key is a default rule for entities that do not have an explicit entry.
- The value is a list of allowed Matrix user IDs.
- Values support glob-style matching (for example `*:example.com`).
- A `*` user entry means "allow any sender" for that specific entity.
- If an entity is not present in the map, it has no extra reply restriction.
- Alias mapping from `authorization.aliases` is applied before matching, so bridged IDs inherit canonical user permissions.
- Internal MindRoom identities (agents, teams, router, and the internal `mindroom_user`) always bypass reply permissions — they are system participants, not end users.
- `bot_accounts` are **not** exempt. Bridge bots listed in `bot_accounts` are still subject to reply permission checks.
- Keys that do not match any configured agent, team, `router`, or `*` are rejected at config load time.
- For voice messages, the permission check uses the original human sender, not the router that posted the transcription.

```
authorization:
  global_users:
    - "@alice:example.com"
    - "@bob:example.com"
  aliases:
    "@alice:example.com":
      - "@telegram_111:example.com"
  agent_reply_permissions:
    "*":
      - "@alice:example.com"
    code:
      - "@alice:example.com"
    research:
      - "@bob:example.com"
    router:
      - "*"
```

In this example, `*` restricts all entities to Alice by default, `research` overrides that and replies only to Bob, and `router` can reply to anyone.

## Bot Accounts

The `bot_accounts` field is a **top-level** config option (not under `authorization:`). It lists Matrix user IDs of non-MindRoom bots — such as bridge bots for Telegram, Slack, or other platforms — that should be treated like agents for response logic. Bots in this list won't trigger the multi-human-thread mention requirement.

```
# Top-level config, not under authorization:
bot_accounts:
  - "@telegram_bot:example.com"
  - "@slack_bot:example.com"
```

For more details on how `bot_accounts` affects routing behavior, see the [Router configuration](https://docs.mindroom.chat/configuration/router/index.md) page.

# Matrix Space

MindRoom can create and maintain a root Matrix Space that groups all managed rooms together. This makes it easy for users to discover and navigate MindRoom rooms in their Matrix client.

## Configuration

```
matrix_space:
  enabled: true    # Default: true
  name: MindRoom   # Default: "MindRoom"
```

| Field     | Type   | Default      | Description                                                                   |
| --------- | ------ | ------------ | ----------------------------------------------------------------------------- |
| `enabled` | bool   | `true`       | Whether to create and maintain a root Matrix Space for managed MindRoom rooms |
| `name`    | string | `"MindRoom"` | Display name for the root Matrix Space when enabled                           |

## Behavior

When `enabled` is `true`, MindRoom creates a Space on startup and adds all managed rooms as children. Rooms created later (by agents joining new rooms or config changes) are automatically added to the Space.

Set `enabled: false` to disable Space creation entirely. The `name` field controls the Space's display name and can be changed at any time.

# OpenAI-Compatible API

MindRoom exposes an OpenAI-compatible chat completions API so any chat frontend can use MindRoom agents as selectable "models". LibreChat, Open WebUI, LobeChat, ChatBox, BoltAI, and anything else that speaks the OpenAI protocol works out of the box.

## How It Works

The frontend calls `GET /v1/models` and sees your agents in the model picker. The user picks an agent and chats. The frontend sends standard OpenAI requests; MindRoom routes them to the selected agent with all its tools, instructions, and memory. The frontend doesn't know it's talking to an agent — it's transparent.

```
Chat Frontend (LibreChat, Open WebUI, etc.)
│
│  GET  /v1/models           → returns your agents as "models"
│  POST /v1/chat/completions → routes to the selected agent
│
└──→ MindRoom API ──→ ai_response() / stream_agent_response()
                         │
                         └──→ agents, tools, memory, knowledge bases
```

No Matrix auth dependency. You can run the OpenAI-compatible API standalone or alongside the Matrix bot.

## Setup

### 1. Set API keys

Add to your `.env`:

```
# Option A: Set API keys (recommended for production)
OPENAI_COMPAT_API_KEYS=sk-my-secret-key-1,sk-my-secret-key-2

# Option B: Allow unauthenticated access (local dev only)
OPENAI_COMPAT_ALLOW_UNAUTHENTICATED=true
```

Without either of these, the API returns 401 on all requests.

### 2. Start MindRoom

```
# Full MindRoom runtime (Matrix bot + API server + dashboard)
uv run mindroom run

# Or via just
just start-mindroom-dev
```

The API is available at `http://localhost:8765/v1/`.

> [!IMPORTANT] If the dashboard and `/v1/*` share a domain behind a reverse proxy, route `/v1/*` to the MindRoom runtime (in addition to `/api/*`). Otherwise OpenAI-compatible requests can be handled by the dashboard and fail.

### 3. Verify

```
# List available agents
curl -H "Authorization: Bearer sk-my-secret-key-1" \
  http://localhost:8765/v1/models

# Chat (non-streaming)
curl -H "Authorization: Bearer sk-my-secret-key-1" \
  -H "Content-Type: application/json" \
  -d '{"model":"general","messages":[{"role":"user","content":"Hello"}]}' \
  http://localhost:8765/v1/chat/completions

# Chat (streaming)
curl -N -H "Authorization: Bearer sk-my-secret-key-1" \
  -H "Content-Type: application/json" \
  -d '{"model":"general","messages":[{"role":"user","content":"Hello"}],"stream":true}' \
  http://localhost:8765/v1/chat/completions
```

## Client Configuration

### LibreChat

Add to your `librechat.yaml`:

```
endpoints:
  custom:
    - name: "MindRoom"
      apiKey: "${MINDROOM_API_KEY}"
      baseURL: "http://localhost:8765/v1"
      models:
        default: ["general"]
        fetch: true
      modelDisplayLabel: "MindRoom"
      titleConvo: true
      titleModel: "general"
      dropParams: ["stop", "frequency_penalty", "presence_penalty", "top_p"]
      headers:
        # Highest-priority session key used by MindRoom
        X-Session-Id: "{{LIBRECHAT_BODY_CONVERSATIONID}}"
        # Backward-compatible fallback used by MindRoom
        X-LibreChat-Conversation-Id: "{{LIBRECHAT_BODY_CONVERSATIONID}}"
```

`X-Session-Id` is recommended when you want deterministic MindRoom session continuity. This is especially important for tools that keep long-lived sessions inside the MindRoom runtime. `X-LibreChat-Conversation-Id` alone is still enough to keep continuity if you already use it.

### Open WebUI

1. Go to **Admin Settings > Connections > OpenAI > Manage**
1. Set API URL to `http://localhost:8765/v1`
1. Set API Key to one of your `OPENAI_COMPAT_API_KEYS`
1. Agents appear automatically in the model picker

### Any OpenAI-compatible client

Point the base URL at `http://localhost:8765/v1` and set the API key. MindRoom implements the OpenAI-compatible `GET /v1/models` and `POST /v1/chat/completions` endpoints.

## Features

### Model selection

Each agent in `config.yaml` appears as a selectable model. The model ID is the agent's internal name (e.g., `code`, `research`), and the display name comes from `display_name`. Only shared agents that are either unscoped or explicitly configured with `worker_scope=shared` appear in `/v1/models`. Agents that use `agents.<name>.private` are not listed there, because `private.per` creates requester-private instances and therefore an isolating execution scope. An OpenAI-compatible run can expose fewer tool functions than the same agent in Matrix when `tool_approval` hides approval-gated functions from `/v1`.

### Auto-routing

Select the `auto` model to let MindRoom's router pick the best agent for each message, the same routing logic used in Matrix rooms. Once routing resolves a specific agent, session continuity and streamed identity bind to that resolved agent name, not the literal `auto` label.

### Teams

Teams are exposed as `team/<team_name>` models. Selecting `team/super_team` runs the full team collaboration or coordination workflow.

### Streaming

`stream: true` returns Server-Sent Events in the standard OpenAI format: role chunk, content chunks, finish chunk, `[DONE]`.

Tool calls appear inline as text in the stream (not as native OpenAI `tool_calls` deltas). MindRoom currently emits tool events in stream chunks as inline `<tool id="N" state="start|done">...</tool>` content.

### Multimodal messages

When a message's `content` is an array of content parts (the OpenAI multimodal format), MindRoom extracts only the `text` parts and concatenates them as the prompt. Non-text parts such as `image_url` are silently ignored by the current implementation. Agents still process the text normally with all their configured tools and instructions.

### Session continuity

Session IDs are derived from request headers:

1. `X-Session-Id` header (explicit control)
1. `X-LibreChat-Conversation-Id` header (automatic with LibreChat)
1. Random UUID fallback

Agent memory and conversation history persist across requests with the same session ID. For persistent MindRoom tool sessions (for example a long-running coding session), prefer `X-Session-Id`.

Session IDs are namespaced internally with a hash of the API key to prevent cross-key session collision. Two different API keys using the same `X-Session-Id` value will not share a session.

### Claude Agent tool sessions

If an agent enables the `claude_agent` tool, the same `X-Session-Id` keeps the Claude session alive across turns. This lets a user continue one long coding flow instead of starting a fresh Claude process on every request. See the `claude_agent` section in [Agent Orchestration](https://docs.mindroom.chat/tools/agent-orchestration/index.md) for configuration details.

Parallel Claude sub-sessions are supported by using different `session_label` values in tool calls:

- Same `session_label`: one shared Claude session (serialized by a per-session lock)
- Different `session_label`: independent Claude sessions that can run concurrently

### Knowledge bases

Agents with configured `knowledge_bases` in `config.yaml` get RAG support automatically. No additional API configuration needed. For Git-backed knowledge bases, missing or stale published indexes schedule the same per-binding refresh flow used by the Matrix runtime. Explicit dashboard/API reindex runs Git sync first and then rebuilds a candidate index.

## What's ignored

The API accepts but ignores these OpenAI parameters (the agent's own config controls them):

- `temperature`, `top_p`, `max_tokens`, `max_completion_tokens`
- `tools`, `tool_choice` (agents use their configured tools)
- `n`, `stop`, `frequency_penalty`, `presence_penalty`, `seed`
- `response_format`, `logprobs`, `logit_bias`
- `stream_options` (usage stats are always zeros)

Client `system` / `developer` messages are prepended to the prompt. They augment the agent's built-in instructions, not replace them.

## Authentication

| `OPENAI_COMPAT_API_KEYS` | `OPENAI_COMPAT_ALLOW_UNAUTHENTICATED` | Behavior                                                          |
| ------------------------ | ------------------------------------- | ----------------------------------------------------------------- |
| Set                      | (any)                                 | Bearer token required, must match one of the comma-separated keys |
| Unset                    | `true`                                | No authentication required                                        |
| Unset                    | Unset/`false`                         | All requests return 401 (locked)                                  |

The OpenAI-compatible API uses its own auth (`OPENAI_COMPAT_API_KEYS`), separate from the dashboard API auth. In standalone mode, the dashboard `/api/*` endpoints can be protected with `MINDROOM_API_KEY`; the browser dashboard uses a same-origin auth cookie, while CLI and curl clients can still send `Authorization: Bearer ...`. These are independent: `MINDROOM_API_KEY` secures the dashboard, while `OPENAI_COMPAT_API_KEYS` secures the `/v1/*` chat completions endpoints.

## Limitations

- **Token usage is always zeros** — Agno doesn't expose token counts
- **No native `tool_calls` format** — tool results appear inline in content text
- **`show_tool_calls` config is Matrix-only today** — OpenAI-compatible `/v1/chat/completions` currently includes tool-call text/events regardless of `show_tool_calls: false`
- **No room memory** — only agent-scoped memory (no `room_id` in API requests)
- **No requester-private instances** — `/v1` currently supports only shared agents that are unscoped or configured with `worker_scope=shared`, so `agents.<name>.private` and other isolating execution scopes are not available there
- **Tool approval is Matrix-only** — `/v1` hides tool functions matched by required-approval rules, including script-based rules, because approval cards need a live Matrix room, thread, and runtime process
- **Scheduler tool unavailable** — scheduling requires Matrix context and returns an error message when no Matrix scheduling context is available

# Architecture

MindRoom's architecture consists of several key components working together.

## Overview

```
┌─────────────────────────────────────────────────────────┐
│                   Matrix Homeserver                      │
│              (Synapse, Conduit, etc.)                    │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│              MultiAgentOrchestrator                      │
│  ┌─────────────────────────────────────────────────┐    │
│  │                   Matrix Client                  │    │
│  │         (nio, sync loops, presence)             │    │
│  └─────────────────────────────────────────────────┘    │
│                                                          │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐    │
│  │ Router  │  │ Agent 1 │  │ Agent 2 │  │  Team   │    │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘    │
│       │            │            │            │          │
│  ┌────▼────────────▼────────────▼────────────▼────┐    │
│  │              Agno Runtime                       │    │
│  │         (LLM calls, tool execution)            │    │
│  └─────────────────────────────────────────────────┘    │
│                                                          │
│  ┌─────────────────────────────────────────────────┐    │
│  │                Memory System                     │    │
│  │  (Mem0 + ChromaDB, agent/team scopes)           │    │
│  └─────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────┘
```

## Components

- [Matrix Integration](https://docs.mindroom.chat/architecture/matrix/index.md) - How MindRoom connects to Matrix
- [Agent Orchestration](https://docs.mindroom.chat/architecture/orchestration/index.md) - How agents are managed

## Key Internal Modules

| Module                  | Purpose                                                                             |
| ----------------------- | ----------------------------------------------------------------------------------- |
| `orchestrator.py`       | MultiAgentOrchestrator — boots entities, manages sync loops, hot-reload             |
| `orchestration/`        | Extracted orchestrator helpers (sync loops, config diffing, room invitations)       |
| `runtime_state.py`      | Shared runtime readiness state for health/ready endpoints                           |
| `runtime_resolution.py` | Authoritative runtime resolution for agent materialization                          |
| `team_exact_members.py` | Runtime resolution for team member materialization                                  |
| `model_loading.py`      | Authoritative model instantiation and provider-specific loader selection            |
| `ai_runtime.py`         | Agent-run input preparation, queued-notice hooks, and inline-media fallback helpers |
| `agent_storage.py`      | Agent session and learning SQLite storage construction helpers                      |
| `agent_descriptions.py` | Shared agent description rendering for routing and delegation                       |
| `agent_policy.py`       | Derives canonical execution policies from authored agent config                     |
| `workspaces.py`         | Agent workspace scaffolding, template seeding, context file resolution              |
| `bot.py`                | AgentBot and TeamBot runtime for Matrix event handling                              |
| `routing.py`            | Intelligent agent selection when no agent is mentioned                              |
| `streaming.py`          | Response streaming via progressive message edits                                    |
| `media_inputs.py`       | Shared media-input container passed across bot, teams, and AI layers                |
| `media_fallback.py`     | Retries model requests without inline media when models reject media inputs         |
| `avatar_generation.py`  | Generates and manages avatar assets for agents, rooms, and spaces                   |
| `response_tracker.py`   | Duplicate response prevention                                                       |
| `topic_generator.py`    | AI-generated room topics                                                            |
| `background_tasks.py`   | Non-blocking async task management with GC protection                               |

## Data Flow

1. **Message arrives** from Matrix homeserver
1. **Router decides** which agent should handle it (if no explicit mention)
1. **Agent processes** the message using the Agno runtime
1. **Tools execute** as needed (file operations, API calls, etc.)
1. **Response sent** back to Matrix room
1. **Memory updates** asynchronously in background

# Matrix Integration

MindRoom uses the Matrix protocol for all agent communication. The integration is implemented in `src/mindroom/matrix/`.

## Why Matrix?

- **Federated** - Connect to any Matrix homeserver
- **Bridgeable** - Bridge to Discord, Slack, Telegram, and more
- **Open** - Open standard and open-source implementations
- **End-to-End Encryption** - Secure communication with encrypted room support

## Matrix Client

MindRoom uses `mindroom-nio` for Matrix communication with SSL context handling and encryption key storage.

### Environment Variables

| Variable             | Default                 | Description                              |
| -------------------- | ----------------------- | ---------------------------------------- |
| `MATRIX_HOMESERVER`  | `http://localhost:8008` | Matrix homeserver URL                    |
| `MATRIX_SERVER_NAME` | (from homeserver)       | Federation server name                   |
| `MATRIX_SSL_VERIFY`  | `true`                  | Set to `false` for dev/self-signed certs |

Streaming behavior is configured in `config.yaml` with `defaults.enable_streaming` (default: `true`).

## Agent Users

Each agent gets its own Matrix user with the `mindroom_` prefix:

```
@mindroom_assistant:example.com
@mindroom_router:example.com  (built-in routing agent)
```

Users are automatically created during orchestrator startup and credentials are persisted in `mindroom_data/matrix_state.yaml`.

## Room Management

Agents can join existing rooms, create new rooms with AI-generated topics, respond to invites automatically, leave unconfigured rooms, and set room avatars.

Rooms are auto-created via `_ensure_room_exists()` (private) and `ensure_all_rooms_exist()` (public). DM rooms can be detected with `async is_dm_room(client, room_id) -> bool`.

## Threading (MSC3440)

MindRoom emits thread replies following [MSC3440](https://github.com/matrix-org/matrix-spec-proposals/blob/main/proposals/3440-threading-via-relations.md), using `m.relates_to` with `rel_type: m.thread`.

Explicit `m.thread` metadata remains the primary source of thread conversation context. For clients or bridges that send plain replies without thread metadata (`m.in_reply_to` but no `rel_type: m.thread`), MindRoom applies a transitive compatibility rule. If a reply chain eventually reaches explicit thread `T` or a proven thread root, MindRoom treats the new reply as part of `T`. Replies that never reach threaded context stay room-level.

### Resolution Rules

When deriving context for an incoming event, MindRoom:

1. Uses explicit `m.thread` relations as the primary inbound thread identity.
1. Lets plain replies inherit thread membership transitively when their reply chain reaches a threaded ancestor or proven thread root.
1. Lets edits, reactions, redactions, and other target-bound operations inherit the canonical thread membership of their target event.
1. May start a new thread under a room-root event when agent thread mode requires it.

```
├── User: @assistant help with this code
│   ├── Assistant: I can help! Let me look at it...
│   ├── User: It should return a list
│   └── Assistant: Here's the updated version...
```

Use `build_message_content()` from `message_builder.py` to construct thread-aware messages, and `EventInfo.from_event()` to analyze event relations (threads, edits, replies, reactions).

## Message Flow

### Sync Loop

Each agent bot runs its own sync loop with 30-second long-polling timeout. Sync loops are wrapped with `sync_forever_with_restart()` for automatic restart on connection failures.

Events are processed in background tasks:

1. Sync receives event via long-polling
1. Event callback triggered (`_on_message`, `_on_invite`, etc.)
1. Background task created for async processing
1. Agent responds in thread

### Streaming Responses

Agents stream responses by progressively editing messages. Streaming is enabled only when the requesting user is online (checked via `should_use_streaming()`), saving API calls for offline users. See [Streaming Responses](https://docs.mindroom.chat/streaming/index.md) for the full feature documentation.

Tool call telemetry is emitted as plain inline markers and mirrored in `io.mindroom.tool_trace` metadata on the same message content.

Marker format:

```
🔧 `tool_name` [N] ⏳     ← pending
🔧 `tool_name` [N]        ← completed
```

Where `N` is 1-indexed per message and maps to `io.mindroom.tool_trace.events[N-1]`.

## Presence

Agents set their Matrix presence with status messages containing model and role information (e.g., "🤖 Model: anthropic/claude-sonnet-4-6 | 💼 Code assistant | 🔧 5 tools available").

**Presence States:**

- **online** - Agent running and ready
- **unavailable** - Agent idle but connected (treated as online for streaming)
- **offline** - Agent stopped or disconnected

## Typing Indicators

Agents show typing indicators while processing via `typing_indicator()` context manager. The indicator auto-refreshes at `min(timeout/2, 15)` seconds to remain visible during long operations.

## Mentions

Mentions are parsed via `format_message_with_mentions()` which handles multiple formats:

- `@calculator` - Short agent name
- `@mindroom_calculator` - Full username
- `@mindroom_calculator:localhost` - Full Matrix ID

Returns content with `m.mentions` and `formatted_body` containing clickable links.

## Large Messages

Messages exceeding the 64KB Matrix event limit are automatically handled by `prepare_large_message()`:

- Messages > 55,000 bytes and edits > 27,000 bytes use a fallback event
- Full original Matrix message content is uploaded as a JSON sidecar (`message-content.json`)
- Preview text included in message body (maximum that fits)
- Custom metadata dict `io.mindroom.long_text` contains `version: 2`, `encoding: "matrix_event_content_json"`, original and preview sizes, and a completeness flag
- Preview event is compact (for example no inline `io.mindroom.tool_trace`), while the sidecar preserves full content fidelity
- Encrypted rooms: sidecar JSON is encrypted before upload (`message-content.json.enc`)

## Response Tracking

MindRoom prevents duplicate responses using a `ResponseTracker` that records which events have already been processed. When a sync reconnection or retry delivers the same event twice, the tracker suppresses the duplicate so only one agent response is sent per triggering message. Tracking state is persisted under `mindroom_data/tracking/` and survives restarts.

## Room Cleanup

On startup, MindRoom detects orphaned bot memberships left over from a previous configuration. When an agent is removed from `config.yaml`, its Matrix bot account may still be a member of rooms it previously joined. The cleanup process leaves those rooms safely without ejecting currently configured entities from their required rooms. This runs automatically — no manual intervention is needed.

## Identity Management

The `MatrixID` class handles Matrix user ID parsing and agent identification:

```
mid = MatrixID.parse("@mindroom_assistant:example.com")
mid.username  # "mindroom_assistant"
mid.domain    # "example.com"
mid.full_id   # "@mindroom_assistant:example.com"

# Create from agent name
mid = MatrixID.from_agent("assistant", "example.com", runtime_paths)

# Extract agent name (returns "code" if configured, None otherwise)
agent_name = extract_agent_name("@mindroom_code:localhost", config, runtime_paths)
```

## Root Space

MindRoom can create and maintain a root Matrix Space that groups all managed rooms.

```
matrix_space:
  enabled: true        # Default: true
  name: MindRoom       # Display name for the Space
```

When enabled, `ensure_root_space()` creates the Space on first boot (or resolves an existing one by alias), links all managed rooms as children, and sets the Space avatar from workspace or bundled assets. The Space name is reconciled on each startup to match the configured value.

## Delivery Policy

Outgoing encrypted Matrix sends keep nio's device-trust checks enabled by default.

```
matrix_delivery:
  ignore_unverified_devices: false
```

Operators can set `matrix_delivery.ignore_unverified_devices` to `true` when bot delivery should proceed even if encrypted rooms contain unverified devices. This is a security tradeoff because Matrix may encrypt outgoing events for devices the bot has not verified.

## Configuration

Matrix settings are derived from `config.yaml`:

```
agents:
  assistant:
    rooms: [lobby, dev]  # Room aliases (auto-created if needed)

teams:
  research_team:
    rooms: [research]
```

Room aliases are resolved to room IDs automatically. Full room IDs (starting with `!`) are also supported.

When a room doesn't exist, it's created with an AI-generated topic, power users are invited, and managed avatars are resolved from workspace overrides or bundled defaults if available.

# Agent Orchestration

The `MultiAgentOrchestrator` (in `src/mindroom/orchestrator.py`) manages the lifecycle of all agents, teams, and the router.

## Boot Sequence

```
main() entry
       │
       ▼
┌──────────────────┐
│ Sync Provider    │
│ Credentials      │
│ (.env/bootstrap  │
│ env → shared     │
│ credentials)     │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Initialize()    │
│ ─────────────────│
│ 1. Parse config  │
│    (Pydantic)    │
│ 2. Load plugins  │
│ 3. Create "user" │
│    Matrix account│
│    (mindroom_user)│
│ 4. Create bots   │
│    for entities  │
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│    Start()       │
│ ─────────────────│
│ 1. try_start()   │
│    each bot      │
│ 2. Setup rooms   │
│    & memberships │
│ 3. Create sync   │
│    tasks         │
└────────┬─────────┘
         │
         ▼
┌──────────────────────────────────────┐
│  Auxiliary Tasks (auto-restart)      │
│ ─────────────────────────────────────│
│ • config watcher (file polling)      │
│ • skills watcher (skill cache)       │
│ • API server (if enabled)            │
│  (each wrapped in                    │
│   _run_auxiliary_task_forever)        │
└───────────────┬──────────────────────┘
                │
                ▼
┌──────────────────────────────────────┐
│  Bot Sync Tasks (asyncio.gather)     │
│ ─────────────────────────────────────│
│ • One sync loop per bot              │
│ • sync_forever_with_restart()        │
│ • Awaited until shutdown             │
└──────────────────────────────────────┘
```

**Key details:**

- **Entity order**: Router first, then agents, then teams
- **Room setup** (`_setup_rooms_and_memberships`): Router creates rooms, invites agents/users, bots join
- **Sync loops**: Each bot runs `sync_forever_with_restart()` with automatic retry
- **Internal user identity**: `mindroom_user.username` is bootstrap-only; only `display_name` should change later

## Hot Reload

Config changes are detected via polling (`watch_file()` checks `st_mtime` every second):

1. On change, `update_config()` is called
1. `_identify_entities_to_restart()` computes diff using `model_dump(exclude_none=True)`
1. Affected entities are stopped, recreated, and restarted
1. Removed entities run `cleanup()` (leave rooms, stop bot)
1. New/restarted bots go through room setup

Skills are watched separately via `_watch_skills_task()` with cache invalidation.

## Orchestration Subpackage

The `src/mindroom/orchestration/` subpackage contains helpers extracted from the monolithic orchestrator:

- **`runtime.py`** — Sync loop helpers: `sync_forever_with_restart()` with linear backoff (capped at 60s), `cancel_task()`, and `create_logged_task()` for safe asyncio task creation.
- **`config_updates.py`** — Config diffing and reload planning: `build_config_update_plan()` computes a `ConfigUpdatePlan` by calling `_identify_entities_to_restart()`, which diffs old and new configs using `model_dump(exclude_none=True)`.
- **`rooms.py`** — Room invitation helpers: `get_authorized_user_ids_to_invite()` and `get_root_space_user_ids_to_invite()` compute which users should be invited to managed rooms and the root Matrix space.

### Runtime Resolution

Agent and team materialization is handled by dedicated top-level modules (not inside the `orchestration/` subpackage):

- **`src/mindroom/runtime_resolution.py`** — Resolves `ResolvedAgentRuntime` (the full set of runtime parameters for one agent instance) including `ResolvedKnowledgeBinding` for knowledge base attachment.
- **`src/mindroom/team_exact_members.py`** — Resolves `ResolvedExactTeamMembers` for team materialization via `materialize_exact_requested_team_members()`.
- **`src/mindroom/agent_policy.py`** — Resolves canonical execution policies and private-team eligibility derived from authored agent config.
- **`src/mindroom/model_loading.py`** — Owns `get_model_instance()` and provider-specific model loader selection.
- **`src/mindroom/ai_runtime.py`** — Owns agent-run input copying, queued-notice hooks, and inline-media fallback helpers used during execution.
- **`src/mindroom/agent_storage.py`** — Owns agent session and learning SQLite storage construction helpers.
- **`src/mindroom/agent_descriptions.py`** — Owns shared agent description rendering used by routing and delegation.
- **`src/mindroom/runtime_state.py`** — Shared runtime readiness state with `set_runtime_starting()`, `set_runtime_ready()`, and `set_runtime_failed()` used by health endpoints.

## Message Handling

Event callbacks are wrapped in `_create_task_wrapper()` to run as background tasks, ensuring the sync loop is never blocked.

**`_on_message` flow:**

1. Skip own messages (except voice transcriptions from router)
1. Check sender authorization and handle edits
1. Check if already responded (`ResponseTracker`)
1. Router handles commands exclusively
1. Extract message context (mentions, thread history, non-agent mention detection)
1. Skip messages from other agents (unless mentioned)
1. Router performs AI routing when no agent mentioned and thread doesn't have multiple human participants
1. Check for team formation or individual response
1. Generate response and store memory

**Message edits**: When a user edits a message that already received an agent response, the agent regenerates its response for the updated content. The agent edits its own previous reply in place rather than sending a new message. Edits from other agents are ignored, and the feature requires that the original response event ID is tracked by the `ResponseTracker`.

**`_on_media_message`**: Handles media events (images, videos, files, and audio). Downloads and decrypts media data, then processes it through the agent. When no agent is mentioned, AI routing is used to select the appropriate agent, similar to text messages.

**`_on_reaction`**: Handles `ReactionEvent` for the interactive Q&A system (e.g., confirming or rejecting agent suggestions) and config confirmation workflows.

**Routing** (when no agent mentioned): Router uses `suggest_agent_for_message()` to pick the best agent based on room configuration and message content. Only routes when multiple agents are available. In threads where multiple non-agent users have posted, routing is skipped entirely — an explicit `@mention` is required. Non-MindRoom bots listed in `bot_accounts` are excluded from this detection.

## Concurrency

- Each bot runs its own sync loop via `sync_forever_with_restart()`
- Sync loop failures trigger automatic restart with linear backoff (5s, 10s, 15s, ... up to 60s max)
- Event callbacks run as background tasks (never block the sync loop)
- `ResponseTracker` prevents duplicate replies
- `StopManager` handles cancellation of in-progress responses

### Graceful Shutdown

On `orchestrator.stop()`:

1. Set `self.running = False`
1. Cancel config reload task
1. Stop memory auto-flush worker
1. Shut down the per-binding knowledge refresh scheduler
1. Cancel pending bot start tasks
1. Stop the MCP manager
1. Cancel all sync tasks
1. Signal all bots to stop (`bot.running = False`)
1. Call `bot.stop()` for each bot concurrently (waits 5s for background tasks, cancels scheduled tasks, closes Matrix client)

# Deployment

MindRoom can be deployed in various ways depending on your needs.

## Deployment Options

| Method                                                                                                | Best For                                                           |
| ----------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------ |
| [Hosted Matrix + local MindRoom](https://docs.mindroom.chat/deployment/hosted-matrix/index.md)        | Simplest setup: run only `uvx mindroom run` locally                |
| Full Stack (Docker Compose)                                                                           | All-in-one: bundled dashboard + Matrix (Tuwunel) + MindRoom client |
| [Docker (single container)](https://docs.mindroom.chat/deployment/docker/index.md)                    | Single MindRoom runtime or when you already have Matrix            |
| [Kubernetes](https://docs.mindroom.chat/deployment/kubernetes/index.md)                               | Multi-tenant SaaS, production                                      |
| [Trusted upstream browser auth](https://docs.mindroom.chat/deployment/trusted-upstream-auth/index.md) | Hosted private agents behind an authenticated access layer         |
| Direct                                                                                                | Development, simple setups                                         |

## Bridges

Connect external messaging platforms to Matrix:

- [Bridges overview](https://docs.mindroom.chat/deployment/bridges/index.md) - available bridges and how they work
- [Telegram bridge](https://docs.mindroom.chat/deployment/bridges/telegram/index.md) - bridge Telegram chats via mautrix-telegram

## Google Services (Gmail/Calendar/Drive/Sheets)

Use these guides if you want users to connect Google accounts in the MindRoom frontend:

- [Google Services OAuth (Admin Setup)](https://docs.mindroom.chat/deployment/google-services-oauth/index.md) - one-time setup for shared/team deployments
- [Google Services OAuth (Individual Setup)](https://docs.mindroom.chat/deployment/google-services-user-oauth/index.md) - single-user bring-your-own OAuth app setup

For private personal-agent tools, use the generic [OAuth Framework](https://docs.mindroom.chat/oauth-framework/index.md) and the Google Drive section in the individual setup guide. For hosted multi-user private agents, also configure [Trusted Upstream Browser Auth](https://docs.mindroom.chat/deployment/trusted-upstream-auth/index.md) so agent-issued OAuth links authenticate as the requester that triggered them.

## Quick Start

### Hosted Matrix + local MindRoom (simplest)

```
# Creates ~/.mindroom/config.yaml and ~/.mindroom/.env by default
uvx mindroom config init --profile public
$EDITOR ~/.mindroom/.env
uvx mindroom connect --pair-code ABCD-EFGH
uvx mindroom run
```

Generate the pair code in `https://chat.mindroom.chat` under: `Settings -> Local MindRoom`.

See [Hosted Matrix deployment](https://docs.mindroom.chat/deployment/hosted-matrix/index.md) for the full walkthrough.

### Full Stack (recommended)

```
git clone https://github.com/mindroom-ai/mindroom-stack
cd mindroom-stack
cp .env.example .env
$EDITOR .env  # add at least one AI provider key

docker compose up -d
```

The stack exposes MindRoom at `http://localhost:8765`, the MindRoom client at `http://localhost:8080`, and Matrix at `http://localhost:8008`. The stack uses published `mindroom`, `mindroom-cinny`, and `mindroom-tuwunel` images by default. If you access it from another device, set `CLIENT_HOMESERVER_URL=http://<host-ip>:8008` in `.env` before starting it.

### Direct (Development)

```
mindroom run --storage-path ./mindroom_data
```

The config file path is set via `MINDROOM_CONFIG_PATH` and otherwise defaults to `./config.yaml`, then `~/.mindroom/config.yaml`.

If you want local Matrix + Cinny with a host-installed MindRoom runtime (Linux/macOS), use:

```
mindroom local-stack-setup --synapse-dir /path/to/mindroom-stack/local/matrix
mindroom run --storage-path ./mindroom_data
```

### Docker (single container)

```
docker run -d \
  --name mindroom \
  -p 8765:8765 \
  -v ./config.yaml:/app/config.yaml:ro \
  -v ./mindroom_data:/app/mindroom_data \
  --env-file .env \
  ghcr.io/mindroom-ai/mindroom:latest
```

See the [Docker deployment guide](https://docs.mindroom.chat/deployment/docker/index.md) for the full single-container setup.

### Kubernetes

See the [Kubernetes deployment guide](https://docs.mindroom.chat/deployment/kubernetes/index.md) for Helm chart configuration.

## Required Configuration

Full stack:

```
# .env in the full stack repo
OPENAI_API_KEY=sk-...
# Add other providers as needed
```

Direct and single-container deployments:

1. **Matrix homeserver** - Set `MATRIX_HOMESERVER` (must allow open registration for agent accounts)
1. **AI provider keys** - At least one of `OPENAI_API_KEY`, `OPENROUTER_API_KEY`, etc.
1. **Persistent storage** - Mount `mindroom_data/` to persist agent state (including `sessions/`, `learning/`, and memory data)

See the [Docker guide](https://docs.mindroom.chat/deployment/docker/#environment-variables) for the complete environment variable reference.

Hosted `mindroom.chat` deployments additionally use values from `mindroom connect` (`MINDROOM_LOCAL_CLIENT_ID`, `MINDROOM_LOCAL_CLIENT_SECRET`, and `MINDROOM_NAMESPACE`) to bootstrap agent registrations and avoid collisions on shared homeservers.

# Hosted Matrix + Local Backend

This guide covers the simplest production-like setup:

- Matrix homeserver is hosted at `https://mindroom.chat`
- Web chat runs at `https://chat.mindroom.chat`
- You run only `mindroom run` locally via `uvx`

## What Runs Where

| Component            | Runs on                          | Purpose                                 |
| -------------------- | -------------------------------- | --------------------------------------- |
| `chat.mindroom.chat` | Hosted web app                   | Login UI and pairing UI                 |
| `mindroom.chat`      | Hosted Matrix + provisioning API | Matrix transport + local onboarding API |
| `uvx mindroom run`   | Your machine/server              | Agent orchestration, tools, model calls |

## Prerequisites

- Python 3.12+
- `uv` installed
- A Matrix account that can sign in to `chat.mindroom.chat`
- At least one AI provider API key, or a local Codex CLI ChatGPT subscription login

## 1. Initialize Local Config

```
uvx mindroom config init --profile public
```

This creates `~/.mindroom/config.yaml` and `~/.mindroom/.env` with hosted defaults. Use `uvx mindroom config init --profile public-codex` if you want the starter config to use `provider: codex`.

## 2. Add AI Provider Key

Edit `~/.mindroom/.env` and set at least one provider key:

```
OPENAI_API_KEY=...
# or OPENROUTER_API_KEY=...
```

For Codex CLI subscription auth, run `codex login` instead of adding an API key. MindRoom reads `~/.codex/auth.json` by default.

## 3. Pair This Install

1. Open `https://chat.mindroom.chat`.
1. Go to `Settings -> Local MindRoom`.
1. Click `Generate Pair Code`.
1. Run locally:

```
uvx mindroom connect --pair-code ABCD-EFGH
```

Pair code behavior:

- Valid for 600 seconds (10 minutes).
- Only used to bootstrap local pairing.

After successful pairing, local provisioning credentials are written to `~/.mindroom/.env` by default unless you use `--no-persist-env`.

## 4. Start MindRoom

```
uvx mindroom run
```

MindRoom then:

1. Connects to `MATRIX_HOMESERVER`
1. Creates/updates configured agent Matrix users
1. Joins/creates configured rooms
1. Starts processing messages

## Credential Model (Important)

`mindroom connect` returns local provisioning credentials:

- `MINDROOM_LOCAL_CLIENT_ID`
- `MINDROOM_LOCAL_CLIENT_SECRET`
- `MINDROOM_NAMESPACE`

`MINDROOM_LOCAL_CLIENT_ID` and `MINDROOM_LOCAL_CLIENT_SECRET` are **not Matrix user access tokens**. `MINDROOM_NAMESPACE` is appended to managed agent usernames and room aliases to avoid collisions on shared homeservers.

They can only call provisioning-service endpoints that accept local client credentials (for example agent registration flows). Revoke them from `Settings -> Local MindRoom` in the chat UI.

## Trust Model (Hosted Server vs Message Privacy)

For message *content*, this setup can be effectively zero-trust toward the homeserver operator when rooms are end-to-end encrypted.

- In E2EE rooms, the homeserver stores ciphertext and cannot read message bodies.
- The local `mindroom run` process holds your agent account keys and performs decryption locally.

Important limits:

- This does **not** hide metadata (room membership, timestamps, event IDs, sender IDs, traffic patterns).
- If a room is not encrypted, the homeserver can read plaintext.
- Any model/tool providers you send content to can still see the prompts/data you send to them.

So the precise claim is: encrypted Matrix message content is protected from the hosted homeserver, not that every part of the system is universally invisible.

## If You Self-Host Later

You can keep the same local flow and switch endpoints:

- `MATRIX_HOMESERVER=https://your-matrix.example.com`
- `MINDROOM_PROVISIONING_URL=https://your-matrix.example.com` (or your dedicated provisioning host)

Then run `mindroom connect` again with a fresh pair code from your own UI.

# Bridges

MindRoom uses [mautrix](https://docs.mau.fi/bridges/) bridges to connect external messaging platforms to Matrix. Bridges run as appservices alongside Synapse, creating ghost users for external contacts and relaying messages bidirectionally.

## Available Bridges

| Bridge                                                                      | Platform  | Mode                       | Status    |
| --------------------------------------------------------------------------- | --------- | -------------------------- | --------- |
| [Telegram](https://docs.mindroom.chat/deployment/bridges/telegram/index.md) | Telegram  | Puppet (login as yourself) | Available |
| Slack                                                                       | Slack     | -                          | Planned   |
| Email                                                                       | IMAP/SMTP | -                          | Planned   |

## How Bridges Work

Each bridge registers as a Matrix [Application Service](https://spec.matrix.org/latest/application-service-api/) with Synapse. The bridge:

1. Creates ghost users on Matrix for external contacts
1. Creates Matrix rooms for external chats
1. Relays messages between the external platform and Matrix in real time

In **puppet mode**, you log into your real account on the external platform. Your messages appear as coming from you on both sides, not from a bot.

## Adding a New Bridge

1. Create a config directory: `telegram-bridge/`, `slack-bridge/`, etc.
1. Add the bridge service to `compose.yaml`
1. Generate a registration file and mount it into Synapse
1. Add the registration path to `homeserver.yaml` under `app_service_config_files`
1. Restart Synapse and start the bridge

# Telegram Bridge

Bridge Telegram and Matrix using [mautrix-telegram](https://docs.mau.fi/bridges/python/telegram/) in **puppet mode**. Each user logs in with their own Telegram account, so messages appear as the real user on both sides.

## What Can You Do With This?

The bridge enables two main use cases:

1. **Talk to MindRoom agents from Telegram** -- Link a Telegram group to a Matrix room (like Lobby) so you can chat with AI agents directly from the Telegram app, without opening Element.
1. **Access Telegram chats from Matrix** -- Your existing Telegram conversations appear as Matrix rooms in Element, so you can use one client for everything.

Most users want use case 1. See [Bridging Matrix rooms to Telegram](#step-2-bridge-matrix-rooms-to-telegram) after setup.

## Architecture

```
Telegram Cloud <--> mautrix-telegram <--> Synapse <--> Element
                    (bridge bot)         (homeserver)   (client)
```

- **mautrix-telegram** runs locally and connects outbound to Telegram's API -- your Matrix server does NOT need to be publicly accessible
- Each Matrix user can log into their own Telegram account (puppeting)
- Messages flow bidirectionally in real time

## Prerequisites

### 1. Telegram API Credentials

1. Go to [my.telegram.org](https://my.telegram.org) and log in
1. Click "API development tools"
1. Create an app (title: "MindRoom Bridge", short name: "mindroom")
1. Note the **api_id** (numeric) and **api_hash** (string)

### 2. Telegram Bot

1. Message [@BotFather](https://t.me/BotFather) on Telegram
1. Send `/newbot`, choose a name and username
1. Note the **bot token** (format: `123456789:ABCdefGHI...`)

## Setup

### 1. Add credentials to config

Edit `telegram-bridge/config.yaml` and replace the placeholders in the `telegram:` section:

```
telegram:
    api_id: 12345678          # Your numeric api_id
    api_hash: abcdef123456    # Your api_hash string
    bot_token: 123456:ABC...  # Your bot token from BotFather
```

Also update the same values in your `.env`:

```
TELEGRAM_API_ID=12345678
TELEGRAM_API_HASH=abcdef123456
TELEGRAM_BOT_TOKEN=123456:ABC...
```

### 2. Recreate Synapse and start the bridge

Synapse needs a new volume mount for the bridge registration file, so it must be **recreated** (not just restarted):

```
# Recreate Synapse to pick up the new volume mount and bridge registration
docker compose up -d synapse

# Wait for Synapse to become healthy
docker compose ps synapse

# Start the bridge
docker compose up -d telegram-bridge
```

> **Note:** `docker compose restart synapse` will NOT work here because the `registration.yaml` volume mount is new in `compose.yaml`. A restart reuses the existing container; `up -d` recreates it with the updated mounts.

### 3. Verify

```
# Check bridge logs
docker compose logs telegram-bridge --tail 20

# Look for "Startup actions complete"
```

## Usage

### Step 1: Log in to Telegram via the bridge

Before you can bridge anything, you must link your Telegram account:

1. Open Element at your Element URL
1. Start a DM with `@telegrambot:your.matrix.domain`
1. Send `login`
1. Enter your phone number in international format (e.g., `+1234567890`)
1. Enter the verification code sent to your Telegram app
1. Your existing Telegram chats will appear as Matrix rooms

### Step 2: Bridge Matrix rooms to Telegram

This is the primary use case -- talking to MindRoom agents from Telegram.

The bridge connects a **Telegram group** to a **Matrix room**. You need a Telegram group on the Telegram side because that's what you'll open in the Telegram app to send and receive messages.

**For each Matrix room you want to access from Telegram** (e.g., Lobby):

1. **Create a Telegram group** in the Telegram app (e.g., name it "MindRoom Lobby")
1. **Add your bridge bot** (e.g., `@your_bridge_bot`) to that Telegram group
1. **In Element**, go to the Matrix room you want to bridge (e.g., Lobby)
1. **Invite the bridge bot**: invite `@telegrambot:your.matrix.domain` to the room
1. **Link the rooms**: in the Matrix room, send `!tg bridge` -- the bot will list your Telegram groups and let you pick which one to link

Once linked:

- Messages you send in the **Telegram group** appear in the **Matrix room** -- MindRoom agents will see and respond to them
- Agent responses in the **Matrix room** appear in the **Telegram group**
- You can chat with MindRoom agents entirely from the Telegram app

Repeat for any other Matrix rooms you want accessible from Telegram.

> **Why can't I just invite the bot directly?** The bridge bot (`@telegrambot`) is Matrix-side infrastructure -- it manages the bridge but isn't a Telegram chat. To use Telegram as your client, there must be a Telegram group for the Telegram app to display. The bridge connects that group to the Matrix room bidirectionally.

### Accessing Telegram chats from Matrix

After logging in (step 1), your Telegram chats automatically appear as Matrix rooms in Element. This lets you use Element as a unified client for both Matrix and Telegram conversations.

- **Private chats**: Automatically bridged as Matrix DMs
- **Groups**: Automatically bridged if within `sync_create_limit` (default: 30)
- **Additional groups**: Use `search <query>` in the bridge bot DM to find and bridge more

### Bot Commands Reference

Send these to `@telegrambot:your.matrix.domain` in a DM, or in a bridged room:

| Command          | Description                                                     |
| ---------------- | --------------------------------------------------------------- |
| `login`          | Link your Telegram account                                      |
| `logout`         | Unlink your Telegram account                                    |
| `ping`           | Check bridge connection status                                  |
| `search <query>` | Search your Telegram chats                                      |
| `!tg bridge`     | Link current Matrix room to a Telegram group (send in the room) |
| `unbridge`       | Unlink current room from Telegram                               |
| `sync`           | Re-sync Telegram chat list                                      |
| `help`           | Show all commands                                               |

## Configuration Reference

Key settings in `telegram-bridge/config.yaml`:

| Setting                       | Default              | Description                                 |
| ----------------------------- | -------------------- | ------------------------------------------- |
| `bridge.username_template`    | `telegram_{userid}`  | Matrix username pattern for Telegram ghosts |
| `bridge.displayname_template` | `{displayname} (TG)` | Display name pattern for Telegram users     |
| `bridge.sync_create_limit`    | `30`                 | Max chats to auto-create on first sync      |
| `bridge.sync_direct_chats`    | `true`               | Auto-bridge private chats                   |
| `bridge.encryption.allow`     | `true`               | Allow E2EE in bridged rooms                 |
| `bridge.permissions`          | See config           | Who can use the bridge and at what level    |

### Permission Levels

Set in `bridge.permissions`:

- `relaybot` - Messages relayed through the bot (not puppeted)
- `user` - Can use the bridge but not log in
- `puppeting` - Can log in with their Telegram account
- `full` - Full access including creating portals
- `admin` - Bridge administration

Default config gives `full` to all users on your homeserver domain.

## Troubleshooting

### Bridge won't start

- Check credentials: `api_id` must be numeric, `api_hash` must be a hex string, `bot_token` must be a valid BotFather token
- Check logs: `docker compose logs telegram-bridge --tail 50`
- Verify Synapse is healthy: `docker compose ps`

### Login fails

- Ensure `api_id` and `api_hash` are from the same Telegram app
- The bot token must be from a bot you own (not revoked)
- If you get "FLOOD_WAIT", wait the indicated time before retrying

### Messages not bridging

- Check the bridge is connected: DM the bot and send `ping`
- Verify Synapse has the registration: check `app_service_config_files` in `homeserver.yaml`
- Check bridge permissions in `config.yaml` - your user domain must have `full` or `puppeting`

### Double puppeting

To make your messages from Matrix appear as your real Telegram account (not the bridge bot):

1. This is automatic when you log in via `login` - puppet mode is the default
1. If messages still show as the bot, check `bridge.sync_with_custom_puppets` in config

### Database issues

The bridge uses SQLite stored in the `telegram-bridge` data volume. To reset:

```
docker compose stop telegram-bridge
rm <data-dir>/telegram-bridge/mautrix-telegram.db
docker compose up -d telegram-bridge
```

Note: This will require re-logging into Telegram.

### Registration out of sync

If Synapse reports appservice errors, regenerate the registration:

```
docker compose stop telegram-bridge
rm telegram-bridge/registration.yaml
# Temporarily set valid api_id in config.yaml, then:
docker compose run --rm --no-deps --entrypoint \
  "python -m mautrix_telegram -g -c /data/config.yaml -r /data/registration.yaml" \
  telegram-bridge
docker compose restart synapse
docker compose up -d telegram-bridge
```

## Maintenance

### Updating

```
docker compose pull telegram-bridge
docker compose up -d telegram-bridge
```

### Backup

Important data locations:

- `telegram-bridge/config.yaml` - Bridge configuration
- `telegram-bridge/registration.yaml` - Appservice registration
- Telegram bridge data volume - SQLite database with session data

# Google Services OAuth

MindRoom uses the generic OAuth framework for Google tools. Each Google service has its own provider ID, callback URL, token service, OAuth client config service, and editable tool settings service. There is no bundled `/api/google/*` OAuth flow.

## Providers

| Tool            | Provider ID       | Callback path                         | Token service           | Client config service          | Settings service  | Scopes                                                    |
| --------------- | ----------------- | ------------------------------------- | ----------------------- | ------------------------------ | ----------------- | --------------------------------------------------------- |
| Google Drive    | `google_drive`    | `/api/oauth/google_drive/callback`    | `google_drive_oauth`    | `google_drive_oauth_client`    | `google_drive`    | Drive read-only plus OpenID email/profile                 |
| Google Calendar | `google_calendar` | `/api/oauth/google_calendar/callback` | `google_calendar_oauth` | `google_calendar_oauth_client` | `google_calendar` | Calendar read/write plus OpenID email/profile             |
| Google Sheets   | `google_sheets`   | `/api/oauth/google_sheets/callback`   | `google_sheets_oauth`   | `google_sheets_oauth_client`   | `google_sheets`   | Sheets read/write, plus OpenID email/profile              |
| Gmail           | `google_gmail`    | `/api/oauth/google_gmail/callback`    | `google_gmail_oauth`    | `google_gmail_oauth_client`    | `gmail`           | Gmail readonly, modify, compose plus OpenID email/profile |

## Google Cloud Setup

Create an OAuth client in Google Cloud Console. Enable only the APIs for the tools you plan to use. Add one authorized redirect URI for each provider you enable.

For local development, the redirect URIs are:

```
http://localhost:8765/api/oauth/google_drive/callback
http://localhost:8765/api/oauth/google_calendar/callback
http://localhost:8765/api/oauth/google_sheets/callback
http://localhost:8765/api/oauth/google_gmail/callback
```

For production, replace the origin with your public MindRoom origin.

## Stored Client Config

OAuth app client config is stored through normal credential storage, separate from user OAuth tokens and editable tool settings. Use one provider-specific service when one Google Cloud OAuth client should apply to only that provider. Use `google_oauth_client` when one shared Google Cloud OAuth client should apply to every Google provider. Provider-specific services win over `google_oauth_client`.

Store these fields on the client config service:

```
{
  "client_id": "your-client-id.apps.googleusercontent.com",
  "client_secret": "your-client-secret",
  "redirect_uri": "https://mindroom.example.com/api/oauth/google_drive/callback"
}
```

`redirect_uri` is optional when `MINDROOM_PUBLIC_URL` or the local default origin is correct. Only provider-specific client config services use stored `redirect_uri`. The shared `google_oauth_client` service ignores `redirect_uri` and derives each provider's callback URI. Dashboard credential responses redact `client_secret`. Saving redacted client config may omit or blank `client_secret` only when `client_id` is unchanged. Changing `client_id` requires submitting the matching new `client_secret`. First-time client config saves require `client_id` and `client_secret`. Client config services are not worker-grantable and are never mirrored into worker containers. Client config services cannot be copied into ordinary credential services.

For non-interactive deployments, you can seed the shared client config service at startup with `MINDROOM_CREDENTIAL_SEEDS_FILE`:

```
[
  {
    "service": "google_oauth_client",
    "credentials": {
      "client_id": {"env": "GOOGLE_CLIENT_ID"},
      "client_secret": {"env": "GOOGLE_CLIENT_SECRET"}
    }
  }
]
```

The referenced env vars may also use the `*_FILE` convention, such as `GOOGLE_CLIENT_SECRET_FILE`. MindRoom updates env-sourced seeded credentials on restart, but it does not overwrite dashboard-managed client config.

## Environment Variables

Optional account restrictions are service-specific:

```
GOOGLE_DRIVE_ALLOWED_EMAIL_DOMAINS=example.com
GOOGLE_CALENDAR_ALLOWED_HOSTED_DOMAINS=example.com
GOOGLE_SHEETS_ALLOWED_EMAIL_DOMAINS=example.com
GOOGLE_GMAIL_ALLOWED_HOSTED_DOMAINS=example.com
```

## Runtime Behavior

Dashboard and agent-issued connect links use `/api/oauth/{provider}/connect` or `/api/oauth/{provider}/authorize`. OAuth callback state is stored server-side as an opaque token and bound to the authenticated dashboard user and scoped credential target. Disconnecting a provider removes the token service for the selected scope and preserves that provider's editable tool settings.

# Google Services OAuth For Individuals

This guide is for one person running MindRoom with their own Google OAuth client. MindRoom uses per-service generic OAuth providers instead of a legacy all-Google route.

## Choose Providers

Enable only the APIs your agents need. Add the matching tool to the agent config.

```
agents:
  personal:
    display_name: Personal
    role: Help with my Google workspace
    worker_scope: user_agent
    tools:
      - google_drive
      - google_calendar
      - google_sheets
      - gmail
```

## Create OAuth Credentials

Open Google Cloud Console and create an OAuth client. Add one redirect URI per provider you use.

```
http://localhost:8765/api/oauth/google_drive/callback
http://localhost:8765/api/oauth/google_calendar/callback
http://localhost:8765/api/oauth/google_sheets/callback
http://localhost:8765/api/oauth/google_gmail/callback
```

## Configure MindRoom

For a single personal OAuth client, store shared Google OAuth app client config under `google_oauth_client` through the dashboard credentials API or raw credentials editor:

```
{
  "client_id": "your-client-id.apps.googleusercontent.com",
  "client_secret": "your-client-secret"
}
```

For provider-specific client config, use `google_drive_oauth_client`, `google_calendar_oauth_client`, `google_sheets_oauth_client`, or `google_gmail_oauth_client`. Provider-specific client config wins over the shared `google_oauth_client` service. The shared `google_oauth_client` service supplies only the shared client ID and secret. MindRoom derives each provider's redirect URI from `MINDROOM_PUBLIC_URL` or the local default origin. MindRoom stores OAuth app client config separately from user OAuth tokens and never mirrors it into worker containers. First-time dashboard client setup requires `client_id` and `client_secret`. Later edits may leave `client_secret` blank only when `client_id` is unchanged. Changing `client_id` requires submitting the matching new `client_secret`.

When using standalone dashboard API-key auth, also set `MINDROOM_OWNER_USER_ID` to your Matrix user ID, such as `@alice:matrix.example.com`. Do not use `MINDROOM_OWNER_USER_ID` as the identity model for hosted multi-user private agents. Use [Trusted Upstream Browser Auth](https://docs.mindroom.chat/deployment/trusted-upstream-auth/index.md) for those deployments.

## Connect

Open the MindRoom dashboard and connect the integration required by each tool. If an agent tries a Google tool before it is connected, the tool result includes a MindRoom connect URL for that exact provider and agent scope. After the browser OAuth flow completes, retry the original request.

OAuth tokens are stored under provider token services such as `google_drive_oauth`. Editable tool settings are stored separately under services such as `google_drive`, `google_calendar`, `google_sheets`, and `gmail`. OAuth app client config is stored separately under services such as `google_oauth_client` or `google_drive_oauth_client`.

# Trusted Upstream Browser Auth

Use trusted upstream auth when MindRoom API and browser routes sit behind a deployment-owned access layer that has already authenticated the human. This mode is disabled by default. Do not enable it unless the reverse proxy or identity gateway strips client-supplied copies of the trusted headers and injects verified values itself.

## Why It Exists

Agent-issued OAuth links are normal browser links such as `/api/oauth/google_drive/authorize?connect_token=...`. The connect token records the Matrix requester that triggered the missing-credentials tool result. In a hosted multi-user private-agent deployment, the browser opening that link must authenticate as the same requester. The standalone `MINDROOM_OWNER_USER_ID` setting maps every dashboard request to one Matrix user, so it is only appropriate for single-owner deployments. It is not a hosted multi-user identity solution.

## Environment

Configure the header names that your access layer owns:

```
MINDROOM_TRUSTED_UPSTREAM_AUTH_ENABLED=true
MINDROOM_TRUSTED_UPSTREAM_USER_ID_HEADER=X-MindRoom-User-Id
MINDROOM_TRUSTED_UPSTREAM_EMAIL_HEADER=X-MindRoom-User-Email
MINDROOM_TRUSTED_UPSTREAM_MATRIX_USER_ID_HEADER=X-MindRoom-Matrix-User-Id
MINDROOM_TRUSTED_UPSTREAM_EMAIL_TO_MATRIX_USER_ID_TEMPLATE='@{localpart}:example.org'
```

`MINDROOM_TRUSTED_UPSTREAM_USER_ID_HEADER` is required when trusted upstream auth is enabled. The user ID value must be stable for the authenticated browser user. `MINDROOM_TRUSTED_UPSTREAM_EMAIL_HEADER` is optional unless `MINDROOM_TRUSTED_UPSTREAM_EMAIL_TO_MATRIX_USER_ID_TEMPLATE` is set. When the email-to-Matrix template is set, `MINDROOM_TRUSTED_UPSTREAM_EMAIL_HEADER` is required because MindRoom derives the Matrix localpart from that trusted email value. When present, the email value is stored in `request.scope["auth_user"]["email"]`. `MINDROOM_TRUSTED_UPSTREAM_MATRIX_USER_ID_HEADER` is optional for shared dashboard access. For private `user` and `user_agent` OAuth flows, the trusted identity must resolve to the requester identity used by Matrix-backed tool execution. Prefer `MINDROOM_TRUSTED_UPSTREAM_MATRIX_USER_ID_HEADER` when your access layer can supply a real Matrix ID. When the access layer only supplies email, set `MINDROOM_TRUSTED_UPSTREAM_EMAIL_TO_MATRIX_USER_ID_TEMPLATE` to derive the Matrix ID from the trusted email localpart. For example, the template `@{localpart}:example.org` maps `alice@example.com` to `@alice:example.org`. The template must contain exactly one `{localpart}` placeholder. Derived Matrix IDs must pass MindRoom's Matrix user ID parser.

## Instance Chart

For the hosted instance chart, configure the equivalent values:

```
trustedUpstreamAuth:
  enabled: "true"
  userIdHeader: X-MindRoom-User-Id
  emailHeader: X-MindRoom-User-Email
  matrixUserIdHeader: X-MindRoom-Matrix-User-Id
  emailToMatrixUserIdTemplate: "@{localpart}:example.org"
```

The chart renders these values as the `MINDROOM_TRUSTED_UPSTREAM_*` runtime environment variables. The instance chart fails rendering when `trustedUpstreamAuth.emailToMatrixUserIdTemplate` is set without `trustedUpstreamAuth.emailHeader`. The template value must contain exactly one `{localpart}` placeholder. When using the platform provisioner, configure the platform chart with matching provisioner values:

```
provisioner:
  trustedUpstreamAuth:
    enabled: "true"
    userIdHeader: X-MindRoom-User-Id
    emailHeader: X-MindRoom-User-Email
    matrixUserIdHeader: X-MindRoom-Matrix-User-Id
    emailToMatrixUserIdTemplate: "@{localpart}:example.org"
```

The platform chart renders these as `INSTANCE_TRUSTED_UPSTREAM_*` variables on the provisioner deployment. The platform chart fails rendering when `provisioner.trustedUpstreamAuth.emailToMatrixUserIdTemplate` is set without `provisioner.trustedUpstreamAuth.emailHeader`.

## Security Boundary

Trusted upstream auth is provider-neutral. Cloudflare Access, an ingress controller, an OAuth2 proxy, or another gateway can provide the headers as long as MindRoom only receives gateway-verified values. Never expose a MindRoom instance with this mode enabled directly to browsers or the public internet. If the configured trusted user ID header is missing, MindRoom returns `401`. If a trusted browser identity does not map to the Matrix requester stored in an OAuth connect token, MindRoom returns `403`. Existing Supabase platform auth and standalone API-key auth remain available when trusted upstream auth is not enabled.

# Docker Deployment

Deploy MindRoom using Docker for simple, containerized deployments.

## Quick Start

MindRoom ships as a single runtime container that serves:

- the bot orchestrator
- the dashboard UI at `http://localhost:8765`
- the dashboard API at `http://localhost:8765/api`
- the OpenAI-compatible API at `http://localhost:8765/v1`

Run it with:

```
docker run -d \
  --name mindroom \
  -p 8765:8765 \
  -v ./config.yaml:/app/config.yaml:ro \
  -v ./mindroom_data:/app/mindroom_data \
  --env-file .env \
  ghcr.io/mindroom-ai/mindroom:latest
```

## Docker Compose

Create a `docker-compose.yml`:

```
services:
  mindroom:
    image: ghcr.io/mindroom-ai/mindroom:latest
    container_name: mindroom
    restart: unless-stopped
    ports:
      - "8765:8765"
    volumes:
      - ./config.yaml:/app/config.yaml:ro
      - ./mindroom_data:/app/mindroom_data
    env_file:
      - .env
    environment:
      - MINDROOM_STORAGE_PATH=/app/mindroom_data
      - LOG_LEVEL=${LOG_LEVEL:-INFO}
      - MATRIX_HOMESERVER=${MATRIX_HOMESERVER}
      # Optional: for self-signed certificates
      # - MATRIX_SSL_VERIFY=false
      # Optional: override server name for federation
      # - MATRIX_SERVER_NAME=example.com
```

Run with:

```
docker compose up -d
```

## Environment Variables

Key environment variables (set in `.env` or pass directly):

| Variable                 | Description                                                                                                                                                                               | Default                                         |
| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------- |
| `MATRIX_HOMESERVER`      | Matrix server URL                                                                                                                                                                         | `http://localhost:8008`                         |
| `MATRIX_SSL_VERIFY`      | Verify SSL certificates                                                                                                                                                                   | `true`                                          |
| `MATRIX_SERVER_NAME`     | Server name for federation (optional)                                                                                                                                                     | -                                               |
| `MINDROOM_STORAGE_PATH`  | Data storage directory                                                                                                                                                                    | Relative to config file                         |
| `LOG_LEVEL`              | Logging level                                                                                                                                                                             | `INFO`                                          |
| `MINDROOM_LOGGER_LEVELS` | Optional per-logger overrides, for example `mindroom:DEBUG,httpx:WARNING,httpcore:WARNING,anthropic:INFO,nio:WARNING`; set `nio.crypto:WARNING` to inspect Matrix crypto decrypt warnings | -                                               |
| `MINDROOM_CONFIG_PATH`   | Path to config.yaml                                                                                                                                                                       | `./config.yaml`, then `~/.mindroom/config.yaml` |
| `ANTHROPIC_API_KEY`      | Anthropic API key (if using Claude models)                                                                                                                                                | -                                               |
| `OPENAI_API_KEY`         | OpenAI API key (if using OpenAI models)                                                                                                                                                   | -                                               |
| `MINDROOM_PORT`          | Port used by Google OAuth callback URL construction and deployment tooling. Does **not** change the API server bind port — use `mindroom run --api-port` for that                         | `8765`                                          |
| `MINDROOM_API_KEY`       | API key for dashboard auth (standalone)                                                                                                                                                   | - (open access)                                 |

To change the API server port or bind address, pass `--api-port` or `--api-host` to the `mindroom run` command. For example, add `command: ["mindroom", "run", "--api-port", "9000"]` to the Docker Compose service.

Streaming responses are configured in `config.yaml` via `defaults.enable_streaming` (default: `true`).

If `MINDROOM_API_KEY` is set, the browser dashboard will prompt for the key via a same-origin login page before loading the UI.

## Building from Source

Build from the repository root:

```
docker build -t mindroom:dev -f local/instances/deploy/Dockerfile.mindroom .
```

The Dockerfile uses a multi-stage build with `uv` for dependency management and runs as a non-root user (UID 1000).

A `Dockerfile.mindroom-minimal` variant is also available, which builds a smaller image without pre-installed tool extras -- useful for sandbox runners.

## With Local Matrix

For development, run MindRoom alongside a local Matrix server:

```
# Start Matrix (Synapse + Postgres + Redis)
cd local/matrix && docker compose up -d

# Verify Matrix is running
curl -s http://localhost:8008/_matrix/client/versions

# Start MindRoom using the docker-compose.yml you created above
docker compose up -d
```

The local Matrix stack includes:

- **Synapse**: Matrix homeserver on port 8008
- **PostgreSQL**: Database backend
- **Redis**: Caching layer

If you're running the backend on the host (not in Docker), you can use `mindroom local-stack-setup` to start Synapse + MindRoom Cinny and persist local Matrix env vars automatically:

```
mindroom local-stack-setup --synapse-dir /path/to/mindroom-stack/local/matrix
mindroom run
```

## Health Checks

The container exposes a health endpoint on port 8765:

```
curl http://localhost:8765/api/health
```

## Data Persistence

MindRoom stores data in the `mindroom_data` directory:

- `sessions/` - Per-agent conversation history (SQLite)
- `learning/` - Per-agent Agno Learning state (SQLite, persistent across restarts)
- `chroma/` - ChromaDB vector store for agent/team memories
- `knowledge_db/` - Knowledge base vector stores
- `culture/` - Shared culture state
- `tracking/` - Response tracking to avoid duplicates
- `credentials/` - Synchronized secrets from `.env`
- `logs/` - Application logs
- `matrix_state.yaml` - Matrix connection state
- `encryption_keys/` - Matrix E2EE keys (if enabled)

## Sandbox Proxy Isolation

When configured, `shell`, `file`, and `python` tool calls can be proxied to a separate **sandbox-runner** sidecar container. The sidecar runs the same image but without access to secrets, credentials, or the primary data volume. This provides real process-level isolation for code-execution tools. Without proxy configuration, all tools execute locally in the MindRoom process.

See [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) for full documentation including Docker Compose examples, Kubernetes shared-sidecar and dedicated-worker modes, host-machine-with-container mode, credential leases, and environment variable reference.

> [!TIP] For production, use a reverse proxy (Traefik, Nginx) in front of the MindRoom container when you want TLS, host routing, or additional auth layers. See `local/instances/deploy/docker-compose.yml` for an example with Traefik labels.

# Sandbox Proxy Isolation

When agents have code-execution tools (`shell`, `file`, `python`), they can read and modify anything on the filesystem, including config files, credentials, and application code. The **sandbox proxy** isolates these tools by forwarding their calls to a separate worker runtime that has no direct access to the primary process secrets. This page describes the current sandboxed execution model.

## How it works

```
┌──────────────────────────┐         HTTP          ┌──────────────────────────┐
│ Primary MindRoom runtime │  ── tool call ──▶     │ Worker runtime           │
│ has secrets              │  ◀── result ───       │ no primary secrets       │
│ has credentials          │                       │ leased credentials only  │
│ has orchestration state  │                       │ agent state + caches     │
└──────────────────────────┘                       └──────────────────────────┘
```

1. Agent invokes `shell.run_shell_command(...)` or another worker-routed tool.
1. The primary MindRoom runtime resolves the target worker from the configured backend plus worker scope.
1. The call is forwarded over HTTP to the target worker runtime.
1. The worker executes the tool against the agent's storage directory plus any worker-local caches and returns the result.
1. All other tools such as API tools or Matrix-bound tools execute in the primary MindRoom runtime as usual.

The static worker runtime authenticates requests with `MINDROOM_SANDBOX_PROXY_TOKEN`. Kubernetes dedicated workers derive a separate runner token for each worker from that control-plane token and the worker key. Compromising one dedicated worker token does not authorize requests to another dedicated worker runner. For tools that need credentials, such as a shell tool that calls an authenticated API, the primary MindRoom runtime can create a short-lived **credential lease** that the worker consumes once. Credentials never become part of the normal tool arguments or the model prompt.

MindRoom currently ships two worker backend shapes:

- `static_runner`: one shared sandbox-runner process, usually a sidecar container or a local HTTP service.
- `kubernetes`: dedicated worker pods created on demand from the primary runtime, with one logical worker per worker key.

## Where Agent Data Lives

Each agent stores all its persistent data (context files, workspace files, memory, sessions, learning) in one directory: `agents/<name>/`. This directory is shared across all worker scopes — switching `worker_scope` changes how tool runtimes are isolated, not where agent data lives. Worker runtimes may keep their own virtualenvs, caches, and scratch files, but those are not agent data. Multiple runtimes may access the same agent directory concurrently, so files and databases there must tolerate concurrent access.

## Deployment modes

### Docker Compose (`static_runner`)

Add a `sandbox-runner` service alongside MindRoom. Both use the same image. The runner just has a different entrypoint and no access to `.env` or the primary data volume.

```
services:
  mindroom:
    image: ghcr.io/mindroom-ai/mindroom:latest
    env_file: .env
    volumes:
      - ./config.yaml:/app/config.yaml:ro
      - ./mindroom_data:/app/mindroom_data
    environment:
      - MINDROOM_WORKER_BACKEND=static_runner
      - MINDROOM_SANDBOX_PROXY_URL=http://sandbox-runner:8766
      - MINDROOM_SANDBOX_PROXY_TOKEN=${MINDROOM_SANDBOX_PROXY_TOKEN}
      - MINDROOM_SANDBOX_EXECUTION_MODE=selective
      - MINDROOM_SANDBOX_PROXY_TOOLS=shell,file,python

  sandbox-runner:
    image: ghcr.io/mindroom-ai/mindroom:latest
    command: ["/app/run-sandbox-runner.sh"]
    user: "1000:1000"
    volumes:
      - sandbox-workspace:/app/workspace
    environment:
      - MINDROOM_SANDBOX_RUNNER_MODE=true
      - MINDROOM_SANDBOX_PROXY_TOKEN=${MINDROOM_SANDBOX_PROXY_TOKEN}
      - MINDROOM_CONFIG_PATH=/app/config.yaml
      - MINDROOM_STORAGE_PATH=/app/workspace/.mindroom

volumes:
  sandbox-workspace:
```

Do not mount the full `mindroom_data` tree into the runner because it contains credentials, Matrix encryption keys, sessions, and logs.

> [!IMPORTANT] The `sandbox-workspace` Docker volume is created as root by default. The runner runs as UID 1000, so you must fix ownership after first creating the volume: `bash docker run --rm -v sandbox-workspace:/workspace busybox chown -R 1000:1000 /workspace` Alternatively, omit the `user:` directive to run as root (less secure).

Key differences from the primary MindRoom runtime:

- **No `env_file`** — runner has no API keys, no Matrix credentials
- **Scratch workspace** — a dedicated volume for worker-local files (caches, virtualenvs)
- **`MINDROOM_STORAGE_PATH`** — pointed at a writable location inside the scratch workspace for tool registry and cache files

> [!WARNING] **Filesystem isolation depends on the worker backend.** Static shared-runner deployments should not mount the primary MindRoom storage tree into the runner. Local in-process execution still shares the primary process filesystem. Kubernetes dedicated workers restrict mounts so each runtime only sees its own agent's directory (for `shared`, `user_agent`, and unscoped modes). The `user` scope is intentionally broader: it shares one runtime across multiple agents per user, so agents in that runtime can see each other's files. Use `user_agent` for per-agent filesystem isolation.

### Kubernetes shared sidecar (`workerBackend: static_runner`)

In Kubernetes the shared runner can still run as a second container in the same pod, sharing `localhost` networking. This is the `workerBackend: static_runner` Helm mode. See `cluster/k8s/instance/templates/deployment-mindroom.yaml` for the full manifest. The sidecar gets:

- An `emptyDir` volume for worker-local scratch files and caches.
- Access to the same shared storage that holds agent data directories.
- Read-only access to config for plugin tool registration.
- No access to the primary secrets volume.

### Kubernetes dedicated workers (`workerBackend: kubernetes`)

In dedicated-worker mode the primary MindRoom runtime creates worker Deployments and Services on demand. Each worker pod runs the sandbox-runner app and is addressed through an internal cluster Service. Each dedicated worker needs access to its agent's storage directory. Worker-local files (caches, virtualenvs, metadata) are kept separate per worker. When a worker is idle, its Deployment scales to zero, but agent data and worker caches are preserved. The runtime chart stores derived worker tokens as per-worker keys in one chart-created worker-auth Secret when workers run in the release namespace. If `workers.kubernetes.namespace` is set to a separate worker namespace, the runtime chart can instead manage per-worker auth Secrets in that namespace. The hosted instance chart stores derived worker tokens as per-worker keys in a pre-created tenant auth Secret. The hosted instance worker-manager Role does not grant broad Secret API access in the shared `mindroom-instances` namespace.

Use the instance Helm chart with values like:

```
workerBackend: kubernetes
workerCleanupIntervalSeconds: 30
storageAccessMode: ReadWriteMany
kubernetesWorkerPort: 8766
kubernetesWorkerReadyTimeoutSeconds: 60
kubernetesWorkerIdleTimeoutSeconds: 1800
sandbox_proxy_token: "replace-me"
```

Important notes for this mode:

- `storageAccessMode` should be `ReadWriteMany` because multiple dedicated workers may need concurrent access to the same agent storage.
- If you must keep `ReadWriteOnce`, set `controlPlaneNodeName` so the control plane and dedicated workers stay on the same node.
- `kubernetesWorkerImage` and `kubernetesWorkerImagePullPolicy` default to the main MindRoom image settings when left empty.
- The chart creates the worker-manager ServiceAccount, Role, RoleBinding, and worker-specific NetworkPolicy rules automatically when this backend is enabled.

The runtime and hosted instance charts grant narrow access to one worker-auth Secret in shared runtime namespaces, while explicitly separate runtime worker namespaces may use per-worker auth Secret CRUD.

- The primary runtime does not need `MINDROOM_SANDBOX_PROXY_URL` in this mode because worker endpoints come from the Kubernetes worker handles.
- Dynamic worker pods default to `enableServiceLinks: false` so Kubernetes does not inject sibling Service names into the runner environment.
- Runner ingress defaults to allowing the MindRoom control-plane pod to reach worker runner ports, while worker-to-worker ingress is denied by NetworkPolicy.
- The authenticated `/api/workers` and `/api/workers/cleanup` endpoints on the primary runtime expose backend-neutral worker lifecycle information.

Untrusted code-execution tools may still share the runner container's process namespace and may be able to inspect the runner process environment through `/proc` on some container runtimes. For dedicated Kubernetes workers, the exposed environment contains only that worker's derived runner token, not the shared control-plane token. This leaves same-worker token exposure as a local containment risk, while per-worker credentials and NetworkPolicy limit cross-worker blast radius.

For the full Helm-side deployment guidance, see [Kubernetes Deployment](https://docs.mindroom.chat/deployment/kubernetes/index.md).

### Host machine + Docker sandbox container

Run MindRoom directly on the host while isolating code-execution tools in a Docker container:

```
# 1. Start the sandbox runner container
docker run -d \
  --name mindroom-sandbox-runner \
  -p 8766:8766 \
  -e MINDROOM_WORKER_BACKEND=static_runner \
  -e MINDROOM_SANDBOX_RUNNER_MODE=true \
  -e MINDROOM_SANDBOX_PROXY_TOKEN=your-secret-token \
  -e MINDROOM_STORAGE_PATH=/app/workspace/.mindroom \
  ghcr.io/mindroom-ai/mindroom:latest \
  /app/run-sandbox-runner.sh

# 2. Start MindRoom on the host with proxy config
export MINDROOM_WORKER_BACKEND=static_runner
export MINDROOM_SANDBOX_PROXY_URL=http://localhost:8766
export MINDROOM_SANDBOX_PROXY_TOKEN=your-secret-token
export MINDROOM_SANDBOX_EXECUTION_MODE=selective
export MINDROOM_SANDBOX_PROXY_TOOLS=shell,file,python
mindroom run
```

Or add the proxy variables to your `.env` file:

```
MINDROOM_WORKER_BACKEND=static_runner
MINDROOM_SANDBOX_PROXY_URL=http://localhost:8766
MINDROOM_SANDBOX_PROXY_TOKEN=your-secret-token
MINDROOM_SANDBOX_EXECUTION_MODE=selective
MINDROOM_SANDBOX_PROXY_TOOLS=shell,file,python
```

This gives you the convenience of running MindRoom natively while keeping code-execution tools inside a container boundary.

> [!TIP] If you use plugin tools that also need proxying, mount your `config.yaml` into the runner container so it can register them: `bash docker run -d \ --name mindroom-sandbox-runner \ -p 8766:8766 \ -v ./config.yaml:/app/config.yaml:ro \ -e MINDROOM_CONFIG_PATH=/app/config.yaml \ -e MINDROOM_SANDBOX_RUNNER_MODE=true \ -e MINDROOM_SANDBOX_PROXY_TOKEN=your-secret-token \ -e MINDROOM_STORAGE_PATH=/app/workspace/.mindroom \ ghcr.io/mindroom-ai/mindroom:latest \ /app/run-sandbox-runner.sh`

## Environment variable reference

### Primary MindRoom runtime (proxy client)

| Variable                                        | Description                                                                                                                                                           | Default                                       |
| ----------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------- |
| `MINDROOM_WORKER_BACKEND`                       | Worker backend name: `static_runner` or `kubernetes`                                                                                                                  | `static_runner`                               |
| `MINDROOM_SANDBOX_PROXY_URL`                    | URL of the shared sandbox runner when using `static_runner`                                                                                                           | *(none — proxy disabled for `static_runner`)* |
| `MINDROOM_SANDBOX_PROXY_TOKEN`                  | Static-runner bearer token and Kubernetes control-plane secret used to derive per-worker runner tokens                                                                | *(required for worker-routed execution)*      |
| `MINDROOM_SANDBOX_EXECUTION_MODE`               | `selective`, `all`, `off`                                                                                                                                             | *(unset — uses proxy tools list)*             |
| `MINDROOM_SANDBOX_PROXY_TOOLS`                  | Comma-separated tool names to proxy                                                                                                                                   | `*` (all, unless mode is `selective`)         |
| `MINDROOM_SANDBOX_PROXY_TIMEOUT_SECONDS`        | HTTP timeout for proxy calls                                                                                                                                          | `120`                                         |
| `MINDROOM_ATTACHMENT_INLINE_SAVE_MAX_BYTES`     | Maximum attachment bytes the primary runtime will inline when saving context attachments into a worker workspace with `get_attachment(..., mindroom_output_path=...)` | `16777216` (16 MiB)                           |
| `MINDROOM_SANDBOX_CREDENTIAL_LEASE_TTL_SECONDS` | Credential lease lifetime                                                                                                                                             | `60`                                          |
| `MINDROOM_SANDBOX_CREDENTIAL_POLICY_JSON`       | JSON mapping tool selectors to credential services                                                                                                                    | `{}`                                          |

When `MINDROOM_WORKER_BACKEND=kubernetes`, the primary runtime resolves worker endpoints through the Kubernetes backend and does not use `MINDROOM_SANDBOX_PROXY_URL`. The Helm chart sets the Kubernetes backend environment variables automatically. If you deploy that mode without Helm, see [Kubernetes Deployment](https://docs.mindroom.chat/deployment/kubernetes/index.md) and `src/mindroom/workers/backends/kubernetes_config.py` for the required environment surface.

### Sandbox runner

| Variable                                             | Description                                                                                                                        | Default                                                      |
| ---------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------ |
| `MINDROOM_SANDBOX_RUNNER_PORT`                       | Port the sandbox runner listens on                                                                                                 | `8766`                                                       |
| `MINDROOM_SANDBOX_RUNNER_MODE`                       | Set to `true` to indicate runner mode                                                                                              | `false`                                                      |
| `MINDROOM_SANDBOX_PROXY_TOKEN`                       | Runner bearer token. Static runners use the shared primary token; Kubernetes dedicated workers receive a per-worker derived token. | *(required)*                                                 |
| `MINDROOM_SANDBOX_RUNNER_EXECUTION_MODE`             | `inprocess` or `subprocess`                                                                                                        | `inprocess`                                                  |
| `MINDROOM_SANDBOX_RUNNER_SUBPROCESS_TIMEOUT_SECONDS` | Subprocess timeout                                                                                                                 | `120`                                                        |
| `MINDROOM_STORAGE_PATH`                              | Writable directory for tool registry init and worker-local caches (e.g., `/app/workspace/.mindroom`)                               | `mindroom_data` next to config *(will fail if not writable)* |
| `MINDROOM_CONFIG_PATH`                               | Path to config.yaml (for plugin tool registration)                                                                                 | *(optional)*                                                 |

## Execution modes

| Mode                         | Behavior                                                                                                   |
| ---------------------------- | ---------------------------------------------------------------------------------------------------------- |
| `selective`                  | Only tools listed in `MINDROOM_SANDBOX_PROXY_TOOLS` are proxied. Recommended.                              |
| `all` / `sandbox_all`        | Every tool call goes through the proxy                                                                     |
| `off` / `local` / `disabled` | Proxy disabled even if URL is set                                                                          |
| *(unset)*                    | If `MINDROOM_SANDBOX_PROXY_TOOLS` is `*` or unset, proxies all tools; if set to a list, proxies only those |

## Shell env and PATH

When `shell` runs through the sandbox proxy, it receives only a small non-secret system env by default, such as `PATH`, `HOME`, `USER`, `TMPDIR`, locale variables, proxy variables, and certificate path variables. Committed runtime `.env` values and provider credentials are not forwarded implicitly. Worker startup env also denies provider API keys such as `OPENAI_API_KEY` and `ANTHROPIC_API_KEY` by default. Configure `extra_env_passthrough` with exact names or glob patterns for exported process env variables you want shell execution to inherit. `extra_env_passthrough` matches exported process env, not config-adjacent `.env` entries. To prevent the runner from leaking its own control-plane credentials to tools, shell passthrough drops names in a small explicit denylist (`MINDROOM_API_KEY`, `MINDROOM_LOCAL_CLIENT_SECRET`, `MINDROOM_SANDBOX_PROXY_TOKEN`, `MINDROOM_SANDBOX_STARTUP_MANIFEST_PATH`) and any name starting with `MINDROOM_SANDBOX_`. Everything else that matches your configured names or globs passes through, including service tokens and provider credentials. If you don't want a value to reach shell commands, don't match it with `extra_env_passthrough`.

If proxied shell commands need extra PATH entries such as wrapper directories, configure `shell_path_prepend`. This prepends the configured entries ahead of the runtime PATH while preserving the existing PATH order and removing duplicates. That keeps PATH handling deployment-specific instead of baking host-specific directories into the shell tool itself.

Shell commands that exceed their timeout return a background handle. Use `check_shell_command(handle)` to poll and `kill_shell_command(handle)` to stop the process. These handles are process-local to the sandbox runner: they survive multiple requests to the same runner process, but not runner restarts. To make that work, shell background-handle requests stay owned by the long-lived runner process even when `MINDROOM_SANDBOX_RUNNER_EXECUTION_MODE=subprocess`.

## Workspace home contract

For worker-routed shell or python requests with a resolved workspace, MindRoom sets `HOME` and `MINDROOM_AGENT_WORKSPACE` to that workspace before running the tool. This includes agent-routed calls, worker-keyed calls whose prepared runtime has a `base_dir`, and unkeyed static-sidecar calls with an explicit absolute `base_dir` override. It also sets `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, and `XDG_STATE_HOME` under that workspace. Workspace identity variables and worker cache variables are owned by MindRoom for the request. `HOME`, `MINDROOM_AGENT_WORKSPACE`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, and `XDG_STATE_HOME` stay under the workspace. `XDG_CACHE_HOME`, `PIP_CACHE_DIR`, `UV_CACHE_DIR`, and `PYTHONPYCACHEPREFIX` stay under the worker cache directory when a worker root exists. `VIRTUAL_ENV` is preserved from the active worker environment and is not pointed at the agent workspace. MindRoom reasserts these owned variables after request env passthrough and after `.mindroom/worker-env.sh`, so hooks can read them but cannot redirect them. The practical contract is that `pwd`, `~`, `Path.home()`, attachment `mindroom_output_path` saves, and file/coding relative paths all refer to the same workspace. For example, after `get_attachment("att_...", mindroom_output_path="incoming/file.txt")`, worker-routed shell can read both `incoming/file.txt` and `~/incoming/file.txt`.

## Workspace env hook (`.mindroom/worker-env.sh`)

Agents can drop a shell script at `<workspace>/.mindroom/worker-env.sh` to set custom env for worker-routed tool calls without changing config or redeploying.

The runner sources this script with `bash` after applying the workspace home contract and before each worker-routed `shell` or `python` request, then merges its exported env into the tool's execution environment.

**Discovery:**

- For agent-routed worker requests, the hook lives at the resolved agent workspace root as `.mindroom/worker-env.sh`.
- For shared and unscoped agents that means `agents/<agent>/workspace/.mindroom/worker-env.sh`.
- For private agents that means `private_instances/<scope>/<agent>/workspace/.mindroom/worker-env.sh`.
- For `worker_scope: user`, the hook follows the per-request workspace, so one shared user runtime can pick up different hooks as it works in different agent workspaces.
- For unkeyed static-sidecar proxy calls (no `worker_key`), the hook is discovered from `tool_init_overrides["base_dir"]` only when that value is an absolute path; relative strings are ignored on this path because there is no canonical workspace root to resolve them against.

**Semantics:**

- Edits take effect on the next worker-routed tool call. No pod restart, no config reload, no Helm change.
- The script must `export FOO=bar` for values to overlay; bare `FOO=bar` does not persist (no `set -a`).
- Filesystem side effects inside the worker sandbox are allowed because the hook is arbitrary agent-editable shell.
- Shell aliases, functions, and `cd` do not persist — only exported env crosses the boundary.

**Filtering:**

`.mindroom/worker-env.sh` is sourced by bash that inherits the runner's process env, which contains tokens the runner needs to function (sandbox proxy auth, etc.). To prevent the runner from leaking its own control-plane credentials to tools, the overlay drops names in a small explicit denylist (`MINDROOM_API_KEY`, `MINDROOM_LOCAL_CLIENT_SECRET`, `MINDROOM_SANDBOX_PROXY_TOKEN`, `MINDROOM_SANDBOX_STARTUP_MANIFEST_PATH`) and any name starting with `MINDROOM_SANDBOX_`. Bash bookkeeping vars (`PWD`, `OLDPWD`, `SHLVL`, `_`, `PIPESTATUS`) are also dropped because they're noise, not values the script meant to export. After MindRoom-owned env names are reasserted, other exported values pass through, including service tokens and provider credentials you intentionally export from the hook. If you don't want a value to reach tools, don't export it.

**Limits and failure handling:**

- Script ≤ 64 KiB; stdout and stderr capture each ≤ 256 KiB; total overlay ≤ 128 KiB; per-value ≤ 32 KiB.
- Hook execution times out after 10 seconds.
- Symlinks that escape the workspace are rejected.
- Any failure (non-zero exit, timeout, escape, missing `bash`) returns the tool call as `ok: false` with `failure_kind: "tool"` and an error mentioning `.mindroom/worker-env.sh`.
- Hook failures do not poison the worker; only the requesting tool call fails.

This hook works identically for the static sidecar and dedicated Kubernetes worker backends because it runs inside the sandbox runner per request. It is not a true container startup hook — it does not change pod templates, recreate Deployments, or alter Helm values. For an example, see `docs/tools/execution-and-coding.md`.

## Credential leases

Some proxied tools need credentials (e.g., a `shell` tool that runs `git push` and needs an SSH key). Rather than giving the runner permanent access to secrets, the primary MindRoom runtime creates a **credential lease** — a short-lived, single-use token that the runner exchanges for credentials during execution.

Configure which credentials are shared via `MINDROOM_SANDBOX_CREDENTIAL_POLICY_JSON`:

```
export MINDROOM_SANDBOX_CREDENTIAL_POLICY_JSON='{"shell": ["github"], "python": ["openai"]}'
```

This shares the `github` credential service with `shell` tool calls and `openai` with `python` tool calls. Credentials are never stored in the runner — each lease is consumed on use and expires after the configured TTL.

## Security considerations

- The worker runtime never gets the primary runtime API key files, Matrix client state, or orchestrator authority.
- The sandbox token authenticates proxy traffic, so use a strong random value.

Kubernetes dedicated workers derive per-worker runner tokens from the control-plane token.

- Credential leases are single-use by default and expire after 60 seconds.
- The worker container `securityContext` drops all capabilities and disables privilege escalation.
- With `workerBackend: static_runner`, the Kubernetes sidecar uses `emptyDir` scratch space and shares access to the same agent storage directories as the main process.
- With `workerBackend: kubernetes`, dedicated workers for `shared`, `user_agent`, and unscoped execution only mount their own agent's directory plus their worker scratch space. `user` mode intentionally mounts the broader `agents/` tree since it shares one runtime across agents.
- The primary MindRoom runtime does not mount the sandbox-runner router, so `/api/sandbox-runner/` exists only in runner or dedicated worker processes.

### Sandbox-runner API endpoints

These endpoints are served by the sandbox-runner process, not the primary MindRoom runtime. All requests require the runner's `MINDROOM_SANDBOX_PROXY_TOKEN` in the `x-mindroom-sandbox-token` header.

| Method | Endpoint                              | Description                                                     |
| ------ | ------------------------------------- | --------------------------------------------------------------- |
| POST   | `/api/sandbox-runner/leases`          | Create a one-time credential lease for an upcoming tool call    |
| POST   | `/api/sandbox-runner/execute`         | Execute a tool call with optional credential override via lease |
| GET    | `/api/sandbox-runner/workers`         | List known workers with lifecycle metadata                      |
| POST   | `/api/sandbox-runner/workers/cleanup` | Mark idle workers for cleanup without deleting persisted state  |

Credential leases are single-use: once consumed by an `/execute` call, the lease cannot be replayed.

## Per-agent configuration

MindRoom owns the default local-versus-worker routing policy. You can override which tools are routed through the sandbox proxy per agent (or set a default for all agents) in `config.yaml`.

Per-agent tool config overrides (inline `shell: {extra_env_passthrough: "DAWARICH_*"}` syntax in agent `tools` lists) are threaded through the sandbox proxy so workers receive the merged overrides alongside credentials and runtime overrides. See [Per-Agent Tool Configuration](https://docs.mindroom.chat/configuration/agents/#per-agent-tool-configuration) for the full syntax.

```
defaults:
  worker_tools: [shell, file]        # route shell+file through the sandbox proxy for all agents by default

agents:
  code:
    tools: [file, shell, calculator]
    # inherits worker_tools from defaults → shell and file proxied

  research:
    tools: [web_search, calculator]
    worker_tools: []                 # explicitly no proxying

  untrusted:
    tools: [shell, file, python]
    worker_tools: [shell, file, python]   # proxy everything
```

The `worker_tools` field has three states:

| Value               | Behavior                                                                                                                                                  |
| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `null` (omitted)    | Use MindRoom's built-in default routing policy. Today that defaults to `coding`, `file`, `python`, and `shell` when those tools are enabled for the agent |
| `[]` (empty list)   | Explicitly disable sandbox proxying for this agent                                                                                                        |
| `["shell", "file"]` | Proxy exactly these tools for this agent                                                                                                                  |

Agent-level `worker_tools` overrides `defaults.worker_tools`. With `MINDROOM_WORKER_BACKEND=static_runner`, a sandbox proxy URL (`MINDROOM_SANDBOX_PROXY_URL`) must still be configured for proxying to take effect. With `MINDROOM_WORKER_BACKEND=kubernetes`, worker endpoints are resolved dynamically and `MINDROOM_SANDBOX_PROXY_URL` is not used.

## Worker Scope

`worker_tools` controls which tools run in the sandbox proxy. `worker_scope` controls how those sandbox runtimes are shared between calls. Some credential-backed tools always stay local regardless of `worker_tools`: `gmail`, `google_calendar`, `google_drive`, `google_sheets`, and `homeassistant`. Additionally, `spotify` is a shared-only integration that requires `worker_scope` unset or `shared` but can still be proxied through the sandbox.

You can set `worker_scope` per agent or in `defaults`:

```
defaults:
  worker_tools: [shell, file]
  worker_scope: user_agent

agents:
  code:
    tools: [shell, file]
    # inherits worker_scope=user_agent

  reviewer:
    tools: [shell, file]
    worker_scope: shared

  bridge_helper:
    tools: [shell]
    worker_scope: user
```

The supported values are:

| Value        | Behavior                                               |
| ------------ | ------------------------------------------------------ |
| `shared`     | One runtime per agent, shared by all users             |
| `user`       | One runtime per user, shared across that user's agents |
| `user_agent` | One runtime per user+agent pair                        |

If `worker_scope` is unset, proxied tools still run in the sandbox, but each call gets a fresh runtime instead of a persistent one.

**Important notes:**

- `worker_scope` does **not** change where agent data is stored. All scopes read and write the same agent storage directory (`agents/<name>/`).
- The dashboard's generic credential forms only work for unscoped agents and agents with `worker_scope=shared`. OAuth providers that support scoped dashboard flows, such as the Google Drive, Gmail, Calendar, and Sheets providers, are the exception. For those providers, the dashboard can connect scoped `user` and `user_agent` credentials, but the Google tools still execute in the primary MindRoom runtime. Tools without a scoped OAuth provider still manage `user` and `user_agent` credentials through their worker runtime.
- `user` mode shares one runtime across multiple agents for a single user, so agents in that runtime can access each other's files. Use `user_agent` for per-agent isolation.

## Without configured worker routing

With `MINDROOM_WORKER_BACKEND=static_runner` and no `MINDROOM_SANDBOX_PROXY_URL`, tool calls execute directly in the primary MindRoom runtime process. This is fine for development but not recommended for production deployments where agents run untrusted code. With `MINDROOM_WORKER_BACKEND=kubernetes`, worker-routed tool calls fail closed when the backend is misconfigured instead of silently running locally.

# Kubernetes Deployment

Deploy MindRoom on Kubernetes for production multi-tenant deployments.

## Architecture

MindRoom uses three Helm charts:

- **Instance Chart** (`cluster/k8s/instance/`) - Individual MindRoom runtime with bundled dashboard/API plus Matrix/Synapse
- **Platform Chart** (`cluster/k8s/platform/`) - SaaS control plane (API, frontend, provisioner)
- **Runtime Chart** (`cluster/k8s/runtime/`) - MindRoom runtime only, for clusters that provide Matrix, storage, secrets, ingress, and platform services externally

## Prerequisites

- Kubernetes cluster (tested with k3s via kube-hetzner)
- kubectl and helm installed
- NGINX Ingress Controller
- cert-manager (for TLS certificates)

## Instance Deployment

### Via Provisioner API (Recommended)

```
export KUBECONFIG=./cluster/terraform/terraform-k8s/mindroom-k8s_kubeconfig.yaml

# Provision, check status, view logs
./cluster/scripts/mindroom-cli.sh provision 1
./cluster/scripts/mindroom-cli.sh status
./cluster/scripts/mindroom-cli.sh logs 1
```

### Direct Helm Installation

For debugging only:

```
helm upgrade --install instance-1 ./cluster/k8s/instance \
  --namespace mindroom-instances \
  --create-namespace \
  --set customer=1 \
  --set accountId="your-account-uuid" \
  --set baseDomain=mindroom.chat \
  --set anthropic_key="your-key" \
  --set openrouter_key="your-key" \
  --set supabaseUrl="https://your-project.supabase.co" \
  --set supabaseAnonKey="your-anon-key" \
  --set supabaseServiceKey="your-service-key"
```

Only enable trusted upstream auth when the instance is behind a verified access layer that strips client-supplied copies of those headers and injects authenticated values itself:

```
helm upgrade --install instance-1 ./cluster/k8s/instance \
  --namespace mindroom-instances \
  --reuse-values \
  --set-string trustedUpstreamAuth.enabled=true \
  --set trustedUpstreamAuth.userIdHeader=X-MindRoom-User-Id \
  --set trustedUpstreamAuth.emailHeader=X-MindRoom-User-Email \
  --set trustedUpstreamAuth.matrixUserIdHeader=X-MindRoom-Matrix-User-Id \
  --set trustedUpstreamAuth.emailToMatrixUserIdTemplate='@{localpart}:example.org'
```

When using the provisioner, configure the platform chart with `provisioner.trustedUpstreamAuth.enabled="true"` and the matching `provisioner.trustedUpstreamAuth.*Header` values. If your access layer cannot supply a Matrix ID header, configure `provisioner.trustedUpstreamAuth.emailToMatrixUserIdTemplate` with the same template. The email-to-Matrix template must contain exactly one `{localpart}` placeholder and requires the matching `emailHeader` value in both the instance and platform chart configuration.

## Runtime-Only Deployment

Use the runtime chart when you already operate the surrounding platform and only want Kubernetes to run the MindRoom runtime.

The chart intentionally does not create Matrix, ingress, a model gateway, or platform services.

```
helm upgrade --install mindroom-runtime ./cluster/k8s/runtime \
  --namespace mindroom \
  --create-namespace \
  -f runtime-values.yaml
```

Typical production values point at existing resources:

```
config:
  create: false
  existingConfigMap: mindroom-config
  key: config.yaml

storage:
  create: false
  existingClaim: mindroom-data

matrix:
  homeserverUrl: http://matrix.example.svc.cluster.local:8008
  serverName: example.com
  registrationToken:
    existingSecret: mindroom-secrets
    key: MATRIX_REGISTRATION_TOKEN

env:
  envFrom:
    - secretRef:
        name: mindroom-secrets

workers:
  backend: kubernetes
  sandbox:
    proxyToken:
      existingSecret: mindroom-sandbox-proxy
      key: MINDROOM_SANDBOX_PROXY_TOKEN
```

See `cluster/k8s/runtime/README.md` and `cluster/k8s/runtime/values.yaml` for the full values surface.

## Worker Backends

The instance and runtime charts support two worker backend modes for worker-routed tools such as `shell`, `file`, and `python`.

The dedicated-worker provisioning flow is implemented today.

Both modes store agent data in the same per-agent directory structure.

| Helm value                     | Behavior                                                            | Best for                                                                              |
| ------------------------------ | ------------------------------------------------------------------- | ------------------------------------------------------------------------------------- |
| `workerBackend: static_runner` | Runs one shared sandbox-runner sidecar inside the main MindRoom pod | Simpler deployments                                                                   |
| `workerBackend: kubernetes`    | Creates dedicated worker Deployments and Services on demand         | Stronger runtime isolation per agent (filesystem isolation depends on `worker_scope`) |

### Shared Sidecar Mode

`workerBackend: static_runner` is the default. The primary runtime talks to a shared sidecar over `localhost`. This keeps the deployment simple, but all proxied tool calls share the same runner process. The runner reads and writes the same agent storage directories as the main process.

### Dedicated Worker Mode

`workerBackend: kubernetes` enables the built-in Kubernetes worker backend. The primary runtime creates worker Deployments and Services on demand and routes tool calls to the matching worker. Each worker pod runs the sandbox-runner app and accesses the same agent storage directory as every other runtime for that agent. Worker-local files (caches, virtualenvs, metadata) are kept separate per worker. When a worker is idle, its Deployment scales to zero, but agent data and worker caches are preserved. The runtime chart stores derived worker tokens as per-worker keys in one chart-created worker-auth Secret when workers run in the release namespace. If `workers.kubernetes.namespace` is set to a separate worker namespace, the runtime chart can instead manage per-worker auth Secrets in that namespace. The hosted instance chart stores derived worker tokens as per-worker keys in a pre-created tenant auth Secret. The hosted instance worker-manager Role does not grant broad Secret API access in the shared `mindroom-instances` namespace.

> [!WARNING] **Filesystem isolation depends on `worker_scope`.** With `shared`, `user_agent`, or unscoped execution, each worker can only see its own agent's storage directory — this is the strongest isolation available. With `user`, the worker can see all agents' storage because it shares one runtime across multiple agents for a single user. Use `user_agent` for per-agent filesystem isolation.

Typical Helm values look like:

```
workerBackend: kubernetes
workerCleanupIntervalSeconds: 30
storageAccessMode: ReadWriteMany
controlPlaneNodeName: ""
kubernetesWorkerImage: ""
kubernetesWorkerImagePullPolicy: ""
kubernetesWorkerServiceAccountName: ""
kubernetesWorkerNamePrefix: "mindroom-worker"
kubernetesWorkerStorageSubpathPrefix: "workers"
kubernetesWorkerPort: 8766
kubernetesWorkerReadyTimeoutSeconds: 60
kubernetesWorkerIdleTimeoutSeconds: 1800
sandbox_proxy_token: "replace-me"
```

The runtime chart exposes the same concepts under the nested `workers.*` values.

Important behavior and constraints:

- `kubernetesWorkerImage` and `kubernetesWorkerImagePullPolicy` default to the main MindRoom image settings when left empty.
- `workerCleanupIntervalSeconds` controls how often the primary runtime runs idle-worker cleanup.
- `kubernetesWorkerIdleTimeoutSeconds` controls when a worker is considered idle and eligible to scale down.
- `kubernetesWorkerReadyTimeoutSeconds` controls how long the primary runtime waits for a worker Deployment to become ready.
- `kubernetesWorkerPort` is the internal Service and container port used by dedicated workers.
- Dedicated workers need access to the shared instance PVC so they can reach agent storage directories.
- For `shared`, `user_agent`, and unscoped execution, mounts are narrowed to just the target agent's directory plus the worker's scratch space.
- Shared credentials are copied into each dedicated worker as needed instead of exposing the whole shared credentials directory inside agent-isolated pods.
- Dedicated workers start with no shared credentials by default.
- Only services listed in `defaults.worker_grantable_credentials` are available inside a dedicated worker.
- `google_vertex_adc` is intentionally unsupported for dedicated workers because workers do not receive ADC files or `GOOGLE_APPLICATION_CREDENTIALS`; keep Vertex ADC usage in the primary runtime.
- Dedicated worker runtime env stays deny-by-default for provider and arbitrary `.env` values, while basic runtime plumbing such as `PATH`, `VIRTUAL_ENV`, and linker vars is set separately.
- This matches the broader sandbox-proxy contract for `python` and `shell`: proxied execution is intentionally stricter than direct local execution and does not inherit ordinary runtime `.env` or provider env by default.
- For agent-editable per-workspace env (extra PATH entries, package indexes, npm cache dirs, etc.), use the request-time `.mindroom/worker-env.sh` overlay documented in [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/#workspace-env-hook-mindroomworker-envsh). The overlay is sourced inside the running worker per request, so it does not change the worker Deployment, the startup manifest, the pod-template hash, or any Helm value, and does not require a worker restart when edited.
- MindRoom-owned workspace identity, cache, and virtualenv env names remain controlled by the worker runtime and cannot be redirected by `.mindroom/worker-env.sh`: `HOME`, `MINDROOM_AGENT_WORKSPACE`, `XDG_CONFIG_HOME`, `XDG_DATA_HOME`, `XDG_STATE_HOME`, `XDG_CACHE_HOME`, `PIP_CACHE_DIR`, `UV_CACHE_DIR`, `PYTHONPYCACHEPREFIX`, and `VIRTUAL_ENV`.
- Worker-local caches may still live under `kubernetesWorkerStorageSubpathPrefix/<worker-dir>/`.

### Storage Requirements

Dedicated workers need access to the same PVC as the primary runtime. Set `storageAccessMode: ReadWriteMany` so multiple workers can access agent storage concurrently. If your storage class only supports `ReadWriteOnce`, set `controlPlaneNodeName` so the control plane and dedicated workers stay on the same node. The chart enforces this constraint during template rendering.

### RBAC And Network Policy

When `workerBackend: kubernetes` is enabled, the chart creates:

- A worker-manager ServiceAccount for the primary runtime.
- A Role and RoleBinding that allow managing worker Deployments and Services in the instance namespace.
- In the runtime chart's default same-namespace mode, a chart-created worker-auth Secret plus narrow `get` and `patch` access to only that Secret.
- In the runtime chart's explicit separate worker namespace mode, Secret CRUD for per-worker auth Secrets in that worker namespace.
- In the hosted instance chart, a pre-created tenant worker-auth Secret plus narrow `get` and `patch` access to only that Secret.
- NetworkPolicy rules that allow the primary runtime to reach the internal worker port while denying worker-to-worker runner ingress.

### Operations

The authenticated dashboard API exposes `/api/workers` to list active or idle workers and `/api/workers/cleanup` to trigger cleanup manually. Dedicated workers are internal-only cluster Services and are authenticated with per-worker runner tokens derived from the primary runtime's `sandbox_proxy_token`. See [Sandbox Proxy Isolation](https://docs.mindroom.chat/deployment/sandbox-proxy/index.md) for the execution model, credential leases, and non-Kubernetes deployment modes.

## Secrets Management

API keys are mounted as files at `/etc/secrets/` (not environment variables). MindRoom reads paths from `*_API_KEY_FILE` environment variables:

```
env:
  - name: ANTHROPIC_API_KEY_FILE
    value: "/etc/secrets/anthropic_key"
  - name: OPENROUTER_API_KEY_FILE
    value: "/etc/secrets/openrouter_key"
```

## Ingress

Each instance gets three hosts:

- `{customer}.{baseDomain}` - MindRoom dashboard and API
- `{customer}.api.{baseDomain}` - Direct API access
- `{customer}.matrix.{baseDomain}` - Matrix/Synapse server

## Platform Deployment

```
# Create values file from example
cp cluster/k8s/platform/values-staging.example.yaml cluster/k8s/platform/values-staging.yaml
# Edit with your configuration

helm upgrade --install platform ./cluster/k8s/platform \
  -f ./cluster/k8s/platform/values-staging.yaml \
  --namespace mindroom-staging
```

The namespace must match `mindroom-{environment}` where `environment` is set in values.

Platform ingress hosts:

- `app.{domain}` - Platform frontend
- `api.{domain}` - Platform backend API
- `webhooks.{domain}/stripe` - Stripe webhooks

## Local Development with Kind

```
just cluster-kind-fresh              # Start cluster with everything
just cluster-kind-port-frontend      # http://localhost:3000
just cluster-kind-port-backend       # http://localhost:8000
just cluster-kind-down               # Clean up
```

See `cluster/k8s/kind/README.md` for details.

## CLI Helper

```
./cluster/scripts/mindroom-cli.sh list              # List instances
./cluster/scripts/mindroom-cli.sh status            # Overall status
./cluster/scripts/mindroom-cli.sh logs <id>         # View logs
./cluster/scripts/mindroom-cli.sh provision <id>    # Create instance
./cluster/scripts/mindroom-cli.sh deprovision <id>  # Remove instance
./cluster/scripts/mindroom-cli.sh upgrade <id>      # Upgrade instance
```

Reads configuration from `saas-platform/.env`.

## Provisioner API

All endpoints require bearer token (`PROVISIONER_API_KEY`).

| Endpoint                           | Method | Description                        |
| ---------------------------------- | ------ | ---------------------------------- |
| `/system/provision`                | POST   | Create or re-provision an instance |
| `/system/instances/{id}/start`     | POST   | Start a stopped instance           |
| `/system/instances/{id}/stop`      | POST   | Stop a running instance            |
| `/system/instances/{id}/restart`   | POST   | Restart an instance                |
| `/system/instances/{id}/uninstall` | DELETE | Remove an instance                 |
| `/system/sync-instances`           | POST   | Sync states between DB and K8s     |

Example provision request:

```
curl -X POST "https://api.mindroom.chat/system/provision" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $PROVISIONER_API_KEY" \
  -d '{"account_id": "uuid", "subscription_id": "sub-123", "tier": "starter"}'
```

The provisioner creates the namespace, generates URLs, deploys via Helm, and updates status in Supabase.

## Deployment Scripts

```
cd saas-platform
./deploy.sh platform-frontend          # Deploy platform frontend
./deploy.sh platform-backend           # Deploy platform backend
./redeploy-mindroom.sh         # Redeploy all customer MindRoom instances
```

## Multi-Tenant Architecture

Each customer instance gets:

- Separate Kubernetes deployment in `mindroom-instances` namespace
- Isolated PersistentVolumeClaim for data
- Own Matrix/Synapse server (SQLite)
- Independent ConfigMap configuration
- Dedicated ingress routes

Platform services run in `mindroom-{environment}` namespace.

# CLI Reference

MindRoom provides a command-line interface for managing agents.

## Basic Usage

```
mindroom [OPTIONS] COMMAND [ARGS]...
```

## Commands

```
 Usage: root [OPTIONS] COMMAND [ARGS]...

 AI agents that live in Matrix and work everywhere via bridges.

 Quick start:
 mindroom config init   Create a starter config
 mindroom run           Start the system

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --install-completion            Install completion for the current shell.              │
│ --show-completion               Show completion for the current shell, to copy it or   │
│                                 customize the installation.                            │
│ --help                -h        Show this message and exit.                            │
╰────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
│ version             Show the current version of Mindroom.                              │
│ run                 Run the mindroom multi-agent system.                               │
│ doctor              Check your environment for common issues.                          │
│ connect             Pair this local MindRoom install with the hosted provisioning      │
│                     service.                                                           │
│ local-stack-setup   Start local Synapse + MindRoom Cinny using Docker only.            │
│ config              Manage MindRoom configuration files.                               │
│ avatars             Generate and sync managed avatar assets.                           │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## version

Show the current MindRoom version.

```
 Usage: root version [OPTIONS]

 Show the current version of Mindroom.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --help  -h        Show this message and exit.                                          │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## run

Start MindRoom with your configuration.

```
 Usage: root run [OPTIONS]

 Run the mindroom multi-agent system.

 This command starts the multi-agent bot system which automatically:
 - Creates all necessary user and agent accounts
 - Creates all rooms defined in config.yaml
 - Manages agent room memberships
 - Starts the bundled dashboard/API server (disable with --no-api)

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --log-level     -l              TEXT     Set the logging level (DEBUG, INFO, WARNING,  │
│                                          ERROR)                                        │
│                                          [env var: LOG_LEVEL]                          │
│                                          [default: INFO]                               │
│ --storage-path  -s              PATH     Base directory for persistent MindRoom data   │
│                                          (state, sessions, tracking)                   │
│ --api               --no-api             Start the bundled dashboard/API server        │
│                                          alongside the bot                             │
│                                          [default: api]                                │
│ --api-port                      INTEGER  Port for the bundled dashboard/API server     │
│                                          [default: 8765]                               │
│ --api-host                      TEXT     Host for the bundled dashboard/API server     │
│                                          [default: 0.0.0.0]                            │
│ --help          -h                       Show this message and exit.                   │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## avatars

Generate and sync managed avatar assets.

```
 Usage: root avatars [OPTIONS] COMMAND [ARGS]...

 Generate and sync managed avatar assets.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --help  -h        Show this message and exit.                                          │
╰────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
│ generate   Generate missing managed avatar files in the workspace.                     │
│ sync       Sync configured room and root-space avatars to Matrix using the initialized │
│            router account.                                                             │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## avatars generate

Generate missing managed avatar files in the workspace. In a source checkout, generated files are written under `./avatars/`. In containerized deployments, generated overrides are written under the persistent MindRoom storage path. Existing managed files are skipped by default. Use `--force` to overwrite them after changing avatar prompts or styles.

```
 Usage: root avatars generate [OPTIONS]

 Generate missing managed avatar files in the workspace.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --force            Overwrite existing managed workspace avatar files.                  │
│ --help   -h        Show this message and exit.                                         │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## avatars sync

Sync configured room and root-space avatars to Matrix using the initialized router account. Existing Matrix avatars are skipped by default. Use `--force` to replace them.

```
 Usage: root avatars sync [OPTIONS]

 Sync configured room and root-space avatars to Matrix using the initialized router
 account.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --force            Replace existing Matrix room and root-space avatars.                │
│ --help   -h        Show this message and exit.                                         │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## doctor

Check your environment for common issues before running `mindroom run`.

Runs a series of checks in one pass:

- **Config file** exists and is valid YAML with correct Pydantic schema
- **Providers** — validates API keys for each configured provider (Anthropic, OpenAI, Ollama, Vertex AI Claude, etc.)
- **Memory config** — checks memory LLM and embedder reachability (Ollama, OpenAI embeddings, sentence-transformers)
- **Matrix homeserver** — verifies the homeserver is reachable via `/_matrix/client/versions`
- **Storage** — confirms the storage directory is writable

```
 Usage: root doctor [OPTIONS]

 Check your environment for common issues.

 Runs connectivity, configuration, and credential checks in a single pass
 so you can fix everything before running `mindroom run`.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --help  -h        Show this message and exit.                                          │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## config

Manage MindRoom configuration files. The `config` subgroup contains commands for creating, viewing, editing, and validating your `config.yaml`.

```
 Usage: root config [OPTIONS] COMMAND [ARGS]...

 Manage MindRoom configuration files.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --help  -h        Show this message and exit.                                          │
╰────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────╮
│ init       Create a starter config.yaml with example agents and models.                │
│ show       Display the current config file with syntax highlighting.                   │
│ edit       Open config.yaml in your default editor.                                    │
│ validate   Validate config.yaml and check for common issues.                           │
│ path       Show the resolved config file path and search locations.                    │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

### config init

Create a starter `config.yaml` with example agents, models, and sensible defaults.

Profiles control the template style:

- `--profile full` (default) — rich example config with interactive provider selection
- `--profile minimal` — bare-minimum config
- `--profile public` — hosted Matrix (`mindroom.chat`) with prefilled homeserver settings
- `--profile public-codex` — hosted Matrix with Codex CLI subscription defaults
- `--profile public-vertexai-anthropic` — hosted Matrix with Vertex AI Claude defaults

Provider presets (`--provider`) set the default model: `anthropic`, `codex`, `openai`, `openrouter`, or `vertexai_claude`.

```
# Hosted Matrix quickstart (creates ~/.mindroom/config.yaml)
mindroom config init --profile public

# Minimal config with Anthropic
mindroom config init --minimal --provider anthropic

# Hosted Matrix with Codex CLI ChatGPT subscription auth
mindroom config init --profile public-codex

# Hosted Matrix with Vertex AI Claude
mindroom config init --profile public-vertexai-anthropic

# Force overwrite existing config
mindroom config init --force
```

The `public-codex` profile and `--provider codex` preset generate `provider: codex` with `id: gpt-5.5` and `context_window: 258000`. They set `extra_kwargs.reasoning_effort: medium`. Prompt caching is enabled automatically per active agent session; leave `prompt_cache_key` unset unless you intentionally want to override the derived key. Run `codex login` first so MindRoom can read `~/.codex/auth.json`.

### config show

Display the current config file with syntax highlighting.

```
# Show config with syntax highlighting
mindroom config show

# Print raw YAML (useful for piping)
mindroom config show --raw

# Show config at a specific path
mindroom config show --path /custom/path/config.yaml
```

### config edit

Open `config.yaml` in your default editor. Editor preference: `$EDITOR` → `$VISUAL` → `nano` → `vim` → `vi`.

```
mindroom config edit
```

### config validate

Validate `config.yaml` and check for common issues. Parses the YAML config using Pydantic and reports errors in a friendly format. Also checks whether required API keys are set as environment variables.

```
mindroom config validate
```

### config path

Show the resolved config file path and all search locations.

```
mindroom config path
```

## connect

Pair this local MindRoom install with a provisioning service.

Default provisioning URL is `https://mindroom.chat` unless you override it with `--provisioning-url` or `MINDROOM_PROVISIONING_URL`.

```
mindroom connect --pair-code ABCD-EFGH
```

On success (default `--persist-env`), this writes to `.env` next to `config.yaml`:

- `MINDROOM_PROVISIONING_URL`
- `MINDROOM_LOCAL_CLIENT_ID`
- `MINDROOM_LOCAL_CLIENT_SECRET`
- `MINDROOM_NAMESPACE`

If your config still contains the owner placeholder token `__MINDROOM_OWNER_USER_ID_FROM_PAIRING__`, `connect` will auto-replace it when pairing returns a valid `owner_user_id`.

Use `--no-persist-env` if you want to export variables only for the current shell session.

```
mindroom connect --pair-code ABCD-EFGH --no-persist-env
```

Use `--provisioning-url` for non-default deployments:

```
mindroom connect \
  --pair-code ABCD-EFGH \
  --provisioning-url https://matrix.example.com
```

## local-stack-setup

Start local Synapse and the MindRoom Cinny client container for development.

By default this command also writes `MATRIX_HOMESERVER`, `MATRIX_SERVER_NAME`, and `MATRIX_SSL_VERIFY=false` into `.env` next to your active `config.yaml` so `mindroom run` works without inline env exports.

```
 Usage: root local-stack-setup [OPTIONS]

 Start local Synapse + MindRoom Cinny using Docker only.

╭─ Options ──────────────────────────────────────────────────────────────────────────────╮
│ --synapse-dir                                 PATH                 Directory           │
│                                                                    containing Synapse  │
│                                                                    docker-compose.yml  │
│                                                                    (from               │
│                                                                    mindroom-stack      │
│                                                                    settings).          │
│                                                                    [default:           │
│                                                                    local/matrix]       │
│ --homeserver-url                              TEXT                 Homeserver URL that │
│                                                                    Cinny and MindRoom  │
│                                                                    should use.         │
│                                                                    [default:           │
│                                                                    http://localhost:8… │
│ --server-name                                 TEXT                 Matrix server name  │
│                                                                    (default: inferred  │
│                                                                    from                │
│                                                                    --homeserver-url    │
│                                                                    hostname).          │
│ --cinny-port                                  INTEGER RANGE        Local host port for │
│                                               [1<=x<=65535]        the MindRoom Cinny  │
│                                                                    container.          │
│                                                                    [default: 8080]     │
│ --cinny-image                                 TEXT                 Docker image for    │
│                                                                    MindRoom Cinny.     │
│                                                                    [default:           │
│                                                                    ghcr.io/mindroom-a… │
│ --cinny-container-n…                          TEXT                 Container name for  │
│                                                                    MindRoom Cinny.     │
│                                                                    [default:           │
│                                                                    mindroom-cinny-loc… │
│ --skip-synapse                                                     Skip starting       │
│                                                                    Synapse (assume it  │
│                                                                    is already          │
│                                                                    running).           │
│ --persist-env             --no-persist-env                         Persist Matrix      │
│                                                                    local dev settings  │
│                                                                    to .env next to     │
│                                                                    config.yaml.        │
│                                                                    [default:           │
│                                                                    persist-env]        │
│ --help                -h                                           Show this message   │
│                                                                    and exit.           │
╰────────────────────────────────────────────────────────────────────────────────────────╯
```

## Examples

### Basic run

```
mindroom run
```

### Debug logging

```
mindroom run --log-level DEBUG
```

To debug MindRoom internals without enabling debug logs from every dependency, keep the global level at `INFO` and set targeted logger overrides:

```
LOG_LEVEL=INFO MINDROOM_LOGGER_LEVELS="mindroom:DEBUG,httpx:WARNING,httpcore:WARNING,anthropic:INFO,nio:WARNING" mindroom run
```

Matrix crypto decrypt warnings from `nio.crypto` are quieted by default because missing Megolm sessions can produce bursts of diagnostically useful but high-volume logs. To inspect those warnings while debugging encryption state, explicitly restore that logger:

```
LOG_LEVEL=INFO MINDROOM_LOGGER_LEVELS="nio.crypto:WARNING" mindroom run
```

### Custom storage path

```
mindroom run --storage-path /data/mindroom
```

### Pair local install with hosted provisioning

```
mindroom connect --pair-code ABCD-EFGH
```

### Start local Synapse + Cinny (default local setup)

```
mindroom local-stack-setup --synapse-dir /path/to/mindroom-stack/local/matrix
```

### Start local stack without writing `.env`

```
mindroom local-stack-setup --no-persist-env
```

### Show version

```
mindroom version
```

### Preflight environment check

```
mindroom doctor
```

### Initialize a config

```
mindroom config init --profile public
```

### Validate your config

```
mindroom config validate
```
