Metadata-Version: 2.4
Name: hermes-memory-pgvector
Version: 0.3.0
Summary: Postgres + pgvector memory provider plugin for hermes-agent. Multi-agent storage layer with per-minion themes, async writer, no LLM in the memory hot path.
Author: Andrea Borghi
License: BSD-3-Clause
Project-URL: Homepage, https://github.com/andreab67/hermes-memory-pgvector
Project-URL: Source, https://github.com/andreab67/hermes-memory-pgvector
Project-URL: Issues, https://github.com/andreab67/hermes-memory-pgvector/issues
Project-URL: Roadmap, https://github.com/andreab67/hermes-memory-pgvector/blob/main/ROADMAP.md
Keywords: hermes-agent,memory-provider,pgvector,postgres,multi-agent,llm-memory,semantic-search
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: System Administrators
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Database
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: System :: Distributed Computing
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: psycopg[binary]<4,>=3.3.4
Requires-Dist: psycopg-pool<4,>=3.3.1
Requires-Dist: PyYAML<7,>=6.0
Provides-Extra: test
Requires-Dist: pytest<9,>=7.4; extra == "test"
Dynamic: license-file

# hermes-memory-pgvector

**Postgres + pgvector memory provider for [hermes-agent](https://github.com/NousResearch/hermes-agent).** A shared memory substrate for a fleet of cooperating hermes-agent minions — built on Postgres and a single embedding endpoint you probably already run, with no LLM in the memory hot path.

```text
each minion → X-Hermes-Session-Key: <theme>
            → hermes-agent gateway
            → pgvector plugin
                ├── memory_entries  (mirrors built-in MEMORY.md / USER.md per theme)
                └── conversations   (every substantive turn, semantically searchable)
```

## Why it exists

Existing memory providers each solve a piece of the problem; the gap for **fleet deployments** is wide:

- **Built-in `memory` tool** persists to per-host `MEMORY.md` / `USER.md`. Two minions on the same host stomp on each other; minions on different hosts have no shared substrate.
- **Honcho** offers cross-session user modelling but requires a full external service, an LLM in the memory hot path for its deriver + dialectic loops, and its own ontology layered on top of the built-in tool. In high-concurrency fleet use it produces retry storms, embedding-endpoint queue backups, and gateway↔Honcho circular dependencies.
- **Holographic** is a fine in-process fact store but uses SQLite — a poor fit for many minions writing concurrently from many hosts.
- **Other providers** (Mem0, Hindsight, OpenViking, ByteRover, RetainDB, Supermemory) all either require a paid cloud, require LLM mediation for memory ops, or both.

What was missing: a **storage layer** that gives the built-in `memory` model durable, multi-tenant, semantically-searchable backing, with no LLM in the hot path, scoped cleanly per-minion so a marketing agent's notes don't pollute a trading agent's recall. That's what this plugin provides.

## Design philosophy

1. **Storage layer, not a memory model.** The agent keeps using `memory(action='add', target='memory'|'user', …)`. We mirror those writes via `on_memory_write`. No new ontology for the agent to learn.
2. **No LLM in the memory hot path.** Embeddings are vector math, not LLM calls. There is no deriver, no dialectic, no dream cycle — the failure modes that hurt Honcho cannot occur here by construction.
3. **Per-agent themes by default, cross-theme recall on explicit demand.** Every row carries `agent_identity` (resolved from `X-Hermes-Session-Key` header, profile name, workspace, or `'default'`). Recall is scoped to the current theme unless the agent asks for `scope='all'`.
4. **Fail-soft everywhere.** Embed endpoint down → degrade to text-only writes. Async writer queue full → drop with a one-time warning. DB down → log + skip. No exception escapes into the agent loop.
5. **Admin/runtime separation.** DDL (`CREATE EXTENSION vector`, `CREATE TABLE`, `CREATE INDEX`) runs once with superuser. The runtime user has DML only on the migrated schema. `ensure_schema()` at runtime is verify-only with a clear `SchemaNotApplied` error if the operator forgot the migration.

## Features (v0.3.0)

| Hook / surface | Behavior |
|---|---|
| `initialize()` | Verifies schema, opens `psycopg_pool.ConnectionPool`, bulk-imports existing `MEMORY.md` + `USER.md` content. |
| `on_memory_write(action, target, content, meta)` | Mirrors built-in `memory` writes into `memory_entries` (add / replace / remove). |
| `sync_turn(user, assistant, session_id)` | Captures every substantive (`>= 40` chars + not boilerplate) chat turn into `conversations`. |
| `prefetch(query)` | Top-K semantically similar `memory_entries` in current theme, injected ambient. |
| `recall_memory(query, scope, target, limit)` tool | Explicit cross-theme search of durable memory entries. |
| `recall_conversation(query, scope, limit)` tool | Explicit search over past chat turns. `scope ∈ {current, session, all, <theme>}`. |

Internals:

- **`psycopg_pool.ConnectionPool`** (min=1, max=4, lazy + thread-safe) shared across the agent thread and the async-writer drain thread.
- **`AsyncWriter`** — bounded queue + daemon drain thread. Memory write hooks return in microseconds. Worker embeds + writes in the background. Crash-resilient (auto-restart on next enqueue).
- **Single migration** (`pgvector/migrations/001_schema.sql`) — `memory_entries` + `conversations` + HNSW indexes. Same tuning operators typically use elsewhere.
- **Boilerplate filter** for turn capture — length floor + acknowledgement regex (`"ok"`, `"thanks"`, `"continue"`, …) so the recall table stays high-signal.

## Multi-agent / per-minion themes

Each systemd-run minion sets one header on its OpenAI client; everything else flows automatically:

```python
client = AsyncOpenAI(
    base_url="http://127.0.0.1:8642/v1",
    api_key=API_KEY,
    default_headers={"X-Hermes-Session-Key": "marketing"},   # ← theme
)
```

The gateway plumbs `X-Hermes-Session-Key` through as `gateway_session_key=…` in `MemoryProvider.initialize` kwargs. The plugin reads it with **priority over the profile default**, so `agent_identity='default'` from unprofiled API traffic does not collapse every minion into one shared scope.

Convention: lowercase, dash-separated, stable. Examples that work well:

- `marketing`, `sales`, `morning-report`, `incident`
- `intraday-<agent_name>` for fan-out workers (e.g. `intraday-trading`, `intraday-sre`, `intraday-marketing`)

## Install

### Option 1: clone + run the installer script (recommended)

```bash
git clone https://github.com/andreab67/hermes-memory-pgvector.git
cd hermes-memory-pgvector
./scripts/install.sh
```

That:

1. `pip install`s `psycopg[binary]`, `psycopg-pool`, `PyYAML` (with the upper-bound pins).
2. Copies `pgvector/` into `$HERMES_HOME/plugins/pgvector/` (defaults to `~/.hermes/plugins/pgvector/`).
3. Prints the admin migration + activation commands you run next.

### Option 2: manual

```bash
# Python deps
pip install 'psycopg[binary]>=3.3.4,<4' 'psycopg-pool>=3.3.1,<4' 'PyYAML>=6.0,<7'

# Plugin module
mkdir -p ~/.hermes/plugins
cp -r pgvector ~/.hermes/plugins/pgvector
```

### Then (admin once)

```bash
# Apply the schema migration (CREATE EXTENSION needs superuser)
sudo -u postgres psql -d <your-memory-db> \
     -f ~/.hermes/plugins/pgvector/migrations/001_schema.sql

# Hand ownership of the new tables to the hermes runtime role
sudo -u postgres psql -d <your-memory-db> -c "
ALTER TABLE memory_entries OWNER TO hermes;
ALTER SEQUENCE memory_entries_id_seq OWNER TO hermes;
ALTER TABLE conversations OWNER TO hermes;
ALTER SEQUENCE conversations_id_seq OWNER TO hermes;
"

# Activate
hermes config set memory.provider pgvector
sudo systemctl restart hermes.service     # or however you run hermes
hermes memory status                       # expect: Provider: pgvector; Status: available
```

## Configuration

Lives in `$HERMES_HOME/config.yaml` under `plugins.pgvector` — every value optional, sensible defaults shown:

```yaml
plugins:
  pgvector:
    dsn: "dbname=hermes_memory user=hermes host=/var/run/postgresql"
    embed_url: "http://your-embed-endpoint:11434"
    embed_model: "nomic-embed-text"
    prefetch_limit: 5
    min_similarity: 0.30
    embed_on_write: true
    scope_default: "current"
    write_queue_maxsize: 256
    bulk_sync_on_init: true
    sync_turns: true
    turn_min_chars: 40
```

The embed endpoint can be any OpenAI-compatible `/v1/embeddings` or Ollama-native `/api/embed` URL that returns **768-dim vectors** (the schema is hard-coded to `vector(768)` to match `nomic-embed-text`). Use a different model only if it produces 768-dim output, or edit the migration before applying it.

## Schema

```sql
CREATE TABLE memory_entries (
  id              BIGSERIAL PRIMARY KEY,
  agent_identity  TEXT NOT NULL DEFAULT 'default',
  target          TEXT NOT NULL CHECK (target IN ('memory', 'user')),
  content         TEXT NOT NULL,
  embedding       vector(768),
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  metadata        JSONB NOT NULL DEFAULT '{}'::jsonb,
  UNIQUE (agent_identity, target, content)
);

CREATE TABLE conversations (
  id              BIGSERIAL PRIMARY KEY,
  session_id      TEXT NOT NULL,
  agent_identity  TEXT NOT NULL DEFAULT 'default',
  role            TEXT NOT NULL CHECK (role IN ('user','assistant','system','tool')),
  content         TEXT NOT NULL,
  ts              TIMESTAMPTZ NOT NULL DEFAULT now(),
  embedding       vector(768),
  metadata        JSONB NOT NULL DEFAULT '{}'::jsonb
);
```

Indexes: HNSW on each `embedding` column (m=16, ef_construction=64) plus per-agent + per-session btree timelines. Full DDL in [`pgvector/migrations/001_schema.sql`](pgvector/migrations/001_schema.sql).

## Tests

```bash
pip install -e ".[test]"

# Skip mode (no DB, no embed endpoint): everything skips gracefully
pytest tests/

# Live mode (against a throwaway Postgres + your embed endpoint)
export PG_TEST_DSN='dbname=hermes_test user=postgres host=/var/run/postgresql'
export PG_TEST_EMBED_URL='http://your-embed-endpoint:11434'
pytest tests/
```

DB tests skip when `PG_TEST_DSN` is unset; live embed tests skip when `PG_TEST_EMBED_URL` is unset.

## Roadmap

See [`ROADMAP.md`](ROADMAP.md) for the full milestone table. Highlights:

- **M1 (v0.1, v0.1.1)** ✅ Shared storage with per-agent themes, async writer, connection pool, bulk import from `MEMORY.md`/`USER.md`
- **M2 (v0.2)** ✅ Conversation transcript table with `sync_turn` capture + `recall_conversation` tool
- **M3 (v0.3)** ✅ Identity propagation for stateless API minions via `X-Hermes-Session-Key`
- **M4 (v0.4)** ⏳ `on_delegation()` + `on_session_end()` capture for agent-of-agents observability
- **M5 (v0.5–v0.6)** ⏳ TTL/decay, partial HNSW indexes per-theme, metrics, bulk-import CLI
- **M6 (v1.0)** ⏳ Stable config schema, full docs, CI coverage

The roadmap exists so the multi-agent positioning isn't a one-off claim — each milestone has to pass the test *"does this make N cooperating agents more capable?"* before it lands. The `What's not on the roadmap` section in `ROADMAP.md` lists what was deliberately rejected (LLM-mediated dialectic, fact-store ontologies, background derivers, in-plugin RBAC) so the boundaries are explicit.

## Rollback

```bash
hermes config set memory.provider none
sudo systemctl restart hermes.service

# Optional — drop the tables (data loss, irreversible)
sudo -u postgres psql -d <your-memory-db> -c "
DROP TABLE IF EXISTS conversations;
DROP TABLE IF EXISTS memory_entries;
"

# Optional — remove the plugin files
rm -rf ~/.hermes/plugins/pgvector
```

## Why a standalone plugin (not an upstream PR)?

Per the hermes-agent [`CONTRIBUTING.md`](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md):

> We are no longer accepting new memory providers into this repo. The set of built-in providers under `plugins/memory/` is closed. If you want to add a new memory backend, publish it as a standalone plugin repo that users install into `~/.hermes/plugins/` (or via a pip entry point).

The discovery system (`plugins/memory/__init__.py` in hermes-agent) scans `$HERMES_HOME/plugins/<name>/` for any directory whose `__init__.py` calls `register_memory_provider`. This plugin's `pgvector/__init__.py` does exactly that — no upstream change required.

## Contributing

Bug reports + PRs welcome. Open an issue describing the failure mode + your environment (hermes-agent version, Postgres version, embed endpoint), or a PR with a focused change + test.

## License

[BSD 3-Clause](LICENSE) © 2026 Andrea Borghi.
