Metadata-Version: 2.4
Name: engram-memory
Version: 0.5.0b1
Summary: Memory layer for AI agents — biologically-inspired forgetting, multi-agent trust, and plug-and-play integrations
Author: Engram Team
License: MIT
Project-URL: Homepage, https://github.com/Ashish-dwi99/Engram
Project-URL: Repository, https://github.com/Ashish-dwi99/Engram
Project-URL: Issues, https://github.com/Ashish-dwi99/Engram/issues
Project-URL: Documentation, https://github.com/Ashish-dwi99/Engram#readme
Project-URL: Changelog, https://github.com/Ashish-dwi99/Engram/blob/main/CHANGELOG.md
Keywords: memory-layer,mcp,claude,cursor,codex,ai,agents,forgetting,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0
Requires-Dist: google-generativeai>=0.3.0
Requires-Dist: qdrant-client>=1.7.0
Requires-Dist: requests>=2.28.0
Provides-Extra: gemini
Requires-Dist: google-generativeai>=0.3.0; extra == "gemini"
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == "openai"
Provides-Extra: ollama
Requires-Dist: ollama>=0.4.0; extra == "ollama"
Provides-Extra: qdrant
Requires-Dist: qdrant-client>=1.7.0; extra == "qdrant"
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == "mcp"
Provides-Extra: api
Requires-Dist: fastapi>=0.109.0; extra == "api"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "api"
Provides-Extra: all
Requires-Dist: openai>=1.0.0; extra == "all"
Requires-Dist: ollama>=0.4.0; extra == "all"
Requires-Dist: mcp>=1.0.0; extra == "all"
Requires-Dist: fastapi>=0.109.0; extra == "all"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "all"
Requires-Dist: aiosqlite>=0.19.0; extra == "all"
Provides-Extra: async
Requires-Dist: aiosqlite>=0.19.0; extra == "async"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: aiosqlite>=0.19.0; extra == "dev"
Requires-Dist: build>=1.0.0; extra == "dev"
Requires-Dist: twine>=5.0.0; extra == "dev"
Dynamic: license-file

<h1 align="center">
  <br>
  <img src="https://img.shields.io/badge/engram-PMK-black?style=for-the-badge" alt="Engram" height="32">
  <br>
  Engram
  <br>
</h1>

<h3 align="center">
  The Personal Memory Kernel for AI Agents
</h3>

<p align="center">
  Hit a rate limit in Claude Code? Open Codex — it already knows what you were doing.<br>
  One memory kernel. Shared across every agent. Bio-inspired forgetting. Staged writes. Episodic recall.
</p>

<p align="center">
  <b>⚠ Early-stage software — not recommended for production use. APIs may change. Use at your own risk.</b>
</p>

<p align="center">
  <a href="https://pypi.org/project/engram-memory"><img src="https://img.shields.io/badge/python-3.9%2B-blue.svg" alt="Python 3.9+"></a>
  <a href="https://github.com/Ashish-dwi99/Engram/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue.svg" alt="MIT License"></a>
  <a href="https://github.com/Ashish-dwi99/Engram/actions"><img src="https://github.com/Ashish-dwi99/Engram/actions/workflows/test.yml/badge.svg" alt="Tests"></a>
  <a href="https://github.com/Ashish-dwi99/Engram"><img src="https://img.shields.io/github/stars/Ashish-dwi99/Engram?style=social" alt="GitHub Stars"></a>
</p>

<p align="center">
  <a href="#-quick-start">Quick Start</a> &middot;
  <a href="#-why-engram">Why Engram</a> &middot;
  <a href="#%EF%B8%8F-architecture">Architecture</a> &middot;
  <a href="#-integrations">Integrations</a> &middot;
  <a href="#-api--sdk">API & SDK</a> &middot;
  <a href="#-longmemeval-on-colab-gpu">LongMemEval</a> &middot;
  <a href="https://github.com/Ashish-dwi99/Engram/blob/main/CHANGELOG.md">Changelog</a>
</p>

---

## Why Engram

Every AI agent you use starts with amnesia. But the real pain isn't just forgetting — it's what happens when you **switch agents**.

You're 40 minutes into a refactor with Claude Code. You've touched six files, picked a migration strategy, mapped out the remaining TODOs. Then you hit a rate limit. Or your terminal crashes. Or you just need Codex for the next part. So you switch — and the new agent has **zero context**. You re-paste file paths, re-explain decisions, re-describe the plan. Half the time the new agent contradicts something you'd already decided.

**Engram fixes this.** It's a Personal Memory Kernel (PMK) — one memory store shared across all your agents. When Claude Code pauses, it saves a session digest. When Codex picks up, it loads that digest and continues where you left off. No re-explanation. No cold starts.

But Engram isn't just a handoff bus. It solves four fundamental problems with how AI memory works today:

| Problem | Other Memory Layers | **Engram** |
|:--------|:--------------------|:-----------|
| **Switching agents = cold start** | Manual copy/paste context | **Handoff bus — session digests, auto-resume** |
| **Nobody forgets** | Store everything forever | **Ebbinghaus decay curve, ~45% less storage** |
| **Agents write with no oversight** | Store directly | **Staging + verification + trust scoring** |
| **No episodic memory** | Vector search only | **CAST scenes (time/place/topic)** |
| Multi-modal encoding | Single embedding | **5 retrieval paths (EchoMem)** |
| Cross-agent memory sharing | Per-agent silos | **Scoped retrieval with all-but-mask privacy** |
| Reference-aware decay | No | **If other agents use it, don't delete it** |
| Knowledge graph | Sometimes | **Entity extraction + linking** |
| MCP + REST | One or the other | **Both, plug-and-play** |
| Local-first | Cloud-required | **127.0.0.1:8100 by default** |

---

## Quick Start

```bash
pip install engram-memory          # 1. Install from PyPI
export GEMINI_API_KEY="your-key"   # 2. Set one key before starting Engram
engram install                     # 3. Auto-configure Claude Code, Cursor, Codex
```

Restart your agent. Done — it now has persistent memory across sessions.

### PyPI Install Options

```bash
# Default runtime (Gemini + local Qdrant + MemoryClient deps)
pip install engram-memory

# Full stack extras (MCP server + REST API + async + all providers)
pip install "engram-memory[all]"

# OpenAI provider add-on
pip install "engram-memory[openai]"

# Ollama provider add-on
pip install "engram-memory[ollama]"
```

### API Key: When and How to Provide It

Engram reads provider credentials when a process initializes `Memory()` (for example: `engram`, `engram-api`, `engram-mcp`, or your Python app).

1. Set env vars **before** starting those processes.
2. If you change keys, restart the process.
3. Default provider is Gemini, so set `GEMINI_API_KEY` or `GOOGLE_API_KEY` unless you override provider config.

```bash
# Default (Gemini)
export GEMINI_API_KEY="your-key"
engram-api
```

```bash
# OpenAI provider
export OPENAI_API_KEY="your-key"
engram-api
```

```bash
# Ollama (local; no cloud key)
export OLLAMA_HOST="http://localhost:11434"
engram-api
```

For remote usage via `MemoryClient`, provider API keys are needed on the **server** running Engram.  
The client only needs:
- `ENGRAM_ADMIN_KEY` (or `admin_key=...`) when minting sessions via `/v1/sessions`
- Bearer session token for normal read/write API calls

**Or with Docker:**

```bash
docker compose up -d               # API at http://localhost:8100
```

---

## Architecture

Engram is a **Personal Memory Kernel** — not just a vector store with an API. It has opinions about how memory should work:

1. **Switching agents shouldn't mean starting over.** When an agent pauses — rate limit, crash, tool switch — it saves a session digest. The next agent loads it and continues. Zero re-explanation.
2. **Memory has a lifecycle.** New memories start in short-term (SML), get promoted to long-term (LML) through repeated access, and fade away through Ebbinghaus decay if unused.
3. **Agents are untrusted writers.** Every write is a proposal that lands in staging. Trusted agents can auto-merge; untrusted ones wait for approval.
4. **Scoping is mandatory.** Every memory is scoped by user. Agents see only what they're allowed to — everything else gets the "all but mask" treatment (structure visible, details redacted).

```
┌─────────────────────────────────────────────────────────────────┐
│                    Agent Orchestrator                            │
│              (Claude Code / Cursor / Codex / Custom)             │
└─────────────────────────┬───────────────────────────────────────┘
                          │
              ┌───────────┴───────────┐
              │                       │
              ▼                       ▼
        ┌──────────┐           ┌──────────┐
        │   MCP    │           │   REST   │
        │  Server  │           │   API    │
        └────┬─────┘           └────┬─────┘
             └───────────┬──────────┘
                         ▼
        ┌────────────────────────────────────┐
        │         Policy Gateway             │
        │   Scopes · Masking · Quotas ·      │
        │   Capability Tokens · Trust Score  │
        └────────────────┬───────────────────┘
                         │
              ┌──────────┴──────────┐
              ▼                     ▼
   ┌──────────────────┐  ┌──────────────────┐
   │  Retrieval Engine │  │ Ingestion Pipeline│
   │  ┌─────────────┐ │  │                  │
   │  │Semantic     │ │  │  Text → Views    │
   │  │(hybrid+graph│ │  │  Views → Scenes  │
   │  │+categories) │ │  │  Scenes → LML    │
   │  ├─────────────┤ │  │                  │
   │  │Episodic     │ │  └────────┬─────────┘
   │  │(CAST scenes)│ │           │
   │  └─────────────┘ │           ▼
   │                  │  ┌──────────────────┐
   │  Intersection    │  │Write Verification│
   │  Promotion:      │  │                  │
   │  match in both → │  │ Invariant checks │
   │  boost score     │  │ Conflict → stash │
   └──────────────────┘  │ Trust scoring    │
                         └────────┬─────────┘
                                  │
              ┌───────────────────┼───────────────────┐
              ▼                   ▼                   ▼
   ┌──────────────────┐  ┌──────────────┐  ┌──────────────────┐
   │  Staging (SML)   │  │ Long-Term    │  │    Indexes       │
   │  Proposals+Diffs │  │ Store (LML)  │  │ Vector + Graph   │
   │  Conflict Stash  │  │ Canonical    │  │ + Episodic       │
   └──────────────────┘  └──────────────┘  └──────────────────┘
              │                   │                   │
              └───────────────────┼───────────────────┘
                                  ▼
                       ┌──────────────────┐
                       │   FadeMem GC     │
                       │  Ref-aware decay │
                       │  If other agents │
                       │  use it → keep   │
                       └──────────────────┘
```

### The Memory Stack

Engram combines five systems, each handling a different aspect of how memory should work:

#### FadeMem — Decay & Consolidation

Memories fade based on time and access patterns, following the Ebbinghaus forgetting curve. Frequently accessed memories get promoted from short-term (SML) to long-term (LML). Unused memories weaken and eventually get forgotten. **Reference-aware:** if other agents still reference a memory, it won't be garbage collected — even if the original agent stopped using it.

```
New Memory → Short-term (SML)
                  │
                  │ Accessed frequently?
                  ▼
            ┌──────────┐
       No ← │  Decay   │ → Yes
            └──────────┘
            │           │
            ▼           ▼
       Forgotten    Promoted to Long-term (LML)
```

#### EchoMem — Multi-Modal Encoding

Each memory is encoded through multiple retrieval paths — keywords, paraphrases, implications, and question forms. This creates 5x the retrieval surface area compared to single-embedding approaches. Important memories get deeper processing (1.6x strength multiplier).

```
Input: "User prefers TypeScript over JavaScript"
  ↓
  raw:          "User prefers TypeScript over JavaScript"
  paraphrase:   "TypeScript is the user's preferred language"
  keywords:     ["typescript", "javascript", "preference"]
  implications: ["values type safety", "modern tooling"]
  question:     "What language does the user prefer?"
```

#### CategoryMem — Dynamic Organization

Categories aren't predefined — they emerge from content and evolve over time. As new memories arrive, the category tree grows, splits, and merges. Categories themselves decay when unused, keeping the taxonomy clean.

#### CAST Scenes — Episodic Narrative Memory

Inspired by the Contextual Associative Scene Theory of memory, Engram clusters interactions into **scenes** defined by three dimensions: time, place, and topic. Each scene has characters, a synopsis, and links to the semantic memories extracted from it.

```
Scene: "Engram v2 architecture session"
  Time:       2026-02-09 12:00–12:25
  Place:      repo:Engram (digital)
  Characters: [self, collaborator]
  Synopsis:   "Designed staged writes and scoped retrieval..."
  Views:      [view_1, view_2, view_3]
  Memories:   [mem_1, mem_2]  ← semantic facts extracted
```

#### Handoff Bus — Cross-Agent Continuity

When an agent pauses work — rate limit, crash, you switch tools — it saves a session digest: task summary, decisions made, files touched, remaining TODOs, blockers. The next agent calls `get_last_session` and gets the full context. No re-explanation needed.

```
Claude Code (rate limited)
  → save_session_digest(task, decisions, files, todos, blockers)
  → Session stored in handoff bus

Codex (picks up)
  → get_last_session(repo="/my-project")
  → Gets full context: task, decisions, files, todos
  → Continues where Claude Code stopped
```

---

### Key Flows

#### Read: Query → Context Packet

```
Agent calls search_memory or POST /v1/search
  → Policy Gateway enforces scope, quotas, masking
  → Dual retrieval: semantic index + episodic index (parallel)
  → Intersection promotion: results matching in both get boosted
  → Returns Context Packet (token-budgeted, with scene citations)
```

The dual retrieval approach reduces "similar but wrong time/place" errors. If a memory appears in both semantic search and the relevant episodic scene, it gets a confidence boost.

#### Write: Agent Proposal → Staging

```
Agent calls propose_write or POST /v1/memories
  → Lands in Staging SML as a Proposal Commit
  → Provenance recorded (agent, time, scope, trust score)
  → Verification runs:
      • Invariant contradiction check → stash if conflict
      • Duplication detection
      • PII risk detection → require manual approval if high
  → High-trust agents: auto-merge
  → Others: wait for user approval or daily digest
```

#### "All But Mask" Policy

When an agent queries data outside its scope, it sees structure but not details:

```json
{
  "type": "private_event",
  "time": "2026-02-10T17:00:00Z",
  "importance": "high",
  "details": "[REDACTED]"
}
```

Agents can still operate (scheduling, planning) without seeing secrets.

---

## Integrations

Engram is plug-and-play. Run `engram install` and it auto-configures everything:

### Claude Code (MCP + Plugin)

```bash
engram install    # Writes MCP config to ~/.claude.json
```

`engram install` also bootstraps workspace continuity rules (in your current
project directory) so agents call handoff tools automatically:

- `AGENTS.md`
- `CLAUDE.md`
- `CURSOR.md`
- `.cursor/rules/engram-continuity.mdc`

Set `ENGRAM_INSTALL_SKIP_WORKSPACE_RULES=1` to disable this behavior.

**MCP tools** give Claude reactive memory — it stores and retrieves when you ask.

The optional **Claude Code plugin** makes memory **proactive** — relevant context is injected automatically before Claude sees your message:

```bash
# Inside Claude Code:
/plugin install engram-memory --path ~/.engram/claude-plugin
```

What the plugin adds:

| Component | What it does |
|:----------|:-------------|
| **UserPromptSubmit hook** | Before each reply, queries Engram and injects matching memories into context. Stdlib-only, no extra deps. Under 2s latency. |
| `/engram:remember <text>` | Save a fact or preference on the spot |
| `/engram:search <query>` | Search memories by topic |
| `/engram:forget <id>` | Delete a memory (confirms before removing) |
| `/engram:status` | Show memory-store stats at a glance |
| **Skill** | Standing instructions telling Claude when to save, search, and surface memories |

**Without plugin** — Claude reacts to explicit requests:
```
You: Remember that I prefer TypeScript
Claude: [calls remember tool] Done.
```

**With plugin** — memory is proactive and invisible:
```
--- Session A ---
You: /engram:remember I prefer TypeScript for all new projects

--- Session B (new conversation, no history) ---
You: What stack should I use for the new API?
[Hook injects "TypeScript preference" before Claude sees the message]
Claude: Based on your preferences, I'd recommend TypeScript...
```

### Cursor

`engram install` writes MCP config to `~/.cursor/mcp.json`. Restart Cursor to load.
It also sets `ENGRAM_MCP_AGENT_ID=cursor` for deterministic handoff identity.

### OpenAI Codex

`engram install` writes MCP config to `~/.codex/config.toml`. Restart Codex to load.
It also sets `ENGRAM_MCP_AGENT_ID=codex` for deterministic handoff identity.

### OpenClaw

`engram install` deploys the Engram skill to OpenClaw's skills directory.

### Any Agent Runtime

Any tool-calling agent can connect via REST:

```bash
engram-api    # Starts on http://127.0.0.1:8100
```

---

## MCP Tools

Once configured, your agent has access to these tools:

| Tool | Description |
|:-----|:------------|
| `add_memory` | Store a new memory (lands in staging by default) |
| `search_memory` | Semantic + keyword + episodic search |
| `get_all_memories` | List all stored memories for a user |
| `get_memory` | Get a specific memory by ID |
| `update_memory` | Update memory content |
| `delete_memory` | Remove a memory |
| `get_memory_stats` | Storage statistics and health |
| `apply_memory_decay` | Run the forgetting algorithm |
| `engram_context` | Session-start digest — load top memories from prior sessions |
| `remember` | Quick-save a fact (no LLM extraction, stores directly) |
| `propose_write` | Create a staged write proposal (default safe path) |
| `list_pending_commits` | Inspect staged write queue |
| `resolve_conflict` | Resolve invariant conflicts (accept proposed or keep existing) |
| `search_scenes` / `get_scene` | Episodic CAST scene retrieval with masking policy |
| `save_session_digest` | Save handoff context when pausing or switching agents |
| `get_last_session` | Load session context from the last active agent |
| `list_sessions` | Browse handoff history across agents |

---

## API & SDK

### REST API

```bash
engram-api    # http://127.0.0.1:8100
              # Interactive docs at /docs
```

```bash
# 1. Create a capability session token
curl -X POST http://localhost:8100/v1/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "user_id": "u123",
    "agent_id": "planner",
    "allowed_confidentiality_scopes": ["work", "personal"],
    "capabilities": ["search", "propose_write", "read_scene"],
    "ttl_minutes": 1440
  }'

# 2. Propose a write (default: staging)
curl -X POST http://localhost:8100/v1/memories \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers dark mode", "user_id": "u123", "mode": "staging"}'

# 3. Search (returns context packet with scene citations)
curl -X POST http://localhost:8100/v1/search \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"query": "UI preferences", "user_id": "u123"}'

# 4. Review staged commits
curl "http://localhost:8100/v1/staging/commits?user_id=u123&status=PENDING"
curl -X POST http://localhost:8100/v1/staging/commits/<id>/approve

# 5. Episodic scene search
curl -X POST http://localhost:8100/v1/scenes/search \
  -H "Content-Type: application/json" \
  -d '{"query": "architecture discussion", "user_id": "u123"}'

# 6. Namespace & trust management
curl -X POST http://localhost:8100/v1/namespaces \
  -d '{"user_id": "u123", "namespace": "workbench"}'
curl "http://localhost:8100/v1/trust?user_id=u123&agent_id=planner"

# 7. Sleep-cycle maintenance
curl -X POST http://localhost:8100/v1/sleep/run \
  -d '{"user_id": "u123", "apply_decay": true, "cleanup_stale_refs": true}'

# 8. Zero-intervention handoff (session bus)
curl -X POST http://localhost:8100/v1/handoff/resume \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"user_id":"u123","agent_id":"frontend","repo_path":"/repo","objective":"continue latest task"}'

curl -X POST http://localhost:8100/v1/handoff/checkpoint \
  -H "Authorization: Bearer <TOKEN>" \
  -H "Content-Type: application/json" \
  -d '{"user_id":"u123","agent_id":"frontend","repo_path":"/repo","task_summary":"implemented card layout"}'

curl "http://localhost:8100/v1/handoff/lanes?user_id=u123"
```

### Python SDK

```python
from engram import Engram

memory = Engram()

# Add a memory
memory.add("User prefers Python over JavaScript", user_id="u123")

# Search with dual retrieval
results = memory.search("programming preferences", user_id="u123")

# Cross-agent knowledge sharing
memory.add(
    "The API rate limit is 100 req/min",
    user_id="team_alpha",
    agent_id="researcher",
    categories=["technical", "api"]
)

# Another agent finds it
results = memory.search("rate limits", user_id="team_alpha")
```

**Full Memory interface:**

```python
from engram import Memory

memory = Memory()

# Lifecycle
memory.add(content, user_id, agent_id=None, categories=None, metadata=None)
memory.get(memory_id)
memory.update(memory_id, content)
memory.delete(memory_id)

# Search
memory.search(query, user_id, agent_id=None, limit=10, categories=None)
memory.get_all(user_id, agent_id=None, layer=None, limit=100)

# Memory management
memory.promote(memory_id)                # SML → LML
memory.demote(memory_id)                 # LML → SML
memory.fuse(memory_ids)                  # Combine related memories
memory.decay(user_id=None)               # Apply forgetting
memory.history(memory_id)                # Access history

# Knowledge graph
memory.get_related_memories(memory_id)   # Graph traversal
memory.get_memory_entities(memory_id)    # Extracted entities
memory.get_entity_memories(entity_name)  # All memories with entity
memory.get_memory_graph(memory_id)       # Visualization data

# Categories
memory.get_category_tree()
memory.search_by_category(category_id)
memory.stats(user_id=None, agent_id=None)
```

**Async support:**

```python
from engram.memory.async_memory import AsyncMemory

async with AsyncMemory() as memory:
    await memory.add("User prefers Python", user_id="u1")
    results = await memory.search("programming", user_id="u1")
```

### CLI

```bash
engram install                     # Auto-configure all integrations
engram status                      # Version, config paths, DB stats
engram serve                       # Start REST API server

engram add "User prefers Python"   # Add a memory
engram search "preferences"        # Search
engram list --layer lml            # List long-term memories
engram stats                       # Memory statistics
engram decay                       # Apply forgetting
engram categories                  # List categories

engram export -o memories.json     # Export
engram import memories.json        # Import (Engram or Mem0 format)
```

---

## Configuration

```bash
# LLM & Embeddings (choose one)
export GEMINI_API_KEY="your-key"                      # Gemini (default)
export OPENAI_API_KEY="your-key"                      # OpenAI
export OLLAMA_HOST="http://localhost:11434"            # Ollama (local, no key)

# v2 features (all enabled by default)
export ENGRAM_V2_POLICY_GATEWAY="true"                # Token + scope enforcement
export ENGRAM_V2_STAGING_WRITES="true"                # Writes land in staging
export ENGRAM_V2_DUAL_RETRIEVAL="true"                # Semantic + episodic search
export ENGRAM_V2_REF_AWARE_DECAY="true"               # Preserve referenced memories
export ENGRAM_V2_TRUST_AUTOMERGE="true"               # Auto-approve for trusted agents
export ENGRAM_V2_AUTO_MERGE_TRUST_THRESHOLD="0.85"    # Trust threshold for auto-merge
```

**Python config:**

```python
from engram.configs.base import MemoryConfig, FadeMemConfig, EchoMemConfig, CategoryMemConfig

config = MemoryConfig(
    fadem=FadeMemConfig(
        enable_forgetting=True,
        sml_decay_rate=0.15,
        lml_decay_rate=0.02,
        promotion_access_threshold=3,
        forgetting_threshold=0.1,
    ),
    echo=EchoMemConfig(
        enable_echo=True,
        auto_depth=True,
        deep_multiplier=1.6,
    ),
    category=CategoryMemConfig(
        enable_categories=True,
        auto_categorize=True,
        enable_category_decay=True,
        max_category_depth=3,
    ),
)
```

---

## Multi-Agent Memory

Engram is designed for agent orchestrators. Every memory is scoped by `user_id` and optionally `agent_id`:

```python
# Research agent stores knowledge
memory.add("OAuth 2.0 with JWT tokens",
           user_id="project_123", agent_id="researcher")

# Implementation agent searches shared knowledge
results = memory.search("authentication", user_id="project_123")
# → Finds researcher's discovery

# Review agent adds findings
memory.add("Security review passed",
           user_id="project_123", agent_id="reviewer")
```

**Agent trust scoring** determines write permissions:
- High-trust agents (>0.85): proposals auto-merge
- Medium-trust: queued for daily digest review
- Low-trust: require explicit approval

---

## Research

Engram is based on:

> **FadeMem: Biologically-Inspired Forgetting for Efficient Agent Memory**
> [arXiv:2601.18642](https://arxiv.org/abs/2601.18642)

| Metric | Result |
|:-------|:-------|
| Storage Reduction | ~45% |
| Multi-hop Reasoning | +12% accuracy |
| Retrieval Precision | +8% on LTI-Bench |

Biological inspirations: Ebbinghaus Forgetting Curve → exponential decay, Spaced Repetition → access boosts strength, Sleep Consolidation → SML → LML promotion, Production Effect → echo encoding, Elaborative Encoding → deeper processing = stronger memory.

---

## LongMemEval on Colab (GPU)

Use this flow to benchmark Engram on LongMemEval in Google Colab with GPU acceleration.

```bash
# 1) In Colab: Runtime -> Change runtime type -> GPU

# 2) Install Engram + GPU reader dependencies
pip install -U engram-memory transformers accelerate

# 3) Download LongMemEval data
mkdir -p /content/longmemeval
cd /content/longmemeval
curl -L -o longmemeval_s_cleaned.json \
  https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned/resolve/main/longmemeval_s_cleaned.json

# 4) Run Engram benchmark (HF reader on GPU)
python -m engram.benchmarks.longmemeval \
  --dataset-path /content/longmemeval/longmemeval_s_cleaned.json \
  --output-jsonl /content/longmemeval/engram_hypotheses.jsonl \
  --retrieval-jsonl /content/longmemeval/engram_retrieval.jsonl \
  --answer-backend hf \
  --hf-model Qwen/Qwen2.5-1.5B-Instruct \
  --embedder-provider simple \
  --llm-provider mock \
  --vector-store-provider memory \
  --history-db-path /content/engram-longmemeval.db \
  --top-k 8 \
  --max-questions 100 \
  --skip-abstention
```

Notes:
- The output file is evaluator-compatible (`question_id`, `hypothesis` per line).
- `--include-debug-fields` adds retrieval diagnostics into each output row.
- The command above uses `simple` embedder + `mock` LLM for memory operations, so **no Gemini/OpenAI key is required**.

If you want to run with Gemini only (no extra reader packages), use base install and set key **before** starting the run:

```bash
pip install -U engram-memory
export GEMINI_API_KEY="your-key"

python -m engram.benchmarks.longmemeval \
  --dataset-path /content/longmemeval/longmemeval_s_cleaned.json \
  --output-jsonl /content/longmemeval/engram_hypotheses.jsonl \
  --answer-backend engram-llm \
  --llm-provider gemini \
  --embedder-provider gemini \
  --vector-store-provider memory
```

Optional official QA scoring from the LongMemEval repo:

```bash
cd /content
git clone https://github.com/xiaowu0162/LongMemEval.git
cd /content/LongMemEval/src/evaluation
export OPENAI_API_KEY="your-key"
python evaluate_qa.py gpt-4o /content/longmemeval/engram_hypotheses.jsonl /content/longmemeval/longmemeval_s_cleaned.json
```

---

## Docker

```bash
# Quick start
docker compose up -d

# Or build manually
docker build -t engram .
docker run -p 8100:8100 -v engram-data:/data \
  -e GEMINI_API_KEY="your-key" engram
```

---

## Manual Integration Setup

<details>
<summary><b>Claude Code / Claude Desktop</b></summary>

Add to `~/.claude.json` (CLI) or `claude_desktop_config.json` (Desktop):

```json
{
  "mcpServers": {
    "engram-memory": {
      "command": "python",
      "args": ["-m", "engram.mcp_server"],
      "env": {
        "GEMINI_API_KEY": "your-api-key"
      }
    }
  }
}
```
</details>

<details>
<summary><b>Cursor</b></summary>

Add to `~/.cursor/mcp.json`:

```json
{
  "mcpServers": {
    "engram-memory": {
      "command": "python",
      "args": ["-m", "engram.mcp_server"],
      "env": {
        "GEMINI_API_KEY": "your-api-key"
      }
    }
  }
}
```
</details>

<details>
<summary><b>OpenAI Codex</b></summary>

Add to `~/.codex/config.toml`:

```toml
[mcp_servers.engram-memory]
command = "python"
args = ["-m", "engram.mcp_server"]

[mcp_servers.engram-memory.env]
GEMINI_API_KEY = "your-api-key"
```
</details>

---

## Troubleshooting

<details>
<summary><b>Claude Code doesn't see the memory tools</b></summary>

- Restart Claude Code after running `engram install`
- Check that `~/.claude.json` has an `mcpServers.engram-memory` section
- Verify your API key: `echo $GEMINI_API_KEY`
</details>

<details>
<summary><b>The hook isn't injecting memories</b></summary>

- Check that `engram-api` is running: `curl http://127.0.0.1:8100/health`
- Verify the plugin is activated: run `/plugin` in Claude Code
- Check script permissions: `ls -l ~/.engram/claude-plugin/engram-memory/hooks/prompt_context.py`
</details>

<details>
<summary><b>API won't start (port in use)</b></summary>

- Check: `lsof -i :8100`
- Kill the process: `kill <PID>`
- Or use a different port: `ENGRAM_API_PORT=8200 engram-api`
</details>

---

## Contributing

```bash
git clone https://github.com/Ashish-dwi99/Engram.git
cd Engram
pip install -e ".[dev]"
pytest tests/ -v
```

---

## License

MIT License — see [LICENSE](LICENSE) for details.

---

<p align="center">
  <b>Switch agents without losing context. Stop re-explaining yourself.</b>
  <br><br>
  <a href="https://github.com/Ashish-dwi99/Engram">GitHub</a> &middot;
  <a href="https://github.com/Ashish-dwi99/Engram/issues">Issues</a> &middot;
  <a href="https://github.com/Ashish-dwi99/Engram/blob/main/CHANGELOG.md">Changelog</a>
</p>
