Metadata-Version: 2.4
Name: reeve
Version: 0.1.9
Summary: Lifetime persistent memory for LLMs. A lightweight SDK for structured knowledge graphs.
Author: Ankesh Kumar
License-Expression: MIT
Keywords: memory,neo4j,knowledge-graph,openai,embeddings,long-term-memory,personal-ai
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Database
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: neo4j>=5.0
Requires-Dist: langchain>=0.2
Requires-Dist: langchain-ollama>=0.2
Requires-Dist: numpy>=1.24
Requires-Dist: python-dotenv>=1.0
Requires-Dist: PyJWT>=2.0
Requires-Dist: starlette>=0.30.0
Requires-Dist: google-auth>=2.0.0
Requires-Dist: requests>=2.25.0
Requires-Dist: sseclient-py>=1.7.2
Provides-Extra: openai
Requires-Dist: langchain-openai>=0.1; extra == "openai"
Requires-Dist: openai>=1.0; extra == "openai"
Provides-Extra: streamlit
Requires-Dist: streamlit>=1.30; extra == "streamlit"
Provides-Extra: mcp
Requires-Dist: mcp>=1.6; extra == "mcp"
Requires-Dist: uvicorn>=0.20.0; extra == "mcp"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"
Requires-Dist: mypy>=1.8; extra == "dev"
Dynamic: license-file

# Reeve

**Lifetime persistent memory for LLMs — a knowledge graph that remembers everything, knows what changed, and never forgets what matters.**

Reeve is a lightweight SDK for long-term AI memory. It transforms unstructured chat history into a structured Neo4j knowledge graph, ensuring that every fact you store today remains accurately retrievable a lifetime from now — in context, and evolving with you.

```python
import os
from reeve import store_memory, query_memory

# Set your API key
os.environ["REEVE_API_KEY"] = "sk-XXXXXXXXXXXXXXXXXXXXXXXX"

# Store a memory
store_memory("I just started as a software engineer at Google in San Francisco")

# Months later...
store_memory("I moved from San Francisco to New York")

# Reeve connects the dots across time
print(query_memory("Where do I live?"))        # → "New York."
print(query_memory("Did I ever live in SF?"))  # → "Yes, before moving to New York."
```

The memory of your move was stored months after your initial job announcement. But when you need it, Reeve connects the dots — because it actually *understands* the relationship between locations, time, and your career history.

No context windows. No token limits. No forgetting. **Persistent memory that outlives any single conversation — or any single year.**

---

## Why This Exists

LLMs are stateless. Every conversation starts from zero. The common workarounds all break down over time:

| Approach | Works for a week | Breaks at scale |
|---|---|---|
| **Chat history** | ✓ | Grows unbounded, no structure, can't handle contradictions |
| **Vector stores** (Pinecone, ChromaDB) | ✓ | Flat chunks — "I moved to NYC" never invalidates "I live in SF" |
| **RAG pipelines** | ✓ | Retrieves similar text, not relevant *knowledge* |
| **Context-window managers** (MemGPT) | ✓ | Clever paging, but still raw text — no entity tracking, no state evolution |

Ask any of them: *"What changed about me since last year?"* — silence.

Ask Reeve — it knows, because it actually tracks entities, states, and time.

## How It Works

Reeve doesn't store text. It **understands** it.

Every input is parsed by an LLM into structured semantics — entities, actions, states, roles, emotions, locations — and written into a **Neo4j knowledge graph** with typed relationships. This isn't an embedding dump. It's a living, evolving model of everything you've told it.

### What Happens When You Store a Memory

*"I love playing football"* becomes a structured graph:

```
(Episode) ──INVOLVES──▶ (Entity: you)
    │
    ├──HAS_STATE──▶ (State: hobby=football, sentiment=love) ──OF_ENTITY──▶ (Entity: you)
    └──INVOLVES──▶ (Entity: football)
```

*"I just started at Google as a software engineer in San Francisco"*:

```
(Episode) ──INVOLVES──▶ (Entity: Google)
    │                        │
    ├──HAS_ACTION──▶ (Action: started working) ──BY_ENTITY──▶ (Entity: you)
    │                                          ──ON_ENTITY──▶ (Entity: Google)
    ├──HAS_STATE──▶ (State: role=software engineer) ──OF_ENTITY──▶ (Entity: you)
    └──AT_LOCATION──▶ (Location: San Francisco)
```

Months later, when someone asks *"My friend invited me to play football, should I go?"*, Reeve retrieves the football episode by semantic similarity, sees the `sentiment=love` state, and answers: *"Yes! You love playing football."*

No explicit link was ever made between the question and the stored memory — the knowledge graph makes the connection automatically.

### What Happens When Facts Change

*"I moved from San Francisco to New York"* — the old state is **superseded**, not deleted or duplicated. Full history is preserved:

```
(State: city=New York, active=true) ──SUPERSEDES──▶ (State: city=San Francisco, active=false)
```

Ask *"Where do I live?"* → gets the current answer: New York.  
Ask *"Where was I living when I joined Google?"* → still knows it was San Francisco.

**This is what persistent memory actually means.** Not similarity search — structured, evolving knowledge with a complete audit trail.

### Triple-Scoring Retrieval: Why the Right Memory Surfaces

Most systems rank by vector similarity alone. That works for a week. Over years, your "I got married" memory gets buried under thousands of newer, more recent entries.

Reeve scores every candidate across **three scoring dimensions** simultaneously:

```
score = 0.65 × vector_similarity + 0.30 × importance + 0.05 × recency
```

- **Vector similarity** — semantic relevance via ANN search
- **Importance** — LLM-assigned at ingestion (landmark life events score higher)
- **Recency** — exponential decay with a 365-day half-life

The key: memories above importance 0.75 **bypass recency decay entirely**. Your wedding, your child's birth, a cancer diagnosis — these surface instantly no matter how old they are. Yesterday's lunch order won't outrank them.

### Entity Resolution: One Identity, Many Names

*"Google"*, *"my company"*, *"the office"*, *"work"* — all resolve to the same canonical entity through 3-layer matching:

1. **Case-insensitive exact** — instant lookup
2. **Substring containment** — "my company Google" → Google
3. **Embedding similarity** — cosine ≥ 0.88 catches semantic equivalents

### Designed to Last a Lifetime

Most memory systems assume weeks of data. Reeve is engineered for a **lifetime of use**:

| What could go wrong | How Reeve handles it |
|---|---|
| Graph grows to millions of nodes | Dynamic candidate pool scales with size (2% of episodes, 50–500) |
| Old memories become noise | Consolidation engine LLM-summarizes old low-importance episodes |
| Entity names fragment over years | Background 3-pass deduplication merges fragmented nodes |
| Facts become outdated | SUPERSEDES chain — current state always queryable, history preserved |
| Embedding models get replaced | Model version tags + batch reindex utility |
| Disaster strikes | Full JSON backup with timestamped exports |

**Store a memory on day one. Query it a lifetime later. It's still there, still accurate, still in context.**

---

## Features

- **Pluggable LLM providers** — Ollama (local, free) or OpenAI (cloud); switch via one env var
- **Semantic extraction** — LLM parses text into structured entities, actions, states, roles, emotions, importance, and location
- **Neo4j knowledge graph** — Episodes, Entities, Actions, States, Roles, Locations with typed relationships
- **3-layer entity resolution** — case-insensitive → substring → embedding similarity (≥ 0.88)
- **State contradiction handling** — new facts `SUPERSEDE` old ones with full audit trail
- **Vector ANN search** — Provider-matched embeddings indexed in Neo4j
- **Lifetime recency tuning** — 5% recency weight, 365-day half-life, importance floor bypass
- **Temporal queries** — supports "last 3 months", "in 2024", "March 2025" natively
- **Memory consolidation** — LLM-summarizes old low-importance episodes
- **Backup & export** — full graph dump to timestamped JSON
- **Entity deduplication** — 3-pass background scan and merge

## Installation

```bash
pip install reeve
```

### Prerequisites

- **Python 3.10+**
- **Neo4j 5.x** (Aura or self-hosted with vector index support)
- **LLM provider** — one of:
  - **Ollama** (default, free, local) — [Install Ollama](https://ollama.com)
  - **OpenAI** — requires API key from [platform.openai.com](https://platform.openai.com/api-keys)

### Configuration

Copy the example environment file and fill in your credentials:

```bash
cp .env.example .env
```

#### Option A — Ollama (local, default)

```bash
# Install & start Ollama
brew install ollama
ollama serve          # keep running in a separate terminal

# Pull required models
ollama pull llama3.2:3b
ollama pull nomic-embed-text
```

Your `.env` only needs Neo4j credentials — Ollama settings default automatically:

```env
LLM_PROVIDER=ollama
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
```

#### Option B — OpenAI (cloud)

```env
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-api-key
NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password
```

Create the required Neo4j vector index (run once in Neo4j Browser — set dimension to match your provider):

```cypher
-- Ollama (nomic-embed-text): 768 dimensions
-- OpenAI (text-embedding-ada-002): 1536 dimensions
CREATE VECTOR INDEX episode_embedding IF NOT EXISTS
FOR (ep:Episode) ON (ep.embedding)
OPTIONS {indexConfig: {
  `vector.dimensions`: 768,
  `vector.similarity_function`: 'cosine'
}};
```

Verify your setup:

```bash
reeve config
```

## Quick Start

### Python API

```python
import os
from reeve import store_memory, query_memory

# Set your API key
os.environ["REEVE_API_KEY"] = "sk-XXXXXXXXXXXXXXXXXXXXXXXX"

# Store memories
store_memory("I love playing football", speaker="ankesh")
store_memory("I work at Google as a software engineer", speaker="ankesh")

# Query — even months later, across different conversations
answer = query_memory("My friend invited me to play football, should I go?", speaker="ankesh")
print(answer)  # "Yes, you should! You love playing football."
```

### CLI

```bash
# Interactive chat
reeve chat --speaker ankesh

# Store a single memory
reeve store "I love playing football" --speaker ankesh

# Query — connects to stored knowledge automatically
reeve query "My friend invited me to play football, should I go?" --speaker ankesh

# Show active provider, models & run health checks
reeve config

# Admin operations
reeve backup --speaker ankesh
reeve dedup
reeve consolidate --speaker ankesh
reeve reindex --old-model nomic-embed-text
```

Or via `python -m`:

```bash
python -m reeve chat --speaker ankesh
```

## Architecture

```
User Input
  │
  ├── Statement ──▶ operator.py ──▶ reconciler.py ──▶ Neo4j Graph
  │                  (LLM)          (entity resolution,
  │                                   state contradiction,
  │                                   graph writing)
  │
  └── Question  ──▶ retriever.py ──▶ Neo4j Vector Index
                     (triple-scoring)    + Graph Traversal
                    ──▶ llm_interface.py ──▶ Answer
                         (LLM)
```

## Switching Providers

Reeve auto-detects the correct embedding dimension and warns you about mismatches. Run `reeve config` at any time to check your setup.

### Ollama → OpenAI

```bash
# 1. Update .env
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-api-key

# 2. Install the optional OpenAI dependency
pip install "reeve[openai]"

# 3. Verify (will warn if Neo4j index dimension doesn't match)
reeve config
```

# 4. Recreate the vector index (1536-dim for text-embedding-ada-002)
#    In Neo4j Browser:
#      DROP INDEX episode_embedding;
#      CREATE VECTOR INDEX episode_embedding IF NOT EXISTS
#      FOR (ep:Episode) ON (ep.embedding)
#      OPTIONS {indexConfig: {`vector.dimensions`: 1536, `vector.similarity_function`: 'cosine'}};

# 5. Re-embed all episodes with the new model
reeve reindex --old-model nomic-embed-text
```

### OpenAI → Ollama

```bash
# 1. Install & start Ollama
brew install ollama && ollama serve
ollama pull llama3.2:3b && ollama pull nomic-embed-text

# 2. Update .env
LLM_PROVIDER=ollama

# 3. Verify
reeve config

# 4. Recreate the vector index (768-dim for nomic-embed-text)
#    In Neo4j Browser:
#      DROP INDEX episode_embedding;
#      CREATE VECTOR INDEX episode_embedding IF NOT EXISTS
#      FOR (ep:Episode) ON (ep.embedding)
#      OPTIONS {indexConfig: {`vector.dimensions`: 768, `vector.similarity_function`: 'cosine'}};

# 5. Re-embed all episodes
reeve reindex --old-model text-embedding-ada-002
```

### Using a Custom Model

Set these in `.env` to use any Ollama-compatible model:

```env
OLLAMA_CHAT_MODEL=mistral       # any chat model
OLLAMA_EMBED_MODEL=mxbai-embed-large  # any embedding model
EMBEDDING_DIM=1024              # override auto-detection for unknown models
```

## Advanced Usage

Use the lower-level API for fine-grained control:

```python
from reeve.operator import operator_extract
from reeve.reconciler import reconcile, consolidate
from reeve.retriever import retrieve
from reeve.backup import save_backup
from reeve.entity_dedup import deduplicate_entities
```

# Extract semantics
semantics = operator_extract("Alice presented her paper at MIT")

# Write to graph
episode_id = reconcile(semantics, speaker="ankesh", raw_text="...")

# Retrieve context
context = retrieve("What did Alice do?", speaker="ankesh")

# Admin operations
consolidate("ankesh")
save_backup(speaker="ankesh")
deduplicate_entities(dry_run=False)
```

## Configuration Reference

| Variable | Default | Description |
|---|---|---|
| `LLM_PROVIDER` | `ollama` | LLM backend: `ollama` (local) or `openai` (cloud) |
| `OLLAMA_BASE_URL` | `http://localhost:11434` | Ollama server address |
| `OLLAMA_CHAT_MODEL` | `llama3.2:3b` | Ollama chat model |
| `OLLAMA_EMBED_MODEL` | `nomic-embed-text` | Ollama embedding model (768-dim) |
| `OPENAI_API_KEY` | — | OpenAI API key (required when provider=openai) |
| `OPENAI_CHAT_MODEL` | `gpt-4o` | OpenAI chat model |
| `OPENAI_EMBED_MODEL` | `text-embedding-ada-002` | OpenAI embedding model (1536-dim) |
| `EMBEDDING_DIM` | auto | Override auto-detected embedding dimension |
| `WEIGHT_SIMILARITY` | `0.65` | Vector similarity weight in scoring |
| `WEIGHT_IMPORTANCE` | `0.30` | Importance weight in scoring |
| `WEIGHT_RECENCY` | `0.05` | Recency weight (low for lifetime safety) |
| `RECENCY_HALF_LIFE_DAYS` | `365` | Days for recency to decay by 50% |
| `IMPORTANCE_FLOOR` | `0.75` | Episodes above this ignore recency decay |
| `VECTOR_CANDIDATES_MIN` | `50` | Minimum ANN candidates |
| `VECTOR_CANDIDATES_MAX` | `500` | Maximum ANN candidates |
| `VECTOR_CANDIDATES_RATIO` | `0.02` | Fraction of total episodes to search |
| `CONSOLIDATION_AGE_DAYS` | `90` | Episodes older than this are eligible |
| `CONSOLIDATION_BATCH_SIZE` | `50` | Max episodes per consolidation round |
| `CONSOLIDATION_IMPORTANCE_CAP` | `0.3` | Only consolidate below this importance |

## Lifetime Design Decisions

| Problem | Solution |
|---|---|
| Recency bias buries old memories | Recency weight = 5%, half-life = 365 days |
| Important memories forgotten | Importance floor bypass (≥ 0.75 → recency = 1.0) |
| Fixed candidate pool too small | Dynamic pool: 2% of graph, clamped 50–500 |
| Graph grows unbounded | Consolidation engine merges old trivia |
| State contradictions | SUPERSEDES chain — only latest value active |
| Entity fragmentation | 3-layer entity resolution + background dedup |
| Embedding model changes | Model version tag + batch reindex utility |
| Vector index lag | 5-minute recent-episode safety net |
| No disaster recovery | Full JSON export with incremental backup |

## Development

```bash
# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Lint
ruff check .

# Type check
mypy reeve/
```

## Tech Stack

- **Python 3.10+**
- **Neo4j 5.x** — Graph database with vector index
- **Ollama** (default) — Local LLM & embeddings (llama3.2, nomic-embed-text)
- **OpenAI** (optional) — GPT-4o (chat) + text-embedding-ada-002 (embeddings)
- **LangChain** — Unified LLM/embedding wrappers
- **NumPy** — Cosine similarity computation

## License

MIT — see [LICENSE](LICENSE).
