Metadata-Version: 2.4
Name: claw-memory
Version: 0.4.0
Summary: A lightweight MCP memory server for AI agents. Markdown-first, zero external dependencies.
Author: OpenClaw Contributors
License-Expression: Apache-2.0
License-File: LICENSE
License-File: NOTICE
Keywords: agent,ai,markdown,mcp,memory
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.11
Requires-Dist: mcp>=1.0.0
Requires-Dist: python-frontmatter>=1.1.0
Requires-Dist: sqlite-vec>=0.1.6
Requires-Dist: tiktoken>=0.8.0
Requires-Dist: watchfiles>=1.0.0
Provides-Extra: all
Requires-Dist: ollama>=0.4.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'all'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.8.0; extra == 'dev'
Provides-Extra: local
Requires-Dist: sentence-transformers>=3.0.0; extra == 'local'
Provides-Extra: ollama
Requires-Dist: ollama>=0.4.0; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Description-Content-Type: text/markdown

# OpenClaw Memory

[![PyPI version](https://img.shields.io/pypi/v/claw-memory.svg)](https://pypi.org/project/claw-memory/)

A lightweight MCP memory server designed for AI agents. Markdown files as the single source of truth, zero external dependencies.

[中文文档](README_CN.md)

## Features

- **Markdown-first** — All memories stored as human-readable Markdown files, git-friendly
- **Zero dependencies** — Pure Python + SQLite, no external services required
- **Smart writing** — Quality gate, auto-routing, conflict detection, reinforcement counting
- **Salience-based retrieval** — Multi-dimensional scoring: semantic similarity + reinforcement + recency + access frequency
- **Token budget aware** — Never exceed your context window budget
- **Session primer** — Cold-start with structured context in ~500 tokens
- **Project isolation** — Global user memory + per-project working memory
- **Privacy protection** — Regex-based sensitive information filtering
- **V1 zero LLM dependency** — Only requires an embedding model (local option available)

## Quick Start

### 1. Install + Initialize (two commands)

```bash
# Step 1: Install from PyPI (recommended)
pip install claw-memory[local]          # Local embedding (works offline)
# or pip install claw-memory[openai]   # OpenAI embedding (more accurate)
# or pip install claw-memory[ollama]   # Ollama embedding

# Step 2: Initialize in any project, then restart Cursor
cd /path/to/your/project
claw-memory init
```

**Or install from source** (for development):

```bash
cd /path/to/claw-memory
pip install -e ".[local]"          # Local embedding
# or pip install -e ".[openai]"   # OpenAI embedding
```

> **macOS Note**: If you see `'sqlite3.Connection' object has no attribute 'enable_load_extension'`, your Python was compiled without SQLite extension support. pyenv users can fix this with:
> ```bash
> LDFLAGS="-L$(brew --prefix sqlite3)/lib" CPPFLAGS="-I$(brew --prefix sqlite3)/include" \
> PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" pyenv install <version> --force
> ```
> See [Troubleshooting](docs/cursor-usage-guide.md#sqlite-扩展加载问题) for details.

**The `init` command automatically handles all configuration:**

- Creates `~/.openclaw_memory/user/` global memory directory with template files
- Creates `.openclaw_memory/` project memory directory (journal, agent, etc.)
- Creates `.openclaw_memory.toml` project config (auto-detects embedding provider)
- Creates `.cursor/mcp.json` MCP server configuration
- Creates `.cursor/rules/memory.mdc` agent usage guide
- Creates `.openclaw_memory/.gitignore` (keeps index files out of git)

Optional flags:

```bash
# Specify embedding provider
claw-memory init --provider openai

# Specify project name
claw-memory init --name "my-awesome-project"

# Initialize global memory only (skip project-level files)
claw-memory init --global-only
```

**After init completes, restart Cursor and the agent will automatically use the memory tools.**

### Other Commands

```bash
# Start MCP server (usually called automatically by Cursor)
claw-memory serve

# SSE mode (for web clients)
claw-memory serve --transport sse --port 8765

# One-shot index of existing memory files
claw-memory index

# Open memory viewer in browser
claw-memory web
claw-memory web --port 8767              # Custom port
claw-memory web --no-open                # Don't auto-open browser
```

### Web Memory Viewer

Browse and search all your memories in a clean web UI:

```bash
claw-memory web
```

This opens a local web server at `http://127.0.0.1:8767` with:

- **File tree navigation** — Browse global and project memories organized by category
- **Markdown rendering** — View memory files with full Markdown rendering and syntax highlighting
- **Full-text search** — Search across all memory files instantly
- **Dark / light theme** — Toggle to match your preference
- **Metadata display** — View frontmatter metadata (type, importance, reinforcement, dates) as badges

### Manual Configuration (optional)

If you prefer not to use `init`, you can manually create `.cursor/mcp.json`:

```json
{
  "mcpServers": {
    "claw-memory": {
      "command": "python",
      "args": ["-m", "openclaw_memory"],
      "env": {
        "OPENCLAW_EMBEDDING_PROVIDER": "openai",
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}
```

## MCP Tools

| Tool | When to use | Description |
|------|-------------|-------------|
| `memory_primer()` | Start of every session | Returns structured context: user identity, project info, preferences, recent activity, active tasks |
| `memory_search(query, scope?, max_tokens?)` | When you need to recall specific information | Semantic search with salience scoring and token budget control |
| `memory_log(content, type?)` | When you discover information worth remembering | Auto-classifies, deduplicates, detects conflicts, and routes to the right file |
| `memory_session_end(summary)` | End of session | Writes structured session summary, updates tasks and primer |
| `memory_update_tasks(tasks_json)` | When task status changes | Updates TASKS.md and primer |
| `memory_read(path)` | When you need full file content | Reads and returns complete Markdown file |

> **Detailed usage examples**: See [Cursor Usage Guide](docs/cursor-usage-guide.md) for a complete walkthrough of a multi-session scenario showing how the agent uses each tool in practice.

## How It Works

### Memory Directory Structure

```
~/.openclaw_memory/              # Global (cross-project)
├── config.toml                  # Global configuration
├── user/
│   ├── preferences.md           # Your preferences
│   ├── instructions.md          # Your rules for the agent
│   └── entities.md              # People, tools, projects
└── index.db                     # Global vector index

<project>/.openclaw_memory/      # Per-project
├── .openclaw_memory.toml        # Project configuration
├── PRIMER.md                    # Auto-maintained session primer
├── TASKS.md                     # Active task tracking
├── journal/YYYY-MM-DD.md        # Structured daily session logs
├── agent/
│   ├── patterns.md              # Reusable solution patterns
│   └── decisions.md             # Architecture decisions (ADRs)
└── index.db                     # Project vector index
```

### Smart Writing Pipeline

```
Input --> Quality Gate --> Privacy Filter --> Smart Router --> Reinforcement/Conflict Check --> Write
```

1. **Quality Gate**: Rejects noise (too short, filler phrases, pure code, speculation)
2. **Privacy Filter**: Blocks API keys, passwords, internal IPs (configurable regex)
3. **Smart Router**: Auto-classifies content to the right file by keyword patterns
4. **Reinforcement**: If highly similar memory exists (>0.92), increments reinforcement count instead of duplicating
5. **Conflict Detection**: If similar memory exists (0.85-0.92) with new info, replaces the old entry

### Salience-Based Retrieval

```
salience = 0.50 * semantic_similarity
         + 0.20 * reinforcement_score
         + 0.20 * recency_decay
         + 0.10 * access_frequency
```

Memories that are frequently mentioned (high reinforcement), recently updated, and often recalled naturally rank higher — no manual importance tuning needed.

### Token Budget

```python
# Returns as many relevant memories as fit within 1500 tokens
results = memory_search("webhook handling", max_tokens=1500)
```

## Configuration

### Project config (`.openclaw_memory.toml`)

```toml
[project]
name = "my-project"
description = "E-commerce platform"

[embedding]
provider = "openai"              # openai | ollama | local
model = "text-embedding-3-small" # optional, uses provider default

[privacy]
enabled = true
patterns = [
    'sk-[a-zA-Z0-9]{20,}',
    'ghp_[a-zA-Z0-9]{36}',
    'password\s*[:=]\s*\S+',
]

[search]
default_max_tokens = 1500
recency_half_life_days = 30
```

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `OPENCLAW_EMBEDDING_PROVIDER` | Embedding provider | `local` |
| `OPENCLAW_EMBEDDING_MODEL` | Model name | Provider default |
| `OPENCLAW_MEMORY_ROOT` | Override memory root path | Auto-detect |
| `OPENAI_API_KEY` | OpenAI API key | — |

## Embedding Providers

| Provider | Dimension | Dependency | Use case |
|----------|-----------|------------|----------|
| OpenAI `text-embedding-3-small` | 1536 | API key | Best accuracy |
| Ollama `nomic-embed-text` | 768 | Local Ollama | Offline / privacy |
| sentence-transformers `all-MiniLM-L6-v2` | 384 | Pure local | Zero dependency |

## Design Decisions

This project was designed by analyzing four existing memory systems:

- **memsearch**: Markdown as source of truth, content-hash dedup, hybrid search
- **OpenViking**: Directory-based organization, L0/L1/L2 progressive loading
- **memU**: Reinforcement counting for importance, salience scoring formula
- **claude-mem**: Structured session summaries, project isolation, privacy tags

Key differences from all four:

- **Zero LLM dependency** in V1 (only embedding model needed)
- **Zero external service dependency** (pure Python + SQLite)
- **Smart writing pipeline** replaces LLM-based extraction with rule-based routing
- **Reinforcement + rules** hybrid for importance (data-driven + heuristic)
- **Token budget aware** retrieval (none of the four do this)

## License

Apache-2.0. See [LICENSE](LICENSE) for details.
