Metadata-Version: 2.4
Name: memir
Version: 0.3.3
Summary: Local memory for coding agents: deterministic zero-token writes, a first-class failure guard (never repeat a mistake), and a token-efficient briefing. Local CPU semantic recall — no API keys, no cloud.
Author: Jabir Al Nahian
License: MIT
Project-URL: Homepage, https://github.com/jabir-al-nahian/memir
Project-URL: Repository, https://github.com/jabir-al-nahian/memir
Project-URL: Issues, https://github.com/jabir-al-nahian/memir/issues
Project-URL: Changelog, https://github.com/jabir-al-nahian/memir/blob/main/CHANGELOG.md
Keywords: agent,memory,llm,coding-agent,mcp,model-context-protocol,failure-memory,context-window,local-first,rag,sqlite,embeddings,model2vec
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE.md
Requires-Dist: model2vec>=0.3
Requires-Dist: numpy>=1.21
Provides-Extra: nli
Requires-Dist: torch>=2.2; extra == "nli"
Requires-Dist: transformers>=4.40; extra == "nli"
Provides-Extra: rerank
Requires-Dist: torch>=2.2; extra == "rerank"
Requires-Dist: transformers>=4.40; extra == "rerank"
Provides-Extra: contextual
Requires-Dist: torch>=2.2; extra == "contextual"
Requires-Dist: transformers>=4.40; extra == "contextual"
Provides-Extra: ml
Requires-Dist: torch>=2.2; extra == "ml"
Requires-Dist: transformers>=4.40; extra == "ml"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=5.0; extra == "dev"
Dynamic: license-file

# 🧠 Memir

**The memoir your coding agent writes for itself.**
Long-term memory for coding agents that *doesn't* eat your context window.
Local-first. Zero API keys. And it **never lets the agent repeat a mistake.**

[![CI](https://github.com/jabir-al-nahian/memir/actions/workflows/ci.yml/badge.svg)](https://github.com/jabir-al-nahian/memir/actions/workflows/ci.yml)
![Python](https://img.shields.io/badge/python-3.10%2B-blue)
![License](https://img.shields.io/badge/license-MIT-green)
![Runs on CPU](https://img.shields.io/badge/runs%20on-CPU%20only-brightgreen)

> *Memir* = **mem**ory + **memoir** + **Mímir**, the Norse keeper of memory and
> wisdom. Your agent keeps a memoir; Memir is where it lives.

---

## The problem

Every coding agent forgets. Switch chats, hit the context limit, start a new
session — and it re-asks what it already knew, re-decides what you already
decided, and **repeats the exact mistake it made an hour ago.**

The common "fix" is to stuff the whole history back into the prompt every turn.
That burns thousands of tokens per request and *still* overflows as the project
grows.

## The idea

Memir stores three kinds of memory in a tiny local SQLite file:

| kind | what it captures |
|------|------------------|
| **fact** | durable truths about the project (endpoints, conventions, constraints) |
| **decision** | a choice **and the reason** for it, so it isn't relitigated |
| **failure** | what was **tried**, **why it failed**, and the **lesson** — a first-class memory |

Then it hands the agent a **token-budgeted briefing** at the start of a session
and a **failure guard** it can check *before* trying something.

> **Writing a memory costs 0 tokens and 0 API calls** — it's a local insert, sub-
> millisecond. Compare that to memory frameworks that run an LLM extraction call
> on *every* `add`.

---

## Why it's different

- **First-class FAILURE memory.** Most memory tools remember facts. Memir
  remembers *mistakes* — what failed, why, and what to do instead — and blocks
  the repeat. That's the load-bearing feature for a coding agent.
- **0-token, deterministic writes.** No LLM in the capture path. Nothing to bill,
  nothing to rate-limit, nothing to go down.
- **Local semantic recall by default.** Ships with a retrieval-tuned, numpy-only
  embedding model ([model2vec](https://github.com/MinishLab/model2vec)
  `potion-retrieval-32M`) that runs in ~1 ms on a **CPU** — no GPU, no torch, no
  API. Paraphrase matching works out of the box, and gracefully falls back to
  keyword recall if the model can't be loaded.
- **Anti-bloat.** Write-time dedup plus a *safe* `consolidate()` that only merges
  genuine restatements — it will **never** collapse two distinct facts.
- **Bounded context cost.** A naive "replay everything" agent grows without
  bound; the briefing here stays inside a token budget you set.
- **Plugs into any agent via MCP.** One stdio server works with Cursor, Claude
  Desktop, Cline, VS Code, Windsurf, …

---

## Install

```bash
pip install memir
```

The base install is light — `model2vec` + `numpy` only (no torch, no GPU, no
API key). Optional **local CPU** accuracy features are opt-in extras:

```bash
pip install "memir[rerank]"      # cross-encoder reranker (precision second stage)
pip install "memir[nli]"         # NLI reasoner (contradiction detection)
pip install "memir[contextual]"  # contextual BGE encoder (stronger multi-hop)
pip install "memir[ml]"          # all of the above
```

Warm the model cache once so later runs work fully offline:

```bash
memir setup      # downloads the CPU model, initialises the store
memir start      # starts the MCP server (stdio) for your editor/agent
```

---

## Quickstart (Python)

```python
from memir import Memir

brain = Memir("my-project")          # local model loads automatically (CPU)

# remember things (0 tokens, sub-ms, no API)
brain.remember_fact("The billing job MUST run in UTC; our partner reports in UTC.")
brain.record_decision("Use SQLite FTS5 for recall.", reason="100x faster than scanning.")

# record a failure so it's never repeated
brain.record_failure(
    attempt="POSTed the whole batch to /v1/ingest in one request.",
    reason="The gateway silently drops bodies >256KB and returns an empty 200.",
    lesson="Chunk uploads under 256KB and verify each ack id.",
)

# BEFORE trying something, check the failure guard
if hits := brain.check_failure("send the full batch to /v1/ingest at once"):
    print("⛔", hits[0].lesson)   # -> Chunk uploads under 256KB and verify each ack id.

# start a fresh session fully informed, within a token budget
print(brain.briefing(budget_tokens=300))
```

### Auto-capture from real errors

```python
# parses the traceback, decides it's non-obvious, stores it as a failure:
brain.capture_error("RuntimeError: gateway returned 200 with empty body — silently dropped")

# a textbook typo is judged self-evident and skipped (no bloat):
brain.capture_error("NameError: name 'foo' is not defined")   # -> None
```

---

## Command line

```bash
memir remember "Deploys go out via the blue/green pipeline only."
memir decision "Adopt SQLite WAL" --why "concurrent reads during writes"
memir failure  "Dropped the prod table" --why "ran migration w/o backup" \
               --lesson "snapshot before every migration"
memir recall   "how do we deploy?"
memir check    "run the migration now"      # failure guard
memir briefing --budget 300
memir stats
memir doctor                                 # environment + MCP config check
```

---

## Use it from your editor (MCP)

Memir ships a Model Context Protocol server (stdio JSON-RPC).

**Cursor / Windsurf / Claude Desktop** — add to your MCP config:

```json
{
  "mcpServers": {
    "memir": {
      "command": "memir",
      "args": ["start"],
      "env": { "MEMIR_PROJECT": "my-project", "MEMIR_DB": ".memir/memir.db" }
    }
  }
}
```

**VS Code** (`.vscode/mcp.json`):

```json
{ "servers": { "memir": { "command": "memir", "args": ["start"] } } }
```

Tools exposed: `remember_fact`, `record_decision`, `record_failure`, `recall`,
`check_failure`, `briefing`, `capture_error`, `stats`, `consolidate`.

Set `MEMIR_EMBED=0` to force keyword-only mode (skips the model entirely).
Override the model with `MEMIR_MODEL=minishlab/potion-base-8M` (30 MB, faster).

---

## Benchmarks

Memir is measured against real memory backends (mem0, cognee, graphiti, letta)
and an oracle upper bound under a single uniform LLM judge. The honest headline:
**Memir matches the top accuracy tier while injecting ~10× fewer memory tokens
per turn**, locally and with zero-token writes. See
[docs/BENCHMARKS.md](docs/BENCHMARKS.md) for the full tables, the competitor
head-to-heads, and the documented limits (relational multi-hop at large scale).

---

## License

[MIT](LICENSE.md) — a permissive open-source license. Free to use, modify, and
distribute, including commercially, with attribution.
