Metadata-Version: 2.4
Name: neruva-mcp
Version: 0.25.2
Summary: MCP server for Neruva agent memory + reasoning substrate (Python mirror of @neruva/mcp 0.23 on npm). Pair with neruva-record 0.16+ for warm daemon + TTL cache + reflect-cadence: cache-hit p95 ~80ms (faster than Mem0). AUTO-PILOT: agent_route_intent_prompt (18-pattern router) + agent_reflect_prompt (self-curation) + auto-extract on records_ingest. Plus Records, 5-engine KG, federated recall, Pearl do-operator, HD analogy, CBR, ToM, EFE, rule induction, replay, code_kg_*.
Author-email: Clouthier Simulation Labs <info@neruva.io>
License: MIT
Project-URL: Homepage, https://neruva.io/claude-code/
Project-URL: Documentation, https://neruva.io/docs/
Project-URL: Source, https://github.com/CloutSimLabs/neruva
Keywords: mcp,model-context-protocol,neruva,agent-memory,agent-context,agentic-ai,knowledge-graph,causal-inference,pearl-do-operator,graph-rag,rag,vector-database,langchain,langgraph,crewai,claude-code,ai,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Requires-Dist: httpx>=0.27
Dynamic: license-file

# neruva-mcp

MCP server for [Neruva](https://neruva.io) — memory + reasoning substrate for AI agents. Knowledge graph (5 engines), Pearl do-operator, HD analogy, episodic CBR, deterministic replay. Drop into Claude Code / Cursor / Codex / Gemini CLI / Goose in one line.

> **For Claude Code users:** see [neruva.io/claude-code](https://neruva.io/claude-code) for the 30-second install + first-queries to try.

## The 5 benchmark crowns

| Benchmark | Score | Position |
|---|---|---|
| **LongMemEval** (end-to-end memory QA) | **93.3%** | top-4 globally, +22pp vs Zep |
| **DFAH compliance** (det × acc) | **100% / 88%** | first system to break the trade-off, 2.75× prior SOTA |
| **Hi-ToM** (higher-order theory-of-mind) | **84.58%** with **99.58%** byte-identical replay | 15× GPT-4 at order 4 · depth-1000 nested beliefs proven |
| **AMA-Bench** (causal + memory reasoning) | **63.3%** | new SOTA, +6pp over paper baseline |
| **Latency** (p95 cache-hit) | **80ms** | 2.5× faster than Mem0 |

Full breakdown: [neruva.io/benchmarks](https://neruva.io/benchmarks). **The only memory stack hitting world-class scores on memory, reasoning, AND audit determinism — simultaneously.**

## What's new in 0.25.0 — `qa_optimized` + `recency_first` graduated

The LongMemEval 93.3% recipe lifted into the substrate. `agent_recall` now accepts two opt-in flags (both default off, backwards compatible):

- **`mode="qa_optimized"`** — 3× over-fetches the candidate pool and BM25-RRF-fuses (k=60) over the text. Lifts records whose text contains specific entity tokens (brand names, amounts, dates, IDs) that pure semantic embedding under-weights. Worth **~+20pp** on memory-QA benchmarks.
- **`recency_first=true`** — re-sorts returned records by `ts` desc after ranking. Use for knowledge-update questions where the most recent statement wins over older contradictory ones.

Pair them for the strongest memory-QA recipe. Auto-router in `neruva-record` 0.19+ suggests both automatically when the intent classifier hits `recall_extended`.

## What's new in 0.24.0 — Code-graph bare-name resolution

`code_kg_module_of` and `code_kg_class_of` now accept short identifiers like `bind` (no module prefix). When the substrate sees an unqualified name, it falls back to a 1-hop `called_by` lookup to find the likely qualified target.

## What's new in 0.22.0 — Auto-pilot surface (the moat)

Two new tools complete the **auto-pilot** that makes the substrate
use itself. The agent automatically routes user intents to the right
cognitive primitive AND self-curates memory across sessions, without
the user telling it which Neruva tool to call.

- **`agent_route_intent_prompt`** — returns the canonical 18-pattern
  intent classifier (counterfactual / analogy / theory-of-mind / rule
  induction / causal / planning / recall / comparison / state /
  composition / decision / mistake + 6 code-graph navigation intents).
  Pair with `NERUVA_AUTO_ROUTE=1` in `neruva-record` for hands-free
  routing on every user prompt.
- **`agent_reflect_prompt`** — returns the canonical reflection prompt
  that extracts durable decisions / facts / mistakes / open questions
  from recent turns. Pair with `NERUVA_AUTO_REFLECT=1` in
  `neruva-record` for hands-free self-curation. Next session boots
  with curated context, not raw transcript.

Both endpoints are pattern-C: substrate emits a prompt, caller LLM
runs it in its normal turn, structured result pushed back via
existing tools. Substrate stays $0/call. Combined with the existing
`hd_kg_extraction_prompt` (Layer 1 — auto-extract on
`records_ingest`), the three layers form a complete auto-pilot.

See `neruva-record` v0.11+ for the SDK that wires these into Claude
Code's hook system automatically.

## What's new in 0.21.0 — code-graph MCP tools

- **5 new `code_kg_*` tools** for sub-ms structural code queries
  against KGs built locally via `neruva-record-code-index`:
  `code_kg_callees`, `code_kg_callers`, `code_kg_class_of`,
  `code_kg_module_of`, `code_kg_imports`. Thin wrappers over
  `hd_kg_query` with "Call this when..." routing nudges.
- **Tool-description routing nudges.** All high-leverage tools
  (records_*, agent_recall/context/remember, hd_kg_query, hd_analogy,
  hd_causal_query, agent_counterfactual_rollout, agent_model_belief(_add),
  agent_register_action, agent_plan_efe, agent_induce_rule,
  agent_extract_schema, agent_hierarchical_decode) lead with
  "Call this when..." so LLMs route into the right substrate
  primitive without explicit prompting.

## What's new in 0.18.3 — depth-unlimited theory of mind + 125× faster cleanup

- **Theory of mind is now depth-unlimited** (v0.5.4 substrate fix).
  Position-tagged at every chain index via non-commutative permutation
  binding. Inner-position swaps correctly reject; recursive self-
  reference (same agent at multiple chain positions) works natively.
- **Cleanup acceleration via FAISS-binary popcount.** OPB query stage 2
  uses SIMD popcount over sign-quantized atoms with deterministic
  float32 cosine rerank. Substantially faster on warm queries; replay
  bit-identical.
- **551× compression on stored OPB pages** (rank-12 SVD). Persistence
  blobs that were >100 MB now fit in under 1 MB at perfect recall on
  round-trip.

## The 9-level cognitive ladder — no LLM vendor ships rows 3-9

The substrate now exposes the full **9-level cognitive ladder**. Every primitive runs sub-100ms, deterministic from seed, behind one MCP install.

| # | Capability | MCP tool(s) | Frontier LLM equivalent |
|---:|---|---|---|
| 1 | **Vector retrieval** (OPB pages + spectral routing) | `records_query(engine="opb")` | Pinecone/Zep (Level 1 only) |
| 2 | **KG + Pearl do-operator + HD analogy + CBR** | `hd_kg_*` · `agent_causal_query` · `hd_analogy` · `hd_cbr_*` | nobody |
| 3 | **Theory of Mind (nested belief)** | `agent_model_belief_add` · `agent_model_belief` | hallucinates at depth |
| 4 | **Counterfactual rollouts ("what if k → a'?")** | `agent_counterfactual_rollout` | confabulates |
| 5 | **Schema lifting (analogical pattern matching)** | `agent_extract_schema` | needs fine-tuning |
| 6 | **Active Inference planning (Friston EFE)** | `agent_register_action` · `agent_plan_efe` | not a primitive |
| 7 | **Few-shot rule induction** | `agent_induce_rule` | fine-tune (many examples) |
| 8 | **Persistent rule storage** | `agent_persist_rule` · `agent_recall_rule` | re-feed demos every recall |
| 9 | **Continual learning, zero forgetting** | `agent_continual_train` · `agent_continual_predict` | catastrophic forgetting |
| + | **Hierarchical chunking (recursive L^K decode)** | `agent_hierarchical_add` · `agent_hierarchical_decode` | not a primitive |

**~80 tools** across Records, KG, Causal, Analogy, CBR, Blend, federated `agent_*`, the 9 cognitive primitives above, self-introspection.

### Why this is unique

Every primitive in rows 3-9 is a graduated, production-shipped engine. No published memory vendor offers more than rows 1-2. Substrate-augmented small LLMs can match frontier-class agentic capabilities at a fraction of the cost per recall.

## Install

```bash
# In Claude Code (any directory, user scope):
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"nv_..."}}'
```

Or one-line install via npx for any MCP host:

```bash
npx -y @neruva/mcp@latest    # one-off
npm i -g @neruva/mcp         # then `neruva-mcp`
```

Get an API key at https://app.neruva.io (free tier, no credit card).

## Wire into a host

### Claude Code
```bash
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"..."}}'
```

### Cursor (`~/.cursor/mcp.json`)
```json
{
  "mcpServers": {
    "neruva": {
      "command": "npx",
      "args": ["-y", "@neruva/mcp@latest"],
      "env": { "NERUVA_API_KEY": "..." }
    }
  }
}
```

### Codex (`~/.codex/config.toml`)
```toml
[mcp_servers.neruva]
command = "npx"
args = ["-y", "@neruva/mcp@latest"]
env = { NERUVA_API_KEY = "..." }
```

### Gemini CLI (`~/.gemini/settings.json`)
```json
{ "mcpServers": { "neruva": { "command": "npx", "args": ["-y", "@neruva/mcp@latest"], "env": { "NERUVA_API_KEY": "..." } } } }
```

### Goose (`~/.config/goose/config.yaml`)
```yaml
extensions:
  neruva:
    type: stdio
    cmd: npx
    args: ["-y", "@neruva/mcp@latest"]
    env:
      NERUVA_API_KEY: nv_...
```
For Goose auto-pilot (pattern-C route / reflect / extract via your LLM): `pip install neruva-goose`.

## The substrate, in one paragraph

Five layers, one API. **Records** = typed agentic events (decisions, mistakes, tool_calls, llm_turns; auto-embedded at D=1024). **Knowledge Graph** = mutable structured state across 5 engines, sub-ms cosine retrieval, matrix-power N-hop derive. **Causal** = Pearl's do-operator (`observation` vs `intervention` arithmetically distinct). **Analogy** = `a:b::c:?` in HD feature space. **Concept Blending** = provenance-preserving merge of multiple memories. **CBR** = factored episode store. The new **federated agent_*** layer (agent_remember / agent_recall / agent_context) routes across all substrates so a single call handles "where does X store, and how do I get it back?"

Deterministic from a seed. Replayable bit-exactly. Portable as `.neruva` containers — your data is yours.

## Three-line LangChain integration

```python
# pip install neruva-langchain
from neruva_langchain import NeruvaChatMessageHistory
history = NeruvaChatMessageHistory(namespace="user_alice")
# wire into any chain that takes BaseChatMessageHistory
```

Same pattern: `neruva-langgraph` (BaseCheckpointSaver + BaseStore), `neruva-crewai` (Storage interface + 3 memory flavors).

## Auto-record for Claude Code

```bash
pip install neruva-record && neruva-record-install
```

Every Claude Code session lands in your Neruva account: tool calls, chat turns, secrets-redacted client-side, queryable across sessions.

## Why use this over a vector DB or Zep

| | Vector DB | Zep | Neruva |
|---|---|---|---|
| KG engines | 0 | 1 | 5 |
| Causal queries (Pearl do-operator) | ❌ | ❌ | ✅ |
| Provable replay (deterministic snapshot/restore) | ❌ | ❌ | ✅ |
| Anomaly detection (quorum disagreement) | ❌ | ❌ | ✅ |
| Federated context (records+KG one call) | ❌ | partial | ✅ |
| Portable container | ❌ | ❌ | ✅ `.neruva` |
| p95 latency | varies | varies | <100ms |
| Cost per recall vs context-stuffing | varies | varies | dramatically lower |

## Auth

Set `NERUVA_API_KEY` in env. `NERUVA_URL` defaults to `https://api.neruva.io`.

Optional: `NERUVA_AUTO_RECORD=namespace[:ttl_days]` — every tool call this agent makes auto-records into the named records namespace. Fire-and-forget, never blocks or breaks the call.

## Update flow

The startup banner prints when a newer version is available:
```
[neruva-mcp] update available: you have 0.16.0, latest is 0.16.1.
```

If registered with `@neruva/mcp@latest`, a Claude Code restart auto-updates.

## License

MIT
