Metadata-Version: 2.4
Name: neruva-mcp
Version: 0.27.0
Summary: MCP server for Neruva -- reliability infrastructure for production AI agents. Agent stops forgetting / drifting / repeating mistakes. v0.27: qbound KG engine (47.5% EM on MemoryAgentBench CR-MH vs <7% wall -- conflict resolution where docs override LLM priors). 6 KG engines, 7-layer federated recall, Pearl do-operator, HD analogy, CBR, ToM, code_kg_*. Free tier, no card.
Author-email: Clouthier Simulation Labs <info@neruva.io>
License: MIT
Project-URL: Homepage, https://neruva.io/claude-code/
Project-URL: Documentation, https://neruva.io/docs/
Project-URL: Source, https://github.com/CloutSimLabs/neruva
Keywords: mcp,model-context-protocol,neruva,agent-memory,agent-context,agentic-ai,knowledge-graph,causal-inference,pearl-do-operator,graph-rag,rag,vector-database,langchain,langgraph,crewai,claude-code,ai,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Requires-Dist: httpx>=0.27
Dynamic: license-file

# neruva-mcp

MCP server for [Neruva](https://neruva.io) — **reliability infrastructure for production AI agents**. One MCP install. Your agent stops forgetting, stops drifting, stops repeating mistakes — every failure replayable bit-for-bit.

Backed by a 6-engine knowledge graph (incl. `qbound`: **47.5% EM on MemoryAgentBench CR-MH vs <7% published wall** — measurably outperforms Mem0, MemGPT, LangMem on document-vs-LLM-prior conflict resolution), Pearl do-operator for causal queries, HD analogy, episodic CBR, deterministic snapshot/restore.

Drop into Claude Code / Cursor / Codex / Gemini CLI / Goose in one line. **Free tier, no card.**

> **For Claude Code users:** see [neruva.io/claude-code](https://neruva.io/claude-code) for the 30-second install + first-queries to try.

## The 5 benchmark crowns

| Benchmark | Score | Position |
|---|---|---|
| **LongMemEval** (end-to-end memory QA) | **93.3%** | top-4 globally, +22pp vs Zep |
| **DFAH compliance** (det × acc) | **100% / 88%** | first system to break the trade-off, 2.75× prior SOTA |
| **Hi-ToM** (higher-order theory-of-mind) | **84.58%** with **99.58%** byte-identical replay | 15× GPT-4 at order 4 · depth-1000 nested beliefs proven |
| **AMA-Bench** (causal + memory reasoning) | **63.3%** | new SOTA, +6pp over paper baseline |
| **Latency** (p95 cache-hit) | **80ms** | 2.5× faster than Mem0 |

Full breakdown: [neruva.io/benchmarks](https://neruva.io/benchmarks). **The only memory stack hitting world-class scores on memory, reasoning, AND audit determinism — simultaneously.**

## What's new in 0.27.0 — `qbound` engine for conflict resolution

New 6th KG engine: **`qbound`** (question-conditioned binding) for workloads where document-stated facts must override the LLM's training-time priors. Measured at **47.5% EM on MemoryAgentBench CR-MH** (multi-hop conflict resolution) — vs the published <7% wall that Mem0, MemGPT, and LangMem all sit at on the same subset.

Use it when your agent reads CRM records, compliance docs, or internal policies that contradict what an LLM "knows" from pretraining:

```python
await session.call_tool(
    "hd_kg_add_facts",
    {
        "kg": "compliance",
        "engine": "qbound",  # <-- new
        "facts": [
            {"subject": "PolicyX", "relation": "current_version", "object": "v2024.3"},
            # ...
        ],
    },
)
```

Drop-in: same shard format as `opb`, same query surface. Backward-compat with all existing tools.

## What's new in 0.26.0 — 7-layer federated recall

`agent_recall_full` lands. One question, seven buckets returned in parallel (~150ms p95):

- **records** — semantic embedding search (existing)
- **kg** — entity-overlap + cosine across knowledge graphs (existing)
- **rules** — HD signature cosine across stored rule libraries (new)
- **cbr** — structural-distance nearest-neighbour across case episode stores (new)
- **scm** — variable-name match across causal models (new)
- **tom** — name-resolved chain lookup across theory-of-mind belief stores (new)
- **continual** — predictive next-token recall against trained K-gram learners (new)

Records + KG work universally. The other five layers light up when you pass opt-in text labels at ingest time (`chain_names`/`prop_name` on `agent_model_belief_add`, `token_names` on `agent_continual_train`, `var_names` on `hd_causal_add_worlds`, `axis_vocab_names` on `hd_cbr_add_episodes`). Empty buckets return a `hint` field with the exact param to pass next time.

Per-layer prefix scoping (`rules_prefix`, `cbr_prefix`, `scm_prefix`, `tom_prefix`, `continual_prefix`, `kg_prefix`) bounds wall time on tenants with many resources. Existing `agent_recall` (records + KG only) unchanged.

## What's new in 0.25.0 — `qa_optimized` + `recency_first` graduated

The LongMemEval 93.3% recipe lifted into the substrate. `agent_recall` now accepts two opt-in flags (both default off, backwards compatible):

- **`mode="qa_optimized"`** — 3× over-fetches the candidate pool and BM25-RRF-fuses (k=60) over the text. Lifts records whose text contains specific entity tokens (brand names, amounts, dates, IDs) that pure semantic embedding under-weights. Worth **~+20pp** on memory-QA benchmarks.
- **`recency_first=true`** — re-sorts returned records by `ts` desc after ranking. Use for knowledge-update questions where the most recent statement wins over older contradictory ones.

Pair them for the strongest memory-QA recipe. Auto-router in `neruva-record` 0.19+ suggests both automatically when the intent classifier hits `recall_extended`.

## What's new in 0.24.0 — Code-graph bare-name resolution

`code_kg_module_of` and `code_kg_class_of` now accept short identifiers like `bind` (no module prefix). When the substrate sees an unqualified name, it falls back to a 1-hop `called_by` lookup to find the likely qualified target.

## What's new in 0.22.0 — Auto-pilot surface (the moat)

Two new tools complete the **auto-pilot** that makes the substrate
use itself. The agent automatically routes user intents to the right
cognitive primitive AND self-curates memory across sessions, without
the user telling it which Neruva tool to call.

- **`agent_route_intent_prompt`** — returns the canonical 18-pattern
  intent classifier (counterfactual / analogy / theory-of-mind / rule
  induction / causal / planning / recall / comparison / state /
  composition / decision / mistake + 6 code-graph navigation intents).
  Pair with `NERUVA_AUTO_ROUTE=1` in `neruva-record` for hands-free
  routing on every user prompt.
- **`agent_reflect_prompt`** — returns the canonical reflection prompt
  that extracts durable decisions / facts / mistakes / open questions
  from recent turns. Pair with `NERUVA_AUTO_REFLECT=1` in
  `neruva-record` for hands-free self-curation. Next session boots
  with curated context, not raw transcript.

Both endpoints are pattern-C: substrate emits a prompt, caller LLM
runs it in its normal turn, structured result pushed back via
existing tools. Substrate stays $0/call. Combined with the existing
`hd_kg_extraction_prompt` (Layer 1 — auto-extract on
`records_ingest`), the three layers form a complete auto-pilot.

See `neruva-record` v0.11+ for the SDK that wires these into Claude
Code's hook system automatically.

## What's new in 0.21.0 — code-graph MCP tools

- **5 new `code_kg_*` tools** for sub-ms structural code queries
  against KGs built locally via `neruva-record-code-index`:
  `code_kg_callees`, `code_kg_callers`, `code_kg_class_of`,
  `code_kg_module_of`, `code_kg_imports`. Thin wrappers over
  `hd_kg_query` with "Call this when..." routing nudges.
- **Tool-description routing nudges.** All high-leverage tools
  (records_*, agent_recall/context/remember, hd_kg_query, hd_analogy,
  hd_causal_query, agent_counterfactual_rollout, agent_model_belief(_add),
  agent_register_action, agent_plan_efe, agent_induce_rule,
  agent_extract_schema, agent_hierarchical_decode) lead with
  "Call this when..." so LLMs route into the right substrate
  primitive without explicit prompting.

## What's new in 0.18.3 — depth-unlimited theory of mind + 125× faster cleanup

- **Theory of mind is now depth-unlimited** (v0.5.4 substrate fix).
  Position-tagged at every chain index via non-commutative permutation
  binding. Inner-position swaps correctly reject; recursive self-
  reference (same agent at multiple chain positions) works natively.
- **Cleanup acceleration via FAISS-binary popcount.** OPB query stage 2
  uses SIMD popcount over sign-quantized atoms with deterministic
  float32 cosine rerank. Substantially faster on warm queries; replay
  bit-identical.
- **551× compression on stored OPB pages** (rank-12 SVD). Persistence
  blobs that were >100 MB now fit in under 1 MB at perfect recall on
  round-trip.

## The 9-level cognitive ladder — no LLM vendor ships rows 3-9

The substrate now exposes the full **9-level cognitive ladder**. Every primitive runs sub-100ms, deterministic from seed, behind one MCP install.

| # | Capability | MCP tool(s) | Frontier LLM equivalent |
|---:|---|---|---|
| 1 | **Vector retrieval** (OPB pages + spectral routing) | `records_query(engine="opb")` | Pinecone/Zep (Level 1 only) |
| 2 | **KG + Pearl do-operator + HD analogy + CBR** | `hd_kg_*` · `agent_causal_query` · `hd_analogy` · `hd_cbr_*` | nobody |
| 3 | **Theory of Mind (nested belief)** | `agent_model_belief_add` · `agent_model_belief` | hallucinates at depth |
| 4 | **Counterfactual rollouts ("what if k → a'?")** | `agent_counterfactual_rollout` | confabulates |
| 5 | **Schema lifting (analogical pattern matching)** | `agent_extract_schema` | needs fine-tuning |
| 6 | **Active Inference planning (Friston EFE)** | `agent_register_action` · `agent_plan_efe` | not a primitive |
| 7 | **Few-shot rule induction** | `agent_induce_rule` | fine-tune (many examples) |
| 8 | **Persistent rule storage** | `agent_persist_rule` · `agent_recall_rule` | re-feed demos every recall |
| 9 | **Continual learning, zero forgetting** | `agent_continual_train` · `agent_continual_predict` | catastrophic forgetting |
| + | **Hierarchical chunking (recursive L^K decode)** | `agent_hierarchical_add` · `agent_hierarchical_decode` | not a primitive |

**~80 tools** across Records, KG, Causal, Analogy, CBR, Blend, federated `agent_*`, the 9 cognitive primitives above, self-introspection.

### Why this is unique

Every primitive in rows 3-9 is a graduated, production-shipped engine. No published memory vendor offers more than rows 1-2. Substrate-augmented small LLMs can match frontier-class agentic capabilities at a fraction of the cost per recall.

## Install

```bash
# In Claude Code (any directory, user scope):
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"nv_..."}}'
```

Or one-line install via npx for any MCP host:

```bash
npx -y @neruva/mcp@latest    # one-off
npm i -g @neruva/mcp         # then `neruva-mcp`
```

Get an API key at https://app.neruva.io (free tier, no credit card).

## Wire into a host

### Claude Code
```bash
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"..."}}'
```

### Cursor (`~/.cursor/mcp.json`)
```json
{
  "mcpServers": {
    "neruva": {
      "command": "npx",
      "args": ["-y", "@neruva/mcp@latest"],
      "env": { "NERUVA_API_KEY": "..." }
    }
  }
}
```

### Codex (`~/.codex/config.toml`)
```toml
[mcp_servers.neruva]
command = "npx"
args = ["-y", "@neruva/mcp@latest"]
env = { NERUVA_API_KEY = "..." }
```

### Gemini CLI (`~/.gemini/settings.json`)
```json
{ "mcpServers": { "neruva": { "command": "npx", "args": ["-y", "@neruva/mcp@latest"], "env": { "NERUVA_API_KEY": "..." } } } }
```

### Goose (`~/.config/goose/config.yaml`)
```yaml
extensions:
  neruva:
    type: stdio
    cmd: npx
    args: ["-y", "@neruva/mcp@latest"]
    env:
      NERUVA_API_KEY: nv_...
```
For Goose auto-pilot (pattern-C route / reflect / extract via your LLM): `pip install neruva-goose`.

## The substrate, in one paragraph

Five layers, one API. **Records** = typed agentic events (decisions, mistakes, tool_calls, llm_turns; auto-embedded at D=1024). **Knowledge Graph** = mutable structured state across 6 engines (hadamard, opb, qbound, multishard, quorum, feature_bundle), sub-ms cosine retrieval, matrix-power N-hop derive. **Causal** = Pearl's do-operator (`observation` vs `intervention` arithmetically distinct). **Analogy** = `a:b::c:?` in HD feature space. **Concept Blending** = provenance-preserving merge of multiple memories. **CBR** = factored episode store. The new **federated agent_*** layer (agent_remember / agent_recall / agent_context) routes across all substrates so a single call handles "where does X store, and how do I get it back?"

Deterministic from a seed. Replayable bit-exactly. Portable as `.neruva` containers — your data is yours.

## Three-line LangChain integration

```python
# pip install neruva-langchain
from neruva_langchain import NeruvaChatMessageHistory
history = NeruvaChatMessageHistory(namespace="user_alice")
# wire into any chain that takes BaseChatMessageHistory
```

Same pattern: `neruva-langgraph` (BaseCheckpointSaver + BaseStore), `neruva-crewai` (Storage interface + 3 memory flavors).

## Auto-record for Claude Code

```bash
pip install neruva-record && neruva-record-install
```

Every Claude Code session lands in your Neruva account: tool calls, chat turns, secrets-redacted client-side, queryable across sessions.

## Why use this over a vector DB or Zep

| | Vector DB | Zep | Mem0 | Neruva |
|---|---|---|---|---|
| KG engines | 0 | 1 | 1 | **6** |
| Causal queries (Pearl do-operator) | ❌ | ❌ | ❌ | ✅ |
| Provable replay (deterministic snapshot/restore) | ❌ | ❌ | ❌ | ✅ |
| Anomaly detection (quorum disagreement) | ❌ | ❌ | ❌ | ✅ |
| Conflict-resolution multi-hop (MAB-CR-MH) | n/a | <7% | <7% (21% CR agg) | **47.5%** (qbound) |
| Federated context (records+KG one call) | ❌ | partial | partial | ✅ |
| Portable container | ❌ | ❌ | ❌ | ✅ `.neruva` |
| p95 latency | varies | varies | varies | <100ms |
| Cost per recall vs context-stuffing | varies | varies | varies | dramatically lower |

### KG engine selector

Pick `engine` on first `hd_kg_add_facts` call to a new KG:

| engine | best for | storage |
|---|---|---|
| `hadamard` (default) | small KGs (<10k facts), latency-critical | 32 KB/shard |
| `opb` | large KGs (>10k facts), matrix-power N-hop derivation | 256 MB/shard |
| `qbound` | conflict-resolution where documents override LLM priors (47.5% on MAB-CR-MH) | similar to opb |
| `multishard` | very large KGs, sharded across K=16 hadamard buckets | scales linearly |
| `quorum` | adversarial/anomaly detection via n-shard quorum | n × hadamard |
| `feature_bundle` | typed-feature workloads (color, size, role) | 128 features × D |

## Auth

Set `NERUVA_API_KEY` in env. `NERUVA_URL` defaults to `https://api.neruva.io`.

Optional: `NERUVA_AUTO_RECORD=namespace[:ttl_days]` — every tool call this agent makes auto-records into the named records namespace. Fire-and-forget, never blocks or breaks the call.

## Update flow

The startup banner prints when a newer version is available:
```
[neruva-mcp] update available: you have 0.16.0, latest is 0.16.1.
```

If registered with `@neruva/mcp@latest`, a Claude Code restart auto-updates.

## License

MIT
