Metadata-Version: 2.4
Name: neruva-mcp
Version: 0.28.1
Summary: MCP server for Neruva -- reliability infrastructure for production AI agents. Agent stops forgetting / drifting / repeating mistakes. v0.27: qbound KG engine for conflict resolution where docs override LLM priors. 6 KG engines, 7-layer federated recall, counterfactual queries, HD analogy, CBR, belief tracking, code_kg_*. Free tier, no card.
Author-email: Clouthier Simulation Labs <info@neruva.io>
License: MIT
Project-URL: Homepage, https://neruva.io/claude-code/
Project-URL: Documentation, https://neruva.io/docs/
Project-URL: Source, https://github.com/CloutSimLabs/neruva
Keywords: mcp,model-context-protocol,neruva,agent-memory,agent-context,agentic-ai,knowledge-graph,causal-inference,pearl-do-operator,graph-rag,rag,vector-database,langchain,langgraph,crewai,claude-code,ai,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: mcp>=1.0.0
Requires-Dist: httpx>=0.27
Dynamic: license-file

# neruva-mcp

MCP server for [Neruva](https://neruva.io) — **reliability infrastructure for production AI agents**. One MCP install. Your agent stops forgetting, stops drifting, stops repeating mistakes — every failure replayable bit-for-bit.

Backed by a 6-engine knowledge graph (incl. `qbound` for document-vs-LLM-prior conflict resolution), counterfactual queries, HD analogy, episodic CBR, deterministic snapshot/restore.

Drop into Claude Code / Cursor / Codex / Gemini CLI / Goose in one line. **Free tier, no card.**

> **For Claude Code users:** see [neruva.io/claude-code](https://neruva.io/claude-code) for the 30-second install + first-queries to try.

## Benchmarks

| Test | Score | What it means |
|---|---|---|
| **Does the agent learn from mistakes?** | **+34 points** | Same Claude Haiku model, no retraining. With Neruva running the agent climbs from 84% to 93% across 2000 tasks. Without Neruva it stays flat at 59%. Three independent runs. |
| **Memory QA across long histories** (LongMemEval) | **93.3%** | top-4 globally, +22pp vs Zep |
| **Compliance + audit determinism** (DFAH) | **100% / 88%** | first system to do both at once, 2.75× prior best |
| **Latency** (p95 cache-hit) | **80ms** | 2.5× faster than Mem0 |

Full breakdown: [neruva.io/benchmarks](https://neruva.io/benchmarks). **The only memory stack hitting world-class scores on memory, reasoning, AND audit determinism — simultaneously.**

## What's new in 0.27.0 — `qbound` engine for conflict resolution

New 6th KG engine: **`qbound`** (question-conditioned binding) for workloads where document-stated facts must override the LLM's training-time priors.

Use it when your agent reads CRM records, compliance docs, or internal policies that contradict what an LLM "knows" from pretraining:

```python
await session.call_tool(
    "hd_kg_add_facts",
    {
        "kg": "compliance",
        "engine": "qbound",  # <-- new
        "facts": [
            {"subject": "PolicyX", "relation": "current_version", "object": "v2024.3"},
            # ...
        ],
    },
)
```

Drop-in: same shard format as `opb`, same query surface. Backward-compat with all existing tools.

## What's new in 0.26.0 — 7-layer federated recall

`agent_recall_full` lands. One question, seven buckets returned in parallel (~150ms p95):

- **records** — semantic embedding search (existing)
- **kg** — entity-overlap + cosine across knowledge graphs (existing)
- **rules** — HD signature cosine across stored rule libraries (new)
- **cbr** — structural-distance nearest-neighbour across case episode stores (new)
- **scm** — variable-name match across causal models (new)
- **tom** — name-resolved chain lookup across theory-of-mind belief stores (new)
- **continual** — predictive next-token recall against trained K-gram learners (new)

Records + KG work universally. The other five layers light up when you pass opt-in text labels at ingest time (`chain_names`/`prop_name` on `agent_model_belief_add`, `token_names` on `agent_continual_train`, `var_names` on `hd_causal_add_worlds`, `axis_vocab_names` on `hd_cbr_add_episodes`). Empty buckets return a `hint` field with the exact param to pass next time.

Per-layer prefix scoping (`rules_prefix`, `cbr_prefix`, `scm_prefix`, `tom_prefix`, `continual_prefix`, `kg_prefix`) bounds wall time on tenants with many resources. Existing `agent_recall` (records + KG only) unchanged.

## What's new in 0.25.0 — `qa_optimized` + `recency_first` graduated

The LongMemEval 93.3% recipe lifted into the substrate. `agent_recall` now accepts two opt-in flags (both default off, backwards compatible):

- **`mode="qa_optimized"`** — 3× over-fetches the candidate pool and BM25-RRF-fuses (k=60) over the text. Lifts records whose text contains specific entity tokens (brand names, amounts, dates, IDs) that pure semantic embedding under-weights. Worth **~+20pp** on memory-QA benchmarks.
- **`recency_first=true`** — re-sorts returned records by `ts` desc after ranking. Use for knowledge-update questions where the most recent statement wins over older contradictory ones.

Pair them for the strongest memory-QA recipe. Auto-router in `neruva-record` 0.19+ suggests both automatically when the intent classifier hits `recall_extended`.

## What's new in 0.24.0 — Code-graph bare-name resolution

`code_kg_module_of` and `code_kg_class_of` now accept short identifiers like `bind` (no module prefix). When the substrate sees an unqualified name, it falls back to a 1-hop `called_by` lookup to find the likely qualified target.

## What's new in 0.22.0 — Auto-pilot surface (the moat)

Two new tools complete the **auto-pilot** that makes the substrate
use itself. The agent automatically routes user intents to the right
cognitive primitive AND self-curates memory across sessions, without
the user telling it which Neruva tool to call.

- **`agent_route_intent_prompt`** — returns the canonical 18-pattern
  intent classifier (counterfactual / analogy / theory-of-mind / rule
  induction / causal / planning / recall / comparison / state /
  composition / decision / mistake + 6 code-graph navigation intents).
  Pair with `NERUVA_AUTO_ROUTE=1` in `neruva-record` for hands-free
  routing on every user prompt.
- **`agent_reflect_prompt`** — returns the canonical reflection prompt
  that extracts durable decisions / facts / mistakes / open questions
  from recent turns. Pair with `NERUVA_AUTO_REFLECT=1` in
  `neruva-record` for hands-free self-curation. Next session boots
  with curated context, not raw transcript.

Both endpoints are pattern-C: substrate emits a prompt, caller LLM
runs it in its normal turn, structured result pushed back via
existing tools. Substrate stays $0/call. Combined with the existing
`hd_kg_extraction_prompt` (Layer 1 — auto-extract on
`records_ingest`), the three layers form a complete auto-pilot.

See `neruva-record` v0.11+ for the SDK that wires these into Claude
Code's hook system automatically.

## What's new in 0.21.0 — code-graph MCP tools

- **5 new `code_kg_*` tools** for sub-ms structural code queries
  against KGs built locally via `neruva-record-code-index`:
  `code_kg_callees`, `code_kg_callers`, `code_kg_class_of`,
  `code_kg_module_of`, `code_kg_imports`. Thin wrappers over
  `hd_kg_query` with "Call this when..." routing nudges.
- **Tool-description routing nudges.** All high-leverage tools
  (records_*, agent_recall/context/remember, hd_kg_query, hd_analogy,
  hd_causal_query, agent_counterfactual_rollout, agent_model_belief(_add),
  agent_register_action, agent_plan_efe, agent_induce_rule,
  agent_extract_schema, agent_hierarchical_decode) lead with
  "Call this when..." so LLMs route into the right substrate
  primitive without explicit prompting.

## What's new in 0.18.3 — depth-unlimited theory of mind + 125× faster cleanup

- **Theory of mind is now depth-unlimited** (v0.5.4 substrate fix).
  Position-tagged at every chain index via non-commutative permutation
  binding. Inner-position swaps correctly reject; recursive self-
  reference (same agent at multiple chain positions) works natively.
- **Cleanup acceleration via FAISS-binary popcount.** OPB query stage 2
  uses SIMD popcount over sign-quantized atoms with deterministic
  float32 cosine rerank. Substantially faster on warm queries; replay
  bit-identical.
- **551× compression on stored OPB pages** (rank-12 SVD). Persistence
  blobs that were >100 MB now fit in under 1 MB at perfect recall on
  round-trip.

## The 9-level cognitive ladder — no LLM vendor ships rows 3-9

The substrate now exposes the full **9-level cognitive ladder**. Every primitive runs sub-100ms, deterministic from seed, behind one MCP install.

| # | Capability | MCP tool(s) | Frontier LLM equivalent |
|---:|---|---|---|
| 1 | **Vector retrieval** (OPB pages + spectral routing) | `records_query(engine="opb")` | Pinecone/Zep (Level 1 only) |
| 2 | **KG + Pearl do-operator + HD analogy + CBR** | `hd_kg_*` · `agent_causal_query` · `hd_analogy` · `hd_cbr_*` | nobody |
| 3 | **Theory of Mind (nested belief)** | `agent_model_belief_add` · `agent_model_belief` | hallucinates at depth |
| 4 | **Counterfactual rollouts ("what if k → a'?")** | `agent_counterfactual_rollout` | confabulates |
| 5 | **Schema lifting (analogical pattern matching)** | `agent_extract_schema` | needs fine-tuning |
| 6 | **Active Inference planning (Friston EFE)** | `agent_register_action` · `agent_plan_efe` | not a primitive |
| 7 | **Few-shot rule induction** | `agent_induce_rule` | fine-tune (many examples) |
| 8 | **Persistent rule storage** | `agent_persist_rule` · `agent_recall_rule` | re-feed demos every recall |
| 9 | **Continual learning, zero forgetting** | `agent_continual_train` · `agent_continual_predict` | catastrophic forgetting |
| + | **Hierarchical chunking (recursive L^K decode)** | `agent_hierarchical_add` · `agent_hierarchical_decode` | not a primitive |

**~80 tools** across Records, KG, Causal, Analogy, CBR, Blend, federated `agent_*`, the 9 cognitive primitives above, self-introspection.

### Why this is unique

Every primitive in rows 3-9 is a graduated, production-shipped engine. No published memory vendor offers more than rows 1-2. Substrate-augmented small LLMs can match frontier-class agentic capabilities at a fraction of the cost per recall.

## Install

```bash
# In Claude Code (any directory, user scope):
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"nv_..."}}'
```

Or one-line install via npx for any MCP host:

```bash
npx -y @neruva/mcp@latest    # one-off
npm i -g @neruva/mcp         # then `neruva-mcp`
```

Get an API key at https://app.neruva.io (free tier, no credit card).

## Wire into a host

### Claude Code
```bash
claude mcp add-json neruva '{"command":"npx","args":["-y","@neruva/mcp@latest"],"env":{"NERUVA_API_KEY":"..."}}'
```

### Cursor (`~/.cursor/mcp.json`)
```json
{
  "mcpServers": {
    "neruva": {
      "command": "npx",
      "args": ["-y", "@neruva/mcp@latest"],
      "env": { "NERUVA_API_KEY": "..." }
    }
  }
}
```

### Codex (`~/.codex/config.toml`)
```toml
[mcp_servers.neruva]
command = "npx"
args = ["-y", "@neruva/mcp@latest"]
env = { NERUVA_API_KEY = "..." }
```

### Gemini CLI (`~/.gemini/settings.json`)
```json
{ "mcpServers": { "neruva": { "command": "npx", "args": ["-y", "@neruva/mcp@latest"], "env": { "NERUVA_API_KEY": "..." } } } }
```

### Goose (`~/.config/goose/config.yaml`)
```yaml
extensions:
  neruva:
    type: stdio
    cmd: npx
    args: ["-y", "@neruva/mcp@latest"]
    env:
      NERUVA_API_KEY: nv_...
```
For Goose auto-pilot (pattern-C route / reflect / extract via your LLM): `pip install neruva-goose`.

## The substrate, in one paragraph

Five layers, one API. **Records** = typed agentic events (decisions, mistakes, tool_calls, llm_turns; auto-embedded at D=1024). **Knowledge Graph** = mutable structured state across 6 engines (hadamard, opb, qbound, multishard, quorum, feature_bundle), sub-ms cosine retrieval, matrix-power N-hop derive. **Causal** = Pearl's do-operator (`observation` vs `intervention` arithmetically distinct). **Analogy** = `a:b::c:?` in HD feature space. **Concept Blending** = provenance-preserving merge of multiple memories. **CBR** = factored episode store. The new **federated agent_*** layer (agent_remember / agent_recall / agent_context) routes across all substrates so a single call handles "where does X store, and how do I get it back?"

Deterministic from a seed. Replayable bit-exactly. Portable as `.neruva` containers — your data is yours.

## Three-line LangChain integration

```python
# pip install neruva-langchain
from neruva_langchain import NeruvaChatMessageHistory
history = NeruvaChatMessageHistory(namespace="user_alice")
# wire into any chain that takes BaseChatMessageHistory
```

Same pattern: `neruva-langgraph` (BaseCheckpointSaver + BaseStore), `neruva-crewai` (Storage interface + 3 memory flavors).

## Auto-record for Claude Code

```bash
pip install neruva-record && neruva-record-install
```

Every Claude Code session lands in your Neruva account: tool calls, chat turns, secrets-redacted client-side, queryable across sessions.

## Why use this over a vector DB or Zep

| | Vector DB | Zep | Mem0 | Neruva |
|---|---|---|---|---|
| KG engines | 0 | 1 | 1 | **6** |
| Counterfactual queries | ❌ | ❌ | ❌ | ✅ |
| Provable replay (deterministic snapshot/restore) | ❌ | ❌ | ❌ | ✅ |
| Anomaly detection (quorum disagreement) | ❌ | ❌ | ❌ | ✅ |
| Federated context (records+KG one call) | ❌ | partial | partial | ✅ |
| Portable container | ❌ | ❌ | ❌ | ✅ `.neruva` |
| p95 latency | varies | varies | varies | <100ms |
| Cost per recall vs context-stuffing | varies | varies | varies | dramatically lower |

### KG engine selector

Pick `engine` on first `hd_kg_add_facts` call to a new KG:

| engine | best for | storage |
|---|---|---|
| `hadamard` (default) | small KGs (<10k facts), latency-critical | 32 KB/shard |
| `opb` | large KGs (>10k facts), matrix-power N-hop derivation | 256 MB/shard |
| `qbound` | conflict-resolution where documents override LLM priors | similar to opb |
| `multishard` | very large KGs, sharded across K=16 hadamard buckets | scales linearly |
| `quorum` | adversarial/anomaly detection via n-shard quorum | n × hadamard |
| `feature_bundle` | typed-feature workloads (color, size, role) | 128 features × D |

## Auth

Set `NERUVA_API_KEY` in env. `NERUVA_URL` defaults to `https://api.neruva.io`.

Optional: `NERUVA_AUTO_RECORD=namespace[:ttl_days]` — every tool call this agent makes auto-records into the named records namespace. Fire-and-forget, never blocks or breaks the call.

## Update flow

The startup banner prints when a newer version is available:
```
[neruva-mcp] update available: you have 0.16.0, latest is 0.16.1.
```

If registered with `@neruva/mcp@latest`, a Claude Code restart auto-updates.

## License

MIT
