Git for LLM memory

Temporal memory for AI agents.

Zaxy turns agent work into inspectable memory: local Eventloom logs for audit, hash-linked provenance for replay, graph projections for connected reasoning, Memory Checkout for compact context, and MCP tools for retrieval, capture, and feedback.

agent-memory-run
01Eventloom logappend-only JSONL + hash chain
02Hybrid extractiontyped rules + fallback extractors
03Neo4j temporal graphNEXT_EVENT / PREVIOUS_EVENT provenance
04Memory Checkoutactive working set + cited context
05MCP responsecapabilities, checkout, capture, feedback
Zaxy temporal knowledge graph showing connected agent memory entities and relationships
1161 tests 91.97% coverage ruff clean mypy clean PyPI 0.3.0

Why Zaxy

Vector memory is useful, but it is not a source of truth.

Markdown files and chunk RAG flatten history. They can retrieve similar text, but they do not preserve causal chains, fact lifetimes, invalidations, or the evidence that led an agent to a decision. Zaxy keeps the raw event stream, seals it with hashes, and projects it into a graph built for multi-hop, temporal, and provenance-aware reasoning.

Relational

Entities and edges preserve actor, task, dependency, and provenance relationships.

Temporal

Facts are versioned, so retrieval can ask what was true now or at a previous point.

Replayable

The Eventloom log remains the immutable record behind every graph projection and checkout.

Architecture

One memory fabric, five layers.

Eventloomappend, verify, replay
Extractentities, edges, embeddings
Neo4jtemporal graph + provenance path
Checkoutbounded cited context
MCPagent-native tools

Pathlight traces memory operations without becoming the storage layer. Neo4j answers graph questions. Eventloom remains the audit trail. The graph projects sealed Eventloom paths through NEXT_EVENT and PREVIOUS_EVENT edges. MCP gives agent frameworks a stable interface over stdio or SSE.

MCP tools

Memory operations agents can call directly.

memory_capabilities

Tell the model what Zaxy can do, what capture paths are healthy, and when to checkout memory.

memory_checkout

Return the current cited prompt state, active working set, provenance, Checkout diagnostics, and warnings.

memory_append

Append typed events, extract graph facts, and trace the operation.

memory_query

Fuse exact lookup, keyword search, vector similarity, and traversal.

memory_replay

Rebuild session history from Eventloom, optionally from a sequence number.

memory_invalidate

Close graph fact validity windows without deleting historical evidence.

Model-facing checkout contract

Models consume answerability, required_action, current_citation_count, and feedback payloads directly. A checkout without current citations or with warnings tells the model to refresh memory or ask the user instead of guessing from stale context. The canonical fixture lives at docs/examples/memory-checkout-contract.json.

{
  "quality": {
    "answerability": "answer_from_memory",
    "confidence": 0.75,
    "required_action": null
  },
  "diagnostics": {
    "current_citation_count": 1,
    "warning_count": 0
  },
  "guidance": {
    "feedback": {
      "tool": "memory_feedback",
      "payloads": [{"feedback": "used"}]
    }
  }
}

Retrieval

Hybrid search for agent context.

Zaxy routes queries through exact entity lookup, Neo4j full-text search, vector similarity, graph traversal, and verbatim Eventloom retrieval. Memory Checkout turns those lanes into compact cited context so agents receive connected facts instead of raw transcript piles. Checkout diagnostics show source lane mix, citation coverage, excluded superseded context, and feedback guidance. Temporal filters let callers retrieve only facts valid at a point in time.

Projection backends

Eventloom stays canonical. Graph backends are projections.

Neo4j remains the default production graph projection. pgGraph is experimental behind PROJECTION_BACKEND=pggraph for teams evaluating Postgres-local graph traversal, pgvector ranking, and temporal retrieval without moving the source of truth out of Eventloom.

Default: Neo4j

Battle-tested graph projection for temporal entities, source citations, inferred edges, and traversal-backed retrieval.

Experimental: pgGraph

Available behind an explicit backend selector and guarded by same-harness quality, citation, latency, and operations comparisons.

Recoverable projection

Rebuild a selected backend from Eventloom with zaxy reproject --projection-backend pggraph --reset-projection.

Deterministic capture

Useful by default, explicit when packet capture costs money.

local-codex

The Codex preset renders the official MCP install command, writes local session JSONL capture config, and supports a managed zaxy capture start watcher.

Observer hooks

Stable hook sinks record lifecycle, command, file-edit, tool-call, and transcript observations when the client supports them.

Packet analyzer

Provider packet capture remains opt-in for diagnostics and high-fidelity audit because it can consume API quota.

Runtime debugging

Read-only local dashboard for memory inspection.

Run zaxy dashboard --host 127.0.0.1 --port 8765 to inspect the active workspace without giving the browser a mutation path. The dashboard resolves one workspace by default and shows Eventloom sessions, graph projection status, recent events, graph neighborhoods, Checkout diagnostics, Last bootstrap, Last checkout, Last feedback, and stale-memory warnings.

Read-only by design

API routes reject writes; the dashboard is for runtime memory/debugging, not editing facts or invalidating history.

Local scope

The header shows workspace, Eventloom path, domain, session, read-only mode, and graph connection status.

Useful when degraded

If Neo4j is unreachable, Eventloom still renders a fallback provenance graph so sessions are not blank.

Production posture

Built for auditable local-first deployments.

Secret files

Docker/Kubernetes-style *_FILE config keeps production secrets out of plaintext env files.

Remote MCP auth

SSE requests require bearer auth and are scoped by per-client session headers.

TLS-ready Neo4j

Production compose and certificate scripts support encrypted Bolt connections.

Benchmark evidence

Reproducible guardrails, not just a headline score.

Full 500-question LongMemEval-compatible guardrail: the current run reports Zaxy checkout mean 0.724, Answer@5 0.628, R@5 0.972, citation coverage 1.000, p95 1472.11 ms, and p99 2652.55 ms. The 100-question headline remains archived evidence, not the only release gate. Competitor numbers below are public external disclosures with different harnesses or metrics; they are not same-harness results.

The older 650 paired queries context suite still matters as fixture evidence: with OpenAI text-embedding-3-small, Zaxy reached 1.000 mean score, a +0.480 mean delta versus vector and markdown+vector baselines. The archived 100-question LongMemEval-compatible report also keeps the same-harness BM25 baseline visible at 0.840 R@5.

Full 500-question guardrail

0.972 R@5

Current hash-backed checkout run with 1.000 citation coverage and archived latency.

Archived headline

0.970 mean

100-question LongMemEval-compatible run with 0.950 Answer@5 and same-harness BM25 baseline.

pgGraph evaluation

0.958 R@5

pgGraph remains experimental; same-harness checkout control currently reports 0.714 mean and 1.000 citation coverage.

Comparison Evidence

Local rows are archived Zaxy harness results. Competitor rows are public disclosures with different harnesses or metrics.

System Metric Reported result Evidence type
Zaxy Full 500-question LongMemEval-compatible R@5 0.972Mean 0.724; Answer@5 0.628; citation coverage 1.000 Same harnessArchived report
Zaxy 100-question headline remains archived evidence 0.970Answer@5 0.950; R@5 1.000 Same harnessArchived reports
BM25 baseline Legacy full-set LongMemEval-compatible R@5 0.770Same 500-question hash workload baseline Same harnessArchived report
pgGraph checkout Backend-evaluation R@5 0.958Mean 0.714; Answer@5 0.632; experimental backend Same harnessArchived report
MemPalace LongMemEval R@5 96.6% raw; 98.4% held-out hybrid ExternalPublic disclosure
Agent Memory LongMemEval-S R@5 95.2% ExternalPublic disclosure
Mem0 LOCOMO accuracy +26% Accuracy over OpenAI Memory ExternalDifferent metric

Install

Five-minute local smoke test, then expose memory through MCP.

pipx install zaxy-memory
zaxy init . --domain my-project --preset local-codex --capture start --infra check
zaxy memory log --eventloom-path .eventloom --session-id my-project-default --limit 5
zaxy memory bootstrap --eventloom-path .eventloom --session-id my-project-default
zaxy doctor --eventloom-path .eventloom
Data lives in .eventloom/
MCP config is generated or printed by zaxy init
Graph posture is explicit: check first, start when needed
Production uses Docker secrets and deployment preflight
scripts/release-check.sh --root .

What happens when you run init

Zaxy writes `.env.local`, records session genesis and heartbeat, checks graph posture, and prints the MCP command or config path for the selected client.

What stays local

Session history lives in `.eventloom/` as append-only JSONL. Neo4j or pgGraph can be rebuilt from that log because the graph is a projection.

How you prove it worked

memory log shows recent events, memory bootstrap shows model-facing startup guidance, and doctor reports missing capture or backend posture.

Documentation

Full operator and integrator documentation.