Coordinator memory for agent teams

Run many agents without losing the plot.

Zaxy Coordinate gives multi-agent projects a parent mission, isolated worker sessions, cited findings, conflict diagnostics, approval packets, and accepted merge-back into one replayable project history.

Zaxy Coordinate graph showing worker sessions, findings, coordinator review, parent mission, and Eventloom source of truth
1972 tests passed 92.04% coverage ruff clean mypy clean PyPI 1.0.1 external verification requested

The wedge

Worker-local claims are not project truth.

Spawning agents is easy. The hard part is turning ten isolated investigations into one trustworthy state of work. Zaxy keeps worker sessions separate, records findings with evidence, marks stale and conflicting claims, and promotes only accepted findings into the parent mission session.

Parent mission

The coordinator owns the accepted project history, decisions, handoff, and Memory Checkout state.

Worker sessions

Agents investigate in isolated Eventloom logs, so exploration does not contaminate authoritative memory.

Approval packets

Human or coordinator-agent review accepts, rejects, defers, or promotes findings with cited provenance.

Architecture

Event-sourced coordination, not a shared scratchpad.

Missionobjective and parent state
Workersisolated sessions
Findingsevidence and confidence
Reviewconflicts, stale claims, approvals
Checkoutaccepted cited context

Eventloom remains the append-only source of truth. The embedded Kuzu projection is the default local graph runtime; Neo4j, pgGraph, LatticeDB, and Pathlight are advanced integration tracks. MCP exposes both memory primitives and first-class coordination tools.

Interfaces

CLI, MCP, dashboard, and adapters all speak the same coordination model.

zaxy coordinate brief

Replay mission state, accepted findings, conflicts, stale claims, missing evidence, and next decisions.

coordination_checkout

Return accepted parent state plus diagnostic worker-local findings without treating them as truth.

coordination_approval_packet

Generate a reviewable payload for accept/reject/defer/promote decisions.

memory_checkout

Keep the lower-level model-facing contract with answerability, required_action, current_citation_count, and memory_feedback guidance.

CoordinationAdapter

Dependency-light Python wrapper with LangGraph and CrewAI helper paths.

dashboard --enable-coordinate-review

Opt-in human review controls on top of replay-backed mission state; read-only remains the default.

Benchmark evidence

CoordinationBench is the coordination benchmark.

Zaxy Coordinate now has a frozen first-party adapter for the external CoordinationBench scorer. CoordinationBench v1 and v1-scale score accepted-finding precision and recall, stale rejection, duplicate consolidation, evidence grounding, and answerability. Zaxy scores 1.000 on those public reproducibility lanes, then lands at a 0.606 mean on public-derived holdout packs. Treat the holdout number as the real current baseline.

Holdout mean

0.606 overall

Frozen-adapter public-derived packs expose the next work: final answering, stale interpretation, and broader conflict detection.

Public lanes

1.000 overall

The v1 and v1-scale scores are first-party public-label reproducibility results, not representative leaderboard claims.

Competitors

disclosure only

Mem0, Agent Memory, and ActiveGraph adapters are listed as not_run until pinned same-harness runner manifests exist.

CoordinationBench Results

Frozen adapter result: CoordinationBench v1 shows 1.000 accepted precision versus 0.200 flat precision, while public-derived holdouts show the current generalization gap.

Lane overall accepted precision conflict recall stale rejection answerability
v1-audited 1.000 1.000 1.000 1.000 1.000
v1-scale 1.000 1.000 1.000 1.000 1.000
public-derived holdout mean 0.606 0.846 0.375 0.000 0.000
source-weight baseline v1 0.411 0.683 0.800 0.333 0.600
Markdown notes 0.400 0.000 0.000 0.000 0.000
BM25 worker logs 0.333 0.000 0.000 0.000 0.000
Flat transcript 0.200 0.000 0.000 0.000 0.000

Install

Five-minute local smoke test, then expose coordination through MCP.

pipx install zaxy-memory
zaxy init
zaxy memory log --eventloom-path .eventloom --limit 5
zaxy memory bootstrap --eventloom-path .eventloom
zaxy doctor --eventloom-path .eventloom
zaxy coordinate start "ship auth refactor" --mission auth-main
zaxy coordinate worker create --mission auth-main --worker auth-api
zaxy coordinate assign --mission auth-main --worker auth-api "trace failures"
zaxy coordinate brief --mission auth-main
zaxy coordinate checkout --mission auth-main

What happens when you run init

Zaxy writes `.env.local`, records session genesis and heartbeat, checks graph posture, and prints the MCP command or config path.

What stays local

Session history lives in `.eventloom/` as append-only JSONL. The graph is a rebuildable projection.

How you prove it worked

memory log, memory bootstrap, doctor, and hook-status expose Last checkout, capture readiness, and stale-memory warnings.

Documentation

Start with Coordinate. Keep the rest as operator reference.

Operator and internals reference