Plan: elfmem Init, Doctor & Setup¶
Overview¶
Introduce a first-run setup experience across the CLI, MCP, and library layers:
elfmem init— create~/.elfmem/, generateconfig.yamlfrom code defaults, optionally seed SELFelfmem doctor— diagnose setup gaps: directory, config, DB, SELF blocks, API keyselfmem_setupMCP tool — agent-native SELF seeding, callable by Claude Code mid-conversation- Bug fix —
~path expansion missing inconfig.py:from_yaml(), causingFileNotFoundError
No new dependencies. No new files (except tests). Minimal changes to existing modules.
Problem Statement¶
Current onboarding flow (broken):
1. User installs elfmem
2. Creates ~/.elfmem/ manually
3. Creates config.yaml manually (copy from docs)
4. Runs: elfmem serve --db ~/.elfmem/agent.db --config ~/.elfmem/config.yaml
→ If ~ not expanded: FileNotFoundError ← BUG
5. SELF frame is empty; no discovery path exists
6. status() says "Memory is empty. Call learn()." — no SELF hint
Target onboarding flow (fixed):
1. User installs elfmem
2. Runs: elfmem init --self "I am Claude Code, an AI engineering assistant"
→ Creates ~/.elfmem/
→ Generates config.yaml from ElfmemConfig() code defaults
→ Seeds SELF block via remember(content, tags=["self"])
3. Runs: elfmem serve (ELFMEM_DB/ELFMEM_CONFIG set) ← works
4. SELF frame has identity blocks ← works
5. status() says "No SELF blocks. Run: elfmem init --self '...'" ← discoverable
Before/After Analysis¶
Before: Control Flow for Config Loading¶
elfmem serve --db ~/.elfmem/agent.db --config ~/.elfmem/config.yaml
→ cli.serve()
→ mcp.main(db_path="~/.elfmem/agent.db", config_path="~/.elfmem/config.yaml")
→ SmartMemory.open(db_path, config="~/.elfmem/config.yaml")
→ MemorySystem.from_config(db_path, "~/.elfmem/config.yaml")
→ _resolve_config("~/.elfmem/config.yaml")
→ ElfmemConfig.from_yaml("~/.elfmem/config.yaml")
→ open("~/.elfmem/config.yaml") ← FileNotFoundError: ~ not expanded
After: Control Flow for Config Loading¶
elfmem serve --db ~/.elfmem/agent.db --config ~/.elfmem/config.yaml
→ cli.serve()
→ mcp.main(db_path="~/.elfmem/agent.db", config_path="~/.elfmem/config.yaml")
→ SmartMemory.open(db_path, config="~/.elfmem/config.yaml")
→ MemorySystem.from_config(db_path, "~/.elfmem/config.yaml")
→ _resolve_config("~/.elfmem/config.yaml")
→ ElfmemConfig.from_yaml("~/.elfmem/config.yaml")
→ open(Path("~/.elfmem/config.yaml").expanduser()) ← works
Before: Control Flow for elfmem init (does not exist)¶
After: Control Flow for elfmem init¶
elfmem init --self "I am Claude Code"
→ cli.init(self_description="I am Claude Code", config_dir="~/.elfmem")
→ _ensure_config_dir("~/.elfmem") # mkdir -p
→ _write_default_config("~/.elfmem/config.yaml") # generate from ElfmemConfig()
→ SmartMemory.managed(db_path, config) # open DB (created if absent)
→ mem.remember("I am Claude Code", tags=["self"]) # seed SELF
→ echo: "✓ Created ~/.elfmem/config.yaml"
"✓ Created ~/.elfmem/agent.db"
"✓ SELF block stored (block_id: a1b2c3d4)"
Before: Control Flow for elfmem_setup (does not exist)¶
After: Control Flow for elfmem_setup¶
Claude: [calls elfmem_setup("I am Claude Code", values=["clean code"])]
→ mcp.elfmem_setup(identity="I am Claude Code", values=["clean code"])
→ _mem().remember("I am Claude Code", tags=["self"])
→ _mem().remember("clean code", tags=["self", "value"])
→ {"status": "setup_complete", "blocks_created": 2, "blocks": [...]}
Before: _derive_health() suggestion (empty state)¶
if active_count == 0 and inbox_count == 0:
return "good", "Memory is empty. Call learn() to add knowledge."
After: _derive_health() suggestion (empty state)¶
if active_count == 0 and inbox_count == 0:
return "good", "Memory is empty. Seed your identity: elfmem init --self '...'"
Files to Create/Modify¶
| File | Action | Scope |
|---|---|---|
src/elfmem/config.py |
Modify | 1-line bug fix: add .expanduser() in from_yaml() |
src/elfmem/db/queries.py |
Modify | Add count_self_blocks(conn) -> int helper |
src/elfmem/api.py |
Modify | Update empty-state suggestion in _derive_health() |
src/elfmem/cli.py |
Modify | Add init and doctor commands |
src/elfmem/mcp.py |
Modify | Add elfmem_setup tool |
src/elfmem/guide.py |
Modify | Add setup guide entry; update OVERVIEW |
tests/test_init.py |
Create | Tests for init, doctor, and elfmem_setup |
Implementation Steps¶
Step 0 — Bug Fix: ~ Path Expansion in config.py¶
File: src/elfmem/config.py
Problem: from_yaml() uses open(path) which does not expand ~. Any path like
~/.elfmem/config.yaml raises FileNotFoundError.
Before (config.py:179):
@classmethod
def from_yaml(cls, path: str) -> ElfmemConfig:
with open(path) as f:
data = yaml.safe_load(f)
data = {k: v for k, v in (data or {}).items() if v is not None}
return cls.model_validate(data)
After (one line change):
@classmethod
def from_yaml(cls, path: str) -> ElfmemConfig:
with open(Path(path).expanduser()) as f: # ← .expanduser() added
data = yaml.safe_load(f)
data = {k: v for k, v in (data or {}).items() if v is not None}
return cls.model_validate(data)
Note: Path is already imported at the top of config.py. This is a true one-line fix.
Step 1 — DB Helper: count_self_blocks in db/queries.py¶
File: src/elfmem/db/queries.py
Purpose: The doctor command needs to check whether any SELF blocks exist without
creating a full SmartMemory session. A direct SQL COUNT is the minimal approach.
Add after existing query functions:
async def count_self_blocks(conn: AsyncConnection) -> int:
"""Count active blocks that carry the 'self' tag or any 'self/*' tag.
Uses SQLite json_each() to inspect the JSON tags array.
Returns 0 if no SELF blocks exist (memory has not been seeded).
"""
result = await conn.execute(
text(
"SELECT COUNT(*) FROM blocks "
"WHERE status = 'active' "
"AND EXISTS ("
" SELECT 1 FROM json_each(tags) "
" WHERE value = 'self' OR value LIKE 'self/%'"
")"
)
)
row = result.fetchone()
return int(row[0]) if row else 0
Imports needed: text from sqlalchemy (already imported in queries.py).
Step 2 — Update Empty-State Suggestion in api.py¶
File: src/elfmem/api.py
Purpose: When memory is empty, the current suggestion mentions learn(). It should
point users toward SELF setup first — that is the correct starting action.
Before (api.py:785–786):
if active_count == 0 and inbox_count == 0:
return "good", "Memory is empty. Call learn() to add knowledge."
After (suggestion updated):
if active_count == 0 and inbox_count == 0:
return "good", "Memory is empty. Seed your identity: elfmem init --self '...'"
This is the only change to api.py. The function signature of _derive_health() does
not change; no callers need updating.
Step 3 — Add Config Generator to config.py¶
File: src/elfmem/config.py
Purpose: elfmem init generates a commented config.yaml. The content must be
derived from ElfmemConfig() defaults so the generated file always matches the code.
Add module-level function (after ElfmemConfig class):
def render_default_config() -> str:
"""Render a commented default config.yaml string from ElfmemConfig() defaults.
Used by `elfmem init` to generate ~/.elfmem/config.yaml.
Values are sourced from ElfmemConfig() so they always match code defaults.
"""
import textwrap
d = ElfmemConfig()
return textwrap.dedent(f"""\
# elfmem configuration
# Generated by: elfmem init
# Edit as needed. All sections are optional — missing keys use code defaults.
# API keys are NOT stored here — set them as environment variables:
# ANTHROPIC_API_KEY, OPENAI_API_KEY, GROQ_API_KEY, etc.
llm:
model: "{d.llm.model}"
temperature: {d.llm.temperature}
max_tokens: {d.llm.max_tokens}
timeout: {d.llm.timeout}
max_retries: {d.llm.max_retries}
embeddings:
model: "{d.embeddings.model}"
dimensions: {d.embeddings.dimensions}
timeout: {d.embeddings.timeout}
memory:
inbox_threshold: {d.memory.inbox_threshold}
curate_interval_hours: {d.memory.curate_interval_hours}
self_alignment_threshold: {d.memory.self_alignment_threshold}
contradiction_threshold: {d.memory.contradiction_threshold}
near_dup_exact_threshold: {d.memory.near_dup_exact_threshold}
near_dup_near_threshold: {d.memory.near_dup_near_threshold}
similarity_edge_threshold: {d.memory.similarity_edge_threshold}
edge_degree_cap: {d.memory.edge_degree_cap}
top_k: {d.memory.top_k}
search_window_hours: {d.memory.search_window_hours}
outcome_prior_strength: {d.memory.outcome_prior_strength}
outcome_reinforce_threshold: {d.memory.outcome_reinforce_threshold}
penalize_threshold: {d.memory.penalize_threshold}
# Custom prompts (optional — uncomment to override library defaults):
# prompts:
# process_block_file: "~/.elfmem/prompts/process_block.txt"
# contradiction_file: "~/.elfmem/prompts/contradiction.txt"
""")
Why this approach:
- Values come from ElfmemConfig() — if defaults change in code, generated config reflects them
- textwrap.dedent with f-string is clean and readable
- Import of textwrap is local to keep module-level imports clean
Step 4 — CLI Commands: init and doctor in cli.py¶
File: src/elfmem/cli.py
4a. elfmem init command¶
Signature:
@app.command()
def init(
self_description: Annotated[
str | None, typer.Option("--self", help="Seed SELF frame with identity description")
] = None,
db: Annotated[
str, typer.Option("--db", envvar="ELFMEM_DB", help="Database path")
] = "~/.elfmem/agent.db",
config_path: Annotated[
str, typer.Option("--config", envvar="ELFMEM_CONFIG", help="Config YAML path")
] = "~/.elfmem/config.yaml",
force: Annotated[
bool, typer.Option("--force", help="Overwrite existing config")
] = False,
json_output: Annotated[bool, typer.Option("--json")] = False,
) -> None:
"""Initialise elfmem: create config directory, generate config, and optionally seed SELF."""
Logic flow:
1. Expand ~ in db and config_path with os.path.expanduser()
2. Create parent directory of config_path with Path.mkdir(parents=True, exist_ok=True)
3. If config file does not exist (or --force):
→ Write render_default_config() to config_path
→ Record: "created config"
else:
→ Record: "config already exists, skipped"
4. If --self provided:
→ _run(_init_self(db_path, config_path, self_description))
→ Record: LearnResult summary
5. Print summary (or JSON)
Async helper (added at bottom of cli.py with other async helpers):
async def _init_self(db_path: str, config: str, content: str) -> LearnResult:
async with SmartMemory.managed(db_path, config=config) as mem:
return await mem.remember(content, tags=["self"])
Output (text mode):
elfmem init
✓ Config: ~/.elfmem/config.yaml (created)
✓ Database: ~/.elfmem/agent.db (ready)
elfmem init --self "I am Claude Code, a software engineering assistant"
✓ Config: ~/.elfmem/config.yaml (created)
✓ Database: ~/.elfmem/agent.db (ready)
✓ SELF: Stored block a1b2c3d4. Status: created.
Next: elfmem serve --db ~/.elfmem/agent.db --config ~/.elfmem/config.yaml
Output (JSON mode):
{
"config_path": "/Users/emson/.elfmem/config.yaml",
"config_action": "created",
"db_path": "/Users/emson/.elfmem/agent.db",
"self_block": {"block_id": "a1b2c3d4", "status": "created"}
}
Idempotency rules:
- Config: skip if exists (unless --force)
- DB: MemorySystem.from_config() already handles create-if-absent
- SELF block: remember() returns "duplicate_rejected" for exact duplicates — safe to re-run
4b. elfmem doctor command¶
Signature:
@app.command()
def doctor(
db: Annotated[
str | None, typer.Option("--db", envvar="ELFMEM_DB", help="Database path")
] = None,
config: Annotated[
str | None, typer.Option("--config", envvar="ELFMEM_CONFIG", help="Config YAML path")
] = None,
json_output: Annotated[bool, typer.Option("--json")] = False,
) -> None:
"""Diagnose your elfmem setup. Reports what is configured and what is missing."""
Logic flow:
1. Resolve default paths: db → "~/.elfmem/agent.db", config → "~/.elfmem/config.yaml"
2. Expand ~ with os.path.expanduser() on both paths
3. Check config_dir exists (parent of config path)
4. Check config file exists
5. Check db file exists
6. Check API keys: ANTHROPIC_API_KEY, OPENAI_API_KEY (warn if both absent)
7. If db exists: open engine, run count_self_blocks(conn), close
8. Print results with ✓/✗/⚠ prefixes
DB check (async helper):
async def _doctor_self_count(db_path: str) -> int:
"""Count active SELF blocks. Returns -1 if DB is not accessible."""
from elfmem.db.engine import create_engine
from elfmem.db.queries import count_self_blocks
try:
engine = await create_engine(db_path)
async with engine.connect() as conn:
count = await count_self_blocks(conn)
await engine.dispose()
return count
except Exception:
return -1
Note: This does NOT open a SmartMemory session — just a raw engine connect to run the COUNT query. No LLM calls, no session tracking, no side effects.
Output (text mode):
elfmem doctor
✓ Config dir: ~/.elfmem/ (exists)
✓ Config: ~/.elfmem/config.yaml (exists)
✓ Database: ~/.elfmem/agent.db (exists)
✓ SELF: 3 SELF blocks found
✓ API keys: ANTHROPIC_API_KEY set
elfmem doctor (empty state)
✓ Config dir: ~/.elfmem/ (exists)
✓ Config: ~/.elfmem/config.yaml (exists)
✓ Database: ~/.elfmem/agent.db (exists)
✗ SELF: No SELF blocks found
Suggestion: elfmem init --self "Describe your agent identity here"
⚠ API keys: Neither ANTHROPIC_API_KEY nor OPENAI_API_KEY is set
Suggestion: export ANTHROPIC_API_KEY="sk-ant-..."
Exit code:
- 0 if all checks pass
- 1 if any ✗ checks fail (useful for CI/CD pipelines)
Step 5 — MCP Tool: elfmem_setup in mcp.py¶
File: src/elfmem/mcp.py
Add after elfmem_guide:
@mcp.tool()
async def elfmem_setup(
identity: str,
values: list[str] | None = None,
) -> dict[str, Any]:
"""Bootstrap agent identity in the SELF frame.
Call this on first use to establish who you are. Creates SELF-tagged blocks
from a natural language description. Safe to call multiple times — exact
duplicates are silently rejected.
identity: Natural language description of agent role, personality, constraints.
values: Optional list of core values or principles (each stored as a block).
Returns: blocks_created count and per-block status.
Example:
elfmem_setup(
identity="I am Claude Code, an AI-powered software engineering assistant.",
values=["clean minimal code", "always confirm destructive operations"]
)
"""
results = []
identity_result = await _mem().remember(identity, tags=["self"])
results.append(identity_result.to_dict())
if values:
for value in values:
r = await _mem().remember(value, tags=["self", "value"])
results.append(r.to_dict())
created = sum(1 for r in results if r["status"] == "created")
return {
"status": "setup_complete",
"blocks_created": created,
"blocks": results,
}
Design notes:
- Uses _mem().remember() directly — same path as elfmem_remember
- tags=["self"] is what places blocks in the SELF frame
- values are tagged ["self", "value"] for finer-grained retrieval later
- No new infrastructure — calls existing SmartMemory.remember()
- Idempotent: remember() returns "duplicate_rejected" for identical content
Step 6 — Guide Entry: setup in guide.py¶
File: src/elfmem/guide.py
Add to GUIDES dict:
"setup": AgentGuide(
name="setup",
what="Bootstrap agent identity by seeding the SELF frame with core identity blocks.",
when=(
"First use — before any other operations. "
"Also when the agent's role, values, or constraints change significantly."
),
when_not=(
"Every session — once seeded, SELF blocks persist and decay slowly. "
"Don't re-seed unchanged identity on every run."
),
cost="Fast per block. One LLM call per block during consolidate().",
returns=(
"dict with blocks_created (int) and blocks (list of LearnResult dicts). "
"status='setup_complete' always. blocks_created=0 means all were duplicates."
),
next=(
"SELF blocks are in inbox until consolidate() runs (auto on session close). "
"After consolidation, elfmem_recall(frame='self') returns your identity context."
),
example=(
"elfmem_setup(\n"
" identity='I am Claude Code, an AI-powered software engineering assistant.',\n"
" values=['clean minimal code', 'confirm before destructive operations']\n"
")"
),
),
Update OVERVIEW string — add setup to the operations table:
Key Invariants¶
- No new dependencies —
textwrapis stdlib; all other imports are already present - Idempotent init — re-running
elfmem initis safe: config skipped if exists, SELF block returnsduplicate_rejectedfor unchanged content - Bug fix is non-breaking —
Path(path).expanduser()is a drop-in; absolute paths and relative paths are unaffected (expanduser is a no-op when~is absent) - doctor has no write side effects — read-only: filesystem checks + SQL COUNT only; does not start sessions, create blocks, or modify the database
elfmem_setupis a thin wrapper — callsSmartMemory.remember()directly; no special SELF machinery; tags are the only mechanism- SELF blocks are ordinary blocks — no schema changes, no new DB tables;
tags=["self"]is sufficient for the SELF frame filter - Config generated from defaults —
render_default_config()callsElfmemConfig(); if defaults change in code, regenerated configs will reflect them doctorexit code — exits1on any✗failure;0on clean; CI/CD safe
Done Criteria¶
Step 0 — Bug Fix¶
elfmem serve --config ~/.elfmem/config.yamldoes not raiseFileNotFoundErrorElfmemConfig.from_yaml("~/.elfmem/config.yaml")works correctly- Absolute paths (
/Users/emson/.elfmem/config.yaml) still work
Step 1 — DB Helper¶
count_self_blocks(conn)returns0on empty DBcount_self_blocks(conn)returns correct count after inserting SELF-tagged blockscount_self_blocks(conn)countstags=["self"]andtags=["self/identity"]but nottags=["other"]
Step 2 — API suggestion¶
status()on empty DB returns suggestion containingelfmem init --selfstatus()on non-empty DB returns normal suggestions (unchanged)
Step 3 — Config generator¶
render_default_config()returns valid YAML (parseable byyaml.safe_load)- Values in rendered YAML match
ElfmemConfig()defaults ElfmemConfig.from_yaml(rendered_file)producesElfmemConfig()(round-trip)
Step 4 — CLI init / doctor¶
elfmem initcreates~/.elfmem/config.yamland prints confirmationelfmem initon existing config prints "already exists, skipped" (no--force)elfmem init --forceoverwrites existing configelfmem init --self "..."stores a SELF-tagged block and reports block_idelfmem init --self "..."re-run producesduplicate_rejected, not an errorelfmem init --jsonoutputs valid JSONelfmem doctorprints✓for every check when fully configuredelfmem doctorprints✗ SELF: No SELF blockson empty DBelfmem doctorprints⚠when API keys absentelfmem doctorexits1when any check fails
Step 5 — MCP elfmem_setup¶
elfmem_setup(identity="...")returns{"status": "setup_complete", "blocks_created": 1, ...}elfmem_setup(identity="...", values=["v1", "v2"])returnsblocks_created: 3- Re-calling with same identity returns
blocks_created: 0(all duplicates) - SELF blocks appear when
elfmem_recall(frame="self")is called after consolidation
Step 6 — Guide¶
system.guide("setup")returns a non-empty guide stringsystem.guide()overview table includessetupelfmem guide setup(CLI) prints the setup guide
Regression¶
- All existing tests pass unchanged
elfmem remember,recall,status,outcome,curate,serve,guideunaffected
File Locations Summary¶
src/elfmem/
├── config.py ← Step 0 (bug fix: expanduser) + Step 3 (render_default_config)
├── db/
│ └── queries.py ← Step 1 (count_self_blocks helper)
├── api.py ← Step 2 (empty-state suggestion text only)
├── cli.py ← Step 4 (init + doctor commands + async helpers)
├── mcp.py ← Step 5 (elfmem_setup tool)
└── guide.py ← Step 6 (setup guide entry + OVERVIEW update)
tests/
└── test_init.py ← New: tests for steps 0–6
Implementation Order¶
Implement in this order to maintain a working system at each step:
Step 0 → fixes the existing ~ bug (unblocks everything)
Step 1 → adds count_self_blocks (needed by doctor)
Step 2 → improves status() suggestion (standalone, no deps)
Step 3 → adds render_default_config (needed by init)
Step 4a → adds elfmem init command
Step 4b → adds elfmem doctor command
Step 5 → adds elfmem_setup MCP tool
Step 6 → adds guide entry (polish, last)
Each step is independently testable. Steps 2, 3, 5, 6 have no dependencies on each other.