Metadata-Version: 2.4
Name: eml-memory-curate
Version: 0.1.0
Summary: Markdown-frontmatter memory curator: tiered indexing, semantic search, and contradiction flagging on top of any memory directory.
Author: Monogate Research
License: PROPRIETARY-PRE-RELEASE
Project-URL: Homepage, https://monogate.org
Project-URL: Repository, https://github.com/agent-maestro/eml-memory-curate
Keywords: memory,agentic,llm,claude,embeddings,semantic-search,knowledge-graph
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: embeddings
Requires-Dist: sentence-transformers>=2.2; extra == "embeddings"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: ruff>=0.1; extra == "dev"
Dynamic: license-file

# eml-memory-curate

Markdown-frontmatter memory curator for agentic Claude / LLM sessions.

Tiered indexing, semantic search, and contradiction flagging on top of
any directory of Markdown files. Sister package to
[`eml-memory`](https://pypi.org/project/eml-memory/) — that one ships
the typed JSONL store; this one curates a Markdown index of memory
"topics" with one file per topic.

## Install

```bash
# core (text search only)
pip install eml-memory-curate

# with semantic search
pip install eml-memory-curate[embeddings]
```

## Quick start

Each topic lives in its own Markdown file with optional frontmatter:

```markdown
---
name: Buzzard drama
description: Capacitor C7 spec mismatch caught at QC stage 3.
type: feedback
---

The vendor-supplied capacitor on revision B07 has a 30% lower
ESR rating than the BOM specifies. Caught by QC stage 3 thermal
imaging on 2026-04-28...
```

Run the curator on the directory:

```bash
$ eml-memory-curate audit ~/.claude-memory
  Memory dir:    ~/.claude-memory
  Entries:       111
  Total bytes:   424,049
  ...

$ eml-memory-curate curate ~/.claude-memory
  Curated 111 entries into 4 tier files.
    MEMORY.md           63 entries   22,920 bytes  (OK 22KB)
    MEMORY_RECENT.md    25 entries   12,400 bytes
    MEMORY_ARCHIVE.md   23 entries   18,200 bytes
    TAGS.md            111 entries   38,100 bytes

$ eml-memory-curate query ~/.claude-memory "buzzard drama"
  Top 3 semantic match(es) for 'buzzard drama':
    [0.821] 2026-04-28 [feedback    ] feedback_buzzard.md
        Capacitor C7 spec mismatch caught at QC stage 3.
    ...
```

## What it does

| subcommand    | description |
|---------------|-------------|
| `audit`       | Counts, sizes, oldest/newest, tier preview |
| `curate`      | Regenerate `MEMORY.md` + `MEMORY_RECENT.md` + `MEMORY_ARCHIVE.md` + `TAGS.md` |
| `query`       | Semantic search (with text fallback) |
| `tags`        | Show all inferred topic tags |
| `embed`       | Build the embedding index (requires `[embeddings]`) |
| `contradict`  | Flag numeric conflicts between recent and older same-tag memories |

## Design

* **Per-topic files are sacred.** The curator never edits or deletes
  source `.md` files. Only the four index files are regenerated.
* **Hot-tier byte cap.** `MEMORY.md` stays under a configurable cap
  (default 22 KB) so it loads uncompressed in an LLM session. Older
  / less-critical entries flow into `MEMORY_RECENT.md` then
  `MEMORY_ARCHIVE.md`.
* **Evergreen rules.** Entries typed `feedback`, `user`, or `reference`
  stay in the hot tier regardless of age. Sessions / projects age out.
* **Contradiction scan is advisory.** When a recent memory says
  "21 backends" and an older same-tag memory said "18 backends",
  the curator flags it for human review. It never auto-resolves.

## Configuration

Override the memory directory via `$EML_MEMORY_CURATE_DIR` or pass it
as the first positional argument to every subcommand. Tier knobs
(`--hot-max-bytes`, `--recent-days`, `--archive-days`, `--title`) live
on the `curate` subcommand.

## Python API

```python
from eml_memory_curate import collect_entries, tier_entries, find_contradictions

entries = collect_entries(Path("~/.claude-memory").expanduser())
hot, recent, archive = tier_entries(entries)
conflicts = find_contradictions(entries, recent_n=10)
```

## Roadmap

* **v0.2** — sync adapter for `eml-memory`'s typed JSONL store, so the
  two packages share one source of truth.
* **v0.3** — registry-drift check (volatile keys vs live values), once
  generalized away from the monogate-research builder.
* **v1.0** — bundled embedding model + SQLite backend.

## License

PROPRIETARY-PRE-RELEASE.
