Metadata-Version: 2.4
Name: edgemem
Version: 0.1.3
Summary: A 0-LLM-token memory management system for edge-friendly long-term agent memory, with optional LLM enhancement.
Author-email: soulless <cuizy@connect.hku.hk>
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click<8.2
Requires-Dist: dateparser
Requires-Dist: numpy
Requires-Dist: openai
Requires-Dist: spacy
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: pytest; extra == "dev"
Dynamic: license-file

# EdgeMem

EdgeMem is a 0-LLM-token memory management system for multi-session agent conversations.
It builds lightweight local memory indexes from dialogue turns, routes questions through timeline and graph structure, and optionally calls an LLM only for final answer generation and evaluation.
Because its core ingestion and retrieval path is low-cost and local-first, it is suitable for edge devices; optional LLM-enhanced features will continue to expand over time.

## Highlights

- **Edge-first ingestion**: local entity extraction, keyword extraction, time parsing, and JSON persistence.
- **Structured memory retrieval**: timeline retrieval, hypergraph routing, tri-graph diffusion, optional embedding retrieval, and score fusion.
- **LoCoMo and LongMemEval-S support**: reproducible CLIs for long-term memory QA retrieval and generation.
- **Explainable evidence**: every retrieved snippet includes source, score, session, speaker, and timestamp metadata.
- **Standalone visual demo**: `docs/edgemem_visual_demo.html` demonstrates graph-routed memory vs summary-only memory.

## Project Layout

```text
edgemem_op/
  edgemem/                    # Core Python package
    cli/                      # answer/evaluate/baseline CLIs
    data_models.py            # MemoryNode, HyperEdge, HyperGraph, Evidence
    timeline.py               # Timeline retriever
    hypergraph.py             # Hypergraph memory and routing
    trigraph.py               # Entity/keyword-turn-session graph retriever
    embedding_retriever.py    # Optional embedding retriever
    generator.py              # Evidence-to-answer prompt
    eval_judge.py             # LLM-as-judge evaluation
  edgemem_memory_plugin/      # Embeddable memory plugin facade
  scripts/                    # Analysis and demo trace utilities
  docs/                       # Method notes, formulation, visual demo
  evaluation/data/locomo/     # Place LoCoMo data here locally; not committed
  longmemeval/...             # Place LongMemEval-S cleaned data here locally; not committed
  examples/traces/            # Exported retrieval trace example
  requirements.txt            # Minimal runtime dependencies
  pyproject.toml              # Package metadata
```

## Installation

```bash
cd edgemem_op
python -m pip install -r requirements.txt
edgemem-init
edgemem-doctor
```

If you want to use a larger spaCy model:

```bash
edgemem-init --model en_core_web_md
export EDGEMEM_SPACY_MODEL=en_core_web_md
```

## Configuration

EdgeMem reads OpenAI-compatible API settings from environment variables:

```bash
export LLM_API_KEY="your-key"
export LLM_BASE_URL="https://api.openai.com/v1"  # optional
export LLM_MODEL="gpt-4o-mini"

export EMB_API_KEY="$LLM_API_KEY"                # only needed for embed/triembed
export EMB_BASE_URL="$LLM_BASE_URL"
export EMB_MODEL="text-embedding-3-small"
```

You can also copy `my_config.example.py` to `my_config.py` for local development, but do not commit `my_config.py`.

## Python Memory Plugin

EdgeMem can be embedded directly as a long-term memory plugin:

```python
from edgemem_memory_plugin import EdgeMemMemoryPlugin, MemoryEvent, MemoryQuery

memory = EdgeMemMemoryPlugin()
memory.write_event(
    MemoryEvent(
        user_id="demo_user",
        session_id="session_1",
        text="I care about turn-level R@5 for LongMemEval.",
    )
)

hits = memory.retrieve(
    MemoryQuery(
        user_id="demo_user",
        query="What metric do I care about?",
        top_k=3,
        mode="fused",
        rerank="trigraph",
    )
)
print(memory.format_context(hits))
```

Run the included example:

```bash
python edgemem_memory_plugin/examples/basic_usage.py
```

## Quick Start

Package smoke tests after installation:

```bash
edgemem-init
edgemem-doctor
edgemem-memory-demo
edgemem-smoke-longmemeval --help
edgemem-smoke-locomo --help
```

If the spaCy model download is blocked, the package can still run basic smoke
tests with spaCy's blank English fallback, but NER quality will be lower.

Timeline-only baseline:

```bash
python -m edgemem.cli.answer \
  --dataset evaluation/data/locomo/locomo10.json \
  --mode timeline \
  --question "When did Caroline go to the LGBTQ support group?"
```

Fused timeline + tri-graph retrieval:

```bash
python -m edgemem.cli.answer \
  --dataset evaluation/data/locomo/locomo10.json \
  --mode fused \
  --rerank trigraph \
  --context-window 1 \
  --question "When did Caroline go to the LGBTQ support group?"
```

Run the LoCoMo evaluation:

```bash
python -m edgemem.cli.evaluate \
  --dataset evaluation/data/locomo/locomo10.json \
  --mode fused \
  --rerank trigraph \
  --run-name locomo_all_trigraph_context \
  --max-workers 10 \
  --context-window 1 \
  --verbose
```

Run LoCoMo retrieval-only detailed metrics without answer generation:

```bash
python -m edgemem.cli.evaluate_locomo_retrieval \
  --dataset evaluation/data/locomo/locomo10.json \
  --mode fused \
  --rerank trigraph \
  --top-k 5 \
  --ks 1,3,5,10 \
  --context-window 1 \
  --run-name locomo_retrieval_detailed \
  --verbose
```

This writes turn/session/context retrieval metrics to
`results/locomo_retrieval_summary_<run-name>.json`.

Convenience smoke test wrapper:

```bash
edgemem-smoke-locomo \
  --dataset evaluation/data/locomo/locomo10.json \
  --limit 5
```

Add `--with-llm` plus `LLM_API_KEY`/`LLM_BASE_URL`/`LLM_MODEL` to generate
answers and run LLM-as-judge.

Run EdgeMem retrieval on LongMemEval-S cleaned:

```bash
python -m edgemem.cli.evaluate_longmemeval \
  --dataset longmemeval/longmemeval-data-cleaned/data/longmemeval_s_cleaned.json \
  --mode fused \
  --rerank trigraph \
  --retrieval-top-k 50 \
  --generation-top-k 5 \
  --context-window 1 \
  --run-name longmemeval_s_cleaned_edgemem \
  --verbose
```

This command evaluates retrieval without calling an LLM. Add `--generate` to also write
`results/longmemeval_hypotheses_<run-name>.jsonl` with `{question_id, hypothesis}` lines
for the official LongMemEval QA evaluator.

Convenience smoke test wrapper:

```bash
edgemem-smoke-longmemeval \
  --dataset longmemeval/longmemeval-data-cleaned/data/longmemeval_s_cleaned.json \
  --limit 3
```

Dataset pages:

- LoCoMo: https://github.com/snap-stanford/locomo
- LongMemEval: https://github.com/xiaowu0162/LongMemEval

## Visual Demo

Open this file directly in a browser:

```text
docs/edgemem_visual_demo.html
```

To export a real retrieval trace from the current code:

```bash
python scripts/export_edgemem_trace.py \
  --dataset evaluation/data/locomo/locomo10.json \
  --conversation-index 0 \
  --qa-index 0 \
  --top-k 5 \
  --max-nodes 80 \
  --output examples/traces/edgemem_trace_demo.json
```

## Notes for Open Source Release

- `my_config.py`, `.env`, caches, generated results, benchmark datasets, and real API keys are intentionally ignored.
- Download LoCoMo and LongMemEval data separately before running benchmark CLIs.
- The core non-embedding retrieval path requires no LLM during ingestion or retrieval.
- LLM calls are used by `generator.py`, `eval_judge.py`, embedding retrieval, and scripts that explicitly rejudge results.
