Metadata-Version: 2.4
Name: MemShan
Version: 2.0.0
Summary: MCP memory server combining Method of Loci, Major System, Songlines, and PAO mnemonic layers
Author-email: Shan Konduru <shan.konduru@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/ShanKonduru/MemSh-n
Project-URL: Repository, https://github.com/ShanKonduru/MemSh-n
Project-URL: Issues, https://github.com/ShanKonduru/MemSh-n/issues
Keywords: mcp,memory,ai,chromadb,knowledge-graph,ollama
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: mcp[cli]
Requires-Dist: chromadb
Requires-Dist: sentence-transformers
Requires-Dist: networkx
Requires-Dist: spacy
Requires-Dist: httpx
Requires-Dist: openai>=1.0.0
Requires-Dist: google-genai
Requires-Dist: anthropic
Requires-Dist: python-dotenv
Requires-Dist: pydantic>=2.0
Requires-Dist: pydantic-settings
Requires-Dist: rich
Requires-Dist: mempalace
Provides-Extra: demo
Requires-Dist: streamlit; extra == "demo"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-html; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pytest-mock; extra == "dev"

<p align="center">
  <img src="logo/MemShānLogo.png" alt="MemShān Logo" width="600"/>
</p>

<h1 align="center">MemShān</h1>
<p align="center"><em>Memory &amp; Context Manager for AI</em></p>

<p align="center">
  <img src="https://img.shields.io/badge/python-3.11+-blue?logo=python" alt="Python 3.11+"/>
  <img src="https://img.shields.io/badge/MCP-Model%20Context%20Protocol-blueviolet" alt="MCP"/>
  <img src="https://img.shields.io/badge/LLM-Ollama%20%7C%20OpenAI%20%7C%20Gemini%20%7C%20Anthropic-orange" alt="LLM Providers"/>
  <img src="https://img.shields.io/badge/vector%20store-ChromaDB-green" alt="ChromaDB"/>
  <img src="https://img.shields.io/badge/graph-NetworkX-red" alt="NetworkX"/>
</p>

---

## What is MemShān?

MemShān is a Python-based **MCP (Model Context Protocol) memory server** that gives AI assistants
a structured, multi-layered long-term memory. It combines four proven cognitive mnemonic techniques
into a single retrieval pipeline:

| Layer | Cognitive Technique | What It Does |
|---|---|---|
| **Base** | Method of Loci | ChromaDB **Wings** and **Rooms** — spatial memory palace for semantic search |
| **Layer 1** | Major System | Converts numbers in text to phonetic tags for precise numeric lookup |
| **Layer 2** | Songlines | NetworkX **Knowledge Graph** records context trails between memory chunks |
| **Layer 3** | PAO System | Compresses session logs into **Subject-Action-Object triplets** for long-term archival |

---

## Architecture

See [ARCHITECTURE.md](ARCHITECTURE.md) for the full design, data-flow diagrams, and technology decisions.  
See [TASK_EXECUTION_PLAN.md](TASK_EXECUTION_PLAN.md) for the phase-by-phase build plan.

---

## MCP Tools

| Tool | Description |
|---|---|
| `store_memory` | Store text into a Wing/Room; all layers run automatically |
| `retrieve_memory` | Unified query: semantic search + graph expansion + numeric tag matching |
| `add_context_trail` | Manually link two memory chunks in the Songlines graph |
| `get_context_trail` | Return the narrative path between two concept nodes |
| `snapshot_session` | Compress a session log into SAO triplets → long-term storage |
| `list_rooms` | List all Wings and Rooms in the memory palace |

---

## LLM Provider Support

MemShān defaults to **Ollama (local, fully offline)**. Switch providers via a single env var — no code changes required.

| Provider | `LLM_PROVIDER` value | Requires |
|---|---|---|
| **Ollama** *(default)* | `ollama` | Ollama running locally |
| OpenAI | `openai` | `OPENAI_API_KEY` in `.env` |
| Google Gemini | `gemini` | `GEMINI_API_KEY` in `.env` |
| Anthropic Claude | `anthropic` | `ANTHROPIC_API_KEY` in `.env` |

---

## Installation

### Prerequisites

- Python 3.11+
- [Ollama](https://ollama.com) installed and running (for default local LLM)
- Windows: batch scripts provided (`.bat`)
- Linux / macOS: shell scripts provided (`.sh`) — make executable with `chmod +x *.sh`

### Windows Quick Start

```bat
REM 1. Initialize git
000_init.bat

REM 2. Create virtual environment
001_env.bat

REM 3. Activate virtual environment
002_activate.bat

REM 4. Install dependencies
003_setup.bat
```

### Linux / macOS Quick Start

```bash
# Make scripts executable (one-time)
chmod +x *.sh

# 1. Initialize git
./000_init.sh

# 2. Create virtual environment
./001_env.sh

# 3. Activate virtual environment (must be sourced)
source 002_activate.sh

# 4. Install dependencies
./003_setup.sh
```

### Manual (inside activated venv)

```bash
pip install -r requirements.txt
```

### Configure `.env`

Copy and edit the environment file:

```env
# LLM Provider (default: ollama)
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.2

# Optional providers (uncomment and add keys to switch)
# LLM_PROVIDER=openai
# OPENAI_API_KEY=sk-...

# ChromaDB storage
CHROMA_PERSIST_DIR=./data/chroma

# Embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
```

---

## Usage

```bat
REM Windows — Run the MCP server
004_run.bat
```

```bash
# Linux / macOS
./004_run.sh
```

```powershell
# Equivalent (inside activated venv)
python main.py
```

---

## Testing

```bat
REM Windows
005_run_test.bat
005_run_code_cov.bat
```

```bash
# Linux / macOS
./005_run_test.sh
./005_run_code_cov.sh
```

```powershell
# Equivalent inside activated venv (Windows)
.venv\Scripts\pytest tests/ -v
.venv\Scripts\pytest tests/ --cov=src --cov-report=term-missing
```

```bash
# Equivalent inside activated venv (Linux / macOS)
.venv/bin/pytest tests/ -v
.venv/bin/pytest tests/ --cov=src --cov-report=term-missing
```

Coverage target: **100% line AND branch coverage per module** — no exceptions.

---

## Security Scanning

MemShān enforces a **zero-tolerance vulnerability policy** on the `main` branch.
All 90 transitive dependencies are audited via [pip-audit](https://pypi.org/project/pip-audit/)
before every commit that changes `requirements.txt`.

### Run the security scan

```bat
REM Windows — Full scan: requirements + installed environment → JSON + HTML reports
006_pip_audit.bat
```

```bash
# Linux / macOS
./006_pip_audit.sh
```

The script runs **two passes**:

| Pass | Scope | Output |
|---|---|---|
| Requirements scan | Direct deps + full transitive tree (as pip resolves) | `security_reports\pip_audit_<TS>.json` + `.html` |
| Environment scan | Everything installed in the venv | `security_reports\pip_audit_env_<TS>.json` + `.html` |

Both JSON files are converted automatically to self-contained, dark-themed HTML
reports with package tables, CVE details, and a filter bar.

### Utility script

`tools/pip_audit_to_html.py` — reusable converter. Accepts pip-audit JSON from a
file or stdin and writes a timestamped HTML report.

```powershell
# Pipe directly
$env:PYTHONUTF8="1"
python -m pip_audit -r requirements.txt --format json 2>$null |
    python tools/pip_audit_to_html.py

# From a saved file
python tools/pip_audit_to_html.py security_reports/audit.json

# Custom output path
python tools/pip_audit_to_html.py audit.json --output reports/my_report.html
```

### Copilot prompt

Use the `/pip-audit` prompt in GitHub Copilot Chat to run the full scan interactively:
`.github/prompts/pip-audit.prompt.md`

### Policy

- Run `006_pip_audit.bat` (Windows) or `./006_pip_audit.sh` (Linux / macOS) before every commit that adds or changes dependencies.
- Resolve **ALL** findings before pushing to `main`.
- Reports are gitignored — only the scripts and prompt are committed.

Every test file must cover three scenario groups:

```python
@pytest.mark.positive  # happy path
@pytest.mark.negative  # error / failure conditions
@pytest.mark.edge      # boundary values, empty inputs, None, single-item collections
```

---

## Script Reference

### Core Scripts

| Windows (`.bat`) | Linux / macOS (`.sh`) | Purpose |
|---|---|---|
| `000_init.bat` | `000_init.sh` | Initializes git and sets user name / email |
| `001_env.bat` | `001_env.sh` | Creates a `.venv` virtual environment |
| `002_activate.bat` | `source 002_activate.sh` | Activates the virtual environment |
| `003_setup.bat` | `003_setup.sh` | Installs `requirements.txt` and initialises MemPalace |
| `004_run.bat` | `004_run.sh` | Runs the MCP server (`main.py`) |
| `005_run_test.bat` | `005_run_test.sh` | Runs the full pytest suite with HTML report |
| `005_run_code_cov.bat` | `005_run_code_cov.sh` | Runs tests with HTML coverage report |
| `006_pip_audit.bat` | `006_pip_audit.sh` | pip-audit security scan → JSON + HTML reports |
| `008_deactivate.bat` | `source 008_deactivate.sh` | Deactivates the virtual environment |

### MemPalace Utility Scripts

| Windows (`.bat`) | Linux / macOS (`.sh`) | Purpose |
|---|---|---|
| `007_mp_mine.bat` | `007_mp_mine.sh` | Mine workspace files into MemPalace |
| `007_mp_status.bat` | `007_mp_status.sh` | Show palace drawer counts and status |
| `007_mp_search.bat` | `007_mp_search.sh` | Search the palace with optional wing/room filters |
| `007_mp_compress.bat` | `007_mp_compress.sh` | Compress drawers using AAAK Dialect (~30× token reduction) |
| `007_mp_diary.bat` | `007_mp_diary.sh` | Read or write agent diary entries |
| `007_mp_wakeup.bat` | `007_mp_wakeup.sh` | Output L0 + L1 context (~600-900 tokens) for session start |
| `007_mp_repair.bat` | `007_mp_repair.sh` | Rebuild vector index after corruption or abrupt exit |

> **Linux / macOS note:** All `.sh` scripts must be made executable once: `chmod +x *.sh`  
> `002_activate.sh` and `008_deactivate.sh` must be **sourced** (`source <script>`), not executed.

---

## Project Structure

### Implemented

```
src/
├── config.py                    # ✅ Pydantic BaseSettings — all env config + LLM factory
├── llm/                         # ✅ LLM provider adapters
│   ├── client.py                #    LLMClient ABC
│   ├── ollama_client.py         #    Ollama (default, fully offline)
│   ├── openai_client.py         #    OpenAI
│   ├── gemini_client.py         #    Google Gemini
│   └── anthropic_client.py      #    Anthropic Claude
├── base/                        # ✅ Method of Loci — Base Layer
│   ├── loci_store.py            #    ChromaDB Wings/Rooms abstraction
│   └── embedder.py              #    sentence-transformers wrapper
└── layers/
    ├── major_system/            # ✅ Layer 1 — Numerical Precision
    │   └── phonetic_encoder.py  #    Numbers → phonetic consonant tags
    └── songlines/               # ✅ Layer 2 — Contextual Continuity
        └── knowledge_graph.py   #    NetworkX directed graph; Context Trails; GraphML persistence
```

### Planned (upcoming phases)

```
src/
├── server.py                    # 🔲 MCP server entry point (FastMCP) — Phase 9
└── layers/
    ├── pao/                     # 🔲 Layer 3 — Episodic Compression (SAO triplets) — Phase 6
    │   └── snapshot.py
    ├── pipeline/                # 🔲 Unified retrieval pipeline — Phase 7
    │   └── retrieval.py
    └── tools/                   # 🔲 MCP tool definitions — Phase 8
        └── mcp_tools.py
```

---

## Success Metrics & Observability

MemShān measures intelligence *density*, not just retrieval correctness. The scorecard below
bridges technical performance of the mnemonic layers with enterprise engineering goals.

### 1. Key Performance Indicators (KPIs)

*How the server runs — quantified technical efficiency per retrieval layer.*

| Category | KPI | Target | Rationale |
|---|---|---|---|
| **Retrieval Quality** | Faithfulness / Groundedness | > 95% | Prevents hallucinated context when traversing a Songline trail |
| **Numerical Precision** | Numerical Recall Accuracy | 100% | Validates the Major System phonetic-tag pipeline for stats and dates |
| **Compression** | Context Compression Ratio | ≥ 5 : 1 | Measures how efficiently the PAO System converts raw session logs to actionable SAO triplets |
| **Latency** | P99 Retrieval Latency | < 200 ms | Loci lookups must not throttle the agent's reasoning loop |
| **Observability** | PulseGuard Hit Rate | 100% of queries | Ensures every retrieval event is logged and validated for semantic drift |

---

### 2. Key Result Areas (KRAs)

*What MemShān achieves — enterprise-level value domains.*

#### KRA 1 — Amnesia-Proof Persistence

> **Metric: Context Retention Span**

Measure how many consecutive sessions an agent maintains perfect continuity on a complex,
multi-phase project without requiring a re-prompt or context refresh.

- **Baseline:** standard zero-shot / short-context agent — continuity typically breaks after 1–2 sessions.
- **MemShān target:** ≥ 10 sessions with no loss of project state.

#### KRA 2 — Quality Engineering Modernisation

> **Metric: Defect Detection Velocity**

Measure how much faster AI-assisted QE reviews identify architectural flaws when MemShān provides
accumulated project memory versus cold zero-shot prompts.

- **Baseline:** cold-prompt review time per module (measured in minutes).
- **MemShān target:** ≥ 40% reduction in time-to-defect-detection.

#### KRA 3 — Resource Optimisation

> **Metric: Token-to-Knowledge Density (TKD)**

$$\text{TKD} = \frac{\text{Relevant facts delivered to model}}{\text{Tokens consumed from context window}}$$

High TKD means MemShān surfaces more signal within the model's 128k / 200k token budget.

- **Target:** TKD ≥ 3× vs. naive full-log injection.

---

### 3. Layer-Specific Experimental Metrics

*Proving that each mnemonic addition (beyond standard vector RAG) earns its place.*

#### Songlines — Narrative Coherence

**Test:** Ask the AI to reconstruct a project's event history from Songline graph traversal.

| Metric | Description | Pass Threshold |
|---|---|---|
| **Temporal Accuracy** | Events retrieved in correct causal / chronological order | ≥ 95% |
| **Sequence Drift** | Compared against standard vector-only RAG (which frequently reorders events) | Songlines must outperform by ≥ 20 pp |

#### PAO System — Reconstruction Fidelity

**Test:** Compress a session log into SAO triplets, then ask the AI to reconstruct the full system state from triplets alone.

| Metric | Description | Pass Threshold |
|---|---|---|
| **Reconstruction Fidelity** | Facts present in original log that survive compression → decompression | ≥ 98% |
| **Data Leakage Rate** | Critical facts lost during PAO snapshot | 0% for facts tagged `critical` |

#### Major System — Numeric Round-Trip

**Test:** Store text containing numerical data, query using rephrased numeric context.

| Metric | Description | Pass Threshold |
|---|---|---|
| **Phonetic Tag Recall** | Correct chunk retrieved via phonetic tag alone | 100% |
| **False Positive Rate** | Unrelated numeric chunks surfaced in results | < 2% |

---

### 4. Observability Dashboard Checklist

*Recommended metrics to surface in Grafana / a custom VeredianAI UI.*

| Signal | What to Track | Alert Condition |
|---|---|---|
| **Memory Growth Rate** | Wing/Room document count over time | Sudden spike > 3× daily average |
| **Room Utilisation** | Query hit-count per Room (hot vs. cold context) | Room unutilised for > 30 days → archive candidate |
| **Semantic Drift Alarm** | Distance between query embedding and retrieved chunk | Cosine distance > 0.4 → PulseGuard flag |
| **Hallucination Rate** | % of responses where retrieved chunk was not used by model | Target < 5% |
| **LLM Fallback Rate** | % of PAO extractions that fell back from spaCy to LLM | Target < 20% (high rate = spaCy model needs retraining) |
| **P99 / P95 Latency** | Per-layer breakdown: embed → query → graph expand → merge | P99 > 200 ms → alert |
| **Provider Error Rate** | Per LLM provider: 4xx / 5xx / timeout rate | Any provider > 1% error rate → alert |

---

## Engineering Standards

| Standard | Requirement |
|---|---|
| **SOLID** | Every production class in `src/` must demonstrably satisfy all five SOLID principles |
| **OOP** | Encapsulation, composition over inheritance, constructor injection throughout |
| **Test Coverage** | **100% line AND branch** coverage per module — no exceptions |
| **Test Scenarios** | Every test file: `@pytest.mark.positive` + `@pytest.mark.negative` + `@pytest.mark.edge` |
| **Database Changes** | All schema changes via Alembic migration — never mutate schema directly |
| **Security Scanning** | `pip-audit -r requirements.txt` before every commit with dependency changes |
| **Session Close** | Write MemPalace diary entry + update `Session_N.md` + commit & push every session |
| **Drift Review** | Compare implementation vs architecture docs every 3 sessions |

See [`.github/copilot-instructions.md`](.github/copilot-instructions.md) for full details.

---

## Contributing

Contributions are welcome. Please read [ARCHITECTURE.md](ARCHITECTURE.md) and
[TASK_EXECUTION_PLAN.md](TASK_EXECUTION_PLAN.md) before submitting a PR.

1. Fork the repository
2. Create a feature branch: `git checkout -b feature/my-feature`
3. Commit your changes: `git commit -m "feat: add my feature"`
4. Push to the branch: `git push origin feature/my-feature`
5. Open a Pull Request

---

## License

MIT License — see [LICENSE](LICENSE) for details.
