Metadata-Version: 2.4
Name: mempalace-gpu
Version: 3.1.0
Summary: GPU-accelerated fork of mempalace — NVIDIA CUDA, AMD ROCm, Apple MPS. 3-6x faster mining.
Author: phobicdotno
License-Expression: MIT
Project-URL: Homepage, https://github.com/phobicdotno/mempalace-gpu
Project-URL: Repository, https://github.com/phobicdotno/mempalace-gpu
Project-URL: Bug Tracker, https://github.com/phobicdotno/mempalace-gpu/issues
Project-URL: Upstream, https://github.com/milla-jovovich/mempalace
Keywords: ai,memory,llm,rag,chromadb,mcp,vector-database,claude,chatgpt,embeddings,gpu,cuda,rocm,mps,apple-silicon
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Utilities
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: chromadb>=0.4.0
Requires-Dist: pyyaml>=6.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: build>=1.0; extra == "dev"
Requires-Dist: twine>=4.0; extra == "dev"
Provides-Extra: gpu
Requires-Dist: sentence-transformers>=2.2.0; extra == "gpu"
Requires-Dist: torch>=2.0.0; extra == "gpu"
Dynamic: license-file

# mempalace-gpu

> GPU-accelerated fork of [milla-jovovich/mempalace](https://github.com/milla-jovovich/mempalace)

This fork adds GPU-accelerated embeddings and batch processing to MemPalace. Supports **NVIDIA (CUDA)**, **AMD (ROCm)**, and **Apple Silicon (MPS)**. For documentation on MemPalace itself (palace structure, AAAK dialect, MCP tools, benchmarks), see the [upstream README](https://github.com/milla-jovovich/mempalace#readme).

---

## What this fork adds

### GPU-accelerated embeddings

Embeddings are computed via `sentence-transformers` on GPU when available, falling back to ChromaDB's default CPU/ONNX model when not.

```bash
mempalace mine ~/myproject --device auto    # auto-detect best GPU
mempalace mine ~/myproject --device cuda    # NVIDIA
mempalace mine ~/myproject --device rocm    # AMD
mempalace mine ~/myproject --device mps     # Apple Silicon (M1-M5)
mempalace mine ~/myproject --device cpu     # force CPU
```

Also configurable via `MEMPALACE_DEVICE` env var or `"device"` in `~/.mempalace/config.json`.

### Batch processing

`collection.add()` calls are batched (100 documents per call instead of 1), reducing ChromaDB overhead regardless of CPU or GPU mode.

### Self-update MCP tool

The MCP server includes a `mempalace_self_update` tool that pulls the latest version from PyPI, callable directly from your AI assistant.

---

## Performance

Tested on two real-world codebases on NVIDIA CUDA. Same files, same drawers — only the device changes.

| Test | Files | Drawers | Size | CPU | GPU | Speedup |
|------|-------|---------|------|-----|-----|---------|
| Large mixed codebase (JS/TS/Dart/Python/HTML) | 118 | 13,673 | ~1.7 GB | 156.7s | 26.3s | **6.0x** |
| Medium Flutter app (Dart/YAML/JSON) | 145 | 2,906 | ~85 MB | 37.3s | 10.7s | **3.5x** |

Speedup scales with drawer count. More chunks = more embedding work = bigger GPU advantage.

---

## Installation

```bash
# Clone this fork
git clone https://github.com/phobicdotno/mempalace-gpu.git
cd mempalace-gpu
```

### NVIDIA (CUDA)

```bash
pip install -e ".[gpu]"
```

### AMD (ROCm)

```bash
# Install PyTorch with ROCm first
pip install torch --index-url https://download.pytorch.org/whl/rocm6.2
# Then install mempalace with GPU extras
pip install -e ".[gpu]"
```

### Apple Silicon (MPS)

```bash
# PyTorch ships with MPS support on macOS by default
pip install -e ".[gpu]"
```

### CPU only (still gets batch processing)

```bash
pip install -e .
```

### Staying in sync with upstream

```bash
git remote add upstream https://github.com/milla-jovovich/mempalace.git
git fetch upstream
git merge upstream/main
```

---

## Changes from upstream

| File | Change |
|------|--------|
| `mempalace/embeddings.py` | **New** -- GPU detection (NVIDIA/AMD/Apple), embedding factory, batch flush |
| `mempalace/miner.py` | Batched `collection.add()`, content hashing, `update()` command |
| `mempalace/convo_miner.py` | Batched `collection.add()` |
| `mempalace/config.py` | `device` property (auto/cuda/rocm/mps/cpu) |
| `mempalace/cli.py` | `--device` flag, `update` subcommand |
| `mempalace/mcp_server.py` | `mempalace_self_update` tool, shared embeddings |
| `mempalace/searcher.py` | Shared embedding function for vector compatibility |
| `mempalace/layers.py` | Shared embedding function |
| `mempalace/palace_graph.py` | Shared embedding function |
| `pyproject.toml` | `gpu` optional dependency group |

All other files are unmodified from upstream. Existing palaces remain compatible.

---

## License

MIT -- same as upstream.
