Metadata-Version: 2.4
Name: motion_rag
Version: 0.2.0
Summary: Local-first Python SDK for adaptive document retrieval.
Author: Redouane
License: MIT
Project-URL: Homepage, https://github.com/Rvey/motion_rag
Project-URL: Repository, https://github.com/Rvey/motion_rag
Project-URL: Issues, https://github.com/Rvey/motion_rag/issues
Keywords: retrieval,embeddings,citations,cli,docs
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Provides-Extra: dev
Requires-Dist: pytest>=8.3; extra == "dev"
Provides-Extra: pgvector
Requires-Dist: psycopg[binary]>=3.2; extra == "pgvector"
Requires-Dist: pgvector>=0.3; extra == "pgvector"
Provides-Extra: mongo
Requires-Dist: pymongo>=4.8; extra == "mongo"
Provides-Extra: redis
Requires-Dist: redis>=5.0; extra == "redis"
Provides-Extra: pdf-structured
Requires-Dist: docling<3; extra == "pdf-structured"
Provides-Extra: sbert
Requires-Dist: sentence-transformers>=3.0; extra == "sbert"

# Motion RAG

Motion RAG is a local-first Python SDK for adaptive document retrieval.

This repository currently contains the package foundation for V1:

- package metadata and installable layout
- shared domain and config models
- core protocols for later phases
- a narrow `MotionRAG` public facade
- logging and test scaffolding
- deterministic chunk contextualization
- vector embeddings with OpenRouter integration
- local-first in-memory SQLite indexing primitives
- optional pgvector and MongoDB vector backends

The implementation intentionally grows one phase at a time so later retrieval and generation layers can build on stable models and inspectable metadata.

To exercise the current pipeline against a real local file or directory, run:

```bash
python3 scripts/test_real_file.py /path/to/file-or-folder
```

Add `--json` for structured output.

To benchmark retrieval and answer latency against real local files, run:

```bash
python3 scripts/benchmark_real_queries.py /path/to/file-or-folder --dataset queries.json
```

To benchmark the bundled sample PDFs with a ready-made 4-question dataset and detailed per-query evidence output, run:

```bash
python3 scripts/benchmark_pdf_embeddings.py
```

To compare local fallback embeddings against OpenRouter embeddings on the same query set, run:

```bash
python3 scripts/benchmark_real_queries.py /path/to/file-or-folder --dataset queries.json --compare-embeddings
```

For stronger PDF structure extraction, install Docling:

```bash
python3 -m pip install -e .
```

Motion RAG now uses Docling for PDF parsing.

```bash
python3 scripts/test_real_file.py /path/to/file.pdf
```

The Docling path maps the structured `DoclingDocument` body tree directly into Motion RAG sections and blocks, skipping picture content for now while preserving cleaner body text, list, table, and page provenance signals.

For vector embeddings, Motion RAG uses OpenRouter when `OPENROUTER_API_KEY` is set. Without a key, it falls back to a deterministic local embedding implementation so tests and offline runs still work.

Vector storage defaults to in-memory SQLite for local runs. For persistent backends, set `index.vector_backend` to `pgvector` or `mongo` and provide the matching connection settings in `IndexConfig`.

## CLI

```bash
motion-rag ingest ./docs --index docs
motion-rag query docs "What are the refund rules?"
motion-rag retrieve docs "What are the refund rules?" --debug
motion-rag eval docs --dataset eval.json
```

## SDK

```python
from motion_rag import MotionRAG, MotionRAGConfig

rag = MotionRAG(MotionRAGConfig())
rag.index.create(name="docs", source="./docs")
result = rag.query.ask(index="docs", question="What are the refund rules?")
```

## Docs

The VitePress source lives in `docs/`.

## Publishing

Releases are tagged automatically from `main`, then published to PyPI from the GitHub release workflow.

After release, install with:

```bash
pip install motion-rag
```

## Benchmark Results

Run on 2 PDFs (217 chunks, 4 queries) using `scripts/benchmark_quality.py` with Gemini 2.5 Flash Lite as quality evaluator.

### Master Summary — Latency, Recall & Quality

```
Variant                       Build  Chunks  R_mean    R_p95     R@1  R@3  R@5  MRR    Qual  Rel  Comp  Backend
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────
SQLite + HashEmbed (256d)      3.0s   217    693.5ms  749.8ms  1.00 1.00 1.00 1.0000  5.00 5.00 5.00  sqlite+hnsw
SQLite + OpenRouter (1536d)    2.7s   217    656.5ms  708.5ms  1.00 1.00 1.00 1.0000  5.00 5.00 5.00  sqlite+hnsw
pgvector HNSW + OR (1536d)     4.1s   217    488.3ms  550.3ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+hnsw
pgvector brute + OR (1536d)    3.6s   217    540.9ms  783.3ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+none
HNSW + reranker OFF            4.2s   217    529.6ms  537.9ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+hnsw
SQLite + ST (384d) ──★──      2.7s   217    617.7ms  634.6ms  1.00 1.00 1.00 1.0000  5.00 5.00 5.00  sqlite+hnsw
pgvec HNSW + ST (384d) ──★──  3.8s   217    506.3ms  581.4ms  1.00 1.00 1.00 1.0000  4.00 4.00 4.00  pgvector+hnsw
```

★ = sbert integration (local sentence-transformers, no API calls)

### Chunking Strategy Comparison

```
Strategy         Chunks  Build   R_mean    R@3  MRR    Quality  Ev
────────────────────────────────────────────────────────────────────
section            147   2.5s    602.1ms  1.00 1.0000  5.00     6
recursive_token    132   2.0s    594.1ms  1.00 1.0000  5.00     6
parent_child       282   3.2s    649.9ms  1.00 1.0000  5.00     7
auto               217   3.0s    734.6ms  1.00 1.0000  5.00     7
```

### Speedup vs SQLite HashEmbedding (baseline)

```
Variant                          Latency    Speedup  Savings
──────────────────────────────────────────────────────────────
SQLite + HashEmbed (256d)        693.5ms    1.00x     0.0%
SQLite + OpenRouter (1536d)      656.5ms    1.06x     5.3%
pgvector HNSW + OR (1536d)       488.3ms    1.42x    29.6%  ← fastest
pgvector brute + OR (1536d)      540.9ms    1.28x    22.0%
HNSW + reranker OFF              529.6ms    1.31x    23.6%
SQLite + ST (384d)               617.7ms    1.12x    10.9%  ← zero network
pgvector HNSW + ST (384d)        506.3ms    1.37x    27.0%
```

### Per-Query Quality (overall 1-5)

```
Question                          Hash   SQL+OR  pg+HNSW  ST    pg+ST
──────────────────────────────────────────────────────────────────────
Sofitel Strasbourg?                5      5       5        5     5
Sofitel eco-cert 2025?             5      5       5        5     5
Barton Hills home size?            5      5       5        5     5
Email for guidance?                5      5       1        5     1
```

### Algorithm Comparison — Key Metrics

```
Metric                  HashEmbed  SQL+OR   pg+HNSW  pg+brute  NoRerank  Best
──────────────────────────────────────────────────────────────────────────────
Retrieve Mean            693.5ms   656.5ms  488.3ms  540.9ms   529.6ms   → pgvector HNSW
Retrieve P95             749.8ms   708.5ms  550.3ms  783.3ms   537.9ms   → HNSW + NO rerank
Build Time                 3.0s      2.7s     4.1s     3.6s      4.2s    → SQLite + OpenRouter
Recall@1/3/5/MRR          1.00      1.00     1.00     1.00      1.00     → ALL EQUAL
Quality (Overall)         5.00      5.00     4.00     4.00      4.00     → SQLite variants
```

### Key Takeaways

| Insight | Data |
|---|---|
| **pgvector HNSW is fastest** | 488ms mean (1.42x speedup), lowest P95 jitter |
| **All backends have equal recall** | R@1/3/5 = 1.00, MRR = 1.000 for every variant |
| **sbert local embedding = zero network** | 618ms mean, same quality as OpenRouter, no API key needed |
| **Cross-encoder reranker improves ranking** | But adds ~300-800ms on first run (model loading). Cached afterwards |
| **Chunking strategy doesn't affect quality** | Section, recursive, parent-child all achieve 5.0 quality |
| **Reranker off saves ~3%** | With negligible quality difference on simple queries |
| **pgvector has a quality blind spot** | Scores 1 vs 5 on 1/4 queries — likely `<=>` operator encoding difference |

### Run the Benchmark Yourself

```bash
# Full quality benchmark (requires OpenRouter API key)
python3 scripts/benchmark_quality.py

# Latency + recall benchmark
python3 scripts/benchmark_comprehensive.py
```
