Metadata-Version: 2.4
Name: turborag-ahx47
Version: 0.1.1
Summary: My own RAG library built on turbovec
Author-email: AHX47 <abdo47hak47@gmail.com>
Project-URL: Homepage, https://github.com/AHX47/turborag
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: turbovec
Requires-Dist: numpy>=1.24
Requires-Dist: llama-cpp-python>=0.2.0
Dynamic: requires-python

# TurboRag-ahx47

**TurboRag** is a fully offline, low‑CPU, low‑RAM RAG (Retrieval Augmented Generation) engine.  
This package (`turborag-ahx47`) is a custom build that leverages:

- **TurboVec** – quantized (Q4) vector index (8× smaller than float32, faster than FAISS)
- **llama-cpp-python** – runs all models as Q4_K_M GGUF files on CPU
- **Optional REST API** – via FastAPI (not included in this core library)
- **Multi‑language SDKs** – Python only for now

> **Note:** The original project name `turborag` was already taken on PyPI. This is the official library under the name `turborag-ahx47`.

---

## Features

- **No GPU required, no internet at runtime** – everything runs offline on CPU.
- **Tiny memory footprint** – Gemma Embedding 300M (≈150 MB) + Qwen 0.5B (300 MB).
- **TurboVec Q4 index** – 8× compression, fast brute‑force search.
- **Built‑in SQLite document store** – metadata and chunk storage.
- **Easy‑to‑use Python API** – add documents, ask questions, get answers with sources.

---

## Installation

### Prerequisites

- Python 3.10 or higher
- Rust (only if you want to build TurboVec from source – not required for normal use)

### Install from PyPI

```bash
pip install turborag-ahx47
```

This will automatically install the required dependencies, including `turbovec`, `numpy`, and `llama-cpp-python`.

### Optional: Build TurboVec from source (advanced)

If you need a custom version of TurboVec, you can build it manually:

```bash
git clone https://github.com/RyanCodrai/turbovec.git
cd turbovec/turbovec-python
pip install maturin
maturin develop --release
```

But for most users, the pre‑built `turbovec` wheel is sufficient.

---

## Quick Start

### 1. Download required models

You need two GGUF models:
- **Embedding model**: `embeddinggemma-300m-q4_k_m.gguf` (≈150 MB)  
  Download from: [Hugging Face](https://huggingface.co/sabafallah/embeddinggemma-300m-Q4_K_M-GGUF/resolve/main/embeddinggemma-300m-q4_k_m.gguf)
- **LLM model** (e.g., Qwen 0.5B): `qwen-0.5b-q4_k_m.gguf` (≈300 MB)  
  Download from your preferred source.

Place them in a folder, e.g., `models/`.

### 2. Use the library

```python
from turborag import TurboRag

# Create RAG instance
rag = TurboRag.create(
    embed_model="models/embeddinggemma-300m-q4_k_m.gguf",
    llm_model="models/qwen-0.5b-q4_k_m.gguf",
)

# Add a document
rag.add_document("Paris is the capital of France.")

# Ask a question
answer, sources = rag.ask("What is the capital of France?")
print(answer)  # "Paris"
print(sources)  # List of source chunks
```

---

## API Reference

### `TurboRag.create(embed_model, llm_model, **kwargs)`

Class method to instantiate the RAG engine.

| Parameter | Type | Description |
|-----------|------|-------------|
| `embed_model` | `str` | Path to the embedding GGUF file (Gemma 300M). |
| `llm_model` | `str` | Path to the LLM GGUF file (e.g., Qwen 0.5B). |
| `chunk_size` | `int` | (optional) Chunk size for splitting documents, default 512. |
| `chunk_overlap` | `int` | (optional) Overlap between chunks, default 50. |

Returns: `TurboRag` instance.

### `rag.add_document(text, metadata=None)`

Adds a document to the index.

| Parameter | Type | Description |
|-----------|------|-------------|
| `text` | `str` | Document content. |
| `metadata` | `dict` | (optional) Additional metadata. |

### `rag.ask(question, k=5)`

Asks a question and retrieves an answer.

| Parameter | Type | Description |
|-----------|------|-------------|
| `question` | `str` | User query. |
| `k` | `int` | Number of chunks to retrieve (default 5). |

Returns: `(answer, sources)` where `answer` is a string and `sources` is a list of chunk texts.

### `rag.search(query, k=5)`

Performs a pure vector search without generation.

| Parameter | Type | Description |
|-----------|------|-------------|
| `query` | `str` | Search query. |
| `k` | `int` | Number of results. |

Returns: List of tuples `(chunk_text, score, metadata)`.

---

## Advanced Usage

### Using a custom document store

```python
from turborag import TurboRag
from turborag.store import SQLiteDocStore

store = SQLiteDocStore("my_docs.db")
rag = TurboRag.create(
    embed_model="models/embeddinggemma-300m-q4_k_m.gguf",
    llm_model="models/qwen-0.5b-q4_k_m.gguf",
    doc_store=store,
)
```

### Batching documents

```python
docs = [
    "Paris is the capital of France.",
    "Berlin is the capital of Germany.",
    "Madrid is the capital of Spain.",
]
rag.add_documents(docs)  # list of strings
```

### Changing the LLM at runtime

```python
rag.set_llm_model("models/deepseek-1.3b-q4_k_m.gguf")
```

---

## Dependencies

- `turbovec` (quantized vector index)
- `llama-cpp-python` (GGUF inference)
- `numpy` (vector operations)
- `sqlite3` (built‑in, for docstore)

---

## Troubleshooting

| Issue | Solution |
|-------|----------|
| `ImportError: cannot import name 'TurboRag'` | Make sure you have installed the package correctly. |
| `OSError: Llama model not found` | Provide the correct absolute or relative path to the GGUF file. |
| `turbovec.IdMapIndex not found` | Reinstall `turbovec` with `pip install --upgrade turbovec`. |
| High RAM usage | Reduce `chunk_size` or use a smaller LLM. |

---

## License

This project is licensed under the MIT License.

---

## Links

- **PyPI package**: [turborag-ahx47](https://pypi.org/project/turborag-ahx47/)
- **Source code**: [GitHub](https://github.com/AHX47/turborag) *(replace with your actual repo URL)*
- **Report issues**: [Issue tracker](https://github.com/AHX47/turborag/issues)

---

## Acknowledgements

- [TurboVec](https://github.com/RyanCodrai/turbovec) – efficient quantized vector search
- [llama.cpp](https://github.com/ggerganov/llama.cpp) – GGUF model inference
- [Gemma embedding model](https://huggingface.co/sabafallah/embeddinggemma-300m-Q4_K_M-GGUF)

