Metadata-Version: 2.4
Name: rag-mcp
Version: 0.1.4
Summary: This is a local rag-mcp solution with chromadb using langchain and docling
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: accelerate>=1.6.0
Requires-Dist: langchain-chroma>=0.2.2
Requires-Dist: langchain-community>=0.3.21
Requires-Dist: langchain-docling>=0.2.0
Requires-Dist: langchain-ollama>=0.3.0
Requires-Dist: langchain-text-splitters>=0.3.8
Requires-Dist: mcp[cli]>=1.6.0
Requires-Dist: sentence-transformers>=4.0.2

# RAG MCP: Document Processing Server

A Retrieval-Augmented Generation (RAG) server built on the Model Context Protocol (MCP) for intelligent document processing and question answering.

## Overview

RAG MCP is a tool that allows you to index various document formats and perform semantic searches against them. It uses advanced embedding techniques and vector databases to make your documents searchable through natural language queries.

## Features

- **Document Indexing**: Support for various document formats (PDF, DOCX, XLSX, PPTX, Markdown, AsciiDoc, HTML, XHTML, CSV)
- **Semantic Search**: Query your documents using natural language
- **Flexible Embedding Models**: Choose between HuggingFace BGE (default) or Ollama embeddings.
- **High Performance**: Optimized for various hardware configurations with automatic device selection (CUDA, MPS, CPU) for HuggingFace embeddings.
- **Persistent Storage**: Vector embeddings are stored locally for future use

## Requirements

- Python 3.11+
- Environment with access to your documents
- (Optional) Ollama installed and running if using Ollama embeddings.

## Installation

### 1. Install UV

First, you need to install UV, a Python package installer and resolver:

#### On macOS/Linux:
```bash
curl -sSf https://astral.sh/uv/install.sh | sh
```

#### On Windows:
```bash
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
```

### 2. Run RAG MCP

Once UV is installed, you can run RAG MCP directly using:

```bash
uvx rag-mcp
```

This will start the MCP server and make it available for document processing.

**Environment Variables:**

You can configure RAG MCP using environment variables:

-   `PERSIST_DIRECTORY` (Required): Path to the directory where the vector database will be stored (e.g., `/path/to/your/persist/directory`). A `chromadb` subfolder will be created here.
-   `USE_OLLAMA_EMBEDDING` (Optional): Set to `True` to use Ollama embeddings instead of the default HuggingFace BGE embeddings. Requires Ollama to be running. Defaults to `False`.
-   `OLLAMA_EMBEDDING_MODEL` (Optional): Specifies the Ollama model name to use for embeddings (e.g., `nomic-embed-text`). Only used if `USE_OLLAMA_EMBEDDING=True`. Defaults to `bge-m3:latest`.

Example:
```bash
PERSIST_DIRECTORY=/data/rag_db USE_OLLAMA_EMBEDDING=True OLLAMA_EMBEDDING_MODEL=nomic-embed-text uvx rag-mcp
```

## IDE Integration

### VS Code Integration

To integrate with Visual Studio Code, create a `mcp.json` file with the following content:

```json
{
    "servers": {
        "rag-mcp-server": {
            "type": "stdio",
            "command": "uvx",
            "args": [
                "rag-mcp"
            ],
            "env": {
                "PERSIST_DIRECTORY": "/path/to/your/persist/directory",
                // Optional: Uncomment and set to true to use Ollama
                // "USE_OLLAMA_EMBEDDING": "True",
                // Optional: Specify Ollama model if using Ollama
                // "OLLAMA_EMBEDDING_MODEL": "nomic-embed-text"
            }
        }
    }
}
```

Replace `/path/to/your/persist/directory` with the desired path. Adjust Ollama settings as needed.

### Cursor Integration

To integrate with Cursor, go to Cursor Settings > MCP and paste this configuration:

```json
{
    "mcpServers": {
      "rag-mcp": {
        "command": "uvx",
        "args": ["rag-mcp"],
        "env": {
          "PERSIST_DIRECTORY": "/path/to/your/persist/directory",
          // Optional: Uncomment and set to true to use Ollama
          // "USE_OLLAMA_EMBEDDING": "True",
          // Optional: Specify Ollama model if using Ollama
          // "OLLAMA_EMBEDDING_MODEL": "nomic-embed-text"
        }
      }
    }
  }
```
Replace `/path/to/your/persist/directory` with the desired path. Adjust Ollama settings as needed.

## Usage

<!-- ... existing usage instructions ... -->

## Supported Embedding Models

The system supports the following embedding models:

1.  **HuggingFace BGE Embeddings** (default): High-quality embeddings that work offline. Optimized for available hardware (CUDA, MPS, CPU).
    *   Uses `BAAI/bge-m3` model.
2.  **Ollama Embeddings** (optional): Uses embeddings generated by a running Ollama instance. Enable by setting `USE_OLLAMA_EMBEDDING=True`.
    *   Default model: `bge-m3:latest`.
    *   Specify a different model using `OLLAMA_EMBEDDING_MODEL`.

## How It Works

1.  **Document Loading**: Uses DoclingLoader to parse various document formats
2.  **Text Splitting**: Documents are split into manageable chunks using RecursiveCharacterTextSplitter
3.  **Embedding Generation**: Text chunks are converted to vector embeddings using either HuggingFace BGE or Ollama.
4.  **Storage**: Embeddings are stored in a Chroma vector database
5.  **Retrieval**: When queried, the system finds semantically similar content to answer questions

## Acknowledgments

- This project uses [LangChain](https://www.langchain.com/) and [Docling](https://github.com/docling-project/docling) for intelligent document parsing
- Vector storage provided by [Chroma](https://www.trychroma.com/)
- Embedding models from [HuggingFace](https://huggingface.co/) and potentially [Ollama](https://ollama.com/)
- Built on the [Model Context Protocol](https://github.com/microsoft/model-context-protocol) (MCP)
