Metadata-Version: 2.3
Name: pprag
Version: 0.1.2
Summary: Proxy-Pointer RAG suite for text, multimodal, and cross-document comparison workflows
Author: Partha Sarkar
Author-email: Partha Sarkar <partha.sarkarx@gmail.com>
Requires-Dist: google-generativeai ; extra == 'compare'
Requires-Dist: langchain-community ; extra == 'compare'
Requires-Dist: langchain-core ; extra == 'compare'
Requires-Dist: langchain-text-splitters ; extra == 'compare'
Requires-Dist: faiss-cpu ; extra == 'compare'
Requires-Dist: python-dotenv ; extra == 'compare'
Requires-Dist: streamlit ; extra == 'compare'
Requires-Dist: llama-cloud ; extra == 'compare'
Requires-Dist: google-generativeai ; extra == 'full'
Requires-Dist: langchain-community ; extra == 'full'
Requires-Dist: langchain-core ; extra == 'full'
Requires-Dist: langchain-text-splitters ; extra == 'full'
Requires-Dist: faiss-cpu ; extra == 'full'
Requires-Dist: python-dotenv ; extra == 'full'
Requires-Dist: pandas ; extra == 'full'
Requires-Dist: openpyxl ; extra == 'full'
Requires-Dist: llama-cloud ; extra == 'full'
Requires-Dist: streamlit ; extra == 'full'
Requires-Dist: pillow ; extra == 'full'
Requires-Dist: numpy ; extra == 'full'
Requires-Dist: markdown ; extra == 'full'
Requires-Dist: pdfservices-sdk ; extra == 'full'
Requires-Dist: streamlit ; extra == 'multimodal'
Requires-Dist: google-generativeai ; extra == 'multimodal'
Requires-Dist: pillow ; extra == 'multimodal'
Requires-Dist: python-dotenv ; extra == 'multimodal'
Requires-Dist: faiss-cpu ; extra == 'multimodal'
Requires-Dist: numpy ; extra == 'multimodal'
Requires-Dist: markdown ; extra == 'multimodal'
Requires-Dist: langchain-community ; extra == 'multimodal'
Requires-Dist: langchain-core ; extra == 'multimodal'
Requires-Dist: langchain-text-splitters ; extra == 'multimodal'
Requires-Dist: pdfservices-sdk ; extra == 'multimodal'
Requires-Dist: google-generativeai ; extra == 'text'
Requires-Dist: langchain-community ; extra == 'text'
Requires-Dist: langchain-core ; extra == 'text'
Requires-Dist: langchain-text-splitters ; extra == 'text'
Requires-Dist: faiss-cpu ; extra == 'text'
Requires-Dist: python-dotenv ; extra == 'text'
Requires-Dist: pandas ; extra == 'text'
Requires-Dist: openpyxl ; extra == 'text'
Requires-Dist: llama-cloud ; extra == 'text'
Requires-Python: >=3.10
Provides-Extra: compare
Provides-Extra: full
Provides-Extra: multimodal
Provides-Extra: text
Description-Content-Type: text/markdown

<p align="center">
  <img src="assets/banner.png" alt="Proxy-Pointer Banner" width="100%">
</p>

# Proxy-Pointer Suite -- Text, Multimodal RAG, and Cross-Document Comparison 🔍

**Structural RAG for Complex Documents** — A high-fidelity retrieval pipeline that uses document hierarchy as the primary retrieval anchor, eliminating "hallucination by chunking." Proxy-Pointer indexes **structural pointers** (breadcrumbs like `Paper > Section > Sub-section`) rather than raw text fragments, ensuring the LLM always understands exactly where it is in a document.

**Retrieve precise text, get grounded visual citations, or perform Agentic section-by-section document comparisons.**

---

## Three Implementations, One Architecture

| Feature              | [Text-Only](./Text-Only)                 | [MultiModal](./MultiModal)                                                       | [DocComparator](./DocComparator)                 |
| :------------------- | :------------------------------------ | :---------------------------------------------------------------------------- | :-------------------------------------------- |
| **Core Goal**  | Maximum precision for text-based RAG  | Unified reasoning across text & visuals                                       | Agentic Cross-Document Comparison             |
| **Input**      | Structured Markdown (LlamaParse)      | Markdown + Figures/Tables (Adobe Extract)                                     | PDF or MD (Mixed format supported)            |
| **Output**     | Text-based answers                    | Text +$\color{#15803d}{\textsf{\textbf{AI-Verified Visual Evidence}}}$ 🖼️ | Side-by-side analytical reports               |
| **LLM**        | Gemini 3.1 Flash-Lite                 | Gemini 3.1 Flash-Lite                                                         | Gemini 3 Flash                                |
| **Embeddings** | gemini-embedding-001 (1536d)          | gemini-embedding-001 (1536d)                                                  | gemini-embedding-001 (1536d)                  |
| **Vision**     | —                                    | ✅ Gemini 3.1 Flash-Lite                                                      | —                                            |
| **Retrieval**  | Structural re-ranking (k=5)           | Anchor-aware re-ranking + image selection                                     | Multi-Stage Proxy-Pointer retrieval           |
| **Benchmark**  | 100% on FinanceBench                  | 96% across 20-query, 5-paper suite                                            | N/A (Dynamic Agentic Evaluation)              |
| **Use Case**   | 10-K Financials, Legal, Documentation | Anything with Images, Diagrams, Charts                                        | Credit Agreements, Contracts, Research Papers |
| **Interface**  | CLI / Python API                      | Streamlit UI with visual citations                                            | Streamlit UI with markdown export             |

---

## How It Works

```mermaid
graph TD
    A[Documents] -->|PDF Extraction| B[Markdown]
    B -->|Tree Builder| C[Structure Trees]
    C -->|Noise Filter| D[Clean Nodes]
    D -->|Embed + Index| E[FAISS]
    E -->|"Query, Dedup, Re-Rank"| F[Top Sections]
    F -->|Synthesize + Cite| G[Grounded Answer]
```

1. **Structure trees** map every section, sub-section, figure, and table in a document
2. **Noise filtering** removes TOC, glossaries, and boilerplate using an LLM
3. **Broad vector recall** (k=200) retrieves candidates, then **LLM re-ranking** selects the best structural matches
4. **Full section loading** gives the synthesizer complete context — not truncated chunks
5. *(MultiModal only)* **Anchor-aware retrieval** surfaces figures/tables physically linked to retrieved sections

---

## Which One Should I Use?

**[Text-Only](./Text-Only)** — Best when your documents are purely text-based and the hierarchy (e.g., `Signatory > Item 1A > Risk Factors`) is the only context needed. Proven at 100% accuracy on financial 10-K filings.

**[MultiModal](./MultiModal)** — Best when your documents contain diagrams, charts, and tables that are essential to the answer. Uses anchor-aware retrieval to surface the exact images tied to a technical discussion, tested across 5 research papers (CLIP, GaLore, NemoBot, VectorFusion, VectorPainter).

**[DocComparator](./DocComparator)** — Best when you need to perform deep, section-by-section comparisons between two complex documents. Uses Agentic RAG and targeted personas (like Senior Legal Counsel) to untangle legal trade-offs and methodological differences beyond surface-level keyword matching.

---

## Architecture Deep Dive

For the full technical story behind the architecture:

1. [Proxy-Pointer Framework for Structure-Aware Enterprise Document Intelligence](https://towardsdatascience.com/proxy-pointer-framework-for-structure-aware-enterprise-document-intelligence/) — Hierarchical understanding and comparison of contracts, research papers, and more
2. [Proxy-Pointer RAG: Multimodal Answers Without Multimodal Embeddings](https://towardsdatascience.com/proxy-pointer-rag-multimodal-answers-without-multimodal-embeddings/) — Structure is all you need
3. [Proxy-Pointer RAG: Structure Meets Scale — 100% Accuracy with Smarter Retrieval](https://towardsdatascience.com/proxy-pointer-rag-structure-meets-scale-100-accuracy-with-smarter-retrieval/) — Scaling to multi-document, LLM re-ranking, and benchmark results
4. [Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and Cost](https://towardsdatascience.com/proxy-pointer-rag-achieving-vectorless-accuracy-at-vector-rag-scale-and-cost/) — Core architecture & the pointer-based retrieval idea

---

## 5-Minute Quickstart

> **Important Note for PyPI Users:** While installing via PyPI (`pip install pprag`) gives you the CLI and code, the application relies on specific local folder structures (like `data/`) and environment variable templates. We strongly recommend cloning the repository first to get the necessary `.env.example` templates and sample data folders for each workflow.

### 1. Clone

```bash
git clone https://github.com/Proxy-Pointer/Proxy-Pointer-RAG.git
cd Proxy-Pointer-RAG
```

### 2. Create Virtual Environment & Install Dependencies

We strongly recommend creating a virtual environment first:

```bash
python -m venv venv
# Windows: venv\Scripts\activate | macOS/Linux: source venv/bin/activate
```

You can then install dependencies using standard `pip` or using `uv` (recommended for developers).

#### Option A: Standard pip
Install the package and your desired modality:

```bash
pip install pprag                 # minimal CLI shell
pip install "pprag[text]"         # text-only structural RAG
pip install "pprag[multimodal]"   # multimodal RAG with visual citations
pip install "pprag[compare]"      # cross-document comparison
pip install "pprag[full]"         # all modalities
```

#### Option B: For Developers (using uv)
If you want to tinker with the code, this project uses [`uv`](https://docs.astral.sh/uv/) for lightning-fast dependency management.

```bash
pip install uv
uv sync --all-extras
# Remember to prefix commands with `uv run` if you use this method!
```

### 3. Configure API keys

Navigate into the folder for the modality you want to run (e.g., `Text-Only`, `MultiModal`, or `DocComparator`), copy the template, and add your API keys:

```bash
cd Text-Only
cp .env.example .env
# Edit .env → add your GOOGLE_API_KEY
# Note: Also review other commented variables, especially the FAISS trust settings required for local index loading!
```

### 4. Build the index

Build the FAISS index from scratch for your chosen modality:

```bash
# Prefix with `uv run` if you installed via Option B
pprag text index --fresh
# or `pprag multimodal index --fresh`
```

### 5. Start querying / Serve

Launch the CLI or Web UI:

```bash
# Prefix with `uv run` if you installed via Option B
pprag text ask
# or `pprag multimodal serve`
# or `pprag compare serve`
```

Each implementation also has its own self-contained README with a detailed quickstart:

- **[Text-Only → Get Started](./Text-Only/README.md)**
- **[MultiModal → Get Started](./MultiModal/README.md)**
- **[DocComparator → Get Started](./DocComparator/README.md)**

All include sample data so you can clone, build the index, and start exploring immediately.

---

## Author
**Partha Sarkar**

## Contact

- **GitHub Issues**: For bug reports
- **General Questions**: Reach out on [LinkedIn](https://www.linkedin.com/in/partha-sarkar-lets-talk-ai) or [Email](mailto:partha.sarkarx@gmail.com)

---

## License

© 2026 Partha Sarkar (Proxy-Pointer). Licensed under [MIT](LICENSE).
