Metadata-Version: 2.4
Name: deeplightrag
Version: 1.0.0
Summary: DeepLightRAG: High-performance Document Indexing and Retrieval System (use with any LLM)
Author-email: Phuong Nguyen <nhphuong.code@gmail.com>
Maintainer-email: Phuong Nguyen <nhphuong.code@gmail.com>
License: MIT
Project-URL: Homepage, https://github.com/png261/DeepLightRag
Project-URL: Repository, https://github.com/png261/DeepLightRag
Project-URL: Bug Tracker, https://github.com/png261/DeepLightRag/issues
Project-URL: Changelog, https://github.com/png261/DeepLightRag/releases
Keywords: rag,retrieval,augmented,generation,ocr,vision,graph,nlp,llm,deepseek,document-processing
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.24.0
Requires-Dist: networkx>=3.0
Requires-Dist: Pillow>=10.0.0
Requires-Dist: PyYAML>=6.0
Requires-Dist: tqdm>=4.65.0
Requires-Dist: typing-extensions>=4.0.0; python_version < "3.10"
Requires-Dist: pdf2image>=1.16.0
Requires-Dist: PyMuPDF>=1.23.0
Requires-Dist: easyocr>=1.7.0
Requires-Dist: torch>=2.0.0
Requires-Dist: torchvision>=0.15.0
Requires-Dist: transformers>=4.40.0
Requires-Dist: accelerate>=0.24.0
Requires-Dist: sentence-transformers>=2.2.0
Requires-Dist: gliner>=0.1.12
Requires-Dist: faiss-cpu>=1.7.4
Provides-Extra: gpu
Requires-Dist: bitsandbytes>=0.41.0; extra == "gpu"
Provides-Extra: macos
Requires-Dist: mlx>=0.21.0; extra == "macos"
Requires-Dist: mlx-lm>=0.19.0; extra == "macos"
Provides-Extra: llm
Requires-Dist: google-generativeai>=0.3.0; extra == "llm"
Requires-Dist: openai>=1.0.0; extra == "llm"
Requires-Dist: anthropic>=0.25.0; extra == "llm"
Provides-Extra: advanced-re
Requires-Dist: opennre>=1.1.0; extra == "advanced-re"
Provides-Extra: web
Requires-Dist: streamlit>=1.30.0; extra == "web"
Requires-Dist: plotly>=5.18.0; extra == "web"
Requires-Dist: pandas>=2.0.0; extra == "web"
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-xdist>=3.3.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.1.0; extra == "dev"
Requires-Dist: pytest-mock>=3.11.0; extra == "dev"
Requires-Dist: black>=23.9.0; extra == "dev"
Requires-Dist: ruff>=0.0.290; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"
Requires-Dist: pre-commit>=3.3.0; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Requires-Dist: wheel>=0.41.0; extra == "dev"
Provides-Extra: docs
Requires-Dist: sphinx>=7.1.0; extra == "docs"
Requires-Dist: sphinx-rtd-theme>=1.3.0; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints>=1.24.0; extra == "docs"
Requires-Dist: myst-parser>=1.0.0; extra == "docs"
Provides-Extra: all
Requires-Dist: deeplightrag[advanced-re,dev,docs,gpu,llm,web]; extra == "all"
Dynamic: license-file

# DeepLightRAG

DeepLightRAG is a high-performance document indexing and retrieval system designed to work with any Large Language Model (LLM). It features a dual-layer graph architecture (Visual-Spatial and Entity-Relationship) to provide context-aware and visually-grounded retrieval.

## Features

- **Dual-Layer Graph**: Combines visual layout awareness with semantic entity relationships.
- **Visual-Grounded Retrieval**: Retrieves not just text, but visual regions and their spatial context.
- **Robust OCR**: Integrated with DeepSeek-OCR and EasyOCR fallback for reliable text extraction.
- **Advanced NER**: Uses GLiNER for zero-shot entity recognition.
- **Flexible LLM Support**: Compatible with OpenAI, Google Gemini, Anthropic, and local LLMs via MLX/Ollama.

## Installation

```bash
pip install deeplightrag
```

## Usage

Index a document:
```bash
deeplightrag index document.pdf
```

Query the index:
```bash
deeplightrag query "What is the main topic?"
```

## License

MIT License
