Metadata-Version: 2.3
Name: nexrag
Version: 0.1.0
Summary: Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML
Keywords: rag,retrieval-augmented-generation,llm,vector-database,embeddings,ai,nlp,pipeline,sdk
Author: KevinRawal
Author-email: KevinRawal <kevinrawal30@gmail.com>
License: TBD
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Dist: pydantic>=2.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: nexrag[all-providers] ; extra == 'all'
Requires-Dist: nexrag[all-loaders] ; extra == 'all'
Requires-Dist: nexrag[pdf] ; extra == 'all-loaders'
Requires-Dist: nexrag[word] ; extra == 'all-loaders'
Requires-Dist: nexrag[excel] ; extra == 'all-loaders'
Requires-Dist: nexrag[html] ; extra == 'all-loaders'
Requires-Dist: nexrag[openai] ; extra == 'all-providers'
Requires-Dist: nexrag[anthropic] ; extra == 'all-providers'
Requires-Dist: nexrag[ollama] ; extra == 'all-providers'
Requires-Dist: nexrag[chromadb] ; extra == 'all-providers'
Requires-Dist: nexrag[huggingface] ; extra == 'all-providers'
Requires-Dist: anthropic>=0.20 ; extra == 'anthropic'
Requires-Dist: chromadb>=0.5 ; extra == 'chromadb'
Requires-Dist: pytest>=8.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=5.0 ; extra == 'dev'
Requires-Dist: ruff>=0.4 ; extra == 'dev'
Requires-Dist: mypy>=1.10 ; extra == 'dev'
Requires-Dist: pre-commit>=3.7 ; extra == 'dev'
Requires-Dist: types-pyyaml ; extra == 'dev'
Requires-Dist: openpyxl>=3.1 ; extra == 'excel'
Requires-Dist: beautifulsoup4>=4.12 ; extra == 'html'
Requires-Dist: lxml>=5.0 ; extra == 'html'
Requires-Dist: sentence-transformers>=2.0 ; extra == 'huggingface'
Requires-Dist: ollama>=0.1 ; extra == 'ollama'
Requires-Dist: openai>=1.0 ; extra == 'openai'
Requires-Dist: pypdf>=4.0 ; extra == 'pdf'
Requires-Dist: python-docx>=1.0 ; extra == 'word'
Requires-Python: >=3.12
Project-URL: Homepage, https://github.com/kevinrawal/nexrag
Project-URL: Repository, https://github.com/kevinrawal/nexrag
Project-URL: Issues, https://github.com/kevinrawal/nexrag/issues
Project-URL: Changelog, https://github.com/kevinrawal/nexrag/blob/main/CHANGELOG.md
Provides-Extra: all
Provides-Extra: all-loaders
Provides-Extra: all-providers
Provides-Extra: anthropic
Provides-Extra: chromadb
Provides-Extra: dev
Provides-Extra: excel
Provides-Extra: html
Provides-Extra: huggingface
Provides-Extra: ollama
Provides-Extra: openai
Provides-Extra: pdf
Provides-Extra: word
Description-Content-Type: text/markdown

# NexRAG

> Framework-agnostic RAG pipeline SDK. Plug in any component, swap any stage, configure everything in YAML.

[![PyPI version](https://img.shields.io/pypi/v/nexrag.svg)](https://pypi.org/project/nexrag/)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-TBD-lightgrey.svg)]()

---

## What is NexRAG?

NexRAG is a production-grade RAG (Retrieval-Augmented Generation) pipeline SDK for Python.

**NexRAG owns the pipeline shape. You own the components.**

Every stage — loading, chunking, embedding, retrieval, generation — is a clean interface. NexRAG ships default implementations for each. You can swap any of them by implementing the interface and declaring it in YAML. No framework lock-in. No magic. No hidden behavior.

---

## Quickstart

> **Note:** NexRAG v1.0 is under active development. This section will be updated on first release.

```python
from nexrag import NexRAG

pipeline = NexRAG.from_config("nexrag.yaml")

# Ingest documents
pipeline.ingest("docs/contracts/")

# Query
result = pipeline.query("What are the termination clauses?")
print(result.answer)
print(result.source_chunks)
```

```yaml
# nexrag.yaml
nexrag:
  version: "1.0"

ingestion:
  chunker:
    strategy: recursive
    chunk_size: 512
  embedder:
    provider: openai
    model: text-embedding-3-small
  vector_db:
    provider: chroma
    default_collection: contracts

query:
  llm:
    provider: openai
    model: gpt-4o
```

---

## Installation

```bash
# Core only
pip install nexrag

# With OpenAI support
pip install "nexrag[openai]"

# With everything
pip install "nexrag[all]"
```

---

## Design Principles

| Principle | What it means |
|---|---|
| Interface-first | Every stage is a contract. Implementation is secondary. |
| Config-driven | YAML configures the pipeline. Code defines the logic. |
| Zero lock-in | Core has no dependency on LangChain, LlamaIndex, or any AI SDK. |
| Explicit over implicit | No hidden defaults. Every behavior is declared or documented. |
| Extensible by design | New components plug in without touching core. |

---

## Architecture

NexRAG has two independent pipelines:

```
INGESTION  →  Loader → Sanitizer → Chunker → Embedder → VectorDB
QUERY      →  Embedder → Retriever → PromptBuilder → LLM → PipelineResult
```

See [Architecture Documentation](docs/) for full pipeline diagrams.

---

## Supported Providers (V1)

| Category | Providers |
|---|---|
| Embedders | OpenAI, HuggingFace, Ollama |
| Vector DBs | ChromaDB (local + remote) |
| LLMs | OpenAI, Anthropic, Ollama |
| Loaders | PDF, TXT/MD, Word, Excel, JSON, HTML, Code |

---

## Contributing

NexRAG is in early development. Contribution guidelines will be published with v1.0.

---

## Changelog

See [CHANGELOG.md](CHANGELOG.md).