Metadata-Version: 2.4
Name: llama-index-retrievers-dewey
Version: 0.1.0
Summary: LlamaIndex retriever integration for Dewey — managed RAG backend
Author-email: Dewey <hi@meetdewey.com>
License-Expression: MIT
Project-URL: Homepage, https://meetdewey.com
Project-URL: Repository, https://github.com/meetdewey/llama-index-retrievers-dewey
Project-URL: Documentation, https://meetdewey.com/docs
Keywords: llama-index,llamaindex,dewey,rag,retrieval,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: meetdewey>=1.0
Requires-Dist: llama-index-core>=0.12
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Requires-Dist: pytest-mock>=3; extra == "dev"
Requires-Dist: llama-index-core>=0.12; extra == "dev"

# LlamaIndex Dewey Retriever

[Dewey](https://meetdewey.com) is a managed document intelligence backend that handles the full RAG pipeline — PDF conversion, section extraction, chunking, embedding, and hybrid retrieval — behind a single REST API.

This package provides a `DeweyRetriever` that integrates Dewey's hybrid semantic + BM25 search directly into LlamaIndex pipelines.

## Installation

```bash
pip install llama-index-retrievers-dewey
```

## Usage

```python
from llama_index.retrievers.dewey import DeweyRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.llms.anthropic import Anthropic

retriever = DeweyRetriever(
    api_key="dwy_live_...",       # Dewey project API key
    collection_id="3f7a1b2c-...", # Collection to search
    k=8,                          # Number of chunks to retrieve
)

# Use with any LlamaIndex query engine
query_engine = RetrieverQueryEngine.from_args(
    retriever=retriever,
    llm=Anthropic(model="claude-haiku-4-5-20251001"),
)
response = query_engine.query("What are the main findings?")
print(response)
```

### Direct retrieval

```python
nodes = retriever.retrieve("attention mechanism scaling")
for n in nodes:
    print(f"[{n.score:.3f}] {n.node.metadata['filename']} › {n.node.metadata['section_title']}")
    print(f"  {n.node.text[:120]}...")
```

### As part of an agent

```python
from llama_index.core.tools import RetrieverTool
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI

tool = RetrieverTool.from_defaults(
    retriever=retriever,
    description="Search AI research papers for relevant information.",
)
agent = ReActAgent.from_tools([tool], llm=OpenAI(model="gpt-4o-mini"), verbose=True)
agent.chat("Compare the Transformer and RAG architectures.")
```

## Parameters

| Parameter | Type | Default | Description |
|---|---|---|---|
| `api_key` | `str` | required | Dewey project API key (`dwy_live_...`) |
| `collection_id` | `str` | required | UUID of the collection to search |
| `k` | `int` | `10` | Max chunks to retrieve (1–50) |
| `base_url` | `str` | `https://api.meetdewey.com/v1` | Override for local dev |

## What Dewey returns

Each retrieved node includes:

- `node.text` — chunk content
- `node.metadata["filename"]` — source document filename
- `node.metadata["section_title"]` — section heading
- `node.metadata["section_level"]` — heading depth (1 = top-level)
- `node.metadata["document_id"]` — document UUID
- `node.metadata["section_id"]` — section UUID
- `score` — relevance score from hybrid RRF ranking

## Resources

- [Dewey documentation](https://meetdewey.com/docs)
- [Free tier signup](https://meetdewey.com) — no credit card required
- [Python SDK (`meetdewey`)](https://pypi.org/project/meetdewey/)
