Metadata-Version: 2.4
Name: langchain-dewey
Version: 0.2.0
Summary: LangChain integration for Dewey — retriever, vector store, and research tool
Author-email: Dewey <hi@meetdewey.com>
License-Expression: MIT
Project-URL: Homepage, https://meetdewey.com
Project-URL: Repository, https://github.com/meetdewey/langchain-dewey
Keywords: langchain,dewey,rag,retrieval,vector-store,llm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: meetdewey>=1.0
Requires-Dist: langchain-core>=0.3
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: langchain>=0.3; extra == "dev"

# langchain-dewey

[![CI](https://github.com/meetdewey/langchain-dewey/actions/workflows/ci.yml/badge.svg)](https://github.com/meetdewey/langchain-dewey/actions/workflows/ci.yml)

LangChain integration for [Dewey](https://meetdewey.com) — retriever, vector store, and research tool.

## Installation

```bash
pip install langchain-dewey
```

## Components

### DeweyRetriever

Drop-in LangChain retriever backed by Dewey's hybrid semantic + BM25 search.

```python
from langchain_dewey import DeweyRetriever
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA

retriever = DeweyRetriever(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
    k=8,
)

qa = RetrievalQA.from_chain_type(llm=ChatOpenAI(), retriever=retriever)
answer = qa.invoke("What are the main findings?")
```

Each returned `Document` carries citation metadata:

| Field | Description |
|---|---|
| `score` | Relevance score (0–1) |
| `document_id` | Dewey document ID |
| `filename` | Original filename |
| `section_id` | Section ID |
| `section_title` | Section heading |
| `section_level` | Heading depth (1 = top-level) |

### DeweyVectorStore

Full `VectorStore` implementation. Dewey manages its own embeddings, so the
`embedding` argument is accepted for interface compatibility but unused.

**Wrap an existing collection:**

```python
from langchain_dewey import DeweyVectorStore

store = DeweyVectorStore(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
)
docs = store.similarity_search("retrieval augmented generation", k=5)
```

**Build from texts:**

```python
store = DeweyVectorStore.from_texts(
    texts=["Neural networks learn via backpropagation.", "..."],
    embedding=None,
    api_key="dwy_live_...",
    collection_name="my-docs",
)
docs = store.similarity_search("how does training work?")
```

`add_texts` uploads each string as a `.txt` document and blocks until Dewey
finishes processing them (configurable via `poll_interval` and `poll_timeout`).

**With LangChain document loaders:**

```python
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("research_paper.pdf")
pages = loader.load_and_split()

store = DeweyVectorStore.from_documents(
    pages,
    embedding=None,
    api_key="dwy_live_...",
    collection_name="research",
)
```

### create_research_tool

Creates a LangChain `@tool` that runs a full Dewey research query — searching,
reading, and synthesising across multiple documents — for use with agents.

```python
from langchain_dewey import create_research_tool
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.prompts import ChatPromptTemplate

research = create_research_tool(
    api_key="dwy_live_...",
    collection_id="3f7a1b2c-...",
    depth="balanced",  # "quick" | "balanced" | "deep" | "exhaustive"
)

llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful research assistant."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])
agent = create_openai_tools_agent(llm, [research], prompt)
executor = AgentExecutor(agent=agent, tools=[research])
executor.invoke({"input": "Summarise the key findings across all documents."})
```

## Requirements

- Python 3.9+
- `meetdewey >= 1.0`
- `langchain-core >= 0.3`

## Development

```bash
pip install -e ".[dev]"
pytest
```
