Metadata-Version: 2.4
Name: llama-index-retrievers-quelvio
Version: 0.1.0
Summary: Quelvio for LlamaIndex — your company's brain as a LlamaIndex retriever.
Project-URL: Homepage, https://quelvio.com
Project-URL: Documentation, https://docs.quelvio.com
Project-URL: Source, https://github.com/Quelvio/quelvio-llama-index
Project-URL: Issues, https://github.com/Quelvio/quelvio-llama-index/issues
Project-URL: Changelog, https://github.com/Quelvio/quelvio-llama-index/blob/main/CHANGELOG.md
Author-email: Quelvio <engineering@quelvio.com>
License: MIT
License-File: LICENSE
Keywords: agent,enterprise-search,knowledge-base,llama-index,llamaindex,llm,quelvio,rag,retrieval
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: httpx<1.0,>=0.27
Requires-Dist: llama-index-core<0.13,>=0.11
Requires-Dist: pydantic<3.0,>=2.5
Requires-Dist: typing-extensions>=4.7
Provides-Extra: dev
Requires-Dist: build>=1.2; extra == 'dev'
Requires-Dist: mypy>=1.10; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-httpx>=0.30; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: respx>=0.21; extra == 'dev'
Requires-Dist: ruff>=0.5; extra == 'dev'
Description-Content-Type: text/markdown

# llama-index-retrievers-quelvio

> Quelvio for LlamaIndex — your company's brain as a LlamaIndex retriever.

`llama-index-retrievers-quelvio` is the official Python integration that plugs
Quelvio's enterprise knowledge API into [LlamaIndex](https://docs.llamaindex.ai).
It ships a first-class `BaseRetriever` wired to your organization's connected
sources (Google Drive, SharePoint, Confluence, Slack, Notion, and the rest of
your content fabric) and scoped to the running user's individual permissions.

[![PyPI version](https://img.shields.io/pypi/v/llama-index-retrievers-quelvio.svg)](https://pypi.org/project/llama-index-retrievers-quelvio/)
[![Python versions](https://img.shields.io/pypi/pyversions/llama-index-retrievers-quelvio.svg)](https://pypi.org/project/llama-index-retrievers-quelvio/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)

## Why Quelvio (and not vanilla RAG)?

A naive RAG pipeline embeds every chunk it can find and ranks by cosine
similarity. That's why most internal copilots confidently quote a
three-year-old draft. Quelvio is a managed company-brain that does the
work a generic vector store can't:

- **Authority scoring.** Every chunk is ranked by *who authored it*, *how
  fresh it is*, and *how many downstream documents reference it* — not just
  semantic similarity to the question.
- **Lifecycle awareness.** Drafts, deprecated docs, and superseded
  decisions are demoted automatically; chunks return a `lifecycle_state`
  the LLM can quote when hedging.
- **Per-employee permissioning.** Every query is scoped to the running
  user's identity. Results never include documents the user can't already
  read in the source system (Drive ACLs, Confluence space restrictions,
  SharePoint groups).
- **Synthesized answers with citations.** The API returns a final answer
  *plus* the chunks that informed it, so your agent can hand the user a
  link to the source of truth, not a hallucination.

## Install

```bash
pip install llama-index-retrievers-quelvio llama-index-core
```

Requires Python 3.10+ and `llama-index-core>=0.11,<0.13`.

## Quickstart

```python
from llama_index_retrievers_quelvio import QuelvioRetriever

retriever = QuelvioRetriever(api_key="qlv_pat_...")  # or set QUELVIO_API_KEY
nodes = retriever.retrieve("what's our refund policy?")

for n in nodes:
    print(n.score, n.metadata["source"], n.text)
```

Each returned `NodeWithScore` carries the chunk's `source_url`,
`authority_score`, `taxonomy_domain`, `chunk_id`, and (when present) the
author's name, email, and department on the underlying
`TextNode.metadata`. The `score` is the chunk's authority score when
available, so a downstream node post-processor or `SimilarityPostprocessor`
can filter on it directly.

## Authentication

`llama-index-retrievers-quelvio` resolves a bearer token from the first
non-empty source, in order:

| Precedence | Source                          | Notes                                              |
| ---------- | ------------------------------- | -------------------------------------------------- |
| 1          | `api_key=...` constructor arg   | Highest priority; never persisted, never logged.   |
| 2          | `QUELVIO_API_KEY` env var       | Best for CI, notebooks, and one-off scripts.       |

Three token types are accepted — the wire format is identical, so the
library does not need to know which kind you provided:

- **Personal Access Token (PAT).** Long-lived bearer tied to a human
  user. Generate at <https://enterprise.quelvio.com/account> → *Personal
  API Keys* → *Create token*. Best for ad-hoc use and CI.
- **OAuth access token.** Short-lived token from the device-code flow
  (`quelvio login` in the [CLI](https://github.com/Quelvio/quelvio-cli)).
- **Service Account key.** Long-lived, machine-scoped. Generate at
  *Settings* → *Service Accounts*. Best for production agents.

The token is held privately on the client; it never appears in
`repr()`, exception messages, or any log line emitted by this library.

## Configuration

| Constructor arg / env var       | Default                       | Purpose                                            |
| ------------------------------- | ----------------------------- | -------------------------------------------------- |
| `api_key` / `QUELVIO_API_KEY`   | *(required)*                  | Bearer token (PAT, OAuth, or Service Account).     |
| `base_url` / `QUELVIO_API_BASE` | `https://api.quelvio.com`     | API base — point at `api-dev` for staging.         |
| `timeout`                       | `30.0` seconds                | Per-request HTTP timeout.                          |
| `max_sources`                   | `5`                           | Max chunks returned per query (1–50).              |
| `mode`                          | `"standard"`                  | `fast` / `standard` / `deep`.                      |
| `domain`                        | `None`                        | Restrict to one taxonomy domain.                   |

## Examples

### 1. Simple Q&A with citations

```python
from llama_index_retrievers_quelvio import QuelvioRetriever

retriever = QuelvioRetriever()  # reads QUELVIO_API_KEY

nodes = retriever.retrieve("how do we handle on-call escalations?")
for n in nodes:
    title = n.metadata["title"]
    url = n.metadata.get("source_url", "(no link)")
    authority = n.metadata.get("authority_score", "—")
    print(f"[authority {authority}] {title}\n  {url}\n  {n.text[:160]}\n")
```

### 2. RAG QueryEngine using QuelvioRetriever + LLM

```python
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import get_response_synthesizer
from llama_index.llms.anthropic import Anthropic
from llama_index_retrievers_quelvio import QuelvioRetriever

retriever = QuelvioRetriever(mode="deep", max_sources=8)

llm = Anthropic(model="claude-sonnet-4-6")
synthesizer = get_response_synthesizer(llm=llm, response_mode="compact")

query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=synthesizer,
)

response = query_engine.query("Summarize our Q4 OKR review decisions.")
print(response)
for src in response.source_nodes:
    print(f"  • {src.metadata['title']} — {src.metadata.get('source_url', '(no url)')}")
```

### 3. Multi-step LlamaIndex agent using QuelvioRetriever as a knowledge tool

```python
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.core.tools import RetrieverTool, ToolMetadata
from llama_index.llms.anthropic import Anthropic
from llama_index_retrievers_quelvio import QuelvioRetriever

retriever = QuelvioRetriever(mode="standard", max_sources=6)

quelvio_tool = RetrieverTool(
    retriever=retriever,
    metadata=ToolMetadata(
        name="quelvio_query",
        description=(
            "Look up factual information from THIS company's internal knowledge "
            "base — policies, decisions, on-call runbooks, OKRs, customer "
            "playbooks. Use whenever the user asks about internal company info."
        ),
    ),
)

agent = FunctionAgent(
    tools=[quelvio_tool],
    llm=Anthropic(model="claude-sonnet-4-6"),
    system_prompt=(
        "You are an internal assistant. Use the quelvio_query tool whenever "
        "the user asks about anything company-specific. Always cite source URLs."
    ),
)

print(await agent.run("What's our parental leave policy?"))
```

## Authority and lifecycle awareness

A naive RAG pipeline embeds every chunk it can find and ranks by cosine
similarity. That's why most internal copilots confidently quote a
three-year-old draft. Quelvio is a managed company-brain that does the
work a generic vector store can't: every chunk is ranked by *who
authored it*, *how fresh it is*, and *how many downstream documents
reference it* — not just semantic similarity to the question. Drafts,
deprecated docs, and superseded decisions are demoted automatically;
chunks return a `lifecycle_state` the LLM can quote when hedging. Every
query is scoped to the running user's identity, so results never include
documents the user can't already read in the source system (Drive ACLs,
Confluence space restrictions, SharePoint groups).

## Related packages

- **[`quelvio-langchain`](https://github.com/Quelvio/quelvio-langchain-python)** —
  the LangChain Python integration (sibling package).
- **[`@quelvio/langchain`](https://github.com/Quelvio/quelvio-langchain-js)** —
  the LangChain.js integration.
- **[`@quelvio/cli`](https://github.com/Quelvio/quelvio-cli)** — query the
  brain from your terminal, scriptable in CI, JSON output.
- **[`@quelvio/mcp-server`](https://github.com/Quelvio/quelvio-mcp-server)** —
  use Quelvio from any Model Context Protocol client (Claude Desktop,
  Cursor, VS Code, etc.).
- **[Quelvio docs](https://docs.quelvio.com)** — concepts, API reference,
  source connectors.

## Development

```bash
git clone https://github.com/Quelvio/quelvio-llama-index
cd quelvio-llama-index
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest
```

Linting and type-checking:

```bash
ruff check src tests
ruff format --check src tests
mypy src
```

## Contributing

Issues and pull requests welcome at
<https://github.com/Quelvio/quelvio-llama-index>. Please run `ruff check`,
`ruff format`, `mypy`, and `pytest` before opening a PR.

## License

MIT — see [LICENSE](./LICENSE).
