Metadata-Version: 2.4
Name: verbatim-client
Version: 0.2.0
Summary: Python SDK and CLI for the Verbatim platform — grounded, verbatim answers from curated document collections.
Project-URL: Homepage, https://verbatim.krlabs.eu
Project-URL: Documentation, https://verbatim.krlabs.eu/docs
Project-URL: Repository, https://github.com/KRLabsOrg/verbatim-client
Project-URL: Issues, https://github.com/KRLabsOrg/verbatim-client/issues
Author-email: KR Labs <info@krlabs.eu>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: academic,citations,cli,papers,rag,research,sdk,search,verbatim
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: httpx>=0.25.0
Requires-Dist: pydantic>=2.0
Requires-Dist: rich>=13.0
Requires-Dist: typer>=0.9.0
Provides-Extra: dev
Requires-Dist: build>=1.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: respx>=0.20; extra == 'dev'
Requires-Dist: twine>=4.0; extra == 'dev'
Description-Content-Type: text/markdown

# verbatim-client

[![PyPI version](https://img.shields.io/pypi/v/verbatim-client.svg)](https://pypi.org/project/verbatim-client/)
[![License](https://img.shields.io/pypi/l/verbatim-client.svg)](https://github.com/KRLabsOrg/verbatim-client/blob/main/LICENSE)

Python SDK and CLI for the [Verbatim platform](https://verbatim.krlabs.eu) — grounded, verbatim answers from curated document collections.

Verbatim is a hosted RAG platform from [KR Labs](https://krlabs.eu). Every answer is grounded in exact, citable spans from real documents. The platform is organized around **collections** — curated document sets like the ACL Anthology, or private collections we host for organizations.

This library is the **Python client** for talking to `verbatim.krlabs.eu` directly. It wraps the REST API with typed [Pydantic](https://docs.pydantic.dev/) responses and ships a `verbatim` command-line tool.

## How it relates to the other Verbatim packages

| Package | What it's for |
|---|---|
| **`verbatim-client`** (this) | Python SDK + CLI for the hosted platform |
| [`verbatim-rag`](https://github.com/KRLabsOrg/verbatim-rag) | Open-source RAG core to self-host your own Verbatim-style stack |
| [`verbatim-mcp`](https://github.com/KRLabsOrg/verbatim-mcp) | MCP server wrapping the hosted platform for agentic clients |
| [verbatim-skill](https://github.com/KRLabsOrg/verbatim-skill) | Claude Code plugin exposing the platform as slash commands |

## Install

```bash
pip install verbatim-client
```

## Get an API key

1. Sign up at [verbatim.krlabs.eu](https://verbatim.krlabs.eu)
2. Navigate to **API Keys**, create a key
3. Export it:

```bash
export VERBATIM_API_KEY=vb_your_key_here
```

## SDK usage

### Sync

```python
from verbatim_client import VerbatimClient

with VerbatimClient() as client:
    # RAG over the default collection (ACL Anthology)
    result = client.query("What is the attention mechanism in transformers?")
    print(result.answer)
    for cite in result.structured_answer.citations:
        print(f"  [{cite.number}] {cite.text}")

    # Search papers (no LLM, no quota)
    papers = client.search_papers("attention mechanisms", year=2017, limit=5)
    for p in papers:
        print(p.id, p.title)

    # Search papers AND get the chunks that produced the match — useful for
    # feeding directly into a local extractor or your own pipeline. Free.
    papers = client.search_papers(
        "attention mechanisms", limit=5, include_chunks=True,
    )
    for p in papers:
        for chunk in p.matched_chunks or []:
            print(f"  [{chunk.score:.2f}] {chunk.text[:80]}…")

    # Paper detail + BibTeX + full text
    paper = client.get_paper("2017.acl-1.1")
    bibtex = client.get_paper_bibtex("2017.acl-1.1").bibtex
    content = client.get_paper_content("2017.acl-1.1").content

    # Browse facets
    venues = client.facets("venue", q="acl", limit=10)
    for v in venues.items:
        print(v.value, v.count)

    # Cross-collection query (when other collections are available)
    result = client.query(
        "Recent advances in language modeling",
        collection_ids=["anthology", "biorxiv"],
    )
```

### Async

```python
import asyncio
from verbatim_client import AsyncVerbatimClient

async def main():
    async with AsyncVerbatimClient() as client:
        result = await client.query("What is BERT?")
        print(result.answer)

asyncio.run(main())
```

### Streaming

NDJSON streaming exposes documents → highlights → answer stages as they're produced:

```python
with VerbatimClient() as client:
    for event in client.query_stream("What is BERT?"):
        if event["type"] == "documents":
            print(f"Retrieved {len(event['data'])} documents")
        elif event["type"] == "answer" and event.get("done"):
            print("Final answer:", event["data"]["answer"])
```

### Verbatim Transform (bring your own context)

`transform` is collection-agnostic — pass your own context inline:

```python
result = client.transform(
    "What are the main findings?",
    context=[
        {"content": "Full text of document 1...", "title": "Doc 1"},
        {"content": "Full text of document 2...", "title": "Doc 2"},
    ],
)
print(result.answer)
```

## CLI

The package installs a `verbatim` command:

```bash
# Ask a question
verbatim query "What is BERT?"

# Search papers
verbatim search "attention mechanisms" --year 2017 --limit 5

# Search + include the matched chunks per paper (free, no LLM)
verbatim search "attention mechanisms" --include-chunks --json | jq '.[].matched_chunks'

# List collections
verbatim collections list

# Paper metadata, BibTeX, full text
verbatim paper get 2017.acl-1.1
verbatim paper bibtex 2017.acl-1.1 > vaswani-2017.bib
verbatim paper content 2017.acl-1.1 > vaswani-2017.md

# Facet autocomplete (fuzzy)
verbatim facets author --q vaswani --limit 5

# Verbatim transform over your own context
verbatim transform "What did they find?" --context ./mycontext.json
```

Every command honors `VERBATIM_API_KEY` and `VERBATIM_API_URL` from the environment; override per-invocation with `--api-key` and `--base-url`.

Add `--json` to most commands to get machine-readable output:

```bash
verbatim search "attention" --json | jq '.[] | {id, title}'
```

## Configuration

| Env var | Default | Description |
|---|---|---|
| `VERBATIM_API_KEY` | — | Required. Your platform API key. |
| `VERBATIM_API_URL` | `https://verbatim.krlabs.eu` | API base URL. Override for dev / self-hosted. |

## Errors

API errors raise `VerbatimError`:

```python
from verbatim_client import VerbatimClient, VerbatimError

try:
    client.query("...")
except VerbatimError as e:
    print(e.status_code, e.detail)
```

Common statuses:
- `401` — authentication required
- `403` — no access to this collection (or non-default collection while the platform's Milvus filter is disabled)
- `404` — collection or paper not found
- `429` — rate limit or quota exceeded
- `451` — legal acceptance required

## License

Apache 2.0. See [LICENSE](LICENSE).
