Metadata-Version: 2.4
Name: docreadi-mcp
Version: 0.1.1
Summary: MCP server for DocReadi — document data extraction for finance, in your AI client.
Project-URL: Homepage, https://docreadi.com
Project-URL: Documentation, https://docreadi.com/api/docs-guide
Project-URL: Repository, https://github.com/greenmartian138/doc_intelligence
Author-email: DocReadi <support@docreadi.com>
License: Proprietary
Keywords: docreadi,documents,extraction,finance,invoice,mcp,ocr
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: httpx<1,>=0.27.0
Requires-Dist: mcp>=1.2.0
Description-Content-Type: text/markdown

# docreadi-mcp

A **local** [MCP](https://modelcontextprotocol.io) server for **DocReadi** —
document data extraction for finance. It runs on your machine, reads documents
off your disk, and calls the hosted DocReadi API with your API key, so an AI
client (Claude Desktop, Claude Code, Cursor, …) can extract documents and query
your DocReadi corpus without leaving the chat.

> **Status: v1 published** — on PyPI as
> [`docreadi-mcp`](https://pypi.org/project/docreadi-mcp/); run with
> `uvx docreadi-mcp` (see Install / configure below). Plan + history:
> `../../MCP_SERVER_PLAN.md`.

## How it works

Your MCP client spawns `docreadi-mcp` as a local subprocess and talks to it over
stdio. The server is a thin courier — all extraction/storage happens on the
hosted DocReadi API; the server just translates tool calls into HTTPS requests
and (for ingestion) reads local files. Your API key lives only in your client
config, never in this repo.

## Install / configure

Add the server to your MCP client config with your DocReadi API key
(get one at **docreadi.com → Settings → API keys**):

```jsonc
{
  "mcpServers": {
    "docreadi": {
      "command": "uvx",
      "args": ["docreadi-mcp"],
      "env": {
        "DOCREADI_API_KEY": "dr_live_…"
        // "DOCREADI_BASE_URL": "https://docreadi.com"  // override if self-hosting
      }
    }
  }
}
```

| Env var | Required | Default |
|---|---|---|
| `DOCREADI_API_KEY` | yes | — |
| `DOCREADI_BASE_URL` | no | `https://docreadi.com` |

## Tools

| Tool | What it does |
|---|---|
| `check_connection` | First-run self-test — is DocReadi reachable and is your API key valid? Never errors; returns a diagnostic. |
| `extract_document` | Extract structured data from a **local** PDF/JPG/PNG — uploads, waits for extraction, returns the fields + line items + totals + confidence. |
| `get_document` | Fetch a document's status + extracted data by id. |
| `extract_adhoc` | One-off, non-destructive extraction of caller-defined fields from a document already in DocReadi (e.g. "who signed these delivery notes?"). |
| `classify_document` | (Re)classify a document's type; returns the type + reasoning. |
| `search_documents` | Search/list the workspace's documents (by vendor, number, type, status, or date range) — a summary row per match. |
| `list_counterparties` | List vendors/customers (VAT, aliases, doc counts), paginated. |
| `list_reports` | List the workspace's saved reports. |
| `run_report` | Run a saved report and return its rows — the way to query/search the corpus. |
| `export_report_csv` | Run a saved report and write the result as a CSV to a local path. |

_(`search_documents` is the raw query surface; for richer column selection and
saved queries, author a report in the UI and use `run_report` /
`export_report_csv`.)_

## Verify your key (`--check`)

Before wiring it into a client, confirm the service is reachable and your key
works — straight from a terminal, no MCP client needed:

```bash
DOCREADI_API_KEY=dr_live_… uvx docreadi-mcp --check
# ✓ Connected — DocReadi reachable at https://docreadi.com, API key valid.
```

Exit code: `0` connected · `1` reachable but the key failed · `2` no key set.
(`docreadi-mcp --version` prints the version; no flags runs the server over
stdio.)

## Development

This package lives in the DocReadi monorepo so it's reviewed and CI'd alongside
the API it wraps. Its layering keeps the dependency-free parts testable in the
main CI without the `mcp` SDK:

- `docreadi_mcp/config.py` — env config.
- `docreadi_mcp/catalog.py` — pure-data tool→endpoint registry (the maintenance
  anchor: `tests/test_mcp_catalog_contract.py` binds it to `agent_guide.GUIDE`).
- `docreadi_mcp/client.py` — httpx client (tested with `MockTransport`).
- `docreadi_mcp/server.py` — thin FastMCP glue (lazy-imports the `mcp` SDK).

```bash
# from the repo root
pip install -e services/mcp          # installs mcp + httpx
DOCREADI_API_KEY=dr_… python -m docreadi_mcp   # run over stdio
```

**Maintenance contract:** when a DocReadi API endpoint a tool wraps changes,
update `docreadi_mcp/catalog.py` (and the tool) — the contract test fails CI
otherwise. See `MCP_SERVER_PLAN.md` §5.
