Metadata-Version: 2.4
Name: arke-terminal
Version: 0.1.192
Summary: Local-first RAG for legal teams.
Project-URL: Repository, https://github.com/padolgot/arke-terminal
Author-email: Pavel Dolgoter <mikepromogratus@proton.me>
License-Expression: MIT
License-File: LICENSE
Keywords: embeddings,legal,local,rag,search
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.12
Requires-Dist: click>=8.1
Requires-Dist: extract-msg>=0.48
Requires-Dist: httpx>=0.28.1
Requires-Dist: llama-cpp-python>=0.3
Requires-Dist: msal>=1.36.0
Requires-Dist: numpy>=2.0
Requires-Dist: pdfplumber>=0.11
Requires-Dist: python-docx>=1.1
Requires-Dist: python-dotenv>=1.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# Arke Terminal

AI document search for legal teams. Privilege-safe, on-premise.

![Dashboard](/.github/media/dashboard.png)

Cloud AI breaks attorney-client privilege (*United States v. Heppner*, *Hamid v SSHD*). Arke runs on your server. Your documents never leave your network.

Hybrid search (semantic + keyword) over your documents. Ask questions, get answers with source references. Click a source to open it in your default application.

## Quick start (Docker)

```bash
docker compose up --build
```

Opens dashboard at [localhost:8000](http://localhost:8000). Pulls models automatically on first run.

## Quick start (local)

Requires Python 3.12+, Postgres with `pgvector`, and [Ollama](https://ollama.com).

```bash
pip install arke-terminal
cp .env.example .env   # fill in DATABASE_URL and DATA_PATH
arke ingest ./your-documents
arke serve
```

## CLI

```bash
arke ingest <path>                       # index documents
arke ask "query"                         # search from terminal
arke serve                               # start dashboard + API
arke sweep <fast|medium|thorough> -l 30  # run eval benchmark
```

## Library

```python
from arke import Arke, Config

cfg = Config(database_url="postgresql://...", data_path="./docs")

async with Arke(cfg) as engine:
    await engine.ingest("./docs")
    result = await engine.ask("What are the termination clauses?")
    print(result.answer)
    for hit in result.hits:
        print(hit.chunk.source, hit.similarity)
```

## Input formats

**Plain text** (.txt) — loaded directly, source = relative path from root.

**JSONL** — one document per line:

```json
{"content": "...", "source": "optional", "created_at": "2026-04-01T12:00:00Z", "metadata": {}}
```

Only `content` is required.
