Metadata-Version: 2.4
Name: sententia
Version: 0.0.0
Summary: FastAPI-based search and RAG engine for local Markdown files powered by FAISS and multilingual E5 embeddings
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: faiss-cpu
Requires-Dist: sentence-transformers
Requires-Dist: httpx
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: pydantic>=2.0
Requires-Dist: numpy
Requires-Dist: mcp>=1.27.0
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == "test"
Requires-Dist: pytest-cov>=5.0; extra == "test"
Requires-Dist: ruff>=0.15.0; extra == "test"
Provides-Extra: docs
Requires-Dist: mkdocs>=1.6.0; extra == "docs"
Requires-Dist: mkdocs-material>=9.5.0; extra == "docs"
Dynamic: license-file

# Sententia

FastAPI-based search and RAG engine for local Markdown files, powered by FAISS and multilingual E5 embeddings.

## Quick Start

Install:

```bash
pip install sententia
```

Run with OpenAI-compatible LLM:

```bash
sententia /path/to/markdown/docs \
  --llm-protocol openai \
  --llm-url https://api.openai.com \
  --llm-model gpt-4o \
  --llm-token sk-...
```

Run with Anthropic:

```bash
sententia /path/to/markdown/docs \
  --llm-protocol anthropic \
  --llm-url https://api.anthropic.com \
  --llm-model claude-sonnet-4-6 \
  --llm-token sk-ant-...
```

Run with Ollama (no API key needed):

```bash
sententia /path/to/markdown/docs \
  --llm-protocol ollama \
  --llm-url http://localhost:11434 \
  --llm-model llama3
```

## Features

- **Semantic search** — FAISS index with `intfloat/multilingual-e5-base` embeddings
- **RAG Q&A** — retrieve relevant chunks and generate answers via LLM
- **Multiple LLM providers** — OpenAI, Anthropic, Ollama
- **Dual interface** — REST API server or MCP server (Model Context Protocol)
- **File access** — read original Markdown files by relative path

## Usage

```
sententia DATA_DIR [OPTIONS]
```

| Option           | Description                                           | Default      |
|------------------|-------------------------------------------------------|--------------|
| `DATA_DIR`       | Path to Markdown files directory                      | *(required)* |
| `--index-path`   | Path to FAISS index file (persisted)                  | in-memory    |
| `--llm-protocol` | LLM provider: `openai`, `anthropic`, `ollama`         | *(required)* |
| `--llm-url`      | LLM API base URL (without `/v1` version path)         | *(required)* |
| `--llm-model`    | LLM model identifier                                  | *(required)* |
| `--llm-token`    | API key (falls back to `SENTENTIA_LLM_TOKEN` env var) | —            |
| `--host`         | Server bind address                                   | `0.0.0.0`    |
| `--port`         | Server port                                           | `8000`       |
| `--mcp`          | Run as MCP server instead of REST API                 | `false`      |

## Reference

### REST API

**Search documents**

```
POST /search
```

```json
{ "query": "how to configure logging", "top": 10 }
```

Response:

```json
{
  "results": [
    { "text": "...", "source": "docs/guide.md", "score": 0.87 }
  ]
}
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | `string` | `""` | Search query |
| `top` | `integer` | `10` | Number of results (1–100) |

**Ask a question (RAG)**

```
POST /ask
```

```json
{ "query": "how to configure logging" }
```

Response:

```json
{
  "answer": "To configure logging...",
  "sources": ["docs/guide.md", "docs/reference.md"]
}
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | `string` | `""` | Question to answer |

**Read file**

```
GET /files/{path}
```

Response:

```json
{ "text": "file contents...", "source": "docs/guide.md" }
```

### MCP Tools

Launch with `--mcp` flag to expose tools via Model Context Protocol.

**`search`** — Search for relevant documents in the knowledge base.

| Parameter | Type      | Default | Description       |
|-----------|-----------|---------|-------------------|
| `query`   | `string`  | —       | Search query      |
| `top`     | `integer` | `10`    | Number of results |

Returns list of `{ text, source, score }`.

**`ask`** — Ask a question and get an answer based on indexed documents.

| Parameter | Type     | Default | Description        |
|-----------|----------|---------|--------------------|
| `query`   | `string` | —       | Question to answer |

Returns `{ answer, sources }`.

**`files`** — Read file content by path.

| Parameter | Type     | Default | Description        |
|-----------|----------|---------|--------------------|
| `path`    | `string` | —       | Relative file path |

Returns `{ text, source }`.

## Development

Clone and install with dev dependencies:

```bash
git clone https://github.com/qarium/sententia.git
cd sententia
pip install -e ".[test]"
```

Run tests:

```bash
pytest
```

Lint:

```bash
ruff check sententia tests
```
