# Gnosis MCP — Full Reference

> Zero-config MCP server that makes your markdown docs searchable by AI agents.
> SQLite default, PostgreSQL optional. Works with Claude Code, Cursor, Windsurf, Cline.
> PyPI: gnosis-mcp | CLI: gnosis-mcp | Import: gnosis_mcp

## Install

```bash
pip install gnosis-mcp               # SQLite (default, zero config)
pip install gnosis-mcp[postgres]     # + PostgreSQL support
```

## Quick Setup (SQLite)

```bash
gnosis-mcp ingest ./docs/   # Auto-creates DB + loads markdown
gnosis-mcp search "query"   # Test it works
gnosis-mcp serve             # Start MCP server
```

## Quick Setup (PostgreSQL)

```bash
export GNOSIS_MCP_DATABASE_URL="postgresql://user:pass@localhost:5432/mydb"
gnosis-mcp init-db          # Create tables (idempotent)
gnosis-mcp ingest ./docs/   # Load markdown files
gnosis-mcp check            # Verify connection + schema
gnosis-mcp serve
```

## Editor Config

The same JSON structure works in every editor. Add it to the appropriate config file:

| Editor | Config File |
|--------|------------|
| Claude Code | `.claude/mcp.json` |
| Cursor | `.cursor/mcp.json` |
| VS Code (Copilot) | `.vscode/mcp.json` (note: uses `"servers"` not `"mcpServers"`) |
| Windsurf | `~/.codeium/windsurf/mcp_config.json` |
| JetBrains | Settings > Tools > AI Assistant > MCP Servers |
| Cline | Cline MCP settings panel |

SQLite (no env needed):

```json
{
  "mcpServers": {
    "docs": {
      "command": "gnosis-mcp",
      "args": ["serve"]
    }
  }
}
```

PostgreSQL:

```json
{
  "mcpServers": {
    "docs": {
      "command": "gnosis-mcp",
      "args": ["serve"],
      "env": {
        "GNOSIS_MCP_DATABASE_URL": "postgresql://user:pass@localhost:5432/mydb"
      }
    }
  }
}
```

## Backends

| | SQLite (default) | PostgreSQL |
|---|---|---|
| Install | `pip install gnosis-mcp` | `pip install gnosis-mcp[postgres]` |
| Config | Nothing | Set `DATABASE_URL` |
| Search | FTS5 keyword (BM25) | tsvector + pgvector hybrid |
| Embeddings | Binary blobs | Native vector type + HNSW index |
| Multi-table | No | Yes (UNION ALL) |

Auto-detection: `DATABASE_URL` set to `postgresql://...` -> PostgreSQL. Not set -> SQLite. Override: `GNOSIS_MCP_BACKEND=sqlite|postgres`.

## Tools (6)

### Read Tools (always available)

1. **search_docs(query, category?, limit?, query_embedding?)** — Search docs using keyword (FTS5/tsvector) or hybrid semantic+keyword search.
   - query: string (required) — search text
   - category: string (optional) — filter by category
   - limit: int (default 5, max configurable) — result count
   - query_embedding: list[float] (optional) — pre-computed embedding for hybrid search (PostgreSQL)

2. **get_doc(path, max_length?)** — Get full document by file path. Reassembles chunks in order.
   - path: string (required) — e.g. "guides/quickstart.md"
   - max_length: int (optional) — truncate at N characters

3. **get_related(path)** — Find related documents via bidirectional link graph.
   - path: string (required)

### Write Tools (require GNOSIS_MCP_WRITABLE=true)

4. **upsert_doc(path, content, title?, category?, audience?, tags?, embeddings?)** — Insert or replace document. Auto-chunks at paragraph boundaries. Optional `embeddings` accepts pre-computed vectors (one per chunk).

5. **delete_doc(path)** — Delete document, its chunks, and links.

6. **update_metadata(path, title?, category?, audience?, tags?)** — Update metadata fields on all chunks.

## Resources (3)

- **gnosis://docs** — List all documents with title, category, chunk count
- **gnosis://docs/{path}** — Read document content by path
- **gnosis://categories** — List categories with document counts

## Configuration (Environment Variables)

All settings via GNOSIS_MCP_* environment variables. Nothing required for SQLite.

### Core Settings
- GNOSIS_MCP_DATABASE_URL — PostgreSQL URL or SQLite file path (default: SQLite at ~/.local/share/gnosis-mcp/docs.db)
- GNOSIS_MCP_BACKEND — Force backend: auto, sqlite, postgres (default: auto)
- GNOSIS_MCP_SCHEMA — Database schema, PostgreSQL only (default: public)
- GNOSIS_MCP_CHUNKS_TABLE — Chunks table name, comma-separated for multi-table on PG (default: documentation_chunks)
- GNOSIS_MCP_LINKS_TABLE — Links table name (default: documentation_links)
- GNOSIS_MCP_SEARCH_FUNCTION — Custom search function, PostgreSQL only (default: none)
- GNOSIS_MCP_EMBEDDING_DIM — Embedding vector dimension for init-db (default: 1536)
- GNOSIS_MCP_POOL_MIN — Min pool connections, PostgreSQL only (default: 1)
- GNOSIS_MCP_POOL_MAX — Max pool connections, PostgreSQL only (default: 3)
- GNOSIS_MCP_WRITABLE — Enable write tools: true/1/yes (default: false)
- GNOSIS_MCP_WEBHOOK_URL — URL to POST on doc changes (default: none)

### Embedding
- GNOSIS_MCP_EMBED_PROVIDER — Embedding provider: openai, ollama, or custom (default: none)
- GNOSIS_MCP_EMBED_MODEL — Embedding model name (default: text-embedding-3-small)
- GNOSIS_MCP_EMBED_API_KEY — API key for embedding provider (default: none)
- GNOSIS_MCP_EMBED_URL — Custom embedding endpoint URL (default: none)
- GNOSIS_MCP_EMBED_BATCH_SIZE — Chunks per embedding batch, min 1 (default: 50)

### Tuning
- GNOSIS_MCP_CONTENT_PREVIEW_CHARS — Characters in search previews, min 50 (default: 200)
- GNOSIS_MCP_CHUNK_SIZE — Max chars per chunk, min 500 (default: 4000)
- GNOSIS_MCP_SEARCH_LIMIT_MAX — Max search result limit, min 1 (default: 20)
- GNOSIS_MCP_WEBHOOK_TIMEOUT — Webhook timeout seconds, min 1 (default: 5)
- GNOSIS_MCP_TRANSPORT — Server transport: stdio or sse (default: stdio)
- GNOSIS_MCP_LOG_LEVEL — Logging: DEBUG/INFO/WARNING/ERROR/CRITICAL (default: INFO)

### Column Overrides (for existing tables with non-standard names)
- GNOSIS_MCP_COL_FILE_PATH (default: file_path)
- GNOSIS_MCP_COL_TITLE (default: title)
- GNOSIS_MCP_COL_CONTENT (default: content)
- GNOSIS_MCP_COL_CHUNK_INDEX (default: chunk_index)
- GNOSIS_MCP_COL_CATEGORY (default: category)
- GNOSIS_MCP_COL_AUDIENCE (default: audience)
- GNOSIS_MCP_COL_TAGS (default: tags)
- GNOSIS_MCP_COL_EMBEDDING (default: embedding)
- GNOSIS_MCP_COL_TSV (default: tsv)
- GNOSIS_MCP_COL_SOURCE_PATH (default: source_path)
- GNOSIS_MCP_COL_TARGET_PATH (default: target_path)
- GNOSIS_MCP_COL_RELATION_TYPE (default: relation_type)

## Custom Search Function (PostgreSQL)

Your function must accept:
```sql
(p_query_text text, p_categories text[], p_limit integer)
```

And return columns: file_path, title, content, category, combined_score.

Optionally, your function can also accept `p_embedding vector(N)` for hybrid search. Gnosis will try passing it automatically when `query_embedding` is provided.

## CLI

```
gnosis-mcp ingest <path> [--dry-run]                       # Load markdown files
gnosis-mcp serve [--transport stdio|sse] [--ingest PATH]   # Start MCP server (optionally ingest first)
gnosis-mcp search <query> [-n LIMIT] [-c CAT] [--embed]    # Search (--embed for hybrid, PG only)
gnosis-mcp stats                                           # Show document/chunk/category counts
gnosis-mcp check                                           # Verify connection + schema
gnosis-mcp embed [--provider P] [--model M] [--dry-run]    # Backfill NULL embeddings via API
gnosis-mcp init-db [--dry-run]                             # Create tables (or preview SQL)
gnosis-mcp export [-f json|markdown] [-c CAT]              # Export documents as JSON or markdown
gnosis-mcp --version                                       # Show version
```

## Ingest

`gnosis-mcp ingest <path>` scans a file or directory for markdown files and loads them into the database.

- Chunks by H2 headers (keeps sections together, not arbitrary character limits)
- Parses YAML-like frontmatter for title, category, audience, tags
- Content hashing: skips unchanged files on re-run
- Category inferred from parent directory name
- Title extracted from first H1 heading
- Skips tiny files (<50 chars)
- Use `--dry-run` to preview without writing

## Architecture

```
src/gnosis_mcp/
├── backend.py         # DocBackend Protocol + create_backend() factory
├── pg_backend.py      # PostgreSQL backend — asyncpg, tsvector, pgvector, UNION ALL
├── sqlite_backend.py  # SQLite backend — aiosqlite, FTS5 MATCH + bm25()
├── sqlite_schema.py   # SQLite DDL — tables, FTS5, triggers, indexes
├── config.py          # GnosisMcpConfig frozen dataclass, backend auto-detection
├── db.py              # Backend lifecycle + FastMCP lifespan
├── server.py          # FastMCP server: 6 tools + 3 resources + webhook helper
├── ingest.py          # File scanner: markdown chunking, frontmatter, content hashing
├── schema.py          # PostgreSQL DDL — tables, indexes, HNSW, hybrid search functions
├── embed.py           # Embedding sidecar: provider abstraction (openai/ollama/custom)
└── cli.py             # argparse CLI: serve, init-db, ingest, search, embed, stats, export, check
```

Default install deps: mcp + aiosqlite. Optional: asyncpg (via `[postgres]` extra).

## License

MIT — https://github.com/nicholasglazer/gnosis-mcp
