Metadata-Version: 2.4
Name: lorewiki
Version: 0.2.8
Summary: Local-first knowledge base for LLM-assisted coding, with hybrid retrieval (BM25 + hierarchy + optional vector) over SQLite FTS5.
Project-URL: Documentation, https://github.com/JochenYang/Lore-wiki
Project-URL: Source, https://github.com/JochenYang/Lore-wiki
Author: LoreWiki Team
License: MIT
License-File: LICENSE
Keywords: cli,fts5,knowledge-base,llm,rag,sqlite
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Documentation
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.10
Requires-Dist: click>=8.0
Requires-Dist: httpx>=0.27
Requires-Dist: loguru>=0.7
Requires-Dist: pydantic-settings>=2.2
Requires-Dist: pydantic>=2.6
Requires-Dist: python-frontmatter>=1.1
Requires-Dist: rich>=13.7
Requires-Dist: tomli-w>=1.0
Requires-Dist: tomli>=2.0; python_version < '3.11'
Requires-Dist: typer>=0.12
Provides-Extra: all
Requires-Dist: sentence-transformers>=2.7; extra == 'all'
Requires-Dist: sqlite-vec>=0.1.6; extra == 'all'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: vector
Requires-Dist: sentence-transformers>=2.7; extra == 'vector'
Requires-Dist: sqlite-vec>=0.1.6; extra == 'vector'
Description-Content-Type: text/markdown

﻿<p align="center">
  <img src="https://raw.githubusercontent.com/JochenYang/Lore-wiki/main/assets/logo.png" alt="LoreWiki" width="320" />
</p>

<p align="center">
  <b><a href="README.md">English</a></b> · <a href="docs/README_zh-CN.md">中文</a>
</p>

> Local-first knowledge base for LLM-assisted coding, with hybrid retrieval
> over SQLite FTS5.

### Build with

[![Python](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12-3776AB?logo=python&logoColor=white&style=for-the-badge)](https://www.python.org/)
[![SQLite](https://img.shields.io/badge/SQLite-+FTS5-003B57?logo=sqlite&logoColor=white&style=for-the-badge)](https://www.sqlite.org/)

### Tools

[![uv](https://img.shields.io/badge/uv-pkg%20%2B%20tool-5C2D91?logo=astral&logoColor=white&style=for-the-badge)](https://docs.astral.sh/uv/)
[![ruff](https://img.shields.io/badge/ruff-0%20errors-D7FF64?logo=ruff&logoColor=black&style=for-the-badge)](https://docs.astral.sh/ruff/)
[![pytest](https://img.shields.io/badge/pytest-240%20passed-0A9EDC?logo=pytest&logoColor=white&style=for-the-badge)](tests/)
[![License](https://img.shields.io/badge/License-MIT-22B14C?style=for-the-badge)](LICENSE)

---

LoreWiki indexes your team's Markdown wiki and exposes it through a single
CLI plus an [opencode](https://opencode.ai) skill consumable by Codex /
Aider / Claude Code / any shell-using LLM agent. The vault is also a
plain folder of `.md` files, so Obsidian / Logseq / VS Code can open
it directly.

**Key numbers from the example_wiki benchmark** (10 hand-authored queries):

| Mode          | Recall@5 | Avg latency |
|---------------|----------|-------------|
| BM25          | 80%      | 1.7 ms      |
| Hierarchy     | 90%      | 0.8 ms      |
| **Mix (RRF)** | **100%** | 3.0 ms      |

## Features

- **Hybrid retrieval**: FTS5 BM25 + hierarchy tree navigation, fused via
  Reciprocal Rank Fusion (no score normalisation needed).
- **Chinese + English friendly**: trigram tokenizer + bigram/LIKE fallback for
  short CJK queries (e.g. `"幂等"` (idempotent), `"认证"` (auth)).
- **Optional LLM integration** (Ollama or OpenAI-compatible). Gracefully
  degrades to "return the top-k chunks" when the LLM is offline.
- **Single-binary CLI + opencode skill**: one command surface, one
  opencode skill (or any shell-using agent) for AI consumption, and
  the on-disk vault as the "UI". No server processes, no extra
  dependencies.
- **One `lorewiki add`** to author a note end-to-end (body via
  `--body` / `--file` / stdin) with auto-reindex so the new doc is
  immediately retrievable.
- **Second-brain / topics**: one isolated vault per knowledge domain
  under `~/lorewiki/topics/`, shared across every project.
- **Zero external services**: SQLite is the only dependency for retrieval.
  LLM is opt-in.
- **Single-package install**: `pip install lorewiki` and you have
  everything; the data lives in your home and is fully owned.

## Installation

LoreWiki ships as a single Python wheel on **PyPI** (the only
distribution channel). Pick your preferred installer:

### uv (recommended, full feature set)

```bash
# Install — isolated per-tool venv, the lorewiki.exe (Windows)
# or lorewiki binary (macOS/Linux) is added to your PATH.
uv tool install lorewiki

# With the optional vector-retrieval extra (sqlite-vec + sentence-transformers):
uv tool install 'lorewiki[vector]'

# Upgrade:
uv tool upgrade lorewiki

# Uninstall (does NOT touch ~/.lorewiki/ — your data is yours):
uv tool uninstall lorewiki
```

If you don't have `uv` yet:

```bash
# macOS / Linux:
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell):
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```

Plain `pip` works too (the `lorewiki.exe` entry point is the same):

```bash
pip install lorewiki              # core CLI
pip install 'lorewiki[vector]'    # opt-in: vector retrieval
```

> The `[rest]` and `[mcp]` extras from 0.1.x are gone as of 0.2.0.
> The CLI + opencode skill replaced the FastAPI / MCP server surface.
> The `[all]` extra is now an alias for `[vector]`.

### From source (for contributors)

```bash
git clone https://github.com/JochenYang/Lore-wiki
cd Lore-wiki
uv tool install --editable .              # dev install
uv tool install --editable '.[dev]'       # + pytest / ruff / coverage
```

Python **3.10+** is required. After install, `lorewiki --version`
should print a banner ending with `v0.2.x`.

> **Windows PowerShell + CJK note**: starting with 0.2.0, LoreWiki
> forces UTF-8 on stdout/stderr unconditionally — CJK characters
> round-trip cleanly through the shell without `chcp 65001`. If you
> hit garbled output on an older release, upgrade with
> `uv tool upgrade lorewiki` or prefix the command with
> `chcp 65001 |`.

For deeper install info (PATH troubleshooting, where data lives,
backups, common errors, how publishing works), see
[`docs/install.md`](docs/install.md).

## Quickstart

```bash
# 1. Create a wiki + sample Markdown
lorewiki init --path ./my-wiki

# 2. Index the Markdown into SQLite + FTS5 (one-time, then incremental)
lorewiki index --path ./my-wiki

# 3. Search (default output is structured JSON for agents; --human for Rich Table)
lorewiki search "用户登录接口" --path ./my-wiki --mode mix --top-k 5
lorewiki search "用户登录接口" --path ./my-wiki --mode mix --top-k 5 --human

# 4. Ask (LLM-assisted answer, gracefully falls back to top chunks)
lorewiki ask "如何实现幂等重试" --path ./my-wiki

# 5. Author a note from the CLI (writes + re-indexes in one go)
#    Three equivalent ways to provide the body:
lorewiki add --title "Python Design" --module "patterns" --tag python,design \
    --body "Some deep details about Python design patterns." \
    --path ./my-wiki

#    --file: read the body from a file
lorewiki add --title "From File" --module "patterns" \
    --file ./drafts/python-design.md --path ./my-wiki

#    stdin pipe (any of these is fine on Windows + PowerShell, even
#    with CJK content; 0.2.2+ scrubs UTF-16 surrogates automatically)
echo "Some deep details about Python design patterns." \
  | lorewiki add --title "From Pipe" --module "patterns" --path ./my-wiki

# 6. Browse the index / hierarchy / status
lorewiki status --path ./my-wiki
lorewiki tree   --path ./my-wiki      # Rich-Tree view of the hierarchy
lorewiki show   index.md --path ./my-wiki   # print a doc body (cleaned)
```

**Config resolution** (later wins):

1. `<wiki>/.lorewiki/config.toml` — per-wiki defaults
2. `~/.lorewiki/config.toml` — user-wide overrides
3. `LOREWIKI_*` env vars — shell-level overrides

Edit any of these with `lorewiki config list / get / set` (TOML-aware,
no hand-editing required).

## Topics — your second brain

The per-wiki mode above is fine for a single project. The
**shared-brain** workflow is **topics** — isolated vaults under
`~/lorewiki/topics/`, queryable from any project:

```bash
lorewiki topic create react                              # empty vault
lorewiki topic create react --source ~/notes/react       # copy mode (default)
lorewiki topic create react --source ~/notes/react --link  # symlink mode
lorewiki topic use react                                 # activate
lorewiki index                                           # index the active topic
lorewiki search "useState closure"                       # query the active topic
lorewiki ask "props drilling 对比"                       # LLM answer from active topic
```

Layout produced:

```
~/lorewiki/                          # central root
├── config.toml                      # global: LLM key, retrieval mode
├── current                          # text: name of active topic
└── topics/
    └── react/                       # one topic = one vault
        ├── .lorewiki/index.db       # hidden lorewiki metadata
        ├── api/auth.md
        └── architecture.md
```

**Topic resolution priority** (later wins): `--topic` flag →
`LOREWIKI_TOPIC` env → `~/lorewiki/current` file → `--path` (legacy
per-wiki mode) → cwd `.lorewiki/config.toml` (legacy per-project mode).

The legacy per-project mode is **permanently supported** — no
migration required. Topics are a convenience, not a replacement.

The vault root is plain Markdown with a hidden `.lorewiki/`
directory, so **Obsidian / Logseq / VS Code can open it directly**
without lorewiki installed. That cross-tool friendliness is the
whole point of the "second brain" framing.

Topic names: lowercase ASCII, digits, hyphens, 1-64 chars, no
leading/trailing hyphens. Reserved names (`init`, `index`,
`current`, Windows device names) are rejected.

## How it works

For a one-query end-to-end walkthrough (CLI dispatch → config
resolution → retriever selection → RRF fusion → output) plus a
deep dive on **how the LLM config actually takes effect**
(three configuration paths, `build_client` dispatch, why pure
`httpx` instead of SDKs), see [`docs/how-it-works.md`](docs/how-it-works.md).

A higher-level architecture overview lives in
[`docs/architecture.md`](docs/architecture.md). Per-phase
self-critique notes are in `docs/critique/phase-{0..6}.md`.

## Configuration

```toml
# ./my-wiki/.lorewiki/config.toml

retrieval_mode = "mix"            # mix | bm25 | hierarchy | vector
rrf_k = 60
chunk_max_tokens = 800
chunk_overlap_tokens = 100
chunk_min_chars = 40
snippet_chars = 240

[mix_weights]
bm25 = 1.0
hierarchy = 0.8
vector = 0.5

[llm]
enabled = false                   # set true to enable `ask`'s LLM path
backend = "ollama"                # ollama | openai
ollama_url = "http://localhost:11434"
ollama_model = "llama3.2"
openai_api_key = ""
openai_base_url = ""              # leave blank for api.openai.com
openai_model = "gpt-4o-mini"
timeout_seconds = 30.0
```

Programmatic access:

```bash
lorewiki config list --path ./my-wiki
lorewiki config get llm.backend --path ./my-wiki
lorewiki config set retrieval_mode '"bm25"' --path ./my-wiki
```

## LLM setup

### Ollama (local, recommended)

```bash
ollama pull llama3.2
lorewiki config set llm.enabled true     --path ./my-wiki
lorewiki config set llm.backend '"ollama"' --path ./my-wiki
lorewiki ask "what's our retry policy?" --path ./my-wiki
```

### OpenAI-compatible (any provider that speaks the `/v1/chat/completions` schema)

> **Note on Azure OpenAI**: Azure's path is different
> (`/openai/deployments/<deployment>/chat/completions?api-version=...`)
> and is **not** currently supported. Use OpenRouter or a self-hosted
> vLLM-compatible endpoint, or wait for the phase-7 Azure support
> (or open an issue if you need it sooner).

```bash
lorewiki config set llm.enabled true     --path ./my-wiki
lorewiki config set llm.backend '"openai"' --path ./my-wiki
lorewiki config set llm.openai_api_key '"sk-..."' --path ./my-wiki
# Optional: point at a compatible proxy (OpenRouter, Azure, vLLM, ...).
lorewiki config set llm.openai_base_url '"https://openrouter.ai/api/v1"' --path ./my-wiki
```

If the LLM is unreachable, `ask` returns the top-K chunks with a clear
"degraded" notice — your workflow never breaks because the model is down.

## REST API

The FastAPI / REST surface was removed in 0.2.0. The CLI is the only
programmatic surface; agents consume it through the opencode skill
(see below) or by shelling out.

## The Markdown vault as your "UI"

LoreWiki no longer ships a built-in web UI in 0.1.0. The recommended
ways to consume the data are:

- **The CLI** (this document) — the single source of truth.
- **The active topic's vault directory** — every topic is a plain
  folder of `.md` files under `~/.lorewiki/topics/<name>/` (or
  `<wiki>/.lorewiki/...` in per-wiki mode). Open it in Obsidian,
  VS Code, Cursor, or any Markdown editor for the full rendered
  view, no extra tooling required.
- **The opencode skill** (below) — for AI agents.

## opencode skill (Codex / Aider / any shell-using agent)

For agents that can already run shell commands, the CLI is lighter-weight
than MCP. LoreWiki ships an official [opencode](https://opencode.ai) skill
in [`skills/lorewiki/SKILL.md`](skills/lorewiki/SKILL.md).

One-time install (after `uv tool install --editable .` puts `lorewiki` on your PATH):

```powershell
# Windows
.\skills\install.ps1            # copy mode
.\skills\install.ps1 -Symlink   # symlink mode (lets you edit SKILL.md live)
```

```bash
# macOS / Linux
./skills/install.sh             # copy mode
./skills/install.sh --symlink   # symlink mode
```

Restart opencode and the agent will auto-trigger the skill on cues like
`查 wiki` / `search the wiki` / `lorewiki ...`. See
[`skills/README.md`](skills/README.md) for full details.

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│            CLI + opencode skill · vault-as-folder          │
├─────────────────────────────────────────────────────────────┤
│  Indexer  │  Retriever (BM25 + Hierarchy + RRF)  │  LLM    │
├─────────────────────────────────────────────────────────────┤
│        SQLite + FTS5 (documents · docs_fts · hierarchy)     │
└─────────────────────────────────────────────────────────────┘
```

See `docs/lorewiki dev document.md` for the full design plan and
`docs/critique/phase-{0..6}.md` for per-phase self-critique notes.

## Development

```bash
pip install -e ".[dev]"
ruff check lorewiki skills tests  # lint
pytest -q                        # 241 unit + integration tests
pytest --cov=lorewiki            # coverage report
```

The `example_wiki/` directory is a curated 5-file benchmark
fixture — not a starter. See `example_wiki/README.md` for what
it is and how to use it.

## Roadmap

- **Vector retrieval** (sqlite-vec + sentence-transformers) — opt-in,
  via `pip install lorewiki[vector]`.
- **Incremental file-watcher** (`lorewiki update --watch`).
- **PDF / Word ingestion** beyond Markdown.
- **Atomic write of `~/lorewiki/current`** (currently best-effort).

## Contributing

See [`CONTRIBUTING.md`](CONTRIBUTING.md) for the workflow. Bug
reports and feature requests go to the issue tracker; PRs are
welcome — see the testing / linting commands above.

## License

[MIT](LICENSE) · Copyright (c) 2026 LoreWiki contributors.
