Metadata-Version: 2.4
Name: codelumen
Version: 3.1.2
Summary: AST-based semantic code search that knows the neighborhood — every result comes with its call graph. MCP server included.
Author: Ahmed Gamil
License-Expression: MIT
Project-URL: Homepage, https://github.com/ahmeedgamil/codelumen
Project-URL: Documentation, https://github.com/ahmeedgamil/codelumen#readme
Project-URL: Repository, https://github.com/ahmeedgamil/codelumen
Project-URL: Issues, https://github.com/ahmeedgamil/codelumen/issues
Keywords: code-search,semantic-search,rag,ast,tree-sitter,mcp,llm,embeddings
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.110.0
Requires-Dist: uvicorn[standard]>=0.27.0
Requires-Dist: pydantic>=2.5.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: typer>=0.9.0
Requires-Dist: rich>=13.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: tree-sitter>=0.22.0
Requires-Dist: tree-sitter-python>=0.23.0
Requires-Dist: tree-sitter-javascript>=0.23.0
Requires-Dist: tree-sitter-typescript>=0.23.0
Requires-Dist: tree-sitter-java>=0.23.0
Requires-Dist: tree-sitter-go>=0.23.0
Requires-Dist: tree-sitter-php>=0.23.0
Requires-Dist: tree-sitter-c-sharp>=0.23.0
Requires-Dist: tree-sitter-ruby>=0.23.0
Requires-Dist: tree-sitter-rust>=0.23.0
Requires-Dist: tree-sitter-cpp>=0.23.0
Requires-Dist: qdrant-client>=1.7.0
Requires-Dist: mcp>=1.0.0
Provides-Extra: voyage
Requires-Dist: voyageai>=0.2.0; extra == "voyage"
Provides-Extra: openai
Requires-Dist: openai>=1.10.0; extra == "openai"
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.18.0; extra == "anthropic"
Provides-Extra: cohere
Requires-Dist: cohere>=4.40; extra == "cohere"
Provides-Extra: local
Requires-Dist: sentence-transformers>=2.3.0; extra == "local"
Requires-Dist: torch>=2.1.0; extra == "local"
Provides-Extra: chroma
Requires-Dist: chromadb>=0.5.0; extra == "chroma"
Provides-Extra: all
Requires-Dist: voyageai>=0.2.0; extra == "all"
Requires-Dist: openai>=1.10.0; extra == "all"
Requires-Dist: anthropic>=0.18.0; extra == "all"
Requires-Dist: cohere>=4.40; extra == "all"
Requires-Dist: sentence-transformers>=2.3.0; extra == "all"
Requires-Dist: torch>=2.1.0; extra == "all"
Requires-Dist: chromadb>=0.5.0; extra == "all"
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: ruff>=0.2.0; extra == "dev"
Requires-Dist: build>=1.0.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Requires-Dist: openai>=1.10.0; extra == "dev"
Dynamic: license-file

﻿<!-- mcp-name: io.github.ahmeedgamil/codelumen -->

<div align="center">

<picture>
  <source media="(prefers-color-scheme: dark)" srcset="assets/codelumen_dark.png">
  <source media="(prefers-color-scheme: light)" srcset="assets/codelumen_light.png">
  <img alt="codelumen logo" src="assets/codelumen_light.png" width="400">
</picture>

# codelumen

**AST-based semantic code search that knows the neighborhood — every result comes with what it calls and what calls it.**

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Build Status](https://img.shields.io/github/actions/workflow/status/ahmeedgamil/codelumen/ci.yml?branch=main&logo=github)](https://github.com/ahmeedgamil/codelumen/actions)
[![MCP](https://img.shields.io/badge/MCP-compatible-7C3AED.svg)](https://modelcontextprotocol.io/)
[![Tree-sitter](https://img.shields.io/badge/parser-tree--sitter-1F8B4C.svg)](https://tree-sitter.github.io/)

[**Install**](#-install) ·
[**Quickstart**](#-60-second-quickstart) ·
[**MCP Setup**](#-mcp--wire-it-into-your-ai-editor) ·
[**Architecture**](#%EF%B8%8F-architecture) ·
[**Config**](#%EF%B8%8F-configuration) ·
[**Troubleshooting**](#-troubleshooting)

<!-- DEMO (highest-impact addition): record a ~15s clip showing
       `codelumen index .`  →  `codelumen search "..."`  →  a result with its call graph.
     Capture with vhs (github.com/charmbracelet/vhs), terminalizer, or asciinema+svg-term.
     Save it as assets/demo.gif, then uncomment the line below: -->
<!-- <img src="assets/demo.gif" alt="codelumen demo — index then search with call-graph context" width="760"> -->

</div>

---

```text
   ┌──────────────┐    parse + embed     ┌──────────────┐    LLM enrich    ┌──────────────┐
   │  source tree │ ───────────────────► │ code  vector │ ──────────────► │  description │
   │              │                      │ + BM25 sparse│                  │  + queries   │
   └──────────────┘                      └──────────────┘                  └──────────────┘

                                Search via CLI, HTTP, or MCP.
```

codelumen indexes your codebase the way a developer thinks about it: every function, method, and class becomes a chunk, with its call graph, docstring, and signature attached. You search by intent (*"how do we retry transient payment failures?"*) and get back the few chunks that actually answer the question.

Two stages: **structural** (always, free, ~seconds) and **enrichment** (optional, LLM-paid, drains a `pending` queue). The index is queryable after stage 1; stage 2 just makes natural-language matches sharper.

The package ships an **MCP server** so Claude Desktop, Cursor, Cline, Continue, Kiro, Zed, and any other MCP-aware client can call codelumen as a tool — your AI agent gets `search`, `find_symbol`, and `get_chunk_context` next to its built-in `read_file` and `grep`.

---

## Features

- **AST-aware chunking** via Tree-sitter — every function, method, and class becomes a first-class searchable unit
- **Call graph included** — each chunk knows what it calls and what calls it
- **Hybrid retrieval** — dense vectors (`code`, `description`, `developer_queries`) + BM25 sparse, all in one query
- **Token-efficient for AI agents** — replaces dozens of `read_file` + `grep` calls with one `search`
- **MCP server out of the box** — works with Claude Desktop, Cursor, Cline, Continue, Kiro, Zed
- **10 languages** — Python, JavaScript, TypeScript, Java, Go, PHP, C#, Ruby, Rust, C++
- **Free tier works** — local embeddings + structural pipeline need zero API keys
- **Incremental indexing** — `--changed` re-indexes only what `git diff` touched

---

## Install

```bash
# Local-only (free, offline embeddings via sentence-transformers)
pip install "codelumen[local]"

# With everything: voyage + openai + anthropic + cohere + chroma
pip install "codelumen[all]"

# Pick exactly what you need
pip install "codelumen[anthropic,local]"
```

**Prefer an isolated CLI install?** [pipx](https://pipx.pypa.io) keeps codelumen and its deps out of your global environment — recommended for a command-line tool:

```bash
pipx install "codelumen[local]"
```

**Just want to try it without installing?** With [uv](https://docs.astral.sh/uv/), run it straight from PyPI:

```bash
uvx --from "codelumen[local]" codelumen index .
```

> **Requires** Python 3.11+. First index downloads the embedding model (~420 MB for the default `all-mpnet-base-v2`) and caches it.

---

## 60-second quickstart

No config files, no API keys, no setup. codelumen works offline with a local
embedder by default. Just `cd` into any project and index it:

```bash
# 1. structural index — no LLM, free, no config needed
cd ~/code/your-project
codelumen index .

# 2. search — pure retrieval, ~80 ms (a background daemon stays warm)
codelumen search "how does the payment retry logic work?"

# 3. (optional) LLM enrichment — needs an Anthropic/OpenAI/OpenRouter/Ollama key
codelumen enrich

# 4. (optional) full RAG with answer generation
codelumen query "where is auth handled?"
```

The first command auto-starts a background **daemon** that loads the embedding
model once and keeps it warm — so every later search (from any terminal or
your editor) is instant. Each project gets its own index automatically under
`~/.codelumen/indexes/`, keyed by repo root. One global config lives at
`~/.codelumen/config.yaml` (created on first run); there is **no per-project
config file** to manage.

---

## CLI commands

| Command | What it does |
|---|---|
| `codelumen index <path>` | Stage 1: parse + embed + upsert. No LLM. |
| `codelumen index <path> --changed` | Incremental — only files in `git diff HEAD~1`. |
| `codelumen index <path> --reset` | Rebuild from scratch (after changing the embedding model). |
| `codelumen enrich` | Stage 2: drain pending chunks through an LLM. |
| `codelumen enrich --force` | Re-enrich every chunk. |
| `codelumen compact` | Prune orphaned records from `.codelumen/enrichment.jsonl`. |
| `codelumen search "query"` | Pure retrieval. `--format json/paths/compact/table`. |
| `codelumen query "question"` | Full RAG: retrieval + answer generation. |
| `codelumen status` | Per-state chunk counts + provider summary. |
| `codelumen projects` | List every indexed project in `~/.codelumen/indexes/`. |
| `codelumen serve` | Run the daemon in the foreground (it otherwise auto-starts). |
| `codelumen stop` | Stop the background daemon. |
| `codelumen doctor` | Daemon + config health check. |

> All commands act on the **current project** (nearest git root of your cwd).
> Override with `--root /path/to/repo`. They're thin clients to the daemon — no
> model loading, no `config.yaml` flag.

---

## MCP — wire it into your AI editor

The package ships an MCP server (`codelumen-mcp`) that exposes eight tools — five for search, three for agent-driven enrichment:

| Tool | Use it for |
|---|---|
| `search` | Semantic search over the index. |
| `search_many` | Several searches in one call — results grouped per query. |
| `find_symbol` | Exact-name lookup for a function/method/class. |
| `get_chunk_context` | Full source + calls + called_by for a symbol. |
| `index_status` | Sanity-check the index. |
| `list_pending_enrichments` | Get a batch of chunks needing LLM enrichment. |
| `save_enrichment` | Persist a summary the agent wrote. |
| `enrichment_progress` | Loop sentinel — pending vs done. |

**Register it once, globally** — it then works in every project you open, with
no per-project setup. This is the entire config — drop it into your editor's
*user-level* MCP file:

```json
{
  "mcpServers": {
    "codelumen": {
      "command": "codelumen-mcp"
    }
  }
}
```

<details>
<summary><b>Where that file lives, per editor</b> (click to expand)</summary>

| Editor | Config file |
|---|---|
| **Claude Desktop** (macOS) | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| **Claude Desktop** (Windows) | `%APPDATA%\Claude\claude_desktop_config.json` |
| **Cursor** | `~/.cursor/mcp.json` (global) or `.cursor/mcp.json` (per-project) |
| **Windsurf** | `~/.codeium/windsurf/mcp_config.json` |
| **Cline** (VS Code) | Cline panel → **MCP Servers** → *Configure* (edits `cline_mcp_settings.json`) |
| **Continue** | `~/.continue/config.yaml` → under a `mcpServers:` block |
| **Kiro** | `~/.kiro/settings/mcp.json` |
| **Claude Code** | `claude mcp add codelumen codelumen-mcp` |

> A few clients use a slightly different shape — e.g. VS Code's native MCP uses a
> top-level `"servers"` key instead of `"mcpServers"`. If yours differs, keep the
> `command: "codelumen-mcp"` part and match the client's MCP docs for the wrapper.

</details>

That's the whole config — no `config.yaml` path, no per-project entry. The
server is a thin proxy: it forwards each call to the warm daemon, tagged with
the project the editor currently has open (its nearest git root). Most editors
launch the server with the workspace as the working directory, so this Just
Works; if yours doesn't, pass the root explicitly:

```json
{ "command": "codelumen-mcp", "args": ["--root", "${workspaceFolder}"] }
```

After restarting the editor, the agent sees `codelumen.search`,
`codelumen.find_symbol`, etc. The daemon loads the model once for the whole
machine — so no matter how many editors and terminals you have open, there's
one model in memory and every call is ~80 ms.

### Teach the agent *when* to use it (optional)

The MCP server gives the agent the tools; a short **skill** teaches it to reach
for semantic search before grep/read. Install it once, user-level, into every
AI editor you use:

```bash
codelumen install-skill            # auto-detects installed editors
codelumen install-skill -t all     # or force every supported target
```

Supported: Claude Code (skill), Cursor & Windsurf (rules), Kiro (steering),
Codex (`AGENTS.md`). It writes to each editor's own convention at the user
level — so, like everything else here, it's set up once and applies to every
project. Re-run any time to update in place.

### Why this matters for token usage

Without semantic search, an AI agent asked *"where do we validate emails?"* runs `grep -r email`, gets 200 hits, and reads dozens of files. With codelumen it calls `search("validate email")`, gets 3 ranked chunks back as JSON (~500 tokens), and reads only what it needs.

### Worked example — *"How does the OrderService validate orders?"*

I indexed `sample_project/`, then asked exactly that. The agent calls **two** MCP tools and is done — no `read_file`, no grep.

<details>
<summary><b>Step 1 — <code>search</code> narrows the question to a few candidates</b></summary>

```json
{
  "query": "how does OrderService validate an order before processing",
  "embedding_dim": 768,
  "results": [
    {
      "score": 0.455,
      "qualified_name": "OrderService::_validate",
      "file": "orders.py", "line_start": 115, "line_end": 124,
      "type": "method", "enrichment_state": "pending"
    },
    {
      "score": 0.397,
      "qualified_name": "OrderService",
      "file": "orders.py", "line_start": 49, "line_end": 124,
      "type": "class",
      "summary": "High-level orchestrator for placing and processing orders. Coordinates payment authorization (PaymentProcessor) and customer notification (NotificationService)."
    },
    {
      "score": 0.363,
      "qualified_name": "OrderService::place_order",
      "file": "orders.py", "line_start": 67, "line_end": 93,
      "summary": "Validate, charge, and confirm an order in one shot."
    }
  ]
}
```

The `_validate` method ranks #1. The class and `place_order` rank just below — useful context, not the answer.

</details>

<details>
<summary><b>Step 2 — <code>get_chunk_context</code> pulls the body, calls, and called_by without opening the file</b></summary>

```json
{
  "found": true,
  "chunk": {
    "qualified_name": "OrderService::_validate",
    "file": "orders.py", "line_start": 115, "line_end": 124,
    "source": "def _validate(self, order: Order) -> None:\n    if order.is_empty():\n        raise ValueError(\"Order has no line items\")\n    if order.total_cents() < self.MIN_ORDER_CENTS:\n        raise ValueError(...)\n    if \"@\" not in order.customer_email:\n        raise ValueError(\"customer_email must be a valid email address\")",
    "calls": ["order.is_empty", "ValueError", "order.total_cents"],
    "called_by": [
      {"function": "place_order", "class_name": "OrderService", "file": "orders.py", "line": 67}
    ]
  }
}
```

That's the entire answer: three checks (non-empty, minimum total, email contains `@`), called once from `place_order`. Total cost: ~1.4 KB of JSON, two MCP calls, sub-second.

</details>

Compared to the grep-then-read alternative — open `orders.py` (3.5 KB), scan for `validate`, then trace into `place_order` to confirm the call — the MCP path uses roughly **40% of the tokens** and skips file I/O entirely.

---

## Two ways to enrich

Stage 2 (description / developer_queries vectors) needs an LLM. You pick how:

### Option A — direct API
codelumen calls Claude / GPT / OpenRouter / Ollama itself.

```bash
codelumen enrich
```

Best for CI, batch jobs, or users without an AI editor. Set `llm.provider` + `llm.<provider>.api_key` in `config.yaml`.

### Option B — your AI agent does it
Skip the API key. The agent in your editor (Claude in Cursor, Kiro, Cline, Continue, etc.) loops over pending chunks via the MCP tools and writes summaries itself. **No codelumen-side LLM cost.** Just say:

> *"enrich the codebase index"*

The agent uses `list_pending_enrichments` → writes summary → `save_enrichment` → repeats until `enrichment_progress` reports zero pending. Uses the model + subscription you already pay for.

---

## Portable enrichment — pay once, share, survive model changes

Enrichment (the LLM-written summaries + developer queries) is the only part of codelumen that costs tokens. codelumen persists that text to **`.codelumen/enrichment.jsonl`** in your repo, keyed by each chunk's content hash — separate from the vector index (which lives in `~/.codelumen/indexes/` and is disposable).

**Commit `.codelumen/enrichment.jsonl` to git.** Doing so gives you:

- **Share across a team** — a teammate clones, runs `codelumen index`, and the enrichment is restored from the file and re-embedded locally. They pay **zero** LLM tokens for code that's already enriched.
- **Survive embedding-model changes** — switch from a local model to Voyage/OpenAI, run `codelumen index --reset`, and your enrichment is re-embedded into the new vector space with **no re-enrichment**. (Without this, changing models meant re-paying for everything.)
- **Survive branch switches / reverts** — old records are kept as a content-addressed cache, so checking out an older revision reuses its enrichment instantly.

It's content-addressed, so it's always safe: if a chunk's code changed, its hash misses and that chunk is simply re-enriched — stale summaries can never attach to changed code. Run `codelumen compact` to prune records no longer referenced by any code. The file holds only text and contains no secrets.

```
.codelumen/enrichment.jsonl   ← commit this (portable enrichment, ~text)
~/.codelumen/indexes/<id>/    ← never committed (vectors, machine-local)
```

---

## OpenRouter — one key, every model

If you don't want to manage Anthropic + OpenAI + Google accounts separately, point codelumen at [OpenRouter](https://openrouter.ai):

```yaml
llm:
  provider: "openrouter"
  openrouter:
    api_key: "${OPENROUTER_API_KEY}"
    model: "anthropic/claude-sonnet-4.6"   # or openai/gpt-5.5, anthropic/claude-opus-4.7, etc.
```

Use any [supported model slug](https://openrouter.ai/models). Same `codelumen enrich` and `codelumen query` commands as before.

---

## Architecture

Three named dense vectors per chunk: `code` (always), `description` (post-enrichment), `developer_queries` (post-enrichment), plus a `bm25` sparse vector. Pre-enrichment, search still works on `code` + BM25 alone.

**Stage 1** (`codelumen index`) parses files with Tree-sitter, builds the call graph, computes a source hash, embeds the raw source into the `code` vector, and upserts to Qdrant. No LLM calls. Every chunk lands in state `pending`.

**Stage 2** (`codelumen enrich`) drains the pending queue through an LLM, generating a logic_summary + 5–10 developer_queries per chunk, embedding them into the `description` and `developer_queries` vectors and mirroring the text to the portable `.codelumen/enrichment.jsonl` store. State flips to `fresh`. Trivial chunks get a free template; tests get pattern-skipped.

---

## Configuration

codelumen uses **one global config** at `~/.codelumen/config.yaml` (created with
working defaults on first run) — the daemon loads it for every project. You can
still point at an explicit file with `$CODELUMEN_CONFIG`. Environment-variable
refs (`${VAR}`) resolve from `~/.codelumen/.env`, a local `.env.codelumen`/`.env`,
or the real environment. Indexes are stored centrally under
`~/.codelumen/indexes/<id>/`, one per project root — you don't set a `path`.

```yaml
embeddings:
  provider: "local"
  local:
    model: "sentence-transformers/all-mpnet-base-v2"

llm:
  provider: "anthropic"
  anthropic:
    api_key: "${ANTHROPIC_API_KEY}"
    model: "claude-sonnet-4-6"

vector_db:
  provider: "qdrant"        # path is managed per-project by the daemon

enrichment:
  enabled: true
  version: 1
  llm_concurrency: 4
  skip_patterns: ["**/test_*.py", "**/*_test.go", "**/tests/**"]
  skip_trivial:
    enabled: true
    max_lines: 3
```

### Supported languages

<div align="center">

![Python](https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=white)
![JavaScript](https://img.shields.io/badge/JavaScript-F7DF1E?logo=javascript&logoColor=black)
![TypeScript](https://img.shields.io/badge/TypeScript-3178C6?logo=typescript&logoColor=white)
![Java](https://img.shields.io/badge/Java-ED8B00?logo=openjdk&logoColor=white)
![Go](https://img.shields.io/badge/Go-00ADD8?logo=go&logoColor=white)
![PHP](https://img.shields.io/badge/PHP-777BB4?logo=php&logoColor=white)
![C#](https://img.shields.io/badge/C%23-239120?logo=csharp&logoColor=white)
![Ruby](https://img.shields.io/badge/Ruby-CC342D?logo=ruby&logoColor=white)
![Rust](https://img.shields.io/badge/Rust-000000?logo=rust&logoColor=white)
![C++](https://img.shields.io/badge/C%2B%2B-00599C?logo=cplusplus&logoColor=white)

</div>

---

## Troubleshooting

| Symptom | Cause | Fix |
|---|---|---|
| `qdrant... already accessed by another instance` | Something opened an index directly while the daemon holds it | The daemon is the sole index owner by design — use the CLI/MCP, don't open `~/.codelumen/indexes/*` yourself. Restart with `codelumen stop`. |
| Search returns nothing on `description` / `developer_queries` | Stage 2 hasn't run | Run `codelumen enrich`, or use the MCP `list_pending_enrichments` flow. |
| Dimension mismatch on query | Embedding model changed since last index | Delete that project's dir under `~/.codelumen/indexes/` and re-`codelumen index .` |
| First command takes ~15 s | Daemon is loading the model (one time, machine-wide) | Normal on first use; every call after is ~80 ms. Run `codelumen doctor` to confirm it's warm. |
| Daemon won't start / commands hang | Bad config or port in use | Check `~/.codelumen/daemon.log`; set `CODELUMEN_DAEMON_PORT` if 7711 is taken. |
| `429` rate limit on Voyage / Anthropic | Free tier, low limits | Add billing or lower `enrichment.llm_concurrency`. |

---

## Status

| | |
|---|---|
| **Version** | `3.1.2` |
| **Python** | 3.11, 3.12 |
| **License** | MIT |
| **Stage 1 (structural)** | Stable |
| **MCP server** | Stable — main shipping surface |
| **Stage 2 (enrichment)** | Beta — rate-limit-sensitive on free LLM tiers |

Tune `enrichment.llm_concurrency`, or use the agent-driven enrichment flow if you hit rate limits.

---

## Contributing

Issues and PRs welcome.

```bash
# Run tests
pytest

# Build a release
python -m build
```

See [open issues](https://github.com/ahmeedgamil/codelumen/issues) for things to pick up.

---

<div align="left">

**Developed and designed by [Ahmed Gamil](https://github.com/ahmeedgamil)**

MIT licensed — see [LICENSE](LICENSE).

</div>
