Metadata-Version: 2.4
Name: combfind
Version: 0.1.9
Summary: Queryable concept map of a codebase for LLM coding agents
Author-email: karolinkostial@gmail.com
License: MIT
Keywords: llm,code-search,embeddings,tree-sitter
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: click>=8.1
Requires-Dist: scikit-learn>=1.3
Requires-Dist: numpy>=1.26
Requires-Dist: sentence-transformers>=3.0
Requires-Dist: sqlite-vec>=0.1
Requires-Dist: tree-sitter>=0.22
Requires-Dist: tree-sitter-go>=0.23
Requires-Dist: tree-sitter-python>=0.23
Provides-Extra: llm
Requires-Dist: llama-cpp-python>=0.2; extra == "llm"
Requires-Dist: huggingface_hub>=0.20; extra == "llm"
Provides-Extra: hdbscan
Requires-Dist: hdbscan>=0.8; extra == "hdbscan"
Provides-Extra: scip
Requires-Dist: protobuf>=5.0; extra == "scip"
Provides-Extra: output
Requires-Dist: rich>=13.0; extra == "output"
Requires-Dist: tqdm>=4.66; extra == "output"
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: ruff>=0.4; extra == "dev"

# combfind

Give an AI agent a codebase. combfind tells it where to look.

combfind builds a local index of a repository so an agent can find the right files and functions for a task with a plain-text query, without reading the entire codebase.

## Install

```bash
pip3 install "combfind[llm]" \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
```

Or with uv:

```bash
uv pip install "combfind[llm]" \
  --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
```

Download a model (one-time, ~2 GB):

```bash
combfind download-model
```

## Usage

```bash
# Index a repository (requires a local LLM model)
combfind init /path/to/repo --db repo.db

# Query it
combfind query "how does authentication work" --db repo.db
combfind query "where are database migrations" --db repo.db --format json
```

### Query output (JSON)

```json
[
  {
    "rank": 1,
    "concept": "Token Refresh",
    "role": "implementation",
    "score": 0.87,
    "files": [{"path": "auth/service.py", "start_line": 42, "end_line": 91}],
    "symbols": ["AuthService.refresh", "AuthService.validate"],
    "why_relevant": "Handles session token validation and refresh logic.",
    "sibling_implementations": []
  }
]
```

### Init options

| Flag | Default | Description |
|------|---------|-------------|
| `--db` | `<repo_path>/.combfind.db` | Output path |
| `--llm-model` | auto-detected | Path to a GGUF model file |
| `--exclude-paths` | - | Paths to skip relative to repo root (repeatable) |
| `--exclude-regex` | - | Regex matched against file paths to skip |

### Query options

| Flag | Default | Description |
|------|---------|-------------|
| `--db` | `.combfind.db` | Database to query |
| `--top-k` | 5 | Number of results to return |
| `--format` | `text` | Output format: `text` or `json` |

## Environment variables

| Variable | Default | Description |
|----------|---------|-------------|
| `COMBFIND_LOG_LEVEL` | `info` | Log verbosity: `debug`, `info`, `warning`, `error` |

## Supported languages

Python, Go. More languages can be added via tree-sitter grammars.
