Metadata-Version: 2.4
Name: kinetic-context
Version: 0.2.2
Summary: Repository-level code context engine — find the right code, fast.
Author: Z.ai
License: MIT
Project-URL: Homepage, https://github.com/notlousybook/kinetic-context
Project-URL: Repository, https://github.com/notlousybook/kinetic-context
Project-URL: Issues, https://github.com/notlousybook/kinetic-context/issues
Keywords: code,search,context,retrieval,embeddings,ast,tree-sitter,mcp,model-context-protocol,code-intelligence
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Code Generators
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: tree-sitter>=0.23
Requires-Dist: tree-sitter-python>=0.23
Requires-Dist: tree-sitter-javascript>=0.23
Requires-Dist: tree-sitter-typescript>=0.23
Requires-Dist: tree-sitter-go>=0.23
Requires-Dist: tree-sitter-rust>=0.23
Requires-Dist: tree-sitter-java>=0.23
Requires-Dist: rank-bm25>=0.2.2
Requires-Dist: networkx>=3.0
Requires-Dist: numpy>=1.24
Requires-Dist: requests>=2.31
Requires-Dist: rich>=13.0
Requires-Dist: tqdm>=4.66
Dynamic: license-file

# kinetic-context

> Repository-level code context engine — find the right code, fast.

`kinetic-context` is a self-contained, installable code context engine. Point it at any repository and it builds a multi-layer index (AST chunks + embeddings + a code knowledge graph + BM25) that you can query with natural language or identifiers. It returns **code blocks with line ranges** — not just file paths — ready to paste into an LLM prompt.

It is designed to be the context layer for coding agents. It ships with a Rich CLI, an MCP server (so Claude Code, Cursor, Continue, and Zed can mount it natively), a TCP JSON mode for any other agent, and a Python library.

---

## Highlights

- **Multi-language**: Python, JavaScript, TypeScript, Go, Rust, Java — via tree-sitter.
- **Structure-aware chunking** (`cAST`): never splits a function mid-body. 35% overlap. Hierarchical summaries at file and repository level.
- **Hybrid retrieval**: dense (Mistral Codestral Embed, 1536-dim) + BM25 (code-aware tokenizer that splits camelCase, snake_case, and dotted identifiers) + a Code Knowledge Graph with 8 relationship types.
- **Reciprocal Rank Fusion** with brute-force-optimized weights across 5 channels (dense, BM25, graph, PRF, patch-reverse-engineering).
- **Zerank-2 reranker** with instruction-following via XML tags, tuned per query intent.
- **Novel signals** invented for this engine:
  - **Cross-resolution resonance** — L2+L3+L4 consensus boost
  - **Code DNA fingerprinting** — structural similarity (param count, complexity, call count)
  - **Semantic bridges** — virtual graph edges between similar functions
  - **File cohort memory** — files that co-occur in correct answers get boosted across queries
  - **Score distribution shape analysis** — adaptive cutoff detects bimodal / uniform / power-law score distributions
  - **Score gap amplification** — sigmoid sharpening of close scores
  - **Post-rerank filename tie-breaker** — the fix for "correct directory, wrong file" at scale
  - **Adversarial anti-centroid** — penalty for chunks near the worst BM25 hits
  - **Query-to-patch reverse engineering** — generate a hypothetical fix, embed it as a 5th RRF channel
- **Incremental indexing** with SHA-256 Merkle root hashing. Only changed files are re-embedded. Subsequent `index` runs are fast.
- **Per-repo isolated storage** under `~/.kinetic/<slug>_<hash>/`. Same-name folders in different parents do not collide. A `manifest.json` records the source path, root hash, file count, and last-indexed timestamp so `kinetic status` can answer "reindex needed?" in <50ms without spinning up the engine.
- **Hookable everywhere**: MCP server (stdio JSON-RPC), TCP JSON, Python library, Rich CLI.

---

## Install

```bash
pip install kinetic-context
```

Requires Python 3.10+. All heavy dependencies (tree-sitter, numpy, networkx, rich) are bundled — no Qdrant/Pinecone/Milvus/Postgres to install.

Set your API keys (a Mistral key for embeddings + a Zerank key for reranking):

```bash
export MISTRAL_API_KEY=...
export ZEROENTROPY_API_KEY=...
```

You can also override the embedder / reranker URLs and model IDs in `KCEConfig` if you want to point at a different provider.

---

## Quick start

```bash
# Index a repo (first run; takes a few minutes for a 1k-file repo)
kinetic index ./my-repo

# Subsequent runs only re-embed files whose SHA-256 hash changed
kinetic index ./my-repo

# Search — returns code blocks with line ranges, not just file paths
kinetic query "authentication middleware" --code

# JSON output (for agent hooks / piping)
kinetic query "how does routing work" --json | jq '.results[0]'

# Check if the index is up to date
kinetic status ./my-repo

# List all indexed repos
kinetic list
```

### Query output example

```
────────────────────────────────────────────────────────────────────────
 kinetic-context • semantic_question • zerank • 68ms
────────────────────────────────────────────────────────────────────────
 Query: how does routing work in flask
────────────────────────────────────────────────────────────────────────
Found 8 code blocks across 8 files

#1  src/flask/sansio/app.py  L240-262 (23 lines)  • add_url_rule (function)  • python
    def add_url_rule(self, rule, endpoint=None, view_func=None, ...)
  240 │ def add_url_rule(
  241 │     self,
  242 │     rule: str,
  243 │     endpoint: str | None = None,
  ...
```

---

## Hooking into coding agents

### MCP server (Claude Code, Cursor, Continue, Zed, anything MCP-aware)

Start the MCP server in the background:

```bash
kinetic mcp
```

Or add it to your agent's MCP config:

```json
{
  "mcpServers": {
    "kinetic": {
      "command": "kinetic",
      "args": ["mcp"]
    }
  }
}
```

Four tools are exposed:

| Tool | Description |
|---|---|
| `kinetic_index` | Index (or incrementally update) a repo |
| `kinetic_query` | Search the indexed codebase, returns code blocks with line ranges |
| `kinetic_status` | Check whether the index is up to date |
| `kinetic_list_indexes` | List all indexed repos |

### TCP JSON server (any agent)

```bash
kinetic serve --port 7878
```

Send a single-line JSON request, get a single-line JSON response:

```bash
echo '{"query": "session cookie signing", "top": 5}' | nc localhost 7878
```

### Python library

```python
from kce.engine import KCEEngine
from kce.config import KCEConfig
from kce.store.registry import Registry

cfg = KCEConfig()
cfg.index_dir = Registry().resolve("/path/to/repo")
cfg.ensure_dirs()
engine = KCEEngine(cfg)
engine.index("/path/to/repo")

result = engine.query("how does authentication work")
for cid in result.final_chunk_ids[:10]:
    chunk = engine.chunk_index[cid]
    print(f"{chunk.rel_path}:{chunk.start_line}-{chunk.end_line}  {chunk.name}")
    print(chunk.content)
```

---

## Storage layout

Every indexed repo gets its own directory under `~/.kinetic/`:

```
~/.kinetic/
  flask_f9cbd6f6/                       <- slug = name + short hash of abspath
    manifest.json                       <- source path, root hash, file count, last indexed
    chunks.jsonl                        <- one CodeChunk per line
    embeddings.npy                      <- (N, 1536) float32 matrix
    embeddings_ids.json                 <- row order
    ckg.graphml                         <- Code Knowledge Graph
    incremental_state.json              <- per-file SHA-256 hashes
    change_log.jsonl                    <- append-only change history
    embed_cache/embeddings.db           <- Mistral API cache (SQLite)
    zerank_cache/rerank.db              <- Zerank API cache (SQLite)
    file_summaries.json
    repo_summary.txt
  django_ea20f8f1/
    ...
```

### Why per-repo isolation?

- Two repos with the same folder name (e.g. `~/work/api` and `~/side/api`) get different slugs because the slug includes a short hash of the absolute path. No collisions.
- Caches live next to the index, so deleting a repo's index (`kinetic forget`) also frees its cache.
- The `manifest.json` records a Merkle root hash over all file content hashes. Recomputing this on startup is <50ms even for a 1k-file repo, so `kinetic status` is instant.

### Efficient incremental updates

1. On `kinetic index <repo>`, we first compute the current Merkle root from disk.
2. We compare to the stored root in `manifest.json`. If they match, the index is up to date — we're done in <100ms.
3. If they differ, we walk the per-file SHA-256 hashes in `incremental_state.json` and identify exactly which files were added, modified, or removed.
4. Only those files are re-parsed, re-summarized, and re-embedded. Existing chunks for unchanged files are reused.
5. The Mistral embed cache (SQLite, keyed by SHA-256 of the embed text) means even a chunk whose text didn't change but whose `chunk_id` was regenerated will not trigger an API call.

For a 1k-file repo where 5 files changed, a re-index takes seconds, not minutes.

---

## Architecture

![Architecture](docs/architecture.png)

The pipeline has four layers:

1. **Ingestion** — tree-sitter parses each source file into an AST. The `cAST` chunker walks the AST and produces chunks that respect syntactic boundaries (a function is never split mid-body). Each chunk is enriched with its signature, docstring, decorators, and scope. A Code Knowledge Graph (NetworkX) is built with 8 relationship types: `CALLS`, `INHERITS`, `IMPLEMENTS`, `IMPORTS`, `CONTAINS`, `USES_TYPE`, `OVERRIDES`, `DEPENDS_ON`.

2. **Storage** — chunks go to JSONL, embeddings go to a single numpy matrix on disk (+ L2-normalized in memory for fast cosine), the graph goes to GraphML. Each repo gets its own directory under `~/.kinetic/`. SHA-256 Merkle root hashing drives incremental updates.

3. **Retrieval & Ranking** — the query coordinator picks one of 5 query types (identifier lookup, semantic question, code completion, bug diagnosis, architecture query). For each type, it applies intent-aware boosts (source vs test vs config files), runs multi-query BM25 + dense + graph retrieval, fuses the results with weighted Reciprocal Rank Fusion, applies novel signals (resonance, DNA, semantic bridges, cohort memory), then reranks the top candidates with Zerank-2.

4. **Output** — the final ranked chunks are returned as code blocks with line ranges, signature, and docstring. They can be rendered by the Rich CLI, serialized to JSON, or shipped over MCP / TCP to any coding agent.

---

## Benchmarks

We benchmark on two real-world repos with hand-curated query sets. We report Context F1@10, Recall@10, Precision@10, and end-to-end latency. **We deliberately do not compare to other context engines** — those numbers are easy to get wrong and we'd rather show our own results honestly than risk an unfair comparison.

### Aggregate metrics

![Aggregate metrics](docs/benchmarks/aggregate_metrics.png)

### Per-query F1 (sorted, so the worst queries are at the top)

![Per-query F1](docs/benchmarks/per_query_f1.png)

### Query outcome distribution (perfect / partial / failed)

![Perfect vs failed](docs/benchmarks/perfect_vs_failed.png)

### Latency distribution

![Latency distribution](docs/benchmarks/latency_distribution.png)

### Quality vs repository scale

![Scale vs quality](docs/benchmarks/scale_vs_quality.png)

### Numbers

| Corpus | Files | Chunks | Graph (nodes/edges) | Queries | F1@10 | Recall@10 | Precision@10 | Avg latency |
|---|---:|---:|---:|---:|---:|---:|---:|---:|
| Flask | 83 | 1,382 | 1,594 / 7,906 | 30 | 0.659 | 0.833 | 0.599 | 68 ms |
| Django (core) | 308 | 7,207 | 6,574 / 46,408 | 30 | 0.647 | 0.867 | 0.559 | 433 ms |

The Django benchmark uses the core Django packages (`django/db/`, `django/http/`, `django/urls/`, `django/template/`, `django/forms/`, `django/core/`, `django/contrib/auth/`, `django/contrib/sessions/`) — the parts of Django that real coding agents actually search. The full Django repo includes 3,000+ files of migrations, tests, and docs that bloat the index without improving retrieval quality on real-world queries.

Run the benchmarks yourself:

```bash
git clone https://github.com/pallets/flask /tmp/flask
git clone https://github.com/django/django /tmp/django
kinetic index /tmp/flask
kinetic index /tmp/django
python scripts/run_bench.py flask
python scripts/run_bench.py django
python scripts/gen_charts.py
```

---

## Configuration

All knobs are in `KCEConfig`. The defaults are tuned for the Mistral Codestral Embed + Zerank-2 combination. You can override:

| Setting | Default | What it controls |
|---|---|---|
| `mistral_embed_model` | `codestral-embed` | Embedding model |
| `zerank_model` | `zerank-2` | Reranker model |
| `chunk_max_tokens` | 512 | Max chunk size |
| `chunk_overlap_pct` | 0.35 | Chunk overlap |
| `bm25_k1`, `bm25_b` | 1.5, 0.75 | BM25 params |
| `rrf_dense`, `rrf_bm25`, `rrf_graph` | 0.75, 0.05, 0.03 | RRF channel weights |
| `retrieval_top_k` | 50 | Candidates per channel |
| `rerank_top_n` | 15 | Candidates sent to reranker |
| `final_top_n` | 10 | Final results returned |
| `filename_keyword_boost` | 2.0 | Post-rerank filename tie-breaker |
| `tier_c_penalty` | 0.4 | Penalty for abstract base classes |
| `context_budget_tokens` | 4096 | Context assembly budget |

API keys are read from `MISTRAL_API_KEY` and `ZEROENTROPY_API_KEY` env vars.

---

## Why not just use ripgrep / the IDE's built-in search?

Lexical search finds the keyword. It doesn't find the *concept*. Asking "how does routing work in flask" with grep gives you every line containing "route" — most of which is irrelevant. `kinetic-context` returns the 8 functions that actually implement routing, with their full bodies and signatures, in 68ms.

## Why not just stuff the whole repo into the LLM context window?

A 1k-file repo is ~500k tokens just for the source. That's expensive, slow, and the LLM's attention degrades badly past ~100k tokens. `kinetic-context` returns the 10 chunks that matter, fits in any context window, and costs 100x less to query.

## Why a Code Knowledge Graph?

Because "what calls what" matters. When you ask "where is `request` used?", the graph says "the `request` proxy is defined in `flask/globals.py`, used in 47 places, and its setter is in `flask/app.py`". Pure lexical search sees the word `request` 500 times. The graph sees the relationship.

---

## License

MIT. See [LICENSE](LICENSE).

## Contributing

Issues and PRs welcome at [github.com/notlousybook/kinetic-context](https://github.com/notlousybook/kinetic-context).
