Metadata-Version: 2.4
Name: promptify-cmax
Version: 0.3.0
Summary: Call-graph-aware code context retrieval for AI coding agents (MCP server + CLI)
Project-URL: Repository, https://github.com/promptify-com/promptify-cmax
Project-URL: Issues, https://github.com/promptify-com/promptify-cmax/issues
Project-URL: Changelog, https://github.com/promptify-com/promptify-cmax/releases
Author-email: Promptify LLC <c@promptify.com>
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: ai-agents,claude-code,code-context,mcp,structural-retrieval
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development
Requires-Python: >=3.12
Requires-Dist: mcp>=1.0
Requires-Dist: tree-sitter-python>=0.23
Requires-Dist: tree-sitter-typescript>=0.23
Requires-Dist: tree-sitter>=0.23
Requires-Dist: watchdog>=4.0
Description-Content-Type: text/markdown

# promptify-cmax

**Call-graph-aware code context retrieval for AI coding agents.**

When you ask Claude Code or Cursor to *fix a specific function*, the agent typically falls back to `grep` and pulls every file that *mentions* the symbol — specs, plans, ADRs, unrelated definitions that happen to share a name. The model pays attention tax on all of it.

`promptify-cmax` returns the files most likely to need editing, ranked by **distance in the call graph** rather than surface name match. It exposes both an [MCP](https://modelcontextprotocol.io/) server (drop-in for Claude Code, Cursor, Continue) and a CLI.

## When to use this (vs. semantic retrieval)

The MCP-server space already has good tools for *discovery* — open-ended "how does auth work in this codebase?" questions. Tools like [`zilliztech/claude-context`](https://github.com/zilliztech/claude-context) use dense embeddings to find code that's *semantically* close to your query. That's the right call when you don't yet know the names of the symbols you're looking for.

`promptify-cmax` solves the complementary problem: *editing tasks where you already know the symbol*. "Fix `threshold_for_complexity`," "why does `run_query` break when I change `Index.upsert`?" — these have a specific entry-point identifier, and the right files to read are the ones *structurally connected* to it (callers, callees, transitive). Embeddings can't see structural reachability; they retrieve based on token similarity, which lets unrelated namesakes contaminate results. We do FQN-aware call-graph BFS, so two `helper()` functions in different files are different graph nodes.

The two approaches are orthogonal and can run side-by-side as separate MCP tools. A capable agent will pick the right one for the task.

| If your task looks like… | Use |
|---|---|
| "How does X work?" / unfamiliar codebase exploration | semantic retrieval (e.g. `claude-context`) |
| "Fix `func_name`" / "why does Y change when I edit Z?" / known-symbol editing | **promptify-cmax** |
| Pattern-matching across the codebase ("find all calls to deprecated API") | [`ast-grep` MCP](https://mcpservers.org/servers/ast-grep/ast-grep-mcp) |

## Why call-graph and not embeddings, for editing tasks

On SWE-bench-Verified Python bug-fix tasks at a 30 000-token budget, structural retrieval surfaces the file the agent needs to edit at a **+24.6 pp higher rate than substring grep** — 41.0 % vs 16.4 % — robust across three pre-registered spikes (v0.4 / v5 / v6) at n=250, n=250, n=127. The v6 verdict is a clean PASS on a 127-instance sample fully disjoint from prior measurement runs:

| Budget | grep finds patch | structural finds patch | Δ |
|---|---|---|---|
| 5 000 tokens | 2.8 % | 16.6 % | **+13.8 pp** |
| **30 000 tokens** | **16.4 %** | **41.0 %** | **+24.6 pp** |
| 100 000 tokens | 39.1 % | 58.6 % | **+19.5 pp** |

Statistics: paired Wilcoxon p = 1.2 × 10⁻⁵, BCa-99 lower bound +10.9 pp, McNemar p = 1.9 × 10⁻⁵, JZS Bayes factor ≈ 4 800, multiverse 5/5 budgets directional. Cross-spike effect-size: +0.170 → +0.213 → +0.246 (consistent across three independent samples).

> **Audit trail:** the public claim above is lifted verbatim from [SPIKE-PCM-BENCH-FULLDISJOINT-V6 VERDICT.md §"Construct ceiling"](https://github.com/promptify-com/1metal-llm/blob/main/docs/research-spikes/2026-05-SPIKE-PCM-BENCH-FULLDISJOINT-V6/VERDICT.md#construct-ceiling-charter-construct-ceiling). The full v0.4 → v5 → v6 spike chain — including a PARITY verdict (paired-median degenerate on binary outcomes) and a FLAGGED-PASS verdict (overlap > 30 % auto-downgrade) — is preserved in the [research-spikes dossier](https://github.com/promptify-com/1metal-llm/tree/main/docs/research-spikes). The discipline ([ADR-0025](https://github.com/promptify-com/1metal-llm/blob/main/docs/adr/0025-promptify-cmax-public-claim-discipline.md)) gates every public-surface number on a `closed-go` spike's verdict.

What the bench measures: did the agent's structural retrieval surface the file the gold patch actually edits, anywhere in its ranked list, within a 30 000-token budget? It does NOT measure end-to-end editing success (whether the agent ultimately produces a correct fix); SWE-bench's evaluation harness is out of scope. The v3-era "49× lower token cost" framing was empirically falsified at n=109 and is retired.

The structural argument independent of the number: a senior engineer fixing a bug doesn't grep for the function name across the repo and read every match. They ask "what calls this, and what does this call?" That's a graph traversal, not a similarity ranking.

## Status

**v0.3 (general availability)** — Python and TypeScript indexing, FQN-aware call resolution, MCP server, ~33 tests. Wedge claim audited via the v0.4 → v5 → v6 spike chain (see "Why call-graph and not embeddings, for editing tasks" above). License: Apache-2.0. Go / Rust / Java / C# planned for Pro tier.

## Install

```bash
pip install promptify-cmax
```

Then index your project and wire it into Claude Code / Cursor / Continue. Five-minute walkthrough with copy-pasteable MCP config snippets and troubleshooting: **[QUICKSTART.md](https://github.com/promptify-com/promptify-cmax/blob/main/QUICKSTART.md)**.

## What it exposes

CLI:
- `promptify-cmax index --project-root <dir>` — build / incrementally update the structural index (one-time per repo, then automatic-on-change)
- `promptify-cmax query --project-root <dir> "<task>"` — return ranked files for a task description
- `promptify-cmax serve --project-root <dir>` — run as an MCP server over stdio

MCP tools (when run as `serve`):
- `structural_context(task, top_k=5)` — rank files by call-graph distance from the task's identifiers
- `reindex()` — rebuild after large code changes

## How it works

1. **Index** (one-time per repo, then incremental on file change): tree-sitter walks every Python and TypeScript source file, extracts function definitions, intra-function call sites, and module-level imports; persists everything to a single SQLite file at `.promptify/code-index.db`.
2. **Resolve** (query time): given a natural-language task, extract candidate identifiers (backtick / CamelCase / snake_case / dotted paths) and intersect with the symbols actually in the index.
3. **BFS** (query time): walk the call graph two hops in both directions; resolve each call edge to a *specific* `(file, function)` tuple via the caller's import bindings and same-file scope, so two functions named `helper` in different files never collapse into one node.
4. **Rank**: group reached nodes by file, sort by `(distance ASC, affected-function-count DESC)`, return the top-k.

The discipline that makes this useful: **fully-qualified-name resolution**, not bare-name matching. A naive call graph treats every `def main(): ...` in the repo as the same node — typically 100+ collisions in any non-trivial Python project. We resolve through imports, so cross-file false positives don't enter the BFS frontier.

## Roadmap

- [x] Python + TypeScript indexing (v0.1)
- [x] FQN-aware call resolution
- [x] MCP server, CLI
- [ ] Go, Rust, Java, C# (Pro)
- [ ] Hosted multi-repo index (Pro)
- [ ] PR-bot / CI integration (Team)
- [ ] VSCode + JetBrains extensions

## Pro / Team

This package is the open-source core. [Promptify](https://promptify.com) is building a hosted layer for teams (multi-repo indexing that survives laptop churn, additional language support, token-savings analytics, editor extensions, SSO/SAML, CI integration). Pricing and signup haven't shipped yet — [watch the repo](https://github.com/promptify-com/promptify-cmax) or [open an issue](https://github.com/promptify-com/promptify-cmax/issues/new) if you'd like a heads-up when the hosted tier launches.

## Contributing

See [CONTRIBUTING.md](https://github.com/promptify-com/promptify-cmax/blob/main/CONTRIBUTING.md). Issues and PRs welcome.

## License

Apache-2.0. Copyright © 2026 Promptify LLC. See [LICENSE](https://github.com/promptify-com/promptify-cmax/blob/main/LICENSE) and [NOTICE](https://github.com/promptify-com/promptify-cmax/blob/main/NOTICE).
