Metadata-Version: 2.4
Name: axoniq
Version: 0.2.3
Summary: Graph-powered code intelligence engine — indexes codebases into a knowledge graph, exposed via MCP tools for AI agents and a CLI for developers.
Keywords: code-intelligence,knowledge-graph,mcp,tree-sitter,static-analysis,dead-code,claude-code
Author: harshkedia177
Author-email: harshkedia177 <harshkedia717@gmail.com>
License-Expression: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Code Generators
Requires-Dist: typer>=0.15.0
Requires-Dist: rich>=13.0.0
Requires-Dist: tree-sitter>=0.25.0
Requires-Dist: tree-sitter-python>=0.23.0
Requires-Dist: tree-sitter-javascript>=0.23.0
Requires-Dist: tree-sitter-typescript>=0.23.0
Requires-Dist: kuzu>=0.11.0
Requires-Dist: igraph>=1.0.0
Requires-Dist: leidenalg>=0.11.0
Requires-Dist: fastembed>=0.7.0
Requires-Dist: mcp>=1.0.0
Requires-Dist: watchfiles>=1.0.0
Requires-Dist: pathspec>=1.0.4
Requires-Dist: pytest>=8.0.0 ; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=6.0.0 ; extra == 'dev'
Requires-Dist: ruff>=0.9.0 ; extra == 'dev'
Requires-Dist: neo4j>=5.0.0 ; extra == 'neo4j'
Requires-Python: >=3.11
Project-URL: Homepage, https://github.com/harshkedia177/axon
Project-URL: Repository, https://github.com/harshkedia177/axon
Project-URL: Issues, https://github.com/harshkedia177/axon/issues
Provides-Extra: dev
Provides-Extra: neo4j
Description-Content-Type: text/markdown

# Axon

**Building the knowledge graph for AI code agents.**

Indexes any codebase into a structural knowledge graph — every dependency, call chain, cluster, and execution flow — then exposes it through smart MCP tools so AI agents never miss code.

```
$ axon analyze .

Walking files...               142 files found
Parsing code...                142/142
Tracing calls...               847 calls resolved
Analyzing types...             234 type relationships
Detecting communities...       8 clusters found
Detecting execution flows...   34 processes found
Finding dead code...           12 unreachable symbols
Analyzing git history...       18 coupled file pairs
Generating embeddings...       623 vectors stored

Done in 4.2s — 623 symbols, 1,847 edges, 8 clusters, 34 flows
```

---

## The Problem

Your AI agent edits `UserService.validate()`. It doesn't know that 47 functions depend on that return type, 3 execution flows pass through it, and `payment_handler.py` changes alongside it 80% of the time.

**Breaking changes ship.**

This happens because AI agents work with flat text. They grep for callers, miss indirect ones, and have no understanding of how code is *connected*. Context windows are finite. LSPs don't expose call graphs. Grepping gives you strings, not structure.

The agent needs a **knowledge graph** — not more text.

---

## How Axon Solves It

Most code intelligence tools give the agent raw files and hope it reads enough. Axon takes a different approach: **precompute structure at index time** so every tool call returns complete, actionable context.

A 12-phase pipeline runs once over your repo. After that:

- `axon_impact("validate")` returns all 47 affected symbols, grouped by depth (will break / may break / review), with confidence scores — in a single call
- `axon_query("auth handler")` returns hybrid-ranked results grouped by execution flow, not a flat list of name matches
- `axon_context("UserService")` returns callers, callees, type references, community membership, and dead code status — the full picture

**Three benefits:**

1. **Reliability** — the context is already in the tool response. No multi-step exploration that can miss code.
2. **Token efficiency** — one tool call instead of a 10-query search chain. Agents spend tokens on reasoning, not navigation.
3. **Model democratization** — even smaller models get full architectural clarity because the tools do the heavy lifting.

**Zero cloud dependencies.** Everything runs locally — parsing, graph storage, embeddings, search. No API keys, no data leaving your machine.

---

## TL;DR

```bash
pip install axoniq            # 1. Install
cd your-project && axon analyze .  # 2. Index (one command, ~5s for most repos)
```

Then add to `.mcp.json` or `.claude/settings.json`:

```json
{
  "mcpServers": {
    "axon": {
      "command": "axon",
      "args": ["serve", "--watch"]
    }
  }
}
```

Your AI agent now has full structural understanding of your codebase. The knowledge graph updates live as you edit.

---

## What You Get

### Find anything — by name, concept, or typo

**Hybrid Search (BM25 + Vector + Fuzzy)**

Three search strategies fused with [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf):

- **BM25 full-text search** — fast exact name and keyword matching via KuzuDB FTS
- **Semantic vector search** — conceptual queries via 384-dim embeddings (BAAI/bge-small-en-v1.5)
- **Fuzzy name search** — Levenshtein fallback for typos and partial matches

Results are ranked with test file down-ranking (0.5x) and source function/class boosting (1.2x), then **grouped by execution flow** so the agent sees architectural context in a single call.

### Know what breaks before you change it

**Impact Analysis with Depth Grouping**

When you're about to change a symbol, Axon traces upstream through the call graph, type references, and git coupling history. Results are grouped by depth for actionability:

- **Depth 1** — Direct callers (will break)
- **Depth 2** — Indirect callers (may break)
- **Depth 3+** — Transitive (review)

Every edge carries a confidence score (1.0 = exact match, 0.8 = receiver method, 0.5 = fuzzy) so you can prioritize what to review.

### Find what to delete

**Dead Code Detection**

Not just "zero callers" — a multi-pass analysis that understands your framework:

1. **Initial scan** — flags symbols with no incoming calls
2. **Exemptions** — entry points, exports, constructors, test code, dunder methods, `__init__.py` symbols, decorated functions, `@property` methods
3. **Override pass** — un-flags methods overriding non-dead base class methods
4. **Protocol conformance** — un-flags methods on Protocol-conforming classes
5. **Protocol stubs** — un-flags all methods on Protocol classes (interface contracts)

### Understand how code runs, not just where it sits

**Execution Flow Tracing**

Detects entry points using framework-aware patterns:
- **Python**: `@app.route`, `@router.get`, `@click.command`, `test_*` functions, `__main__` blocks
- **JavaScript/TypeScript**: Express handlers, exported functions, `handler`/`middleware` patterns

Then traces BFS execution flows from each entry point through the call graph, classifying flows as intra-community or cross-community.

### See your architecture without reading docs

**Community Detection**

Uses the [Leiden algorithm](https://www.nature.com/articles/s41598-019-41695-z) (igraph + leidenalg) to automatically discover functional clusters. Each community gets a cohesion score and auto-generated label. Agents can ask "what cluster does this symbol belong to?" and get the answer without reading a single design doc.

### Find hidden dependencies git knows about

**Change Coupling (Git History)**

Analyzes 6 months of git history to find dependencies that static analysis misses:

```
coupling(A, B) = co_changes(A, B) / max(changes(A), changes(B))
```

Files with coupling strength >= 0.3 and 3+ co-changes get linked. These show up in impact analysis — so when you change `user.py`, the agent also knows to check `user_test.py` and `auth_middleware.py`.

### Always up to date

**Watch Mode**

Live re-indexing powered by a Rust-based file watcher (watchfiles):

```bash
$ axon watch
Watching /Users/you/project for changes...

[10:32:15] src/auth/validate.py modified -> re-indexed (0.3s)
[10:33:02] 2 files modified -> re-indexed (0.5s)
```

File-local phases (parse, imports, calls, types) run immediately on change. Global phases (communities, processes, dead code) batch every 30 seconds.

### Structural diff, not text diff

**Branch Comparison**

Compare branches at the symbol level using git worktrees (no stashing required):

```bash
$ axon diff main..feature

Symbols added (4):
  + process_payment (Function) -- src/payments/stripe.py
  + PaymentIntent (Class) -- src/payments/models.py

Symbols modified (2):
  ~ checkout_handler (Function) -- src/routes/checkout.py

Symbols removed (1):
  - old_charge (Function) -- src/payments/legacy.py
```

### Clean call graphs

**Noise Filtering**

Built-in blocklist (138 entries) automatically filters language builtins (`print`, `len`, `isinstance`), JS/TS globals (`console`, `setTimeout`, `fetch`), React hooks (`useState`, `useEffect`), and common stdlib methods from the call graph. Your graph shows *your* code's relationships, not noise from `list.append()`.

---

## The Pipeline

Axon builds deep structural understanding through 12 sequential analysis phases:

| Phase | What It Does |
|-------|-------------|
| **File Walking** | Walks repo respecting `.gitignore`, filters by supported languages |
| **Structure** | Creates File/Folder hierarchy with CONTAINS relationships |
| **Parsing** | tree-sitter AST extraction — functions, classes, methods, interfaces, enums, type aliases |
| **Import Resolution** | Resolves import statements to actual files (relative, absolute, bare specifiers) |
| **Call Tracing** | Maps function calls with confidence scores. Noise filtering skips 138 language builtins |
| **Heritage** | Tracks class inheritance (EXTENDS) and interface implementation (IMPLEMENTS) |
| **Type Analysis** | Extracts type references from parameters, return types, and variable annotations |
| **Community Detection** | Leiden algorithm clusters related symbols into functional communities |
| **Process Detection** | Framework-aware entry point detection + BFS flow tracing |
| **Dead Code Detection** | Multi-pass analysis with override, protocol, and decorator awareness |
| **Change Coupling** | Git history analysis — finds files that always change together |
| **Embeddings** | 384-dim vectors for every symbol, enabling semantic search. Skip with `--no-embeddings` |

---

## MCP Integration

Axon exposes its full intelligence as an MCP server. Set it up once, and your AI agent has structural understanding of your codebase forever.

### Setup

**Claude Code** — add to `.mcp.json` or `.claude/settings.json`:

```json
{
  "mcpServers": {
    "axon": {
      "command": "axon",
      "args": ["serve", "--watch"]
    }
  }
}
```

**Cursor** — add to your MCP settings:

```json
{
  "axon": {
    "command": "axon",
    "args": ["serve", "--watch"]
  }
}
```

Or run `axon setup --claude` / `axon setup --cursor` to generate the config.

The `--watch` flag enables live re-indexing — the graph updates as you edit code.

### Tools

| Tool | What the agent gets |
|------|-------------|
| `axon_query` | Hybrid search (BM25 + vector + fuzzy) with results grouped by execution flow |
| `axon_context` | 360-degree view — callers, callees, type refs, confidence tags, dead code status |
| `axon_impact` | Blast radius grouped by depth — direct (will break), indirect (may break), transitive |
| `axon_dead_code` | All unreachable symbols grouped by file |
| `axon_detect_changes` | Map a `git diff` to affected symbols and execution flows |
| `axon_list_repos` | All indexed repositories with stats |
| `axon_cypher` | Read-only Cypher queries against the knowledge graph |

Every tool response includes a **next-step hint** guiding the agent through a natural investigation workflow:

```
query   -> "Next: Use context() on a specific symbol for the full picture."
context -> "Next: Use impact() if planning changes to this symbol."
impact  -> "Tip: Review each affected symbol before making changes."
```

### Resources

| URI | Description |
|-----|-------------|
| `axon://overview` | Node and relationship counts by type |
| `axon://dead-code` | Full dead code report |
| `axon://schema` | Graph schema reference for Cypher queries |

---

## How It Compares

| Capability | grep / ripgrep | LSP | Context window stuffing | Axon |
|-----------|---------------|-----|------------------------|------|
| Text search | Yes | No | Yes | Yes (hybrid BM25 + vector) |
| Find all callers | No | Partial | Hit-or-miss | Yes (full call graph with confidence) |
| Type relationships | No | Yes | No | Yes (param/return/variable roles) |
| Dead code detection | No | No | No | Yes (multi-pass, framework-aware) |
| Execution flow tracing | No | No | No | Yes (entry point -> flow) |
| Community detection | No | No | No | Yes (Leiden algorithm) |
| Change coupling (git) | No | No | No | Yes (6-month co-change analysis) |
| Impact analysis | No | No | No | Yes (depth-grouped with confidence) |
| AI agent integration | No | Partial | N/A | Yes (full MCP server) |
| Structural branch diff | No | No | No | Yes (node/edge level) |
| Watch mode | No | Yes | No | Yes (Rust-based, 500ms debounce) |
| Works offline | Yes | Yes | No | Yes |

---

## Supported Languages

| Language | Extensions | Parser |
|----------|-----------|--------|
| Python | `.py` | tree-sitter-python |
| TypeScript | `.ts`, `.tsx` | tree-sitter-typescript |
| JavaScript | `.js`, `.jsx`, `.mjs`, `.cjs` | tree-sitter-javascript |

---

## Installation

```bash
# With pip
pip install axoniq

# With uv (recommended)
uv add axoniq

# With Neo4j backend support
pip install axoniq[neo4j]
```

Requires **Python 3.11+**.

### From Source

```bash
git clone https://github.com/harshkedia177/axon.git
cd axon
uv sync --all-extras
uv run axon --help
```

---

## CLI Reference

```
axon analyze [PATH]          Index a repository (default: current directory)
    --full                   Force full rebuild (skip incremental)
    --no-embeddings          Skip vector embedding generation (faster indexing)

axon status                  Show index status for current repo
axon list                    List all indexed repositories (auto-populated on analyze)
axon clean                   Delete index for current repo
    --force / -f             Skip confirmation prompt

axon query QUERY             Hybrid search the knowledge graph
    --limit / -n N           Max results (default: 20)

axon context SYMBOL          360-degree view of a symbol
axon impact SYMBOL           Blast radius analysis
    --depth / -d N           BFS traversal depth (default: 3)

axon dead-code               List all detected dead code
axon cypher QUERY            Execute a raw Cypher query (read-only)

axon watch                   Watch mode — live re-indexing on file changes
axon diff BASE..HEAD         Structural branch comparison

axon setup                   Print MCP configuration JSON
    --claude                 For Claude Code
    --cursor                 For Cursor

axon mcp                     Start the MCP server (stdio transport)
axon serve                   Start the MCP server
    --watch, -w              Enable live file watching with auto-reindex
axon --version               Print version
```

---

## Example Workflows

### "I need to refactor the User class — what breaks?"

```bash
# See everything connected to User
axon context User

# Check blast radius — grouped by depth
axon impact User --depth 3

# Find files that always change with user.py
axon cypher "MATCH (a:File)-[r:CodeRelation]->(b:File) WHERE a.name = 'user.py' AND r.rel_type = 'coupled_with' RETURN b.name, r.strength ORDER BY r.strength DESC"
```

### "Is there dead code we should clean up?"

```bash
axon dead-code
```

### "What are the main execution flows?"

```bash
axon cypher "MATCH (p:Process) RETURN p.name, p.properties ORDER BY p.name"
```

### "Which parts of the codebase are most tightly coupled?"

```bash
axon cypher "MATCH (a:File)-[r:CodeRelation]->(b:File) WHERE r.rel_type = 'coupled_with' RETURN a.name, b.name, r.strength ORDER BY r.strength DESC LIMIT 20"
```

---

## Knowledge Graph Model

### Nodes

| Label | Description |
|-------|-------------|
| `File` | Source file |
| `Folder` | Directory |
| `Function` | Top-level function |
| `Class` | Class definition |
| `Method` | Method within a class |
| `Interface` | Interface / Protocol definition |
| `TypeAlias` | Type alias |
| `Enum` | Enumeration |
| `Community` | Auto-detected functional cluster |
| `Process` | Detected execution flow |

### Relationships

| Type | Description | Key Properties |
|------|-------------|----------------|
| `CONTAINS` | Folder -> File/Symbol hierarchy | -- |
| `DEFINES` | File -> Symbol it defines | -- |
| `CALLS` | Symbol -> Symbol it calls | `confidence` (0.0-1.0) |
| `IMPORTS` | File -> File it imports from | `symbols` (names list) |
| `EXTENDS` | Class -> Class it extends | -- |
| `IMPLEMENTS` | Class -> Interface it implements | -- |
| `USES_TYPE` | Symbol -> Type it references | `role` (param/return/variable) |
| `EXPORTS` | File -> Symbol it exports | -- |
| `MEMBER_OF` | Symbol -> Community it belongs to | -- |
| `STEP_IN_PROCESS` | Symbol -> Process it participates in | `step_number` |
| `COUPLED_WITH` | File -> File that co-changes with it | `strength`, `co_changes` |

### Node ID Format

```
{label}:{relative_path}:{symbol_name}

Examples:
  function:src/auth/validate.py:validate_user
  class:src/models/user.py:User
  method:src/models/user.py:User.save
```

---

## Architecture

```
Source Code (.py, .ts, .js, .tsx, .jsx)
    |
    v
+----------------------------------------------+
|         Ingestion Pipeline (12 phases)        |
|                                               |
|  walk -> structure -> parse -> imports        |
|  -> calls -> heritage -> types                |
|  -> communities -> processes -> dead_code     |
|  -> coupling -> embeddings                    |
+----------------------+-----------------------+
                       |
                       v
              +-----------------+
              | KnowledgeGraph  |  (in-memory during build)
              +--------+--------+
                       |
          +------------+------------+
          v            v            v
     +---------+ +---------+ +---------+
     | KuzuDB  | |  FTS    | | Vector  |
     | (graph) | | (BM25)  | | (HNSW)  |
     +----+----+ +----+----+ +----+----+
          +------------+------------+
                       |
              StorageBackend Protocol
                       |
              +--------+--------+
              v                 v
        +----------+     +----------+
        |   MCP    |     |   CLI    |
        |  Server  |     | (Typer)  |
        | (stdio)  |     |          |
        +----+-----+     +----+-----+
             |                |
        Claude Code      Terminal
        / Cursor         (developer)
```

### Tech Stack

| Layer | Technology | Purpose |
|-------|-----------|---------|
| Parsing | tree-sitter | Language-agnostic AST extraction |
| Graph Storage | KuzuDB | Embedded graph database with Cypher, FTS, and vector support |
| Graph Algorithms | igraph + leidenalg | Leiden community detection |
| Embeddings | fastembed | ONNX-based 384-dim vectors (~100MB, no PyTorch) |
| MCP Protocol | mcp SDK (FastMCP) | AI agent communication via stdio |
| CLI | Typer + Rich | Terminal interface with progress bars |
| File Watching | watchfiles | Rust-based file system watcher |
| Gitignore | pathspec | Full `.gitignore` pattern matching |

### Storage

Everything lives locally:

```
your-project/
+-- .axon/
    +-- kuzu/          # KuzuDB graph database (graph + FTS + vectors)
    +-- meta.json      # Index metadata and stats
```

Add `.axon/` to your `.gitignore`.

A global registry at `~/.axon/repos/` is automatically populated on `axon analyze`, enabling `axon list` to discover all indexed repositories across your machine.

The storage layer is abstracted behind a `StorageBackend` Protocol — KuzuDB is the default, with an optional Neo4j backend available via `pip install axoniq[neo4j]`.

---

## Development

```bash
git clone https://github.com/harshkedia177/axon.git
cd axon
uv sync --all-extras

# Run tests
uv run pytest

# Lint
uv run ruff check src/

# Run from source
uv run axon --help
```

---

## License

MIT

---

Built by [@harshkedia177](https://github.com/harshkedia177)
