Metadata-Version: 2.4
Name: repomap-ai
Version: 0.1.2
Summary: Token-efficient repository mapping tool for AI IDEs
Project-URL: Homepage, https://github.com/tushar22/repomap
Project-URL: Repository, https://github.com/tushar22/repomap
Project-URL: Issues, https://github.com/tushar22/repomap/issues
Author: tushar22
License-Expression: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.11
Requires-Dist: mcp>=1.0.0
Requires-Dist: networkx>=3.3
Requires-Dist: numpy>=1.26.0
Requires-Dist: rich>=13.0
Requires-Dist: scipy>=1.13.0
Requires-Dist: tiktoken>=0.7.0
Requires-Dist: tree-sitter-python>=0.23.0
Requires-Dist: tree-sitter-typescript>=0.23.0
Requires-Dist: tree-sitter>=0.25.0
Requires-Dist: typer>=0.12.0
Requires-Dist: watchdog>=4.0.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Provides-Extra: languages
Requires-Dist: tree-sitter-c>=0.23.0; extra == 'languages'
Requires-Dist: tree-sitter-cpp>=0.23.0; extra == 'languages'
Requires-Dist: tree-sitter-go>=0.23.0; extra == 'languages'
Requires-Dist: tree-sitter-java>=0.23.0; extra == 'languages'
Requires-Dist: tree-sitter-javascript>=0.23.0; extra == 'languages'
Requires-Dist: tree-sitter-ruby>=0.23.0; extra == 'languages'
Requires-Dist: tree-sitter-rust>=0.23.0; extra == 'languages'
Provides-Extra: llm
Requires-Dist: httpx>=0.27.0; extra == 'llm'
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == 'mcp'
Provides-Extra: scale
Requires-Dist: numpy>=1.26.0; extra == 'scale'
Requires-Dist: rustworkx>=0.15.0; extra == 'scale'
Requires-Dist: scipy>=1.13.0; extra == 'scale'
Provides-Extra: visual
Requires-Dist: jinja2>=3.1.0; extra == 'visual'
Description-Content-Type: text/markdown

# RepoMap

[![PyPI](https://img.shields.io/pypi/v/repomap-ai?label=pip&color=blue)](https://pypi.org/project/repomap-ai/)
[![npm](https://img.shields.io/npm/v/repomap-ai?label=npm&color=red)](https://www.npmjs.com/package/repomap-ai)
[![Python](https://img.shields.io/badge/python-3.11%2B-blue)](https://python.org)
[![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)

Token-efficient repository mapping tool for AI IDEs. RepoMap parses source code with tree-sitter, builds function-level dependency graphs, and outputs compact maps that fit within token budgets.

## What is RepoMap

RepoMap generates concise, structured maps of codebases designed for consumption by LLMs and AI-powered IDEs. Rather than dumping raw source code into a context window, RepoMap extracts symbols (functions, classes, methods), resolves their dependencies, ranks them by importance using PageRank, and formats the output to stay within a specified token budget. The result is a high-signal overview that helps AI tools understand project structure without exhausting context limits.

## Features

- **Tree-sitter parsing** for Python, TypeScript, and JavaScript
- **Dependency graph** with typed edges: calls, imports, reads, writes, extends, implements
- **PageRank ranking** to surface the most important symbols first
- **Token-budget-aware output** that fits maps within configurable token limits
- **Data model detection** for Pydantic, dataclass, and SQLAlchemy models
- **Entry point detection** to identify CLI commands, API routes, and main functions
- **Multiple output formats**: Markdown, JSON, XML
- **Interactive HTML explorer** for visual navigation of the dependency graph
- **MCP server** for direct integration with AI IDEs (Cursor, VS Code)
- **Incremental file watcher** that updates the map as code changes

## Installation

Choose whichever method suits you — both give you the same `repomap` command.

### Via npm / npx (no Python setup required)

```bash
# Run instantly without installing
npx repomap-ai generate .

# Or install globally
npm install -g repomap-ai
repomap generate .
```

> `npx repomap-ai` automatically installs the Python backend (`pip install repomap-ai`) on first run.

### Via pip (Python 3.11+)

```bash
pip install repomap-ai
repomap generate .
```

With optional extras:

```bash
# Visualization, MCP server, and performance extras
pip install "repomap-ai[visual,mcp,scale]"
```

### Via pipx (recommended for global CLI use)

```bash
pipx install repomap-ai
repomap generate .
```

### From source (development)

```bash
git clone https://github.com/tushar22/repomap.git
cd repomap
pip install -e ".[dev,visual,mcp]"
```

## Quick Start

```bash
# Generate a map of the current repository
repomap generate .

# Generate a map with a 2000-token budget in JSON format
repomap generate . --max-tokens 2000 --format json

# Focus on symbols around a specific function
repomap generate . --around "MyClass.process"

# Launch the interactive visual explorer
repomap visual . -o .repomap.html
```

## CLI Commands

### `repomap generate`

Generate a repository map.

```bash
repomap generate . [OPTIONS]
```

| Option | Description |
|---|---|
| `--max-tokens N` | Maximum token budget for output (default: 1000) |
| `--around SYMBOL` | Focus map around a specific symbol |
| `--format FORMAT` | Output format: `markdown`, `json`, `xml` |
| `--output FILE` | Write output to a file instead of stdout |
| `--scope SCOPE` | Limit to a subdirectory or file |
| `--verbose` | Show detailed processing information |

Examples:

```bash
# Default markdown output
repomap generate .

# JSON output focused on a specific class, saved to file
repomap generate . --format json --around "UserService" --output map.json

# Scoped to a subdirectory with a larger token budget
repomap generate . --scope src/core --max-tokens 4000 --verbose
```

### `repomap visual`

Generate an interactive HTML dependency graph explorer.

```bash
repomap visual . -o .repomap.html
```

Open the resulting HTML file in a browser to explore symbols, their relationships, and importance rankings interactively.

### `repomap stats`

Display symbol store statistics: number of files parsed, symbols extracted, edges in the dependency graph.

```bash
repomap stats .
```

### `repomap watch`

Start an incremental file watcher that updates the symbol store as files change.

```bash
repomap watch .
```

### `repomap serve`

Start an MCP server for integration with AI IDEs.

```bash
repomap serve . --transport stdio
```

### `repomap init`

Generate IDE configuration files for MCP integration.

```bash
repomap init .
```

This creates `.cursor/mcp.json` and `.vscode/mcp.json` in the target directory.

## Using with Existing Repos

Just `cd` into any repo and run — no config needed.

### One-liner (zero install)

```bash
cd /path/to/any/repo
npx repomap-ai generate .
```

> First run auto-installs the Python backend in the background, then generates the map.

### Step-by-step walkthrough

1. **Go to the target repository:**

   ```bash
   cd /path/to/your/repo
   # or clone one first
   git clone https://github.com/example/project.git && cd project
   ```

2. **Generate a high-level map:**

   ```bash
   # via npx
   npx repomap-ai generate .

   # or via pip-installed CLI
   repomap generate .
   ```

3. **Focus on a specific area** using `--around` for a detailed view of a symbol and its dependencies:

   ```bash
   repomap generate . --around "handle_request"
   ```

4. **Explore visually** by generating the interactive HTML explorer:

   ```bash
   repomap visual . -o .repomap.html
   open .repomap.html   # macOS
   ```

5. **Set up MCP for IDE integration** so your AI assistant can query the map on demand:

   ```bash
   repomap init .
   repomap serve . --transport stdio
   ```

## MCP Integration

RepoMap exposes six tools through the Model Context Protocol (MCP):

| Tool | Description |
|---|---|
| `repomap_overview` | Get a token-budgeted overview of the entire repository |
| `repomap_around` | Explore symbols surrounding a specific function or class |
| `repomap_query` | Search for symbols by name or pattern |
| `repomap_data_model` | List detected data models (Pydantic, dataclass, SQLAlchemy) |
| `repomap_entry_points` | List detected entry points (CLI commands, routes, main) |
| `repomap_impact` | Analyze the impact of changing a specific symbol |

### Cursor

Add to `.cursor/mcp.json`:

```json
{
  "mcpServers": {
    "repomap": {
      "command": "repomap",
      "args": ["serve", ".", "--transport", "stdio"]
    }
  }
}
```

### VS Code

Add to `.vscode/mcp.json`:

```json
{
  "servers": {
    "repomap": {
      "command": "repomap",
      "args": ["serve", ".", "--transport", "stdio"]
    }
  }
}
```

Alternatively, run `repomap init .` to generate both configuration files automatically.

## Output Formats

### Markdown (default)

Human-readable format, suitable for most LLMs. Selected with `--format markdown`.

### JSON

Structured format for programmatic consumption. Selected with `--format json`. Includes full symbol metadata, edges, and ranking scores.

### XML

Optimized for Claude. Selected with `--format xml`. Uses a compact tag structure that Claude parses efficiently, reducing token usage compared to equivalent JSON.

## Configuration

RepoMap reads configuration from `.repomaprc` (INI-style) or `pyproject.toml` under `[tool.repomap]`.

### pyproject.toml

```toml
[tool.repomap]
max_tokens = 1000
output_format = "markdown"
exclude_patterns = ["**/node_modules/**", "**/venv/**", "**/.git/**"]
```

### .repomaprc

```ini
max_tokens = 1000
output_format = markdown
exclude_patterns = **/node_modules/**, **/venv/**, **/.git/**
```

| Option | Default | Description |
|---|---|---|
| `max_tokens` | `1000` | Maximum token budget for generated maps |
| `output_format` | `markdown` | Default output format: `markdown`, `json`, `xml` |
| `exclude_patterns` | `[]` | Glob patterns for files and directories to skip |

## Architecture

RepoMap follows a four-stage pipeline:

```
Source Files --> Parser --> Graph --> Ranker --> Formatter --> Output
```

1. **Parser** -- Uses tree-sitter grammars to extract symbols (functions, classes, methods, variables) and their relationships from Python, TypeScript, and JavaScript source files.

2. **Graph** -- Builds a directed dependency graph using NetworkX. Edges are typed (calls, imports, reads, writes, extends, implements) and weighted by relationship strength.

3. **Ranker** -- Applies PageRank over the dependency graph to score each symbol by structural importance. Supports boosting for entry points and data models.

4. **Formatter** -- Serializes the ranked symbols into the requested output format (Markdown, JSON, XML), pruning low-ranked symbols to stay within the token budget as measured by tiktoken.

## License

See [LICENSE](LICENSE) for details.
