Metadata-Version: 2.4
Name: codesurface
Version: 0.8.0
Summary: MCP server that indexes a codebase's public API at startup and serves it via compact tool responses. Pluggable parsers for C#, Go, Java, TypeScript, Python, C++, and more.
Project-URL: Homepage, https://github.com/Codeturion/codesurface
Project-URL: Repository, https://github.com/Codeturion/codesurface
Project-URL: Issues, https://github.com/Codeturion/codesurface/issues
Author-email: Codeturion <fuatcankoseoglu@gmail.com>
License: ## TL;DR
        
        Free to use, fork, modify, and share for any personal or non-commercial purpose.
        Commercial use requires permission — contact fuatcankoseoglu@gmail.com.
        
        Full license terms below.
        
        ---
        
        # PolyForm Noncommercial License 1.0.0
        
        Copyright (c) 2026 Codeturion
        
        <https://polyformproject.org/licenses/noncommercial/1.0.0>
        
        ## Acceptance
        
        In order to get any license under these terms, you must agree to them as both strict obligations and conditions to all your licenses.
        
        ## Copyright License
        
        The licensor grants you a copyright license for the software to do everything you might do with the software that would otherwise infringe the licensor's copyright in it for any permitted purpose. However, you may only distribute the software according to Distribution License and make changes or new works based on the software according to Changes and New Works License.
        
        ## Distribution License
        
        The licensor grants you an additional copyright license to distribute copies of the software. Your license to distribute covers distributing the software with changes and new works permitted by Changes and New Works License.
        
        ## Notices
        
        You must ensure that anyone who gets a copy of any part of the software from you also gets a copy of these terms or the URL for them above, as well as copies of any plain-text lines beginning with `Required Notice:` that the licensor provided with the software.
        
        ## Changes and New Works License
        
        The licensor grants you an additional copyright license to make changes and new works based on the software for any permitted purpose.
        
        ## Patent License
        
        The licensor grants you a patent license for the software that covers patent claims the licensor can license, or becomes able to license, that you would infringe by using the software.
        
        ## Noncommercial Purposes
        
        Any noncommercial purpose is a permitted purpose.
        
        ## Personal Uses
        
        Personal use for research, experiment, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, amateur pursuits, or religious observance, without any anticipated commercial application, is use for a permitted purpose.
        
        ## Noncommercial Organizations
        
        Use by any charitable organization, educational institution, public research organization, public safety or health organization, environmental protection organization, or government institution is use for a permitted purpose regardless of the source of funding or obligations resulting from the funding.
        
        ## Fair Use
        
        You may have "fair use" rights for the software under the law. These terms do not limit them.
        
        ## No Other Rights
        
        These terms do not allow you to sublicense or transfer any of your licenses to anyone else, or prevent the licensor from granting licenses to anyone else. These terms do not imply any other licenses.
        
        ## Patent Defense
        
        If you make any written claim that the software infringes or contributes to infringement of any patent, your patent license for the software granted under these terms ends immediately. If your company makes such a claim, your patent license ends immediately for work on behalf of your company.
        
        ## Violations
        
        The first time you are notified in writing that you have violated any of these terms, or done anything with the software not covered by your licenses, your licenses can nonetheless continue if you come into full compliance with these terms, and take practical steps to correct past violations, within 32 days of receiving notice. Otherwise, all your licenses end immediately.
        
        ## No Liability
        
        As far as the law allows, the software comes as is, without any warranty or condition, and the licensor will not be liable to you for any damages arising out of these terms or the use or nature of the software, under any kind of legal claim.
        
        ## Definitions
        
        The **licensor** is the individual or entity offering these terms, and the **software** is the software the licensor makes available under these terms.
        
        **You** refers to the individual or entity agreeing to these terms.
        
        **Your company** is any legal entity, sole proprietorship, or other kind of organization that you work for, plus all organizations that have control over, are under the control of, or are under common control with that organization. **Control** means ownership of substantially all the assets of an entity, or the power to direct its management and policies by vote, contract, or otherwise. Control can be direct or indirect.
        
        **Your licenses** are all the licenses granted to you for the software under these terms.
        
        **Use** means anything you do with the software requiring one of your licenses.
License-File: LICENSE
Keywords: api,code-intelligence,csharp,go,golang,index,java,mcp,python,typescript
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Requires-Python: >=3.10
Requires-Dist: mcp[cli]>=1.8.0
Description-Content-Type: text/markdown

<!-- mcp-name: io.github.Codeturion/codesurface -->

# codesurface

[![PyPI Version](https://img.shields.io/pypi/v/codesurface.svg)](https://pypi.org/project/codesurface/)
[![PyPI Downloads](https://img.shields.io/pypi/dm/codesurface.svg)](https://pypi.org/project/codesurface/)
[![MCP Registry](https://img.shields.io/badge/MCP-Registry-green)](https://registry.modelcontextprotocol.io/?q=codesurface)
[![GitHub Stars](https://img.shields.io/github/stars/Codeturion/codesurface)](https://github.com/Codeturion/codesurface)
[![GitHub Last Commit](https://img.shields.io/github/last-commit/Codeturion/codesurface)](https://github.com/Codeturion/codesurface)
[![Languages](https://img.shields.io/badge/languages-C%23%20%7C%20TS%20%7C%20Java%20%7C%20Go%20%7C%20Python-blueviolet)](https://github.com/Codeturion/codesurface)
[![License: PolyForm Noncommercial](https://img.shields.io/badge/License-PolyForm%20Noncommercial%201.0.0-blue.svg)](https://polyformproject.org/licenses/noncommercial/1.0.0/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![Blog Post](https://img.shields.io/badge/Blog-Benchmark%20Write--up-blue)](https://www.codeturion.me/blog/reducing-llm-agent-hallucinations-through-constrained-api-retrieval)

**MCP server that indexes your codebase's public API at startup and serves it via compact tool responses — saving tokens vs reading source files.**

Parses source files, extracts public classes/methods/properties/fields/events, and serves them through 5 MCP tools. Works with Claude Code, Cursor, Windsurf, or any MCP-compatible AI tool.

**Supported languages:** C# (`.cs`), Go (`.go`), Java (`.java`), Python (`.py`), TypeScript/TSX (`.ts`, `.tsx`)

## Quick Start

Add to your `.mcp.json`:

```json
{
  "mcpServers": {
    "codesurface": {
      "command": "uvx",
      "args": ["codesurface", "--project", "/path/to/your/src"]
    }
  }
}
```

Point `--project` at any directory containing supported source files — a Unity `Assets/Scripts` folder, a Spring Boot project, a .NET `src/` tree, a Node.js/React project, a Python package, etc. Languages are auto-detected.

Restart your AI tool and ask: *"What methods does MyService have?"*

## CLAUDE.md Snippet

Add this to your project's `CLAUDE.md` (or equivalent instructions file). **This step is important.** Without it, the AI has the tools but won't know when to reach for them.

````markdown
## Codebase API Lookup (codesurface MCP)

Use codesurface MCP tools BEFORE Grep, Glob, Read, or Task (subagents) for any class/method/field lookup. This applies to you AND any subagents you spawn.

| Tool | Use when | Example |
|------|----------|---------|
| `search` | Find APIs by keyword | `search("MergeService")` |
| `get_signature` | Need exact signature | `get_signature("TryMerge")` |
| `get_class` | See all members on a class | `get_class("BlastBoardModel")` |
| `get_stats` | Codebase overview | `get_stats()` |

Every result includes file path + line numbers. Use them for targeted reads:
- `File: Service.cs:32` → `Read("Service.cs", offset=32, limit=15)`
- `File: Converter.java:504-506` → `Read("Converter.java", offset=504, limit=10)`

Never read a full file when you have a line number. Only fall back to Grep/Read for implementation details (method bodies, control flow).
````

## Tools

| Tool | Purpose | Example |
|------|---------|---------|
| `search` | Find APIs by keyword | "MergeService", "BlastBoard", "GridCoord" |
| `get_signature` | Exact signature by name or FQN | "TryMerge", "CampGame.Services.IMergeService.TryMerge" |
| `get_class` | Full class reference card — all public members | "BlastBoardModel" → all methods/fields/properties |
| `get_stats` | Overview of indexed codebase | File count, record counts, namespace breakdown |
| `reindex` | Incremental index update (mtime-based) | Only re-parses changed/new/deleted files. Also runs automatically on query misses |

## Tested On

| Project | Language | Files | Records | Time |
|---------|----------|-------|---------|------|
| [vscode](https://github.com/microsoft/vscode) | TypeScript | 6,611 | 88,293 | 9.3s |
| [Paper](https://github.com/PaperMC/Paper) | Java | 2,909 | 33,973 | 2.3s |
| [client-go](https://github.com/kubernetes/client-go) | Go | 219 | 2,760 | 0.4s |
| [langchain](https://github.com/langchain-ai/langchain) | Python | 1,880 | 12,418 | 1.1s |
| [pydantic](https://github.com/pydantic/pydantic) | Python | 365 | 9,648 | 0.3s |
| [guava](https://github.com/google/guava) | Java | 891 | 8,377 | 2.4s |
| [immich](https://github.com/immich-app/immich) | TypeScript | 919 | 7,957 | 0.6s |
| [fastapi](https://github.com/tiangolo/fastapi) | Python | 881 | 5,713 | 0.5s |
| [ant-design](https://github.com/ant-design/ant-design) | TypeScript | 2,947 | 5,452 | 0.9s |
| [dify](https://github.com/langgenius/dify) | TypeScript | 4,903 | 5,038 | 1.9s |
| [crawlee-python](https://github.com/apify/crawlee-python) | Python | 386 | 2,473 | 0.3s |
| [flask](https://github.com/pallets/flask) | Python | 63 | 872 | <0.1s |
| [cobra](https://github.com/spf13/cobra) | Go | 15 | 249 | <0.1s |
| [gin](https://github.com/gin-gonic/gin) | Go | 41 | 574 | <0.1s |
| Unity game (private) | C# | 129 | 1,018 | 0.1s |

## Line Numbers for Targeted Reads

Every record includes `line_start` and `line_end` (1-indexed). Multi-line declarations span the full signature:

```
[METHOD] com.google.common.base.Converter.from
  Signature: static Converter<A, B> from(Function<...> forward, Function<...> backward)
  File: Converter.java:504-506          ← multi-line signature

[METHOD] server.AlbumController.createAlbum
  Signature: createAlbum(@Auth() auth: AuthDto, @Body() dto: CreateAlbumDto)
  File: album.controller.ts:46          ← single-line
```

This lets AI agents do **targeted reads** instead of reading full files:

```python
# Instead of reading the entire 600-line file:
Read("Converter.java")                     # 600 lines, ~12k tokens

# Read just the method + context:
Read("Converter.java", offset=504, limit=10)  # 10 lines, ~200 tokens
```

## Benchmarks

Measured across 5 real-world projects in 5 languages, each using a 10-step cross-cutting research workflow.

![Total Tokens — Cross-Language Comparison](https://raw.githubusercontent.com/Codeturion/codesurface/master/docs/images/01-total-tokens.png)

| Language | Project | Files | Records | MCP | Skilled | Naive | MCP vs Skilled |
|----------|---------|------:|--------:|----:|--------:|------:|---------------:|
| C# | Unity game | 129 | 1,034 | **1,021** | 4,453 | 11,825 | 77% fewer |
| TypeScript | immich | 694 | 8,344 | **1,451** | 4,500 | 14,550 | 68% fewer |
| Java | guava | 891 | 8,377 | **1,851** | 4,200 | 26,700 | 56% fewer |
| Go | gin | 38 | 534 | **1,791** | 2,770 | 15,300 | 35% fewer |
| Python | codesurface | 9 | 40 | **753** | 2,000 | 10,400 | 62% fewer |

![Hallucination Risk](https://raw.githubusercontent.com/Codeturion/codesurface/master/docs/images/04-hallucination.png)

Even with follow-up reads for implementation detail, the hybrid MCP + targeted Read approach uses **44% fewer tokens** than a skilled Grep+Read agent and **87% fewer** than a naive agent:

![Hybrid Workflow](https://raw.githubusercontent.com/Codeturion/codesurface/master/docs/images/03-hybrid.png)

### Per-question breakdown

![Per Question](https://raw.githubusercontent.com/Codeturion/codesurface/master/docs/images/02-per-step.png)

See [workflow-benchmark.md](workflow-benchmark.md) for the full step-by-step analysis across all languages.

## Multiple Projects

Each `--project` flag indexes one directory. To index multiple codebases, run separate instances with different server names:

```json
{
  "mcpServers": {
    "codesurface-backend": {
      "command": "uvx",
      "args": ["codesurface", "--project", "/path/to/backend/src"]
    },
    "codesurface-frontend": {
      "command": "uvx",
      "args": ["codesurface", "--project", "/path/to/frontend/src"]
    }
  }
}
```

Each instance gets its own in-memory index and tools. The AI agent sees both and can query across projects.

## Setup Details

<details>
<summary>Alternative installation methods</summary>

**Using pip install:**
```bash
pip install codesurface
```
```json
{
  "mcpServers": {
    "codesurface": {
      "command": "codesurface",
      "args": ["--project", "/path/to/your/src"]
    }
  }
}
```

</details>

<details>
<summary>Project structure</summary>

```
codesurface/
├── src/codesurface/
│   ├── server.py           # MCP server — 5 tools
│   ├── db.py               # SQLite + FTS5 database layer
│   └── parsers/
│       ├── base.py         # BaseParser ABC
│       ├── csharp.py       # C# parser
│       ├── go.py           # Go parser
│       ├── java.py         # Java parser
│       ├── python_parser.py # Python parser
│       └── typescript.py   # TypeScript/TSX parser
├── pyproject.toml
└── README.md
```

</details>

<details>
<summary>Troubleshooting</summary>

**"No codebase indexed"**
- Ensure `--project` points to a directory containing supported source files (`.cs`, `.go`, `.java`, `.py`, `.ts`, `.tsx`)
- The server indexes at startup — check stderr for the "Indexed N records" message

**Server won't start**
- Check Python version: `python --version` (needs 3.10+)
- Check `mcp[cli]` is installed: `pip install mcp[cli]`

**Stale results after editing source files**
- The index auto-refreshes on query misses — if you add a new class and query it, the server reindexes and retries automatically
- You can also call `reindex()` manually to force an incremental update

</details>

---

## Contact

fuatcankoseoglu@gmail.com

## License

[PolyForm Noncommercial 1.0.0](https://polyformproject.org/licenses/noncommercial/1.0.0/)

Free to use, fork, modify, and share for any personal or non-commercial purpose.
Commercial use requires permission.
