Metadata-Version: 2.4
Name: contextai-graph
Version: 0.2.0
Summary: Code graph extraction for LLM-assisted debugging
Author: Shobhit Goel
License: MIT
Project-URL: Homepage, https://github.com/shobhit0521/ContextAI
Project-URL: Repository, https://github.com/shobhit0521/ContextAI
Keywords: code-graph,static-analysis,llm,mcp,ast,debugging
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Quality Assurance
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.0.0
Requires-Dist: networkx>=3.0
Requires-Dist: pyvis>=0.3.0
Dynamic: license-file

# ContextAI

> **Code graph extraction for LLM-assisted debugging.**
> Turn a codebase into a directed property graph so an LLM gets *surgical, structured, bounded* context — the broken node plus its neighbors — instead of the whole repository.

<p align="left">
  <img alt="Python" src="https://img.shields.io/badge/python-3.11%2B-blue" />
  <img alt="Tests" src="https://img.shields.io/badge/tests-153%20passing-brightgreen" />
  <img alt="Status" src="https://img.shields.io/badge/status-alpha-orange" />
</p>

---

## Why

LLMs fail on large codebases for three reasons:

- **Too much context** → noise, confusion, hallucination
- **Too little context** → blind spots, wrong fixes
- **No structure** → the model can't see how components relate

Bugs don't live in isolation — they live in the *relationships* between components. ContextAI makes those relationships explicit and traversable by representing every meaningful piece of code as a **node** and every relationship as an **edge**. When debugging, you feed the LLM only the broken node, its adjacent nodes, and the connecting edges.

---

## What it does

```
   source code ──▶ extraction pipeline ──▶ directed property graph ──▶ bounded LLM context
                   (AST · framework ·          (nodes + edges,           (a node + its
                    patterns · runtime)         fully typed schema)        neighborhood)
```

ContextAI scans a Python project — `.py` modules **and `.ipynb` notebooks** — and emits a typed graph (`graph.json`) plus an interactive visualization (`graph.html`). Each node carries its signature, side effects, error-handling profile, and complexity; each edge carries its contract, criticality, and failure behavior.

---

## Quickstart

**Requirements:** Python 3.11+

```bash
# install
pip install -e .          # or: pip install -r requirements.txt

# extract a graph from any file or directory
contextai path/to/your/project/      # or: python3 run.py path/to/your/project/

# outputs (under out/, gitignored):
#   out/graph.json   — all nodes + edges
#   out/index.html   — dashboard: coverage · connectivity health · interactive graph
#   out/graph.html   — raw interactive graph (pyvis, self-contained)
```

Open **`out/index.html`** for the full picture — extraction coverage, connectivity health
(fragmentation, isolated nodes, sinks/sources), node-type distribution, and the live graph
with a code/location details panel, all in one self-contained file.

Run it against the bundled benchmark (the VS Code Flask tutorial):

```bash
python3 run.py python-sample-vscode-flask-tutorial/
```

---

## Architecture

| Layer | Tools | Gets you | Status |
|---|---|---|---|
| **1. AST + types** | Python `ast` (`.py` + `.ipynb`) | functions, classes, imports, call sites, signatures, side effects | ✅ Implemented |
| **2. Framework conventions** | custom parsers | route → handler, templates, static assets (Flask) | ✅ Flask; others planned |
| **2.5. Pattern matching** | regex on source | DB tables, hardcoded URLs, cache ops, template refs | 🟡 Partial |
| **3. Runtime tracing** | `sys.settrace` + asyncio hooks | actual call chains, dynamic dispatch, async fan-out | ✅ Implemented |
| **4. LLM pass** | Claude / GPT | semantic intent, implicit relationships | ⏳ Planned |

No single method captures everything: **static analysis** sees what code *says*, **runtime tracing** sees what it *does*. ContextAI merges both into one graph.

### Pipeline

```
run.py
  ├─ Pass 1: walk files → emit NODES   (ast_extractor, flask_convention_extractor)
  └─ Pass 2: resolve IDs → emit EDGES  (edge_extractor)

graph/extractors/runtime/   ← trace a running app and merge real call chains
  ├─ tracer.py              sys.settrace + asyncio monkey-patches
  ├─ call_log.py            captured calls → call_log.json
  ├─ script_runner.py       drive a target script/entry point under the tracer
  └─ edge_injector.py       merge runtime calls into the static graph
```

---

## The graph schema

[`schema.py`](schema.py) is the source of truth (Pydantic v2).

**Nodes** carry: `id`, `type`, `name`, `location`, `code`, `signature` (typed inputs/outputs), `side_effects`, `error_handling`, and `metadata` (complexity, test coverage, staleness).

**Edges** carry: `id`, `type`, `from`, `to`, `direction`, `contract` (input/output shape), `criticality`, `on_failure` (retry / default / throw / circuit-break), and `performance`.

<details>
<summary><b>Node types</b></summary>

- **Boundary:** `API_ENDPOINT`, `MESSAGE_CONSUMER`, `CRON_JOB`
- **Logic:** `FUNCTION`, `CLASS`, `MIDDLEWARE`, `ROUTE_HANDLER`
- **Data:** `SCHEMA`, `MODEL`, `DTO`, `DATABASE`, `TABLE`, `COLLECTION`
- **Infra:** `MESSAGE_QUEUE`, `FILE_STORAGE`, `EXTERNAL_LIBRARY`
- **Scaffolding:** `MODULE_INIT`, `ENTRY_POINT`, `FILE`, `TEMPLATE`, `STATIC_ASSET`
</details>

<details>
<summary><b>Edge types</b></summary>

| Category | Edges |
|---|---|
| Call | `CALLS`, `CALLS_ASYNC`, `DELEGATES_TO` |
| Dependency | `IMPORTS`, `IMPORTS_SIDE_EFFECT`, `INHERITS`, `IMPLEMENTS`, `INSTANTIATES`, `INJECTS` |
| Data | `READS`, `WRITES`, `VALIDATES`, `TRANSFORMS`, `MAPS_TO`, `RETURNS` |
| Communication | `HANDLES`, `GUARDS`, `PUBLISHES_TO`, `SUBSCRIBES_TO`, `CALLS_EXTERNAL`, `RENDERS`, `SERVES_STATIC` |
| App wiring | `USES_APP_INSTANCE` |

</details>

---

## Analysis tools

All default to `graph.json` in the current directory.

```bash
python3 tools/graph_connectivity.py     # health score, isolated nodes, islands
python3 tools/coverage_from_graph.py    # % of source lines covered by nodes
python3 tools/graph_duplicates.py       # duplicate / overlapping node ranges
python3 tools/diff_graph.py a.json b.json   # diff two graphs
```

### Runtime tracing

```bash
python3 tools/run_with_tracing.py \
  --target your_app/main.py \
  --project-root your_app/ \
  --build-static \
  --output graph_runtime.json
```

Runs your app under the tracer, then merges observed call chains into the static graph (confirming static edges, filling gaps, and adding runtime-only edges).

---

## Public API

[`graph/api.py`](graph/api.py) is the only surface the MCP server (and any other client) should import — never reach into extractors or `GraphStore` directly.

```python
from graph.api import (
    build_graph, load_graph, find_node, get_context, list_gaps, get_edge_path,
    run_trace, merge_trace, start_trace, stop_trace,
)
```

**Querying.** `get_context(store, node_id, depth=2, direction="in")` returns a bounded subgraph around a node. It is **incoming-biased** by default: deep predecessors (who calls this — the blast radius), one shallow successor hop, and a successor pull around gap nodes. Pass `direction="out"` / `"both"` to change the bias.

**Runtime tracing** has three capture modes, all converging on one merge:

| Mode | Entry point | Use it for |
|---|---|---|
| One-shot script / IDE run | `run_trace(target, project_root, …)` | capture + merge a single script or entry point in one call |
| Long-running session | `start_trace(project_root)` … `stop_trace(project_root, base_graph, output)` | a server or worker traced across **many requests** without restarting — every call in between is unioned into one capture |
| Per-request web | `TracingMiddleware` / `AsyncTracingMiddleware` ([`tools/trace_middleware.py`](tools/trace_middleware.py)) | trace one request at a time, triggered by an `X-Trace: 1` header |

All three feed `merge_trace(base_graph, call_log, project_root, output)` — the single seam that folds a runtime call log onto a base graph. The base is a parameter: pass the static graph to merge a single action, or a prior runtime graph to **accumulate a sequence** of actions (call counts sum, edges union). The static graph is never mutated — every merge writes a fresh overlay.

---

## Testing

```bash
pip install pytest
pytest -q
```

The suite (**153 tests**) is built on an *inductive* strategy: every atomic extraction pattern — structural, web/API, data access, messaging, signatures, data flow, async runtime — has a minimal fixture and an exact-count assertion. If the extractor handles every base case, it handles their combinations.

```
tests/
  test_static_induction.py   structural · web · data · messaging · signatures · data flow
  test_phase2_edges.py       call resolution, dynamic dispatch, super(), properties
  test_phase3_runtime.py     tracer capture + edge-injector merge + end-to-end
  test_phase3_http.py        HTTP / async routing patterns
  test_runtime_api.py        public API: direction-aware context, merge/run/session tracing
  test_notebook_extractor.py .ipynb flattening + node/edge extraction across cells
  test_dashboard.py          metrics builder + self-contained dashboard generation
  fixtures/                  minimal atomic patterns per test
```

---

## Project structure

```
schema.py                       NodeSchema + EdgeSchema (Pydantic, source of truth)
run.py                          entry point: extract → store → visualize (contextai cli)
graph/
  api.py                        public API consumed by the MCP server
  extractors/
    ast_extractor.py            Python AST → nodes (functions, classes, schemas, …)
    edge_extractor.py           all edge types
    notebook_extractor.py       .ipynb → flatten code cells → reuse AST pipeline
    flask_convention_extractor.py  templates + static assets
    runtime/                    sys.settrace tracer + call log + script runner + edge injector
  store/graph_store.py          NetworkX graph + JSON persistence + direction-aware neighbor traversal
  visualizer/visualizer.py      pyvis HTML output (out/graph.html)
tools/                          connectivity, coverage, duplicates, diff, tracing
benchmarks/flask-tutorial/      hand-authored ground-truth graph (diff target)
dashboard/                      self-contained dashboard → out/index.html
  metrics.py                    reuses the coverage + connectivity tools → one payload
  dashboard.py / template.html  embed graph + metrics into a single HTML file
out/                            generated artifacts (gitignored)
docs/                           design + planning notes
tests/                          inductive test suite + fixtures
```

---

## Roadmap

- [x] Static extraction (AST) — nodes, edges, signatures, side effects
- [x] Flask framework conventions (routes, templates, static assets)
- [x] Runtime tracing (sync + async call chains)
- [x] Inductive test suite (153 tests)
- [x] Jupyter `.ipynb` notebook extraction — code cells → AST pipeline, with cell-aware locations
- [ ] **MCP server** — expose the graph to LLM clients as a tool ([`docs/MCP_SERVER_PLAN.md`](docs/MCP_SERVER_PLAN.md))
- [ ] **LLM integration** — neighborhood-context retrieval for debugging
- [ ] More frameworks (Django, FastAPI), git/version metadata, derived impact edges (`AFFECTS`, `DEPENDS_ON`, `TRIGGERS`)
- [ ] Multi-language extraction (JS/TS → `UI_COMPONENT`)

See [`docs/`](docs/) for design and planning notes.

---

## Status

**Alpha.** The static extractor and runtime tracer work and are covered by tests. The LLM/MCP consumption layer — the part that turns the graph into better debugging answers — is in active development.

---

## License

No license has been chosen yet. Until one is added, all rights are reserved by the author.
