Metadata-Version: 2.4
Name: intentic-ike
Version: 0.5.1
Summary: intentic Knowledge Engine — context-efficient knowledge serving for AI agents
Author-email: intentic <hello@intentic.io>
License: MIT
Project-URL: Homepage, https://github.com/pedrams/ike
Project-URL: Repository, https://github.com/pedrams/ike
Project-URL: Documentation, https://github.com/pedrams/ike/blob/main/docs/build-with-ai.md
Project-URL: Issues, https://github.com/pedrams/ike/issues
Project-URL: Changelog, https://github.com/pedrams/ike/releases
Keywords: knowledge-engine,mcp,ai-agents,knowledge-base,context-engineering,model-context-protocol
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.1
Requires-Dist: python-frontmatter>=1.1
Requires-Dist: markdown-it-py>=3.0
Requires-Dist: tiktoken>=0.8
Requires-Dist: filelock>=3.16
Requires-Dist: mdformat>=0.7.17
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == "dev"
Requires-Dist: pytest-cov>=6.0; extra == "dev"
Provides-Extra: mcp
Requires-Dist: fastmcp>=2.0; extra == "mcp"
Provides-Extra: source
Requires-Dist: tree-sitter-language-pack==1.5.0; extra == "source"
Requires-Dist: networkx>=3.0; extra == "source"
Dynamic: license-file

# ike — intentic Knowledge Engine

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://python.org)
[![PyPI](https://img.shields.io/pypi/v/intentic-ike.svg)](https://pypi.org/project/intentic-ike/)

**Make your knowledge base and source code queryable by any AI tool. One command.**

```bash
pip install intentic-ike[mcp]
ike init --kb-root ./docs
```

ike scans your docs, fixes quality issues, and generates config files so that Claude Code, Cursor, Codex, Windsurf, and every other AI coding tool can query your knowledge base with precision. With the `[source]` extra, ike also indexes source code. Routing via PageRank, fetching at symbol-level.

**Before ike:** Your AI tool reads entire files, wastes context, gets confused.
**After ike:** Your AI tool asks for exactly the section or function it needs. 80-95% fewer tokens.

```
# Without ike: 17,500 tokens (4 full files dumped into context)
$ cat company/vision.md company/icp.md company/positioning.md company/messaging.md | wc -w

# With ike: 3,088 tokens (2 targeted sections)
$ ike route "ICP definition"
→ company/icp-definition.md#ideal-customer-profile-icp (2,836 tokens)
$ ike fetch company/icp-definition.md --section ideal-customer-profile-icp
```

## Why ike Exists

I run multiple AI coding tools across several repos. 81 knowledge documents. Strategy, ICP definitions, architecture decisions, operational workflows. My AI tools already had access to all of it through a <a href="https://intentic.io/single-prompt-ai-analysis-broken" target="_blank">structured knowledge base</a>. The problem wasn't access. It was efficiency.

Every time an AI tool needed context, it loaded entire files. Full documents dumped into the context window when only one section was relevant. The right knowledge existed, but there was no way to deliver the right amount of context at the right time to the right agent.

When an AI tool needed my ICP definition, it read the entire file. 7,306 tokens. The relevant section is 807 tokens. 89% waste. Across every knowledge lookup in every session across every tool, that adds up fast. Context window spent on content the AI doesn't need is context window unavailable for reasoning.

ike solved this. I measured the difference across my real knowledge base:

| Scenario | Without ike | With ike | Savings |
|---|---|---|---|
| Signal analysis (4 company files) | 17,500 tokens | 3,088 tokens | 82% |
| Ripple analysis (2 architecture files) | 7,715 tokens | 1,334 tokens | 83% |
| Architecture lookup (1 section) | 4,825 tokens | 259 tokens | 95% |
| Vision section (from 7K doc) | 7,306 tokens | 807 tokens | 89% |

Real queries. Real knowledge base. Not a synthetic benchmark.

### What ike actually does

ike sits between your AI tools and your documentation. Instead of reading files, your AI tool asks ike what's relevant, gets back a ranked list with token counts, and fetches only the sections it needs.

Two steps. Route first (~100 tokens), then fetch what matters.

```
Route: "What do you have about deployment?"
→ engineering/architecture.md#deployment (842 tokens)

Fetch: "Give me that section."
→ The actual content, nothing else.
```

The AI sees the token cost before loading anything. It decides what's worth the context budget. No full-file dumps.

For source code, ike parses your codebase into an AST, builds a dependency graph between files, ranks them by importance using PageRank, and lets your AI tool fetch a single function by name. Not the file. The function.

### The setup

One command. `ike init --kb-root ./docs`. ike scans your Markdown files, builds a search index, and generates config files that Claude Code, Cursor, Codex, Windsurf, and 10+ other tools auto-discover. No manual wiring.

Docs need headers (H2+) and ideally YAML frontmatter. If they don't have frontmatter, `ike doctor --yes` infers it from the content. Title from H1, domain from directory, summary from first paragraph. Non-destructive, metadata only.

### Why this matters

Context window is the bottleneck. Not model intelligence, not speed, not cost per token. When an AI tool runs out of context, it starts dropping information or hallucinating. Every token wasted on irrelevant content is a token unavailable for reasoning.

80-95% savings per lookup means your AI tools can work with your entire knowledge base instead of running out of context after two files. Write your knowledge once, query it from any AI tool.

## Quick Start

### Option A: Let your AI tool do it

Open your AI editor and paste this prompt:

<details>
<summary><strong>Copy this prompt into Claude Code, Cursor, Codex, or any AI coding tool</strong></summary>

```
I need to set up ike (intentic Knowledge Engine) to make this repository's
documentation queryable by AI tools.

ike is a CLI + MCP server that indexes Markdown files and serves them via
2-step retrieval: route (find relevant sections, ~100 tokens) then fetch
(load specific content). This is much more efficient than reading entire
files. 77-94% token savings.

Repo: https://github.com/pedrams/ike
Docs: https://github.com/pedrams/ike/blob/main/docs/build-with-ai.md

Please do the following:

1. INSTALL: Run `pip install intentic-ike[mcp]`
   - Requires Python 3.12+
   - The [mcp] extra includes the MCP server for IDE integration

2. FIND MY DOCS: Look for the directory containing my Markdown documentation.
   Common locations: ./docs, ./knowledge, ./wiki, or the repo root.
   List the .md files you find so I can confirm.

3. INITIALIZE: Run `ike init --kb-root <path-to-docs>`
   This will:
   - Scan all .md files and report quality (ready/fixable/needs review)
   - Generate .mcp.json (MCP server config for Claude Code, Cursor, etc.)
   - Generate AGENTS.md (universal AI tool discovery file)
   If .mcp.json already exists, ike merges. It won't overwrite your config.

4. FIX DOCS: Run `ike doctor --yes`
   This auto-adds missing YAML frontmatter (title, domain, summary) to docs
   that need it. Only metadata is added. Document content is never changed.
   If you prefer to review each fix: run `ike doctor` without --yes.

5. FIX CROSS-REFERENCES: Run `ike doctor --cross-refs`
   This analyzes orphaned documents (no incoming links) and suggests which
   docs should link to each other based on shared domains and keywords.
   For each suggestion:
   - Read both documents to understand their relationship
   - Add a `## Related` section at the bottom with Markdown links,
     or add inline links where they fit naturally in the content
   - Only add links that make semantic sense. Skip weak matches
   After adding links, run `ike lint` to verify orphan count decreased.

6. VERIFY: Run these commands and show me the output:
   - `ike route "test"` (should return matching sections with token counts)
   - `ike lint` (should show remaining quality issues, if any)
   - `cat .mcp.json` (should show the MCP server config)
   - `cat AGENTS.md` (should show the AI tool discovery file)

7. DONE: Tell me to restart my AI tool so the MCP connection activates.

If any step fails:
- "ike: command not found" → pip install didn't work, try: python -m ike --help
- "ike serve fails with ImportError" → install with MCP: pip install intentic-ike[mcp]
- "route returns no results" → run: ike index --rebuild
- "doctor changes too much" → use: ike doctor (interactive) instead of --yes
```

</details>

### Option B: Manual setup (4 commands)

```bash
pip install intentic-ike[mcp]       # 1. Install
cd ~/my-repo
ike init --kb-root ./docs           # 2. Scan docs, generate .mcp.json + AGENTS.md
ike doctor --yes                    # 3. Fix missing frontmatter
ike route "test"                    # 4. Verify
# Restart your AI tool
```

## How It Works

**2-step retrieval:** Route first (~100 tokens), then fetch only what you need.

```bash
# Step 1: Find relevant sections
$ ike route "deployment strategy"
{
  "chunks": [
    {"file_path": "engineering/architecture.md",
     "section_id": "deployment", "score": 3.0, "token_count": 842}
  ]
}

# Step 2: Load only what you need
$ ike fetch engineering/architecture.md --section deployment
## Deployment
CI/CD pipeline deploys to Hetzner Cloud...
```

The AI tool decides what to load based on token counts. No context wasted.

## Source Code Intelligence

Index source code for symbol-level retrieval. Same 2-step flow, but for code.

```bash
pip install intentic-ike[source]        # adds tree-sitter + networkx
ike index --source ./src                # index source code via AST
ike route "auth middleware" --source     # PageRank-weighted code routing
ike fetch src/auth.py --symbol AuthService.validate_token  # exact method
ike symbols --file auth.py              # list all symbols
```

**How it works:**
1. tree-sitter parses source files into AST, extracts functions/classes/methods as chunks
2. A name-based Def/Ref graph links files that share symbol names (same language only)
3. PageRank ranks files by importance. Defining files score higher than files that merely reference them
4. At query time: `score = 0.7 * pagerank + 0.3 * keyword_match`

**Language support:**

| Tier | Languages | Extraction |
|------|-----------|-----------|
| Tier 1 | Python, TypeScript, JavaScript, Go, Rust | AST-based, qualified names (`Class.method`), decorators |
| Tier 2 | 25+ languages (Java, C/C++, Ruby, PHP, Swift, Kotlin, Scala, Lua, R, …) | Text-chunking fallback (~50 lines/chunk) |

**MCP tools:** `route(query, source=True)`, `fetch(file_path, symbol="Class.method")`, `query(text, source=True)`

## AI Tool Compatibility

ike generates discovery files for every major AI coding tool:

| AI Tool | How it discovers ike | Generated by |
|---------|---------------------|-------------|
| **Claude Code** | `.mcp.json` + `AGENTS.md` | `ike init` |
| **Cursor** | `.mcp.json` + `AGENTS.md` | `ike init` |
| **Codex** (OpenAI) | `AGENTS.md` | `ike init` |
| **Windsurf** | `.mcp.json` + `AGENTS.md` | `ike init` |
| **Claude Desktop** | `.mcp.json` | `ike init` |
| **OpenCode** | `AGENTS.md` | `ike init` |
| **Hermes** | `.mcp.json` + `AGENTS.md` | `ike init` |
| **VS Code + Copilot** | `.mcp.json` | `ike init` |
| **Zed** | `.mcp.json` | `ike init` |
| **Aider** | `AGENTS.md` | `ike init` |

**MCP tools** (for MCP-capable editors): `route`, `fetch`, `query`, `lint`, with `source` and `symbol` parameters for code
**CLI commands** (for everything else): `ike route`, `ike fetch`, `ike query`, `ike lint`, `ike symbols`

## Commands

### Setup

| Command | What it does |
|---------|-------------|
| `ike init --kb-root ./docs` | Scan docs, generate `.mcp.json` + `AGENTS.md` |
| `ike doctor --yes` | Auto-fix missing frontmatter |
| `ike doctor` | Interactive frontmatter fix (review each) |
| `ike doctor --cross-refs` | Suggest missing cross-references for orphans |
| `ike serve` | Start MCP server (stdio transport) |
| `ike migrate-mcp` | Migrate old `.mcp.json` to portable format |

### Query

| Command | What it does |
|---------|-------------|
| `ike route "query"` | Find relevant sections (~100 tokens response) |
| `ike route "query" --source` | Find relevant source code (PageRank-weighted) |
| `ike fetch path/file.md` | Load entire file |
| `ike fetch path/file.md --section id` | Load specific section |
| `ike fetch path/file.py --symbol Class.method` | Load specific function/method |
| `ike query "text" --depth deep` | Route + fetch in one step |
| `ike symbols --file pattern` | List indexed source symbols |
| `ike index --source ./src` | Index source code via tree-sitter AST |

### Maintenance

| Command | What it does |
|---------|-------------|
| `ike lint` | Check for missing frontmatter, broken refs, orphans |
| `ike lint --freshness 30` | Also flag docs older than 30 days |
| `ike list` | List all indexed sections |
| `ike index --rebuild` | Rebuild search index |

### Global options

```bash
ike --kb-root /path/to/kb route "query"   # custom KB root
export IKE_KB_ROOT=/path/to/kb            # or via env var
ike --version                             # show version
```

## Context Efficiency

Tested against a real 81-doc knowledge base:

| Scenario | Without ike | With ike | Savings |
|----------|------------|---------|---------|
| Signal analysis (4 company files) | 17,500 tokens | 3,088 tokens | **-82%** |
| Ripple analysis (2 architecture files) | 7,715 tokens | 1,334 tokens | **-83%** |
| Architecture lookup (1 section) | 4,825 tokens | 259 tokens | **-95%** |
| Vision section (from 7K doc) | 7,306 tokens | 807 tokens | **-89%** |

Every token saved is a token available for reasoning.

## Document Format

ike works with any Markdown. For best results, docs should have YAML frontmatter:

```yaml
---
title: "Authentication Architecture"
domain: "engineering"
summary: "OAuth2 + JWT auth flow for the API gateway."
---
```

**Don't have frontmatter?** Run `ike doctor --yes`. It infers title from H1, domain from directory path, summary from the first paragraph.

## Install

```bash
pip install intentic-ike          # CLI only
pip install intentic-ike[mcp]     # CLI + MCP server
pip install intentic-ike[source]  # CLI + source code indexing (tree-sitter + networkx)
pip install intentic-ike[mcp,source]  # everything
```

Requires Python 3.12+. Works on Linux, macOS, Windows (WSL).

## MCP Server Configuration

`ike init` generates `.mcp.json` automatically. Or configure manually:

```json
{
  "mcpServers": {
    "ike": {
      "command": "ike",
      "args": ["serve"],
      "env": { "IKE_KB_ROOT": "/path/to/your/docs" }
    }
  }
}
```

## Architecture

```
AI Tool (Claude Code, Cursor, Codex, ...)
        |
   ike CLI / MCP Server
        |
   Engine (facade)
        |
   ┌────┬────┬────┬────┬────┐
Parser Index Fetcher Writer Linter
   |    |                        CodeParser  CodeGraph
markdown-it  SQLite WAL          tree-sitter  NetworkX
```

- **Plugin CLI.** Commands auto-discovered from `ike/commands/*_cmd.py`
- **Thread-safe.** SQLite with per-thread connections (safe for MCP thread pool)
- **Lazy MCP.** `fastmcp` only imported when `ike serve` runs
- **Lazy source.** `tree-sitter-language-pack` and `networkx` only imported when `--source` is used

## Development

```bash
git clone https://github.com/pedrams/ike && cd ike
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,mcp,source]"
pytest                                  # 281 tests, ~10s
```

## License

MIT
