Metadata-Version: 2.4
Name: contextledger
Version: 0.1.2
Summary: Cross-tool decision memory for AI coding assistants — local-first MCP server
License-Expression: MIT
License-File: LICENSE
Requires-Python: >=3.12
Requires-Dist: anthropic>=0.40
Requires-Dist: click>=8.1
Requires-Dist: mcp>=1.12
Description-Content-Type: text/markdown

# ContextLedger

> **One decision memory. Shared across every AI coding tool you use.**

ContextLedger is the `CLAUDE.md` that works in Cursor, Codex, Gemini, and Windsurf too — a local-first MCP server that reads the rules files and git history you already maintain, then makes that decision context queryable from every AI coding assistant on your machine.

**Status:** v0.1.2 release candidate — local-first CLI + MCP server. [Read the positioning doc](./POSITIONING.md).

---

## The problem

You use Claude Code in the morning, Cursor for refactoring, Codex for quick scripts, and Gemini for research. Each tool reads its own rules file (`CLAUDE.md`, `.cursorrules`, `AGENTS.md`, `.windsurfrules`) and forgets your decisions the moment a session ends. You re-explain the same architectural choices to each tool, re-litigate decisions you already rejected, and watch agents suggest approaches you killed three months ago.

Your decisions live everywhere — git commit messages, PR descriptions, multi-level rules files. No single AI tool sees them all.

## What ContextLedger does

ContextLedger runs as a **local MCP server**. Once indexed:

1. **Reads what you already wrote.** v0.1 indexes git commit messages via LLM extraction (decisions, constraints, rejected alternatives) and ingests `CLAUDE.md` / `AGENTS.md` / `.cursorrules` / `.windsurfrules` / `.augment/` as searchable text. No new authoring required.
2. **Builds a unified decision corpus.** Decisions, rejected alternatives, and architectural rationale become first-class queryable entities — not a code knowledge graph with WHY bolted on.
3. **Serves it to every MCP client.** Claude Code, Cursor, Windsurf, JetBrains (via MCP), VS Code, Codex CLI, Gemini CLI — any tool that speaks MCP gets the same answers.

Your decisions stop disappearing between sessions. Your tools stop suggesting things you already rejected.

---

## Three jobs ContextLedger does

### 1. Stop re-explaining decisions across tools

> *"Why did we use Redis here?"* — ask in Claude Code, Cursor, or Codex. Get the same answer, sourced from the original commit message.

Your decision lives once. Every tool reads it.

### 2. Surface rejected alternatives before agents re-suggest them

> *"What database options did we consider for the events table?"* — get all three options you evaluated, including the two you rejected and why.

When an agent is about to suggest the rejection, ContextLedger surfaces the prior reasoning.

### 3. Onboard new AI tools without re-authoring rules

> Add a new AI assistant to your workflow. It immediately inherits every rule you wrote in CLAUDE.md, .cursorrules, AGENTS.md, etc.

No re-translation. No per-tool rules duplication.

---

## Install

ContextLedger is a Python CLI + MCP server. Three install paths — pick one:

### A. Source checkout (current default)

```bash
git clone https://github.com/Balachadal/contextledger.git
cd contextledger
uv sync --group dev
```

Run as `uv run contextledger <command>`.

### B. Install from a local checkout

After cloning as in A:

```bash
uv tool install .         # or: pipx install .
```

Run as `contextledger <command>` directly — the entry point lands on `PATH`.

### C. From PyPI (post-launch — not yet available)

The PyPI publish workflow is wired (`.github/workflows/release.yml`) but the
first `v*` tag has not been pushed yet, so the package is **not on PyPI
today**. Once it is published:

```bash
# NOT YET — v0.1.2 is not on PyPI yet. Use path A or B above for now.
pipx install contextledger
# or
uv tool install contextledger
```

Run as `contextledger <command>`.

### Backends for indexing

`index` needs one of:

- `ANTHROPIC_API_KEY` exported in your shell (used with `--llm api`, the default), **or**
- the local `claude` CLI on `PATH`, authenticated against a Claude Pro/Max
  subscription (used with `--llm claude-cli` — no API key needed).

`query`, `serve`, and `doctor` need neither.

No Docker. No Neo4j. No cloud account.

---

## 5-minute quickstart

Commands below assume `contextledger` is on `PATH` (install path B above;
path C is post-launch). On a source checkout (path A), prefix every command
with `uv run`.

```bash
# 0. Sanity-check your setup — read-only, does not create the database.
contextledger doctor

# 1. Index this repo (claude subscription path; ~50 commits ≈ 5–10 min).
contextledger index . --llm claude-cli --limit 50

# 2. Ask the corpus a real question.
contextledger query "why did we choose sqlite"

# 3. (Optional) Wire it into Claude Code as an MCP server.
#    Requires a Claude Code version that supports `claude mcp add`.
#    Older versions: add the entry to Claude Code's config file manually.
claude mcp add contextledger -- contextledger serve --repo "$(pwd)"
```

Anthropic API instead of the Claude Code subscription:

```bash
ANTHROPIC_API_KEY=… contextledger index . --limit 50
```

After indexing, ask any MCP-speaking AI client: *"Why did we kill the Redis
approach?"* — the answer is sourced from your actual git history, with the
commit hash cited.

### Indexing throughput

| Backend | Per-commit cost | Notes |
|---|---|---|
| `--llm api` | ~1–2 s/commit | Direct Anthropic SDK call, runs sequentially. |
| `--llm claude-cli` | ~5–10 s/commit | Each call spawns a fresh Claude Code process (~1–2 s warm-up + model latency). v0.1.x will move to `claude-agent-sdk` for single-process parallel calls. |

For a 100-commit repo, expect ~2–3 min on `--llm api` and ~10–15 min on `--llm claude-cli`. Re-indexing is idempotent, so safe to interrupt and resume.

---

## Manually backfill a missing decision

Sometimes a decision lives in a Slack thread, a meeting, or someone's head
— never in a commit message or rules file. `contextledger decision add`
writes one straight into the local corpus, with no LLM call and no
`ANTHROPIC_API_KEY` required. This is a **backfill mechanism for missing
decisions**, not a note-taking system: use it sparingly, when `query`
turns up nothing because the rationale was never written down.

```bash
# Interactive — prompts for summary, constraints, rejected alternatives,
# forward context, files governed, and optional raw source text.
contextledger decision add --repo .

# Non-interactive — pass at least --summary plus any of the field flags.
contextledger decision add --repo . \
  --summary "Use SQLite for the local corpus" \
  --constraint "single-file deployment" \
  --rejected "Postgres::Operational overhead for a local tool" \
  --forward-context "Revisit if multi-user sync lands" \
  --file src/contextledger/storage.py

# Optional --id pins a meaningful, human-readable ref. Citation will read
# `manual:session-ttl-policy` in query output.
contextledger decision add --repo . --id session-ttl-policy
```

Manual decisions are stored with `source_type="manual"` and become
searchable through the normal `query` / MCP path. A `--id` that already
exists is refused with a clear error — no silent overwrites of
hand-authored content.

### Browse manual decisions

`decision list` shows your manually recorded decisions newest-first;
`decision show` prints one by its `source_ref` (the `--id` you set, or the
auto-UUID4). Both are local-only — no LLM, no git, no
`ANTHROPIC_API_KEY`.

```bash
contextledger decision list --repo .
contextledger decision show session-ttl-policy --repo .
```

`show` resolves the `source_ref` only — the same value that appears in
`query` citations as `manual:<source_ref>`. Manual-only for now; commit
and rules-file records remain accessible through `query` and
`get_change_narrative`. An empty `decision list` prints a one-line hint
pointing at `decision add`, so a fresh corpus tells you what's next
rather than nothing.

---

## Before editing a file

`contextledger before-edit FILE` surfaces relevant decisions, constraints,
rejected alternatives, rules guidance, and the file's change history before
you (or an AI agent) edit it — so prior intent has a chance to push back
before code is written. Read-only: no LLM call, no network, no writes.

```bash
contextledger before-edit src/contextledger/storage.py
contextledger before-edit src/contextledger/storage.py --task "add sqlite-vec embeddings"
```

The optional `--task` description sharpens the surfaced context against your
intended change. Low-confidence decisions are filtered out, so output stays a
high-signal guardrail. If the corpus has nothing relevant, output falls back
to a hint pointing at `contextledger decision add`.

The same surface is exposed as the `before_edit(file, task)` MCP tool — AI
agents can call it as a pre-edit check before suggesting code.

---

## Where your data goes

ContextLedger is local-first by design. Specifically:

- **Storage location.** The decision corpus lives in
  `~/.contextledger/<repo-hash>/decisions.db` (SQLite + FTS5). The repo
  itself is never written to — `.contextledger/` is not created inside
  the repository.
- **Indexing with `--llm api`.** Each commit's subject and body are sent
  to the Anthropic API for structured extraction (forced tool-use,
  `claude-sonnet-4-6` by default). Output: a decision record stored
  locally. File contents are not sent.
- **Indexing with `--llm claude-cli`.** Each commit's subject and body
  are passed to the local `claude` CLI, which reaches Anthropic through
  your Claude Code subscription auth (no API key). Same payload, same
  result on disk.
- **Rules files** (`CLAUDE.md`, `AGENTS.md`, `.cursorrules`,
  `.cursor/rules/*.{md,mdc}`, `.windsurfrules`, `.augment/`,
  `.github/copilot-instructions.md`) are read directly off disk and
  stored verbatim in the local DB. No LLM call, no network.
- **Queries** (`contextledger query`, the `query_decision` and
  `get_change_narrative` MCP tools) are 100% local: SQLite FTS5 lookups
  against the local DB. No network.

`index` is the only command that reaches a cloud LLM. `query`, `serve`,
and `doctor` only read the local DB and never make network calls. If a
repo's commit messages contain anything you'd rather not send to a cloud
LLM, run `index` in **rules-only** mode:

```bash
contextledger index . --rules-only
```

Local-only; no LLM call; rules files only. Skips git commits, requires
no `ANTHROPIC_API_KEY`, and still keeps the rules corpus current
(including stale-rules purge on re-index). `index --dry-run` (without
`--rules-only`) still calls the LLM, so it isn't a privacy preview.

---

## MCP tools (v0.1)

Two tools exposed to any MCP client.

### `query_decision(question: str) → str`

Natural-language query over the decision corpus, with stopword-tolerant lexical search (FTS5). Returns the matching decisions with source citations.

```
> query_decision("Why did we choose Anthropic over OpenAI?")
→ Decision: ...
  Constraints: ...
  Rejected alternatives: OpenAI gpt-5.4 (reason: Latency benchmark failed); ...
  Source: commit:a8fe977 (2026-04-12)
```

### `get_change_narrative(file: str, since: str = "") → str`

Why did this file evolve? Returns chronological decisions whose commits touched the file.

```
> get_change_narrative("src/db/schema.py", since="2025-12-01")
→ Change narrative for src/db/schema.py:
  - [2026-04-12] Adopt Redis for sessions (source: commit:a8fe977)
  - …
```

### `before_edit(file: str, task: str = "") → str`

Pre-edit context guardrail. Combines decisions, constraints, rejected
alternatives, rules guidance, and the file's change history so an agent's
edit can respect prior intent. Low-confidence decisions filtered; read-only.

```
> before_edit("src/db/schema.py", task="add embeddings column")
→ Relevant decisions:
  - Adopt Redis for sessions (source: commit:a8fe977)
  Rejected alternatives:
  - PostgreSQL JSON column (latency budget exceeded)
  Rules guidance:
  - …
  File history:
  - [2026-04-12] …
```

### Roadmap (not in v0.1)

- `get_module_contract(path)` — module intent + invariants from rules files / docstrings. Lands in v0.2 alongside structured rules-file extraction.
- `find_related_decisions(concept)` — semantic search. Lands in v0.2 with sqlite-vec embeddings.

---

## Troubleshooting

Run `contextledger doctor` first — it usually points at the right fix.

| Symptom | Cause | Fix |
|---|---|---|
| `doctor` says `git repo: NO` | The path isn't a git repository | `cd` into a git working tree, or `git init` first. Indexing reads `git log`. |
| `doctor` says `exists: no` for the DB | You haven't indexed yet | Run `contextledger index . --llm claude-cli --limit 50` (or the `--llm api` variant). |
| `index` errors with `ANTHROPIC_API_KEY is required` | Default backend is `api` and the key isn't set | Either `export ANTHROPIC_API_KEY=…` or pass `--llm claude-cli` to use the Claude Code subscription. |
| `doctor` says `claude CLI: not on PATH` and you want the subscription backend | Claude Code CLI not installed or not in this shell's `PATH` | Install Claude Code, confirm with `claude --version`, then re-run `doctor`. |
| `query` returns *"No matching decisions found…"* | Either the corpus is empty or the query terms don't appear in any record. The hint that follows points at `decision add` for cases where the rationale was never written down. | `doctor` shows the row counts. If zero, run `index` first. If non-zero, try simpler terms — FTS5 is lexical, not semantic (semantic search lands in v0.2). Backfill with `contextledger decision add` if the rationale lives only in a Slack thread or someone's head. |
| Old guidance from a deleted/renamed rules file keeps surfacing | Stale rows from before the file was removed | Re-run `contextledger index .` — it purges `rules_file` rows whose source paths no longer exist on disk. |

Still stuck? `doctor` output plus the failing command is the fastest issue to triage.

---

## Why ContextLedger (and not the alternatives)

| If you want... | Use |
|---|---|
| **Cross-tool decision memory** | **ContextLedger** |
| A full codebase knowledge graph | [Repowise](https://github.com/repowise-dev/repowise), [Sourcegraph](https://sourcegraph.com), [Greptile](https://greptile.com) |
| Code dependency / call graph | [Ctxo](https://github.com/alperhankendi/Ctxo), [GitNexus](https://github.com/abhigyanpatwari/GitNexus) |
| AI session capture per-commit | [XHawk](https://xhawk.ai) |
| Spec-driven workflow inside one assistant | [Augment Intent](https://augmentcode.com/product/intent) |
| Inline code annotations in JetBrains | [JetBrains Recap & Insights](https://blog.jetbrains.com/ai/) |

ContextLedger is for the niche where you switch between two or more of these AI coding tools and need the decision context to follow you. If you stay inside one tool's ecosystem, that tool's native features are usually enough.

[Full competitive analysis →](./POSITIONING.md)

---

## Roadmap

- **v0.1 (May 2026):** 2 MCP tools (`query_decision`, `get_change_narrative`). Indexes git commits via Claude tool-use; ingests CLAUDE.md / AGENTS.md / .cursorrules / .windsurfrules / .augment/ as searchable text. SQLite + FTS5 persistence (lexical search). Source-installable; PyPI / Homebrew tap follow.
- **v0.2:** Structured decision extraction from rules files. `get_module_contract` and `find_related_decisions` MCP tools. sqlite-vec semantic search. PR description ingestion. Local LLM support (Ollama). Background re-indexing.
- **v0.3:** Cross-machine sync (Pro tier). Team decision corpus (Team tier).
- **v1.0:** Whatever the Week-14 decision gate metrics tell us to build.

## License

MIT. Use it, fork it, commercialize it. We don't care.

## Contributing

v0.1 is being built solo to ship fast. Contributions reopen post-launch. Watch the repo for the v0.1 release tag.

## Built by

A solo AI builder who orchestrates Claude Code + Codex + Gemini across 7 active projects daily and got tired of re-explaining the same decisions to each tool.

Built in France. Station F target cohort. Bootstrapped to traction first.
