Metadata-Version: 2.4
Name: graphify-intent
Version: 0.2.0
Summary: Intent-layer enrichment for graphify knowledge graphs: extract the decisions, mechanisms, constraints, and trade-offs behind your docs as grounded, anchored graph nodes.
Project-URL: Homepage, https://github.com/willneill/graphify-intent
Project-URL: Repository, https://github.com/willneill/graphify-intent
Project-URL: Issues, https://github.com/willneill/graphify-intent/issues
Project-URL: Changelog, https://github.com/willneill/graphify-intent/blob/main/CHANGELOG.md
Author-email: Will Neill <willneill@gmail.com>
License: MIT
License-File: LICENSE
Keywords: claude,claude-code,code-intelligence,documentation,embeddings,graphify,intent,knowledge-graph,llm,rag,static-analysis
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Documentation
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Software Development :: Documentation
Classifier: Typing :: Typed
Requires-Python: >=3.10
Provides-Extra: dev
Requires-Dist: mypy>=1.11; extra == 'dev'
Requires-Dist: networkx>=3; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
Requires-Dist: pytest-cov>=5; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Requires-Dist: ruff>=0.6; extra == 'dev'
Provides-Extra: embeddings
Requires-Dist: model2vec>=0.3; extra == 'embeddings'
Description-Content-Type: text/markdown

# graphify-intent

[![PyPI version](https://img.shields.io/pypi/v/graphify-intent.svg)](https://pypi.org/project/graphify-intent/)
[![Python versions](https://img.shields.io/pypi/pyversions/graphify-intent.svg)](https://pypi.org/project/graphify-intent/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![CI](https://github.com/willneill/graphify-intent/actions/workflows/ci.yml/badge.svg)](https://github.com/willneill/graphify-intent/actions/workflows/ci.yml)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)

Post-process a [graphify](https://github.com/safishamsi/graphify) knowledge graph with an **intent layer** — the *why* behind your docs. Where graphify extracts *what* exists, `graphify-intent` extracts the decisions, mechanisms, constraints, and trade-offs your prose encodes, each with a grounded rationale, and anchors them to the concepts graphify already found. A pipeline of LLM passes — extract → anchor → cross-doc relate, plus an opt-in cross-document concept-resolution pass — produces a sidecar JSON and an enriched `graph.json`.

```mermaid
flowchart LR
    D["docs/*.md"] --> A
    G["graph.json"] --> B
    A["Pass A<br/>extract intent"] --> B["Pass B<br/>anchor to concepts"]
    B --> C["Pass C<br/>cross-doc intent"]
    B --> R["Pass D<br/>concept resolution<br/>(opt-in)"]
    A --> OUT
    C --> OUT
    R --> OUT
    OUT["outputs:<br/>.graphify_intent.json<br/>graph.enriched.json<br/>enrichment_report.md"]
```

## Requirements

`graphify-intent` post-processes graphify graphs, so you need `graphifyy` installed in the same Python environment. By default it uses your **Claude Pro/Max subscription** via the Claude Code CLI — **no API key required**. To use a billed API key instead, see [LLM backend](#llm-backend).

## Installation

Install into the same environment as graphify:

```bash
pip install graphify-intent
# or with uv:
uv pip install graphify-intent
```

To enable Pass D's embedding-based concept resolution, add the optional extra:

```bash
pip install graphify-intent[embeddings]   # local model2vec embedder for Pass D
```

Without it, Pass D still runs — it degrades gracefully to a lexical-similarity fallback.

## Quick start

```bash
# Uses your Claude Pro/Max subscription by default (needs the `claude` CLI).
graphify-intent \
  --graph graphify-out/graph.json \
  --docs docs/ \
  --passes A,B,C
```

## LLM backend

By default, `graphify-intent` uses your **Claude Pro/Max subscription** through the
[Claude Code CLI](https://docs.claude.com/en/docs/claude-code) (`claude`) — no API key,
no per-token cost. It falls back to a billed API key if the CLI isn't available.

| `--backend` | Behaviour |
|-------------|-----------|
| *(unset)* | **Subscription** if the `claude` CLI is on your `PATH`, else a detected API key |
| `subscription` | Force the Claude Code CLI (subscription auth) |
| `api` | Force a billed API key (`ANTHROPIC_API_KEY` / `GEMINI_API_KEY`) |
| `claude` / `gemini` / … | Force a specific graphify backend |

You can also set `GRAPHIFY_INTENT_BACKEND`. For subscription mode, install the `claude`
CLI and run it once to sign in. For API mode, install the matching extra
(`graphifyy[anthropic]` or `graphifyy[gemini]`) and provide the key.

### Keep API keys out of `.env` and your shell history

If you do use an API key, prefer a password-manager CLI over inlining the secret. With
[1Password's `op`](https://developer.1password.com/docs/cli/), inject it at runtime so it
never lands on disk in plaintext:

```bash
export ANTHROPIC_API_KEY="$(op read 'op://<vault>/<item>/credential')"
graphify-intent --backend api --graph graphify-out/graph.json --docs docs/
```

…or wrap the command with `op run` so the secret lives only for that process:

```bash
op run --env-file=.env.op -- graphify-intent --backend api --graph graphify-out/graph.json --docs docs/
```

A real key committed in `.env` risks leaking into git history, CI logs, and backups; a
manager keeps it encrypted, access-audited, and revocable. Best of all, the default
subscription backend needs no key at all.

## CLI flags

| Flag | Default | Description |
|------|---------|-------------|
| `--graph` | `graphify-out/graph.json` | Path to the graphify output `graph.json` |
| `--docs` | `docs/` | Directory containing markdown source files |
| `--passes` | `A,B,C` | Comma-separated passes: `A` extract, `B` anchor, `C` cross-doc relate, `D` cross-doc concept resolution (opt-in) |
| `--backend` | *(auto)* | Backend selection (see [LLM backend](#llm-backend)) |
| `--min-confidence` | `0.75` | Minimum `confidence_score` for Pass C intent edges |
| `--max-concurrency` | `4` | Maximum number of concurrent LLM requests |
| `--max-cost-usd` | `2.00` | Cost guard for **API** backends: prompts if the estimate exceeds this (skipped for the free subscription backend) |
| `--yes` | `false` | Skip the cost-guard prompt (non-interactive / CI) |

Set `--backend` to `subscription`, `api`, or an explicit provider name (see [LLM backend](#llm-backend)).

## Outputs

All outputs are written to the same directory as `graph.json`:

| File | Contents |
|------|----------|
| `.graphify_intent.json` | Sidecar: all intent nodes and edges added across runs |
| `graph.enriched.json` | Original graph merged with the intent layer (graphify node-link format) |
| `enrichment_report.md` | Before/after metrics: node/link/isolated counts, intent breakdown by kind, anchored vs orphaned, grounding coverage |

## How it works

**Pass A — intent extraction.** Each markdown document is split into heading-bounded sections. Per section, the LLM extracts 0-N *intent units* — a `decision`, `mechanism`, `constraint`, or `tradeoff` — each with a one-sentence `claim`, the `rationale` (the *why*), any `alternatives` considered, and a deterministic ID grounded to the section's character span.

**Pass B — anchoring.** Each intent node is linked to the graph concept(s) it explains via a `rationale_for` edge. Candidate graph nodes are found **without embeddings** — same `source_file` plus lexical overlap — then the LLM adjudicates which the intent actually explains.

**Pass C — cross-doc intent.** A single LLM call over all intent nodes surfaces intent-level relationships between them — `supersedes`, `trade_off_against`, `constrains`, `motivated_by`, `contradicts`. Edges below `--min-confidence` are discarded.

**Pass D — cross-doc concept resolution (opt-in).** Enable with `--passes A,B,C,D`.
Pass D unifies the graphify concepts the intent layer touches **non-destructively**,
emitting `same_concept` edges between existing concept nodes (nothing is merged or
rewritten). Cross-document intent linkage then emerges transitively —
`intent —rationale_for→ JWT —same_concept— token ←rationale_for— intent`. Candidate
pairs are cross-file only, blocked by **embedding cosine** when a local model is
available (optional `[embeddings]` extra) or **lexical Jaccard** otherwise, then the LLM
adjudicates each pair; edges below `--min-confidence` are dropped. Every edge records its
`method` (embedding/lexical) and `similarity` for provenance.

## Idempotency

Re-running `graphify-intent` is safe. Intent nodes already present in the sidecar (`.graphify_intent.json`) are skipped by ID, and edges are skipped by `(source, target, relation)` key, so a re-run adds only genuinely new intent. You can also run passes separately (e.g. `--passes A` then `--passes B,C`): when `A` is not in the run, later passes seed from the existing sidecar.

## Development

```bash
git clone <repo>
cd graphify-intent
pip install -e ".[dev]"
python -m pytest tests/ -v
```

Tests cover every module — IDs, section splitting with spans, relation/confidence validation, all four passes (including Pass D's candidate resolution and embedding fallback), merge/enriched-graph assembly, the report, backend resolution, and an end-to-end smoke test. The LLM boundary is injected/mocked, so the suite needs neither graphify nor an API key.

## Note on graphify peer dependency

`graphify-intent` does not declare `graphifyy` as a hard dependency because the package name and extras vary by installation method. Install it separately:

```bash
pip install graphifyy            # core — enough for the default subscription backend
pip install graphifyy[anthropic] # add for --backend api with Anthropic
pip install graphifyy[gemini]    # add for --backend api with Gemini
```

For the default subscription backend you also need the [Claude Code CLI](https://docs.claude.com/en/docs/claude-code) (`claude`) installed and signed in. The CLI will exit with a clear error message if graphify is not found at runtime.

## Architecture decisions

Design rationale is recorded as ADRs in [`docs/adr/`](docs/adr/):

| ADR | Decision |
|-----|----------|
| [0001](docs/adr/0001-direct-schema-enforced-llm-calls.md) | Direct, schema-enforced LLM calls |
| [0002](docs/adr/0002-intent-rationale-wedge-and-data-model.md) | Intent/rationale as the product wedge + node data model |
| [0003](docs/adr/0003-span-level-grounding-and-provenance.md) | Span-level grounding & provenance |
| [0004](docs/adr/0004-embedding-free-same-file-anchoring.md) | Embedding-free, same-file anchoring (v1) |
| [0005](docs/adr/0005-extend-graphify-relation-vocabulary.md) | Extend graphify's relation vocabulary deliberately |
| [0006](docs/adr/0006-enriched-graph-assembly-node-link.md) | Enriched-graph assembly on the node-link format |
| [0007](docs/adr/0007-determinism-idempotency-confidence.md) | Determinism, idempotency, and a single confidence policy |
| [0008](docs/adr/0008-subscription-first-llm-backend.md) | Subscription-first LLM backend via the Claude CLI |
| [0009](docs/adr/0009-cross-document-concept-resolution.md) | Cross-document concept resolution (Pass D) |

## License and attribution

`graphify-intent` is licensed under the [MIT License](LICENSE) — Copyright (c) 2026 Will Neill.

This project is an independent post-processor built to interoperate with
[graphify](https://github.com/safishamsi/graphify) by Safi Shamsi. It reuses graphify's
graph schema and relation vocabulary and calls graphify as a separately-installed runtime
dependency; no graphify source code is bundled with or distributed as part of this project.
graphify is licensed under the MIT License (Copyright (c) 2026 Safi Shamsi) — see the
`ACKNOWLEDGEMENT AND ATTRIBUTION` section of this repository's [LICENSE](LICENSE) file and
the [upstream license](https://github.com/safishamsi/graphify/blob/v8/LICENSE) for the
full text. With thanks to the graphify project.
