Metadata-Version: 2.4
Name: koshas
Version: 0.0.9
Summary: kosha (कोश) — a treasury of your repo and environment context for coding agents. FTS5 + vector search + call graph, no LLMs required.
Project-URL: Repository, https://github.com/vedicreader/kosha
Project-URL: Documentation, https://vedicreader.github.io/kosha/
Author-email: Karthik <karthik.rajgopal@hotmail.com>
License: Apache-2.0
License-File: LICENSE
Keywords: code graph,code search,devtool,nbdev,repo-context
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.10
Requires-Dist: fastprogress>=1.1.5
Requires-Dist: litesearch>=0.0.22
Requires-Dist: pyan3>=2.4.3
Requires-Dist: pyskills>=0.0.4
Requires-Dist: watchfiles>=1.1.1
Description-Content-Type: text/markdown

# kosha


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

kosha (कोश) — a treasury of repo and installed-package context for
humans and coding assistants.

> Persistent knowledge of your codebase and installed packages — FTS5 +
> vector search + call graph, merged with Reciprocal Rank Fusion. Each
> result includes the code snippet, callers, callees, and PageRank. No
> LLMs required.

## Install

kosha is a **dev dependency** — it indexes code at development time so
AI coding assistants can search it. It does not ship with your
application.

``` sh
uv add --dev koshas        # uv (recommended)
pip install --group dev koshas
```

## Setup (one-time per project)

``` python
Kosha(install_skill=True)   # writes .agents/skills/kosha/SKILL.md
```

Commit `.agents/skills/kosha/SKILL.md` so every contributor — human and
AI — picks up the skill automatically.

## Sync (once per session, re-run when things change)

``` python
k = Kosha()                                       # auto-detects git repo root
k.sync(pkgs=['fasthtml', 'fastcore', 'litesearch'])
```

Subsequent calls are incremental — only changed files and new package
versions are re-indexed. **Re-run `k.sync()` whenever the env or repo
changes materially**: after `uv add` / `pip install` / version bumps,
after pulling or merging significant code changes, or when results look
stale. Calling it too often is harmless; calling it too rarely silently
feeds the agent stale context. Indexes:

- `.kosha/code.db` — repo chunks + embeddings (project-local)
- `.kosha/graph.db` — call graph (project-local)
- `$XDG_DATA_HOME/kosha/env.db` — installed packages (shared across
  repos)

## The four-step workflow

The high-value pattern: **inventory → disambiguate → narrow → trace.**
Skip steps that don’t apply (table at the end of this section).

### Step 1 — Inventory

Query the indexes directly to see what’s installed and what each package
exposes. `pkgs_in_env`, `dep_stack`, and `public_api` are all indexed
SQLite reads, sub-second on typical envs — no cache file is needed. (A
markdown cache without invalidation goes silently stale when packages
change, and the reads are already fast enough that no cache is earning
its keep.)

``` python
pkgs   = k.pkgs_in_env(pyproject=True)                              # [{name, version}, ...]
layers = k.dep_stack(seeds=[p['name'] for p in pkgs], depth=2)      # BFS, by coupling
for p in pkgs:
    api = k.public_api(p['name'], limit=30)                         # public surface + docstrings
    # use `pkgs`, `layers`, `api` directly — no intermediate file needed
```

### Step 2 — Disambiguate

If the task names a domain (“payments”, “ui”, “http client”) and several
installed packages could plausibly serve it, ask the user which to use
before searching deeper. `env_context()` auto-detects package names in
plain query tokens, so this call is cheap.

``` python
hits = k.env_context('toast notification', limit=30)         # cheap, no graph enrichment
by_pkg = {}
for r in hits: by_pkg.setdefault(r['metadata'].get('package'), []).append(r)
candidates = sorted(by_pkg, key=lambda p: -len(by_pkg[p]))[:4]
# If multiple candidates have comparable hit counts, present them and wait for the user.
```

### Step 3 — Narrow

Once the package set is known, run `context()` **once** with a
`package:` filter. The result is already enriched with `pagerank`,
`callers`, `callees`, `co_dispatched` — do **not** loop and call `ni()`
afterwards.

``` python
results = k.context('toast notification package:monsterui', limit=10)
for r in results:
    m = r['metadata']
    print(f"{m['mod_name']}  L{m.get('lineno','?')}  pr={r.get('pagerank',0):.4f}  "
          f"callers={list(r['callers'])[:2]}  callees={list(r['callees'])[:2]}")
```

### Step 4 — Trace

Reach for these only when the task spans packages or you need entry
points.

``` python
k.api_call_paths('myapp', 'fasthtml', k=15)        # how myapp reaches into fasthtml
k.short_path('myapp.routes.checkout', 'stripe.Webhook.construct_event')
k.top_nodes('fasthtml', k=5)                       # entry points to read first
k.neighbors('myapp.payments.verify', depth=2)
```

### When to skip steps

| Situation                                           | Steps     |
|-----------------------------------------------------|-----------|
| Trivial “how does X work” lookup in a known package | 3 only    |
| First time working in this repo / env               | 1 → 3     |
| Task names a domain, multiple packages could fit    | 1 → 2 → 3 |
| Task spans packages, or you need entry points       | 1 → 3 → 4 |

## `context()` reference

The main entry point. Parses `key:value` filters, auto-detects package
names, fans out repo + env searches in parallel, and merges with chained
RRF. With `graph=True` (default) each result is enriched from
`.kosha/graph.db`.

Each result is a dict with:

- `content` — code snippet
- `metadata` — `{mod_name, path, lineno, type, package?, docstring?}`
- graph fields — `pagerank`, `in_degree`, `out_degree`, `callers`,
  `callees`, `co_dispatched`

(`co_dispatched` lists siblings registered together at module level —
route groups, plugin tables — the pattern to follow when adding a new
handler. Inspect any result with `dir(r)` or `r.keys()`.)

### Filters

| Token          | Example            | Effect                               |
|----------------|--------------------|--------------------------------------|
| `package:name` | `package:fasthtml` | Env search restricted to one package |
| `file:glob`    | `file:routes*`     | Repo results by filename             |
| `path:pattern` | `path:api/*`       | Repo results by path                 |
| `lang:ext`     | `lang:py`          | By language                          |
| `type:node`    | `type:FunctionDef` | By AST node type                     |

Plurals (`packages:`, `paths:`) and comma-separated values are
supported.

``` python
# Combined: functions in payments/, restricted to one package
k.context('handle stripe webhook type:FunctionDef path:payments/ package:fasthtml', limit=5)
```

## Graph API

| Call | Returns |
|----|----|
| `k.ni(node)` | Node row + callers + callees + co_dispatched + pagerank |
| `k.short_path(a, b)` | Shortest call chain between two nodes |
| `k.neighbors(node, depth=2)` | All nodes within N hops |
| `k.graph.ranked(k=10, module='pkg')` | Top nodes by PageRank |
| `k.public_api(pkg, limit=200)` | Public entries (`__all__` + `@patch`) with docstrings |
| `k.gn(where=...)` / `k.ge(where=...)` | Direct `graph_nodes` / `graph_edges` queries |

## CLI

For shell harnesses (GitHub Copilot CLI, Claude Code hooks, scripts).
Markdown by default; `--as_json` pipes cleanly into `jq`. Run `kosha`
with no args to see all subcommands.

``` bash
kosha sync                                          # once per checkout
kosha context "embed a query" --as_json | jq '.[].metadata.mod_name'
kosha public-api fastcore
kosha api-paths kosha litesearch --k 10
kosha ni "fastcore.basics.merge"
```

## Harness install

``` bash
# Project-local (default; auto-discovered by Claude Code, Continue.dev, Cursor, Copilot)
#   .agents/skills/kosha/SKILL.md   ← written by Kosha(install_skill=True); commit it
# Claude Code, global across all projects on this machine:
mkdir -p ~/.claude/skills/kosha && cp .agents/skills/kosha/SKILL.md ~/.claude/skills/kosha/
# Other harnesses: place SKILL.md at whatever path the harness scans (e.g. .continue/skills/kosha/).
```

## pyskills

kosha is registered as a
[pyskill](https://github.com/AnswerDotAI/pyskills) (entry point
`kosha.skill`). LLM hosts can `list_pyskills()` to discover it without
importing, then `doc(kosha.skill)` for the full surface and instantiate
`Kosha` directly.
