Metadata-Version: 2.4
Name: koshas
Version: 0.0.8
Summary: kosha (कोश) — a treasury of your repo and environment context for coding agents. FTS5 + vector search + call graph, no LLMs required.
Project-URL: Repository, https://github.com/vedicreader/kosha
Project-URL: Documentation, https://vedicreader.github.io/kosha/
Author-email: Karthik <karthik.rajgopal@hotmail.com>
License: Apache-2.0
License-File: LICENSE
Keywords: code graph,code search,devtool,nbdev,repo-context
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Requires-Python: >=3.10
Requires-Dist: fastprogress>=1.1.5
Requires-Dist: litesearch>=0.0.22
Requires-Dist: pyan3>=2.4.3
Requires-Dist: pyskills>=0.0.4
Requires-Dist: watchfiles>=1.1.1
Description-Content-Type: text/markdown

# kosha


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

kosha (कोश) - A treasury of your repo and environment context for humans
and coding assistants. \> kosha gives you persistent knowledge of your
codebase and installed packages — indexed with FTS5 + vector search +
call graph, merged with Reciprocal Rank Fusion. Results include the code
snippet, callers, callees, and PageRank. No LLMs required.

## Install

kosha is a **dev dependency** — it runs at development time so your AI
coding assistant can search your code. It does not ship with your
application.

``` sh
# uv (recommended)
uv add --dev koshas

# pip
pip install --group dev koshas
```

## One-time project setup

Run this once to drop a `SKILL.md` into `.agents/skills/kosha/` — the
file your AI harness reads to know kosha exists and how to call it.

``` python
Kosha(install_skill=True)   # writes .agents/skills/kosha/SKILL.md at your repo root
# Commit this file so every contributor (and every AI) gets it automatically.
```

## Sync once per session

Index your repo code, installed packages, and call graph in one call.
Subsequent calls are incremental — only changed files and new package
versions are re-indexed.

``` python
k = Kosha()   # auto-detects git repo root

k.sync(pkgs=['fasthtml', 'fastcore', 'litesearch'])
# Indexes:
#   .kosha/code.db   — your repo code chunks + embeddings
#   .kosha/graph.db  — call graph (callers, callees, PageRank)
#   ~/.local/share/kosha/env.db — installed packages (shared across repos)
```

## Searching — `context()`

The main entry point. Parses optional `key:value` filters, auto-detects
package names, fans out searches in parallel, and merges everything with
chained RRF.

With `graph=True` (default) each result is enriched with call graph data
from `.kosha/graph.db`.

``` python
results = k.context('how do I render a toast notification', limit=10)

for r in results:
    m = r['metadata']
    print(f"{m['mod_name']}  (line {m.get('lineno','?')})")
    print(f"  pagerank={r.get('pagerank',0):.5f}  callers={r['callers'][:2]}")
    print(f"  {r['content'][:100]}")
    print()
```

## What each result contains

Every result is a plain dict — code snippet plus structural context from
the call graph:

``` python
{
  # The code
  'content':  'def merge(*ds):\n    "Merge all dicts"\n    return {k:v for d in ds ...}',

  # Where it lives
  'metadata': {
      'mod_name': 'fastcore.basics.merge',   # fully-qualified — use in ni() / short_path()
      'path':     '/path/to/fastcore/basics.py',
      'lineno':   655,
      'type':     'FunctionDef',
      'package':  'fastcore',                # present on package results
  },

  # Structural position in the codebase
  'pagerank':      0.00027,  # centrality — higher = more load-bearing
  'in_degree':     8,        # number of callers
  'out_degree':    12,       # number of callees
  'callers':       ['fastcore.script.call_parse._f', ...],
  'callees':       ['fastcore.basics.NS.__iter__', ...],
  'co_dispatched': [],       # functions registered alongside this one
}
```

`co_dispatched` is particularly useful: it lists functions assigned
together in the same list, dict, or route group at module level — the
pattern to follow when adding a new handler or plugin.

## Filter syntax

Add `key:value` tokens anywhere in your query to narrow results. Plural
forms and comma-separated values are supported.

| Token          | Example            | Effect                             |
|----------------|--------------------|------------------------------------|
| `package:name` | `package:fasthtml` | Restrict env search to one package |
| `file:glob`    | `file:routes*`     | Restrict repo results by filename  |
| `path:pattern` | `path:api/*`       | Restrict repo results by path      |
| `lang:ext`     | `lang:py`          | Filter by language                 |
| `type:node`    | `type:FunctionDef` | Filter by AST node type            |

Filters can be combined and stacked:
`"stripe webhook path:payments/ type:FunctionDef"`

``` python
# parseq strips filter tokens from a query — fast, no DB needed
bare, filt = parseq('stripe webhook path:payments/ type:FunctionDef')
print(f'query:   {bare!r}')
print(f'filters: {dict(filt)}')
```

``` python
# Restrict to a specific package
results = k.context('render a table package:fasthtml', limit=5)

# Functions only, in the payments directory
results = k.context('handle stripe webhook type:FunctionDef path:payments/', limit=5)

# Multiple packages — fan-out in parallel, results merged
results = k.context('payments page packages:fasthtml,monsterui', limit=15)
```

## The structural layer — CodeGraph

`k.graph` is a `CodeGraph` backed by `.kosha/graph.db`. After
`k.sync()`, the graph covers your repo and every indexed package. You
can traverse it directly, or let `context()` enrich results
automatically.

``` python
# Full structural info for any node
k.ni('fastcore.basics.merge')
# → {node, flavor, file, pagerank, in_degree, out_degree, callers, callees, co_dispatched}

# Top nodes by PageRank within a module
k.graph.ranked(10, module='fastcore.basics')

# Shortest call chain between two nodes
k.short_path('apswutils.db.Table.upsert', 'apswutils.db.Table.insert_chunk')
# → ['...upsert', '...upsert_all', '...insert_all', '...insert_chunk']

# Everything within 2 hops of a node
k.neighbors('myapp.payments.verify_webhook', depth=2)

# Direct table queries
k.gn(where='node like "%stripe%"')    # graph_nodes
k.ge(where='caller like "%route%"')   # graph_edges
```

## API discovery

Two functions for quickly surfacing what a package exposes.

**`pkg_url(pkg)`** — returns the best web URL for an installed package
from its metadata (Source Code \> Repository \> Home-page). Useful when
you need to fetch docs or browse source.

**`k.public_api(module, min_callees, limit)`** — queries the code graph
for public entry points: functions with `in_degree=0` (nothing in the
indexed code calls them externally) and no underscore prefix. Returns
each node’s name, graph metrics, and docstring pulled from the indexed
content.

``` python
from kosha import pkg_url

# Package web URL — useful for WebFetch or browsing docs
pkg_url('fastcore')   # → 'https://github.com/fastai/fastcore'
pkg_url('httpx')      # → 'https://github.com/encode/httpx'

# Public API surface from the call graph
# Returns functions with in_degree=0 (no internal callers) + their docstrings
api = k.public_api(module='fastcore', min_callees=1, limit=20)
for fn in api:
    print(fn['node'], '|', fn.get('docstring', '')[:60])

# Without module filter — all public entry points across everything indexed
all_entry_points = k.public_api(min_callees=0)
```

## Composing a plan — the full workflow

The highest-value pattern strings `context` → `short_path` → `ni`
together. Each step narrows the search space and adds structural
evidence before you write a line of code.

**Step 1** — find the key functions (graph-enriched)

``` python
results = k.context('webhook signature verification payments', limit=20, graph=True)
# Sort by pagerank to find the structural load-bearers
key = sorted(results, key=lambda r: -r.get('pagerank', 0))
```

**Step 2** — map the call chains

``` python
from itertools import combinations
nodes = [r['metadata']['mod_name'] for r in key[:8]]
paths = [p for a, b in combinations(nodes, 2) if (p := k.short_path(a, b))]
paths.sort(key=len)   # shortest = tightest coupling between your key nodes
```

**Step 3** — drill into the join points

``` python
for node in nodes[:5]:
    info = k.ni(node)
    # callers       → where to hook in upstream
    # callees       → what you can reuse
    # co_dispatched → pattern to follow when adding a new handler alongside existing ones
```

**Step 4** — write your plan, grounded in `mod_name:lineno`

``` python
for r in key[:5]:
    m = r['metadata']
    print(f"{m['mod_name']}  line {m.get('lineno','?')}  pagerank={r.get('pagerank',0):.5f}")
```

Quoting `mod_name` + `lineno` in each step of your plan anchors the plan
to the actual code.

## Using with Claude Code and other harnesses

### Project-local (commit alongside code)

The `Kosha(install_skill=True)` call above writes
`.agents/skills/kosha/SKILL.md`. Most agent harnesses (Claude Code,
Continue.dev, Cursor, Copilot) auto-discover skills in
`.agents/skills/`. Committing this file means every contributor — human
and AI — gets it automatically.

### Claude Code — global (all projects on this machine)

``` bash
mkdir -p ~/.claude/skills/kosha
cp .agents/skills/kosha/SKILL.md ~/.claude/skills/kosha/SKILL.md
```

Once installed globally, Claude Code will load the kosha skill at the
start of every session in every repo.

### Other harnesses

Place `SKILL.md` wherever the harness discovers agent skills. Common
locations: - `.agents/skills/kosha/SKILL.md` — general convention -
`.continue/skills/kosha/SKILL.md` — Continue.dev - Configure in harness
settings if the path differs

## CLI

kosha ships a `kosha` command for shell-based harnesses (GitHub Copilot,
Claude Code hooks, scripts). Default output is readable markdown; add
`--as_json` for JSON piping.

``` bash
# Sync repo + all pyproject.toml deps into .kosha/ databases
kosha sync

# Fan-out search (repo + env + graph enrichment)
kosha context "embed a query" --limit 10
kosha context "embed a query" --limit 5 --as_json | jq '.[].metadata.mod_name'

# Repo-only / env-only
kosha repo-context "parse filters"
kosha env-context "fastcore store_attr"

# Node info — callers, callees, co_dispatched, pagerank
kosha ni "kosha.core.Kosha"
kosha ni "fastcore.basics.merge" --as_json

# Package public API (respects __all__ + @patch methods)
kosha public-api fastcore
kosha public-api kosha.graph

# Shortest call-graph paths between two packages
kosha api-paths kosha litesearch --k 10

# BFS dependency layers ordered by coupling strength
kosha dep-stack --seeds kosha --depth 2

# Top PageRank nodes in a package's public API
kosha top-nodes fastcore --k 5

# Live incremental re-index (blocking, Ctrl-C to stop)
kosha watch
```

## pyskills integration

kosha registers itself as a
[pyskill](https://github.com/AnswerDotAI/pyskills) — a Python-native
plugin that LLM hosts (e.g. solveit) can discover and load without
importing upfront.

``` python
from pyskills.core import list_pyskills, doc

# Discovery — no import needed
list_pyskills()   # → {'kosha.skill': 'kosha — FTS5 + vector search...', ...}

# Load and inspect
import kosha.skill
doc(kosha.skill)          # module overview: Kosha class + all allowed methods
doc(kosha.skill.Kosha)    # class detail: __init__, all public methods with signatures

# Use normally
from kosha.skill import Kosha
k = Kosha()
k.sync()
k.context("how does routing work")
```

The `allow({Kosha: ...})` declaration in `kosha.skill` tells pyskills
hosts which methods are safe to call in sandboxed environments — all
public methods are permitted.
