# kosha


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

kosha (कोश) - A treasury of your repo and environment context for humans
and coding assistants. \> kosha gives you persistent knowledge of your
codebase and installed packages — indexed with FTS5 + vector search +
call graph, merged with Reciprocal Rank Fusion. Results include the code
snippet, callers, callees, and PageRank. No LLMs required.

## Install

kosha is a **dev dependency** — it runs at development time so your AI
coding assistant can search your code. It does not ship with your
application.

``` sh
# uv (recommended)
uv add --dev koshas

# pip
pip install --group dev koshas
```

## One-time project setup

Run this once to drop a `SKILL.md` into `.agents/skills/kosha/` — the
file your AI harness reads to know kosha exists and how to call it.

``` python
Kosha(install_skill=True)   # writes .agents/skills/kosha/SKILL.md at your repo root
# Commit this file so every contributor (and every AI) gets it automatically.
```

## Sync once per session

Index your repo code, installed packages, and call graph in one call.
Subsequent calls are incremental — only changed files and new package
versions are re-indexed.

``` python
k = Kosha()   # auto-detects git repo root

k.sync(pkgs=['fasthtml', 'fastcore', 'litesearch'])
# Indexes:
#   .kosha/code.db   — your repo code chunks + embeddings
#   .kosha/graph.db  — call graph (callers, callees, PageRank)
#   ~/.local/share/kosha/env.db — installed packages (shared across repos)
```

## Searching — `context()`

The main entry point. Parses optional `key:value` filters, auto-detects
package names, fans out searches in parallel, and merges everything with
chained RRF.

With `graph=True` (default) each result is enriched with call graph data
from `.kosha/graph.db`.

``` python
results = k.context('how do I render a toast notification', limit=10)

for r in results:
    m = r['metadata']
    print(f"{m['mod_name']}  (line {m.get('lineno','?')})")
    print(f"  pagerank={r.get('pagerank',0):.5f}  callers={r['callers'][:2]}")
    print(f"  {r['content'][:100]}")
    print()
```

## What each result contains

Every result is a plain dict — code snippet plus structural context from
the call graph:

``` python
{
  # The code
  'content':  'def merge(*ds):\n    "Merge all dicts"\n    return {k:v for d in ds ...}',

  # Where it lives
  'metadata': {
      'mod_name': 'fastcore.basics.merge',   # fully-qualified — use in ni() / short_path()
      'path':     '/path/to/fastcore/basics.py',
      'lineno':   655,
      'type':     'FunctionDef',
      'package':  'fastcore',                # present on package results
  },

  # Structural position in the codebase
  'pagerank':      0.00027,  # centrality — higher = more load-bearing
  'in_degree':     8,        # number of callers
  'out_degree':    12,       # number of callees
  'callers':       ['fastcore.script.call_parse._f', ...],
  'callees':       ['fastcore.basics.NS.__iter__', ...],
  'co_dispatched': [],       # functions registered alongside this one
}
```

`co_dispatched` is particularly useful: it lists functions assigned
together in the same list, dict, or route group at module level — the
pattern to follow when adding a new handler or plugin.

## Filter syntax

Add `key:value` tokens anywhere in your query to narrow results. Plural
forms and comma-separated values are supported.

<table>
<colgroup>
<col style="width: 29%" />
<col style="width: 37%" />
<col style="width: 33%" />
</colgroup>
<thead>
<tr>
<th>Token</th>
<th>Example</th>
<th>Effect</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>package:name</code></td>
<td><code>package:fasthtml</code></td>
<td>Restrict env search to one package</td>
</tr>
<tr>
<td><code>file:glob</code></td>
<td><code>file:routes*</code></td>
<td>Restrict repo results by filename</td>
</tr>
<tr>
<td><code>path:pattern</code></td>
<td><code>path:api/*</code></td>
<td>Restrict repo results by path</td>
</tr>
<tr>
<td><code>lang:ext</code></td>
<td><code>lang:py</code></td>
<td>Filter by language</td>
</tr>
<tr>
<td><code>type:node</code></td>
<td><code>type:FunctionDef</code></td>
<td>Filter by AST node type</td>
</tr>
</tbody>
</table>

Filters can be combined and stacked:
`"stripe webhook path:payments/ type:FunctionDef"`

``` python
# parseq strips filter tokens from a query — fast, no DB needed
bare, filt = parseq('stripe webhook path:payments/ type:FunctionDef')
print(f'query:   {bare!r}')
print(f'filters: {dict(filt)}')
```

``` python
# Restrict to a specific package
results = k.context('render a table package:fasthtml', limit=5)

# Functions only, in the payments directory
results = k.context('handle stripe webhook type:FunctionDef path:payments/', limit=5)

# Multiple packages — fan-out in parallel, results merged
results = k.context('payments page packages:fasthtml,monsterui', limit=15)
```

## The structural layer — CodeGraph

`k.graph` is a `CodeGraph` backed by `.kosha/graph.db`. After
`k.sync()`, the graph covers your repo and every indexed package. You
can traverse it directly, or let `context()` enrich results
automatically.

``` python
# Full structural info for any node
k.ni('fastcore.basics.merge')
# → {node, flavor, file, pagerank, in_degree, out_degree, callers, callees, co_dispatched}

# Top nodes by PageRank within a module
k.graph.ranked(10, module='fastcore.basics')

# Shortest call chain between two nodes
k.short_path('apswutils.db.Table.upsert', 'apswutils.db.Table.insert_chunk')
# → ['...upsert', '...upsert_all', '...insert_all', '...insert_chunk']

# Everything within 2 hops of a node
k.neighbors('myapp.payments.verify_webhook', depth=2)

# Direct table queries
k.gn(where='node like "%stripe%"')    # graph_nodes
k.ge(where='caller like "%route%"')   # graph_edges
```

## Composing a plan — the full workflow

The highest-value pattern strings `task_context` → `context` →
`short_path` → `ni` together. Each step narrows the search space and
adds structural evidence before you write a line of code.

**Step 1** — discover the landscape

``` python
tc = k.task_context('add webhook verification to the payments flow', depth=2)
# tc['packages']   → which packages are involved
# tc['dep_layers'] → what each package pulls in; use to decide what to pass to sync()
```

**Step 2** — find the key functions (graph-enriched)

``` python
results = k.context('webhook signature verification payments', limit=20, graph=True)
# Sort by pagerank to find the structural load-bearers
key = sorted(results, key=lambda r: -r.get('pagerank', 0))
```

**Step 3** — map the call chains

``` python
from itertools import combinations
nodes = [r['metadata']['mod_name'] for r in key[:8]]
paths = [p for a, b in combinations(nodes, 2) if (p := k.short_path(a, b))]
paths.sort(key=len)   # shortest = tightest coupling between your key nodes
```

**Step 4** — drill into the join points

``` python
for node in nodes[:5]:
    info = k.ni(node)
    # callers       → where to hook in upstream
    # callees       → what you can reuse
    # co_dispatched → pattern to follow when adding a new handler alongside existing ones
```

**Step 5** — write your plan, grounded in `mod_name:lineno`

``` python
for r in key[:5]:
    m = r['metadata']
    print(f"{m['mod_name']}  line {m.get('lineno','?')}  pagerank={r.get('pagerank',0):.5f}")
```

Quoting `mod_name` + `lineno` in each step of your plan anchors the plan
to the actual code.

## Using with Claude Code and other harnesses

### Project-local (commit alongside code)

The `Kosha(install_skill=True)` call above writes
`.agents/skills/kosha/SKILL.md`. Most agent harnesses (Claude Code,
Continue.dev, Cursor, Copilot) auto-discover skills in
`.agents/skills/`. Committing this file means every contributor — human
and AI — gets it automatically.

### Claude Code — global (all projects on this machine)

``` bash
mkdir -p ~/.claude/skills/kosha
cp .agents/skills/kosha/SKILL.md ~/.claude/skills/kosha/SKILL.md
```

Once installed globally, Claude Code will load the kosha skill at the
start of every session in every repo.

### Other harnesses

Place `SKILL.md` wherever the harness discovers agent skills. Common
locations: - `.agents/skills/kosha/SKILL.md` — general convention -
`.continue/skills/kosha/SKILL.md` — Continue.dev - Configure in harness
settings if the path differs
Token	Example	Effect
`package:name`	`package:fasthtml`	Restrict env search to one package
`file:glob`	`file:routes*`	Restrict repo results by filename
`path:pattern`	`path:api/*`	Restrict repo results by path
`lang:ext`	`lang:py`	Filter by language
`type:node`	`type:FunctionDef`	Filter by AST node type