Index the charm docs

Build a local index of Canonical’s charm-ecosystem documentation so the agent answers from cited passages instead of plausible-sounding paraphrase.

Cantrip can index the canonical Canonical documentation surfaces (Juju, ops, charmcraft, rockcraft, jubilant, charmlibs). Once indexed, the agent reaches for docs_search before answering “how do I…” questions, and you can pull cited passages mid-message via @docs <site> <query>. Both surfaces return canonical URLs — never paraphrased text — so the agent can no longer fall back to plausible-sounding hallucinations.

One-time setup

The index needs an embed provider. Voyage is the Anthropic-ecosystem recommendation; OpenAI’s text-embedding-3-small works too. Configure either in your shell:

$ export CANTRIP_EMBED_PROVIDER=voyage
$ export VOYAGE_API_KEY=<your-key>

See Configure embed and rerank for the OpenAI variant and the OPENAI_EMBED_BASE_URL override for self-hosted vLLM.

Index a site

$ cantrip docs index --site ops
Indexing ops (https://documentation.ubuntu.com/ops/latest/) …
  pages: 184  chunks: 612  embed-batches: 10  errors: 0

Re-running the same command crawls fresh and replaces stored rows by stable (url, ordinal) hash — you do not need to clear the cache by hand. Crawl errors (timeouts, 404s) are absorbed; the rest of the pages still index.

--all indexes every registered site:

$ cantrip docs index --all

See what’s indexed

$ cantrip docs list
Cache root: /home/<you>/.cache/cantrip/docs-index

Site         Indexed  Chunks   Description
------------ -------- -------- ----------------------------------------
juju         no       -        Juju documentation (operator framework + CLI)
ops          yes      612      ops library reference (charm authoring API)
charmcraft   no       -        charmcraft reference (charm packaging tooling)
rockcraft    no       -        rockcraft reference (OCI image packaging)
jubilant     no       -        Jubilant (integration-testing helpers)
charmlibs    no       -        charm libraries reference (PyPI charm libs)
$ cantrip docs search ops "how do secrets work"
[0.842] https://documentation.ubuntu.com/ops/latest/howto/manage-secrets.html
    Manage charm secrets
    Charm secrets are content-addressed values that …

[0.781] https://documentation.ubuntu.com/ops/latest/reference/secrets.html
    Secret reference
    A secret is created with a label and content via the …

Pipe the output into a fuzzy finder like fzf — or straight into your editor — for quick navigation.

Use it in chat

Inside the TUI / Web chat, the agent calls docs_search automatically when answering ecosystem questions, and you can inject specific passages with the @docs mention:

How do I model a database relation? @docs ops relation

The mention expands inline before the message reaches the LLM, attaching the top hits as a context block with their URLs. See @-mention context providers for the full mention catalogue.

Costs

Every batch of chunks goes through the embed provider, so costs are roughly the per-million-token rate of your embed model multiplied by the corpus size. At voyage-3’s $0.06/1M rate, indexing the ops reference (~600 chunks × 500 tokens) is well under a cent. The /cost slash command shows a separate By role section once embed traffic exists, so retrieval spend is easy to spot.

Caveats