Response schemas
Cantrip ships JSON schemas for the recurring structured-output shapes its callers consume. Pass one to `complete_structured()` (or directly to `provider.complete(response_schema=…)`) when you want a parseable reply rather than free text.
When to use a schema
Pass a response_schema when the caller will parse the
reply rather than show it to the user. Recipes consume planner
briefings. The oracle returns a structured second opinion that
downstream code feeds back into decision points. Acceptance
reports populate test matrices. Free-form prose stays untyped
— schemas are for data, not conversation.
Schemas are plain dicts matching JSON Schema draft 2020-12 — the same surface every supported provider already accepts. No Pydantic, no attrs, no DSL. Pass a built-in or hand-roll your own.
Built-in schemas
Importable from cantrip.llm.schemas. Resolve by name
through BUILTIN_SCHEMAS for caller paths driven by
config (recipes, settings).
PLANNER_BRIEFING-
Output of a planner LLM call: a list of work-queue tasks each
with a
title, acategory(one ofresearch/build/deploy/test/debug/infra/confirm— mirrorscantrip.agent.queue.TaskCategory), an optional description and dependency list. ORACLE_ANSWER-
Shape of an oracle reply when the caller wants more
than free-form prose:
answer(required), optionalconfidencein[0, 1], and lists ofcaveatsandreferences. CHECK_RESULT-
Output of a prompt-based "Check" — the LLM evaluates
a named rule against the active charm and returns
status: pass | fail, amessage, and optionallyseverity,evidence, and asuggested_fix. ACCEPTANCE_REPORT-
Acceptance-test report — what the agent produces after
exercising a deployed charm.
appandoverall_status(pass | fail | partial) are required;coveragerecords which areas were exercised;findingsis a list of issues to surface.
Provider matrix
Native enforcement is an optimisation — Cantrip-side validation runs regardless, so providers without native support still satisfy the contract via the corrective-retry path.
| Provider | Native enforcement | Wire mechanism |
|---|---|---|
| Gemini | Yes | response_mime_type=application/json + response_schema |
| OpenAI-compatible (vLLM, Fireworks, OpenRouter, OpenCode Zen, inference-snap) | Yes (where the backend supports it) | response_format: {type: json_schema, json_schema: {...}} |
| Anthropic Claude | No | Argument accepted but ignored; relies on Cantrip-side validation. |
The provider.supports_response_schema property
distinguishes the two so callers that require native
enforcement can short-circuit early.
Calling complete_structured
The high-level entry point lives at
cantrip.llm.structured.complete_structured. It calls
the provider with the schema, parses the reply, validates it, and
returns a dict — or raises StructuredOutputError on
unrecoverable failure.
from cantrip.llm.schemas import ORACLE_ANSWER
from cantrip.llm.structured import complete_structured
answer = await complete_structured(
provider,
messages=[
Message(role=Role.SYSTEM, content="You are an architecture oracle."),
Message(role=Role.USER, content="Should this charm use Pebble or systemd?"),
],
schema=ORACLE_ANSWER,
)
print(answer["answer"]) # always a string
print(answer.get("confidence")) # optional, in [0, 1] if present
For the lower-level path, pass response_schema=…
directly to provider.complete() and validate
manually with
cantrip.llm.structured.validate_against_schema.
Use this when you need provider-specific kwargs (custom
thinking_budget, tool choices) the helper doesn't
expose.
Validation and retry
Validation strips wrapping ```json code fences,
parses the result with json.loads, and runs
jsonschema.validate against the schema. Failures
raise StructuredOutputError carrying the raw text,
the schema, and the underlying parser or validator error.
complete_structured retries once by default
(retries=1). On failure it appends the malformed
reply as an assistant turn and a corrective USER turn that quotes
the schema and the validation error, asking the model to emit
valid JSON. Set retries=0 for one-shot calls (CI),
or higher when burning extra tokens to coax a recalcitrant model
is acceptable.
When all retries are exhausted, the last error is raised so the caller can surface the most recent malformed output to the user — earlier attempts are discarded.