quant · advisor
A governed 4B advisor over your corpus — exact source-id citations, trusted refusals, local on a DGX Spark
A corpus-grounded advisor is only trustworthy if it cites exactly and refuses cleanly — and prompting alone doesn't get there: the 30B teacher prompted scored 8/21 on the frozen curveball bench with 3 fabrications, while this 4B-SFT-v0.2 lane scores 18/21 with refusals 9/9 and zero private-state leaks. The default Q4_K_M GGUF serves at ~70 tok/s on a DGX Spark (Q8_0 at 42 tok/s, identical bench behavior) at 2.6 GB, so the whole answer/refuse/route loop runs local. Sibling release Orionfold/Advisor-bench carries the frozen OOD bench (pool 75, held-out 28, curveballs 40+21) and the sha-pinned 182-source corpus manifest.
- Governed Q&A over an enterprise or personal corpus with exact source-id citations
- Refusal-gated handling of out-of-corpus and private-state questions
- The local citation lane in a governed routing stack that escalates to a frontier model with receipts
Audience — Operators who want a corpus-grounded local advisor whose citation and refusal behavior is bench-proven (frozen OOD curveballs, strict scoring) — not a hosted assistant.
| Variant | tok/s | advisor curveball-v0.2, frozen OOD bench (n=21, scored==strict; refusals 9/9, 0 private-state risk) |
|---|---|---|
| Q4_K_M sweet spot | 70.0 | 0.86 |
| Q8_0 | 42.0 | 0.86 |