Weave QuickDistill

Quick Start:
  1. Sync traces: Fetch a Weave project to import your production LLM calls
  2. Select traces: Choose the traces you want to use as your ground-truth "strong model" examples
  3. Run End-to-End Test: Click the button to automatically run cheaper models on your selected traces and evaluate them with an LLM judge
  4. Mine Hard Examples: Find where weak models struggle most and export those examples to build better prompts
⚡ AUTOMATIC WORKFLOW
Select traces, then run end-to-end testing or mine hard examples — all in one click
⚙️ TOOLS
Judges Prompt Builder + Cost Explorer
📋 Manual Workflow (Legacy)
  1. Export selected traces as a test set
  2. Run inference with weak models on the test set
  3. Evaluate results with judges to compare outputs
✅ Fully supported: OpenAI (chat.completions, responses), Anthropic (Messages), Google Gemini (generate_content, Chat)
Total: 0
Shown: 0
Loading traces...