Weave QuickDistill
Quick Start:
Sync traces:
Fetch a Weave project to import traces from your LLM calls
Select Evaluation data:
Choose a subset of traces to use as input for weak model evaluation and click 'Export selected to test set'
Generate weak outputs:
Run inference with smaller models on your test set
Evaluate quality:
Use judges to compare weak model responses against strong model outputs to find the best budget model
Project:
No projects loaded
Fetch New Project
Refresh Current
Filter by Operation:
Primary supported: openai.chat.completions.create
All Operations
Filter by Model:
All Models
Select All Filtered
Export Selected to Test Set (
0
)
Run Weak Models
Run Evaluation
Manage Judges
Total:
0
Shown:
0
Loading traces...
Run Weak Model Inference
Select Strong Model Export:
Loading...
Select which strong model traces to run weak models on
W&B Models:
OpenRouter Models (one per line):
Enter custom model strings from OpenRouter or other providers
Number of Examples:
Will use first N examples from the selected export
Run Inference
Cancel
Running inference...
Run Evaluation
Select Weak Model Results:
Select Judge:
Create/manage judges
Run Evaluation
Close
Running evaluations...
Completed Evaluations: