Note: Uses LiteLLM format. Examples: openai/gpt-5, anthropic/claude-3.5-sonnet, openai/gpt-4o
Scalar: Instruct model to return JSON with 'score' or 'scores' key
Boolean: Instruct model to return JSON with 'correct' key
Required placeholders: {strong_output}, {weak_output}
Optional placeholder: {question}