modernization_classic (v1.0.0)

2 run summaries ยท 2 models

Mean win rate

ModelWin rate
Qwen-32B1.000
Llama-70B0.000

Per-axis rollup

ModelScenariofunctional_accuracycompleteness
Qwen-32Bcobol_billing 88.582.0
Llama-70Bcobol_billing 72.068.0