| Task ID | Band | Score | Passed | Cost |
|---|
| python-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-refactor-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-refactor-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-config-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-config-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-recovery-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-recovery-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-dependency-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-dependency-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-explain-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-explain-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-security-fix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-security-fix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-bugfix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-bugfix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-performance-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-performance-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-test-writing-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-test-writing-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-multi-file-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-multi-file-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-refactor-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-refactor-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-config-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-config-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-recovery-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-recovery-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-dependency-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-dependency-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-explain-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-explain-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
Latest run: 42bce1ae-c13b-4fab-a3b2-66046d333adb | Latest model: coder | Latest score: 0.740 | Recorded at: 2026-04-27T16:27:18.067977+00:00
| Run ID | Model | Git SHA | Score | Created |
|---|
| 42bce1ae-c13b-4fab-a3b2-66046d333adb | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:27:18.067977+00:00 |
| 4574785f-7767-430e-83b2-f3b1e0b39bfc | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:27:17.836205+00:00 |
| 8eedd7b3-8260-404c-a0c1-a575b1a2f90b | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:27:17.726656+00:00 |
| af15d949-1fee-473c-8e18-a2136dbd2ca6 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:27:17.584418+00:00 |
| 47c44d25-5344-4bb0-ba11-7d44d45f2848 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:27:17.420972+00:00 |
| Run ID | Task ID | Taxonomy | Score | Cost | Created |
|---|
| 42bce1ae-c13b-4fab-a3b2-66046d333adb | typescript-multi-file-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:18.067977+00:00 |
| 4574785f-7767-430e-83b2-f3b1e0b39bfc | python-multi-file-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.836205+00:00 |
| 8eedd7b3-8260-404c-a0c1-a575b1a2f90b | typescript-test-writing-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.726656+00:00 |
| af15d949-1fee-473c-8e18-a2136dbd2ca6 | python-test-writing-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.584418+00:00 |
| 47c44d25-5344-4bb0-ba11-7d44d45f2848 | typescript-performance-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.420972+00:00 |
| 48b7562e-eea5-4ee8-8de0-4b81cf6833b3 | python-performance-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.334352+00:00 |
| baba7e2c-128a-411b-8967-2515efc52979 | typescript-bugfix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.227381+00:00 |
| 1da723ba-d735-4bd2-8953-636784b1014a | python-bugfix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.133150+00:00 |
| f8373f60-b96b-4c10-9b8a-e83571af670e | typescript-security-fix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:17.053236+00:00 |
| afbd58f8-ca1b-4f2d-b77c-55743afb892a | python-security-fix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:27:16.968150+00:00 |