| Task ID | Band | Score | Passed | Cost |
|---|
| python-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-refactor-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-refactor-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-config-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-config-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-recovery-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-recovery-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-dependency-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-dependency-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-explain-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-explain-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-security-fix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-security-fix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-bugfix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-bugfix-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-performance-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-performance-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-test-writing-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-test-writing-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-multi-file-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-multi-file-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-refactor-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-refactor-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-config-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-config-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-recovery-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-recovery-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-dependency-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-dependency-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-explain-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| shell-explain-medium-001 | medium | 0.740 | ✓ | $0.0010 |
| python-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-security-fix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-performance-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-test-writing-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| typescript-multi-file-easy-001 | easy | 0.740 | ✓ | $0.0010 |
Latest run: bd6f10ac-2f94-4d89-9155-045dbf0b18d7 | Latest model: coder | Latest score: 0.740 | Recorded at: 2026-04-27T16:14:37.078591+00:00
| Run ID | Model | Git SHA | Score | Created |
|---|
| bd6f10ac-2f94-4d89-9155-045dbf0b18d7 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:14:37.078591+00:00 |
| fddda8e4-2787-408c-8c3f-a48687f86ad6 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:14:37.021572+00:00 |
| ccf7f6e1-5799-4822-8542-82c0127356f3 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:14:36.971405+00:00 |
| cbf38656-5716-4244-99f3-4331144e3d36 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:14:36.886328+00:00 |
| f58bc820-e605-401f-8f4d-acda7736a4ba | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T16:14:36.814328+00:00 |
| Run ID | Task ID | Taxonomy | Score | Cost | Created |
|---|
| bd6f10ac-2f94-4d89-9155-045dbf0b18d7 | typescript-multi-file-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:37.078591+00:00 |
| fddda8e4-2787-408c-8c3f-a48687f86ad6 | python-multi-file-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:37.021572+00:00 |
| ccf7f6e1-5799-4822-8542-82c0127356f3 | typescript-test-writing-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.971405+00:00 |
| cbf38656-5716-4244-99f3-4331144e3d36 | python-test-writing-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.886328+00:00 |
| f58bc820-e605-401f-8f4d-acda7736a4ba | typescript-performance-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.814328+00:00 |
| b81b274a-f40f-4e75-9152-b8f8d6527c6b | python-performance-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.740310+00:00 |
| 3b2fffb8-8232-4e6e-829b-7f56cc4636ed | typescript-bugfix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.686940+00:00 |
| 0cfd4276-8533-4dde-b0aa-0470c956013d | python-bugfix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.623191+00:00 |
| 032ae8b9-860e-495c-8b88-0b8e6e3d5cb9 | typescript-security-fix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.572760+00:00 |
| 980705cc-5aea-4de5-b967-0e2f8d9b31ac | python-security-fix-easy-001 | wrong-logic | 0.740 | $0.0010 | 2026-04-27T16:14:36.541799+00:00 |