Profile: gdm-swebench-lite-v1 | Tasks: 8 | Pass rate: 37.5% | Cost: $0.0080
| Task ID | Band | Score | Passed | Cost |
|---|---|---|---|---|
| python-bugfix-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| canary-typescript-session-003 | hard | 0.310 | ✗ | $0.0010 |
| canary-python-security-001 | hard | 0.310 | ✗ | $0.0010 |
| canary-typescript-auth-006 | hard | 0.310 | ✗ | $0.0010 |
| canary-python-cache-005 | hard | 0.310 | ✗ | $0.0010 |
| python-config-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| python-dependency-easy-001 | easy | 0.740 | ✓ | $0.0010 |
| canary-shell-ops-004 | hard | 0.310 | ✗ | $0.0010 |
Latest run: 5be9cc23-1170-4cda-85c6-f15ea1f570cc | Latest model: coder | Latest score: 0.310 | Recorded at: 2026-04-27T15:44:15.046881+00:00
| Run ID | Model | Git SHA | Score | Created |
|---|---|---|---|---|
| 5be9cc23-1170-4cda-85c6-f15ea1f570cc | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.310 | 2026-04-27T15:44:15.046881+00:00 |
| 66def481-7d91-4305-8a90-da35804589c5 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T15:44:14.987334+00:00 |
| e536c52f-4907-4bb6-b8fa-f29aacd3237c | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.740 | 2026-04-27T15:44:14.943627+00:00 |
| 98a3a464-d329-48d9-b236-62e444d5cc2c | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.310 | 2026-04-27T15:44:14.900759+00:00 |
| 05a670a3-9308-4336-af1f-ddcc986fcd29 | coder | 4669773b4fbe9d507f1396f38777a1b36998faf3 | 0.310 | 2026-04-27T15:44:14.853498+00:00 |
pass_rate vs cost_usd (Pareto frontier marked with *) * [#######-------------] 37.5% @ $0.0010 (coder)