You are analyzing an agent trial run. Evaluate the trial against the criteria below, and provide a short rationale for each.

You are in the trial directory. Read trial files using relative paths (e.g., "result.json", "agent/trajectory.json").

{task_section}

Read ALL relevant files before evaluating:

Trial files:
- result.json — trial outcome, reward, exception info
- agent/trajectory.json — the agent's full action history
- verifier/test-stdout.txt (if exists) — test output
- exception.txt (if exists) — failure details

Evaluate each criterion one at a time. For each criterion, think about the evidence before making your judgment. Reference specific files, trajectory steps, or test output in your explanations.

Your response must include:
- "trial_name": the trial directory name
- "summary": 3-5 sentence overview of what happened. Include: what the agent attempted, key errors or issues, and how close the agent got to solving the task (e.g., passed some tests, had the right approach but got stuck, or failed early without progress).
- "checks": evaluation of each criterion below, with outcome (pass/fail/not_applicable) and explanation

Guidance:
{criteria_guidance}
