{{ c.category or 'clarity' }},
section: {{ c.section_reference or 'general' }}
{{ c.summary }}
{% endif %} {% if c.description %}{{ c.description }}
{% endif %}{% extends "base.html" %}
{% block title %}{{ data.review_name }}{% endblock %}
{% block content %}
{{ data.paper.title }}
Reminder: this output is a draft-polishing aid — it is not a peer-review
generator. Most venues have strict policies against using LLMs in assigned reviews. Please use it at your
own discretion, and indicate when you have used it. Every comment is a
suggestion to evaluate, not a finding to accept. AI reviewers hallucinate, miss context,
and over-confidently flag non-issues. Expect to reject roughly half of what you see.
LLM: {{ data.review_name }}
{{ data.llm_provider }} / {{ data.llm_model }}
· Base URL: {{ data.llm_base_url or '(default)' }}
{% if data.launched_at %}
· Launched: {{ data.launched_at }}
{% endif %}
{% if data.ended_at %}
· Ended: {{ data.ended_at }}
{% endif %}
| File | Size | Description |
|---|---|---|
{{ f.abs_path }} |
{{ f.size }} | {{ f.description }} |
Everything produced by this review lives in a single directory. Paths below are absolute so you can copy-paste straight into a shell.
Run directory: {{ run_files.run_dir }}
{{ data.selected|length }} reviewers selected by topic similarity, then diversified across personas.
| ID | Domain | Persona | Selection relevance |
|---|---|---|---|
{{ r.id }} |
{{ r.domain }} | {{ r.persona }} | {{ "%.3f"|format(r.score) }} |
Format-fix retries: {{ data.n_format_repairs or 0 }} of {{ data.n_reviewers_total }} reviewer(s) (incl. the clarity reviewer) needed a markdown-repair pass to produce usable output. {% if (data.n_format_repairs or 0) > 0 %} A high count suggests the model is deviating from the expected comment format — consider a different provider/model or tightening the persona prompt. {% endif %}
{% endif %}
Always-on reviewer focused on writing quality (flow, terminology,
grammar, figures, structure). Not part of the ranked issues and
not compared against human reviewers during Validation —
{{ data.clarity_review._reviewer_id }} /
{{ data.clarity_review._persona }}.
{{ c.category or 'clarity' }},
section: {{ c.section_reference or 'general' }}
{{ c.summary }}
{% endif %} {% if c.description %}{{ c.description }}
{% endif %}Ordered by commonality × importance: issues multiple reviewers raise, weighted by how severe they are, rise to the top.
Each issue cluster groups comments that look like the same concern (cosine similarity on summary + keywords). Clusters are then scored:
score = num_distinct_reviewers × (0.5·avg_severity + 0.5·max_severity)
where severity weights are:
Higher-severity comments AND broader reviewer agreement both push a cluster up. Intuition: one reviewer's minor issue scores 1·(0.5·1 + 0.5·1) = 1; three reviewers agreeing on a moderate issue scores 3·(0.5·2 + 0.5·2) = 6; five reviewers on a mix of major and moderate scores 5·(0.5·2.4 + 0.5·3) = 13.5.
The severity badge next to each cluster shows the worst individual comment in the cluster, not an average. A cluster can legitimately have score 6 with a MINOR label — that means six reviewers all flagged the same thing as minor, and none escalated it to moderate or major. The count of agreeing reviewers still pushes its score up. The small coloured chips beside the label ("6 minor" etc.) show the severity mix so you can tell a broadly-agreed nit from a genuinely severe issue at a glance.
category and section fields showEach cluster row shows two tags pulled from the representative comment (the highest-severity member):
category — a classification of what kind of
concern this is, snapped to a fixed vocabulary at parse
time. Examples: novelty, evaluation,
methodology, reproducibility,
presentation, deployment,
related_work, correctness. This is
what the validator uses to route each miss to an expected
persona (Methodology Critic catches methodology,
Novelty Hunter catches novelty, etc.). Shown as
general if the LLM didn't pick a specific one.
section — a normalized anchor into where in
the paper the comment applies, extracted from the
reviewer's text by pattern matching: Section 3.2,
Table 2, Figure 4,
Algorithm 1, Eq. 7, or named parts
like Abstract, Introduction,
Related Work, Conclusion,
References. Shown as general when
the comment doesn't point at any specific part of the paper.
Clusters group comments by semantic similarity, so the representative's category and section are shared by most of the cluster — but individual members may phrase section references slightly differently. Expand the "other reviewers raised the same issue" details below each cluster to see the full text each reviewer wrote.
Description. {{ rep.description }}
{% if c.members|length > 1 %}{{ m._reviewer_id }}
({{ m._persona }})
{{ (m.severity or 'minor')|upper }}
category: {{ m.category or 'general' }},
section: {{ m.section_reference or 'general' }}
{{ m.summary }}
{% endif %} {% if m.description %}{{ m.description }}
{% endif %}No issues were raised — surprising; the reviewers produced no comments for this paper.
{% endfor %}