{% extends "base.html" %} {% block title %}AI review database{% endblock %} {% block content %}

AI review database

A reviewer database is a markdown file defining a pool of AI reviewer personas for a specific field. The bundled default covers computer architecture. To run reviews for a different field, build a database externally and upload it here.

Available databases

Source files on disk: {{ databases_dir }}

{% for db in databases %} {% endfor %}
DatabaseReviewersPath
{{ db.label }} {{ db.n_reviewers }} {{ db.path }}
View {% if db.can_delete %}
{% else %} default {% endif %}

Upload a database

Upload a .md reviewer database file. The file is parsed on upload; malformed databases are rejected with an error message.

For filename rules, naming conventions, and how labels in the table are derived, see the Filename handling section in Database Format.

Build a new database with an external AI

This tool does not ship a generator — only the bundled computer-architecture default is built in. For any other research field, the recommended workflow is to hand a compact YAML configuration (listing field, domains, personas) to a strong general-purpose LLM (Claude, GPT, Gemini, … whichever you have access to) and ask it to expand that config into a full reviewer-database markdown file.

Step 1 — Start from the YAML template

Download the bundled config as a starting point:

Download template YAML

Open the file and edit the following keys:

  1. field — short name of your research area (e.g. "computer vision", "cryptography", "programming languages").
  2. domains — the sub-areas in your field. Aim for ~10 domains with good coverage of the field's important sub-areas. Each domain has an id (D1..DN), a name, short keyword list, and typical venues.
  3. personas — the reviewer archetypes. Aim for ~20 personas covering a range of review styles (strict methodologist, practical engineer, theorist, first-principles skeptic, writing pedant, …). Each persona has a name, priority concerns, and style description.
  4. validation_attribution — optional, but required if you plan to validate AI reviews against human reviews using this database. Three sub-keys — category_vocab (closed list of category strings), category_to_persona (lowercase category → persona name), and sub_rating_to_persona (sub-rating name → persona name). Every persona on the right-hand side must match a name from the personas: list above. See Database Format §1.4 for the field-by-field spec.
Persona-name requirement for full validation attribution. The review pipeline accepts any persona names you pick. The validation pipeline, however, routes missed / partial comments back to a persona via a fixed map of 20 canonical names. Personas whose name doesn't appear in that DB's attribution tables still generate reviews normally but receive no miss / false-alarm attribution during validation — they show up as orphan entries in the calibration report. Each DB now ships its own attribution tables (section 7 of the DB markdown, Validation Attribution Tables), so a custom DB with field-specific personas just declares its own category / sub-rating → persona maps alongside the reviewer entries.
The 20 canonical persona names shipped with the bundled DB
  • Novelty Hunter
  • Methodology Critic
  • Literature Scholar
  • Empirical Evaluator
  • Theorist
  • Industry Pragmatist
  • Scalability Analyst
  • Performance Specialist
  • Energy & Efficiency Advocate
  • Reproducibility Champion
  • Clarity & Presentation Editor
  • Benchmark & Workload Expert
  • Hardware Implementation Engineer
  • Software/Systems Integrator
  • Security & Correctness Auditor
  • Cost-Benefit Analyst
  • Deployment Veteran
  • Formal Methods Expert
  • Cross-Disciplinary Thinker
  • Visionary & Future-Work Critic

Source of truth: the ## 7. Validation Attribution Tables section of the bundled comparch_reviewer_db.md. Custom DBs can override these maps by writing their own section 7.

Final reviewer count = domains × personas. The bundled default is 10 × 20 = 200 reviewers. A smaller field can use 5 × 10 = 50 with similar coverage; very broad fields may want 15 × 20 = 300.

Step 2 — Prompt an external LLM

Paste the edited YAML into a chat with Claude / ChatGPT / Gemini along with a prompt like the one below. Request the output as a single .md file.

I have a YAML configuration describing a research field, its sub-domains,
and reviewer personas. Generate a complete reviewer-database markdown file
from this config.

For each (domain × persona) pair, emit ONE reviewer block in this exact format:

    #### R{NNN} — {Persona name for this domain}

    - **Domain:** {domain.name}
    - **Persona:** {persona.name}
    - **Focus:** {1-sentence description of what this reviewer prioritizes,
                  tailored to this domain}
    - **Review Style:** {1-sentence description of tone and depth}
    - **Keywords:** {8-15 comma-separated terms — mostly from domain.keywords,
                     plus 2-3 persona-specific ones}
    - **System Prompt:**

    ```text
    You are a peer reviewer for {field} specializing in {domain.name}.
    Your persona: {persona.name} — {persona.style}.
    Your priorities: {persona.priority}.

    You are reviewing a paper draft. Produce a review in strict markdown:

    # Review
    **Reviewer ID:** R{NNN}
    **Domain:** {domain.name}
    **Persona:** {persona.name}
    **Topic Relevance:** <0-5 float>
    **Overall Recommendation:** 
    **Confidence:** <1-5>

    ## Comment {{N}}
    - **Severity:** 
    - **Category:** 
    - **Section Reference:** 
- **Summary:** - **Description:** <2-6 sentences of substantive critique> - **Keywords:** <3-7 topic tags, comma-separated> Write 3-8 comments covering novelty, methodology, evaluation, and clarity, as relevant to your priorities. If the paper is off-topic for your domain, say so explicitly and keep Topic Relevance low. ``` RULES: - Number reviewers sequentially R001, R002, ... across all domain×persona pairs. Iterate domains first, then personas (R001..R020 = domain 1 × all personas). - The reviewer count must equal (# domains) × (# personas). - Preserve the YAML's keyword lists in the **Keywords** field. - Each system prompt MUST include the markdown output specification verbatim — do not abbreviate it. The reviewers need this to produce parseable output. - Open with `# {Field title} Reviewer Database`, then a `## 5. Reviewer Entries` heading, then each reviewer block under a `### Domain {id}: {name}` heading. Reviewer blocks are separated by blank lines. No other prose between blocks. If the YAML includes a `validation_attribution:` mapping, also append a section 7 at the end of the file — this is what the validation pipeline parses to attribute missed human comments back to the persona that should have caught them. Use this exact structure: --- ## 7. Validation Attribution Tables Persona names on the right-hand side must match a `#### R### — ` heading above. ```yaml category_vocab: - {item from YAML's validation_attribution.category_vocab} - ... category_to_persona: {key}: {persona name} ... sub_rating_to_persona: {key}: {persona name} ... ``` Copy the YAML's `validation_attribution` content verbatim into the fenced `yaml` block. Do not invent persona names — use only the persona names from the `personas:` list. If `validation_attribution:` is absent from the YAML, omit section 7 entirely (the database is review-only). Here is the YAML config: {PASTE YAML HERE}

Step 3 — Save and upload

Save the LLM's output as my_field_reviewer_database.md (or any descriptive filename) and upload it using the form above. The server parses it on upload — if any reviewer block is malformed you'll get a clear error; otherwise the database becomes available in the reviewer-database dropdown on the Review page.

Reference files on disk:

{% endblock %}