← Dashboard

Human Review {{ item.module_type }}

Assess the response quality and confirm the evaluator signal.

{% if error %}
{{ error }}
{% endif %}
{% if item %} {% else %} {% endif %}
Project{{ project }}
Module{{ item.module_type }}
Raw Score{{ item.raw_score | round(4) if item.raw_score is not none else '-' }}
Calibrated Score{{ item.calibrated_score | round(4) if item.calibrated_score is not none else '-' }}
Trigger{{ item.trigger_reason }}
Created{{ item.created_at.strftime('%Y-%m-%d %H:%M UTC') if item.created_at else '-' }}
Review item not available

User Request

{{ item.prompt or "No request available" }}

Model Response

{{ item.model_response }}
{% if item.judge_details %}

Evaluation Details

{% if review_context == 'complete_view' %}
Random Sample - Complete View
{% endif %} {% if item.judge_details['heuristic_only'] is defined and item.judge_details['heuristic_only'] %}
Heuristic Only
{% endif %} {% if item.judge_details['heuristic_only'] is defined and not item.judge_details['heuristic_only'] %}
Judge Enabled
{% endif %} {% if item.judge_details.get('hallucination_label') or item.judge_details.get('hallucination_risk_level') %} {% if item.judge_details.get('hallucination_type') %} {% endif %} {% if item.judge_details.get('hallucination_source') %} {% endif %} {% if item.judge_details.get('hallucination_patterns_found') %} {% endif %} {% if item.judge_details.get('hallucination_category') %} {% endif %} {% if item.judge_details.get('hallucination_subtypes') %} {% endif %} {% if item.judge_details.get('hallucination_risk_reason') %} {% endif %} {% if item.judge_details.get('adversarial_attack_type') %} {% endif %} {% if item.judge_details.get('adversarial_subtype') %} {% endif %} {% if item.judge_details.get('adversarial_risk_reason') %} {% endif %} {% else %} {% if item.judge_details['attack_type'] %} {% endif %} {% if item.judge_details['subtype'] %} {% endif %} {% if item.judge_details['risk_level'] %} {% endif %} {% if item.judge_details['risk_reason'] %} {% endif %} {% if item.judge_details['type'] %} {% endif %} {% if item.judge_details['source'] %} {% endif %} {% if item.judge_details['patterns_found'] %} {% endif %} {% if item.judge_details['category'] %} {% endif %} {% if item.judge_details['subtypes'] %} {% endif %} {% endif %}
Module{{ item.judge_details['module'] }}
Label{{ item.judge_details['label'] }}
Confidence{{ item.judge_details['confidence'] | round(4) }}
Explanation{{ item.judge_details['explanation'] }}
Hallucination Analysis
Risk Level {% set h_risk = item.judge_details.get('hallucination_risk_level', 'N/A') %} {% if h_risk == 'high' %} {{ h_risk | upper }} {% elif h_risk == 'medium' %} {{ h_risk | upper }} {% elif h_risk == 'low' %} {{ h_risk | upper }} {% else %} {{ h_risk }} {% endif %}
Type{{ item.judge_details.get('hallucination_type', 'N/A') }}
Source{{ item.judge_details.get('hallucination_source', 'N/A') }}
Detected Patterns{{ (item.judge_details.get('hallucination_patterns_found', [])) | join(', ') }}
Category{{ item.judge_details.get('hallucination_category', 'N/A') }}
Subtypes{{ (item.judge_details.get('hallucination_subtypes', [])) | join(', ') }}
Risk Reason{{ item.judge_details.get('hallucination_risk_reason', 'N/A') }}
Adversarial Analysis
Risk Level {% set a_risk = item.judge_details.get('adversarial_risk_level', 'N/A') %} {% if a_risk == 'high' %} {{ a_risk | upper }} {% elif a_risk == 'medium' %} {{ a_risk | upper }} {% elif a_risk == 'low' %} {{ a_risk | upper }} {% else %} {{ a_risk }} {% endif %}
Attack Type{{ item.judge_details.get('adversarial_attack_type', 'N/A') }}
Subtype{{ item.judge_details.get('adversarial_subtype', 'N/A') }}
Risk Reason{{ item.judge_details.get('adversarial_risk_reason', 'N/A') }}
Attack Type{{ item.judge_details['attack_type'] }}
Subtype{{ item.judge_details['subtype'] }}
Risk Level{{ item.judge_details['risk_level'] }}
Risk Reason{{ item.judge_details['risk_reason'] }}
Hallucination Type{{ item.judge_details['type'] }}
Source{{ item.judge_details['source'] }}
Detected Patterns{{ item.judge_details['patterns_found'] | join(', ') }}
Category{{ item.judge_details['category'] }}
Subtypes{{ item.judge_details['subtypes'] | join(', ') }}
{% endif %}

Overall Assessment

Is the model response acceptable?

Agreement with Automated Evaluation

Do you agree with the automated evaluation?

Issue Classification

Which issues were present? (Select all that apply)

Judge Error Analysis

Why was the automated evaluation incorrect? (Only fill if you selected Disagree/Partially Agree above)

Reviewer Notes

Additional comments