Campaign | Created | Labels | Status |
---|---|---|---|
![]() |
{{ campaign.metadata.created }} | {% for category in campaign.metadata.config.annotation_span_categories %} {{ category.name }} {% endfor %} | {% if campaign.metadata.mode == 'llm_eval' or campaign.metadata.mode == 'llm_gen' %} {% include 'include/progress_bar_llm.html' %} {% elif campaign.metadata.mode == 'crowdsourcing' %} {% include 'include/progress_bar_crowdsourcing.html' %} {% else %} external {% endif %} |
To provide you better control over the results, we do not compute the inter-annotator agreement (IAA) directly in factgenie.
Instead, the process is the following:
For your convenience, we provide a 👉️ Jupyter notebook 👈️ showing how you can compute the Pearson r coefficient (dataset-level and example-level error count correlations) along with the γ (Gamma) score (fine-grained score based on span alignment) using the files exported from factgenie.
Dataset | Split | Outputs | Example count | Groups for comparison (at least 2 needed) |
---|