Add dataset
  • Example datasets
  • Direct upload
  • Register custom loader

You can download example datasets (optionally also along with model outputs and campaigns with annotations) from the list below.

After downloading the dataset, it will appear in the list of local datasets.

You can also help other users by adding your (or public) dataset to the list! See the contribution guidelines on factgenie wiki .

{% for dataset_id, dataset in resources.items() %} {% endfor %}
Dataset Splits Outputs Campaigns Source Download Size Description
{{ dataset_id }} {% for split in dataset.splits %} {{ split }} {% endfor %} {% if not dataset.outputs %} - {% endif %} {% for output in dataset.outputs %} {{ output }} {% endfor %} {% if not dataset.annotations %} - {% endif %} {% for ann in dataset.annotations %} {{ ann }} {% endfor %} {% if dataset.source %} link {% else %} - {% endif %} {% if dataset['download-size'] %} {{ dataset['download-size']|safe }} {% else %} - {% endif %}
{% if dataset.description %} {{ dataset.description|safe }} {% else %} No description {% endif %}

Here you can upload dataset in one of the basic formats with pre-defined loaders.

Supported formats
  • text: a plain text file, one input example per line.
  • jsonl: a JSON Lines file, one JSON object per line.
  • csv: a CSV file, one input example per row.
  • html: a ZIP file containing HTML files, one input example per file (the files will be sorted numerically by filename).
Dataset files

To add a custom dataset locally, register the dataset in data/datasets.yaml.

You can either select from pre-existing classes (see datasets/basic.py) or create a loader in datasets (subclassing the Dataset class).

See data/datasets_TEMPLATE.yaml for an example.