{% extends "base.html" %} {% from "table_macro.html" import table %} {% block title %}{{context['run']}}{% endblock %} {% block content %}

Judy

Judy is a python library and framework to evaluate the text-generation capabilities of Large Language Models (LLM) using a Judge LLM.

Judy allows users to use a competent Judge LLM (such as GPT-4) to evaluate other LLMs using different options for the following dimensions:

{% endblock %}