Metadata-Version: 2.4
Name: verifiers
Version: 0.1.14
Summary: Verifiers: Environments for LLM Reinforcement Learning
Project-URL: Homepage, https://github.com/primeintellect-ai/verifiers
Project-URL: Documentation, https://github.com/primeintellect-ai/verifiers
Project-URL: Repository, https://github.com/primeintellect-ai/verifiers.git
Project-URL: Issues, https://github.com/primeintellect-ai/verifiers/issues
Author-email: William Brown <williambrown97@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: agentic-rl,agents,environments,eval,grpo,harness,llm,multi-turn,reinforcement-learning,rl,rlvr,tool-use,train,verifiers
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: <3.14,>=3.10
Requires-Dist: aiolimiter>=1.2.1
Requires-Dist: anthropic>=0.78.0
Requires-Dist: datasets<4.7.0,>=3.0.0
Requires-Dist: gepa
Requires-Dist: httpx>=0.27.0
Requires-Dist: jinja2>=3.1.6
Requires-Dist: math-verify>=0.8.0
Requires-Dist: mcp>=1.14.1
Requires-Dist: msgpack>=1.1.2
Requires-Dist: nest-asyncio>=1.6.0
Requires-Dist: numpy
Requires-Dist: openai-agents>=0.0.7
Requires-Dist: openai>=1.108.1
Requires-Dist: prime-sandboxes>=0.2.21
Requires-Dist: prime-tunnel>=0.1.6
Requires-Dist: pydantic>=2.11.9
Requires-Dist: pyzmq>=27.1.0
Requires-Dist: regex<2026.4.4
Requires-Dist: requests
Requires-Dist: rich
Requires-Dist: setproctitle>=1.3.0
Requires-Dist: tenacity>=8.5.0
Requires-Dist: textual
Requires-Dist: tomli; python_version < '3.11'
Requires-Dist: typing-extensions; python_version < '3.12'
Requires-Dist: wget>=3.2
Provides-Extra: browser
Requires-Dist: aiohttp>=3.9.0; extra == 'browser'
Requires-Dist: python-dotenv>=1.0.0; extra == 'browser'
Requires-Dist: stagehand>=3.0.0; extra == 'browser'
Provides-Extra: openenv
Requires-Dist: openenv-core[core]==0.2.1; extra == 'openenv'
Provides-Extra: renderers
Requires-Dist: renderers>=0.1.6; extra == 'renderers'
Provides-Extra: rg
Requires-Dist: reasoning-gym; extra == 'rg'
Provides-Extra: rl
Requires-Dist: accelerate>=1.4.0; extra == 'rl'
Requires-Dist: deepspeed>=0.17.6; extra == 'rl'
Requires-Dist: flash-attn>=2.8.3; extra == 'rl'
Requires-Dist: liger-kernel>=0.5.10; extra == 'rl'
Requires-Dist: peft; extra == 'rl'
Requires-Dist: requests; extra == 'rl'
Requires-Dist: torch<2.9.0,>=2.8.0; extra == 'rl'
Requires-Dist: transformers>=4.56.2; extra == 'rl'
Requires-Dist: vllm<0.11.0,>=0.10.0; extra == 'rl'
Requires-Dist: wandb; extra == 'rl'
Provides-Extra: ta
Requires-Dist: nltk; extra == 'ta'
Requires-Dist: textarena; extra == 'ta'
Description-Content-Type: text/markdown

<p align="center">
  <picture>
    <source media="(prefers-color-scheme: light)" srcset="https://github.com/user-attachments/assets/40c36e38-c5bd-4c5a-9cb3-f7b902cd155d">
    <source media="(prefers-color-scheme: dark)" srcset="https://github.com/user-attachments/assets/6414bc9b-126b-41ca-9307-9e982430cde8">
    <img alt="Prime Intellect" src="https://github.com/user-attachments/assets/6414bc9b-126b-41ca-9307-9e982430cde8" width="312" style="max-width: 100%;">
  </picture>
</p>

---

<h3 align="center">
Verifiers: Environments for LLM Reinforcement Learning
</h3>

<p align="center">
  <a href="https://docs.primeintellect.ai/verifiers">Documentation</a> •
  <a href="https://app.primeintellect.ai/dashboard/environments?ex_sort=most_stars">Environments Hub</a> •
  <a href="https://github.com/PrimeIntellect-ai/prime-rl">PRIME-RL</a>
</p>

---

<p align="center">
  <a href="https://github.com/PrimeIntellect-ai/verifiers/actions/workflows/style.yml">
    <img src="https://github.com/PrimeIntellect-ai/verifiers/actions/workflows/style.yml/badge.svg" alt="Style" />
  </a>
  <a href="https://github.com/PrimeIntellect-ai/verifiers/actions/workflows/test.yml">
    <img src="https://github.com/PrimeIntellect-ai/verifiers/actions/workflows/test.yml/badge.svg" alt="Test" />
  </a>
  <a href="https://github.com/PrimeIntellect-ai/verifiers/actions/workflows/publish-envs.yml">
    <img src="https://github.com/PrimeIntellect-ai/verifiers/actions/workflows/publish-envs.yml/badge.svg" alt="Envs" />
  </a>
</p>

## News & Updates

- [04/17/26] v0.1.12 is released, featuring a new composable Task/Agent/Environment architecture, upstreamed opencode and RLM harnesses/tasksets, major `RLMEnv` improvements (context dropping, prompt builder, hardened transport), multi-worker env server support, expanded `vf-tui` capabilities, and richer eval configuration.
- [03/12/26] v0.1.11 is released, featuring a unified client stack, major `RLMEnv` and env server reliability improvements, a substantially refined eval TUI, new pass@k and ablation sweep support, and bundled opencode environments.
- [02/10/26] v0.1.10 is released, featuring OpenEnv and BrowserEnv integrations, resumed evals, improved rollout and token tracking, safer sandbox lifecycle behavior, refreshed workspace setup, and opencode harbor improvements.
- [01/08/26] v0.1.9 is released, featuring a number of new experimental environment class types, monitor rubrics for automatic metric collection, improved workspace setup flow, improved error handling, bug fixes, and a documentation overhaul.
- [11/19/25] v0.1.8 is released, featuring a major refactor of the rollout system to use trajectory-based tracking for token-in token-out training across turns, as well as support for truncated or branching rollouts.
- [11/07/25] Verifiers v0.1.7 is released! This includes an improved quickstart configuration for training with [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl), a new included "nano" trainer (`vf.RLTrainer`, replacing `vf.GRPOTrainer`), and a number of bug fixes and improvements to the documentation.
- [10/27/25] A new iteration of the Prime Intellect [Environments Program](https://docs.google.com/spreadsheets/d/13UDfRDjgIZXsMI2s9-Lmn8KSMMsgk2_zsfju6cx_pNU/edit?gid=0#gid=0) is live!  


# Overview

Verifiers is our library for creating environments to train and evaluate LLMs.

Environments contain everything required to run and evaluate a model on a particular task:
- A *dataset* of task inputs
- A *harness* for the model (tools, sandboxes, context management, etc.)
- A reward function or *rubric* to score the model's performance

Environments can be used for training models with reinforcement learning (RL), evaluating capabilities, generating synthetic data, experimenting with agent harnesses, and more. 

Verifiers is tightly integrated with the [Environments Hub](https://app.primeintellect.ai/dashboard/environments?ex_sort=most_stars), as well as our training framework [prime-rl](https://github.com/PrimeIntellect-ai/prime-rl) and our [Hosted Training](https://app.primeintellect.ai/dashboard/training) platform.

## Getting Started

Ensure you have `uv` installed, as well as the `prime` [CLI](https://docs.primeintellect.ai/cli-reference/introduction) tool:
```bash
# install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# install the prime CLI
uv tool install prime
# log in to the Prime Intellect platform
prime login
```
To set up a new workspace for developing environments, do:
```bash
# ~/dev/my-lab
prime lab setup 
```

This sets up a Python project if needed (with `uv init`), installs `verifiers` (with `uv add verifiers`), creates the recommended workspace structure, and downloads useful starter files:
```
configs/
├── endpoints.toml      # OpenAI-compatible API endpoint configuration
├── rl/                 # Example configs for Hosted Training
├── eval/               # Example multi-environment eval configs
└── gepa/               # Example configs for prompt optimization
.prime/
└── skills/             # Bundled workflow skills for create/browse/review/eval/GEPA/train/brainstorm
environments/
└── AGENTS.md           # Documentation for AI coding agents
AGENTS.md               # Top-level documentation for AI coding agents
CLAUDE.md               # Claude-specific pointer to AGENTS.md
```

Alternatively, add `verifiers` to an existing project:
```bash
uv add verifiers && prime lab setup --skip-install
```

Environments built with Verifiers are self-contained Python modules. To initialize a fresh environment template, do:
```bash
prime env init my-env # creates a new template in ./environments/my_env
```
For OpenEnv integration, use:
```bash
prime env init my-openenv --openenv
```
Then copy your OpenEnv project into `environments/my_openenv/proj/` and build the image with:
```bash
uv run vf-build my-openenv
```

This will create a new module called `my_env` with a basic environment template.
```
environments/my_env/
├── my_env.py           # Main implementation
├── pyproject.toml      # Dependencies and metadata
└── README.md           # Documentation
```

Environment modules should expose a `load_environment` function which returns an instance of the Environment object, and which can accept custom arguments. For example: 
```python
# my_env.py
import verifiers as vf

def load_environment(dataset_name: str = 'gsm8k') -> vf.Environment:
    dataset = vf.load_example_dataset(dataset_name) # 'question'
    async def correct_answer(completion, answer) -> float:
        completion_ans = completion[-1]['content']
        return 1.0 if completion_ans == answer else 0.0
    rubric = vf.Rubric(funcs=[correct_answer])
    env = vf.SingleTurnEnv(dataset=dataset, rubric=rubric)
    return env
```

For composable environments with reusable tasksets, toolsets, custom programs,
or custom harnesses, use the v1 BYO Harness path:
```python
# my_env.py
import verifiers.v1 as vf

def source():
    yield {
        "prompt": [{"role": "user", "content": "Reverse abc."}],
        "answer": "cba",
        "max_turns": 1,
    }

@vf.reward(weight=1.0)
async def contains_answer(task, state) -> float:
    return float(task["answer"] in str(state.get("completion") or ""))

def load_taskset(config: vf.TasksetConfig | None = None):
    return vf.Taskset(source=source, rewards=[contains_answer], config=config)

def load_environment(config: vf.EnvConfig | None = None) -> vf.Env:
    config = config or vf.EnvConfig()
    return vf.Env(taskset=load_taskset(config=config.taskset))
```
If no harness is passed, `vf.Env` uses the base endpoint-backed harness. See
**[BYO Harness](docs/byo-harness.md)** for the advanced v1 taskset/harness API.
Reusable taskset and harness packages live under `verifiers.v1.packages` while
the v1 API stabilizes, and are re-exported from `verifiers.v1` for normal use.
For example, Harbor task directories can run through the bundled OpenCode CLI
harness with:

```python
env = vf.Env(
    taskset=vf.HarborTaskset(tasks="/path/to/harbor/tasks"),
    harness=vf.OpenCode(),
)
```

The same environment package is the unit used by evals and `prime-rl`. The
trainer owns model, endpoint, sampling, and rollout count; v1-specific taskset
and harness options stay under `env.taskset` and `env.harness`:

```toml
# configs/rl/my-v1-env.toml
model = "Qwen/Qwen3-30B-A3B-Instruct-2507"
max_steps = 100
batch_size = 256
rollouts_per_example = 8

[sampling]
max_tokens = 4096

[[env]]
id = "my-env"

[env.args]
arg1 = "non-th-arg"

[env.harness]
max_turns = 1

[env.taskset.scoring.contains_answer]
weight = 1.0
```

```bash
prime env install my-env
uv run prime-rl configs/rl/my-v1-env.toml
```

To install the environment module into your project, do:
```bash
prime env install my-env # installs from ./environments/my_env
```

To install an environment from the Environments Hub into your project, do:
```bash
prime env install primeintellect/math-python
```

To run a local evaluation with any OpenAI-compatible model, do:
```bash
prime eval run my-env -m openai/gpt-5-nano # run and save eval results locally
```
Evaluations use [Prime Inference](https://docs.primeintellect.ai/inference/overview) by default; configure your own API endpoints in `./configs/endpoints.toml`.

View local evaluation results in the terminal UI:
```bash
prime eval tui
```

To publish the environment to the [Environments Hub](https://app.primeintellect.ai/dashboard/environments?ex_sort=most_stars), do:
```bash
prime env push --path ./environments/my_env
```

To run an evaluation directly from the Environments Hub, do:
```bash
prime eval run primeintellect/math-python
```

## Documentation

**[Environments](docs/environments.md)** — Create datasets, rubrics, and custom multi-turn interaction protocols.

**[BYO Harness](docs/byo-harness.md)** — Build composable v1 taskset/harness environments with custom tools, sandboxes, users, and custom programs.

**[Evaluation](docs/evaluation.md)** - Evaluate models using your environments.

**[Training](docs/training.md)** — Train models in your environments with reinforcement learning.

**[Development](docs/development.md)** — Contributing to verifiers

**[API Reference](docs/reference.md)** — Understanding the API and data structures

**[FAQs](docs/faqs.md)** - Other frequently asked questions.


## Citation

Originally created by Will Brown ([@willccbb](https://github.com/willccbb)).

If you use this code in your research, please cite:

```bibtex
@misc{brown_verifiers_2025,
  author       = {William Brown},
  title        = {{Verifiers}: Environments for LLM Reinforcement Learning},
  howpublished = {\url{https://github.com/PrimeIntellect-ai/verifiers}},
  note         = {Commit abcdefg • accessed DD Mon YYYY},
  year         = {2025}
}
```
