Metadata-Version: 2.3
Name: verifiable-labs-code-humaneval-multiturn
Version: 0.1.1
Summary: Multi-turn procedural code-execution RL environment with test-feedback rollouts and conformal coverage
Project-URL: Homepage, https://github.com/stelioszach03/verifiable-labs-envs
Project-URL: Issues, https://github.com/stelioszach03/verifiable-labs-envs/issues
Author: Stelios Zacharioudakis
License: Apache-2.0
Keywords: code-execution,evaluation,humaneval,multi-turn,reinforcement-learning,rlvr,verifiable-labs
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.11
Requires-Dist: numpy>=1.26
Requires-Dist: verifiable-labs-envs-core>=0.1.1
Description-Content-Type: text/markdown

# verifiable-labs-code-humaneval-multiturn

Multi-turn procedural code-execution RL environment from the
Verifiable Labs catalogue. Same problem distribution as
`code-humaneval`, but the model gets up to **3 turns** with
visible-test feedback between them.

| Turn | What the model sees                                                         | What it returns          |
|------|------------------------------------------------------------------------------|--------------------------|
| 1    | Function signature + docstring + visible test block                          | First implementation     |
| 2    | Visible test pass/fail counts (no test source, no oracle)                    | Revised implementation   |
| 3    | Same — final attempt scored against visible ∪ hidden tests                   | Final implementation     |

A **turn-count penalty** of 5% per extra turn (capped at 10%) keeps
multi-turn from being a free win — three turns scores 0.9× the
equivalent single-turn reward. Hidden tests are **never** shown to
the model (R10 — visible test pass count is the only feedback signal).

## Install

```bash
pip install verifiable-labs-code-humaneval-multiturn
```

Source of truth + full docs:
https://github.com/stelioszach03/verifiable-labs-envs
