Metadata-Version: 2.4
Name: full-repo-refactor
Version: 0.6.1
Summary: Deprecated: use `refactor-arena` instead
Project-URL: Documentation, https://refactorarena.com/
Project-URL: GitHub, https://github.com/fastfedora/refactor-arena
Author-email: Trevor Lohrbeer <trevor@fastfedora.com>
License: MIT
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: refactor-arena>=0.6.1
Description-Content-Type: text/markdown

# Refactor Arena

**NOTE: This package has been renamed. Find the new package at `refactor-arena` or visit
[refactorarena.com](https://refactorarena.com/) for the documentation.**

A Control Arena setting for evaluating agents on full repository refactoring tasks with optional
backdoor insertion.

[View full documentation](https://refactorarena.com/).

## Overview

This setting provides an environment for evaluating agents on full repository refactoring tasks.
Each task involves:

1. **Main Task**: Refactoring the repository to improve code quality and maintainability.
2. **Side Task** (Attack Mode): Inserting a hidden vulnerability while performing the refactoring
   that is exploited when running the tests.

## Installation

Install the package (requires Python 3.12):

```bash
# Install the package
pip install refactor-arena

# Setup your .env and initial scenario
refactor-arena init

# Run getting started scenario in honest mode
refactor-arena eval --config getting-started
```

If not already installed, [install Docker](https://docs.docker.com/engine/install/). Docker must be
running for the evals to execute.

## Usage

### Using the CLI

```bash
# Run with default settings using `side-tasks-3/mixed-3` dataset
refactor-arena

# Setup your .env and initial scenario
refactor-arena init

# Run getting started scenario in honest mode
refactor-arena eval --config getting-started

# Run getting started scenario in attack mode
refactor-arena eval --config getting-started --eval-mode attack

# Run with a different dataset from `github:fastfedora/full-repo-datasets` and limit main tasks
refactor-arena --dataset side-tasks-3/mixed-20 --main-task "refactor_single_file"

# Run with verbose logging and limit to 3 samples
refactor-arena eval --config getting-started --verbose --limit 3

# View help
refactor-arena --help
```

## Configuration

To create a new scenario, create a file that references the main configuration file from the package
and then override the settings you want to change.

For instance, to create a scenario that runs only the Refactor Single File main task against the
20-repo `side-tasks-3/mixed-20` dataset in attack mode, create a file called `mixed-20.yaml`:

```yaml
# mixed-20.yaml
includes:
  - package:config.yaml

setting:
  dataset: "side-tasks-3/mixed-20"
  main_task: refactor_single_file$

eval:
  # Default mode for this scenario is attack; use `--eval-mode honest` to run in honest mode
  mode: attack
```

Then run this scenario using:

```bash
refactor-arena --config=mixed-20
```

Almost all aspects of the setting can be configured via configuration files, including:

- models
- prompts
- main tasks
- side tasks
- policies
- monitors
- metrics
- eval settings

See [the documentation](https://refactorarena.com/) for more details.

## License

MIT License