Metadata-Version: 2.4
Name: meta-agents-research-environments
Version: 1.0.0
Summary: Meta Agents Research Environments is a research-driven environment designed to simulate complex, real-life tasks that span several minutes and require multiple steps to be solved. Unlike static simulation environments, this platform introduces a dynamic setting where the state of the environment evolves and new information is continuously integrated.
Project-URL: Homepage, https://github.com/facebookresearch/meta-agents-research-environments/
Project-URL: Repository, https://github.com/facebookresearch/meta-agents-research-environments.git
License: MIT License
License-File: LICENSE
Requires-Python: ==3.10.*
Requires-Dist: click==8.1.8
Requires-Dist: datasets==4.0.0
Requires-Dist: deepdiff>=8.6.1
Requires-Dist: docstring-parser==0.16
Requires-Dist: fastapi==0.116.1
Requires-Dist: fsspec==2024.12.0
Requires-Dist: geographiclib==2.0
Requires-Dist: huggingface-hub==0.33.4
Requires-Dist: inputimeout==1.0.4
Requires-Dist: jinja2==3.1.6
Requires-Dist: joblib==1.4.2
Requires-Dist: litellm==1.71.1
Requires-Dist: mammoth==1.8.0
Requires-Dist: markdownify==0.14.1
Requires-Dist: mcp[cli]==1.11.0
Requires-Dist: mysql-connector-python==9.2.0
Requires-Dist: numpy==2.2.6
Requires-Dist: pandas==2.2.3
Requires-Dist: pathvalidate==3.2.1
Requires-Dist: pdfminer-six==20231228
Requires-Dist: phonenumbers==8.13.53
Requires-Dist: pillow==11.1.0
Requires-Dist: polars-lts-cpu==1.33.1
Requires-Dist: puremagic==1.27
Requires-Dist: pydantic==2.10.6
Requires-Dist: pypdf==6.0.0
Requires-Dist: python-dotenv==1.0.1
Requires-Dist: python-pptx==1.0.2
Requires-Dist: rapidfuzz==3.12.1
Requires-Dist: retry==0.9.2
Requires-Dist: s3fs==2024.12.0
Requires-Dist: starlette==0.47.2
Requires-Dist: strawberry-graphql==0.275.5
Requires-Dist: termcolor==2.5.0
Requires-Dist: tqdm-joblib==0.0.4
Requires-Dist: uvicorn==0.35.0
Requires-Dist: wsproto==1.2.0
Provides-Extra: dev
Requires-Dist: autoflake==2.3.1; extra == 'dev'
Requires-Dist: build==1.2.1; extra == 'dev'
Requires-Dist: codespell==2.4.1; extra == 'dev'
Requires-Dist: debugpy==1.8.12; extra == 'dev'
Requires-Dist: flake8-tidy-imports==4.11.0; extra == 'dev'
Requires-Dist: flake8==7.1.1; extra == 'dev'
Requires-Dist: hatchling==1.21.1; extra == 'dev'
Requires-Dist: pre-commit==2.21.0; extra == 'dev'
Requires-Dist: pyright==1.1.399; extra == 'dev'
Requires-Dist: pytest==8.3.4; extra == 'dev'
Requires-Dist: ruff==0.11.5; extra == 'dev'
Requires-Dist: twine==5.0.0; extra == 'dev'
Requires-Dist: typer==0.16.0; extra == 'dev'
Description-Content-Type: text/markdown

# Meta Agents Research Environments (ARE)

[![PyPI version](https://badge.fury.io/py/meta-agents-research-environments.svg)](https://badge.fury.io/py/meta-agents-research-environments)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)

> A research environment for simulating complex, real-life tasks that require multi-step reasoning and dynamic adaptation.

Meta Agents Research Environments (ARE) is a platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this research platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges. In particular, ARE runs the [Gaia2](#example-gaia2-benchmark) benchmark, a follow-up to [Gaia](https://arxiv.org/abs/2311.12983), evaluating a broader range of agent capabilities.

## Table of Contents

- [Background](#background)
- [Install](#install)
- [Usage](#usage)
- [API](#api)
- [Contributing](#contributing)
- [License](#license)

## Background

ARE addresses critical gaps in AI agent evaluation by providing:

- **Dynamic Environments**: Scenarios that evolve over time with new information and changing conditions
- **Multi-Step Reasoning**: Complex tasks requiring 10+ steps and several minutes to complete
- **Real-World Focus**: Grounded situations that mirror actual real-world challenges
- **Comprehensive Evaluation**: The Gaia2 benchmark with 800 scenarios across multiple domains


### Getting Started

| | |
|---|---|
| **[Quick Start](https://facebookresearch.github.io/meta-agents-research-environments/quickstart.html)** | Get up and running with your first scenario in just a few minutes with step-by-step instructions. |
| **[Gaia2 Evaluation](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/gaia2_evaluation.html)** | Build and evaluate your agents on the Gaia2 benchmark, a comprehensive suite of 800 dynamic scenarios across 10 universes. |
| **[Gaia2 Blog Post](https://huggingface.co/blog/gaia2)** | Learn more about Gaia2 on the Hugging Face blog. |
| **[Paper](https://ai.meta.com/research/publications/are-scaling-up-agent-environments-and-evaluations/)** | Read the research paper detailing the Gaia2 benchmark and evaluation methodology. |
| **[Demo](https://huggingface.co/spaces/meta-agents-research-environments/demo)** | [Try the ARE Demo on Hugging Face](https://huggingface.co/spaces/meta-agents-research-environments/demo) — Play around with the agent platform directly in your browser, no installation required! |
| **[Gaia2 Leaderboard](https://huggingface.co/spaces/meta-agents-research-environments/leaderboard)** | Check the self-published results from Gaia2 Benchmark runs. |
| **[Learn More](https://facebookresearch.github.io/meta-agents-research-environments/foundations/index.html)** | Dive deeper into the core concepts of agents, environments, apps, events, and scenarios. |


## Install

For complete installation instructions and setup options, see the [Installation Guide](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/installation.html).

### Prerequisites

First, install [uv](https://docs.astral.sh/uv/), a fast Python package installer and resolver.

### Quick Start with uvx

The fastest way to get started is using `uvx` to run commands directly:

```bash
# Run Gaia2 benchmark scenarios
uvx --from meta-agents-research-environments are-benchmark gaia2-run --hf meta-agents-research-environments/gaia2 --hf_split validation -l 1

# Run custom scenarios
uvx --from meta-agents-research-environments are-run -s scenario_tutorial -a default
```

All the commands in this README and the documentation are available through `uvx`.

### Traditional Installation

Alternatively, install the package directly:

```bash
# With uv (recommended)
uv pip install meta-agents-research-environments

# With pip
pip install meta-agents-research-environments
```

## Usage

### Basic Commands

After installation, these command-line tools are available:

**Run Individual Scenarios**
```bash
are-run -s scenario_find_image_file -a default
```

**Benchmark Evaluation**
```bash
are-benchmark run -d /path/to/scenarios --agent default --limit 10
```

**Gaia2 Evaluation**
```bash
are-benchmark gaia2-run --hf meta-agents-research-environments/gaia2 --hf_split validation -l 5
```

**Interactive GUI**
```bash
are-gui -s scenario_find_image_file
```

The GUI provides a web-based interface for interactive scenario exploration and real-time agent monitoring. When started, it typically runs at `http://localhost:8080`. The interface supports different view modes:

- **Playground Mode**: Chat-like interface for direct agent interaction
- **Scenarios Mode**: Structured task execution and evaluation with DAG visualization

![Scenario DAG Visualization](docs/_static/ui/scenario_dag_visualization.png)

For detailed information about the GUI features, navigation, and workspace usage, see the [Understanding UI Guide](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/understanding_ui.html).


### Model Configuration

ARE supports multiple AI model providers through LiteLLM:

```bash
# Llama API
export LLAMA_API_KEY="your-api-key"
are-benchmark run --hf meta-agents-research-environments/gaia2 --hf_split validation \
  --model Llama-3.1-70B-Instruct --provider llama-api --agent default

# Local deployment
are-benchmark run --hf meta-agents-research-environments/gaia2 --hf_split validation \
  --model your-local-model --provider local \
  --endpoint "http://localhost:8000" --agent default
```

For detailed information on configuring different model providers, environment variables, and advanced options, see the [LLM Configuration Guide](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/llm_configuration.html).

Run any command with `--help` to see all available options.

### Example: Gaia2 Benchmark

```bash
# Set up your model configuration
export LLAMA_API_KEY="your-api-key"

# Run a validation set to test your setup
are-benchmark run --hf meta-agents-research-environments/gaia2 --hf_split validation \
  --model meta-llama/Llama-3.3-70B-Instruct --model_provider novita \
  --agent default --limit 10 --output_dir ./validation_results

# Run complete Gaia2 evaluation for leaderboard submission
are-benchmark gaia2-run --hf meta-agents-research-environments/gaia2 \
  --model Llama-3.1-70B-Instruct --provider llama-api \
  --agent default --output_dir ./gaia2_results \
  --hf_upload my-org/gaia2-results
```

## API

### Core Concepts

- **Agents**: AI entities that interact with the environment using ReAct (Reasoning + Acting) framework
- **Apps**: Interactive applications (email, calendar, file system) that provide APIs for agent interaction
- **Events**: Dynamic elements that make environments evolve over time
- **Scenarios**: Complete tasks combining apps, events, and validation logic

### Documentation

Comprehensive documentation is available at:

- **Main Documentation**: [docs/index.rst](https://facebookresearch.github.io/meta-agents-research-environments/index.html)
- **Tutorials**: [docs/tutorials/](https://facebookresearch.github.io/meta-agents-research-environments/tutorials/)
- **API Reference**: [docs/api_reference/](https://facebookresearch.github.io/meta-agents-research-environments/api_reference/)

Key documentation sections:

- [Core Concepts](https://facebookresearch.github.io/meta-agents-research-environments/foundations/index.html) - Understanding agents, apps, events, and scenarios
- [Benchmarking Guide](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/benchmarking.html) - Complete benchmarking and evaluation reference
- [Gaia2 Evaluation](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/gaia2_evaluation.html) - Detailed Gaia2 benchmark submission guide
- [Scenario Development](https://facebookresearch.github.io/meta-agents-research-environments/tutorials/scenario_development.html) - Creating custom scenarios
- [CLI Reference](https://facebookresearch.github.io/meta-agents-research-environments/api_reference/cli_reference.html) - Complete command-line interface documentation

### Quick Links

- **Installation Guide**: [docs/user_guide/installation.rst](https://facebookresearch.github.io/meta-agents-research-environments/user_guide/installation.html)
- **Quickstart Tutorial**: [docs/quickstart.rst](https://facebookresearch.github.io/meta-agents-research-environments/quickstart.html)

## Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details on:

- Setting up the development environment
- Running tests and linting
- Submitting pull requests
- Creating new scenarios and apps

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

## Citation

If you use Meta Agents Research Environments in your work, please cite:

```bibtex
TODO
```
