Metadata-Version: 2.4
Name: chakra-marina
Version: 0.13.0
Summary: A framework for evaluating and optimizing agents and models using sandboxed environments.
Author: Alex Shaw
Author-email: Alex Shaw <alexgshaw64@gmail.com>
License-Expression: Apache-2.0
License-File: LICENSE
Requires-Dist: pydantic>=2.11.7
Requires-Dist: shortuuid>=1.0.13
Requires-Dist: typer>=0.16.0
Requires-Dist: requests>=2.32.4
Requires-Dist: pyyaml>=6.0.2
Requires-Dist: rich>=14.1.0
Requires-Dist: toml>=0.10.2
Requires-Dist: tenacity>=9.1.2
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: litellm>=1.83.14
Requires-Dist: jinja2>=3.1.6
Requires-Dist: datasets>=4.4.1
Requires-Dist: dirhash>=0.5.0
Requires-Dist: claude-agent-sdk>=0.1.17
Requires-Dist: packaging>=25.0
Requires-Dist: fastapi>=0.128.0
Requires-Dist: uvicorn>=0.38.0
Requires-Dist: ruff>=0.13.0
Requires-Dist: pathspec>=1.0.3
Requires-Dist: supabase>=2.28.2
Requires-Dist: httpx>=0.27.0
Requires-Dist: chakra-marina[cloud] ; extra == 'all'
Requires-Dist: chakra-marina[tinker] ; extra == 'all'
Requires-Dist: chakra-marina[cwsandbox] ; extra == 'cloud'
Requires-Dist: chakra-marina[wandb] ; extra == 'cloud'
Requires-Dist: chakra-marina[e2b] ; extra == 'cloud'
Requires-Dist: chakra-marina[daytona] ; extra == 'cloud'
Requires-Dist: chakra-marina[islo] ; extra == 'cloud'
Requires-Dist: chakra-marina[modal] ; extra == 'cloud'
Requires-Dist: chakra-marina[runloop] ; extra == 'cloud'
Requires-Dist: chakra-marina[gke] ; extra == 'cloud'
Requires-Dist: chakra-marina[tensorlake] ; extra == 'cloud'
Requires-Dist: chakra-marina[novita] ; extra == 'cloud'
Requires-Dist: cwsandbox>=0.23.3 ; extra == 'cwsandbox'
Requires-Dist: daytona>=0.165.0 ; extra == 'daytona'
Requires-Dist: e2b>=2.25.0 ; extra == 'e2b'
Requires-Dist: dockerfile-parse>=2.0.1 ; extra == 'e2b'
Requires-Dist: kubernetes>=32.0.0 ; extra == 'gke'
Requires-Dist: islo>=0.3.0 ; extra == 'islo'
Requires-Dist: dockerfile-parse>=2.0.1 ; extra == 'islo'
Requires-Dist: harbor-langsmith ; extra == 'langsmith'
Requires-Dist: modal>=1.4.0 ; extra == 'modal'
Requires-Dist: novita-sandbox==2.0.0a3 ; extra == 'novita'
Requires-Dist: dockerfile-parse>=2.0.1 ; extra == 'novita'
Requires-Dist: runloop-api-client>=1.2.0 ; extra == 'runloop'
Requires-Dist: dockerfile-parse>=2.0.1 ; extra == 'runloop'
Requires-Dist: tensorlake>=0.5.18 ; extra == 'tensorlake'
Requires-Dist: tinker>=0.14.0 ; extra == 'tinker'
Requires-Dist: tinker-cookbook>=0.1.0 ; extra == 'tinker'
Requires-Dist: wandb>=0.27 ; extra == 'wandb'
Requires-Dist: cwsandbox>=0.23.3 ; extra == 'wandb'
Requires-Python: >=3.12
Provides-Extra: all
Provides-Extra: cloud
Provides-Extra: cwsandbox
Provides-Extra: daytona
Provides-Extra: e2b
Provides-Extra: gke
Provides-Extra: islo
Provides-Extra: langsmith
Provides-Extra: modal
Provides-Extra: novita
Provides-Extra: runloop
Provides-Extra: tensorlake
Provides-Extra: tinker
Provides-Extra: wandb
Description-Content-Type: text/markdown

# Marina

Marina is a fork of [Harbor](https://github.com/harbor-framework/harbor), a framework for evaluating and optimizing AI agents and language models. You can use Marina to:

- Evaluate arbitrary agents like Claude Code, OpenHands, Codex CLI, and more.
- Build and share your own benchmarks and environments.
- Conduct experiments in thousands of environments in parallel through providers like Daytona and Modal.
- Generate rollouts for RL optimization.

```bash
marina run -p examples/tasks/hello-screenshot --agent openhands-sdk --model openrouter/openai/gpt-5.5 --ae LLM_SUPPORTS_VISION=true
```

## Changes

We track upstream Harbor closely, with a few additions of our own.

**Agent kwargs** (`--ak key=value`)

- `disable_builtin_tools` — drop the default terminal/file-editor/task-tracker tools
- `disable_stuck_detection` — turn off the SDK's stuck-agent detection

**Environment variables** (`--ae NAME=value`)

- `LLM_SUPPORTS_VISION` — force vision support for models LiteLLM misclassifies
- `OPENROUTER_REASONING_ENABLED` / `_EXCLUDE` — control OpenRouter reasoning (auto-on for Opus 4.7)
- `OPENROUTER_VERBOSITY` — set OpenRouter verbosity
- `LLM_THINKING_DISPLAY` — thinking display mode (default `summarized`)
- `SYSTEM_MESSAGE_SUFFIX` / `USER_MESSAGE_SUFFIX` — append text to the OpenHands SDK system / user prompts
- `AWS_BEARER_TOKEN_BEDROCK` — bearer-token auth for Bedrock

Vision support also writes MCP image observations to `/logs/agent/trajectory-images/`, referenced from the ATIF trajectory.

## Installation

Marina is published to PyPI as [`chakra-marina`](https://pypi.org/project/chakra-marina/).

**Install the CLI.** To put `marina` on your PATH:

```bash
uv tool install chakra-marina
```

Or with pip:

```bash
pip install chakra-marina
```

Add extras for cloud providers — e.g. `chakra-marina[daytona]`, or
`chakra-marina[cloud]` for all providers and `chakra-marina[all]` for everything.

## Example: Running a task

Run the bundled `hello-screenshot` task locally with Docker. It's a vision smoke
test: the agent calls an MCP `take_screenshot` tool, sees a solid-color image, and
reports the color — exercising Marina's image pipeline end to end.

```bash
export LLM_API_KEY=<YOUR-KEY>
marina run -p examples/tasks/hello-screenshot \
   --agent openhands-sdk \
   --model openrouter/openai/gpt-5.5 \
   --ae LLM_SUPPORTS_VISION=true \
   --ae SYSTEM_MESSAGE_SUFFIX="You run fully autonomously — never ask for confirmation."
```

`--ae LLM_SUPPORTS_VISION=true` enables vision (required for this task), and
`--ae SYSTEM_MESSAGE_SUFFIX="..."` appends text to the OpenHands SDK system prompt.
Both apply only to the `openhands-sdk` agent (see [Changes](#changes)).

To run on a cloud provider (like Daytona) instead of local Docker, pass the `--env` flag:

```bash
export LLM_API_KEY=<YOUR-KEY>
export DAYTONA_API_KEY=<YOUR-KEY>
marina run -p examples/tasks/hello-screenshot \
   --agent openhands-sdk \
   --model openrouter/openai/gpt-5.5 \
   --ae LLM_SUPPORTS_VISION=true \
   --env daytona
```

When running a whole benchmark (many tasks), raise `--n-concurrent` to fan out across
hundreds or thousands of environments in parallel.

To see all supported agents, and other options run:

```bash
marina run --help
```
