Metadata-Version: 2.4
Name: gl-computer-use
Version: 0.0.0b6
Summary: GL Computer Use SDK — desktop automation via natural-language prompts.
Author-email: Christopher Julius Limantoro <christopherlimantoro@gmail.com>
Requires-Python: <3.14,>=3.12
Description-Content-Type: text/markdown
Requires-Dist: pydantic<3.0.0,>=2.11.4
Requires-Dist: pydantic-settings<3.0.0,>=2.0.0
Requires-Dist: python-dotenv<2.0.0,>=1.0.0
Requires-Dist: e2b-desktop<3.0.0,>=2.3.0
Requires-Dist: cua-agent<1.0.0,>=0.8.2
Requires-Dist: aiofiles<25.0.0,>=23.0.0
Requires-Dist: structlog<26.0.0,>=24.0.0
Provides-Extra: minio
Requires-Dist: aiobotocore<3,>=2.13; extra == "minio"
Provides-Extra: recording
Requires-Dist: playwright<2.0.0,>=1.40.0; extra == "recording"
Requires-Dist: pillow<13.0.0,>=12.2.0; extra == "recording"
Provides-Extra: agents
Requires-Dist: gui-agents>=0.3.0; extra == "agents"
Requires-Dist: pyautogui>=0.9.54; extra == "agents"
Provides-Extra: opensandbox
Requires-Dist: opensandbox>=0.1.7; extra == "opensandbox"
Provides-Extra: retries
Requires-Dist: tenacity<10.0.0,>=8.2.0; extra == "retries"
Provides-Extra: observability
Requires-Dist: gl-computer-use[retries]; extra == "observability"
Requires-Dist: gl-observability-binary==0.1.4; extra == "observability"
Provides-Extra: otel
Requires-Dist: gl-computer-use[observability]; extra == "otel"
Provides-Extra: all
Requires-Dist: gl-computer-use[agents,minio,observability,opensandbox,recording,retries]; extra == "all"

# GL Computer Use

## Description

A typed Python SDK for desktop automation via natural-language prompts. GL Computer Use wraps cloud desktop sandboxes and computer-use agents into a clean async API with live streaming, human-in-the-loop takeover, structured observability, and swappable providers.

### Key Features

- **Streaming and non-streaming run modes**: `run()` for live events, `run_once()` for a single result, `run_sync()` for non-async scripts and Jupyter notebooks.
- **Swappable agents**: `cua` (trycua/cua, default) or `agents` (simular-ai/Agent-S).
- **Swappable sandboxes**: `e2b` (E2B Desktop, default) or `opensandbox` (Alibaba OpenSandbox).
- **Live desktop URL**: noVNC streaming URL surfaced via the `SANDBOX_READY` event or `StreamClient.stream_url`.
- **Human-in-the-loop takeover**: pause an agent loop and hand control to a human, then resume with optional guidance.
- **Artifact storage**: local disk by default, MinIO/S3 via the `minio` extra.
- **Structured logging** with optional OpenTelemetry tracing/metrics and Sentry via the `observability` extra.
- **Custom provider registration**: plug in your own sandbox, agent, or artifact store without modifying the SDK.

---

## Installation

Install the core SDK:

```bash
pip install gl-computer-use
```

Install optional extras only when you need them:

```bash
pip install "gl-computer-use[recording]"     # WebM session recording via Playwright
pip install "gl-computer-use[agents]"        # Agent-S (simular-ai) support
pip install "gl-computer-use[opensandbox]"   # Alibaba OpenSandbox support
pip install "gl-computer-use[minio]"         # MinIO / S3-compatible artifact store
pip install "gl-computer-use[observability]" # OTLP tracing/metrics + Sentry via gl-observability
pip install "gl-computer-use[all]"           # all of the above
```

API keys required at runtime:

1. E2B API key — [e2b.dev](https://e2b.dev) (when using `sandbox="e2b"`)
2. Anthropic API key (for the default `claude-sonnet-4-6` model) or OpenAI API key

### Session recording setup (optional, one-time)

WebM recordings require Playwright's Chromium binaries (~130 MB, stored under `~/.cache/ms-playwright/`):

```bash
pip install "gl-computer-use[recording]"
gl-computer-use-setup
```

If you skip this step, the SDK falls back to GIF recording via screenshot stitching.

---

## Quick Start

### Streaming events

`run()` returns a `StreamClient`; iterate it to receive events. The terminal `TASK_COMPLETED` event carries the final `TaskResult`.

```python
import asyncio
from gl_computer_use import GLComputerUseClient


async def main() -> None:
    client = GLComputerUseClient()
    stream = await client.run("Open Firefox and navigate to google.com")

    async for event in stream:
        if event.event_type == "SANDBOX_READY" and event.stream_url:
            print(f"Watch live at: {event.stream_url}")
        elif event.event_type == "STEP_COMPLETED":
            print(f"Step {event.step_index}: {event.action.type if event.action else '—'}")
        elif event.event_type == "TASK_COMPLETED":
            print(f"Status: {event.result.status}")
            print(f"Output: {event.result.output}")


asyncio.run(main())
```

### Fire-and-forget async

`run_once()` returns a `TaskResult` directly when the task finishes. Raises `TaskFailedError` / `TaskCancelledError` on non-`COMPLETED` outcomes.

```python
import asyncio
from gl_computer_use import GLComputerUseClient


async def main() -> None:
    client = GLComputerUseClient()
    result = await client.run_once("Open a terminal and check Python version")
    print(result.status, result.output, len(result.steps))


asyncio.run(main())
```

### Synchronous / Jupyter

`run_sync()` is a plain synchronous method — no `asyncio.run()`, no `await`. It detects whether an event loop is already running and dispatches via `ThreadPoolExecutor` when needed, so it works in regular scripts and Jupyter notebooks (no `nest_asyncio` required).

```python
from gl_computer_use import GLComputerUseClient

result = GLComputerUseClient().run_sync("Open the file manager")
print(result.status)
```

---

## Configuration

Configuration is read from environment variables (prefix `GLCU_`) or by passing a `GLComputerUseConfig` object directly. Create a `.env` file in your working directory:

```dotenv
GLCU_E2B_API_KEY=sk-e2b-...
GLCU_ANTHROPIC_API_KEY=sk-ant-...

# Optional overrides
GLCU_MODEL=anthropic/claude-sonnet-4-6
GLCU_TIMEOUT=300
GLCU_MAX_STEPS=50
```

Critical fields:

| Variable | Default | Description |
|---|---|---|
| `GLCU_E2B_API_KEY` | `None` | E2B Desktop API key (required when `sandbox="e2b"`) |
| `GLCU_ANTHROPIC_API_KEY` | `None` | Anthropic API key (required for `anthropic/*` models) |
| `GLCU_OPENAI_API_KEY` | `None` | OpenAI API key (required for `openai/*` models) |
| `GLCU_MODEL` | `"anthropic/claude-sonnet-4-6"` | LLM in `provider/name` format |
| `GLCU_AGENT` | `"cua"` | Agent provider: `"cua"` or `"agents"` |
| `GLCU_SANDBOX` | `"e2b"` | Sandbox provider: `"e2b"` or `"opensandbox"` |
| `GLCU_ARTIFACT` | `"local"` | Artifact store: `"local"` or `"minio"` |
| `GLCU_TIMEOUT` | `300.0` | Task timeout in seconds |
| `GLCU_MAX_STEPS` | `50` | Maximum agent loop iterations |
| `GLCU_LOCAL_ARTIFACT_DIR` | `"./artifacts"` | Directory for saved screenshots and recordings |
| `GLCU_LOG_LEVEL` | `"INFO"` | `DEBUG`, `INFO`, `WARNING`, or `ERROR` |
| `GLCU_LOG_FORMAT` | `"json"` | `"json"` (structured) or `"console"` (human-readable) |

OpenSandbox, MinIO, Agent-S, and observability (OTLP/Sentry/PII) have additional `GLCU_*` env vars — see `GLComputerUseConfig` in `gl_computer_use/config.py` for the full list.

---

## Provider Agnosticism

Swap agents and sandboxes via config alone — no code changes:

| Agent | Sandbox | Config |
|---|---|---|
| CUA (default) | E2B (default) | `GLComputerUseClient()` |
| CUA | OpenSandbox | `GLComputerUseConfig(sandbox="opensandbox")` |
| Agent-S | E2B | `GLComputerUseConfig(agent="agents")` |
| Agent-S | OpenSandbox | `GLComputerUseConfig(agent="agents", sandbox="opensandbox")` |

```python
from gl_computer_use import GLComputerUseClient, GLComputerUseConfig

client = GLComputerUseClient(GLComputerUseConfig(agent="agents", sandbox="opensandbox"))
```

---

## Runtime API

The client exposes three run methods:

| Method | Returns | Use when |
|---|---|---|
| `await client.run(prompt, ...)` | `StreamClient` | You need live event streaming or the `SANDBOX_READY` URL before the task finishes |
| `await client.run_once(prompt, ...)` | `TaskResult` | You only need the final result, async context |
| `client.run_sync(prompt, ...)` | `TaskResult` | You only need the final result, non-async script or Jupyter notebook |

All three methods accept the same parameters:

| Parameter | Type | Default | Description |
|---|---|---|---|
| `prompt` | `str` | — | Task description |
| `config` | `GLComputerUseConfig \| None` | `None` | Per-call config override |
| `timeout` | `float \| None` | `None` | Max seconds (falls back to `config.timeout`) |
| `files` | `list[File] \| None` | `None` | Files to upload to the sandbox before the task |
| `retrieve_files` | `list[str] \| None` | `None` | Sandbox paths to download after completion |
| `on_takeover_needed` | `Callable \| None` | `None` | Takeover callback |

`run_once()` and `run_sync()` raise `TaskFailedError` / `TaskCancelledError` directly instead of returning a result with a non-`COMPLETED` status.

---

## Live Desktop (noVNC)

When using the E2B sandbox, a noVNC HTTP endpoint is started alongside the desktop. The SDK waits until that endpoint is reachable before surfacing the URL.

```python
# Option A — pre-iteration attribute
stream = await client.run("do something")
print(stream.stream_url)

# Option B — first SANDBOX_READY event
async for event in stream:
    if event.event_type == "SANDBOX_READY" and event.stream_url:
        webbrowser.open(event.stream_url)
```

---

## Takeover

Pass `on_takeover_needed` to `run()` / `run_once()` / `run_sync()`. The agent pauses when a takeover condition is detected, and your callback receives a `TakeoverContext` with the session state and a `resume()` function. Without a callback, a `TakeoverRequiredError` is raised. See `examples/takeover.py` and `examples/takeover_caller_initiated.py`.

---

## Errors

All SDK exceptions extend `GLComputerUseError`:

- `ConfigError` — bad or missing credentials.
- `SandboxProvisionError` — the sandbox could not be allocated.
- `GLTimeoutError` — no event received within the configured timeout.
- `TaskFailedError` — the agent terminated with an error (`TASK_FAILED`).
- `TaskCancelledError` — the task was cancelled (`TASK_CANCELLED`).
- `TakeoverRequiredError` — takeover was needed but no callback was supplied.

```python
from gl_computer_use import (
    GLComputerUseClient,
    ConfigError,
    SandboxProvisionError,
    GLTimeoutError,
    TaskFailedError,
)

try:
    result = await GLComputerUseClient().run_once("do something", timeout=60.0)
except ConfigError as e:
    print("Check your API keys:", e)
except SandboxProvisionError as e:
    print("Sandbox failed to start:", e)
except GLTimeoutError as e:
    print("Took too long:", e)
except TaskFailedError as e:
    print("Agent failed:", e)
```

---

## Observability

The SDK uses `structlog` for structured logging (JSON by default; set `GLCU_LOG_FORMAT=console` for human-readable output). Every line carries `session_id`, `task_id`, and `component`. Distributed tracing and metrics via OTLP, plus Sentry error tracking, are available through the `observability` extra and delegated to GDP Labs' [`gl-observability`](../gl-observability) SDK. Optional regex-based PII redaction is enabled with `GLCU_PII_REDACTION_ENABLED=true`.

---

## Custom Providers

Plug in alternative sandboxes, agents, or artifact stores without modifying the SDK:

```python
from gl_computer_use import register_sandbox, GLComputerUseClient, GLComputerUseConfig
from gl_computer_use.sandbox.base import BaseSandbox


class MyCustomSandbox(BaseSandbox):
    ...  # implement abstract methods


register_sandbox("my-sandbox", MyCustomSandbox)
client = GLComputerUseClient(config=GLComputerUseConfig(sandbox="my-sandbox"))
```

`register_agent` and `register_artifact` work the same way for custom agents and artifact stores.

---

## Local Development Setup

```bash
git clone git@github.com:GDP-ADMIN/gl-sdk.git
cd gl-sdk/libs/gl-computer-use
uv sync --all-extras
uv run gl-computer-use-setup
source .venv/bin/activate
```

Run checks:

```bash
uv run pytest           # tests
uv run ruff check .     # lint
uv run ruff check --fix # auto-fix lint
uv run mypy gl_computer_use/  # type-check
```

---

## Contributing

Please refer to the [Python Style Guide](https://docs.google.com/document/d/1uRggCrHnVfDPBnG641FyQBwUwLoFw0kTzNqRm92vUwM/edit?usp=sharing) for code style, documentation standards, and SCA requirements.
