Metadata-Version: 2.4
Name: agt_sandbox
Version: 4.0.0
Summary: Deprecated — use agent-governance-toolkit-cli instead. Agent Sandbox: Docker-based execution isolation for AI agents
Project-URL: Homepage, https://github.com/microsoft/agent-governance-toolkit
Project-URL: Repository, https://github.com/microsoft/agent-governance-toolkit
Project-URL: Documentation, https://github.com/microsoft/agent-governance-toolkit#readme
Project-URL: Bug Tracker, https://github.com/microsoft/agent-governance-toolkit/issues
Author-email: Microsoft Corporation <agentgovtoolkit@microsoft.com>
Maintainer-email: Agent Governance Toolkit Team <agentgovtoolkit@microsoft.com>
License: MIT
License-File: LICENSE
Keywords: agents,ai,containers,docker,governance,gvisor,isolation,kata,safety,sandbox
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: agent-governance-toolkit-cli<5.0,>=4.0.0
Provides-Extra: azure
Requires-Dist: azure-core<2.0,>=1.32.0; extra == 'azure'
Requires-Dist: azure-identity<2.0,>=1.19.0; extra == 'azure'
Provides-Extra: dev
Requires-Dist: mypy<3.0,>=2.1.0; extra == 'dev'
Requires-Dist: pytest-asyncio<2.0,>=1.3.0; extra == 'dev'
Requires-Dist: pytest-cov<8.0,>=7.1.0; extra == 'dev'
Requires-Dist: pytest<10.0,>=9.0.3; extra == 'dev'
Requires-Dist: ruff<1.0,>=0.15.13; extra == 'dev'
Provides-Extra: docker
Requires-Dist: docker<8.0,>=7.1.0; extra == 'docker'
Provides-Extra: full
Requires-Dist: agent-governance-toolkit-core<5.0,>=4.0.0; extra == 'full'
Requires-Dist: docker<8.0,>=7.1.0; extra == 'full'
Requires-Dist: hyperlight-sandbox<0.5,>=0.4.0; extra == 'full'
Provides-Extra: hyperlight
Requires-Dist: hyperlight-sandbox<0.5,>=0.4.0; extra == 'hyperlight'
Provides-Extra: policy
Requires-Dist: agent-governance-toolkit-core<5.0,>=4.0.0; extra == 'policy'
Description-Content-Type: text/markdown

# Agent Sandbox

Public Preview — execution isolation for AI agents with policy-driven
resource limits, tool proxies, network enforcement, and filesystem
checkpointing. Ships three interchangeable backends behind the same
`SandboxProvider` ABC.

Part of the [Agent Governance Toolkit](https://github.com/microsoft/agent-governance-toolkit).

## Providers at a glance

| Provider | Isolation primitive | Best for | Extra |
|----------|--------------------|----------|-------|
| `DockerSandboxProvider` | Hardened OCI container (runc, auto-upgrades to gVisor / Kata) | Local dev, CI, self-hosted runners | `agt-sandbox[docker]` |
| `HyperLightSandboxProvider` | KVM / mshv / WHP micro-VM via [hyperlight-sandbox](https://github.com/hyperlight-dev/hyperlight-sandbox) | Sub-millisecond cold start, per-call VM isolation | `agt-sandbox[hyperlight]` |
| `ACASandboxProvider` | [Azure Container Apps sandbox](https://github.com/microsoft/azure-container-apps) (managed) | Production, multi-tenant, no infra to run | `agt-sandbox[azure]` + the [early-access SDK wheel](https://github.com/microsoft/azure-container-apps/releases) |

All three implement the same async + sync API (`create_session`,
`execute_code`, `destroy_session`, plus `*_async` variants) and consume
the same `PolicyDocument` for resource caps, network allowlists, and
tool allowlists.

## Installation

```bash
# Everything (Docker + Hyperlight + policy engine):
pip install "agt-sandbox[full]"

# Pick what you need:
pip install "agt-sandbox[docker]"
pip install "agt-sandbox[hyperlight]"
pip install "agt-sandbox[azure,policy]"
```

The Azure data-plane SDK ships as an early-access wheel — pin the URL:

```bash
pip install https://github.com/microsoft/azure-container-apps/releases/download/python-sdk-v0.1.0b1-early-access/azure_containerapps_sandbox-0.1.0b1-py3-none-any.whl
```

## Quick start (all three providers)

```python
from agent_sandbox import (
    DockerSandboxProvider,
    HyperLightSandboxProvider,
    ACASandboxProvider,
)

# Pick one:
provider = DockerSandboxProvider()
# provider = HyperLightSandboxProvider(backend="wasm")
# provider = ACASandboxProvider(
#     resource_group="my-rg", sandbox_group="agents",
#     region="eastus2", disk="python-3.13",
#     ensure_group_location="eastus2",
# )

handle = provider.create_session("agent-1")
out = provider.execute_code("agent-1", handle.session_id, "print('hello')")
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
```

---

## 1. `DockerSandboxProvider` — local hardened containers

Each agent session runs in its own container with capabilities dropped,
no privilege escalation, a read-only root filesystem, a non-root user,
and no network by default.

```python
import asyncio
from agent_sandbox import (
    DockerSandboxProvider,
    IsolationRuntime,
    SandboxConfig,
)

async def run_agent_task():
    provider = DockerSandboxProvider(
        image="python:3.12-slim",
        runtime=IsolationRuntime.AUTO,   # auto-upgrade to gVisor / Kata
    )
    config = SandboxConfig(
        timeout_seconds=30,
        memory_mb=256,
        cpu_limit=0.5,
        network_enabled=False,
        read_only_fs=True,
    )

    session = await provider.create_session_async("research-agent", config=config)
    try:
        execution = await provider.execute_code_async(
            "research-agent", session.session_id,
            "import json, math; print(json.dumps([math.sqrt(x) for x in range(5)]))",
        )
        print(execution.result.stdout)

        checkpoint = provider.save_state(
            "research-agent", session.session_id, "after-step-1",
        )
        print(f"Checkpoint saved: {checkpoint.image_tag}")
    finally:
        await provider.destroy_session_async("research-agent", session.session_id)

asyncio.run(run_agent_task())
```

### What the Docker sandbox enforces

| Control | Default |
|---------|---------|
| Linux capabilities | All dropped (`--cap-drop=ALL`) |
| Privilege escalation | Blocked (`--security-opt=no-new-privileges`) |
| Root filesystem | Read-only |
| Container user | `nobody` (UID 65534) |
| PID limit | 256 |
| Network | Disabled unless explicitly allowed |
| Runtime | `runc` (auto-upgrades to gVisor or Kata when available) |
| State | `save_state` / `restore_state` via image commit |

---

## 2. `HyperLightSandboxProvider` — micro-VM isolation

Backed by the upstream [hyperlight-sandbox](https://github.com/hyperlight-dev/hyperlight-sandbox)
runtime. Each session is a fresh micro-VM on KVM (Linux), mshv (Azure
HCL), or WHP (Windows) — typical cold start is well under a millisecond.
Tools are registered as host functions and invoked synchronously from
the guest, gated by the session's `policy.tool_allowlist`.

```python
from agent_sandbox import HyperLightSandboxProvider

def fetch_arxiv(query: str) -> str:
    return f"<results for {query}>"

provider = HyperLightSandboxProvider(
    backend="wasm",                 # or "hyperlightjs" / "nanvix"
    module="python_guest",          # only meaningful for backend="wasm"
    tools={"fetch_arxiv": fetch_arxiv},
)

if not provider.is_available():
    raise SystemExit(f"Hyperlight unavailable: {provider.unavailable_reason}")

handle = provider.create_session("agent-1")
out = provider.execute_code(
    "agent-1", handle.session_id,
    "print(fetch_arxiv('cs.CL'))",
)
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
```

Notes:
- Each session owns one OS thread that is the sole code path touching
  its `Sandbox` — required by the upstream runtime.
- `provider.is_available()` probes for a hypervisor and returns
  `unavailable_reason` if none is present (e.g. on macOS hosts without
  WHP / KVM passthrough).
- Only tools listed in a session's `policy.tool_allowlist` are exposed
  to that session's guest; the rest stay host-side.

---

## 3. `ACASandboxProvider` — Azure Container Apps

Runs each session inside a managed Azure Container Apps sandbox via the
early-access `azure-containerapps-sandbox` Python SDK
([complete reference](https://github.com/microsoft/azure-container-apps/blob/main/docs/early/python-sdk/complete-reference.md)).
Same API as the other providers; the rest of your code is unchanged.

```bash
pip install "agt-sandbox[azure,policy]"
pip install https://github.com/microsoft/azure-container-apps/releases/download/python-sdk-v0.1.0b1-early-access/azure_containerapps_sandbox-0.1.0b1-py3-none-any.whl

az login   # or use managed identity in hosted compute
```

```python
from agent_sandbox import ACASandboxProvider

provider = ACASandboxProvider(
    resource_group="my-rg",          # must already exist
    sandbox_group="agents",          # auto-created if ensure_group_location is set
    region="eastus2",                # selects the data-plane endpoint
    subscription_id=None,            # falls back to AZURE_SUBSCRIPTION_ID env var
    disk="python-3.13",              # public disk image with python3 preinstalled
    ensure_group_location="eastus2", # create the sandbox group on first use
)

if not provider.is_available():
    raise SystemExit(f"ACA unavailable: {provider.unavailable_reason}")

handle = provider.create_session("agent-1")
out = provider.execute_code(
    "agent-1", handle.session_id, "print('hello azure')"
)
print(out.result.stdout)
provider.destroy_session("agent-1", handle.session_id)
provider.close()
```

The provider holds one `SandboxGroupClient` per `(resource_group,
sandbox_group)` pair and caches the per-sandbox `SandboxClient` returned
by `begin_create_sandbox().result()`. When a `PolicyDocument` is
supplied, `network_allowlist` is translated into a fail-closed egress
policy (`defaultAction: Deny` + per-host `Allow` rules) and applied via
`SandboxClient.set_egress_policy`. Set `defaults.network_default: allow`
in the policy if you explicitly want the SDK's default-allow behaviour.

A complete worked example (8 verified branches against live Azure —
allow / policy-deny / egress-block / sanity / tool-allowed /
tool-denied / remote-execution proof / egress audit) lives at
[`examples/quickstart/aca_sandbox_test.py`](../../examples/quickstart/aca_sandbox_test.py)
and reads its policy from
[`examples/quickstart/policies/aca_research_agent.yaml`](../../examples/quickstart/policies/aca_research_agent.yaml).

---

## Policy-driven configuration

All three providers consume the same `agent_os.policies.PolicyDocument`.
Sandbox resource caps, network allowlists, and tool allowlists are
native fields on the schema as of AGT 3.3, so policies live in YAML:

```yaml
name: research-agent
version: "2"

defaults:
  action: allow
  max_cpu: 1.0
  max_memory_mb: 2048
  timeout_seconds: 90
  network_default: deny

network_allowlist:
  - api.openai.com
  - "*.github.com"

tool_allowlist:
  - fetch_arxiv

rules:
  - name: deny-shell-out
    condition: { field: code, operator: contains, value: subprocess }
    action: deny
    priority: 100
    message: "shell-out blocked by research-agent policy"
```

```python
from agent_os.policies import PolicyDocument

policy = PolicyDocument.from_yaml("policies/aca_research_agent.yaml")
handle = await provider.create_session_async("agent-1", policy=policy)
```

## License

MIT
