Python SDK for agents that control the desktop, browse the web, and call any tool — with built-in safety, memory, and human approval.
pip install gantrygraph
from gantrygraph import GantryEngine, gantry_tool from gantrygraph.perception import DesktopScreen from gantrygraph.actions import MouseKeyboardTools from langchain_anthropic import ChatAnthropic @gantry_tool async def notify(channel: str, msg: str) -> str: """Post a message to a Slack channel.""" return await slack.post(channel, msg) agent = GantryEngine( llm=ChatAnthropic(model="claude-sonnet-4-6"), perception=DesktopScreen(), tools=[MouseKeyboardTools(), notify], max_steps=30, ) agent.run("File the quarterly report and ping #finance")
Composable primitives. No framework lock-in.
Screenshot the desktop or a browser, decide what to click, type, or navigate — no APIs needed.
Desktop guide →Wrap a function with @gantry_tool or connect GitHub, Postgres, Notion via MCP — two lines.
Workspace sandboxing, human approval gates, budget caps. Know exactly what runs before it runs.
Add guardrails →