Metadata-Version: 2.4
Name: ai-desktop-use
Version: 0.1.0
Summary: AI-powered desktop automation SDK - like Browser-Use but for your entire desktop
Project-URL: Homepage, https://github.com/anthropic-ai/desktop-use
Project-URL: Documentation, https://desktop-use.dev
Project-URL: Repository, https://github.com/anthropic-ai/desktop-use
Author-email: DolphyTech <admin@dolphytech.com>
License-Expression: MIT
Keywords: ai,automation,claude,computer-use,desktop,rpa
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Desktop Environment
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Requires-Dist: anthropic>=0.80.0
Requires-Dist: mss>=9.0.0
Requires-Dist: pillow>=10.0.0
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.3.0; extra == 'dev'
Provides-Extra: memory
Requires-Dist: httpx>=0.27.0; extra == 'memory'
Description-Content-Type: text/markdown

# 🖥️ Desktop-Use

**AI-powered desktop automation SDK** — like [Browser-Use](https://github.com/browser-use/browser-use) but for your entire desktop.

```python
from desktop_use import Agent

agent = Agent(task="Open Calculator and calculate 2+2")
result = agent.run_sync()
```

## ✨ Features

- **🤖 AI-Powered** - Uses Claude's vision to understand and interact with any desktop application
- **🖱️ Full Control** - Mouse, keyboard, screenshots - everything you need
- **🍎 macOS First** - Native performance with cliclick (Windows/Linux coming soon)
- **📐 Ultra-wide Ready** - Automatic resolution scaling for any display size
- **🧠 Memory Integration** - Optional Remembra integration for persistent context
- **⚡ Simple API** - One-liner to automate complex tasks

## 🚀 Quick Start

### Installation

```bash
# Install cliclick (macOS only)
brew install cliclick

# Install Desktop-Use
pip install desktop-use
```

### Grant Permissions

Go to **System Settings → Privacy & Security → Accessibility** and enable your Terminal app.

### Run Your First Task

```bash
# CLI
desktop-use "Open Spotlight and search for Notes"

# Python
python -c "from desktop_use import run; run('Open Calculator')"
```

## 📖 Usage

### Python SDK

```python
from desktop_use import Agent, AgentConfig

# Simple usage
agent = Agent(task="Open FaceTime")
result = agent.run_sync()

# With configuration
config = AgentConfig(
    model="claude-4-sonnet-20250514",
    max_steps=20,
    verbose=True
)
agent = Agent(task="Create a new folder called 'Test' on Desktop", config=config)
result = agent.run_sync()

if result.success:
    print(f"Done: {result.final_message}")
else:
    print(f"Failed: {result.error}")
```

### Low-Level Control

```python
from desktop_use import Desktop

desktop = Desktop()

# Direct control
desktop.click(100, 200)
desktop.type("Hello World")
desktop.press("return")
desktop.hotkey("cmd", "space")  # Open Spotlight

# Screenshot
img = desktop.screenshot()
img.save("screen.png")
```

### CLI

```bash
# Basic usage
desktop-use "Open Safari and go to google.com"

# With options
desktop-use --max-steps 25 "Fill out the form on the current page"
desktop-use --model claude-4-opus-20250514 "Complex multi-step task"
desktop-use --quiet "Background task"
```

## 🖥️ Platform Support

| Platform | Status | Mouse/Keyboard | Screenshots |
|----------|--------|----------------|-------------|
| macOS | ✅ Ready | cliclick | native/mss |
| Windows | 🔜 Soon | pyautogui | mss |
| Linux | 🔜 Soon | pyautogui | mss |

## 🔧 Requirements

- Python 3.10+
- macOS 12+ (for now)
- [cliclick](https://github.com/BlueM/cliclick) (`brew install cliclick`)
- Anthropic API key

## 📐 How It Works

1. **Capture** - Takes a screenshot of your desktop
2. **Scale** - Resizes to fit Claude's vision constraints (max 1568px edge)
3. **Analyze** - Claude sees the screenshot and decides what to do
4. **Execute** - Mouse/keyboard commands are executed via cliclick
5. **Repeat** - Loop until task is complete or max steps reached

The coordinate scaling is automatic - Claude works in scaled coordinates, and Desktop-Use maps them back to your actual resolution.

## 🧠 Memory Integration (Optional)

Connect to [Remembra](https://remembra.dev) for persistent memory:

```python
config = AgentConfig(
    memory_url="http://localhost:8787",
    memory_project_id="my-automation"
)
agent = Agent(task="Do what I asked yesterday", config=config)
```

## 🛣️ Roadmap

- [x] macOS support with cliclick
- [x] Ultra-wide display support
- [x] CLI interface
- [ ] Windows support (pyautogui)
- [ ] Linux support (pyautogui/xdotool)
- [ ] MCP server for Claude Desktop
- [ ] Async operations
- [ ] Action recording/playback
- [ ] Visual debugging mode

## 🤝 Contributing

Contributions welcome! See [CONTRIBUTING.md](CONTRIBUTING.md).

## 📄 License

MIT License - see [LICENSE](LICENSE).

---

Built with ❤️ by [DolphyTech](https://dolphytech.com)
