Metadata-Version: 2.4
Name: ghost-pc
Version: 0.2.13
Summary: Control your Windows PC from WhatsApp with AI vision
Project-URL: Homepage, https://github.com/nexarlabs-ai/GhostPC
Project-URL: Repository, https://github.com/nexarlabs-ai/GhostPC
Project-URL: Issues, https://github.com/nexarlabs-ai/GhostPC/issues
Author-email: Nexar Labs <hello@nexarlabs.ai>
License: MIT
License-File: LICENSE
Keywords: ai,computer-use,desktop-automation,gemini,whatsapp,windows
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Win32 (MS Windows)
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: Microsoft :: Windows :: Windows 10
Classifier: Operating System :: Microsoft :: Windows :: Windows 11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Communications :: Chat
Classifier: Topic :: Desktop Environment
Requires-Python: >=3.12
Requires-Dist: aiohttp>=3.9.0
Requires-Dist: aiosqlite>=0.20.0
Requires-Dist: bettercam>=1.0.0
Requires-Dist: click>=8.1.0
Requires-Dist: google-adk>=1.0.0
Requires-Dist: google-genai>=1.0.0
Requires-Dist: pillow>=10.0.0
Requires-Dist: psutil>=5.9.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pynput>=1.7.0
Requires-Dist: pystray>=0.19.0
Requires-Dist: pyturbojpeg>=1.7.0
Requires-Dist: qrcode>=7.0.0
Requires-Dist: questionary>=2.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typing-extensions>=4.15.0
Provides-Extra: all
Requires-Dist: aiortc>=1.14.0; extra == 'all'
Requires-Dist: apscheduler>=3.11.0; extra == 'all'
Requires-Dist: chromadb>=1.0.0; extra == 'all'
Requires-Dist: cloudflared; extra == 'all'
Requires-Dist: edge-tts>=6.1.0; extra == 'all'
Requires-Dist: playwright>=1.40.0; extra == 'all'
Requires-Dist: pycaw>=20240210; extra == 'all'
Requires-Dist: pywin32>=306; extra == 'all'
Requires-Dist: pywinauto>=0.6.8; extra == 'all'
Requires-Dist: rapidocr-onnxruntime>=1.3.0; extra == 'all'
Requires-Dist: sounddevice>=0.5.5; extra == 'all'
Requires-Dist: watchdog>=6.0.0; extra == 'all'
Requires-Dist: windows-toasts>=1.1.0; extra == 'all'
Provides-Extra: browser
Requires-Dist: playwright>=1.40.0; extra == 'browser'
Provides-Extra: desktop
Requires-Dist: pycaw>=20240210; extra == 'desktop'
Requires-Dist: pywin32>=306; extra == 'desktop'
Requires-Dist: pywinauto>=0.6.8; extra == 'desktop'
Requires-Dist: windows-toasts>=1.1.0; extra == 'desktop'
Provides-Extra: memory
Requires-Dist: chromadb>=1.0.0; extra == 'memory'
Provides-Extra: notifications
Requires-Dist: apscheduler>=3.11.0; extra == 'notifications'
Requires-Dist: watchdog>=6.0.0; extra == 'notifications'
Provides-Extra: ocr
Requires-Dist: rapidocr-onnxruntime>=1.3.0; extra == 'ocr'
Provides-Extra: tunnel
Requires-Dist: cloudflared; extra == 'tunnel'
Provides-Extra: voice
Requires-Dist: edge-tts>=6.1.0; extra == 'voice'
Requires-Dist: sounddevice>=0.5.5; extra == 'voice'
Provides-Extra: wakeword
Requires-Dist: openwakeword>=0.6.0; (python_version < '3.12') and extra == 'wakeword'
Provides-Extra: webrtc
Requires-Dist: aiortc>=1.14.0; extra == 'webrtc'
Description-Content-Type: text/markdown

# GhostPC

> Control your Windows PC from WhatsApp with AI vision.

GhostPC is a local-first AI desktop agent that gives you full control of your Windows PC through WhatsApp messages. Send a text, and the AI sees your screen, moves your mouse, types on your keyboard, runs commands, launches apps, and streams your screen live to your phone.

## Features

- **AI Vision** — Gemini Computer Use sees your screen and takes actions
- **Full Desktop Control** — Mouse, keyboard, scroll, drag-and-drop
- **Terminal Commands** — Run PowerShell/cmd from your phone
- **App Launching** — "Open Chrome", "Open VS Code"
- **Live Screen Streaming** — MJPEG stream viewable in any mobile browser
- **WhatsApp Integration** — Baileys-powered, QR code pairing
- **Clipboard** — Read/write system clipboard remotely
- **File Browsing** — List, search, and read files on your PC
- **Emergency Stop** — Ctrl+Alt+Q kills all agent activity instantly
- **System Tray** — Status icon showing connection state

## Quick Start

### Prerequisites

- **Windows 10/11** (screen capture and input injection are Windows-only)
- **Python 3.12+** (auto-downloaded by uv)
- **Node.js 20+** (for the WhatsApp bridge)
- **Gemini API key** from [Google AI Studio](https://aistudio.google.com/apikey)

### Install & Run

```bash
# Clone the repo
git clone https://github.com/yourusername/ghost-pc.git
cd ghost-pc

# Set your API key
export GEMINI_API_KEY=your-key-here

# Run (uv handles everything)
uv run ghost
```

That's it. Scan the QR code with WhatsApp, and start texting commands.

### Quick Commands

| Command | What it does |
|---------|-------------|
| Just type naturally | AI processes and executes |
| `screenshot` / `ss` | Send current screen as image |
| `watch` / `live` | Get live screen stream URL |
| `stop` | Cancel current action |
| `reset` | Clear conversation history |
| `help` | Show command list |

## Configuration

All settings are via environment variables. Create a `.env` file or export them:

```env
# Required
GEMINI_API_KEY=your-gemini-api-key

# Optional
GHOST_SCREEN_FPS=10               # Live stream FPS (1-60)
GHOST_JPEG_QUALITY=70             # JPEG quality (1-100)
GHOST_CAPTURE_RESOLUTION=1280x720 # Capture resolution
GHOST_STREAM_PORT=8443            # Stream server port
GHOST_EMERGENCY_HOTKEY=ctrl+alt+q # Emergency stop hotkey
GHOST_ALLOWED_NUMBERS=+1234567890 # WhatsApp allowlist (comma-separated)
GHOST_LOG_LEVEL=INFO              # Logging level
```

## Architecture

```
Python Agent (asyncio event loop)
├── Orchestrator — wires all subsystems together
├── ADK Agent — Gemini Computer Use model + tools
│   ├── ComputerUseToolset → DesktopComputer(BaseComputer)
│   │   ├── BetterCam screen capture → PNG for model
│   │   └── Win32 SendInput (mouse/keyboard/scroll)
│   └── Custom tools (terminal, apps, clipboard, files)
├── AgentRunner — manages sessions, tool callbacks
├── BaileysBridge — spawns Node.js, JSON-RPC over stdio
├── MJPEGServer — aiohttp, streams JPEG frames + WebSocket chat
└── GhostTray — pystray system tray icon

Node.js Bridge (subprocess)
├── Baileys socket — WhatsApp connection
├── stdin reader — JSON-RPC commands from Python
└── stdout writer — JSON-RPC events to Python
```

## Security

| Layer | Protection |
|-------|-----------|
| WhatsApp Access | Phone number allowlist |
| Emergency Stop | Ctrl+Alt+Q hotkey |
| Screen Data | Only sent to Gemini API for analysis |
| Live Viewing | Token-authenticated URLs |
| Terminal | Command blocklist (no `format`, `del /s`, etc.) |
| Files | Configurable allowed directories |
| Coordinates | Clamped to screen dimensions |

## Development

```bash
# Install dev dependencies
uv sync --group dev

# Run tests
uv run pytest tests/ -v

# Lint
uv run ruff check src/

# Format
uv run ruff format src/

# Type check
uv run pyright src/
```

## License

MIT
