Metadata-Version: 2.4
Name: android-device-mcp-server
Version: 0.1.3
Summary: MCP server for Android device interaction via ADB
Project-URL: Repository, https://github.com/nickaigi/MK.Android.Patrol
License-Expression: MIT
Keywords: adb,android,automation,claude,mcp,testing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Software Development :: Quality Assurance
Classifier: Topic :: Software Development :: Testing
Classifier: Typing :: Typed
Requires-Python: >=3.12
Requires-Dist: mcp>=1.0.0
Requires-Dist: pillow>=10.0
Description-Content-Type: text/markdown

# android-device-mcp-server

An [MCP (Model Context Protocol)](https://modelcontextprotocol.io) server that gives Claude full observation and control of a connected Android device via ADB.

## Installation

```bash
# Recommended: run ephemerally with uvx (no install needed)
uvx android-device-mcp-server

# Or install with pip
pip install android-device-mcp-server
android-device-mcp-server
```

## Add to a project

In `.mcp.json` (or Claude Code's MCP config):

```json
{
  "mcpServers": {
    "android-device": {
      "command": "uvx",
      "args": ["android-device-mcp-server"]
    }
  }
}
```

## Prerequisites

- **Python 3.12+**
- `adb` on PATH (`brew install android-platform-tools` or from Android SDK)
- At least one connected device or running emulator (`adb devices` to verify)

---

## Tools

### Observation

| Tool | Description | Returns |
|------|-------------|---------|
| `capture_screen` | Screenshot + UI hierarchy in one call | JPEG image + compact UI text |
| `get_screenshot` | Screenshot only | JPEG image |
| `get_ui_hierarchy` | UI hierarchy only | Compact element format |
| `get_device_info` | Device model, API level, screen size | Formatted text |
| `get_installed_packages` | List installed apps | Sorted package names |
| `get_logcat` | Filtered logcat output | Logcat text |

**`capture_screen` is the primary tool.** It returns both a compressed JPEG screenshot
and the UI hierarchy in a compact text format in a single call, giving Claude visual
context and structured element data together.

The compact UI format looks like:

```
0 FrameLayout @540,1200
1 Button "Pay Now" #pay_btn @540,550 [C]
2 View (Payments) @405,2300 [C]
3 ScrollView @540,1300 [S]
```

Flags: `[C]`=clickable, `[LC]`=long-clickable, `[S]`=scrollable, `[*]`=selected,
`[F]`=focused, `[v]`=checked, `[?]`=checkable, `[D]`=disabled, `[P]`=password

### Actions

| Tool | Description | Parameters |
|------|-------------|------------|
| `tap` | Tap at coordinates | `x`, `y` |
| `tap_element` | Tap element by text, content-desc, or resource-id | `text` |
| `type_text` | Type text into the focused field | `text` |
| `swipe` | Swipe gesture | `start_x`, `start_y`, `end_x`, `end_y`, `duration_ms` |
| `scroll_down` / `scroll_up` | Scroll using screen center | `amount` (pixels, default 500) |
| `press_key` | Press a device key | `key` |
| `long_press` | Long press at coordinates | `x`, `y`, `duration_ms` |

**Prefer `tap_element` over `tap`** — it finds the element by text, content-description,
or resource-id and computes the tap coordinates automatically. It also supports partial
text matching as a fallback (e.g., "Pay" matches "Pay Now").

Supported keys for `press_key`: `back`, `home`, `enter`, `tab`, `delete`, `volume_up`,
`volume_down`, `power`, `menu`, `search`, `dpad_up`, `dpad_down`, `dpad_left`,
`dpad_right`, `dpad_center`, `space`, `escape`, `wakeup`.

### App Lifecycle

| Tool | Description | Parameters |
|------|-------------|------------|
| `launch_app` | Launch app by package name | `package`, `activity` (optional) |
| `stop_app` | Force stop app | `package` |
| `clear_app_data` | Clear app data and cache | `package` |
| `install_apk` | Install APK from local path | `apk_path`, `replace_existing` |
| `uninstall_app` | Uninstall app | `package`, `keep_data` |

### Utility

| Tool | Description | Parameters |
|------|-------------|------------|
| `wait` | Wait for a duration (max 60s) | `seconds` |
| `wait_for_element` | Poll UI until element appears | `text`, `timeout_sec`, `poll_interval_sec` |
| `list_devices` | List all connected ADB devices | _(none)_ |
| `set_device` | Select target device (multi-device) | `device_id` |

---

## Configuration

All settings can be overridden via environment variables:

| Variable | Default | Description |
|----------|---------|-------------|
| `ADB_PATH` | `adb` | Path to the adb binary |
| `SCREENSHOT_TIMEOUT` | `15.0` | Screenshot capture timeout (seconds) |
| `UI_DUMP_TIMEOUT` | `10.0` | UI hierarchy dump timeout (seconds) |
| `ACTION_TIMEOUT` | `10.0` | Action command timeout (seconds) |
| `DEFAULT_SWIPE_DURATION_MS` | `300` | Default swipe duration (milliseconds) |
| `UI_SETTLE_WAIT_SEC` | `0.5` | Wait after tap/scroll before next action |
| `SCREENSHOT_MAX_WIDTH` | `320` | Max screenshot width (pixels) |
| `SCREENSHOT_QUALITY` | `50` | JPEG compression quality (1-100) |

Invalid values are logged and silently replaced with defaults.

---

## How the UI hierarchy works

`adb shell uiautomator dump` produces an XML tree of every visible UI element with its
text, content-description, bounds, and interaction flags. This server parses that XML
and returns a compact text format optimized for LLM consumption — reducing token usage
by ~80% compared to raw XML while preserving all the information Claude needs to reason
about and interact with the UI.

For Compose-based apps, class names are generic (`android.view.View` instead of `Button`,
`TextField`, etc.), but text, content-description, bounds, and clickability are all
present — which is everything needed.
