Metadata-Version: 2.4
Name: pyvisionauto
Version: 0.1.3
Summary: PyVisionAuto: Cross-platform desktop automation toolkit with visual image matching, mouse/keyboard control, and screen recording
Author: PyVisionAuto contributors
License-Expression: LicenseRef-Proprietary
Project-URL: Homepage, https://pypi.org/project/pyvisionauto/
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: Microsoft :: Windows
Classifier: Environment :: X11 Applications
Classifier: Topic :: Software Development :: Testing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: license.dat
Requires-Dist: opencv-python>=4.8.0
Requires-Dist: mss>=9.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pyautogui>=0.9.54
Requires-Dist: pillow>=10.0.0
Provides-Extra: test
Requires-Dist: pytest>=8.0; extra == "test"
Dynamic: license-file

# PyVisionAuto

[![PyPI version](https://img.shields.io/pypi/v/pyvisionauto.svg)](https://pypi.org/project/pyvisionauto/)
![Python](https://img.shields.io/pypi/pyversions/pyvisionauto)
![Platform](https://img.shields.io/badge/platform-Linux%20%7C%20Windows-blue)

**PyVisionAuto** is an end-to-end desktop automation toolkit.
It is centered on visual image matching and also includes screen recording, mouse automation, and keyboard automation capabilities.

## Scope

- Linux (X11 session) and Windows
- Real physical display required

## Install

```bash
pip install pyvisionauto
```

## System dependencies

### Linux

- **python3-tk** — Required for border overlay highlight
- **xdotool** — Preferred for window activation
- **wmctrl** — Fallback for window activation
- **ffmpeg** — Required for screen recording; install via `sudo apt install ffmpeg`

### Windows

- **tkinter** — Bundled with most Python installations
- **ffmpeg** — Required for screen recording; download from [ffmpeg.org](https://ffmpeg.org/download.html), extract archive, and add the `bin` folder to system `PATH`

### Verify ffmpeg installation

```bash
# Check if ffmpeg is installed and accessible
ffmpeg -version
```

> **Note:** Screen recording (via `Recorder` API) requires ffmpeg. On Linux, it uses `x11grab` codec; on Windows, it uses `gdigrab` codec. Both are built into ffmpeg by default.

## Quick start

### Basic usage: Find and click

```python
from pyvisionauto import Screen

screen = Screen()
# Wait for image to appear on screen, highlight it, then click
screen.wait("login_button.png", timeout=10).highlight().click()
```

### Advanced example: Record automation with screen capture

This example demonstrates screen recording combined with visual automation:

```python
from pyvisionauto import Screen, Recorder
from pathlib import Path

# Initialize screen automation and recorder
screen = Screen()
recorder = Recorder()

# Ensure ffmpeg is installed for recording:
# - Linux: sudo apt install ffmpeg
# - Windows: Download from ffmpeg.org and add to PATH

# Start recording the entire screen to a video file
# This creates a timestamped video file in the current directory
video_path = Path("automation_demo.mp4")
recorder.start_recording(output_path=video_path)

try:
    # Activate target application window
    screen.activate_window("Calculator")
    
    # Wait for "1" button to appear (max 10 seconds) and highlight it
    screen.wait("button_1.png", timeout=10).highlight().click()
    
    # Click "+" button with custom highlight duration (500ms)
    screen.click("button_plus.png", timeout=5)
    
    # Type numbers using keyboard automation
    screen.type_text("5")
    
    # Wait for and click "=" button
    screen.wait("button_equals.png", timeout=5).highlight().click()
    
    # Check that result appears (with 3 second timeout)
    screen.wait("result_7.png", timeout=3).highlight()
    
finally:
    # Stop recording (important: cleanup resources)
    recorder.stop_recording()
    print(f"✓ Automation recorded to {video_path}")
```

### Activate a window before matching

```python
screen.activate_window("Calculator")
screen.click("button.png")
```

## Runtime screenshot

Highlighted match region during runtime:

![PyVisionAuto runtime screenshot with highlighted region](screenshot.png)

## Platform differences

| Feature | Linux | Windows |
|---|---|---|
| Screen capture & template matching | Supported | Supported |
| Mouse / keyboard automation | Supported | Supported |
| Highlight overlay | Supported | Supported |
| Window activation | xdotool / wmctrl | pyautogui (pygetwindow) |
| Screen recording | ffmpeg + x11grab | ffmpeg + gdigrab |

> Screen recording requires ffmpeg installed and added to system PATH. Linux uses `x11grab`, Windows uses `gdigrab`.

## Notes

- Wayland-only and headless environments are not currently supported.
- On Windows with high-DPI scaling, coordinate accuracy may be affected.
