Metadata-Version: 2.4
Name: nano-wait-vision
Version: 0.1.0
Summary: Vision extension for nano-wait automation
Author: Luiz Seabra De Marco
License: MIT
Project-URL: Repository, https://github.com/LuizSeabraDeMarco/NanoWaitVision
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: nano-wait
Requires-Dist: opencv-python
Requires-Dist: pytesseract
Requires-Dist: pyautogui
Requires-Dist: numpy
Dynamic: license-file

# 👁️ Nano-Wait-Vision — Visual Execution Extension

[![PyPI Version](https://img.shields.io/pypi/v/nano-wait-vision.svg)](https://pypi.org/project/nano-wait-vision/)
[![License](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

**nano-wait-vision** is the official computer vision extension for **nano-wait**. It integrates visual awareness capabilities (such as OCR, icon detection, and screen states) into the adaptive waiting engine, enabling more robust and deterministic automations.

> 📦 **Critical Dependency:** This package **DEPENDS on `nano-wait`**. It does not replace `nano-wait`, but rather extends it.

---

## 🧭 Table of Contents

1.  [What is Nano-Wait-Vision?](#-what-is-nano-wait-vision)
2.  [Added Features](#-added-features)
3.  [Quick Start](#-quick-start)
    *   [Installation](#installation)
    *   [Usage Examples](#usage-examples)
4.  [Installation & Dependencies (READ THIS)](#-installation--dependencies-read-this)
    *   [Python Dependencies](#python-dependencies-via-pip)
    *   [Mandatory External Dependency (OCR)](#mandatory-external-dependency-ocr)
5.  [Mental Model and Operation](#-mental-model--how-nano-wait-vision-works)
6.  [VisionState — Visual Operation Return](#visionstate--visual-operation-return)
7.  [Ideal Use Cases](#-ideal-use-cases)
8.  [Design Philosophy](#-design-philosophy)
9.  [Relationship with `nano-wait`](#relationship-with-nano-wait)

---

## 🧠 What is Nano-Wait-Vision?

Nano-Wait-Vision is a **deterministic vision engine** designed for Python automation. Its main function is to allow scripts to **wait for real visual conditions** on the screen, instead of relying solely on time-based waits.

It was developed to operate **in conjunction** with `nano-wait`, establishing a clear division of responsibilities:

| Component | Function |
| :--- | :--- |
| ⏱️ `nano-wait` | **When to check** (Manages the pace and adaptive waiting) |
| 👁️ `nano-wait-vision` | **What to check** (Provides visual awareness) |

---

## 🧩 Added Features

`nano-wait-vision` extends the `nano-wait` waiting engine with the following visual capabilities:

*   **👁️ OCR (Optical Character Recognition):** Reading text directly from the screen.
*   **🖼️ Icon Detection:** Locating visual elements (template matching) with high precision.
*   **🧠 Explicit Visual States:** Defining and waiting for specific graphical interface states.
*   **📚 Persistent Visual Memory:** Uses efficient computer vision techniques, without the need for heavy Machine Learning models.
*   **🖥️ Screen-Based Automation:** Ideal for RPA (Robotic Process Automation) tasks and GUI testing.

👉 All these features use the **adaptive waiting engine** of `nano-wait` as their foundation.

---

## 🚀 Quick Start

### Installation

First, install the main package (`nano-wait`), and then the vision extension:

```bash
pip install nano-wait
pip install nano-wait-vision
```

### Usage Examples

#### Simple Visual Observation

```python
from nano_wait_vision import VisionMode

vision = VisionMode()
state = vision.observe()

print(state.detected, state.text)
```

#### Wait for Text to Appear on Screen

```python
from nano_wait_vision import VisionMode

vision = VisionMode()
state = vision.wait_text("Welcome", timeout=10)

if state.detected:
    print("Text 'Welcome' detected on screen.")
```

#### Wait for Icon to Appear

```python
from nano_wait_vision import VisionMode

vision = VisionMode()
# Assumes 'ok.png' is an image file of the icon to be searched
state = vision.wait_icon("ok.png", timeout=10)

if state.detected:
    print("Icon 'ok.png' detected on screen.")
```

---

## ⚠️ Installation & Dependencies (READ THIS)

Nano-Wait-Vision is not a lightweight library, as it depends on graphical automation and OCR at the operating system level.

### Python Dependencies (via pip)

The following dependencies are automatically installed:

*   `opencv-python`
*   `pytesseract`
*   `pyautogui`
*   `numpy`

### 🧠 Mandatory External Dependency (OCR)

👉 **Tesseract OCR** must be installed and accessible in the operating system's PATH for the OCR functionality to work.

| Operating System | Installation Command |
| :--- | :--- |
| **macOS** | `brew install tesseract` |
| **Ubuntu / Debian** | `sudo apt install tesseract-ocr` |
| **Windows** | Download from the [official Tesseract website](https://github.com/tesseract-ocr/tesseract) and add to PATH. |

⚠️ **Warning:** Without Tesseract installed, any OCR functionality will fail immediately.

---

## 🧠 Mental Model — How `nano-wait-vision` Works

`nano-wait-vision` does not execute in isolation. It operates within the conceptual cycle of `nano-wait`:

`observe` → `reason` → `wait` → `observe`

It integrates two cooperative engines:

| 👁️ Vision Engine (What is happening?) | ⏱️ `nano-wait` (When to check?) |
| :--- | :--- |
| OCR (text) | CPU |
| Icons (template matching) | Memory |
| Explicit visual states | Smart Mode |
| | Adaptive pace |

👉 The vision engine **never** executes `time.sleep()` directly. It always delegates the pace and waiting to `nano-wait`.

---

## VisionState — Visual Operation Return

All visual operations return a `VisionState` object with the observation information:

```python
VisionState(
    name: str,
    detected: bool,
    confidence: float,
    text: Optional[str],
    icon: Optional[str]
)
```

👉 **Always validate** the `detected` field in critical automations to ensure the visual condition has been met.

---

## 🧪 Ideal Use Cases

Use `nano-wait-vision` if your project involves:

*   **RPA (Robotic Process Automation)**
*   **GUI (Graphical User Interface) Automation**
*   **Visual Testing** (ensuring elements appear correctly)
*   **OCR-Driven Workflows**
*   **Systems without an API** (where the only interface is the screen)
*   **Screen-Based Alternatives to Selenium**

If you only need smart waiting based on time and system resources, use `nano-wait` alone.

---

## 🧩 Design Philosophy

The project adheres to the following principles:

*   **Deterministic:** Predictable and consistent results.
*   **No Opaque ML:** Avoids complex and hard-to-debug Machine Learning models.
*   **Reproducible:** Automations must be repeatable across different environments.
*   **Explainable:** The vision state must be easy to understand and track.
*   **Based on Real Screen State:** Operates directly on what the user sees.
*   **Integrated into System Context:** Works in harmony with the adaptive waiting engine.

---

## 📌 Relationship with `nano-wait`

| Project | Description |
| :--- | :--- |
| `nano-wait` | Main product, adaptive waiting engine. |
| `nano-wait-vision` | Official vision extension, adds visual capabilities. |

Both are published separately on PyPI but are designed to work as a unified system.
