Metadata-Version: 2.4
Name: widget-eval
Version: 0.1.0
Summary: Evaluation pipeline for widget generation — runs quality metrics and generates statistics reports.
Author: Houston Zhang
License: Apache-2.0
Project-URL: Homepage, https://github.com/WebAgent-Arena/widget-eval
Keywords: widget,evaluation,benchmark,ui-generation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: widget-quality>=0.1.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: openpyxl>=3.1.0
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"

# Widget Eval

Evaluation pipeline for widget generation — runs [widget-quality](https://github.com/WebAgent-Arena/widget-quality) metrics and generates statistics reports.

## Installation

```bash
pip install widget-eval
```

## Usage

```bash
# Basic usage (CPU)
widget-eval --gt_dir /path/to/GT --pred_dir /path/to/results

# Use GPU
widget-eval --gt_dir /path/to/GT --pred_dir /path/to/results --cuda

# Skip evaluation, only generate statistics
widget-eval --gt_dir /path/to/GT --pred_dir /path/to/results --skip_eval

# Custom output directory and workers
widget-eval --gt_dir /path/to/GT --pred_dir /path/to/results --output_dir /path/to/stats --workers 8
```

## Pipeline

1. **Evaluation** — Computes 12 quality metrics (layout, legibility, style, perceptual, geometry) for each GT-prediction pair
2. **Statistics** — Aggregates results into `metrics_stats.json` and `metrics.xlsx`

## License

Apache-2.0
