Metadata-Version: 2.4
Name: img-to-text
Version: 0.1.0
Summary: Extract and organize screenshot text using OCR
License: MIT
License-File: LICENSE
Keywords: cli,ocr,screenshot,text-extraction
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Text Processing
Requires-Python: >=3.10
Requires-Dist: pillow>=10.0
Provides-Extra: all-ocr
Requires-Dist: easyocr>=1.7; extra == 'all-ocr'
Requires-Dist: pytesseract>=0.3.10; extra == 'all-ocr'
Provides-Extra: dev
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: easyocr
Requires-Dist: easyocr>=1.7; extra == 'easyocr'
Provides-Extra: tesseract
Requires-Dist: pytesseract>=0.3.10; extra == 'tesseract'
Description-Content-Type: text/markdown

# img-to-text

Extract and organize text from screenshots with OCR.

<p align="center">
  <a href="https://github.com/bhayanak/image-to-text/actions/workflows/ci.yml"><img src="https://github.com/bhayanak/image-to-text/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
  <a href="https://codecov.io/gh/bhayanak/image-to-text"><img src="https://codecov.io/gh/bhayanak/sp-dl/graph/badge.svg" alt="Coverage"></a>
  <a href="https://codecov.io/gh/bhayanak/image-to-text"><img src="https://img.shields.io/badge/coverage-98%25-brightgreen" alt="Coverage 98%"></a>
  <a href="https://pypi.org/project/img-to-text/"><img src="https://img.shields.io/pypi/v/img-to-text" alt="PyPI"></a>
  <a href="https://pypi.org/project/img-to-text/"><img src="https://img.shields.io/pypi/pyversions/img-to-text" alt="Python"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="License: MIT"></a>
</p>

## Features

- OCR from screenshots using `pytesseract` or `easyocr`
- Automatic pre-processing for better OCR quality
- Structured extraction into labeled sections when possible
- Full text is always preserved in output
- Report output as Markdown, JSON, and plain text

## Install

```bash
pip install img-to-text
```

Local development install:

```bash
pip install -e ".[dev,all-ocr]"
```

## Usage

```bash
img-to-text extract screenshot.png
img-to-text extract screenshot.png --format all
img-to-text extract screenshot1.png screenshot2.png --stdout --format json
img-to-text raw screenshot.png --engine easyocr
```

### Output formats

- `md`: organized markdown report
- `json`: structured JSON records including `full_text`
- `txt`: plain text grouped by record
- `both`: markdown + JSON (default)
- `all`: markdown + JSON + text

## License

[MIT](LICENSE)