Metadata-Version: 2.4
Name: anyocr
Version: 0.2.0
Summary: A lightweight, unified OCR toolkit with a one-liner API. Supports Surya, EasyOCR, PaddleOCR, Tesseract, and Vision LLMs through a single interface.
Project-URL: Homepage, https://github.com/vietanhdev/anyocr
Project-URL: Documentation, https://github.com/vietanhdev/anyocr#readme
Project-URL: Repository, https://github.com/vietanhdev/anyocr
Project-URL: Issues, https://github.com/vietanhdev/anyocr/issues
Project-URL: Author, https://www.nrl.ai
Author-email: Viet-Anh Nguyen <vietanh.dev@gmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: easyocr,ocr,paddleocr,surya,tesseract,text-recognition,vlm
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Requires-Dist: click>=8.0
Requires-Dist: numpy>=1.21.0
Requires-Dist: pillow>=9.0.0
Provides-Extra: all
Requires-Dist: anyllm>=0.1.0; extra == 'all'
Requires-Dist: easyocr>=1.6.0; extra == 'all'
Requires-Dist: paddleocr>=2.6.0; extra == 'all'
Requires-Dist: paddlepaddle>=2.4.0; extra == 'all'
Requires-Dist: pytesseract>=0.3.10; extra == 'all'
Requires-Dist: surya-ocr>=0.6.0; extra == 'all'
Provides-Extra: benchmark
Requires-Dist: numpy>=1.21.0; extra == 'benchmark'
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Provides-Extra: easyocr
Requires-Dist: easyocr>=1.6.0; extra == 'easyocr'
Provides-Extra: llm
Requires-Dist: anyllm>=0.1.0; extra == 'llm'
Provides-Extra: paddle
Requires-Dist: paddleocr>=2.6.0; extra == 'paddle'
Requires-Dist: paddlepaddle>=2.4.0; extra == 'paddle'
Provides-Extra: pdf
Requires-Dist: pdf2image>=1.16.0; extra == 'pdf'
Provides-Extra: progress
Requires-Dist: tqdm>=4.60.0; extra == 'progress'
Provides-Extra: surya
Requires-Dist: surya-ocr>=0.6.0; extra == 'surya'
Provides-Extra: tesseract
Requires-Dist: pytesseract>=0.3.10; extra == 'tesseract'
Provides-Extra: vlm
Requires-Dist: anyllm>=0.1.0; extra == 'vlm'
Description-Content-Type: text/markdown

# anyocr

<p align="center"><img src="logo.svg" alt="anyocr logo" width="120"></p>

![PyPI](https://img.shields.io/pypi/v/anyocr)
![Python](https://img.shields.io/pypi/pyversions/anyocr)
![License](https://img.shields.io/pypi/l/anyocr)

A lightweight, unified OCR toolkit with a one-liner API. Supports multiple backends (Surya, EasyOCR, PaddleOCR, Tesseract) through a single interface.

**Runs completely offline after first model download.** OCR models are cached locally and no internet connection is required for subsequent use.

**Author:** [Viet-Anh Nguyen](https://www.nrl.ai) | [GitHub](https://github.com/vietanhdev)

---

## Features

- **One-liner API** -- `anyocr.read("image.png")` and you're done
- **Multiple backends** -- EasyOCR, PaddleOCR, Tesseract, all through one interface
- **Auto-detection** -- automatically selects the best available backend
- **Flexible input** -- accepts file paths, numpy arrays, PIL Images, bytes, or URLs
- **Batch processing** -- process multiple images efficiently
- **Preprocessing** -- built-in image enhancement for better accuracy
- **Type-safe** -- full type hints and dataclass results

## Installation

Install the core package (no OCR backend):

```bash
pip install anyocr
```

Then install your preferred backend(s):

```bash
# EasyOCR -- recommended for general use
pip install anyocr[easyocr]

# PaddleOCR -- best for CJK languages
pip install anyocr[paddle]

# Tesseract -- widest language support (requires tesseract system package)
pip install anyocr[tesseract]

# Install all backends
pip install anyocr[all]
```

### Tesseract system dependency

If using the Tesseract backend, you also need the system binary:

```bash
# Ubuntu/Debian
sudo apt install tesseract-ocr

# macOS
brew install tesseract

# Windows -- download from https://github.com/UB-Mannheim/tesseract/wiki
```

## Quick Start

```python
import anyocr

# Read text from an image (auto-selects best available backend)
result = anyocr.read("document.png")
print(result.text)

# Use a specific backend
result = anyocr.read("document.png", backend="easyocr")

# Multi-language support
result = anyocr.read("document.png", lang=["en", "vi"])

# Access detailed results
for line in result.lines:
    print(f"Text: {line.text}")
    print(f"  BBox: {line.bbox}")
    print(f"  Confidence: {line.confidence:.2f}")

# Batch processing
results = anyocr.read_batch(["page1.png", "page2.png", "page3.png"])

# Check available backends
print(anyocr.available_backends())  # e.g. ['easyocr', 'tesseract']

# Set default backend
anyocr.set_default_backend("easyocr")
```

### Flexible image input

```python
import anyocr
from PIL import Image
import numpy as np

# File path
result = anyocr.read("photo.jpg")

# PIL Image
img = Image.open("photo.jpg")
result = anyocr.read(img)

# Numpy array
arr = np.array(img)
result = anyocr.read(arr)

# Bytes
with open("photo.jpg", "rb") as f:
    result = anyocr.read(f.read())

# URL
result = anyocr.read("https://example.com/image.png")
```

### Preprocessing

```python
import anyocr

# Enable preprocessing for scanned documents
result = anyocr.read(
    "scan.png",
    do_preprocessing=True,
    do_deskew=True,
    do_enhance_contrast=True,
    contrast_factor=1.8,
)
```

## Backend Comparison

| Feature | EasyOCR | PaddleOCR | Tesseract |
|---|---|---|---|
| **Speed** | Medium | Fast | Fast |
| **Accuracy** | High | Very High | Good |
| **Languages** | 80+ | 80+ | 100+ |
| **GPU support** | Yes | Yes | No |
| **CJK quality** | Good | Excellent | Fair |
| **Setup** | pip only | pip only | System binary required |
| **Model size** | Large | Medium | Small |
| **Best for** | General use | CJK / documents | Legacy systems |

## Supported Languages

All backends support English. For other languages:

- **EasyOCR:** 80+ languages including Vietnamese, Chinese, Japanese, Korean, Thai, Arabic, Hindi, and more. [Full list](https://www.jaided.ai/easyocr/)
- **PaddleOCR:** 80+ languages with excellent CJK support. [Full list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/multi_languages_en.md)
- **Tesseract:** 100+ languages. Install additional language packs via your system package manager. [Full list](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html)

## API Reference

### `anyocr.read(image, backend=None, lang=None, do_preprocessing=False, **kwargs)`

Read text from a single image.

- **image** -- file path, numpy array, PIL Image, bytes, or URL
- **backend** -- `"easyocr"`, `"paddleocr"`, `"tesseract"`, or `None` (auto-select)
- **lang** -- list of language codes, e.g. `["en", "vi"]`
- **do_preprocessing** -- enable image preprocessing pipeline
- **Returns:** `OCRResult`

### `anyocr.read_batch(images, **kwargs)`

Read text from multiple images. Same options as `read()`.

### `anyocr.available_backends()`

Returns a list of installed backend names.

### `anyocr.set_default_backend(name)`

Set the default backend for all subsequent calls.

### `OCRResult`

- `.text` -- full recognized text (lines joined by newlines)
- `.lines` -- list of `TextLine` objects
- `.confidence` -- average confidence score
- `.backend` -- name of the backend used
- `.language` -- language codes used

### `TextLine`

- `.text` -- recognized text
- `.bbox` -- bounding box as `(x_min, y_min, x_max, y_max)`
- `.confidence` -- confidence score (0.0 to 1.0)

## Local-First / Edge AI

This package is designed to work completely offline. After initial model download,
no internet connection is required.

```bash
# Pre-download models for offline use
python -m anyocr download

# Download models for specific languages
python -m anyocr download --lang en,vi

# Download for a specific backend
python -m anyocr download --backend easyocr --lang en
```

```python
import anyocr

# Pre-download English models for the default backend
anyocr.download_models(lang=["en"])

# Pre-download for a specific backend and languages
anyocr.download_models(lang=["en", "vi"], backend="easyocr")
```

## Development

```bash
git clone https://github.com/vietanhdev/anyocr.git
cd anyocr
pip install -e ".[dev]"
pytest tests/ -v
```

## License

MIT License. See [LICENSE](LICENSE) for details.

---

Built by [Viet-Anh Nguyen](https://www.nrl.ai) | [NRL.ai](https://www.nrl.ai)
