Metadata-Version: 2.4
Name: nougat-ocr-cli
Version: 0.1.3
Summary: Simple CLI wrapper for Nougat OCR with GPU acceleration support
Project-URL: Homepage, https://github.com/rubenffuertes/nougat-ocr-cli
Project-URL: Repository, https://github.com/rubenffuertes/nougat-ocr-cli
Project-URL: Issues, https://github.com/rubenffuertes/nougat-ocr-cli/issues
Author-email: Ruben Fernandez-Fuertes <fernandezfuertesruben@gmail.com>
License: MIT
License-File: LICENSE
Keywords: cli,document,extraction,gpu,nougat,ocr,pdf
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: General
Classifier: Topic :: Utilities
Requires-Python: <3.13,>=3.11
Requires-Dist: albumentations==1.3.1
Requires-Dist: nougat-ocr>=0.1.17
Requires-Dist: pypdfium2<5.0.0,>=4.30.0
Requires-Dist: torch>=2.0.0
Requires-Dist: transformers<4.36.0,>=4.35.0
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# Nougat OCR CLI

Simple, batteries-included CLI wrapper for [Nougat OCR](https://github.com/facebookresearch/nougat) with GPU acceleration.

## Features

- GPU acceleration (CUDA & Apple Metal)
- Simple CLI interface
- Batch processing support
- Clean Markdown output
- Automatic model downloading
- Python API with type hints

## Installation

### From PyPI

```bash
pip install nougat-ocr-cli
```

### From GitHub

```bash
pip install git+https://github.com/rubenffuertes/nougat-ocr-cli.git
```

### From source

```bash
git clone https://github.com/rubenffuertes/nougat-ocr-cli.git
cd nougat-ocr-cli
uv pip install -e .
```

## CLI Usage

```bash
# Basic usage - outputs to current directory
nougat-ocr-cli document.pdf

# Specify output directory
nougat-ocr-cli document.pdf -o output/

# Process specific pages (zero-indexed)
nougat-ocr-cli document.pdf --pages 0-5
nougat-ocr-cli document.pdf --pages 1,3,5,7

# Use smaller model for faster processing
nougat-ocr-cli document.pdf --model 0.1.0-small

# Use full precision (FP32) for better accuracy
nougat-ocr-cli document.pdf --full-precision

# Set batch size manually
nougat-ocr-cli document.pdf --batch-size 4
```

### CLI Options

| Option | Description |
|--------|-------------|
| `input` | Input PDF file to process |
| `-o, --output` | Output directory (default: current directory) |
| `--model` | Model version (default: 0.1.0-base) |
| `--batch-size` | Batch size for processing (auto-detected) |
| `--full-precision` | Use FP32 instead of BF16 |
| `--no-markdown` | Disable markdown post-processing |
| `--pages` | Page range (e.g., '0-5' or '1,3,5') |

## Python API

```python
from nougat_wrapper import NougatOCR
from pathlib import Path

# Initialize (loads model to GPU automatically)
ocr = NougatOCR()

# Extract text from PDF
result = ocr.extract_text(Path("paper.pdf"))

print(f"Extracted {result.pages} pages")
print(f"Failed pages: {result.placeholder_pages}")
print(result.text)  # Markdown output
```

### Advanced Usage

```python
ocr = NougatOCR(
    model_tag="0.1.0-small",  # Use smaller model
    batch_size=4,              # Process 4 pages at once
    full_precision=True,       # Use FP32 instead of BF16
)

# Only OCR pages 0, 1, 2 (zero-indexed)
result = ocr.extract_text(pdf_path, pages=[0, 1, 2])
```

## Requirements

- Python 3.11 only (3.12+ not supported due to nougat-ocr dependencies)
- GPU recommended (CUDA or Apple Metal)
- ~1.3 GB for model weights (auto-downloaded)

## License

MIT
