Metadata-Version: 2.4
Name: fast-paddleocr-mcp
Version: 0.2.0
Summary: Fast PaddleOCR MCP server - Extract text from images using PaddleOCR with optimized performance
Project-URL: Homepage, https://github.com/yourusername/PaddleOCR-MCP
Project-URL: Documentation, https://github.com/yourusername/PaddleOCR-MCP#readme
Project-URL: Repository, https://github.com/yourusername/PaddleOCR-MCP
Project-URL: Issues, https://github.com/yourusername/PaddleOCR-MCP/issues
Author: PaddleOCR-MCP Contributors
License: MIT
License-File: LICENSE
Keywords: image-processing,mcp,model-context-protocol,ocr,paddleocr,text-recognition
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Requires-Dist: paddleocr>=2.7.0
Requires-Dist: paddlepaddle>=2.5.0
Requires-Dist: pillow>=10.0.0
Description-Content-Type: text/markdown

# PaddleOCR-MCP

PaddleOCR MCP (Model Context Protocol) server and CLI tool that extracts text from images and outputs results in markdown format. Optimized for fast inference with GPU auto-detection.

## Installation

### CLI Tool

This tool can be run using `uvx`:

```bash
uvx --from . paddleocr-md <image_path> [-o output.md]
```

Or install it locally:

```bash
uv pip install -e .
paddleocr-md <image_path> [-o output.md]
```

### MCP Server

The MCP (Model Context Protocol) server allows integration with MCP clients like Cursor, Claude Desktop, etc.

**Run MCP server with uvx:**
```bash
uvx --from . python -m paddleocr_cli.mcp_server
```

See [MCP_README.md](MCP_README.md) for detailed MCP server documentation and integration instructions.

## Usage

### Basic Usage

The tool is optimized for speed by default with the following settings:
- **Fast mode enabled** (disables preprocessing for maximum speed)
- **PP-OCRv4** (faster mobile models)
- **640px image size limit** (faster processing)
- **Auto GPU detection** (uses GPU if available, falls back to CPU)

```bash
# Output will be saved as <image_name>.png.md
# Uses: fast mode + PP-OCRv4 + 640px + auto GPU detection
uvx --from . paddleocr-md image.png

# Specify custom output path
uvx --from . paddleocr-md image.png -o result.md

# Force CPU mode
uvx --from . paddleocr-md image.png --cpu

# Disable fast mode for better accuracy on rotated text
uvx --from . paddleocr-md image.png --no-fast

# Use PP-OCRv5 for better accuracy (slower)
uvx --from . paddleocr-md image.png --ocr-version PP-OCRv5
```

### Default Optimization Settings

The tool is optimized for speed by default with these settings:

- ✅ **Fast mode enabled**: Disables textline orientation classification (skips one model)
- ✅ **PP-OCRv4**: Uses faster mobile models (PP-OCRv4_mobile_det, PP-OCRv4_mobile_rec)
- ✅ **640px image size limit**: Faster processing (vs default 960px)
- ✅ **Auto GPU detection**: Automatically uses GPU if available, falls back to CPU
- ✅ **Document preprocessing disabled**: Skips unnecessary preprocessing steps

#### Customization Options

1. **`--no-fast`**: Disable fast mode for better accuracy
   - Enables textline orientation classification
   - Better accuracy on rotated text, but slower

2. **`--cpu`**: Force CPU mode
   - Overrides auto GPU detection
   - Explicitly use CPU

3. **`--gpu`**: Force GPU mode
   - Will fail if GPU not available
   - Use when you want to ensure GPU usage

4. **`--ocr-version PP-OCRv5`**: Use better accuracy version
   - PP-OCRv5 has better accuracy but slower than PP-OCRv4 (default)
   - Uses server models

5. **`--max-size <pixels>`**: Adjust image processing size
   - Default: 640px
   - Larger values (e.g., 960, 1280) = better accuracy, slower
   - Smaller values (e.g., 480) = faster, may reduce accuracy

6. **`--hpi`**: High-Performance Inference
   - Automatically selects best inference backend (Paddle Inference, OpenVINO, ONNX Runtime, TensorRT)
   - Requires HPI dependencies: `paddleocr install_hpi_deps cpu/gpu`
   - Best performance but requires additional setup

### Examples

```bash
# Basic usage (uses all optimizations by default: fast + PP-OCRv4 + 640px + auto GPU)
uvx --from . paddleocr-md photo.jpg

# Process with custom output
uvx --from . paddleocr-md document.png -o extracted_text.md

# Better accuracy (slower) - disable fast mode and use PP-OCRv5
uvx --from . paddleocr-md image.png --no-fast --ocr-version PP-OCRv5 --max-size 960

# Force CPU mode
uvx --from . paddleocr-md image.png --cpu

# Use High-Performance Inference (requires HPI dependencies)
uvx --from . paddleocr-md image.png --hpi
```

## Output Format

The tool generates a markdown file containing:
- Source image path
- List of detected text (one per line)

Example output (`test_image.png.md`):
```markdown
# OCR Result

**Source Image:** `test_image.png`

---

- HelloPaddleOcR
- 10000C
```

## Requirements

- Python >= 3.8
- PaddleOCR
- Pillow

## License

MIT
