Metadata-Version: 2.4
Name: nuoyi
Version: 0.4.10
Summary: A simple tool to transform PDF and DOCX to Markdown using marker-pdf
Project-URL: Homepage, https://github.com/cycleuser/NuoYi
Project-URL: Documentation, https://github.com/cycleuser/NuoYi#readme
Project-URL: Repository, https://github.com/cycleuser/NuoYi.git
Project-URL: Issues, https://github.com/cycleuser/NuoYi/issues
Project-URL: Changelog, https://github.com/cycleuser/NuoYi/blob/main/CHANGELOG.md
Author-email: CycleUser <cycleuser@cycleuser.org>
License-Expression: GPL-3.0-or-later
License-File: LICENSE
Keywords: amd,converter,directml,document,docx,gpu,markdown,marker-pdf,ocr,pdf,radeon
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Environment :: X11 Applications :: Qt
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Text Processing :: Markup :: Markdown
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Requires-Dist: marker-pdf>=1.0.0
Requires-Dist: pillow>=9.0.0
Requires-Dist: pymupdf>=1.23.0
Requires-Dist: python-docx>=0.8.11
Provides-Extra: all
Requires-Dist: build>=1.0.0; extra == 'all'
Requires-Dist: docling>=1.0.0; extra == 'all'
Requires-Dist: flask>=2.0.0; extra == 'all'
Requires-Dist: magic-pdf>=1.3.0; extra == 'all'
Requires-Dist: pdfplumber>=0.10.0; extra == 'all'
Requires-Dist: pymupdf4llm>=0.0.17; extra == 'all'
Requires-Dist: pyside6>=6.5.0; extra == 'all'
Requires-Dist: pytest-cov>=4.0.0; extra == 'all'
Requires-Dist: pytest>=7.0.0; extra == 'all'
Requires-Dist: ruff>=0.1.0; extra == 'all'
Requires-Dist: twine>=4.0.0; extra == 'all'
Provides-Extra: all-cuda
Requires-Dist: build>=1.0.0; extra == 'all-cuda'
Requires-Dist: docling>=1.0.0; extra == 'all-cuda'
Requires-Dist: flask>=2.0.0; extra == 'all-cuda'
Requires-Dist: magic-pdf>=1.3.0; extra == 'all-cuda'
Requires-Dist: nvidia-cublas-cu12>=12.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-cuda-nvrtc-cu12>=12.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-cuda-runtime-cu12>=12.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-cufft-cu12>=11.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-curand-cu12>=10.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-cusolver-cu12>=11.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-cusparse-cu12>=12.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-nvjitlink-cu12>=12.0.0; extra == 'all-cuda'
Requires-Dist: nvidia-nvtx-cu12>=12.0.0; extra == 'all-cuda'
Requires-Dist: pdfplumber>=0.10.0; extra == 'all-cuda'
Requires-Dist: pymupdf4llm>=0.0.17; extra == 'all-cuda'
Requires-Dist: pyside6>=6.5.0; extra == 'all-cuda'
Requires-Dist: pytest-cov>=4.0.0; extra == 'all-cuda'
Requires-Dist: pytest>=7.0.0; extra == 'all-cuda'
Requires-Dist: ruff>=0.1.0; extra == 'all-cuda'
Requires-Dist: twine>=4.0.0; extra == 'all-cuda'
Provides-Extra: cuda
Requires-Dist: nvidia-cublas-cu12>=12.0.0; extra == 'cuda'
Requires-Dist: nvidia-cuda-nvrtc-cu12>=12.0.0; extra == 'cuda'
Requires-Dist: nvidia-cuda-runtime-cu12>=12.0.0; extra == 'cuda'
Requires-Dist: nvidia-cufft-cu12>=11.0.0; extra == 'cuda'
Requires-Dist: nvidia-curand-cu12>=10.0.0; extra == 'cuda'
Requires-Dist: nvidia-cusolver-cu12>=11.0.0; extra == 'cuda'
Requires-Dist: nvidia-cusparse-cu12>=12.0.0; extra == 'cuda'
Requires-Dist: nvidia-nvjitlink-cu12>=12.0.0; extra == 'cuda'
Requires-Dist: nvidia-nvtx-cu12>=12.0.0; extra == 'cuda'
Provides-Extra: dev
Requires-Dist: build>=1.0.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Requires-Dist: twine>=4.0.0; extra == 'dev'
Provides-Extra: docling
Requires-Dist: docling>=1.0.0; extra == 'docling'
Provides-Extra: engines
Requires-Dist: docling>=1.0.0; extra == 'engines'
Requires-Dist: magic-pdf>=1.3.0; extra == 'engines'
Requires-Dist: pdfplumber>=0.10.0; extra == 'engines'
Requires-Dist: pymupdf4llm>=0.0.17; extra == 'engines'
Provides-Extra: gui
Requires-Dist: pyside6>=6.5.0; extra == 'gui'
Provides-Extra: mineru
Requires-Dist: magic-pdf>=1.3.0; extra == 'mineru'
Provides-Extra: pdfplumber
Requires-Dist: pdfplumber>=0.10.0; extra == 'pdfplumber'
Provides-Extra: pymupdf
Requires-Dist: pymupdf4llm>=0.0.17; extra == 'pymupdf'
Provides-Extra: web
Requires-Dist: flask>=2.0.0; extra == 'web'
Description-Content-Type: text/markdown

# NuoYi

A simple tool to transform PDF and DOCX to Markdown.

[中文文档](README_CN.md)

NuoYi uses [marker-pdf](https://github.com/VikParuchuri/marker) for high-quality PDF conversion with OCR and layout detection. All processing is done **fully offline** after the initial model download.

## Features

- **PDF to Markdown**: High-quality conversion using marker-pdf with surya OCR
- **DOCX to Markdown**: Native support for Microsoft Word documents
- **Automatic GPU/CPU Selection**: Detects available VRAM and falls back to CPU if needed
- **Batch Processing**: Convert entire directories of documents
- **GUI Interface**: PySide6-based graphical interface for easy batch conversion
- **Image Extraction**: Automatically extracts and saves images from PDFs
- **Multi-language Support**: 10 languages supported including Chinese, English, Japanese, French, Russian, German, Spanish, Portuguese, Italian, Korean

## Installation

**Requires Python 3.10 or higher** (marker-pdf requires Python >= 3.10).

### From PyPI

```bash
pip install nuoyi
```

### With GUI support

```bash
pip install nuoyi[gui]
```

### With NVIDIA CUDA support (IMPORTANT for GPU users)

If you encounter `CUBLAS_STATUS_NOT_INITIALIZED` errors when using GPU, install the CUDA libraries:

```bash
pip install nuoyi[cuda]
```

Or manually:

```bash
pip install nvidia-cublas nvidia-cuda-runtime nvidia-cufft nvidia-cusolver nvidia-cusparse nvidia-curand nvidia-cuda-nvrtc nvidia-nvtx
```

**Why is this needed?** PyTorch's CUDA packages sometimes don't include all required NVIDIA libraries. The `nvidia-*` packages ensure complete CUDA library installation for marker-pdf to work properly.

### Full installation with all features

```bash
pip install nuoyi[all-cuda]
```

### From source

```bash
git clone https://github.com/cycleuser/NuoYi.git
cd NuoYi
pip install -e .
```

### macOS Installation Notes

marker-pdf fully supports macOS (both Intel and Apple Silicon). On macOS, PyTorch is installed automatically without CUDA. Apple Silicon Macs can use MPS acceleration via `--device mps`.

If you encounter torch installation issues on macOS, install the CPU-only version of PyTorch first:

```bash
pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu
pip install nuoyi
```

## Usage

### Command Line Interface

```bash
# Convert a single PDF file
nuoyi paper.pdf

# Specify output file
nuoyi paper.pdf -o output/result.md

# Convert a DOCX file
nuoyi document.docx -o document.md

# Batch convert all files in a directory
nuoyi ./papers --batch

# Batch convert with custom output directory
nuoyi ./papers --batch -o ./output

# Force CPU mode (for low VRAM GPUs)
nuoyi paper.pdf --device cpu

# Force OCR even for digital PDFs
nuoyi paper.pdf --force-ocr

# Specify page range
nuoyi paper.pdf --page-range "0-5,10,15-20"

# Specify languages
nuoyi paper.pdf --langs "zh,en,ja"

# Disable OCR models for digital PDFs (saves ~1.5GB VRAM)
nuoyi paper.pdf --disable-ocr-models

# Low VRAM mode for 4-6GB GPUs
nuoyi paper.pdf --low-vram
```

### GUI Mode

```bash
nuoyi --gui
```

The GUI provides:
- Directory selection for input/output
- File list with status tracking
- Device selection (auto/CPU/CUDA)
- Force OCR option
- Page range and language configuration
- Real-time progress and logging

**Startup interface:**

![Startup](images/1-启动界面.png)

**Select input directory:**

![Select directory](images/2-选择路径.png)

**Configure device and options:**

![Configure](images/3-选择模型.png)

**Conversion result (viewed in VS Code):**

![Result](images/4-结果.png)

### Python API

```python
from nuoyi import MarkerPDFConverter, DocxConverter

# Convert PDF (full models, ~3GB VRAM)
pdf_converter = MarkerPDFConverter(
    force_ocr=False,
    langs="zh,en",
    device="auto"  # or "cpu", "cuda", "mps"
)
markdown_text, images = pdf_converter.convert_file("input.pdf")

# Convert PDF (minimal models for digital PDFs, ~1.5GB VRAM)
pdf_converter_minimal = MarkerPDFConverter(
    disable_ocr_models=True,  # Saves ~1.5GB VRAM
    langs="zh,en",
    device="auto"
)
markdown_text, images = pdf_converter_minimal.convert_file("digital.pdf")

# Convert PDF (low VRAM mode)
pdf_converter_low_vram = MarkerPDFConverter(
    low_vram=True,
    langs="zh,en",
    device="auto"
)
markdown_text, images = pdf_converter_low_vram.convert_file("input.pdf")

# Convert DOCX
docx_converter = DocxConverter()
markdown_text = docx_converter.convert_file("input.docx")
```

## Supported Languages

| Code | Language |
|------|----------|
| `zh` | Chinese / 中文 |
| `en` | English |
| `ja` | Japanese / 日本語 |
| `fr` | French / Français |
| `ru` | Russian / Русский |
| `de` | German / Deutsch |
| `es` | Spanish / Español |
| `pt` | Portuguese / Português |
| `it` | Italian / Italiano |
| `ko` | Korean / 한국어 |

Use `nuoyi --list-langs` to see the full list. Default: `zh,en`.

## Command Line Options

| Option | Description |
|--------|-------------|
| `input` | Input PDF/DOCX file or directory (with --batch) |
| `-o, --output` | Output file path (single file) or directory (batch mode) |
| `--force-ocr` | Force OCR even for digital PDFs with embedded text |
| `--page-range` | Page range to convert, e.g. '0-5,10,15-20' |
| `--langs` | Comma-separated languages (default: zh,en). See `--list-langs` |
| `--list-langs` | List all supported languages and exit |
| `--batch` | Process all PDF/DOCX files in the input directory |
| `--device` | Device for model inference: auto (default), cpu, cuda, or mps |
| `--low-vram` | Enable low VRAM mode for 4-6GB GPUs |
| `--disable-ocr-models` | Disable OCR models for digital PDFs (~1.5GB VRAM saved) |
| `--gui` | Launch PySide6 GUI mode |
| `-V, --version` | Show version and exit |

## Memory Management

NuoYi automatically manages GPU memory:

- **Auto mode** (default): Detects available VRAM and uses GPU if sufficient (>6GB free)
- **CPU mode**: Forces CPU processing (slower but no VRAM limit)
- **CUDA mode**: Forces GPU processing (may OOM on large PDFs)
- **MPS mode**: For Apple Silicon Macs

### Low VRAM Options

For GPUs with limited VRAM (4-6GB):

1. **Use `--low-vram` flag**: Enables aggressive memory optimization
   ```bash
   nuoyi paper.pdf --low-vram
   ```

2. **Disable OCR models** (for digital PDFs only): Saves ~1.5GB VRAM
   ```bash
   nuoyi paper.pdf --disable-ocr-models
   ```
   
   ⚠️ **Warning**: This disables OCR features. Only suitable for:
   - Digital PDFs with embedded text (not scanned documents)
   - PDFs without complex tables requiring OCR
   - PDFs without mathematical formulas requiring OCR

3. **Use CPU mode**: No VRAM limitation but slower
   ```bash
   nuoyi paper.pdf --device cpu
   ```

4. **Use pymupdf engine**: Fast, no GPU required
   ```bash
   nuoyi paper.pdf --engine pymupdf
   ```

If CUDA out of memory occurs during conversion, NuoYi automatically retries with aggressive memory cleanup.

## Dependencies

### Required
- `marker-pdf>=1.0.0` - PDF conversion engine
- `PyMuPDF>=1.23.0` - PDF page counting
- `python-docx>=0.8.11` - DOCX conversion
- `Pillow>=9.0.0` - Image processing

### Optional
- `PySide6>=6.5.0` - GUI support (install with `pip install nuoyi[gui]`)

## Model Download

### Download Location

Models are downloaded automatically on first run and stored in:

```
~/.cache/huggingface/hub/
```

The models are from [Hugging Face](https://huggingface.co/) and include:
- `vikp/surya_det` - Layout detection model
- `vikp/surya_rec` - Text recognition model
- `vikp/surya_order` - Reading order model
- Other marker-pdf related models

Total size: approximately **2-3 GB**.

### For Users in China

Hugging Face may be blocked or slow in mainland China due to GFW. You can use a mirror:

```bash
# Set Hugging Face mirror (add to ~/.bashrc or run before nuoyi)
export HF_ENDPOINT=https://hf-mirror.com

# Then run nuoyi normally
nuoyi paper.pdf
```

Alternatively, you can download models manually and place them in the cache directory.

### Custom Model Path

The current version does not support custom model paths to keep the tool simple and avoid configuration complexity. Models are always stored in the default Hugging Face cache location.

## Notes

- After initial model download, everything works fully offline
- Use `--device cpu` if you encounter CUDA out of memory errors
- Legacy `.doc` format is not supported; convert to `.docx` first

## Agent Integration (OpenAI Function Calling)

NuoYi exposes OpenAI-compatible tools for LLM agents:

```python
from nuoyi.tools import TOOLS, dispatch

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=TOOLS,
)

result = dispatch(
    tool_call.function.name,
    tool_call.function.arguments,
)
```

## CLI Help

![CLI Help](images/nuoyi_help.png)

## License

GPL-3.0 License - see [LICENSE](LICENSE) for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Acknowledgments

- [marker-pdf](https://github.com/VikParuchuri/marker) - The excellent PDF conversion engine
- [surya](https://github.com/VikParuchuri/surya) - OCR and layout detection models
