Metadata-Version: 2.3
Name: pdfembed
Version: 0.1.0
Summary: CLI/TUI tool to run OCR locally and overlay a searchable text layer on PDFs.
Author: harumiWeb
Author-email: harumiWeb <ganaharumi@outlook.jp>
License: BSD-3-Clause
Requires-Dist: numpy>=1.26.4
Requires-Dist: onnxocr>=2025.5
Requires-Dist: opencv-python>=4.11.0.86
Requires-Dist: pypdf>=6.4.0
Requires-Dist: pypdfium2>=5.1.0
Requires-Dist: reportlab>=4.4.5
Requires-Dist: textual>=6.6.0
Requires-Python: >=3.12
Description-Content-Type: text/markdown

## pdfembed

<img width="1077" height="610" alt="Image" src="https://github.com/user-attachments/assets/0610ecf0-fec0-4657-abe5-c4387a91bbae" />

---

CLI/TUI tool to run OCR locally and overlay a searchable text layer on PDFs. By default, a Textual-based TUI launches; use `--cli` for the classic CLI.

License: BSD-3-Clause (see `LICENSE`).

### Quickstart

- TUI (default):  
  `python -m pdfembed.cli` or `pdfembed`

- CLI:  
  `python -m pdfembed.cli --cli --file sample.pdf --dpi 300`

### TUI Controls

- `f`: select PDF file(s) (opens a file dialog; multiple selection allowed)  
- `o`: select output folder (opens a folder dialog; defaults to the first PDF's directory)  
- `v`: toggle overlay visibility (debug)  
- `s`: start OCR  
- `q`: quit  
- DPI is fixed to the default in TUI; change via CLI `--dpi` if needed.

While OCR is running, a "Processing... please wait" indicator is shown and other keys are ignored until completion.

### CLI Options (key ones)

- `--file <pdf1> [pdf2 ...]` or `--dir <folder>`: input PDFs  
- `--output <dir>`: output directory (default: input location)  
- `--dpi <int>`: render DPI (default 300)  
- `--visible`: make overlay text visible (debug)  
- `--font <path>`: TTF font for overlay text  
- `--log-level <LEVEL>`: logging level (INFO by default)

### Dependencies

- Textual (TUI)
- tkinter (file dialogs, stdlib)
- onnxocr / pypdfium2 / pypdf / reportlab / opencv-python / numpy
