Metadata-Version: 2.4
Name: pdfto
Version: 0.1.0
Summary: Convert PDFs to Markdown or plain text via font-aware heuristics
License: MIT
Requires-Python: >=3.10
Requires-Dist: pymupdf>=1.23
Requires-Dist: typer>=0.12
Provides-Extra: dev
Requires-Dist: pytest>=8; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Description-Content-Type: text/markdown

# pdfto

Convert PDF files to Markdown or plain text from the command line.

## Install

```bash
pip install pdfto
```

## Usage

```bash
# Convert to Markdown (output alongside input)
pdfto --markdown example.pdf

# Convert to Markdown with explicit output path
pdfto --markdown example.pdf -o output.md

# Convert to plain text
pdfto --text example.pdf -o output.txt

# Write to stdout
pdfto --markdown example.pdf -o -

# Batch: multiple files
pdfto --markdown a.pdf b.pdf -o ./converted/

# Batch: entire directory (recursive)
pdfto --markdown ./docs/ -o ./out/
```

## Options

| Flag | Description |
|------|-------------|
| `--markdown` | Convert to Markdown (`.md`) |
| `--text` | Convert to plain text (`.txt`) |
| `-o / --output` | Output file, directory, or `-` for stdout |
| `--force` | Overwrite existing output files |
| `--quiet` | Suppress progress messages |
| `--version` | Show version and exit |

## How it works

Uses [PyMuPDF](https://pymupdf.readthedocs.io/) to extract per-span font metadata (size, bold, italic, monospace flags). Heading levels are assigned by font-size ratio relative to the document's body text size. No ML or GPU required.
