Metadata-Version: 2.4
Name: paper-reader
Version: 0.1.0
Summary: Local PDF paper reader with side-by-side machine-translation, powered by pdf2zh.
Project-URL: Homepage, https://github.com/linzhenkun2025/paper-reader
Project-URL: Issues, https://github.com/linzhenkun2025/paper-reader/issues
Author: linzhenkun2025
License: MIT
License-File: LICENSE
Keywords: arxiv,pdf,pdf2zh,reader,translation
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.11
Requires-Dist: flask-cors>=4.0
Requires-Dist: flask>=3.0
Requires-Dist: pdf2zh>=1.9
Requires-Dist: pymupdf>=1.24
Provides-Extra: dev
Requires-Dist: build; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: twine; extra == 'dev'
Description-Content-Type: text/markdown

# Paper Reader

A local web app to read PDF papers side-by-side with an automatic Chinese (or other-language) translation, powered by pdf2zh.

---

## Features

- **Side-by-side panes**: original PDF on the left, translated PDF on the right, with synced scrolling.
- **Clickable links**: internal page jumps and external URLs both work in the rendered view.
- **Auto-numbered table of contents**: outline panel generated from the PDF's bookmark tree.
- **Selectable text layer**: copy text from both the original and translated panes.
- **Bounded directory browser**: serves only files under the configured directory; no path-traversal escapes.

## Requirements

- Python >= 3.11

`pdf2zh` is installed automatically as a dependency.

## Install

```sh
pipx install paper-reader        # once published
# or from a local checkout:
pipx install .
```

After installing, run `paper-reader doctor` once to apply two colorspace bug-fix patches to the bundled pdf2zh/pdfminer (without them, color figures may render red). This is idempotent — safe to run repeatedly.

```sh
paper-reader doctor
```

> **Why is this needed?** pdf2zh 1.9.11 (the current release) still ships both colorspace bugs. The `doctor` command patches them in-place so you do not have to wait for an upstream fix.

## Usage

```sh
paper-reader [DIR] [options]
```

- With no `DIR`, Paper Reader serves the **current working directory**.
- `paper-reader ~/papers` serves that directory.
- Open the URL printed on startup (default: http://127.0.0.1:8733).
- Pass `--open` to have the browser open automatically.

Examples:

```sh
paper-reader                      # serve current directory
paper-reader ~/papers             # serve ~/papers
paper-reader ~/papers --open      # serve ~/papers and open browser
paper-reader --port 9000          # use a different port
```

## Configuration

Settings are resolved in this order (highest priority first):

**CLI flag > environment variable > config file > default**

The config file location is `~/.config/paper-reader/config.toml` (respects `XDG_CONFIG_HOME`).

| Option | CLI flag | Env var | Default |
|---|---|---|---|
| Browse directory | `DIR` (positional) | `PAPER_DIR` | current working directory |
| Port | `--port` | `PAPER_PORT` | `8733` |
| Host | `--host` | `PAPER_HOST` | `127.0.0.1` |
| Cache directory | `--cache-dir` | `PAPER_CACHE` | `~/.cache/paper-reader` (XDG_CACHE_HOME aware) |
| Source language | `--from` | `PAPER_FROM` | `en` |
| Target language | `--to` | `PAPER_TO` | `zh` |
| Translation engine | `--engine` | `PAPER_ENGINE` | `google` (choices: `google`, `microsoft`, `mymemory`) |
| Concurrency | `--concurrency` | `PAPER_CONCURRENCY` | `3` |
| pdf2zh path | `--pdf2zh` | `PAPER_PDF2ZH` | auto-discovered |
| Open browser | `--open` | _(CLI only)_ | off |
| Config file path | `--config` | _(CLI only)_ | `~/.config/paper-reader/config.toml` |

### Example `~/.config/paper-reader/config.toml`

```toml
port = 8733
host = "127.0.0.1"
engine = "google"
lang_from = "en"
lang_to = "zh"
concurrency = 4
cache_dir = "~/.cache/paper-reader"
```

## How It Works

Flask serves a [pdf.js](https://mozilla.github.io/pdf.js/) frontend that renders two PDF panes side-by-side. Translation is performed per-page by invoking the external `pdf2zh` CLI as a subprocess; translated PDFs are cached so each page is only translated once.

## License & Credits

This project is licensed under the **MIT License** (see [LICENSE](LICENSE)).

It invokes **[pdf2zh / PDFMathTranslate](https://github.com/Byaidu/PDFMathTranslate)** (AGPL-3.0) as a separate command-line program via subprocess. This is aggregation, not a derivative work — the Paper Reader source code remains MIT while pdf2zh's AGPL obligations remain with pdf2zh itself.
