Metadata-Version: 2.4
Name: transklate
Version: 0.1.0
Summary: Use OCR Tesseract to convert Hebrew PDFs to text files and then translate them.
Author-email: Frédérique Michèle Rey <frederique.rey@hu-berlin.de>
Maintainer-email: Frédérique Michèle Rey <frederique.rey@hu-berlin.de>
License-Expression: MIT
License-File: LICENCE.txt
Keywords: OCR,hebrew,transcribe,translation
Classifier: Programming Language :: Python
Requires-Python: >=3.9
Requires-Dist: alive-progress==3.2.0
Requires-Dist: click==8.1.*
Requires-Dist: deep-translator==1.11.4
Requires-Dist: ironpdf==2025.1.1.1
Requires-Dist: pymupdf==1.25.2
Requires-Dist: pytesseract==0.3.13
Requires-Dist: regex==2024.11.6
Description-Content-Type: text/markdown

# Transklate - A small tools to convert PDF in Hebrew to txt and to translate it

The purpose of this little code is to automate the following two processes
1. to transcribe a PDF written in Hebrew into a TXT file using Tesseract.
  - The PDFs are converted to PNGs in a temporary folder.
  - Each image is then converted to a string by Tesseract.
2. translate the TXT file into the language of your choice using Google Translate.

At the moment the translation is not very good (and not as good as you'd expect from Google Translate online).

## Instalation
```bash
pip install transklate
```

## Basic CLI use
```bash
transklate <file_name.pdf> --lang en
```
For the output language, use the Google translate code list.
