Metadata-Version: 2.4
Name: deocr
Version: 0.1.1
Summary: A reverse OCR tool that renders huggingface-compatible datasets to images of specified sizes (e.g., `512x512`).
Author: Moenupa
Author-email: Moenupa <moenupa@gmail.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Dist: jsonargparse[signatures]>=4.26.1
Requires-Dist: datasets
Requires-Dist: markdown
Requires-Dist: playwright>=1.49.1,<1.56.0
Requires-Python: >=3.9, <3.14
Project-URL: Homepage, https://github.com/Moenupa/DeOCR
Project-URL: Issues, https://github.com/Moenupa/DeOCR/issues
Description-Content-Type: text/markdown

# DeOCR

DeOCR (de-cor), A reverse OCR tool that renders huggingface-compatible datasets to images of specified sizes (e.g., `512x512`). This tool can be considered as a text-to-image data pre-processing component in pipelines such as [DeepSeek-OCR](https://github.com/deepseek-ai/DeepSeek-OCR).

```mermaid
---
title: DeOCR Usage in LLM Pipeline
---
flowchart LR
  TEXTDATA[/"some context in text form"/]
  MMDATA[/"Does this particular car <br/> &lt;image&gt; present in here &lt;image&gt; ?"/]
  HFDATASET[("huggingface dataset")] 
  subgraph DeOCR
    CSS1["cli --style red-text textit"]
    CSS2["cli --style default"]
    CSS3["cli --style default"]
    MAPPER["DeOCR Dataset Mapper"]
  end
  TEXTDATA --> CSS1 --> IMG1[["some context in text form"]]:::redText
  TEXTDATA --> CSS2 --> IMG2[["some context in text form"]]
  MMDATA --> CSS3 --> IMG3[["Does this particular car <br/> 🖼️🖼️🖼️🖼️🖼️🖼️🖼️<br/>🖼️🖼️🖼️🚗🖼️🖼️🖼️<br/>🖼️🖼️🖼️🖼️🖼️🖼️🖼️<br/> present in here <br/> 🖼️🖼️🖼️🖼️🖼️🖼️🖼️<br/>🖼️🖼️🖼️🖼️🖼️🖼️🖼️<br/>🖼️🖼️🖼️🖼️🖼️🖼️🖼️<br/>?"]]
  HFDATASET --> MAPPER --> DEOCRDATASET[("🖼️ imagified dataset")]
  DEOCRDATASET & IMG1 & IMG2 & IMG3 -.-> MODEL["LLMs or VLMs<br/> Evaluation"]
  classDef redText color:#ff0000,font-style:italic;
  IMG1 ~~~|"fa:fa-mobile-screen A screenshot of text <br/>w. special formatting"| IMG1
  IMG2 ~~~|"fa:fa-mobile-screen A plain screenshot of text"| IMG2
  IMG3 ~~~|"fa:fa-mobile-screen A screenshot of both text and images"| IMG3
```

<details><summary>Here is an output example, sized `512x512`, with random string as context</summary>

![a 512x512 example](assets/output_sample_w512_h512.png)

</details>

# Quick Start

```sh
pip install deocr
# activate your python environment, then install playwright deps
playwright install chromium
```

<details><summary>Alternatively, install from source</summary>

```sh
# uv
uv add "deocr @ git+https://github.com/Moenupa/DeOCR.git"
# for pip or conda
pip install "git+https://github.com/Moenupa/DeOCR.git"
# activate your python environment, then install playwright deps
playwright install chromium
```

</details>

<details><summary>For development</summary>

Please use uv to manage the environment:

```sh
git clone https://github.com/Moenupa/DeOCR.git
cd DeOCR
uv venv
uv sync --dev
source .venv/bin/activate
playwright install chromium
pre-commit install
```

</details>
