Metadata-Version: 2.4
Name: arabic-rt
Version: 0.1.3
Summary: Arabic shaping, BiDi, and un-baking for games, TTS, and real-time clients.
Project-URL: Homepage, https://github.com/balswyan/arabic-rt
Project-URL: Repository, https://github.com/balswyan/arabic-rt
Project-URL: Issues, https://github.com/balswyan/arabic-rt/issues
Author: Bandar AlSwyan
License-Expression: MPL-2.0
License-File: LICENSE
Keywords: arabic,bidi,games,i18n,internationalization,presentation-forms,reshaper,rtl,shaping,text-to-speech,tts,unity
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Mozilla Public License 2.0 (MPL 2.0)
Classifier: Natural Language :: Arabic
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Games/Entertainment
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing :: Linguistic
Requires-Python: >=3.9
Provides-Extra: dev
Requires-Dist: arabic-reshaper>=3; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Requires-Dist: python-bidi>=0.4; extra == 'dev'
Description-Content-Type: text/markdown

# arabic-rt

**Arabic shaping, BiDi, and *un-baking* for games, TTS, and real-time clients.**

[![PyPI](https://img.shields.io/pypi/v/arabic-rt.svg)](https://pypi.org/project/arabic-rt/)
[![Python](https://img.shields.io/pypi/pyversions/arabic-rt.svg)](https://pypi.org/project/arabic-rt/)
[![License: MPL-2.0](https://img.shields.io/badge/license-MPL--2.0-brightgreen.svg)](LICENSE)
[![Open in Spaces](https://huggingface.co/datasets/huggingface/badges/resolve/main/open-in-hf-spaces-sm.svg)](https://huggingface.co/spaces/balswyan/arabic-rt)

**[🤗 Try the live demo](https://huggingface.co/spaces/balswyan/arabic-rt)** · **[📦 PyPI](https://pypi.org/project/arabic-rt/)** · **[🎮 .NET / Unity version](https://github.com/balswyan/arabic-rt-dotnet)**

Most Arabic libraries can turn logical Arabic into correctly *shaped, right-to-left* text. `arabic-rt` does that too — but it also does the part almost nothing else does: it can **reverse** the process, turning baked presentation-form text back into clean logical Arabic. That round-trip is what makes Arabic work in places it normally breaks: multiplayer game chat, naive text renderers, and text-to-speech.

- 🔁 **Bake and un-bake.** `fix()` → renders correctly even on clients that do *zero* Arabic processing. `unfix()` → recovers logical Arabic for TTS, search, or logging.
- 🎮 **Built for real-time clients.** A `GAME` preset handles word-by-word chat readers (joins words so they aren't split, keeps the first words on top when wrapping).
- 🧩 **Zero dependencies.** Pure Python. Drop it in anywhere.
- ✅ **Validated.** Forward output matches `arabic_reshaper` + `python-bidi` byte-for-byte; `unfix(fix(x)) == x` is covered by tests. The C# port produces byte-identical output, so text baked in Unity reads back in Python and vice-versa.

> Pure shaping/BiDi is well served by existing tools. `arabic-rt`'s reason to exist is the **real-time / game** niche and the **un-baking** capability built for it.

## Try it

A live, no-install demo — type Arabic and watch it shaped, baked, and un-baked in real time:
**https://huggingface.co/spaces/balswyan/arabic-rt**

## Install

```bash
pip install arabic-rt
```

## Quick start

```python
import arabic_rt as ar

baked = ar.fix("مرحبا بالعالم")     # visual-order presentation forms (renders anywhere)
ar.unfix(baked)                      # -> "مرحبا بالعالم"  (back to logical, for TTS/search)
ar.shape("سلم")                      # -> "ﺳﻠﻢ"  (contextual shaping only, no reorder)

ar.contains_arabic("hi مرحبا")       # True
ar.is_shaped(baked)                  # True
```

### Game chat (word-by-word readers)

```python
ar.fix("مرحبا بالعالم", ar.GAME)     # words joined so a naive reader shows the whole phrase
```

### Tune it yourself

```python
from arabic_rt import Options, fix

opts = Options(
    combine_allah=True,      # collapse الله -> ﷲ
    reverse_word_order=True, # full RTL line (False = shape per word, keep typed order)
    word_joiner="\u00A0",    # separator for naive word-by-word readers
    prevent_word_split=True,
    max_line_chars=18,       # wrap long lines ourselves (first words on top, each line RTL)
)
fix("نص عربي طويل", opts)
```

## Why "un-baking" matters

To make Arabic show up correctly on a client that does no shaping, you "bake" it into final presentation glyphs in visual (reversed) order. The catch: once baked, the text is no longer real Arabic letters — so a text-to-speech engine reads gibberish, and search/logging break. `unfix()` reverses the bake (presentation forms → base letters, ligatures expanded, order restored) so the *display* can stay baked while the *voice* and *data* see clean Arabic.

## API

| Function | Purpose |
| --- | --- |
| `fix(text, opts=None, **overrides)` | Logical Arabic → baked visual presentation forms. No-op on non-Arabic or already-shaped text. |
| `unfix(text)` | Baked Arabic → logical Arabic. No-op on text that isn't baked. |
| `shape(text, *, combine_allah=False)` | Contextual shaping only; order preserved. |
| `contains_arabic(text)` / `is_shaped(text)` | Fast checks. |
| `Options` / `GAME` | Config dataclass and a ready preset for game chat. |

## Also available for .NET & Unity

The same engine, ported to C# with **byte-for-byte identical output**, targeting `netstandard2.0/2.1` (Unity-compatible):
**[github.com/balswyan/arabic-rt-dotnet](https://github.com/balswyan/arabic-rt-dotnet)** · `dotnet add package ArabicRt`

## A note on display fonts

`arabic-rt` produces correct *text*; how it *looks* is your font's job. For rendering shaped Arabic (e.g. in the demo or a UI), a quality **Naskh** face such as **Noto Naskh Arabic** or **Amiri** (both SIL OFL) looks far better than a generic system font.

## Validation

Run the suite (installs the reference libraries as dev extras):

```bash
pip install -e ".[dev]"
pytest -q
```

## License & author

Licensed under the **Mozilla Public License 2.0 (MPL-2.0)** — see [`LICENSE`](LICENSE). Use it freely, including in closed-source games and apps; modifications to `arabic-rt`'s own files stay open.

Created by **Bandar AlSwyan**.

---

<div dir="rtl" align="right">

<h2>عربي — نظرة سريعة</h2>

<p><b>arabic-rt</b> مكتبة لمعالجة النص العربي: تشكيل الحروف (وصلها بأشكالها الصحيحة)، وترتيبها من اليمين إلى اليسار، <b>والأهم</b> القدرة على عكس العملية — أي تحويل النص «المخبوز» (أشكال العرض المقلوبة) مرة أخرى إلى عربية منطقية سليمة.</p>

<p>هذه القدرة على «فك الخبز» (<code>unfix</code>) هي ما يجعل العربية تعمل في أماكن تتعطّل فيها عادةً: دردشة الألعاب الجماعية، والمحرّكات التي لا تعالج العربية، وأنظمة النطق (TTS). فالنص يظهر صحيحاً للجميع، بينما يقرأ محرّك الصوت أو البحث نسخة منطقية نظيفة.</p>

<p>🤗 جرّب العرض الحيّ: <a href="https://huggingface.co/spaces/balswyan/arabic-rt">huggingface.co/spaces/balswyan/arabic-rt</a></p>

<ul>
<li><b>fix()</b>: عربية منطقية ← أشكال عرض جاهزة تظهر بشكل صحيح على أي عميل حتى بدون معالجة.</li>
<li><b>unfix()</b>: عكس العملية لاستعادة العربية المنطقية (للنطق والبحث والسجلات).</li>
<li><b>GAME</b>: إعداد جاهز لدردشة الألعاب التي تقرأ الكلمات واحدة تلو الأخرى.</li>
<li>بدون أي اعتماديات، ومُتحقَّق منها مقابل arabic_reshaper و python-bidi حرفاً بحرف.</li>
</ul>

<p>متوفّرة أيضاً لـ .NET و Unity: <a href="https://github.com/balswyan/arabic-rt-dotnet">arabic-rt-dotnet</a>. برخصة MPL-2.0. من إعداد <b>بندر الصويان</b>.</p>

</div>
