Metadata-Version: 2.4
Name: gangajal
Version: 0.1.0
Summary: Fast, deterministic profanity filter using WebAssembly - works across all languages
Author: Saugat
Project-URL: Repository, https://github.com/SaugatEDITH/gangajal
Project-URL: Bug-Tracker, https://github.com/SaugatEDITH/gangajal/issues
Project-URL: Documentation, https://github.com/SaugatEDITH/gangajal/blob/master/README.md
Keywords: profanity,filter,censor,moderation,wasm,webassembly,unicode
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Typing :: Typed
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: wasmtime
Dynamic: license-file

# Gangajal — Global Profanity Filtering Engine

**Secure • Fast • Cross-language • No raw bad words in repo**

Gangajal is a deterministic, high-performance profanity filter that works in **every programming language** via WebAssembly.  
It uses Unicode normalization, Bloom filters, and SHA-256 hashing — keeping your dictionary 100% private while delivering lightning-fast filtering.

![GitHub stars](https://img.shields.io/github/stars/SaugatEDITH/gangajal)
![PyPI version](https://img.shields.io/pypi/v/gangajal)

## ✨ Features

- Supports **all world languages** (Unicode NFKC + letters + combining marks)
- Zero raw profanity words ever in the repository
- Extremely fast (Bloom filter + binary search)
- One WASM core — works in Node.js, Python, Go, .NET, Java, Rust, browsers, etc.
- Deterministic & reproducible
- Easy admin tools to update the dictionary safely

## 🚀 Quick Start

### Install

```bash
pip install gangajal
```

> **Alternative: Install from GitHub release**  
> 1. Download `gangajal-python.tar.xz` from [Releases](https://github.com/SaugatEDITH/gangajal/releases)  
> 2. Extract it  
> 3. `pip install ./gangajal-python`

### Usage

```python
from gangajal import validate, reload_assets

# Full mask mode (0): masks entire word
print(validate("hello badword here", 0))  # "hello ******** here"

# Partial mask mode (1): keeps first char  
print(validate("hello badword here", 1))  # "hello b****** here"

# Reload assets without restarting Python
reload_assets()

### Modes

- `mode=0`: Full mask - replaces entire word with `*` (e.g., `badword` → `*******`)
- `mode=1`: Partial mask - keeps first character (e.g., `badword` → `b******`)

---

## Architecture

- **Admin Tools** (private) → generate safe binary assets  
- **Binary Assets** (public) → `badwords.bloom` + `badwords.hash.bin`  
- **WASM Core** → same engine everywhere  
- **Language Bindings** → JS, Python, Go, .NET, etc.

Full specification → [SPEC.md](https://github.com/SaugatEDITH/gangajal/blob/master/SPEC.md)

---

## License

MIT © Saugat
