Metadata-Version: 2.4
Name: transfuzzy
Version: 0.1.1
Summary: TransFuzzy is a robust transliteration system that bridges the gap between Indic scripts and the Latin alphabet.
Author: Goutham
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: black>=26.3.1
Requires-Dist: build>=1.4.2
Requires-Dist: flask>=3.1.3
Requires-Dist: flask-cors>=6.0.2
Requires-Dist: fuzzywuzzy>=0.18.0
Requires-Dist: indic-transliteration>=2.3.81
Requires-Dist: jellyfish>=1.2.1
Requires-Dist: joblib>=1.5.3
Requires-Dist: langdetect>=1.0.9
Requires-Dist: matplotlib>=3.10.8
Requires-Dist: numpy>=2.4.4
Requires-Dist: pandas>=3.0.1
Requires-Dist: python-levenshtein>=0.27.3
Requires-Dist: ruff>=0.15.8
Requires-Dist: scikit-learn==1.5.2
Requires-Dist: scipy>=1.17.1
Requires-Dist: sentence-transformers>=5.3.0
Requires-Dist: twine>=6.2.0
Dynamic: license-file

# 🔤 TransFuzzy

**Multilingual AI-powered name matching — phonetic + semantic + ML in one CLI tool.**

TransFuzzy is a high-performance system for matching names across **Indic and Latin scripts**, combining phonetic algorithms, string similarity, and transformer embeddings into a single intelligent pipeline.

---

## 🚀 Installation

```bash
pip install transfuzzy
```

---

## ⚡ Usage

### Start API Server

```bash
transfuzzy
# or
transfuzzy serve
```

Runs at:

```
http://localhost:5000
```

---

### 🔍 CLI Prediction

```bash
transfuzzy predict "Rahul"
```

```bash
transfuzzy predict "Rahul" --top 5
```

```bash
transfuzzy predict "Rahul" --json
```

---

### Example Output

```
🔍 Similar names:

1. Rahul
2. Raahul
3. Rahool
4. Rahil
```

---

## 🌐 Supported Languages

* English (Latin)
* Hindi (Devanagari)
* Telugu
* Tamil
* Kannada
* Malayalam
* Gujarati
* Gurmukhi

You can input:

```
"Rahul"
"राहुल"
"రాహుల్"
```

---

## 🧠 How It Works

```
Input Name
   ↓
Script Detection → Transliteration
   ↓
Candidate Filtering (~73k names)
   ↓
Similarity Metrics (8 features)
   ↓
ML Model (Random Forest)
   ↓
Hybrid Scoring
   ↓
Top Matches
```

---

## 📡 API Usage

### POST `/similar_names`

```json
{
  "name": "Rahul"
}
```

Response:

```json
{
  "similar_names": ["Rahul", "Raahul", "Rahool"]
}
```

---

## 🏗️ Project Structure

```
src/transfuzzy/
├── cli.py          # CLI entrypoint
├── app.py          # Flask API
├── core/           # ML pipeline
├── dir/            # processing steps
├── db/             # dataset + model
├── utils/          # helpers
├── templates/      # UI
├── static/         # frontend
```

---

## 🧪 Training

```bash
uv run python scripts/enrich.py
uv run python scripts/train.py
```

---

## ⚙️ Development

```bash
git clone https://github.com/your-username/transfuzzy.git
cd transfuzzy
uv sync
uv run transfuzzy
```

---

## ✨ Features

* 🔊 Phonetic matching (Soundex, Metaphone)
* 📐 String similarity (Levenshtein, Jaro-Winkler)
* 🧠 Semantic embeddings (Sentence Transformers)
* 🌲 ML model (Random Forest)
* ⚡ Optimized inference pipeline
* 💻 CLI + API + Web UI

---

## 📄 License

MIT © Goutham

---

## 🔥 Vision

TransFuzzy is designed for real-world systems like:

* KYC verification
* Government databases
* Search & deduplication
* Multilingual identity matching

---

<p align="center">
  Built with ❤️ for AI-powered applications
</p>
