Metadata-Version: 2.4
Name: purrfectkit
Version: 0.2.4
Summary: **PurrfectKit** is a Python library for effortless Retrieval-Augmented Generation (RAG) workflows.
Keywords: rag,nlp,llms,python,ai,ocr,document-processing,multilingual,text-extraction
Author: SUWALUTIONS
Author-email: SUWALUTIONS <suwa@suwalutions.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Text Processing :: General
Classifier: Natural Language :: English
Classifier: Natural Language :: Thai
Requires-Dist: python-magic<=0.4.27
Requires-Dist: sentence-transformers<=5.1.0
Requires-Dist: transformers<=4.53.0
Requires-Dist: docling<=2.31.1
Requires-Dist: markitdown<=0.1.1
Requires-Dist: pymupdf4llm<=0.0.27
Requires-Dist: pdf2image<=1.17.0
Requires-Dist: pytesseract<=0.3.13
Requires-Dist: easyocr<=1.7.2
Requires-Dist: surya-ocr<=0.14.0
Requires-Dist: python-doctr<=1.0.0
Requires-Dist: pandas<=2.3.2
Requires-Dist: langchain-text-splitters<=1.0.0
Requires-Dist: tiktoken<=0.12.0
Requires-Dist: ruff<=0.6.0 ; extra == 'dev'
Requires-Dist: mypy<=1.11.0 ; extra == 'dev'
Requires-Dist: pre-commit<=3.8.0 ; extra == 'dev'
Requires-Dist: detect-secrets<=1.5.0 ; extra == 'dev'
Requires-Dist: codecov-cli<=11.2.4 ; extra == 'dev'
Requires-Dist: sphinx<=8.2.3 ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme<=3.0.2 ; extra == 'docs'
Requires-Dist: pytest<=8.4.2 ; extra == 'test'
Requires-Dist: pytest-cov<=7.0.0 ; extra == 'test'
Requires-Dist: pytest-mock<=3.15.1 ; extra == 'test'
Maintainer: KHARAPSY
Maintainer-email: KHARAPSY <kharapsy@suwalutions.com>
Requires-Python: >=3.10
Project-URL: Documentation, https://suwalutions.github.io/PurrfectKit
Project-URL: Issues, https://github.com/SUWALUTIONS/PurrfectKit/issues
Project-URL: Repository, https://github.com/SUWALUTIONS/PurrfectKit
Provides-Extra: dev
Provides-Extra: docs
Provides-Extra: test
Description-Content-Type: text/markdown

![PurrfectMeow Logo](https://github.com/suwalutions/PurrfectKit/blob/meow/docs/_static/repo-logo.png)

# PurrfectKit

[![Python 3.10–3.13](https://img.shields.io/badge/python-3.10–3.13-blue)](https://www.python.org)
[![PyPI](https://img.shields.io/pypi/v/purrfectkit?color=gold&label=PyPI)](https://pypi.org/project/purrfectkit/)
[![Downloads](https://img.shields.io/pypi/dm/purrfectkit?color=purple)](https://pypistats.org/packages/purrfectkit)
[![codecov](https://codecov.io/github/suwalutions/PurrfectKit/branch/meow/graph/badge.svg?token=Z6YETHJXCL)](https://codecov.io/github/suwalutions/PurrfectKit)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Docker](https://img.shields.io/docker/v/suwalutions/purrfectkit?label=docker)](https://ghcr.io/suwalutions/purrfectkit)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)


**PurrfectKit** is your all-in-one, dependency-smart, configuration-friendly toolkit that turns even the most advanced Retrieval-Augmented Generation (RAG) workflows into a smooth, beginner-friendly experience.


🧩 5 Cats Will Lead You To The Purrfect Way.    

🐱 **Suphalak** – Seamlessly reads and loads content from files.

🐱 **Malet** – Splits content into high-quality, model-friendly chunks.

🐱 **WichienMaat** – Embeds chunks into powerful vector representations.

🐱 **KhaoManee** – Searches and retrieves the most relevant vectors.

🐱 **Kornja** – Generates final responses enriched by retrieved knowledge (Under Development).


> **_NOTE:_** The Thai cat-themed naming isn’t just cute—it makes learning and remembering the RAG process surprisingly fun and intuitive.


Whether you're a sturdent, researcher, hobbyist, or production-level engineer, this toolkit gives you a clean, guided workflow that “**just works**”

## Quickstart

PurrfectKit aims to be plug-and-play, but a few lightweight system tools are required.

### Prerequisites

#### Linux (Ubuntu / Debian)

    # Install Python (if not already)
    sudo apt update
    sudo apt install -y python3 python3-pip
    
    # Install Tesseract OCR
    sudo apt install -y tesseract-ocr tesseract-ocr-tha
    
    # Install FFmpeg
    sudo apt install -y ffmpeg
    
    # Install libmagic
    sudo apt install -y libmagic1

#### macOS

    # Install Homebrew if missing
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
    
    # Install Python
    brew install python
    
    # Install Tesseract OCR
    brew install tesseract
    
    # Install FFmpeg
    brew install ffmpeg
    
    # Install libmagic
    brew install libmagic

#### Windows

    # Install Python
    Download from the official website:
   
    [https://www.python.org/downloads/](https://www.python.org/downloads/)

    ✔ Make sure to check “Add Python to **PATH**” during installation.

    # Install Tesseract OCR
    Download the Windows installer:

    [https://github.com/UB-Mannheim/tesseract/wiki](https://github.com/UB-Mannheim/tesseract/wiki)

    ✔ Make sure to add the installation path to your **System PATH**



### Installation
```bash
pip install purrfectkit

```

### Usage
```python
from purrfectmeow.meow.felis import DocTemplate, MetaFile
from purrfectmeow import Suphalak, Malet, WichienMaat, KhaoManee

file_path = 'test/test.pdf'
metadata = MetaFile.get_metadata(file_path)
with open(file_path, 'rb') as f:
    content = Suphalak.reading(f, 'test.pdf')
chunks = Malet.chunking(content, chunk_method='token', chunk_size='500', chunk_overlap='25')
docs = DocTemplate.create_template(chunks, metadata)
embedding = WichienMaat.embedding(chunks)
query = WichienMaat.embedding("ทดสอบ")
KhaoManee.searching(query, embedding, docs, 2)

```

## License

PurrfectKit is released under the [MIT License](LICENSE).
