Metadata-Version: 2.4
Name: purrfectkit
Version: 0.2.8
Summary: **PurrfectKit** is a Python library for effortless Retrieval-Augmented Generation (RAG) workflows.
Keywords: rag,nlp,llms,python,ai,ocr,document-processing,multilingual,text-extraction
Author: SUWALUTIONS
Author-email: SUWALUTIONS <suwa@suwalutions.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Text Processing :: General
Classifier: Natural Language :: English
Classifier: Natural Language :: Thai
Requires-Dist: python-magic>=0.4.27
Requires-Dist: pytesseract>=0.3.13
Requires-Dist: pillow>=10.4.0
Requires-Dist: sentence-transformers==5.2.0
Requires-Dist: pandas>=2.3.3
Requires-Dist: pdf2image>=1.17.0
Requires-Dist: pymupdf4llm>=0.2.9
Requires-Dist: markitdown[all]>=0.1.4
Requires-Dist: easyocr>=1.7.2
Requires-Dist: python-doctr>=1.0.0
Requires-Dist: typhoon-ocr>=0.4.1
Requires-Dist: tiktoken>=0.12.0
Requires-Dist: langchain-text-splitters>=1.1.0
Requires-Dist: ollama>=0.6.1
Requires-Dist: openai>=2.15.0
Requires-Dist: docling>=2.68
Requires-Dist: surya-ocr>=0.17.0
Requires-Dist: ruff>=0.14.11 ; extra == 'dev'
Requires-Dist: mypy>=1.19.1 ; extra == 'dev'
Requires-Dist: types-pyyaml>=6.0.12 ; extra == 'dev'
Requires-Dist: pre-commit>=4.5.1 ; extra == 'dev'
Requires-Dist: detect-secrets>=1.5.0 ; extra == 'dev'
Requires-Dist: codecov-cli>=11.2.6 ; extra == 'dev'
Requires-Dist: sphinx<=9.0.0 ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme>=3.1.0 ; extra == 'docs'
Requires-Dist: pytest>=9.0.2 ; extra == 'test'
Requires-Dist: pytest-cov>=7.0.0 ; extra == 'test'
Requires-Dist: pytest-mock>=3.15.1 ; extra == 'test'
Maintainer: KHARAPSY
Maintainer-email: KHARAPSY <kharapsy@suwalutions.com>
Requires-Python: >=3.10
Project-URL: Documentation, https://suwalutions.github.io/PurrfectKit
Project-URL: Repository, https://github.com/SUWALUTIONS/PurrfectKit
Project-URL: Issues, https://github.com/SUWALUTIONS/PurrfectKit/issues
Provides-Extra: dev
Provides-Extra: docs
Provides-Extra: test
Description-Content-Type: text/markdown

![PurrfectMeow Logo](https://github.com/suwalutions/PurrfectKit/blob/meow/docs/_static/repo-logo.png)

# PurrfectKit

[![Python 3.10–3.13](https://img.shields.io/badge/python-3.10–3.13-blue)](https://www.python.org)
[![PyPI](https://img.shields.io/pypi/v/purrfectkit?color=gold&label=PyPI)](https://pypi.org/project/purrfectkit/)
[![Downloads](https://img.shields.io/pypi/dm/purrfectkit?color=purple)](https://pypistats.org/packages/purrfectkit)
[![codecov](https://codecov.io/github/suwalutions/PurrfectKit/branch/meow/graph/badge.svg?token=Z6YETHJXCL)](https://codecov.io/github/suwalutions/PurrfectKit)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Docker](https://img.shields.io/docker/v/suwalutions/purrfectkit?label=docker)](https://ghcr.io/suwalutions/purrfectkit)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)


**PurrfectKit** is your all-in-one, dependency-smart, configuration-friendly toolkit that turns even the most advanced Retrieval-Augmented Generation (RAG) workflows into a smooth, beginner-friendly experience.


🧩 5 Cats Will Lead You To The Purrfect Way.    

🐱 **Suphalak** – Seamlessly reads and loads content from files.

🐱 **Malet** – Splits content into high-quality, model-friendly chunks.

🐱 **WichienMaat** – Embeds chunks into powerful vector representations.

🐱 **KhaoManee** – Searches and retrieves the most relevant vectors.

🐱 **Kornja** – Generates final responses enriched by retrieved knowledge (Under Development).


> **_NOTE:_** The Thai cat-themed naming isn’t just cute—it makes learning and remembering the RAG process surprisingly fun and intuitive.


Whether you're a sturdent, researcher, hobbyist, or production-level engineer, this toolkit gives you a clean, guided workflow that “**just works**”

## Quickstart

PurrfectKit aims to be plug-and-play, but a few lightweight system tools are required.

### Prerequisites

#### Linux (Ubuntu / Debian)

    # Install Python (if not already)
    sudo apt update
    sudo apt install -y python3 python3-pip
    
    # Install Tesseract OCR
    sudo apt install -y tesseract-ocr tesseract-ocr-tha
    
    # Install FFmpeg
    sudo apt install -y ffmpeg
    
    # Install libmagic
    sudo apt install -y libmagic1

#### macOS

    # Install Homebrew if missing
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
    
    # Install Python
    brew install python
    
    # Install Tesseract OCR
    brew install tesseract
    
    # Install FFmpeg
    brew install ffmpeg
    
    # Install libmagic
    brew install libmagic

#### Windows

    # Install Python
    Download from the official website:
   
    [https://www.python.org/downloads/](https://www.python.org/downloads/)

    ✔ Make sure to check “Add Python to **PATH**” during installation.

    # Install Tesseract OCR
    Download the Windows installer:

    [https://github.com/UB-Mannheim/tesseract/wiki](https://github.com/UB-Mannheim/tesseract/wiki)

    ✔ Make sure to add the installation path to your **System PATH**



### Installation
```bash
pip install purrfectkit

```

### Usage
```python
from purrfectmeow.meow.felis import DocTemplate, MetaFile
from purrfectmeow import Suphalak, Malet, WichienMaat, KhaoManee

file_path = 'test/test.pdf'
metadata = MetaFile.get_metadata(file_path)
with open(file_path, 'rb') as f:
    content = Suphalak.reading(f, 'test.pdf')
chunks = Malet.chunking(content, chunk_method='token', chunk_size='500', chunk_overlap='25')
docs = DocTemplate.create_template(chunks, metadata)
embedding = WichienMaat.embedding(chunks)
query = WichienMaat.embedding("ทดสอบ")
KhaoManee.searching(query, embedding, docs, 2)

```

## License

PurrfectKit is released under the [MIT License](LICENSE).
