Metadata-Version: 2.4
Name: purrfectkit
Version: 0.2.3
Summary: **PurrfectKit** is a Python library for effortless Retrieval-Augmented Generation (RAG) workflows.
Keywords: rag,nlp,llms,python,ai,ocr,document-processing,multilingual,text-extraction
Author: SUWALUTIONS
Author-email: SUWALUTIONS <suwa@suwalutions.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Text Processing :: General
Classifier: Natural Language :: English
Classifier: Natural Language :: Thai
Requires-Dist: python-magic<=0.4.27
Requires-Dist: sentence-transformers<=5.1.0
Requires-Dist: transformers<=4.53.0
Requires-Dist: docling<=2.31.1
Requires-Dist: markitdown<=0.1.1
Requires-Dist: pymupdf4llm<=0.0.27
Requires-Dist: pdf2image<=1.17.0
Requires-Dist: pytesseract<=0.3.13
Requires-Dist: easyocr<=1.7.2
Requires-Dist: surya-ocr<=0.14.0
Requires-Dist: python-doctr<=1.0.0
Requires-Dist: pandas<=2.3.2
Requires-Dist: langchain-text-splitters<=1.0.0
Requires-Dist: tiktoken<=0.12.0
Requires-Dist: ruff<=0.6.0 ; extra == 'dev'
Requires-Dist: mypy<=1.11.0 ; extra == 'dev'
Requires-Dist: pre-commit<=3.8.0 ; extra == 'dev'
Requires-Dist: detect-secrets<=1.5.0 ; extra == 'dev'
Requires-Dist: codecov-cli<=11.2.4 ; extra == 'dev'
Requires-Dist: sphinx<=8.2.3 ; extra == 'docs'
Requires-Dist: sphinx-rtd-theme<=3.0.2 ; extra == 'docs'
Requires-Dist: pytest<=8.4.2 ; extra == 'test'
Requires-Dist: pytest-cov<=7.0.0 ; extra == 'test'
Requires-Dist: pytest-mock<=3.15.1 ; extra == 'test'
Maintainer: KHARAPSY
Maintainer-email: KHARAPSY <kharapsy@suwalutions.com>
Requires-Python: >=3.10
Project-URL: Documentation, https://suwalutions.github.io/PurrfectKit
Project-URL: Issues, https://github.com/SUWALUTIONS/PurrfectKit/issues
Project-URL: Repository, https://github.com/SUWALUTIONS/PurrfectKit
Provides-Extra: dev
Provides-Extra: docs
Provides-Extra: test
Description-Content-Type: text/markdown

![PurrfectMeow Logo](https://github.com/suwalutions/PurrfectKit/blob/meow/docs/_static/repo-logo.png)

# PurrfectKit

[![Python 3.10–3.13](https://img.shields.io/badge/python-3.10–3.13-blue)](https://www.python.org)
[![PyPI](https://img.shields.io/pypi/v/purrfectkit?color=gold&label=PyPI)](https://pypi.org/project/purrfectkit/)
[![Downloads](https://img.shields.io/pypi/dm/purrfectkit?color=purple)](https://pypistats.org/packages/purrfectkit)
[![codecov](https://codecov.io/github/suwalutions/PurrfectKit/branch/meow/graph/badge.svg?token=Z6YETHJXCL)](https://codecov.io/github/suwalutions/PurrfectKit)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
[![Docker](https://img.shields.io/docker/v/suwalutions/purrfectkit?label=docker)](https://ghcr.io/suwalutions/purrfectkit)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

**PurrfectKit** is a toolkit that simplifies Retrieval-Augmented Generation (RAG) into 5 easy steps:
1. Suphalak - read content from files
2. Malet - split content into chunks
3. WichienMaat - embed chunks into vectors
4. KhaoManee - search vectors with queries
5. Kornja - generate answers from vectors

> **_NOTE:_** Each step is inspired by a unique Thai cat breed, making the workflow memorable and fun.

## Quickstart

### Prerequisites
- python
- tesseract


### Installation
```bash
pip install purrfectkit

```

### Usage
```python
from purrfectmeow.meow.felis import DocTemplate, MetaFile
from purrfectmeow import Suphalak, Malet, WichienMaat, KhaoManee

file_path = 'test/test.pdf'
metadata = MetaFile.get_metadata(file_path)
with open(file_path, 'rb') as f:
    content = Suphalak.reading(f, 'test.pdf')
chunks = Malet.chunking(content, chunk_method='token', chunk_size='500', chunk_overlap='25')
docs = DocTemplate.create_template(chunks, metadata)
embedding = WichienMaat.embedding(chunks)
query = WichienMaat.embedding("ทดสอบ")
KhaoManee.searching(query, embedding, docs, 2)

```

## License

PurrfectKit is released under the [MIT License](LICENSE).
