Metadata-Version: 2.4
Name: quaynor
Version: 2.0.0
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
License-File: LICENSE
Summary: Local GGUF inference, tool calling, and streaming chat for Python (Quaynor bindings).
Keywords: AI,LLM,Large Language Models,Inference,Tool calling,Local LLM,Local Inference,On-device AI,SLM,Edge AI,Realtime LLM,Sustainable AI,Inference engine,Inference library,Local AI
Author: Quaynor
License: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Documentation, https://www.quaynor.site/
Project-URL: Homepage, https://www.quaynor.site/
Project-URL: Repository, https://github.com/iBz-04/quaynor

# Quaynor

**Run LLMs locally and efficiently on any device**

Quaynor is a lightweight, fast inference engine that makes it simple to run open-source LLMs directly inside your Python applications. 

## Key Features

- **Run locally, offline, for free** - No API keys or cloud services required
- **Fast, simple tool calling** - Just pass normal Python functions
- **Reliable tool execution** - Automatically derives grammar from function signatures
- **Infinite conversations** - Conversation-aware preemptive context shifting prevents mid-conversation crashes
- **GPU accelerated** — Vulkan (Linux/Windows) and Metal (macOS/iOS), where supported
- **Thousands of compatible models** - Works with any LLM in GGUF format
- **Powered by llama.cpp** - Built on the proven [llama.cpp](https://github.com/ggml-org/llama.cpp) engine

## Quick Start

```python
from quaynor import Chat

chat = Chat('./model.gguf')
response = chat.ask('Hello world?').completed()
print(response)
```

## Installation

```bash
pip install quaynor
```

## Documentation

Full documentation: https://www.quaynor.site/

## License

MIT — see the repository `LICENSE` file.

