Metadata-Version: 2.4
Name: grillyoptimum
Version: 0.1.0
Summary: HuggingFace Optimum-compatible Vulkan backend — optional grilly extension
Author-email: Nicolas Cloutier <ncloutier@grillcheeseai.com>
License: MIT
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: grilly>=0.4.0
Requires-Dist: grillyinference>=0.1.0
Requires-Dist: numpy
Requires-Dist: transformers
Provides-Extra: optimum
Requires-Dist: optimum; extra == "optimum"
Dynamic: license-file

# GrillyOptimum (Alpha not production ready)

HuggingFace Optimum-compatible Vulkan backend — optional [grilly](https://github.com/grillcheese/grilly) extension.

## Features

- **from_pretrained()** — load any HF Llama model directly
- **generate()** — HuggingFace-compatible generation interface
- **Pipeline integration** — use with `transformers.pipeline("text-generation")`
- **Vulkan acceleration** — native fp16 inference on RDNA2 GPUs

## Quick Start

```bash
pip install grillyoptimum
```

```python
from grillyoptimum import VulkanModelForCausalLM
from transformers import AutoTokenizer

model = VulkanModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")

# HF-style generation
input_ids = tokenizer.encode("Hello, world!", return_tensors="np")
output = model.generate(input_ids, max_new_tokens=100, temperature=0.7)
print(tokenizer.decode(output[0]))

# Pipeline
from grillyoptimum.pipeline import create_text_generation_pipeline
pipe = create_text_generation_pipeline("meta-llama/Llama-3.2-3B-Instruct")
result = pipe("Explain quantum computing")
print(result[0]["generated_text"])
```

## Requirements

- Python 3.12+
- grilly >= 0.4.0
- grillyinference >= 0.1.0
- transformers

## License

MIT
