Metadata-Version: 2.4
Name: xorfice
Version: 0.1.39
Summary: SOTA Omni-Modal Personal AI Orchestrator & Engine
Author-email: Xoron-Dev <contact@xoron.dev>
Project-URL: Homepage, https://github.com/xoron-dev/xorfice
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: torch>=2.0.0
Requires-Dist: triton
Requires-Dist: transformers
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: pydantic
Requires-Dist: safetensors
Requires-Dist: hf_transfer
Requires-Dist: huggingface_hub
Requires-Dist: rich
Requires-Dist: readchar

# 📦 Xorfice: The SOTA Omni-Modal Orchestration Engine

`xorfice` is the official, high-performance Python package for Xoron-Dev. It is more than just a model wrapper—it is a complete inference and agentic orchestration layer designed for the next era of multimodal AI.

## 🚀 Installation

Stable version from PyPI:
```bash
pip install xorfice
```

Development version from source:
```bash
git clone https://gitlab.com/joeycristini56/Xoron-Dev.git
pip install -e ./xorfice_pkg
```

---

## 🛠️ The SOTA Orchestrator: XoronEngine

The `XoronEngine` is a heavy-duty, production-ready inference manager that handles the entire lifecycle of the Xoron-Dev model.

### 🌟 Key Capabilities
- **Automatic Weights Management:** Snapshot downloads from HuggingFace Hub with local caching.
- **Multimodal Routing:** Single entry point for Text, Image, Video, and Audio.
- **Dynamic Optimization:** Auto-tunes hardware affinity based on CUDA, VRAM, and NUMA node detection.

### ⚙️ Developer Usage
```python
from xorfice import XoronEngine

# Initialize the engine
# Auto-detects hardware and optimizes for max performance
# Correct model slug: Backup-bdg/Xoron-Dev-MultiMoe
engine = XoronEngine(
    model_path="Backup-bdg/Xoron-Dev-MultiMoe",
    max_vram_experts=4, # Offload 4/8 experts to CPU
    device="cuda"
)

# Multimodal Generation with Streaming
# The engine natively handles URLs, local paths, and raw tensors
for token in engine.generate(
    prompt="Explain this video and analyze the speaker's tone.",
    videos="https://example.com/demo.mp4",
    audios="path/to/voice.wav",
    stream=True
):
    print(token, end="", flush=True)
```

---

## 🏗️ Native SOTA Optimizations

Xorfice implements elite-level performance features right out of the box:

### 1. MoE Expert Offloading
Our custom `LRUExpertCache` manages Mixture of Experts (MoE) layers dynamically. By keeping only the most frequently used experts in VRAM, we enable **5B+ parameter models to run smoothly on 8GB consumer GPUs**.

### 2. Paged KV Cache
Inspired by vLLM, xorfice uses **Paged Attention** to manage Key-Value memory. This allows for massively increased throughput and support for thousands of tokens in long-chain reasoning.

### 3. Integrated Agentic Memory
Xorfice includes a **FlatFileMemoryManager** that persists user interactions across sessions, allowing Xoron-Dev to "learn" from conversations without full fine-tuning.

### 4. Zero-Shot Voice Cloning
Using our SOTA `VoiceManager`, you can clone voices instantly by uploading a short 5-second sample. No retraining required—pure latent-space adaptation.

---

## 🎨 Creative Capabilities
Xorfice exposes raw diffusion pipelines through the `engine.generate_image()` and `engine.generate_video()` methods, allowing for **Text-to-Video (T2V)**, **Image-to-Video (I2V)**, **Text-to-Image (T2I)**, **Image-to-Image (I2I)**, and **Video-to-Video (V2V)** workflows.

---

## 🤝 Open Source & Contributing
Interested in pushing the boundaries of SOTA AI? Check out our [Architecture Deep Dive](https://xoron.dev/docs/architecture).

*Xorfice: Powering the next generation of omni-modal agents.*
