Metadata-Version: 2.4
Name: turboprivate-ai
Version: 0.1.4
Summary: Unified platform for self-hosted LLM inference + enterprise safety governance
Project-URL: Homepage, https://github.com/Kubenew/turboprivate-ai
Project-URL: Source, https://github.com/Kubenew/turboprivate-ai
Project-URL: Issues, https://github.com/Kubenew/turboprivate-ai/issues
Author: TurboPrivate AI
License: Apache-2.0
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: click>=8.1.0
Requires-Dist: fastapi>=0.111.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: pydantic-settings>=2.3.0
Requires-Dist: pydantic>=2.7.0
Requires-Dist: pyyaml>=6.0.1
Requires-Dist: rich>=13.7.0
Requires-Dist: uvicorn[standard]>=0.29.0
Provides-Extra: all
Requires-Dist: autoawq>=0.2.0; extra == 'all'
Requires-Dist: beautifulsoup4>=4.12.0; extra == 'all'
Requires-Dist: celery>=5.4.0; extra == 'all'
Requires-Dist: chromadb>=0.5.0; extra == 'all'
Requires-Dist: helm-sdk>=0.1.0; extra == 'all'
Requires-Dist: httpx>=0.27.0; extra == 'all'
Requires-Dist: kubernetes>=30.0.0; extra == 'all'
Requires-Dist: opentelemetry-api>=1.25.0; extra == 'all'
Requires-Dist: opentelemetry-instrumentation-fastapi>=0.46b0; extra == 'all'
Requires-Dist: opentelemetry-sdk>=1.25.0; extra == 'all'
Requires-Dist: outlines>=0.0.40; extra == 'all'
Requires-Dist: passlib[bcrypt]>=1.7.4; extra == 'all'
Requires-Dist: pdfplumber>=0.10.0; extra == 'all'
Requires-Dist: pgvector>=0.3.0; extra == 'all'
Requires-Dist: prometheus-client>=0.20.0; extra == 'all'
Requires-Dist: pypdf>=4.2.0; extra == 'all'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'all'
Requires-Dist: pytest-localserver>=0.9.0; extra == 'all'
Requires-Dist: pytest>=8.0; extra == 'all'
Requires-Dist: python-docx>=1.1.0; extra == 'all'
Requires-Dist: python-jose[cryptography]>=3.3.0; extra == 'all'
Requires-Dist: python-terraform>=0.10.1; extra == 'all'
Requires-Dist: redis>=5.0.0; extra == 'all'
Requires-Dist: ruff>=0.1; extra == 'all'
Requires-Dist: scikit-learn>=1.5.0; extra == 'all'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'all'
Requires-Dist: torch>=2.3.0; extra == 'all'
Requires-Dist: transformers>=4.41.0; extra == 'all'
Requires-Dist: vllm>=0.5.0; extra == 'all'
Provides-Extra: auth
Requires-Dist: passlib[bcrypt]>=1.7.4; extra == 'auth'
Requires-Dist: python-jose[cryptography]>=3.3.0; extra == 'auth'
Provides-Extra: dev
Requires-Dist: httpx>=0.27.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest-localserver>=0.9.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Provides-Extra: full
Requires-Dist: autoawq>=0.2.0; extra == 'full'
Requires-Dist: beautifulsoup4>=4.12.0; extra == 'full'
Requires-Dist: celery>=5.4.0; extra == 'full'
Requires-Dist: chromadb>=0.5.0; extra == 'full'
Requires-Dist: helm-sdk>=0.1.0; extra == 'full'
Requires-Dist: kubernetes>=30.0.0; extra == 'full'
Requires-Dist: opentelemetry-api>=1.25.0; extra == 'full'
Requires-Dist: opentelemetry-instrumentation-fastapi>=0.46b0; extra == 'full'
Requires-Dist: opentelemetry-sdk>=1.25.0; extra == 'full'
Requires-Dist: outlines>=0.0.40; extra == 'full'
Requires-Dist: passlib[bcrypt]>=1.7.4; extra == 'full'
Requires-Dist: pdfplumber>=0.10.0; extra == 'full'
Requires-Dist: pgvector>=0.3.0; extra == 'full'
Requires-Dist: prometheus-client>=0.20.0; extra == 'full'
Requires-Dist: pypdf>=4.2.0; extra == 'full'
Requires-Dist: python-docx>=1.1.0; extra == 'full'
Requires-Dist: python-jose[cryptography]>=3.3.0; extra == 'full'
Requires-Dist: python-terraform>=0.10.1; extra == 'full'
Requires-Dist: redis>=5.0.0; extra == 'full'
Requires-Dist: scikit-learn>=1.5.0; extra == 'full'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'full'
Requires-Dist: torch>=2.3.0; extra == 'full'
Requires-Dist: transformers>=4.41.0; extra == 'full'
Requires-Dist: vllm>=0.5.0; extra == 'full'
Provides-Extra: inference
Requires-Dist: autoawq>=0.2.0; extra == 'inference'
Requires-Dist: outlines>=0.0.40; extra == 'inference'
Requires-Dist: torch>=2.3.0; extra == 'inference'
Requires-Dist: transformers>=4.41.0; extra == 'inference'
Requires-Dist: vllm>=0.5.0; extra == 'inference'
Provides-Extra: infra
Requires-Dist: helm-sdk>=0.1.0; extra == 'infra'
Requires-Dist: kubernetes>=30.0.0; extra == 'infra'
Requires-Dist: python-terraform>=0.10.1; extra == 'infra'
Provides-Extra: memory
Requires-Dist: beautifulsoup4>=4.12.0; extra == 'memory'
Requires-Dist: chromadb>=0.5.0; extra == 'memory'
Requires-Dist: pdfplumber>=0.10.0; extra == 'memory'
Requires-Dist: pypdf>=4.2.0; extra == 'memory'
Requires-Dist: python-docx>=1.1.0; extra == 'memory'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'memory'
Provides-Extra: observability
Requires-Dist: opentelemetry-api>=1.25.0; extra == 'observability'
Requires-Dist: opentelemetry-instrumentation-fastapi>=0.46b0; extra == 'observability'
Requires-Dist: opentelemetry-sdk>=1.25.0; extra == 'observability'
Requires-Dist: prometheus-client>=0.20.0; extra == 'observability'
Provides-Extra: safety
Requires-Dist: pgvector>=0.3.0; extra == 'safety'
Requires-Dist: scikit-learn>=1.5.0; extra == 'safety'
Requires-Dist: sentence-transformers>=3.0.0; extra == 'safety'
Provides-Extra: worker
Requires-Dist: celery>=5.4.0; extra == 'worker'
Requires-Dist: redis>=5.0.0; extra == 'worker'
Description-Content-Type: text/markdown

# TurboPrivate AI — Private & Safe Enterprise AI Platform

<p align="center">
  <a href="https://pypi.org/project/turboprivate-ai/"><img src="https://img.shields.io/pypi/v/turboprivate-ai?color=blue&logo=pypi" alt="PyPI version"></a>
  <a href="https://pypi.org/project/turboprivate-ai/"><img src="https://img.shields.io/pypi/pyversions/turboprivate-ai?logo=python" alt="Python versions"></a>
  <a href="https://github.com/Kubenew/turboprivate-ai/actions"><img src="https://img.shields.io/github/actions/workflow/status/Kubenew/turboprivate-ai/ci.yml?branch=main&logo=github" alt="CI status"></a>
  <a href="https://pypi.org/project/turboprivate-ai/"><img src="https://img.shields.io/pypi/dm/turboprivate-ai?logo=pypi" alt="Downloads"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache_2.0-blue" alt="License"></a>
  <a href="https://github.com/Kubenew/turboprivate-ai"><img src="https://img.shields.io/github/stars/Kubenew/turboprivate-ai?logo=github" alt="Stars"></a>
</p>

<p align="center">
  <strong>Run powerful LLMs on your own hardware — 40–60% cheaper than public clouds, with built-in enterprise safety & governance.</strong>
</p>

---

## Why TurboPrivate AI?

- **Full data sovereignty** — nothing leaves your infrastructure
- **Dramatic cost reduction** — INT4/AWQ quantization + smart routing
- **Enterprise Safety** — powered by Mythos Safe (defensive evaluation, jailbreak protection, audit)
- **OpenAI compatible** — drop-in replacement for your existing applications
- **One-command deploy** — from bare metal to production in minutes

## Key Features

- **TurboQuant Engine** — State-of-the-art INT4/AWQ quantization with minimal quality loss
- **Mythos Safe** — Multi-layer defensive safety (pre & post-flight gates)
- **Private RAG** — Secure document ingestion and retrieval
- **Full-stack observability** — Prometheus, Grafana, OpenTelemetry
- **Enterprise ready** — RBAC, audit trail, multi-tenancy, compliance support
- **Hardware flexibility** — RTX 4090, A100/H100, or even CPU-only

## Performance (RTX 4090)

| Model | Quant | Tokens/sec | VRAM Usage | Cost vs Groq/AWS |
|---|---|---|---|---|
| Llama 3.1 8B | INT4 | 110+ | ~5.8 GB | **~8x cheaper** |
| Qwen2.5 32B | INT4 | 45+ | ~22 GB | **~6x cheaper** |
| Llama 3.1 70B | INT4 | 18+ | ~48 GB | **~5x cheaper** |

## Quick Start

```bash
# 1. Deploy full stack (K8s)
turbo deploy --provider bare-metal --gpu auto

# 2. Serve model
turbo model serve meta-llama/Llama-3.1-8B --quant int4

# 3. Chat
turbo chat
```

Or use Docker Compose for quick testing:

```bash
docker compose up -d                    # dev
# docker compose -f docker-compose.prod.yml up -d  # production (GPU)
```

## Pricing

| Tier | Price | Best For | Includes |
|---|---|---|---|
| **PoC / Pilot** | €15,000 – €35,000 | 4–8 weeks trial | Deployment, 2 models, training, support |
| **Enterprise License** | €65,000 / year | Single cluster, up to 10 users | Full features, unlimited models, SLA 99.5% |
| **Enterprise Plus** | €120,000 – €180,000 / year | Multiple clusters, 50+ users | Priority support, custom verifiers, SOC2 |
| **Managed Service** | €8,000 – €25,000 / month | No ops team | Fully managed by us |

**Volume discounts** available for 3+ clusters.  
All prices exclude hardware.

Interested in a private demo?  
📅 [Book a 30-min PoC Call](mailto:felix@turboprivate.ai) | ✉️ [Contact Sales](mailto:felix@turboprivate.ai)

## Architecture

```
CLI / SDK / Dashboard
        ↓
   API Gateway (FastAPI · Auth · Rate Limiting)
        ↓
┌─────────────────┐  ┌───────────────────┐
│  Mythos Safe    │  │  TurboQuant INT4  │
│  Verifiers ·    │  │  vLLM/llama.cpp   │
│  Audit Trail    │  │  Inference Engine │
└─────────────────┘  └───────────────────┘
        ↓
   Memory & RAG (TurboMemory · pdf2struct)
        ↓
┌──────────┐ ┌──────────┐ ┌──────────┐
│  K3s     │ │Monitoring│ │ Storage  │
│  Cluster │ │Prom/Graf │ │ PG/Redis │
└──────────┘ └──────────┘ └──────────┘
```

## Demo

<p align="center">
  <img src="https://raw.githubusercontent.com/Kubenew/turboprivate-ai/main/demo/turboprivate-demo.gif" alt="TurboPrivate AI deployment demo" width="100%">
</p>

## Documentation

- [Architecture](docs/ARCHITECTURE.md) — Full system design
- [Deployment](docs/DEPLOYMENT.md) — Production deployment guide
- [CLI Reference](turbo/cli.py) — All CLI commands
- [API Reference](turbo/api/main.py) — FastAPI routes
- [Safety Gate](turbo/safety/gate.py) — Verifier configuration
- [Demo Assets](demo/) — GIF recording tape + deploy script

## Changelog

### 0.1.4 (2026-05-13)
- Production-hardened Helm charts (configmap, ingress, services templates)
- Enhanced rate limiter with token bucket algorithm + per-route limits
- Improved safety gate middleware with pre/post-flight hook chain
- Realtime metrics visualization in dashboard endpoint
- TurboQuant v3 quantization pipeline: AWQ + INT4 mixed-precision
- Backup/restore CLI with age-encrypted snapshots
- K3s provisioner with multi-node discovery + node labels
- vLLM backend: speculative decoding toggle + prefix caching
- llama.cpp backend: flash attention + GPU offloading
- Worker refinements: quantize retry, eval timeout, ingestion dedup
- CLI enhancements: model status, deploy progress, backup summary
- PII detector regex expansion (passport, SSN, phone variants)
- Vulnerability verifier: CVE-2025 scoring + dependency jail status
- PDF/image ingestion with OCR fallback in RAG pipeline

### 0.1.3 (2026-05-13)
- Extended demo GIF to 61s with 5-scene animation (intro, deploy, serve+chat, safety block, dashboard)
- Switched README GIF to absolute GitHub raw URL for PyPI rendering

### 0.1.2 (2026-05-11)
- Enterprise-ready README with pricing table and benchmarks
- Added docs/ARCHITECTURE.md with system design diagrams
- Added docs/DEPLOYMENT.md with production deployment guide
- Added examples/ with HTTP, safety, RAG, and quantization samples
- Added .env.example with all configuration options
- Added benchmarks/ with RTX 4090 performance results
- Switched license from MIT to Apache 2.0
- Added `turbo doctor` CLI command for system health checks
- Added GitHub Actions Docker build workflow
- Updated pyproject.toml with `full` install extra

### 0.1.1 (2026-05-11)
- Migrated to hatchling build system
- Fixed missing `InferenceEngine` import in `turbo.inference`
- Fixed `TracerProvider` bug in OpenTelemetry instrumentation
- Added structured logging to all exception handlers
- Consolidated Celery workers into shared `worker.celery_app`
- Added CI workflow with ruff linting + pytest
- Improved graceful shutdown (audit trail flush)
- Updated dependencies (replaced `unstructured` with actual used libs)

## License

Apache 2.0 — see [LICENSE](LICENSE).

---

<p align="center">
  Built by <a href="https://github.com/Kubenew">Kubenew</a> — ex-HPE engineer, 12+ years enterprise infrastructure
</p>
