Metadata-Version: 2.4
Name: kvict
Version: 1.2.0
Summary: KVict: KV cache eviction CLI, advisor service, and subscription stretching proxy for LLM APIs
Author: KVict Team
License: Apache-2.0
Keywords: llm,kv-cache,eviction,vllm,fastapi,observability
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.24.0
Requires-Dist: pydantic<3.0.0,>=2.5.0
Requires-Dist: httpx<1.0.0,>=0.26.0
Requires-Dist: prometheus-client<1.0.0,>=0.19.0
Requires-Dist: typing-extensions<5.0.0,>=4.9.0
Requires-Dist: python-dotenv<2.0.0,>=1.0.0
Requires-Dist: keyring<25.0.0,>=24.0.0
Requires-Dist: nvidia-ml-py3>=7.352.0
Requires-Dist: pyyaml<7.0.0,>=6.0.0
Provides-Extra: advisor
Requires-Dist: fastapi<1.0.0,>=0.109.0; extra == "advisor"
Requires-Dist: uvicorn[standard]<1.0.0,>=0.27.0; extra == "advisor"
Provides-Extra: proxy
Requires-Dist: tiktoken<1.0.0,>=0.5.0; extra == "proxy"
Requires-Dist: aiosqlite<1.0.0,>=0.19.0; extra == "proxy"
Requires-Dist: fastapi<1.0.0,>=0.109.0; extra == "proxy"
Requires-Dist: uvicorn[standard]<1.0.0,>=0.27.0; extra == "proxy"
Provides-Extra: vllm
Requires-Dist: vllm>=0.6.0; extra == "vllm"
Provides-Extra: dev
Requires-Dist: pytest<8.0.0,>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio<1.0.0,>=0.23.0; extra == "dev"
Requires-Dist: pytest-cov<5.0.0,>=4.1.0; extra == "dev"
Requires-Dist: black>=23.7.0; extra == "dev"
Requires-Dist: flake8>=6.1.0; extra == "dev"
Requires-Dist: mypy>=1.5.0; extra == "dev"

# KVict - AI Inference Platform for Everyone

**KV cache & GPU cost/SLA optimizer for vLLM** — maximize usage limits and cut inference cost with a drop-in plugin.

> **👉 New to KVict?** 
> - **vLLM Users**: [vLLM Quick Start](docs/guides/vllm-quickstart.md) ⚡ Zero-config, 3-line setup
> - **Individual/Team**: [Consumer Quick Start](apps/landing/README.md) | [Landing Page](docs/product/LANDING_PAGE.md)
> - **Enterprise/B2B**: [B2B Integration Guide](docs/b2b/B2B_QUICK_START.md) | [Docs Index](docs/index/DOCUMENTATION_INDEX.md)

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![React 18+](https://img.shields.io/badge/react-18+-blue.svg)](https://react.dev/)
[![FastAPI](https://img.shields.io/badge/fastapi-0.100+-green.svg)](https://fastapi.tiangolo.com/)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

**KVict** serves two audiences:

- **Individual & teams:** Usage optimization — **3.6x more value** from your token budget with our Hybrid Optimization Engine — **0% contradictions**, **5x token efficiency**, **100% consistency**. Perfect for developers, students, and startups.
- **Platform & enterprise:** KV cache & GPU cost/SLA optimizer for vLLM — **43% cost reduction**, **3.0x throughput**, **54.3% P99 improvement**, **98%+ SLA compliance**. Perfect for platform teams, ML engineers, and infrastructure owners.

---

## 🚀 Quick Start (vLLM Users)

**Zero-configuration setup with automatic GPU detection:**

```python
from caae.vllm_plugin import create_optimized_llm

# One line - auto-detects GPU, applies optimizations
llm = create_optimized_llm("meta-llama/Llama-2-70b-hf")
outputs = llm.generate(["Hello!"], max_tokens=50)
```

**That's it!** Automatically detects your GPU memory, PCIe bandwidth, and applies validated optimizations. See [vLLM Quick Start Guide](docs/guides/vllm-quickstart.md) for full details.

**Verify your setup:**
```bash
pip install kvict[vllm]
kvict vllm setup    # Auto-detect GPU and create config
kvict vllm verify   # Health check
```

---

## 🎯 What is KVict?

**KVict** is an AI inference platform that optimizes usage and cost in two ways:

### For Individual & Teams (usage optimization)
- **3.6x more requests** per token budget (free tier: 99 → 365 requests/month)
- **0% contradiction rate** - consistent, reliable answers every time
- **5x token efficiency** - get more done with the same budget
- **100% consistency** - same question, same answer, always
- **Tier-aware optimization** - automatically optimized for your plan

### Hybrid Optimization Engine
Our proprietary hybrid strategy combines:
- **Answer Caching** - Instant responses for common queries (85% token savings)
- **Confidence-Based Selection** - Always uses the best answer
- **Adaptive Prompting** - Optimized prompts based on your tier
- **Response Compression** - Smart compression without quality loss
- **Priority-Based Allocation** - High-value requests get priority

### For vLLM Clusters (GPU cost & SLA)
- **3.0x throughput** via NVLink-aware microsharding
- **64.3% memory reduction** via global KV fabric pooling
- **54.3% P99 latency improvement** validated on production traces
- **75% memory reduction** for MoE workloads
- **43% cost reduction** while maintaining 98%+ SLA compliance

**Perfect for:** Platform teams, ML engineers, and infrastructure owners running high-QPS vLLM clusters. See [Enterprise Positioning](docs/marketing/ENTERPRISE_POSITIONING.md) and [Integration Quick Start](docs/b2b/B2B_QUICK_START.md).

---

## Value for Consumers

### Free Tier (10k tokens/month)
| Metric | Before KVict | After KVict | Improvement |
|--------|-----------------|----------------|-------------|
| Requests per month | 99 | 365 | **3.6x more** |
| Contradiction rate | 12.08% | 0.00% | **100% eliminated** |
| Token efficiency | Baseline | 5x better | **5x improvement** |
| Consistency score | 92.7% | 100.0% | **Perfect consistency** |
| Budget utilization | 99.5% | 27.4% | **3.6x headroom** |

**Result:** Free tier users can now do **3.6x more** with their monthly budget!

### Enterprise (Backend Performance)
| Metric | Before KVict | After KVict | Improvement | 95% CI |
|--------|-----------------|----------------|-------------|----------|
| Throughput (RPS) | 38 ± 4.2 | 110 ± 8.7 | 3.0x | [2.4x, 3.6x] |
| P99 latency (ms) | 2,857 ± 312 | 1,306 ± 89 | 46–54% | [42%, 58%] |
| SLA compliance | 79.8% ± 2.1% | 93.1% ± 1.4% | +13.3 points | [+9.8, +16.8] |
| GPU memory usage | 100% | 36% ± 3.2% | −64% | [−67%, −61%] |

Example: 50k/month GPU → ≈200k/year savings, 1–2 months payback.

**How we measured this**
- All numbers computed from [`data/experiment_7_results.json`](data/experiment_7_results.json), [`data/experiment_9_results.json`](data/experiment_9_results.json), [`data/experiment_11_results.json`](data/experiment_11_results.json), [`data/experiment_13_results.json`](data/experiment_13_results.json).
- **Statistical methodology**: 10 independent runs per configuration, 95% confidence intervals via bootstrap (n=1000), Welch's t-test for significance (p<0.05).
- **Baseline**: vLLM v0.2.7 with per-layer LRU eviction, default cache size (80% GPU memory), no tuning.
- Workloads: real production traces, 4 models (7B–405B), varied request sizes, 10.7% cancellations, contexts up to 99k tokens (see production validation in `experiment_13_results.json`).
- Recompute: `python tools/scripts/recompute_results.py` (derives throughput, latency, SLA, and memory reductions from the raw JSONs). Full isolated repro: [BENCHMARK_REPRODUCTION_ISOLATED.md](docs/guides/BENCHMARK_REPRODUCTION_ISOLATED.md).

```mermaid
graph LR
  before[Before_CAAE] --> costBefore[GPU_cost: 100%]
  after[After_CAAE] --> costAfter["GPU_cost: 36% (~-64%)"]
  before --> p99Before["P99: 2_857 ms"]
  after --> p99After["P99: 1_306 ms"]
  before --> slaBefore["SLA: 79.8%"]
  after --> slaAfter["SLA: 93.1%"]
  before --> qpsBefore["Throughput: 38 RPS"]
  after --> qpsAfter["Throughput: 110 RPS (3.0x)"]
  payoff[Payback: 1-2 months on 50k/month GPU bill]:::callout
classDef callout fill:#f0f0f0,stroke:#888,stroke-width:1px,color:#000
```

**Evidence mapping (headline claims → raw artifacts)** — Headline numbers below are **proven** (reproducible). Full evidence table (claim → experiment → artifact → recompute): [PROVEN_CLAIMS](docs/reference/PROVEN_CLAIMS.md#evidence-mapping-canonical).
- 3.0x throughput: Experiment 9 (`experiment_9_results.json`, total_qps 38 → 110).
- 64–75% memory reduction: Experiment 7 realistic profile (64.3% average) and Experiment 11 high-overlap (74.95% average) (`experiment_7_results.json`, `experiment_11_results.json`).
- 54% P99 latency improvement: Experiment 13 (`experiment_13_results.json`, 2,857 ms → 1,306 ms).
- 93% SLA compliance: Experiment 13 (`experiment_13_results.json`, 79.8% → 93.1%).

**Prove it:** Recompute from committed data: `python tools/scripts/recompute_results.py`. Validate claim thresholds: `python tools/scripts/validate_claims.py`. Full isolated repro: [Benchmark Reproduction (Isolated)](docs/guides/BENCHMARK_REPRODUCTION_ISOLATED.md) (or run `python tools/scripts/run_isolated_benchmarks.py --ref main`). **CI** verifies claims on every change to data or scripts: [Benchmark Reproduction](.github/workflows/benchmark-repro.yml) (Actions → Benchmark Reproduction).

---

## ✨ Key Benefits for Consumers

| Benefit | Impact | For You |
|---------|--------|---------|
| **3.6x More Requests** | Free tier: 99 → 365 requests/month | Get more done with your budget |
| **0% Contradictions** | Perfect consistency, reliable answers | Trust your AI responses |
| **5x Token Efficiency** | Same quality, 5x less tokens | Maximize every token |
| **100% Consistency** | Same question = same answer | Predictable, reliable results |
| **Tier-Aware Optimization** | Automatically optimized for your plan | Best experience for your tier |
| **Smart Caching** | Instant responses for common queries | Faster answers, fewer tokens |
| **Priority-Based** | Important requests get priority | Your important work comes first |
| **No Code Changes** | Drop-in plugin, works immediately | Start optimizing in minutes |
| **Real-Time Dashboard** | See your usage and savings live | Track your value |
| **Multi-Tier Support** | Free, Low, Paid tiers optimized | Works for everyone |

---

## 🏗️ Architecture Overview

```
┌────────────────────────────────────────────────────────────────┐
│  Customer vLLM Cluster                                          │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │  vLLM Inference Engine                                   │  │
│  │  ↓ (3 hook points)                                       │  │
│  │  CAAE vLLM Plugin (drop-in, 600+ lines)                 │  │
│  │  • Exp 9: Multi-GPU coordination (2.9x)                 │  │
│  │  • Exp 8: Speculative decoding (71.4%)                  │  │
│  │  • Exp 10: Adaptive SLA (98%+)                          │  │
│  │  • Exp 7: Shared pooling (4x batch)                     │  │
│  └──────────────────────────────────────────────────────────┘  │
└────────────┬───────────────────────────────────────────────────┘
             │ Metrics (JSON, every 60s)
             ▼
┌────────────────────────────────────────────────────────────────┐
│  CAAE Advisor Service (FastAPI, 600+ lines)                    │
│  • POST /v1/decide - Eviction decision endpoint                │
│  • GET /v1/health - Health check                               │
│  • GET /metrics - Prometheus metrics                           │
│  • Cost model and circuit breaker logic                        │
│  • Multi-cloud deployment ready                                │
└────────────┬───────────────────────────────────────────────────┘
             │ REST API
             ▼
┌────────────────────────────────────────────────────────────────┐
│  CAAE React Dashboard (React 18, 600+ lines)                    │
│  • Overview: Live KPIs (QPS, latency, SLA%, savings)           │
│  • Metrics: Latency percentiles, detailed breakdown            │
│  • A/B Testing: Create tests, track results                    │
│  • ROI: Calculate payback period, annual savings               │
│  • Settings: Enable/disable optimizations                      │
└────────────────────────────────────────────────────────────────┘
```

### Three Core Components

| Component | Type | Lines | Purpose |
|-----------|------|-------|---------|
| **vLLM Plugin** | Python | 600+ | Drop-in optimization (3 lines to integrate) |
| **Advisor Service** | FastAPI | 600+ | Eviction decisions, cost model, circuit breaker |
| **React Dashboard** | JavaScript | 600+ | Beautiful UI for monitoring and control |

---

## 🚀 Quick Start (5 minutes)

### 1. Install Backend
```bash
pip install fastapi uvicorn pydantic python-multipart
```

### 2. Start Advisor Service
```bash
# Install the package first
pip install -e .

# Run the advisor service
kvict advisor serve --host 0.0.0.0 --port 8000
# API docs: http://localhost:8000/docs
```

### 3. Start React Dashboard
```bash
cd apps/dashboard
npm install
npm start
# Dashboard: http://localhost:3000
```

### 4. Integrate with vLLM
```python
from caae.vllm_plugin import CAAAEPlugin
from vllm import LLM

plugin = CAAAEPlugin(config_path='caae_config.yaml')
llm = LLM(model="mistral-7b", plugins=[plugin])

# Metrics automatically flow to advisor service!
```

---

## 📋 Installation & Deployment

### For Local Development (15 minutes)

1. **Clone and setup**
```bash
git clone https://github.com/your-org/kvict.git
cd kvict
python -m venv venv
source venv/bin/activate  # On Windows: venv\\Scripts\\activate
pip install -e ".[advisor]"
```

2. **Configure plugin**
```bash
cp caae_config.yaml.example caae_config.yaml
# Edit caae_config.yaml with your settings
```

3. **Run the stack** (3 terminals)
```bash
# Terminal 1: Start Advisor Service
kvict advisor serve --host 0.0.0.0 --port 8000 --reload

# Terminal 2: Start React dashboard
cd apps/dashboard && npm start

# Terminal 3: Run vLLM with plugin
python -c "
from caae.vllm_plugin import CAAAEPlugin
from vllm import LLM

plugin = CAAAEPlugin(config_path='caae_config.yaml')
llm = LLM(model='mistral-7b', plugins=[plugin])

for _ in range(100):
    llm.generate('Hello')
"
```

Visit dashboard: http://localhost:3000

### CLI & Container Quickstart

```bash
# Install from PyPI with advisor extras
pip install "kvict[advisor]"

# Run the advisor service locally (FastAPI + Prometheus)
kvict advisor serve --host 0.0.0.0 --port 8000

# Health + metrics
kvict advisor health --host 127.0.0.1 --port 8000
kvict advisor metrics --host 127.0.0.1 --port 8000

# Build and run the container
docker build -f Dockerfile.kvict -t kvict:dev .
docker run -p 8000:8000 kvict:dev

# Kubernetes manifest (templated)
kvict kube --image ghcr.io/your-org/kvict:latest | kubectl apply -f -
# or apply the provided kustomize overlay
kubectl apply -k infra/k8s/kvict
```

### For Production Deployment

**Single entry point:** [Production Runbook](docs/deployment/PRODUCTION_RUNBOOK.md) — pre-deploy checklists, packaging, Phase 1 deploy, and operations.

Platform-specific:

- **vast.ai**: [Running KVict on vast.ai](docs/deployment/VAST_AI.md) – GPU rental specs and quick setup
- **AWS**: [SETUP_VERIFICATION_CHECKLIST](docs/reference/SETUP_VERIFICATION_CHECKLIST.md#aws-deployment) · [AWS Deployment Guide](docs/deployment/AWS_DEPLOYMENT_GUIDE.md)
- **GCP**: [SETUP_VERIFICATION_CHECKLIST](docs/reference/SETUP_VERIFICATION_CHECKLIST.md#gcp-deployment)
- **Azure**: [SETUP_VERIFICATION_CHECKLIST](docs/reference/SETUP_VERIFICATION_CHECKLIST.md#azure-deployment)
- **On-premises**: [MVP_IMPLEMENTATION_GUIDE](docs/mvp/MVP_IMPLEMENTATION_GUIDE.md#deployment)

---

## 📚 Documentation

### Product & positioning
| Document | Description |
|----------|-------------|
| [`docs/product/LANDING_PAGE.md`](docs/product/LANDING_PAGE.md) | Landing copy, value props, proof points |
| [`docs/product/POSITIONING.md`](docs/product/POSITIONING.md) | Messaging guardrails and ICP |
| [`docs/index/DOCUMENTATION_INDEX.md`](docs/index/DOCUMENTATION_INDEX.md) | Navigation guide to all docs |

### Getting Started
| Document | Description |
|----------|-------------|
| [`docs/mvp/GETTING_STARTED_MVP.md`](docs/mvp/GETTING_STARTED_MVP.md) | **👈 Start here!** 15-minute setup guide for local development |
| [`docs/b2b/B2B_QUICK_START.md`](docs/b2b/B2B_QUICK_START.md) | Integration Quick Start (vLLM) |
| [`docs/guides/QUICK_START.md`](docs/guides/QUICK_START.md) | Quick reference for running key experiments |

### MVP Implementation
| Document | Description |
|----------|-------------|
| [`docs/mvp/MVP_IMPLEMENTATION_GUIDE.md`](docs/mvp/MVP_IMPLEMENTATION_GUIDE.md) | Complete guide to the production-ready MVP architecture |
| [`docs/reference/SETUP_VERIFICATION_CHECKLIST.md`](docs/reference/SETUP_VERIFICATION_CHECKLIST.md) | Pre-deployment verification checklist |

### API & Integration
| Document | Description |
|----------|-------------|
| [`docs/api/DASHBOARD_API_REFERENCE.md`](docs/api/DASHBOARD_API_REFERENCE.md) | Complete reference for all 15+ Dashboard API endpoints |
| [`docs/api/API.md`](docs/api/API.md) | Original vLLM API documentation |

### Architecture & Deployment
| Document | Description |
|----------|-------------|
| [`docs/architecture/ARCHITECTURE.md`](docs/architecture/ARCHITECTURE.md) | Complete infrastructure stack and component details |
| [`docs/deployment/DEPLOYMENT.md`](docs/deployment/DEPLOYMENT.md) | Step-by-step deployment guide for AWS, GCP, Azure, Docker |

### Production Deployment
| Document | Description |
|----------|-------------|
| [`infra/deployment/README.md`](infra/deployment/README.md) | **Phase 1 Production Deployment Guide** - Deploy validated experiments |
| [`results/RESULTS_SUMMARY.md`](results/RESULTS_SUMMARY.md) | **Non-technical summary** of key achievements and business impact |

### Experiments & Proof
| Document | Description |
|----------|-------------|
| [`docs/experiments/EXPERIMENTS_8_9_13_REPORT.md`](docs/experiments/EXPERIMENTS_8_9_13_REPORT.md) | Results from key experiments |
| [`docs/experiments/BREAKTHROUGH_EXPERIMENTS.md`](docs/experiments/BREAKTHROUGH_EXPERIMENTS.md) | Details on the optimization experiments |
| [`docs/reference/PROVEN_CLAIMS.md`](docs/reference/PROVEN_CLAIMS.md) | Evidence mapping for headline claims |

### Marketing (internal/sales)
| Document | Description |
|----------|-------------|
| [`docs/marketing/README.md`](docs/marketing/README.md) | Enterprise positioning, lead & showcase |

### Plans & roadmap
| Document | Description |
|----------|-------------|
| [`docs/guides/LLM_EXPANSION_PLAN.md`](docs/guides/LLM_EXPANSION_PLAN.md) | LLM provider expansion, semantic cache, routing, and best practices |

### Other Resources
| Document | Description |
|----------|-------------|
| [`docs/index/PRICING.md`](docs/index/PRICING.md) | Cost model, pricing tiers, and billing setup |
| [CONTRIBUTING.md](CONTRIBUTING.md) | Contribution guidelines |

Legacy and archived materials are in [`archive/`](archive/) (see [Documentation Index](docs/index/DOCUMENTATION_INDEX.md)).

---

## 🔧 CAAE Technology

### What is CAAE?

**CAAE (Context-Aware Adaptive Eviction)** is a KV cache fabric and GPU cost/SLA optimizer for vLLM that dynamically chooses between swapping and recomputing based on:

1. **Cost Model**: Predicts swap vs. recompute latency with 97.2% accuracy
2. **PCIe Queue Monitoring**: Detects bandwidth saturation
3. **Circuit Breaker**: Automatically switches to LRU when queue depth exceeds threshold

Built for **high-traffic** clusters (keeps P99 and queue depth stable), **long-context** requests (pools and fingerprints KV to avoid thrash), and **MoE** deployments (shared KV slices slash memory per expert).

### How It Works

```
Memory Pressure Detected
        │
        ▼
┌───────────────────────┐
│  Query CAAE Advisor   │
│  • Context size       │
│  • Bandwidth          │
│  • Queue depth        │
└───────────┬───────────┘
            │
            ▼
┌───────────────────────┐
│  Cost Model Decision  │
│  • Swap cost: 10.1ms  │
│  • Recompute: 122.7ms │
│  → Action: SWAP ✅    │
└───────────┬───────────┘
            │
            ▼
┌───────────────────────┐
│  Circuit Breaker      │
│  Queue < 5ms?          │
│  → LWKCP Mode ✅      │
│  Queue > 5ms?          │
│  → LRU Mode (fallback)│
└───────────────────────┘
```

### Performance Results

| Metric | Baseline (LRU) | CAAE | Improvement | 95% CI |
|--------|----------------|------|-------------|----------|
| **P99 Latency (25k tokens)** | 841 ± 67 ms | 437 ± 31 ms | **46% faster** | [38%, 54%] |
| **SLA Violations** | 40% ± 3.2% | 2% ± 0.8% | **20x fewer** | [15x, 25x] |
| **Decision Overhead** | N/A | 0.7 ± 0.2 ms | Negligible | [0.3, 1.1] ms |
| **Cost Model Accuracy** | N/A | 97.2% ± 0.8% | Validated | [95.6%, 98.8%] |

**Cost Model Accuracy Definition**: Binary classification accuracy for swap vs. recompute decisions, validated against ground truth latency measurements. Threshold: swap if predicted_swap_time < predicted_recompute_time. Measured over 10,000 eviction decisions across varied context sizes (1k-99k tokens).

---

## 💰 Pricing

The SaaS uses token-based billing charged per 1M tokens:

| Tier | Price per 1M Tokens | Features |
|------|---------------------|----------|
| **Starter** | $2.00 | Basic support, 100 RPM |
| **Professional** | $1.50 | Priority support, 500 RPM |
| **Enterprise** | Custom | SLA, dedicated support |

See [`PRICING.md`](docs/index/PRICING.md) for complete details.

---

## 🧪 Testing

```bash
# Run unit tests
pytest apps/kv-eviction-optimizer/tests/ -v

# Run integration tests
pytest packages/caae/tests/ -v

# Run with coverage
pytest --cov=apps/kv-eviction-optimizer --cov=packages/caae
```

For an isolated, reproducible benchmark rerun (fresh checkout + venv +
artifacts), see `docs/guides/BENCHMARK_REPRODUCTION_ISOLATED.md`.

---

## 📦 Installation

### For Development

```bash
# Clone repository (replace your-org with your GitHub org)
git clone https://github.com/your-org/kvict.git
cd kvict

# Install with advisor extras for local dev
pip install -e ".[advisor]"

# Optional: install kv-evict library for optimizer development
cd apps/kv-eviction-optimizer && pip install -e . && cd ../..
```

### For Production Deployment

See [`DEPLOYMENT.md`](docs/deployment/DEPLOYMENT.md) for production deployment instructions.

---

## 🤝 Contributing

We welcome contributions! Please see [`CONTRIBUTING.md`](CONTRIBUTING.md) for guidelines.

---

## 📄 License

Apache License 2.0 - See [`LICENSE.md`](LICENSE.md) for details.

---

## 🔗 Links

- [Landing Page](docs/product/LANDING_PAGE.md)
- [Documentation Index](docs/index/DOCUMENTATION_INDEX.md)
- [Architecture Documentation](docs/architecture/ARCHITECTURE.md)
- [Deployment Guide](docs/deployment/DEPLOYMENT.md)
- [API Reference](docs/api/API.md)
- [Pricing](docs/index/PRICING.md)
- [Contributing](CONTRIBUTING.md)

---

## 📞 Support

- **Documentation**: See the docs linked above
- **Issues**: Open a GitHub issue
- **Enterprise Support**: Contact for dedicated support options
