Metadata-Version: 2.4
Name: local-video-gen-studio
Version: 0.1.0
Summary: Desktop app for local video generation using SANA-WM
Author-email: ZachOS <zach@zachos.dev>
Requires-Python: >=3.10
Requires-Dist: accelerate>=0.27.0
Requires-Dist: aiosqlite>=0.19.0
Requires-Dist: diffusers>=0.27.0
Requires-Dist: fastapi>=0.110.0
Requires-Dist: ffmpeg-python>=0.2.0
Requires-Dist: numpy>=1.26.0
Requires-Dist: opencv-python>=4.9.0
Requires-Dist: pillow>=10.2.0
Requires-Dist: pydantic-settings>=2.1.0
Requires-Dist: pydantic>=2.6.0
Requires-Dist: stripe>=8.0.0
Requires-Dist: torch>=2.2.0
Requires-Dist: transformers>=4.38.0
Requires-Dist: uvicorn[standard]>=0.27.0
Provides-Extra: dev
Requires-Dist: black>=24.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.2.0; extra == 'dev'
Description-Content-Type: text/markdown

# Local Video Gen Studio

Generate stunning AI video clips from text—no subscriptions, no cloud fees, no waiting. Just your GPU and total creative control.

## What is this?

Local Video Gen Studio is a cross-platform desktop application that brings professional video generation to your machine. Built around SANA-WM (a 2.6B open-source video model), it lets creators generate 720p video clips from text prompts in minutes. Process batches overnight, export to your editing timeline, and keep everything private—no cloud, no surveillance, no vendor lock-in.

Perfect for YouTubers building B-roll libraries, course creators illustrating concepts, and indie game developers prototyping visual effects without licensing costs.

## Features

- **Text-to-Video Generation** – Convert detailed prompts into 720p video clips up to 60 seconds
- **Batch Processing** – Queue multiple videos and render overnight while you sleep
- **Local-Only Inference** – Your GPU, your data, your privacy. No cloud uploads
- **GPU Auto-Detection** – Automatically optimizes for NVIDIA/AMD/Metal
- **Prompt Templates** – Pre-built templates for common use cases (product demos, tutorials, game footage)
- **One-Click Model Download** – Automatic SANA-WM model initialization on first launch
- **Export Ready** – Direct MP4 output compatible with any video editor
- **Cross-Platform** – macOS, Windows, and Linux via Tauri

## Quick Start

### Requirements
- 12GB+ VRAM (NVIDIA RTX 3060 Ti / RTX 4070 or equivalent)
- Python 3.10+
- Node.js 16+
- 50GB free disk space (for model + cache)

### Installation

1. **Clone and install dependencies**
   ```bash
   git clone https://github.com/yourusername/local-video-gen-studio.git
   cd local-video-gen-studio
   
   # Frontend
   npm install
   
   # Backend
   python -m venv venv
   source venv/bin/activate  # Windows: venv\Scripts\activate
   pip install -r backend/requirements.txt
   ```

2. **Configure environment**
   ```bash
   cp .env.example .env
   # Edit .env with your NVIDIA_API_KEY (optional) and output paths
   ```

3. **Launch the app**
   ```bash
   npm run tauri dev
   ```

The app will download SANA-WM (~3.2GB) on first run and auto-detect your GPU.

## Usage

### Generate a Single Video

1. Open Local Video Gen Studio
2. Enter a detailed text prompt (e.g., "A silver MacBook Pro sliding across a wooden desk, warm lighting, cinematic")
3. Select resolution (480p free tier, 720p Pro)
4. Click **Generate**
5. Monitor progress in the queue panel
6. Export to MP4 when complete

### Batch Generate Variations

1. Create a prompt template with variables: `"A {color} {object} {action}, {lighting}"`
2. Upload a CSV with variations
3. Click **Queue All** – generates overnight
4. Retrieve finished videos in the exports folder

### Example Prompts

```
"Overhead shot of coffee pouring into a white ceramic mug, steam rising, warm sunlight"

"Low-angle tracking shot through a forest canopy, autumn leaves, misty morning light"

"Screen recording aesthetic: cursor clicking through a minimalist dashboard, smooth transitions"
```

## Tech Stack

**Frontend:**
- TypeScript + React (Vite)
- Tauri (lightweight Rust native shell)
- CSS3 for responsive UI

**Backend:**
- Python 3.10+
- SANA-WM (2.6B parameter open-source video model)
- PyTorch/CUDA for inference
- FastAPI for IPC
- SQLite for job queue persistence

**Infrastructure:**
- Stripe (optional monetization integration)
- Local GPU orchestration (no external services)

## Development

```bash
# Run development server (hot reload)
npm run tauri dev

# Build production binaries
npm run tauri build

# Backend development (watch mode)
cd backend && python main.py --dev

# Type checking
npm run type-check

# Lint
npm run lint
```

## Roadmap

- [ ] Frame interpolation (60fps output)
- [ ] LoRA fine-tuning for style consistency
- [ ] Realtime preview in editor
- [ ] Audio sync generation
- [ ] Commercial license watermark removal

## Licensing

MIT License – See LICENSE file for details.

---

**Questions?** Open an issue or check out [OVERVIEW.md](./OVERVIEW.md) for deeper architecture docs.