Metadata-Version: 2.3
Name: sora-extend
Version: 0.1.3
Summary: AI-Planned, Scene-Exact Video Generation with Continuity using Sora 2
Author: Mason Hall
Author-email: Mason Hall <masonhall@gmail.com>
Requires-Dist: openai>=1.0.0
Requires-Dist: requests>=2.31.0
Requires-Dist: typer>=0.9.0
Requires-Dist: rich>=13.0.0
Requires-Dist: moviepy>=1.0.3
Requires-Dist: opencv-python-headless>=4.8.0
Requires-Dist: imageio>=2.31.0
Requires-Dist: imageio-ffmpeg>=0.4.9
Requires-Python: >=3.12
Description-Content-Type: text/markdown

# Sora Extend

**AI-Planned, Scene-Exact Video Generation with Continuity using Sora 2**

Originally built by [Matt Shumer](https://x.com/mattshumer_). CLI package conversion by Mason Hall.

## Usage

```bash
uvx sora-extend generate "A tron motorcycle racing through NYC at night" --api-key <your-api-key>
```

## Overview

Sora Extend enables you to generate extended videos (>12s) using Sora 2 by:

1. **AI Planning**: Using an LLM (e.g., GPT-5) to intelligently plan N scene prompts from a base idea, optimized for continuity
2. **Chained Generation**: Rendering each segment with Sora 2, passing the prior segment's final frame as `input_reference` for seamless transitions
3. **Automatic Concatenation**: Combining all segments into a single MP4

This approach allows you to create videos much longer than Sora's standard 12-second limit while maintaining visual continuity across segments.


## Installation

### Using pip

```bash
pip install -e .
```

### Using uv (recommended)

```bash
uv pip install -e .
```

## Prerequisites

- Python 3.12 or higher
- OpenAI API key with access to:
  - Sora 2 (video generation)
  - A reasoning model for planning (e.g., GPT-5, or fallback to gpt-4)

**Note:** This tool also supports [Echo](https://echo.merit.systems) API keys. If your API key starts with `echo_`, the tool will automatically route requests through Echo's API gateway.

## Performance

This package is optimized for fast startup using **lazy imports**:

- **Initial import**: ~1-2ms (heavy dependencies not loaded)
- **CLI startup**: Fast, loads only what's needed
- **No breaking changes**: All APIs work exactly the same

Heavy libraries (OpenCV, MoviePy, Typer, Rich) are only loaded when their functions are actually called, making programmatic usage extremely fast.

For more details and benchmarks, see [PERFORMANCE.md](PERFORMANCE.md).

## Usage

### Basic Command

```bash
sora-extend generate "Your video description here"
```

### Full Example

```bash
export OPENAI_API_KEY="your-api-key-here"

sora-extend generate \
  "Gameplay footage of a game releasing in 2027, a car driving through a futuristic city" \
  --seconds 8 \
  --num-segments 3 \
  --output-dir ./my_video
```

This will create a 24-second video (3 segments × 8 seconds each).

### Command-Line Options

```
sora-extend generate PROMPT [OPTIONS]

Arguments:
  PROMPT  Base prompt describing the video concept [required]

Options:
  -k, --api-key TEXT           OpenAI API key (or set OPENAI_API_KEY env var)
  -s, --seconds INTEGER        Duration of each segment in seconds (4, 8, or 12) [default: 8]
  -n, --num-segments INTEGER   Number of segments to generate [default: 2]
  --sora-model TEXT           Sora model to use (sora-2 or sora-2-pro) [default: sora-2]
  --planner-model TEXT        Planning model for generating prompts [default: gpt-5]
  --size TEXT                 Video size (must stay constant) [default: 1280x720]
  -o, --output-dir PATH       Output directory for generated videos [default: sora_ai_planned_chain]
  --no-concatenate            Skip concatenating segments into final video
  -q, --quiet                 Suppress progress output
  --help                      Show this message and exit
```

### Examples

**Create a 16-second video (2 × 8s segments):**
```bash
sora-extend generate "A drone flying over a cyberpunk city at night"
```

**Create a 48-second video using 12-second segments:**
```bash
sora-extend generate "A nature documentary about arctic foxes" \
  --seconds 12 \
  --num-segments 4
```

**Generate segments without concatenating:**
```bash
sora-extend generate "Product showcase for a new smartphone" \
  --seconds 8 \
  --num-segments 3 \
  --no-concatenate
```

**Use Sora 2 Pro model:**
```bash
sora-extend generate "High-quality cinematic scene" \
  --sora-model sora-2-pro \
  --seconds 12 \
  --num-segments 2
```

## How It Works

### 1. AI Planning Phase

The planner model (default: GPT-5) receives your base prompt and generates detailed, scene-specific prompts for each segment. These prompts include:

- **Context sections**: Guidance for the AI about what came before
- **Continuity instructions**: Explicit directions to start from the previous segment's final frame
- **Visual consistency**: Maintained lighting, style, and subject identity across segments

**Detailed steps:**
- [1.1] Preparing planning request
- [1.2] Calling the planner model (e.g., GPT-5)
- [1.3] Parsing AI response
- [1.4] Validating and formatting segments

### 2. Chained Video Generation

For each segment:
1. Generate video using Sora 2 with the planned prompt
2. Extract the final frame from the generated video
3. Use that frame as the `input_reference` for the next segment

This creates smooth transitions between segments.

**Detailed steps (per segment N):**
- [2.N.1] Creating video generation job
- [2.N.2] Job started confirmation
- [2.N.3] Polling for completion
- [2.N.4] Downloading video segment
- [2.N.5] Video saved confirmation
- [2.N.6] Extracting last frame for continuity
- [2.N.7] Reference frame saved

### 3. Concatenation

All segments are concatenated into a single MP4 file using MoviePy, preserving the framerate and quality.

**Detailed steps:**
- [3.1] Loading video segments
- [3.2] Combining video segments
- [3.3] Writing final video
- [3.4] Cleaning up resources

## Output Structure

```
sora_ai_planned_chain/
├── segment_01.mp4        # First video segment
├── segment_01_last.jpg   # Last frame of first segment
├── segment_02.mp4        # Second video segment
├── segment_02_last.jpg   # Last frame of second segment
├── ...
└── combined.mp4          # Final concatenated video
```

## API Usage

You can also use Sora Extend as a Python library:

```python
from openai import OpenAI
from pathlib import Path
from sora_extend import plan_prompts_with_ai, chain_generate_sora, concatenate_segments

# Initialize client (supports both OpenAI and Echo API keys)
api_key = "your-api-key"
if api_key.startswith("echo_"):
    client = OpenAI(api_key=api_key, base_url="https://echo.router.merit.systems")
else:
    client = OpenAI(api_key=api_key)

output_dir = Path("./output")
output_dir.mkdir(exist_ok=True)

# Plan segments
segments = plan_prompts_with_ai(
    base_prompt="A journey through space",
    seconds_per_segment=8,
    num_generations=3,
    planner_model="gpt-5",
    client=client,
)

# Generate videos
segment_paths = chain_generate_sora(
    segments=segments,
    size="1280x720",
    model="sora-2",
    api_key="your-api-key",
    out_dir=output_dir,
)

# Concatenate
final_video = concatenate_segments(segment_paths, output_dir / "final.mp4")
print(f"Video saved to: {final_video}")
```

## Configuration

### Environment Variables

- `OPENAI_API_KEY`: Your OpenAI API key (required)
  - Supports both OpenAI keys and Echo keys (starting with `echo_`)
  - Echo keys automatically route through `https://echo.router.merit.systems`

### Supported Video Sizes

- `1280x720` (default)
- `1920x1080`
- `1080x1920` (vertical)
- Other sizes supported by Sora 2

### Segment Durations

Recommended values: 4, 8, or 12 seconds

The total video duration will be: `seconds_per_segment × num_segments`

## Tips for Best Results

1. **Clear Base Prompts**: Be specific about the setting, style, and mood
2. **Consistent Style**: Mention visual style in your base prompt (e.g., "cinematic", "documentary style")
3. **Subject Continuity**: If featuring a character/object, describe them clearly
4. **Reasonable Segment Count**: Start with 2-3 segments to test, then scale up
5. **Check Planner Output**: Review the AI-planned prompts before generation (use `--quiet` to see them)

## Troubleshooting

**"Planner returned no text"**
- Try a different planner model: `--planner-model gpt-4o`

**"Create video failed"**
- Check your API key has Sora 2 access
- Verify your prompt doesn't violate content policies

**Discontinuities between segments**
- Try shorter segments (e.g., 4s instead of 12s)
- Make your base prompt more specific about visual style

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is based on the original Jupyter notebook by Matt Shumer.

## Credits

- Original concept and notebook: [Matt Shumer](https://x.com/mattshumer_)
- CLI package conversion: Mason Hall


