Metadata-Version: 2.4
Name: demo-video-recorder
Version: 0.1.2
Summary: Scriptable demo video recording for apps, terminals, and AI agents.
License: MIT License
        
        Copyright (c) 2026 Huang Yuanjin
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: edge-tts>=7.2
Requires-Dist: playwright>=1.60.0
Requires-Dist: pyrootutils>=1.0
Description-Content-Type: text/markdown

# demo-video-recorder

Scriptable demo video recording for Python agents and humans.

`demo-video-recorder` helps you write small Python scripts that drive a CLI, browser UI, or native app window and turn that interaction into an MP4. It handles screen or browser capture, subtitle timing, optional burned-in subtitles, and optional narration audio through TTS.

It is especially useful when a coding agent needs to inspect a project, write a deterministic `record_demo.py`, react to app output, and produce a clean demo video without hand-recording the workflow.

## Install

```bash
pip install demo-video-recorder
```

With uv:

```bash
uv add demo-video-recorder
```

Recording depends on `ffmpeg` and `ffprobe` being available:

```bash
ffmpeg -version
ffprobe -version
```

For browser demos, install the Playwright browser binaries once in the environment where you installed the package:

```bash
python -m playwright install chromium
```

Linux capture is not implemented yet. Windows capture uses `gdigrab`; macOS capture uses `avfoundation`. `WebUIRecorder` defaults to Playwright video capture, so headless browser demos do not need macOS Screen Recording permission unless you explicitly use `video_backend="ffmpeg"`.

## macOS Notes

On macOS, the first real screen recording (except playwright recording for webui apps) may require granting Screen Recording permission to Terminal, iTerm, your IDE, or whichever Python host runs the script. You can preflight this from Python:

```python
from demo_video_recorder import check_screen_recording_access

result = check_screen_recording_access(prompt=True)
print(result)
```

High-quality burned subtitles require an `ffmpeg` build with the `subtitles` filter, which depends on `libass`. If your active `ffmpeg` does not support it, install a libass-enabled build such as `ffmpeg-full` and put it on `PATH`:

```bash
brew install ffmpeg-full
export PATH="/opt/homebrew/opt/ffmpeg-full/bin:$PATH"
ffmpeg -hide_banner -filters | rg subtitles
```

## Quick Start: CLI Demo

```python
from demo_video_recorder import CLIDemoRecorder


def main():
    r = CLIDemoRecorder("out/cli-demo.mp4")
    try:
        r.open_terminal(
            title="CLI Demo",
            window_size=(1200, 900),
            start_recording=True,
            clear=True,
        )
        r.explain("We'll run the app and use its help command.")
        r.run(["python", "app.py"], interactive=True, command_label="python app.py")
        r.expect_output(">")
        r.input("help")
        r.expect_regex(r"Commands?:")
        r.explain("The app prints the available commands, so the demo can keep going from real output.")
        r.input("quit")
        r.stop_app()
    finally:
        r.close()
        if r.is_recording:
            r.stop_recording()


if __name__ == "__main__":
    main()
```

Useful CLI helpers:

- `open_terminal(...)`: configures the terminal and can start recording.
- `clear()`: clears the current terminal with `clear` or `cls`.
- `run(..., interactive=True)`: starts a CLI app and streams stdout/stderr to the recorded terminal.
- `input("text")`: types into the active CLI app.
- `expect_output("text")` and `expect_regex(r"...")`: wait for real app output.
- `mark_output()` and `output_since(marker)`: isolate output caused by one action.
- `explain("...")`: adds narration subtitles and optional spoken narration.
- `stop_recording()`: finalizes the MP4.

When `new_window=True` is used, the recorder re-runs the script in a dedicated terminal session. On Windows it opens a new console. On macOS it opens a new Terminal.app window and captures that window when bounds are available. Worker stdout and stderr are mirrored to `out/<name>.worker.log`.

## Quick Start: Web UI Demo

`WebUIRecorder` is built for browser demos. It defaults to Playwright's own page video recorder, which works in headless browser contexts and then passes the raw MP4 through the same subtitle and narration pipeline.

```python
from demo_video_recorder import WebUIRecorder


def main():
    r = WebUIRecorder("out/web-demo.mp4", headless=True, viewport=(1280, 720))
    try:
        r.serve("dist", 8000)
        r.open_web("/")
        r.explain("The local web app is open.")
        r.find_input(label="Email address").fill("ada@example.com")
        r.find_select(label="Plan").select_option(label="Pro")
        r.find(role="button", name="Continue").click()
        r.find("main", text="Welcome").highlight()
        r.explain("The workflow is complete and the confirmation is visible.")
    finally:
        r.close()
        if r.is_recording:
            r.stop_recording()


if __name__ == "__main__":
    main()
```

Useful Web UI helpers:

- `serve(path, port=8000)`: serves a static folder over localhost.
- `open_web(url=None)`: opens a URL. Bare domains become `https://...`; relative paths use the served folder.
- `find(...)`: bs4-style visible element lookup.
- `find_optional(...)`: returns `None` instead of raising when an element is absent.
- `find_input(...)`: finds `input` and `textarea` controls.
- `find_select(...)`: finds `select` controls.
- Element methods include `highlight()`, `click()`, `double_click()`, `hover()`, `wait()`, `text()`, and `attribute()`.
- Input methods include `fill()`, `type()`, `clear()`, `set_range()`, `set_date()`, `set_color()`, `set_files()`, `press()`, `check()`, `uncheck()`, and `select_option()`.

`find()` accepts Beautiful Soup style names and attrs plus Playwright-friendly selectors:

```python
r.find("button", text="Save")
r.find("input", {"name": "email"})
r.find("input", _class="field-control")
r.find(selector="[data-testid='submit']")
r.find(role="button", name="Continue")
r.find(label="Email address").fill("ada@example.com")
```

Prefer robust selectors in this order: role and accessible name, label or placeholder, test id, then CSS selector.

## Quick Start: Native App Window

```python
from demo_video_recorder import DemoVideoRecorder


def main():
    r = DemoVideoRecorder("out/app-demo.mp4")
    try:
        r.open_app(["notepad.exe"], title_hint="Untitled - Notepad", capture_window=True)
        r.start_capture_window()
        r.explain("The app window is open and being captured.")
    finally:
        r.close()
        if r.is_recording:
            r.stop_recording()


if __name__ == "__main__":
    main()
```

## Narration Audio

Add `EdgeTTSBackend` when you want spoken narration in addition to subtitles:

```python
from demo_video_recorder import CLIDemoRecorder, EdgeTTSBackend

tts = EdgeTTSBackend(
    save_dir="out/demo.tts",
    speaker="en-US-AvaMultilingualNeural",
    speed="+0%",
    volume="+0%",
    cache=True,
)

r = CLIDemoRecorder("out/demo.mp4", tts=tts)
```

When TTS is enabled, `explain()` uses the generated audio duration instead of estimating from word count. If synthesis latency would show up as dead air in the capture, pre-generate longer narration:

```python
prepared = r.synthesize_if_tts_enabled(
    "This narration is prepared before the visible interaction begins."
)
r.explain(prepared)
```

For several known cues, prepare them on the recorder instead of retyping async glue:

```python
intro, finish = r.prepare_cues(
    ["The app is open.", "The result is now visible."],
    async_tts=True,
)
```

List available Edge voices:

```python
from demo_video_recorder import EdgeTTSBackend

tts = EdgeTTSBackend(save_dir="out/voices")
print("\n".join(tts.list_speakers()))
```

If Edge TTS repeatedly fails for a service or network reason, you can fall back
to native OS speech on macOS or Windows:

```python
from demo_video_recorder import NativeTTSBackend

tts = NativeTTSBackend(save_dir="out/demo.tts", cache=True)
```

## Defaults

```python
from demo_video_recorder import DEFAULTS, FAST_SMOKE_TEST_DEFAULTS

DEFAULTS.words_per_minute          # 170
DEFAULTS.min_pause_seconds         # 2.0
DEFAULTS.command_lead_seconds      # 0.0
DEFAULTS.typed_character_delay     # 0.018
DEFAULTS.capture_framerate         # 15
DEFAULTS.video_scale_width         # 1280
```

Use `FAST_SMOKE_TEST_DEFAULTS` for quick local script checks, not polished final videos.

## FOR AI AGENT: PLEASE READ

This package ships a complete guide for coding agents. Before writing a recording script, read it from the installed package:

```bash
python -c "import importlib.resources as r; print((r.files('demo_video_recorder') / 'AGENT.md').read_text())"
```

The guide covers environment checks, macOS permissions, subtitle support, CLI and Web UI recording patterns, output-aware interactions, TTS pre-synthesis, and final video verification.
