Metadata-Version: 2.4
Name: caption-cast
Version: 0.1.0
Summary: 20 audio-driven caption styles for video. Composes text-fx + lyric-sync + audio-arrange.
Project-URL: Homepage, https://github.com/tomastimelock/caption-cast
Project-URL: Issues, https://github.com/tomastimelock/caption-cast/issues
Author-email: Trollfabriken AITrix AB <dev@trollfabriken.se>
License: MIT
Keywords: animation,audio,captions,ffmpeg,karaoke,lyrics,subtitles,video
Classifier: Development Status :: 4 - Beta
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Graphics
Classifier: Topic :: Multimedia :: Video
Requires-Python: >=3.10
Requires-Dist: emoji>=2.10
Requires-Dist: ffmpeg-python>=0.2
Requires-Dist: numpy>=1.24
Requires-Dist: pillow>=10.0
Requires-Dist: pydantic>=2.5
Requires-Dist: pysubs2>=1.7
Requires-Dist: text-fx-engine>=0.1
Provides-Extra: all
Requires-Dist: audio-arrange>=0.1; extra == 'all'
Requires-Dist: lyric-sync>=0.1; extra == 'all'
Provides-Extra: beats
Requires-Dist: audio-arrange>=0.1; extra == 'beats'
Provides-Extra: dev
Requires-Dist: build; extra == 'dev'
Requires-Dist: pytest-cov>=4; extra == 'dev'
Requires-Dist: pytest>=7; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: lyrics
Requires-Dist: lyric-sync>=0.1; extra == 'lyrics'
Description-Content-Type: text/markdown

# caption-cast

20 audio-driven caption styles for video.

[![PyPI](https://img.shields.io/pypi/v/caption-cast)](https://pypi.org/project/caption-cast/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)

---

caption-cast renders styled, timed captions onto video. It provides 20 named caption styles, each
a fixed composition of text-fx effects, timing behaviour, and layout rules. It does not do speech
transcription, audio separation, or beat detection itself — those are handled by sibling packages
(lyric-sync and audio-arrange respectively), which caption-cast can consume when installed as
optional extras.

The library was built to serve the caption layer of Trollfabriken's automated video pipeline, where
clip metadata already carries subtitle or lyric timecodes and the remaining work is rendering them
consistently across styles.

---

## What it solves

| Pain point | Resolution |
|---|---|
| Karaoke-style word-by-word highlighting requires tight timing control and per-word colour switching | `burn_lyrics()` with any `karaoke_*` style handles word-level timing from a pysubs2 `SSAFile` or from lyric-sync output |
| Subtitle burn-in varies wildly in font, position, and animation across tools | 20 named styles give a reproducible, versioned rendering contract; pick a slug, get the same output every time |
| Beat-synchronised caption pulses need audio analysis glued to render | `beat_synced_captions()` accepts an `audio-arrange` beat grid and schedules caption emphasis automatically |

---

## Installation

Core package (requires ffmpeg on PATH):

```bash
pip install caption-cast
```

With lyric-sync support (LRC / MusicXML ingestion, word-level timing):

```bash
pip install "caption-cast[lyrics]"
```

With audio-arrange support (beat detection, tempo analysis):

```bash
pip install "caption-cast[beats]"
```

Full extras:

```bash
pip install "caption-cast[all]"
```

Development:

```bash
pip install "caption-cast[dev]"
```

ffmpeg must be available on your system PATH. Install via:

- Ubuntu/Debian: `sudo apt-get install ffmpeg`
- macOS: `brew install ffmpeg`
- Windows: `choco install ffmpeg` or download from https://ffmpeg.org/download.html

---

## Quick start

### 1. Burn subtitles from an SRT file

```python
from caption_cast import burn_subtitles

burn_subtitles(
    video="input.mp4",
    subtitles="dialogue.srt",
    style="clean_lower_third",
    output="output.mp4",
)
```

The `subtitles` parameter accepts a path to any pysubs2-readable format (SRT, ASS, VTT, SSA).

### 2. Karaoke word-by-word highlighting from an LRC file

```python
from caption_cast import burn_lyrics

burn_lyrics(
    video="music_video.mp4",
    lyrics="song.lrc",          # LRC with word-level timestamps
    style="karaoke_neon",
    output="music_video_captioned.mp4",
    highlight_color="#FF3399",
    base_color="#FFFFFF",
)
```

If `lyric-sync` is installed, you can pass a `lyric_sync.LyricTrack` object directly instead of a
file path. See lyric-sync's README for how to produce one from MusicXML or Spotify API data.

### 3. Beat-synchronised caption pulses

```python
from caption_cast import beat_synced_captions
from audio_arrange import analyse_beats   # requires caption-cast[beats]

beat_grid = analyse_beats("track.wav")

beat_synced_captions(
    video="clip.mp4",
    subtitles="lyrics.srt",
    beat_grid=beat_grid,
    style="pulse_bold",
    output="clip_captioned.mp4",
)
```

On each beat onset the active caption receives a scale/opacity pulse governed by the style's
`beat_emphasis` parameters. The default pulse lasts 80 ms and can be overridden per call.

---

## The 20 styles

| Slug | Display name | Primary use case |
|---|---|---|
| `clean_lower_third` | Clean Lower Third | Documentary, interview subtitles |
| `clean_center` | Clean Center | Narrative subtitles, no music |
| `minimal_top` | Minimal Top | B-roll with top-aligned text |
| `bold_lower_third` | Bold Lower Third | Social media, short-form video |
| `bold_center` | Bold Center | Impact titles doubling as captions |
| `outline_lower_third` | Outline Lower Third | Subtitles over bright or variable backgrounds |
| `outline_center` | Outline Center | Lyrics on music videos with complex backgrounds |
| `drop_shadow_lower` | Drop Shadow Lower | Standard broadcast-style lower third |
| `karaoke_classic` | Karaoke Classic | Word-by-word left-to-right wipe highlight |
| `karaoke_neon` | Karaoke Neon | Word highlight with neon glow effect |
| `karaoke_fill` | Karaoke Fill | Word highlight with solid fill colour swap |
| `karaoke_bounce` | Karaoke Bounce | Word highlight with vertical bounce on activation |
| `karaoke_wave` | Karaoke Wave | Sequential per-character wave on active word |
| `pulse_bold` | Pulse Bold | Caption scales on beat onset, bold weight |
| `pulse_glow` | Pulse Glow | Caption glow radius pulses on beat onset |
| `pulse_color` | Pulse Color | Caption colour shifts on beat onset |
| `fade_word` | Fade Word | Each word fades in on its start timecode |
| `slide_up` | Slide Up | Line slides up into position on entry |
| `typewriter` | Typewriter | Characters reveal left-to-right at constant rate |
| `pop_center` | Pop Center | Caption pops to full size from zero scale |

All 20 styles are parametric. Every parameter has a default; pass keyword arguments to `apply_caption`,
`burn_lyrics`, or `burn_subtitles` to override individual parameters without changing the base style.

---

## Lyrics input formats

caption-cast accepts timed lyrics in three ways:

**pysubs2-readable files** — SRT, ASS, SSA, VTT. Word-level timing in ASS/SSA format is supported
for karaoke styles. Pass the file path as the `lyrics` or `subtitles` argument.

**LRC files** — Standard LRC with line timestamps. Enhanced LRC with word-level `<mm:ss.xx>` tags
is required for karaoke styles. Pass the file path; caption-cast parses LRC internally via pysubs2.

**lyric-sync LyricTrack** — When `lyric-sync >= 0.1` is installed, pass a `LyricTrack` object
directly. This is the preferred path when you need to ingest MusicXML, parse Spotify's lyrics API,
or align phoneme boundaries. lyric-sync produces word-level timecodes that map cleanly onto the
karaoke styles.

```python
# With lyric-sync installed
from lyric_sync import parse_lrc, align_words
from caption_cast import burn_lyrics

track = parse_lrc("song.lrc")
aligned = align_words(track, audio="song.wav")   # optional phoneme alignment

burn_lyrics(
    video="clip.mp4",
    lyrics=aligned,          # LyricTrack object accepted directly
    style="karaoke_wave",
    output="out.mp4",
)
```

---

## CLI

caption-cast ships a CLI entry point at `caption-cast`.

List all styles:

```bash
caption-cast styles
```

Burn subtitles from the command line:

```bash
caption-cast burn \
  --video input.mp4 \
  --subtitles dialogue.srt \
  --style clean_lower_third \
  --output output.mp4
```

Burn karaoke lyrics:

```bash
caption-cast lyrics \
  --video music_video.mp4 \
  --lyrics song.lrc \
  --style karaoke_neon \
  --highlight-color "#FF3399" \
  --output out.mp4
```

Inspect a style's parameters:

```bash
caption-cast info karaoke_neon
```

Get version:

```bash
caption-cast --version
```

All `burn` and `lyrics` subcommand options map 1-to-1 to the Python API keyword arguments.
Run `caption-cast <subcommand> --help` for a full parameter list.

---

## Composition with the Trollfabriken stack

caption-cast sits between the ingestion packages (lyric-sync, audio-arrange) and the render engine
(text-fx). The dependency chain is:

```
lyric-sync ──┐
              ├──► caption-cast ──► text-fx ──► ffmpeg
audio-arrange ┘
```

caption-cast does not need lyric-sync or audio-arrange at runtime if you supply pre-timed subtitle
files. Install the extras only when you need programmatic lyrics ingestion or beat analysis.

For title cards and motion-graphic intro/outro sequences, see title-fx, which builds on text-fx
with a separate catalog of 44 cinematic title effects.

---

## Package structure

```
caption-cast/
  src/
    caption_cast/
      __init__.py          # public API: apply_caption, burn_lyrics, burn_subtitles,
      api.py               #             beat_synced_captions, list_styles, get_style_info
      cli.py               # caption-cast CLI entry point
      renderer.py          # delegates to text-fx render pipeline
      timing.py            # timecode parsing, word-level offset resolution
      beat.py              # beat grid → caption emphasis schedule
      styles/
        __init__.py        # style registry
        catalog.py         # style definitions (parametric dataclasses)
      data/
        caption_styles.json  # serialised style catalog (shipped in wheel)
  tests/
```

---

## License

MIT. Copyright 2026 Trollfabriken AITrix AB.

Part of the [Trollfabriken stack](https://github.com/tomastimelock).

- PyPI: https://pypi.org/project/caption-cast/
- Issues: https://github.com/tomastimelock/caption-cast/issues
- text-fx (render engine): https://github.com/tomastimelock/text-fx
- lyric-sync (lyrics ingestion): https://github.com/tomastimelock/lyric-sync
- audio-arrange (beat detection): https://github.com/tomastimelock/audio-arrange
