// open source audio pipeline

xil-pipeline

Script → Voice → Production.
Fully automated.

Transform AI-generated scripts into fully produced podcasts and audiobooks. Multi-voice TTS via ElevenLabs, layered sound effects, music beds, and ambient audio — all orchestrated from a single structured script format.

xil-pipeline — terminal
# Scaffold a new project workspace
$ xil-init my-show --show "My Podcast"
$ cd my-show
# Scan the sample script (pre-flight check)
$ xil-scan scripts/sample_S01E01.md
# Parse into structured JSON
$ xil-parse scripts/sample_S01E01.md --episode S01E01
# Preview TTS cost (no API calls)
$ xil-produce --episode S01E01 --dry-run
⠿ Parsing voices... ✓ done
⠿ Mixing audio layers... ✓ done
⠿ Exporting master... ✓ episode.mp3
🎙 Runtime: 12m 34s · 48kHz stereo · VBR MP3

The Pipeline

From raw idea to finished audio in five stages.

🧠
AI Script
Generate structured .xil.yaml scripts via LLM prompts
📝
Parse
Validate script format, assign voices, cues & timing
🎙
TTS Render
ElevenLabs synthesizes each voice track in parallel
🎵
Layer Audio
Mix SFX, music beds, and ambient sound layers
📦
Export
Output MP3, WAV, or chapter-split audiobook files

Hear It In Action

Sample outputs generated entirely by xil-pipeline — AI script, ElevenLabs voices, layered audio.

🎙
Tech Deep Dive — "The Rise of Open Source AI"
 2 voices · 8m 42s · Podcast format
podcast elevenlabs music-bed
📖
Audiobook Sample — "Chapter 1: The Signal"
 1 narrator · 14m 20s · Audiobook format
audiobook ambient-sfx chapters
🎭
Drama Short — "The Interview" (3-voice scene)
 3 voices · 5m 08s · Drama format
drama multi-voice sfx-layers

// Demo audio coming soon — watch the repo for release announcements.

Up and Running in Minutes

Install, write a script, render audio. That's the whole loop.

terminal
# Scaffold a new project workspace (creates a copy of the sample script)
$ xil-init my-show --show "My Podcast"
$ cd my-show

# Scan the sample script (pre-flight check)
$ xil-scan scripts/sample_S01E01.md

# Parse into structured JSON
$ xil-parse scripts/sample_S01E01.md --episode S01E01

# Preview TTS character cost (no API calls)
$ xil-produce --episode S01E01 --dry-run

# Generate voice and SFX stems (requires ELEVENLABS_API_KEY — see Environment below)
$ xil-produce --episode S01E01

# Export DAW layers for mixing in Audacity
$ xil-daw --episode S01E01

# Produce final master MP3
$ xil-master --episode S01E01

Explore the Docs

Everything you need to build, extend, and contribute to xil-pipeline.

What xil-pipeline Does

A full audio production stack, driven by code.

🧠
AI Script Generation
Prompt any LLM to generate structured .xil.yaml scripts. Built-in templates for podcasts, audiobooks, interviews, and drama formats.
🎙
Multi-Voice TTS
Assign unique ElevenLabs voices to each character or narrator. Parallel rendering keeps production fast even for long-form content.
🎵
Layered Audio Mixing
Stack music beds, ambient sounds, and SFX cues with per-layer volume envelopes, crossfades, and ducking when voices are active.
📦
Flexible Export
Output to MP3, WAV, or FLAC. Audiobook mode splits by chapter markers. Podcast mode embeds ID3 metadata automatically.
🔌
Plugin Architecture
Swap in alternative TTS providers, custom audio processors, or your own script generators via a clean plugin interface.
CLI-First Design
Fully scriptable. Integrate into CI/CD pipelines, cron jobs, or content automation workflows with zero GUI required.

Join the Build

xil-pipeline is early-stage and actively welcoming contributors. Every PR counts.

  • 1

    Fork & Clone

    Fork the repo on GitHub and clone it locally. Read CONTRIBUTING.md to get oriented.

  • 2

    Pick an Issue

    Browse open issues tagged good first issue or help wanted to find a good starting point.

  • 3

    Build & Test

    Run the test suite, make your changes, and add tests for new functionality.

  • 4

    Open a PR

    Submit your pull request with a clear description. We review quickly and give constructive feedback.

⭐ Star on GitHub 🐛 Browse Issues
AGPL-3.0
License
🟢
Active Development
PRs Welcome

// good first issues

Add support for SSML tags in voice cues
Write unit tests for the YAML parser
Add OpenAI script generation template
Document the plugin API with examples
Build a sample audiobook demo script