Metadata-Version: 2.4
Name: gencast
Version: 1.1.0
Summary: Generate conversational podcasts from documents using AI
Author-email: Mae Capacite <cadrianmae@users.noreply.github.com>
License: MIT
Project-URL: Homepage, https://github.com/cadrianmae/podcast-ai
Project-URL: Repository, https://github.com/cadrianmae/podcast-ai
Project-URL: Issues, https://github.com/cadrianmae/podcast-ai/issues
Keywords: podcast,ai,tts,openai,anthropic,notebooklm,education
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Education
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: click>=8.1.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: jinja2>=3.1.0
Requires-Dist: litellm>=1.30.0
Requires-Dist: openai>=1.0.0
Requires-Dist: pydub>=0.25.1
Requires-Dist: numpy>=1.24.0
Requires-Dist: scipy>=1.11.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: mistralai>=1.0.0
Requires-Dist: pypdf>=3.0.0
Requires-Dist: audioop-lts>=0.2.0; python_version >= "3.13"
Requires-Dist: rich>=13.0.0
Requires-Dist: srt>=3.5.0
Requires-Dist: tiktoken>=0.7.0
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test"
Requires-Dist: vcrpy>=5.0.0; extra == "test"
Requires-Dist: ffmpeg-python>=0.2.0; extra == "test"
Provides-Extra: dev
Requires-Dist: basedpyright>=1.21.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: all
Requires-Dist: gencast[dev,test]; extra == "all"

# gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

```text
gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)
```

## Install

```bash
pip install gencast
```

System dependency: `ffmpeg` (for audio combining and M4A muxing).

API keys (export or use `gencast init` to be prompted):

```bash
export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)
```

## Quickstart

```bash
gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a
```

Or one-shot from a markdown file (uses default profiles):

```bash
gencast generate path/to/lecture.md
```

## Three-axis profile system

Each notebook composes three orthogonal profiles:

```yaml
speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)
```

List bundled profiles:

```bash
gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms
```

Profiles cascade: `./gencast/profiles/<kind>/<name>.yaml` (project)
> `~/.config/gencast/profiles/<kind>/<name>.yaml` (XDG)
> bundled defaults. Override per-notebook via `overrides:` block in the
notebook YAML.

## Worked example

`./photosynthesis/notebook.yaml`:

```yaml
title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
```

```bash
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a
```

## Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

| Component | Default model | Cost |
|---|---|---|
| Outline | `claude-haiku-4-5` | ~$0.005 |
| Transcript (with prompt cache) | `claude-sonnet-4-5` | ~$0.10 |
| TTS | `openai/tts-1-hd` | ~$0.06 |
| Subtitles | native (no Whisper) | $0.00 |
| **Total** | | **~$0.17** |

Use `--model` overrides or different episode profiles to trade quality for cost.

## Caches

- **TTS cache** -- `~/.cache/gencast/tts/` -- always on. Re-runs cost only changed sentences.
- **LLM cache** -- `~/.cache/gencast/llm/` -- opt-in via `--cache-llm`. Off by default since dialogue is non-deterministic.
- **PDF extract cache** -- `~/.cache/gencast/extract/` -- always on for Mistral PDF extraction.

Manage:

```bash
gencast cache status
gencast cache clear --type tts --yes
```

## CLI reference

```text
gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]
```

Verbosity: `-v`, `-vv`, `-q`, `--silent`, `--log-file PATH`.

### Cost preview

Predict cost before generating:

```bash
gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)
```

For scripts and skills, use `--json`:

```bash
gencast estimate my-lecture.yaml --json
```

For the rate table only (used by Claude Code skills via dynamic context injection):

```bash
gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models
```

## Tests

```bash
pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)
```

## Specs and design

- [v1.0 design](docs/superpowers/specs/2026-05-07-gencast-v1-rewrite-design.md)
- [Plan A -- foundation](docs/superpowers/plans/2026-05-07-gencast-v1-plan-a-foundation.md)
- [Plan B -- pipeline](docs/superpowers/plans/2026-05-07-gencast-v1-plan-b-pipeline.md)
- [Plan C -- finishing](docs/superpowers/plans/2026-05-08-gencast-v1-plan-c-finishing.md)
- [Future work](docs/future-work.md)

## License

MIT.
