Metadata-Version: 2.4
Name: gencast
Version: 1.2.2
Summary: Generate conversational podcasts from documents using AI
Author-email: Mae Capacite <cadrianmae@users.noreply.github.com>
License: MIT
Project-URL: Homepage, https://github.com/cadrianmae/podcast-ai
Project-URL: Repository, https://github.com/cadrianmae/podcast-ai
Project-URL: Issues, https://github.com/cadrianmae/podcast-ai/issues
Keywords: podcast,ai,tts,openai,anthropic,notebooklm,education
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Education
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: click>=8.1.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=6.0
Requires-Dist: jinja2>=3.1.0
Requires-Dist: litellm>=1.30.0
Requires-Dist: openai>=1.0.0
Requires-Dist: pydub>=0.25.1
Requires-Dist: numpy>=1.24.0
Requires-Dist: scipy>=1.11.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: mistralai>=1.0.0
Requires-Dist: pypdf>=3.0.0
Requires-Dist: audioop-lts>=0.2.0; python_version >= "3.13"
Requires-Dist: rich>=13.0.0
Requires-Dist: srt>=3.5.0
Requires-Dist: tiktoken>=0.7.0
Provides-Extra: test
Requires-Dist: pytest>=7.0.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test"
Requires-Dist: vcrpy>=5.0.0; extra == "test"
Requires-Dist: ffmpeg-python>=0.2.0; extra == "test"
Provides-Extra: dev
Requires-Dist: basedpyright>=1.21.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Provides-Extra: all
Requires-Dist: gencast[dev,test]; extra == "all"

# gencast

Generate conversational podcasts from documents using AI. A cost-effective, customisable, local-first alternative to NotebookLM.

```text
gencast notebook.yaml  ->  podcast.m4a (with embedded subtitles)
```

## Install

```bash
pip install gencast
```

System dependency: `ffmpeg` (for audio combining and M4A muxing).

API keys (export or use `gencast init` to be prompted):

```bash
export OPENAI_API_KEY="sk-..."          # required (TTS + Whisper)
export ANTHROPIC_API_KEY="sk-ant-..."   # required (default outline + transcript)
export MISTRAL_API_KEY="..."            # optional (better PDF extraction)
```

## Quickstart

```bash
gencast init                        # interactive notebook wizard
gencast preview notebook.yaml       # outline-only dry run (free)
gencast generate notebook.yaml      # full pipeline -> out/<basename>.m4a
```

Or one-shot from a markdown file (uses default profiles):

```bash
gencast generate path/to/lecture.md
```

## Three-axis profile system

Each notebook composes three orthogonal profiles:

```yaml
speaker_profile: revision-duo       # WHO speaks (1-4 voices, personas)
episode_profile: exam-revision      # WHAT kind of podcast (briefing, segments, models)
room_profile:    small-room         # HOW it sounds (spatial pipeline)
```

List bundled profiles:

```bash
gencast list-profiles --type speakers
gencast list-profiles --type episodes
gencast list-profiles --type rooms
```

Profiles cascade: `./gencast/profiles/<kind>/<name>.yaml` (project)
> `~/.config/gencast/profiles/<kind>/<name>.yaml` (XDG)
> bundled defaults. Override per-notebook via `overrides:` block in the
notebook YAML.

## Worked example

`./photosynthesis/notebook.yaml`:

```yaml
title: Photosynthesis revision
sources:
  - lectures/photosynthesis.md
  - lectures/calvin-cycle.md
speaker_profile: revision-duo
episode_profile: exam-revision
room_profile: small-room
output:
  basename: photosynthesis-revision
  formats: [m4a]
overrides:
  briefing_suffix: |
    Pay specific attention to the distinction between the light-dependent
    reactions and the Calvin cycle. Include one worked Q&A on this distinction.
```

```bash
gencast generate photosynthesis/notebook.yaml
# -> photosynthesis/out/photosynthesis-revision.m4a
```

## Cost

Typical 10-min podcast (~5K-token source, 6 segments, 2 speakers):

| Component | Default model | Cost |
|---|---|---|
| Outline | `claude-haiku-4-5` | ~$0.005 |
| Transcript (with prompt cache) | `claude-sonnet-4-5` | ~$0.10 |
| TTS | `openai/tts-1-hd` | ~$0.06 |
| Subtitles | native (no Whisper) | $0.00 |
| **Total** | | **~$0.17** |

Use `--model` overrides or different episode profiles to trade quality for cost.

## Caches

- **TTS cache** -- `~/.cache/gencast/tts/` -- always on. Re-runs cost only changed sentences.
- **LLM cache** -- `~/.cache/gencast/llm/` -- opt-in via `--cache-llm`. Off by default since dialogue is non-deterministic.
- **PDF extract cache** -- `~/.cache/gencast/extract/` -- always on for Mistral PDF extraction.

Manage:

```bash
gencast cache status
gencast cache clear --type tts --yes
```

## CLI reference

```text
gencast NB.yaml                       generate (alias for `gencast generate NB.yaml`)
gencast init [--copy NB] [--minimal]  interactive notebook wizard
gencast preview NB.yaml               outline-only dry run
gencast generate NB.yaml              full pipeline -> m4a + sidecars
gencast estimate NB.yaml [--json] [--no-suggestions]
                                      predict USD cost before running. +-25% uncertainty.
gencast estimate --rates-only [--json]
                                      dump per-1k-token rates for bundled-default models.
gencast list-profiles [--type X]      enumerate profiles in cascade
gencast subtitle audio.mp3            re-subtitle external audio (Whisper)
gencast cache status [--type X]       inspect cache sizes
gencast cache clear [--type X] [--yes]
```

Verbosity: `-v`, `-vv`, `-q`, `--silent`, `--log-file PATH`.

### Cost preview

Predict cost before generating:

```bash
gencast estimate my-lecture.yaml
# gencast estimate -- my-lecture.yaml
# ================================================================
# Source:    12,840 tokens  (1 file)
#
# Stage breakdown                                          est. USD
# ------------------------------------------------------  --------
# Extract                                                    $0.00
# Outline      claude-haiku-4-5      . 13.0k in              $0.04
# Transcript   claude-sonnet-4-5     . 6 segs/~1.4k          $0.18
# TTS          openai/tts-1-hd       . ~4,500 chars          $0.14
# Whisper      whisper-1             . ~6.0 min              $0.04
#                                                          --------
#                                                  Total:    $0.40
#                                                            +-25%
#
# Cheaper alternatives
#   transcript   claude-sonnet-4-5  -> claude-haiku-4-5  saves ~$0.13 (-72%)
#                  (quality trade-off -- see docs)
```

For scripts and skills, use `--json`:

```bash
gencast estimate my-lecture.yaml --json
```

For the rate table only (used by Claude Code skills via dynamic context injection):

```bash
gencast estimate --rates-only --json
gencast estimate --rates-only --provider anthropic --json
gencast estimate --rates-only --all-models --json   # all ~2,700 LiteLLM models
```

## Tests

```bash
pytest tests/unit                          # fast, no API calls
pytest tests/component                     # vcrpy cassettes, no keys needed once recorded
GENCAST_TEST_E2E=1 pytest tests/e2e        # real API calls, costs a few cents
GENCAST_TEST_AUDIO=1 pytest tests/audio    # TTS + spatial audio (requires OPENAI_API_KEY)
```

## Specs and design

- [v1.0 design](docs/superpowers/specs/2026-05-07-gencast-v1-rewrite-design.md)
- [Plan A -- foundation](docs/superpowers/plans/2026-05-07-gencast-v1-plan-a-foundation.md)
- [Plan B -- pipeline](docs/superpowers/plans/2026-05-07-gencast-v1-plan-b-pipeline.md)
- [Plan C -- finishing](docs/superpowers/plans/2026-05-08-gencast-v1-plan-c-finishing.md)
- [Future work](docs/future-work.md)

## Claude Code integration

gencast ships with a Claude Code plugin that exposes four skills for conversational use inside Claude Code. The plugin is bundled with the `gencast` Python package — no separate install once you have `pip install gencast>=1.2.0`.

In Claude Code, install the plugin once:

```
/plugin install gencast
```

Then trigger any of the four skills with natural language:

| Skill | Example trigger | What it does |
|---|---|---|
| `notebook-init` | "draft a gencast notebook from these notes" | Builds `notebook.yaml` conversationally; picks profiles from the bundled catalogue. |
| `source-check` | "are these sources good for a podcast?" | Token-counts sources and predicts USD cost via `gencast estimate`. |
| `review-transcript` | "review this gencast transcript" | Reads `transcript.json`, flags awkward phrasings + flow problems. Advisory only — does not auto-regenerate. |
| `cost-explain` | "explain my gencast cost.json" | Plain-language cost-by-stage breakdown with optimisation suggestions. |

Skills are workflow recipes that shell out to the gencast CLI. The CLI remains the source of truth and is fully usable without Claude Code or the plugin.

Source for the skills lives at `skills/` in the gencast repo — see `.claude-plugin/plugin.json` for the manifest.

## License

MIT.
