Metadata-Version: 2.4
Name: kentui
Version: 0.1.0
Summary: Interactive CLI for kenkui — convert ebooks to audiobooks locally
Author-email: Sumner MacArthur <spn1kolat3sla@gmail.com>
License-Expression: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/D1zzl3D0p/kentui
Project-URL: Bug Tracker, https://github.com/D1zzl3D0p/kentui/issues
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: kenkui>=1.2.0
Requires-Dist: inquirerpy>=0.3.4
Requires-Dist: rich>=13.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Dynamic: license-file

# kentui

![Python](https://img.shields.io/badge/python-3.12+-blue)
![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-lightgrey)
![License](https://img.shields.io/github/license/D1zzl3D0p/kentui)
![PyPI](https://img.shields.io/pypi/v/kentui)

> **Freaky fast audiobook generation from ebooks. No GPU. No nonsense.**

kentui is the interactive CLI for [kenkui](https://github.com/D1zzl3D0p/kenkui) — an ebook-to-audiobook converter powered by [Kyutai's pocket-tts](https://github.com/kyutai-labs/pocket-tts), running entirely on CPU.

kentui handles the interactive parts: configuration wizard, voice management, chapter selection, and progress display. The actual conversion engine is kenkui, which kentui depends on.

---

## ✨ Features

- Freaky fast audiobook generation
- No GPU needed, 100% CPU
- Super high-quality text-to-speech
- Interactive hub with live status panel and Escape to go back
- **Multi-voice narration** — different voices for different characters, powered by an LLM
- **Chapter-voice mode** — assign a distinct voice to each chapter
- **Voice pool template** — persistent global defaults for automatic voice assignment
- **Credits chapter** — synthesized audio appended to every m4b
- Three tiers of voices: compiled, built-in, and custom
- Flexible chapter selection with presets and manual override
- Broadcast-quality audio post-processing chain
- Supports EPUB, MOBI/AZW, and FB2

---

## 🚀 Quick Start

### Requirements

- Python **3.12+**

### Install

```bash
pip install kentui
```

Or with uv / pipx:

```bash
uv tool install kentui
pipx install kentui
```

Compiled voices (~440 MB) are downloaded automatically on first run. To download them ahead of time:

```bash
kentui voices download
```

### Run

```bash
kentui book.epub
```

That's it. An interactive wizard walks you through the setup, then kentui runs the job and shows a live progress bar. You'll get a `book.m4b` alongside your ebook when it's done.

---

## 📚 Usage

### Interactive wizard (default)

```bash
kentui book.epub
```

Opens a configuration hub showing a live status panel of your current settings, then a menu:

```
┌─ Current Settings ───────────────────────────────────────────┐
│  Mode:          Multi-voice                                   │
│  NLP:           Anthropic · claude-haiku-4-5                 │
│  TTS Provider:  pocket-tts · local                           │
│  Narrator:      sarah                                         │
│  Chapters:      content-only (42 selected)                   │
│  Quality:       temp 0.8 · 30 LSD steps · 96k               │
└───────────────────────────────────────────────────────────────┘

  > Submit Job
    Narrator Voice →
    Chapters →
    Narration Mode →
    Series →
    Advanced Options →
    Cancel
```

Press **Escape** at any step to go back. All settings persist to `~/.config/kenkui/last_job_profile.toml` and pre-load on the next run.

### Headless mode

Pass a config file with `-c` to skip the wizard entirely:

```bash
kentui book.epub -c my-config.toml
```

Exits 0 on success, 1 on failure.

### `kentui add`

```bash
# Interactive wizard
kentui add book.epub

# Headless
kentui add book.epub -c my-config.toml
```

### Pipeline step commands

```bash
kentui parse book.epub       # Stage 1-2 NLP: entity scan + character clustering
kentui attribute book.epub   # Stage 3-4 NLP: speaker attribution
kentui generate book.epub    # TTS + stitch (requires prior NLP cache)
```

---

## 🎙️ Narration Modes

### Single voice

The default. One voice narrates everything.

### Multi-voice (character narration)

kenkui uses an NLP pipeline to identify characters and assigns each a distinct voice. The narrator gets its own voice too.

Two NLP backends are available: **Ollama** (local, default) and **cloud providers** (Anthropic, OpenAI, Google).

#### Ollama (default)

**Requirements:**
- [Ollama](https://ollama.com) running locally (`ollama serve`)
- NLP model pulled (default: `llama3.2`) — `ollama pull llama3.2`
- spaCy model — downloaded automatically if missing

#### Cloud providers (Anthropic, OpenAI, Google)

Run `kentui config` and answer yes to "Configure a cloud NLP provider API key?" to set up credentials.

**Default models:**

| Provider | Default model |
|----------|--------------|
| `anthropic` | `claude-sonnet-4-6` |
| `openai` | `gpt-4o` |
| `google` | `gemini/gemini-2.0-flash` |

**How voice assignment works:**

After the scan completes, voices are assigned using a three-tier priority system:

1. **Series record** — named character → pinned voice (highest priority)
2. **Voice pool template** — role + gender + rank → voice
3. **Round-robin pool** — any remaining characters

### Chapter-voice mode

Assign a distinct voice to each chapter. The wizard presents each chapter title and lets you pick a voice.

---

## 🗂️ Voice Pool Template

The voice pool template (`~/.config/kenkui/voice_pool.toml`) pre-assigns voices by character role, gender, and rank. It applies automatically to every multi-voice job.

```toml
[protagonist.male]
1 = "david"
2 = "james"
pool = ["oliver", "ethan"]

[protagonist.female]
1 = "sarah"
pool = ["emma", "claire"]

[supporting.male]
pool = ["oliver", "ethan", "marcus"]

[minor]
pool = []  # fallback: any non-excluded voice
```

---

## 🎙️ Voice System

Voices come in three tiers:

| Tier | Source | Auth required? |
|------|--------|---------------|
| **Compiled** | Downloaded from HuggingFace on first run | No |
| **Built-in** | 8 pocket-tts defaults | No |
| **Custom** | `.wav` files (user-provided or fetched) | Yes (HuggingFace) |

**Built-in voices:**
```
alba, marius, javert, jean, fantine, cosette, eponine, azelma
```

### Voice manager

```bash
kentui voices
```

Launches an interactive voice manager: browse, audition, manage the exclusion pool, and look up the character cast for a completed multi-voice book.

### Voice commands

```bash
# List voices (with optional filters)
kentui voices list
kentui voices list --gender Female
kentui voices list --accent Scottish
kentui voices list --source compiled

# Audition a voice
kentui voices audition <voice>
kentui voices audition <voice> --text "Your preview text here."

# Download compiled voices
kentui voices download
kentui voices download --force

# Fetch custom voices from HuggingFace
kentui voices fetch --repo user/repo-name

# Manage auto-assignment pool
kentui voices exclude <voice>
kentui voices include <voice>

# Look up a book's character cast
kentui voices cast <title>
```

---

## ⚙️ Configuration

```bash
# Create or edit the default config
kentui config

# Create a named config profile
kentui config fast-mode

# Use a named config
kentui book.epub -c fast-mode
```

### Key settings

| Key | Default | Description |
|-----|---------|-------------|
| `workers` | `cpu_count - 2` | Parallel TTS worker processes |
| `m4b_bitrate` | `96k` | Output audio bitrate |
| `temp` | `0.7` | Sampling temperature |
| `lsd_decode_steps` | `1` | LSD decode steps (higher = better quality, slower) |
| `default_voice` | `alba` | Fallback voice |
| `default_chapter_preset` | `content-only` | Default chapter filter preset |
| `pause_line_ms` | `800` | Pause between lines (ms) |
| `pause_chapter_ms` | `2000` | Pause between chapters (ms) |
| `pause_scene_break_ms` | `4000` | Pause at scene breaks (ms) |
| `nlp_provider` | `ollama` | NLP backend |
| `nlp_model` | `llama3.2` | Model for speaker attribution |
| `credits_enabled` | `true` | Append synthesized credits audio |

---

## 📖 Chapter Selection

| Preset | Description |
|--------|-------------|
| `content-only` | Body chapters only *(default)* |
| `chapters-only` | Titled chapters only |
| `with-parts` | Chapters and part headings |
| `all` | Every item in the ebook |
| `none` | Skip everything |

After selecting a preset, the wizard shows a checkbox list of all chapters with the preset's defaults pre-selected.

---

## 🔊 Audio Post-Processing

kenkui applies a broadcast-quality effects chain: noise reduction → high-pass filter → low shelf EQ → presence boost → de-esser → compressor → limiter → autogain. All parameters are configurable via `kentui config`.

---

## FAQ

**Do I need a GPU?**
No. kenkui is 100% CPU-based.

**What ebook formats does it support?**
EPUB, MOBI/AZW/AZW3/AZW4, and FB2.

**What output format does it use?**
M4B, with chapters, metadata, and embedded covers.

**Do I need Ollama for multi-voice?**
No. You can use Ollama (local) or Anthropic, OpenAI, or Google. Run `kentui config` to set up a cloud provider.

**Does it upload my books anywhere?**
With the default Ollama backend: no. With a cloud NLP provider, the book text is sent to that provider's API for the character scan. Nothing else is uploaded.

---

## 🙏 Special Thanks

Thanks to **Project Gutenberg** for providing some of the public-domain books included with kenkui.

---

## Voice Dataset Credits

kenkui's compiled voices are derived from two publicly available speech corpora.

### CSTR VCTK Corpus

> Veaux, Christoph; Yamagishi, Junichi; MacDonald, Kirsten. (2019). *CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit*. University of Edinburgh. The Centre for Speech Technology Research (CSTR).

Licensed under [Creative Commons Attribution 4.0 (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).
Commercial use is permitted with attribution.

### EARS Dataset

Licensed under [Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/).

> **Note:** Compiled voices sourced from EARS (identifiable by `EARS` in the voice name via `kentui voices list`) **may not be used for commercial purposes**. If you are building a commercial product with kenkui, use only VCTK-sourced or built-in voices.
