Metadata-Version: 2.4
Name: soniox-transcribe
Version: 0.1.0
Summary: Generate subtitles from audio/video using Soniox Speech-to-Text API
Project-URL: Homepage, https://github.com/qyhfrank/soniox-transcribe
Project-URL: Repository, https://github.com/qyhfrank/soniox-transcribe
Project-URL: Issues, https://github.com/qyhfrank/soniox-transcribe/issues
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Requires-Dist: httpx>=0.27.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer>=0.12.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.8.0; extra == 'dev'
Description-Content-Type: text/markdown

# soniox-transcribe

CLI tool for generating subtitles from audio/video files using the [Soniox](https://soniox.com) Speech-to-Text API.

## Features

- **Subtitle formats**: SRT, VTT, TXT
- **Video support**: Automatically extracts audio from video files via ffmpeg
- **Speaker diarization**: Speaker labels enabled by default
- **60+ languages**: With automatic language identification
- **Context support**: Custom terms and background text to improve accuracy (up to ~8,000 tokens)

## Installation

```bash
pip install soniox-transcribe
```

For video file support, [ffmpeg](https://ffmpeg.org) must be installed:

```bash
# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg
```

## Configuration

Set your Soniox API key as an environment variable:

```bash
export SONIOX_API_KEY=<your-api-key>
```

Get your API key at [console.soniox.com](https://console.soniox.com).

Alternatively, create a `.env` file in your working directory:

```
SONIOX_API_KEY=<your-api-key>
```

## Usage

```bash
# Basic transcription (outputs .srt by default)
soniox-transcribe video.mp4

# Specify output file (format inferred from extension)
soniox-transcribe video.mp4 -o video.vtt

# Explicit format
soniox-transcribe audio.wav -f txt

# Transcribe from URL
soniox-transcribe --url https://example.com/audio.mp3

# Disable speaker diarization
soniox-transcribe interview.wav --no-speaker

# Add custom vocabulary
soniox-transcribe lecture.mp3 -t "Transformer" -t "BERT" -t "GPT"

# Provide background text for better accuracy
soniox-transcribe meeting.mp3 --context-text "Discussion about Q4 revenue targets"

# Use a context file
soniox-transcribe meeting.mp3 --context-text-file agenda.txt
```

### Options

| Option                | Description                                             |
| --------------------- | ------------------------------------------------------- |
| `-o`, `--output`      | Output file path (default: input with .srt extension)   |
| `-f`, `--format`      | Output format: `srt`, `vtt`, `txt`                      |
| `-u`, `--url`         | Transcribe from a public URL instead of a local file    |
| `-l`, `--language`    | Language hints (default: `zh`, `en`); repeatable         |
| `--no-speaker`        | Disable speaker diarization and labels                  |
| `--max-chars`         | Max characters per subtitle line (default: 42)          |
| `-m`, `--model`       | Soniox model (default: `stt-async-v4`)                  |
| `-t`, `--context-term`| Custom terms/vocabulary; repeatable                     |
| `--context-text`      | Background text for accuracy improvement                |
| `--context-text-file` | File containing background text                         |
| `-c`, `--context`     | JSON string with full context configuration             |
| `--context-file`      | JSON file with full context configuration               |

## Supported Formats

**Audio**: mp3, wav, flac, aac, ogg, m4a, wma

**Video**: mp4, mkv, avi, mov, webm, flv, wmv, m4v (requires ffmpeg)

## License

[MIT](LICENSE)
