Metadata-Version: 2.4
Name: makesub
Version: 0.1.1
Summary: Generate SRT subtitles from video/audio files using Whisper
Author-email: Akshay Gupta <hi@akshaygpt.com>
License: MIT
Project-URL: Homepage, https://github.com/akshaygpt/makesub
Keywords: subtitles,srt,whisper,transcription,video
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Topic :: Multimedia :: Video
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Environment :: Console
Requires-Python: >=3.9
Description-Content-Type: text/markdown
Requires-Dist: openai-whisper>=20231117
Requires-Dist: torch>=1.12

# makesub

**makesub** is a command-line tool that automatically generates SRT subtitle files from any video or audio file. It uses [OpenAI Whisper](https://github.com/openai/whisper), a state-of-the-art speech recognition model, to transcribe spoken audio into accurate, timestamped subtitles.

No API key required. Everything runs locally on your machine.

```bash
makesub lecture.mp4
# → lecture.srt
```

---

## Who is this for?

- Content creators who want subtitles for YouTube videos, reels, or podcasts
- Developers building subtitle pipelines
- Researchers transcribing interviews or recordings
- Anyone who needs fast, offline, accurate subtitles from a video file

---

## Features

- Generates standard `.srt` subtitle files ready for use in any video editor or player
- Powered by OpenAI Whisper — no internet connection or API key needed after install
- Supports 99+ languages with automatic language detection
- Native Apple Silicon support (MPS acceleration on M1/M2/M3 Macs)
- Handles MP4, MOV, MKV, AVI, MP3, WAV, M4A, and any format ffmpeg can read
- Clear error messages for common problems (missing ffmpeg, no audio track, silent video, etc.)

---

## Requirements

- Python 3.9+
- [ffmpeg](https://ffmpeg.org/) — required for audio decoding

Install ffmpeg on macOS:
```bash
brew install ffmpeg
```

Install ffmpeg on Ubuntu/Debian:
```bash
sudo apt install ffmpeg
```

---

## Installation

```bash
pip install makesub
```

> **Apple Silicon (M1/M2/M3):** Install PyTorch first to ensure you get the MPS-accelerated build, then install makesub:
> ```bash
> pip install torch
> pip install makesub
> ```

---

## Usage

```bash
makesub <video_or_audio_file> [options]
```

The subtitle file is written to the same directory as the input file by default.

```bash
makesub video.mp4
# Output: video.srt
```

### Options

| Flag | Default | Description |
|------|---------|-------------|
| `--model` | `base` | Whisper model: `tiny`, `base`, `small`, `medium`, `large`, `large-v3`, `turbo` |
| `--language` | `en` | Language code (e.g. `en`, `fr`, `de`, `ja`, `zh`). Use `auto` to detect automatically |
| `--output` | alongside input | Output `.srt` path or directory |
| `--device` | auto | Force compute device: `cpu`, `mps`, `cuda` |
| `--verbose` | off | Print each decoded segment in real time |

### Examples

```bash
# Subtitle an English video (default)
makesub interview.mp4

# Use a more accurate model for better results
makesub documentary.mp4 --model medium

# Auto-detect the spoken language
makesub foreign_film.mp4 --language auto

# Subtitle a French video
makesub podcast.mp3 --language fr

# Save the subtitle file to a specific location
makesub recording.mov --output ~/Desktop/recording.srt

# Watch segments appear in real time (useful for long files)
makesub lecture.mp4 --verbose
```

---

## Choosing a model

Larger models are slower but produce significantly more accurate subtitles. The `base` model is a good starting point for most use cases.

| Model | Size | Relative Speed | Best For |
|-------|------|----------------|----------|
| `tiny` | 75 MB | ~32x | Quick drafts, short clips |
| `base` | 145 MB | ~16x | Everyday use (default) |
| `small` | 465 MB | ~6x | Better accuracy, still fast |
| `medium` | 1.5 GB | ~2x | High accuracy |
| `large-v3` | 3 GB | 1x | Best possible accuracy |
| `turbo` | 810 MB | ~8x | Fast with good accuracy |

Models are downloaded automatically on first use and cached in `~/.cache/whisper/`.

---

## Supported file formats

Any format that ffmpeg can decode, including:

`mp4` `mov` `mkv` `avi` `webm` `flv` `m4v` `mp3` `wav` `m4a` `aac` `ogg` `flac` `wma`

---

## Troubleshooting

**`ffmpeg not found`**
Install ffmpeg — see Requirements above.

**`No speech detected`**
Try `--language auto` if the video is not in English. Check that the video actually has an audio track.

**`Not enough memory to load the model`**
Switch to a smaller model: `--model small` or `--model tiny`.

**`Permission denied` reading a file on macOS**
Terminal may need Full Disk Access. Go to System Settings > Privacy & Security > Full Disk Access and enable your terminal app.

---

## License

MIT

---

## Acknowledgements

Built on top of [OpenAI Whisper](https://github.com/openai/whisper). Audio decoding powered by [ffmpeg](https://ffmpeg.org/).
