Metadata-Version: 2.4
Name: stt2desktop
Version: 0.2.0
Summary: Local speech-to-text for desktop using faster-whisper
Project-URL: Documentation, https://github.com/jedie/stt2desktop
Project-URL: Source, https://github.com/jedie/stt2desktop
Author-email: Jens Diemer <stt2desktop@jensdiemer.de>
License: GPL-3.0-or-later
License-File: LICENSE
Requires-Python: >=3.12
Requires-Dist: bx-py-utils
Requires-Dist: chime
Requires-Dist: cli-base-utilities>=0.27.1
Requires-Dist: evdev-binary
Requires-Dist: faster-whisper
Requires-Dist: rich
Requires-Dist: sounddevice
Requires-Dist: tyro
Description-Content-Type: text/markdown

# stt2desktop

[![tests](https://github.com/jedie/stt2desktop/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/jedie/stt2desktop/actions/workflows/tests.yml)
[![codecov](https://codecov.io/github/jedie/stt2desktop/branch/main/graph/badge.svg)](https://app.codecov.io/github/jedie/stt2desktop)
[![stt2desktop @ PyPi](https://img.shields.io/pypi/v/stt2desktop?label=stt2desktop%20%40%20PyPi)](https://pypi.org/project/stt2desktop/)
[![Python Versions](https://img.shields.io/pypi/pyversions/stt2desktop)](https://github.com/jedie/stt2desktop/blob/main/pyproject.toml)
[![License GPL-3.0-or-later](https://img.shields.io/pypi/l/stt2desktop)](https://github.com/jedie/stt2desktop/blob/main/LICENSE)

Local speech-to-text for desktop using [faster-whisper](https://github.com/SYSTRAN/faster-whisper).

Let's you dictate text into any application without sending audio to any cloud services.
Everything runs locally on your machine — no internet connection required after the initial model was download.

Currently only tested under Linux with KDE ;)

## How it works

1. Run `./cli.py listen` (Whisper model downloaded on first run, cached on disk)
2. Hold **Scroll Lock** to record from your microphone
3. Release **Scroll Lock** — the audio is transcribed locally by faster-whisper
4. The transcribed text is copied to the clipboard via `wl-copy` and pasted into the focused window via `ydotool key ctrl+v`

Used tools:

- [faster-whisper](https://github.com/SYSTRAN/faster-whisper) for local speech recognition
- [ydotool](https://github.com/ReimuNotMoe/ydotool) to simulate keyboard input (works on Wayland and X11)
- [wl-clipboard](https://github.com/bugaevc/wl-clipboard) (`wl-copy`) to paste text via clipboard — avoids keyboard layout issues
- [chime](https://github.com/MaxHalford/chime) to play notification sounds


## Prepare installation

Requirements: Python 3.12+, a working microphone, [wl-clipboard](https://github.com/bugaevc/wl-clipboard) and [ydotool and ydotoold](https://github.com/ReimuNotMoe/ydotool):

```bash
sudo apt install ydotool ydotoold wl-clipboard
sudo usermod -aG input $USER
echo 'KERNEL=="uinput", GROUP="input", MODE="0660"' | sudo tee /etc/udev/rules.d/60-uinput.rules
sudo udevadm control --reload-rules && sudo udevadm trigger
```

Then re-login (or run `newgrp input` in the current shell) so the group change takes effect.


## Install via pipx

You can install "stt2desktop" with [pipx](https://pipx.pypa.io/):

```bash
sudo apt install pipx
pipx install stt2desktop
```

Then run:

```bash
stt2desktop listen
```

The default global hotkey is **Scroll Lock** (In german: "rollen").
You can change it via the `--hotkey` option (see below).
Proposal for alternative key: `ctrl_r`, `alt_r`, `cmd_r`, `shift_r` ;)

## CLI listen

[comment]: <> (✂✂✂ auto generated listen help start ✂✂✂)
```
usage: stt2desktop listen [-h] [LISTEN OPTIONS]

Start the STT listener. Hold the hotkey to record, release to transcribe and insert.

╭─ options ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help                show this help message and exit                                                            │
│ -v, --verbosity           Verbosity level; e.g.: -v, -vv, -vvv, etc. (repeatable)                                    │
│ --model {tiny_en,tiny,base_en,base,small_en,small,medium_en,medium,large_v1,large_v2,large_v3,large,distil_large_v2, │
│ distil_medium_en,distil_small_en,distil_large_v3,distil_large_v3_5,large_v3_turbo,turbo}                             │
│                           Whisper model to use for transcription. (default: small)                                   │
│ --hotkey STR              evdev key name to hold for recording. Release to transcribe and insert text. Examples:     │
│                           KEY_SCROLLLOCK, KEY_RIGHTCTRL, KEY_RIGHTALT. (default: KEY_SCROLLLOCK)                     │
│ --sample-rate INT         Audio sample rate in Hz. Whisper expects 16000. (default: 16000)                           │
│ --device STR              Device to run inference on, e.g. cpu or cuda. (default: auto)                              │
│ --compute-type STR        Quantization type, e.g. int8, float16, float32. (default: int8)                            │
│ --num-workers {None}|INT  Number of parallel transcription workers. Defaults to CPU count. (default: None)           │
│ --sounds, --no-sounds     Play notification sounds via chime. (default: True)                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```
[comment]: <> (✂✂✂ auto generated listen help end ✂✂✂)


### Whisper models

Just a selection and approximate values:

| Model    | Size    | Speed   | Accuracy         |
|----------|---------|---------|------------------|
| `tiny`   | ~75 MB  | fastest | lowest           |
| `base`   | ~145 MB | fast    | good             |
| `small`  | ~460 MB | slower  | better (default) |
| `medium` | ~1.5 GB | slow    | high             |

Larger models produce more accurate transcriptions but take longer to process ;)


## Troubleshooting

Use `pavucontrol` to check your audio setup and make sure the correct microphone is selected and working.

Test audio recording:

```bash
./cli.py test-recording
```

Some terminal commands to check your audio setup:

```bash
# List capture devices in PulseAudio sound server:
pactl list sources short

# Check current volume:
pactl list sources | grep -A1 "Name: .*input\|Volume:"

# Displays the current state in PipeWire:
wpctl status
````

Setup loopback mode to hear youself:

```bash
# Start:
pactl load-module module-loopback
# Undo:
pactl unload-module module-loopback
```



## start development

At least `uv` is needed. Install e.g.: via pipx:
```bash
apt-get install pipx
pipx install uv
```

Clone the project and just start the CLI help commands.
A virtual environment will be created/updated automatically.

```bash
~$ git clone https://github.com/jedie/stt2desktop.git
~$ cd stt2desktop
~/stt2desktop$ ./cli.py --help
~/stt2desktop$ ./dev-cli.py --help
```

[comment]: <> (✂✂✂ auto generated dev help start ✂✂✂)
```
usage: ./dev-cli.py [-h] {coverage,install,lint,mypy,nox,pip-audit,publish,test,update,update-readme-history,update-test-snapshot-files,version}



╭─ options ────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ -h, --help    show this help message and exit                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ subcommands ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ (required)                                                                                                           │
│   • coverage  Run tests and show coverage report.                                                                    │
│   • install   Install requirements and 'stt2desktop' via pip as editable.                                            │
│   • lint      Check/fix code style by run: "ruff check --fix"                                                        │
│   • mypy      Run Mypy (configured in pyproject.toml)                                                                │
│   • nox       Run nox                                                                                                │
│   • pip-audit                                                                                                        │
│               Run pip-audit check against current requirements files                                                 │
│   • publish   Build and upload this project to PyPi                                                                  │
│   • test      Run unittests                                                                                          │
│   • update    Update dependencies (uv.lock) and git pre-commit hooks                                                 │
│   • update-readme-history                                                                                            │
│               Update project history base on git commits/tags in README.md Will be exited with 1 if the README.md    │
│               was updated otherwise with 0.                                                                          │
│                                                                                                                      │
│               Also, callable via e.g.:                                                                               │
│                   python -m cli_base update-readme-history -v                                                        │
│   • update-test-snapshot-files                                                                                       │
│               Update all test snapshot files (by remove and recreate all snapshot files)                             │
│   • version   Print version and exit                                                                                 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```
[comment]: <> (✂✂✂ auto generated dev help end ✂✂✂)


## History

[comment]: <> (✂✂✂ auto generated history start ✂✂✂)

* [v0.2.0](https://github.com/jedie/stt2desktop/compare/v0.1.2...v0.2.0)
  * 2026-04-22 - paste text via clipboard to avoid keyboard layout issues
  * 2026-04-16 - Add test commands and migrate to ydotool
* [v0.1.2](https://github.com/jedie/stt2desktop/compare/v0.1.1...v0.1.2)
  * 2026-03-30 - print warning when not running on Linux
  * 2026-03-30 - Update requirements
  * 2026-03-27 - Update README
* [v0.1.1](https://github.com/jedie/stt2desktop/compare/v0.1.0...v0.1.1)
  * 2026-03-27 - +Proposal for alternative hotkey
  * 2026-03-27 - fix color outputs
  * 2026-03-27 - Update requirements
  * 2026-03-27 - add missing license file.
* [v0.1.0](https://github.com/jedie/stt2desktop/compare/v0.0.1...v0.1.0)
  * 2026-03-27 - Use chime to play notification sounds
  * 2026-03-27 - Try to fix github CI run
  * 2026-03-27 - Cleanup README
  * 2026-03-27 - pipx usage

<details><summary>Expand older history entries ...</summary>

* [v0.0.1](https://github.com/jedie/stt2desktop/compare/b407f8f...v0.0.1)
  * 2026-03-26 - Add POC
  * 2026-03-26 - init

</details>


[comment]: <> (✂✂✂ auto generated history end ✂✂✂)
