Metadata-Version: 2.4
Name: fala
Version: 0.0.1
Summary: Voice dictation for Ubuntu/Wayland — hotkey-driven speech-to-text that types anywhere
Author: Ivar Soares Urdalen
Author-email: Ivar Soares Urdalen <ivar.urdalen@gmail.com>
License-Expression: MIT
License-File: LICENSE
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Intended Audience :: Developers
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Operating System :: POSIX :: Linux
Classifier: Environment :: Console
Requires-Dist: assemblyai
Requires-Dist: peewee
Requires-Dist: questionary
Requires-Dist: openai
Requires-Dist: rich
Requires-Dist: secretstorage
Requires-Dist: typer
Requires-Dist: websockets>=14.0
Requires-Dist: pyaudio>=0.2.11
Requires-Dist: numpy>=1.24.0
Requires-Python: >=3.11
Project-URL: Source, https://github.com/ivarurdalen/fala
Project-URL: Documentation, https://ivarurdalen.github.io/fala
Project-URL: Changelog, https://github.com/ivarurdalen/fala/blob/main/CHANGELOG.md
Description-Content-Type: text/markdown

# fala

*fala* means "speak" in Portuguese.

Voice dictation for Ubuntu/Wayland. Press a hotkey to start recording, press it again to stop — your speech is transcribed via the OpenAI API and typed wherever your cursor is.

Works system-wide: terminals, browsers, text editors, anything.

See [docs/architecture.md](docs/architecture.md) for a full diagram of how the components connect.

## How it works

1. Press your hotkey → recording starts (`parecord`)
2. Speak
3. Press hotkey again → recording stops, audio is sent to OpenAI for transcription, text is typed at the cursor via `ydotool`

## System requirements

- Python 3.11+
- [uv](https://docs.astral.sh/uv/)

```bash
sudo apt install -y ffmpeg portaudio19-dev pulseaudio-utils
```

`ydotool` needs access to the input subsystem. Add your user to the `input` group and create the udev rule:

```bash
sudo usermod -aG input $USER
echo 'KERNEL=="uinput", MODE="0660", GROUP="input"' \
    | sudo tee /etc/udev/rules.d/60-uinput.rules
sudo udevadm control --reload-rules && sudo udevadm trigger
```

Then log out and back in for the group change to take effect.

### ydotool daemon

The Ubuntu apt package for `ydotool` is too old to include `ydotoold`. Build and install the current version from source:

```bash
bash scripts/install_ydotool.sh
```

This clones, builds, installs, and starts the daemon as a systemd user service. Safe to re-run.

## Installation

Build and install `fala` as a global CLI tool using a wheel:

```bash
uv build
uv tool install --force dist/*.whl
```

This makes `fala` available in your PATH everywhere, no virtualenv activation needed.

To update after pulling changes:

```bash
uv build
uv tool install --force dist/*.whl
```

## API key setup

Store your OpenAI API key in the GNOME Keyring (encrypted at rest, unlocked automatically at login):

```bash
fala setup
```

You will be prompted to enter your key. It is stored securely and never written to disk in plaintext.

Alternatively, set it as an environment variable (e.g. for CI or headless use):

```bash
export OPENAI_API_KEY="sk-..."
```

The env var takes precedence over the keyring if both are set.

## Hotkey setup

Install/update helper scripts directly:

```bash
fala install-scripts
```

Or via guided setup:

```bash
fala setup
```

Register a GNOME keyboard shortcut. The script will ask you to physically press the key combination you want:

```bash
bash scripts/setup_keybinding.sh
```

To remove the shortcut:

```bash
bash scripts/setup_keybinding.sh --revert
```

`fala setup` is idempotent and safe to re-run.

## CLI usage

Record from microphone (press Enter to start, press Enter to stop):

```bash
fala record
```

Transcribe an existing audio file:

```bash
fala transcribe path/to/audio.wav
```

Live streaming transcription to terminal:

```bash
fala stream
```

By default, `fala stream` prints transcript deltas directly in the terminal and redirects noisy stderr output (for example ALSA/JACK warnings) to `~/.config/fala/stream.log`.
Use `--type-into-active-window` to also inject text into the focused GUI input via `ydotool`.

Create a high-clarity test fixture recording and generate a pytest integration test with the expected transcript baked into the test:

```bash
fala create-test-fixture clear_en_sentence --text "The quick brown fox jumps over the lazy dog."
```

Run integration tests (including realtime streaming integration with local WAV fixture):

```bash
just test-integration
```

Manual equivalent for all integration tests:

```bash
RUN_STREAMING_INTEGRATION=1 uv run pytest -m integration
```

Notes:
- The integration test uses `tests/fixtures/clear_en_sentence.wav`.
- It requires provider API keys to be available (`OPENAI_API_KEY` and/or `ASSEMBLYAI_API_KEY`); the `just` command loads both from keyring automatically if stored via `fala setup`.

Show usage analytics from your transcription log:

```bash
fala stats
fala stats --weeks 16
fala stats --typing-wpm 45 --speaking-wpm 130
```

`fala stats` prints rich tables with weekly usage, average characters per transcription, provider mix, streaks, and activity trends. It also estimates time saved by speaking vs typing.

Set defaults once (persisted in `~/.config/fala/config.toml`):

```bash
fala config --typing-wpm 40 --speaking-wpm 120 --avg-characters-per-word 5
fala config --assemblyai-model u3-rt-pro
```

These defaults are used by `fala stats` unless you override them with command flags.
