Metadata-Version: 2.4
Name: bee-recorder
Version: 0.1.0b3
Summary: Local-first meeting recorder with AI transcription and speaker diarization
Author: Paulo Henrique
License: MIT
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Sound/Audio
Classifier: Topic :: Multimedia :: Sound/Audio :: Capture/Recording
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: click>=8.1
Requires-Dist: faster-whisper>=1.0
Requires-Dist: speechbrain>=1.0
Requires-Dist: rich>=13.0
Requires-Dist: soundfile>=0.12
Requires-Dist: numpy>=1.24
Requires-Dist: scikit-learn>=1.3
Requires-Dist: torch>=2.0
Requires-Dist: torchaudio>=2.0
Requires-Dist: huggingface-hub>=0.20
Requires-Dist: tomli>=2.0; python_version < "3.11"
Provides-Extra: summary
Requires-Dist: anthropic>=0.20; extra == "summary"

# Bee Recorder

Local-first meeting recorder for Linux. Captures system audio (Google Meet, Teams, Zoom, Discord, etc.) and your microphone in parallel, then transcribes everything with Whisper and identifies who said what with SpeechBrain — all on your machine. No audio leaves your computer, no API keys, no HuggingFace token.

The product is the **transcript** itself: a Markdown file you can read and a JSON file you can feed to any LLM later for analysis.

## Features

- **Fully local** — Whisper for transcription, SpeechBrain ECAPA-TDNN for speaker diarization. Nothing uploaded.
- **Zero-friction setup** — `bee setup` detects your hardware (RAM, GPU) and picks a Whisper model that will run well, then downloads it (~1.6 GB).
- **Channel-aware diarization** — microphone audio is diarized into `Mic_00`, `Mic_01`, ... (works for in-person meetings where multiple people share one mic), and remote participants from system audio are diarized into `Speaker_00`, `Speaker_01`, ...
- **Interactive labeling** — after processing, Bee shows representative quotes per speaker so you can match `Mic_00`/`Speaker_00` to real names.
- **Two outputs per recording** — `<id>.md` (human-readable) and `<id>.json` (structured for automation/LLM consumption).
- **Auto language detection** — works out of the box in any language Whisper supports.

## Requirements

- Linux with PulseAudio or PipeWire (Ubuntu, Fedora, Debian, Arch, Mint, Pop!_OS, etc.)
- Python 3.10+
- `ffmpeg` and `pulseaudio-utils` (for `pactl`)

```bash
# Ubuntu / Debian
sudo apt install ffmpeg pulseaudio-utils python3 python3-venv

# Fedora
sudo dnf install ffmpeg-free pulseaudio-utils python3

# Arch
sudo pacman -S ffmpeg libpulse python
```

macOS and Windows are not supported today (the audio capture layer relies on PulseAudio/PipeWire).

## Install

```bash
pipx install bee-recorder
bee setup       # detects hardware, downloads Whisper + SpeechBrain models
bee doctor      # verifies ffmpeg, audio server, models, config
```

Or with plain `pip` inside a venv:

```bash
python3 -m venv ~/.venvs/bee
~/.venvs/bee/bin/pip install bee-recorder
~/.venvs/bee/bin/bee setup
```

## Recording a meeting

### 1. Start the recording before joining

```bash
bee start -n weekly-2026-05-08
```

The name is optional — without `-n`, Bee uses a timestamp ID. Add `--no-mic` if you want to capture only the system audio.

### 2. Join the meeting normally

Bee runs in the background and captures whatever your audio output and microphone produce. It auto-detects new audio sinks during the call (e.g. you plug in AirPods mid-meeting), so you don't need to restart anything.

### 3. Stop and process

```bash
bee stop
```

This stops the capture and runs:

1. Whisper transcription (mic + system audio)
2. SpeechBrain diarization on both audio streams (mic + system)
3. Merging — mic speakers become `Mic_XX`, remote participants become `Speaker_XX`

How long it takes depends on the meeting length and your hardware. With a CPU and the `medium` model, expect roughly 0.5–1× real time (a 30-minute meeting takes 15–30 minutes to process). With a GPU, much faster.

If you want to stop now and process later:

```bash
bee stop --no-process
bee process weekly-2026-05-08    # transcribe + diarize on demand
```

### 4. Label the participants

After processing, Bee prints a table of representative quotes per speaker. To re-print it any time:

```bash
bee speakers weekly-2026-05-08
```

Then assign real names:

```bash
bee label weekly-2026-05-08 Mic_00="Alice" Speaker_00="Bob (Acme Corp)" Speaker_01="Carol (Acme Corp)"
```

The label rewrites both the JSON and Markdown transcripts in place.

### 5. Read or share the transcript

```bash
bee show weekly-2026-05-08       # prints metadata + transcript
bee show weekly-2026-05-08 -t    # transcript only
bee list                          # all recordings
```

## Output files

Transcripts are written to `~/Documents/bee/transcripts/` by default (configurable):

| File | Contents |
|------|----------|
| `<id>.md` | Human-readable transcript with timestamps and speaker labels |
| `<id>.json` | Structured transcript with metadata header (duration, language, speakers, segments) for automation and LLM consumption |

Raw audio (`mic.wav`, `system.wav`) and a `metadata.json` are kept under `~/Documents/bee/recordings/<id>/` so you can re-process a recording any time with `bee process <id>`.

## Optional: AI summary

Bee does **not** generate summaries by default — the transcript is the deliverable. If you want a quick summary, the `summarize` command shells out to the [Claude Code CLI](https://docs.claude.com/en/docs/claude-code) (must be installed and authenticated separately):

```bash
bee summarize ~/Documents/bee/transcripts/weekly-2026-05-08.md
```

This produces `weekly-2026-05-08_summary.md` next to the transcript.

For richer analyses (action items, sentiment, decisions), feed the JSON transcript into your LLM of choice. The structured format is designed for that.

## Command reference

| Command | What it does |
|---------|--------------|
| `bee start -n <name>` | Start recording |
| `bee start --no-mic` | Capture system audio only |
| `bee stop` | Stop and process (transcribe + diarize + ask to label) |
| `bee stop --no-process` | Just stop the recording |
| `bee stop --no-diarize` | Skip diarization (faster, no speaker labels) |
| `bee stop --no-identify` | Don't prompt for participant names interactively |
| `bee process <id>` | (Re)process a stopped recording |
| `bee speakers <id>` | Show preview quotes per detected speaker |
| `bee label <id> SPEAKER_00=Name ...` | Rename speakers in a transcript |
| `bee show <id>` | Print transcript and metadata |
| `bee list` | List all recordings |
| `bee summarize <file>` | Generate a Claude summary (requires Claude CLI) |
| `bee watch` | Watch the transcripts folder and auto-summarize new files |
| `bee setup` | First-time setup: hardware detection + model download |
| `bee doctor` | Health check (ffmpeg, audio server, models, config) |
| `bee models list` | Show available and installed models |
| `bee models download <name>` | Download a specific Whisper or diarization model |
| `bee models usage` | Show disk usage of downloaded models |

## Configuration

Bee stores its config at `~/.config/bee/config.toml`. Defaults are written by `bee setup`:

```toml
[whisper]
model = "medium"        # tiny, base, small, medium, large-v3
device = "auto"         # auto, cpu, cuda
language = "auto"       # auto-detect, or "pt", "en", "es", ...

[audio]
sample_rate = 16000
channels = 1
```

You can change the model later (`bee models download large-v3` and edit the config), or change the language to skip detection. Run `bee doctor` after editing.

## Privacy

Everything runs locally. Audio files, transcripts, and models all live on your disk. The only network traffic is the initial model download from HuggingFace (anonymous, no token required) when you run `bee setup`.

## License

MIT
