Metadata-Version: 2.4
Name: anyscribecli
Version: 0.8.2
Summary: CLI tool to download, transcribe, and convert video/audio to structured markdown
Project-URL: Repository, https://github.com/rishmadaan/anyscribecli
Project-URL: Issues, https://github.com/rishmadaan/anyscribecli/issues
Project-URL: Documentation, https://github.com/rishmadaan/anyscribecli/tree/main/docs
Author: Rish Madaan
License-Expression: MIT
License-File: LICENSE
Keywords: cli,markdown,obsidian,transcription,whisper,youtube
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Text Processing :: Markup :: Markdown
Requires-Python: >=3.10
Requires-Dist: beaupy>=3.0
Requires-Dist: fastapi>=0.110
Requires-Dist: httpx>=0.27
Requires-Dist: instaloader>=4.10
Requires-Dist: python-dotenv>=1.0
Requires-Dist: python-multipart>=0.0.6
Requires-Dist: pyyaml>=6.0
Requires-Dist: rich>=13.0
Requires-Dist: typer[all]>=0.9
Requires-Dist: uvicorn[standard]>=0.27
Requires-Dist: websockets>=12.0
Requires-Dist: yt-dlp>=2024.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: local
Requires-Dist: faster-whisper>=1.0; extra == 'local'
Provides-Extra: mcp
Requires-Dist: mcp>=1.0; extra == 'mcp'
Description-Content-Type: text/markdown

# anyscribecli

**Download. Transcribe. Markdown.** Turns YouTube videos, Instagram reels, and local audio/video files into structured, searchable markdown — browsable in Obsidian.

Three equivalent surfaces, pick whichever fits:

- **Web UI** (`scribe ui`) — click-through, for humans who prefer browsers.
- **Terminal wizard** (`scribe onboard`, `scribe "<url>"`) — arrow-key prompts, for humans who live in terminals.
- **Headless flags** (`scribe onboard --provider X --yes --json`, `scribe "<url>" --json`) — flag-driven, structured output, for AI agents / CI / scripts.

None of the three is "primary"; all three cover the full product. Human users reach the full product through Web UI or terminal without ever needing the other. Agents reach it through flags alone. Shared backend, shared state — a transcription started from any surface is visible to all of them.

### Private by default, local-first

- **The Web UI is a local server, not a cloud service.** It runs at `127.0.0.1:8457` on your own machine. No account, no sign-up, no telemetry. There is no "anyscribe.com" backend.
- **Internet is only involved when *you* ask for it.** Three cases, all transparent:
  - **Downloading a YouTube or Instagram source** — obvious; you gave it the URL.
  - **Calling an API provider** (OpenAI, Deepgram, ElevenLabs, Sarvam, OpenRouter) — your audio goes to the provider you picked, and nothing else. Your data stays between you and them.
  - **Pulling a Whisper model** (one-time, only if you enable local transcription) — weights download from Hugging Face.
- **Fully offline is available.** Local files + the local provider (`scribe local setup --model base`) = zero network traffic. Your audio never leaves your machine. Same pipeline, same output format as the cloud providers.
- No analytics, no phone-home. `scribe update --check` reaches PyPI to compare versions, but only when you run it.

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![PyPI](https://img.shields.io/pypi/v/anyscribecli.svg)](https://pypi.org/project/anyscribecli/)
[![Platforms: macOS, Linux, Windows](https://img.shields.io/badge/platform-macOS%20%7C%20Linux%20%7C%20Windows-informational.svg)](https://pypi.org/project/anyscribecli/)

---

## What it does

```
URL or local file → Download/convert audio → Transcribe → Formatted Markdown → Obsidian Vault
```

- **6 transcription providers** — OpenAI Whisper, Deepgram Nova, ElevenLabs, OpenRouter, Sarvam AI, Local (offline)
- **Speaker diarization** — `--diarize` flag for multi-speaker transcripts (meetings, interviews, podcasts)
- **3 input sources** — YouTube, Instagram (reels + posts), local files (mp3, mp4, m4a, wav, opus, ogg, flac, webm)
- **Obsidian-native output** — YAML frontmatter, word count, reading time, tags
- **Master index + daily logs** — browse everything in Obsidian
- **Download-only mode** — grab video or audio without transcribing
- **Batch processing** — transcribe a list of URLs from a file
- **Web UI** — `scribe ui` launches a local dashboard (transcribe, browse history, manage settings, first-run onboarding wizard) at `127.0.0.1:8457` — served from your own machine, no cloud backend
- **Local-first, no account** — no sign-up, no telemetry, no SaaS layer; fully offline with the local provider + local files
- **Agent-friendly CLI** — `--json` output, structured exit codes, `--yes` for non-interactive runs on every consequential command; no silent defaults for choices an agent might make on the user's behalf
- **Three-surface onboarding parity** — wizard modal in the Web UI, interactive prompts in `scribe onboard`, flag-driven in `scribe onboard --yes ...`; all three write the same config

## Quick Start

### Install

**macOS / Linux** (one command — installs Python, ffmpeg, and scribe):
```bash
curl -fsSL https://raw.githubusercontent.com/rishmadaan/anyscribecli/main/install.sh | bash
```

**Windows** (PowerShell — installs Python, ffmpeg, and scribe):
```powershell
irm https://raw.githubusercontent.com/rishmadaan/anyscribecli/main/install.ps1 | iex
```

**Or install manually:**
```bash
pip install anyscribecli
```

### Get started

```bash
scribe ui    # opens web dashboard — guides you through setup
```

On first launch the web UI opens a full-screen **onboarding wizard** — pick a provider, paste the API key (with a live Test button), optionally enable offline transcription, confirm your workspace, done.

> **Windows:** If `scribe` isn't recognized, use `python -m anyscribecli ui`
>
> **Alternative paths** (same end state):
> - `scribe onboard` — interactive terminal wizard.
> - `scribe onboard --provider openai --api-key "$OPENAI_API_KEY" --yes --json` — headless mode for agents, CI, and scripts.

### Transcribe

```bash
# From a URL
scribe "https://www.youtube.com/watch?v=VIDEO_ID"

# From a local file
scribe /path/to/podcast.mp3
```

> **Always wrap URLs in quotes** — shells like zsh break URLs with `?` and `&`.

### Download (no transcription)

```bash
scribe download "https://www.youtube.com/watch?v=VIDEO_ID"            # video
scribe download "https://www.youtube.com/watch?v=VIDEO_ID" --audio-only  # audio
```

## Commands

| Command | Description |
|---------|-------------|
| `scribe onboard` | Interactive setup wizard (TUI) |
| `scribe onboard --provider X --api-key $KEY --yes` | Headless setup for agents / CI |
| `scribe transcribe "<url or file>"` | Transcribe a video or local file to markdown |
| `scribe download "<url>"` | Download video or audio only |
| `scribe batch <file>` | Batch transcribe URLs or file paths from a file |
| `scribe config show/set/path` | View and change settings |
| `scribe providers list/test` | Manage transcription providers |
| `scribe local setup --model <size>` | Install faster-whisper + download a Whisper model |
| `scribe local status` / `scribe local teardown` | Report / remove offline transcription |
| `scribe model list / pull / rm / reinstall / info` | Manage cached Whisper model weights |
| `scribe ui` | Launch the web UI in your browser |
| `scribe install-skill` | Install Claude Code skill |
| `scribe update` | Update to the latest version |
| `scribe doctor` | Check system health |

### Transcribe options

```bash
scribe transcribe "<url>"
  --provider, -p <name>    # Override provider (openai, deepgram, elevenlabs, local, etc.)
  --language, -l <code>    # Language code (default: auto-detect)
  --diarize, -d            # Enable speaker diarization (multi-speaker transcripts)
  --json, -j               # JSON output for scripting/AI agents
  --keep-media             # Keep the downloaded audio file
  --clipboard, -c          # Read URL from clipboard
  --quiet, -q              # Suppress progress output
```

Provide a URL, file path, or use interactive mode:
```bash
scribe transcribe "https://..."     # quoted URL (primary)
scribe transcribe /path/to/file.mp3 # local audio/video file
scribe transcribe                    # interactive prompt (no quoting needed)
scribe transcribe --clipboard        # read URL from system clipboard
```

### Download options

```bash
scribe download "<url>"
  --video / --audio-only     # Video (default) or audio only
  --json, -j                 # JSON output
  --quiet, -q                # Suppress progress
  --clipboard, -c            # Read URL from clipboard
```

### Batch options

```bash
scribe batch <file>
  --provider, -p <name>      # Override provider
  --language, -l <code>      # Override language
  --json, -j                 # JSON output
  --keep-media               # Keep audio files
  --quiet, -q                # Suppress progress
  --stop-on-error            # Stop at first failure
```

### JSON output

```bash
scribe transcribe "https://youtube.com/watch?v=abc123" --json
```

```json
{
  "success": true,
  "file": "~/anyscribe/sources/youtube/video-title.md",
  "title": "Video Title",
  "platform": "youtube",
  "duration": "12:34",
  "language": "en",
  "word_count": 1500,
  "provider": "openai"
}
```

## Prerequisites

The onboarding wizard checks for these and offers to install them:

| Dependency | Required | Install |
|------------|----------|---------|
| Python 3.10+ | Yes | [python.org](https://www.python.org/downloads/) |
| yt-dlp | Yes | `brew install yt-dlp` or `pip install yt-dlp` |
| ffmpeg | Yes | `brew install ffmpeg` or [ffmpeg.org](https://ffmpeg.org/) |
| API key | Yes (for cloud providers) | See [Provider Guide](docs/user/providers.md) |

## Directory structure

```
~/anyscribe/                              # Obsidian vault (configurable)
├── _index.md                             # Master index (newest first)
├── sources/
│   ├── youtube/<slug>.md
│   ├── instagram/<slug>.md
│   └── local/<slug>.md
└── daily/YYYY-MM-DD.md

~/.anyscribecli/                          # App internals (hidden)
├── config.yaml                           # Settings (no secrets)
├── .env                                  # API keys + passwords
├── downloads/                            # Downloads (separate from vault)
│   ├── audio/<platform>/                 # Kept audio (if keep_media=true)
│   └── video/<platform>/                 # Downloaded videos
├── sessions/                             # Login sessions
└── logs/                                 # Processing logs
```

> **Workspace is visible and configurable** — transcripts default to `~/anyscribe/` (no hidden dot-dir). Change it with `scribe config set workspace_path /your/path`. Downloads stay separate to keep the vault lightweight.

## Providers

| Provider | Best for | API key |
|----------|----------|---------|
| **OpenAI Whisper** (default) | General purpose, multilingual | `OPENAI_API_KEY` |
| **Deepgram Nova** | Diarization (auto-selected with `--diarize`), Hinglish | `DEEPGRAM_API_KEY` |
| **ElevenLabs Scribe** | High accuracy, 99 languages, word timestamps | `ELEVENLABS_API_KEY` |
| **Sarvam AI** | Indic languages (Hindi, Tamil, Telugu, etc.) | `SARGAM_API_KEY` |
| **OpenRouter** | Access to various AI models | `OPENROUTER_API_KEY` |
| **Local** (faster-whisper) | Offline, free, no API key needed | None |

See [Provider Guide](docs/user/providers.md) for detailed comparison, pricing, and setup.

## Configuration

```yaml
# ~/.anyscribecli/config.yaml
provider: openai          # Transcription provider
language: auto             # Language (auto-detect or ISO code)
keep_media: false          # Keep audio files after transcription
output_format: clean       # clean | timestamped | diarized
diarize: false             # enable speaker diarization
prompt_download: never     # never | ask | always — download video after transcription
local_file_media: skip     # skip | copy | move | ask — what to do with local files
workspace_path: ""         # empty = ~/anyscribe (default), or set a custom path
```

API keys and passwords live in `~/.anyscribecli/.env` (separate from config, never committed). You can set API keys directly:

```bash
scribe config set deepgram_api_key YOUR_KEY
scribe config set openai_api_key YOUR_KEY
```

> **Diarization auto-routing:** When you use `--diarize` without specifying a provider, scribe automatically switches to Deepgram (if configured) for best speaker detection. Override with `-p openai` if needed.

> **Web UI labels:** The CLI's `--diarize` flag is shown as `Multi-speaker` in the web UI, and the `diarized` output format is labelled `with-speaker-labels`. Wire values are unchanged — the rename is display-only so the UI reads in plain English.

See [Configuration Guide](docs/user/configuration.md) for all options.

## Claude Code Integration

scribe ships with a [Claude Code skill](https://code.claude.com/docs/en/skills) that teaches Claude how to transcribe, configure providers, and troubleshoot on your behalf. After installing scribe:

```bash
scribe install-skill
```

Or run `scribe onboard` — it auto-detects Claude Code and offers to install the skill. Once installed, Claude can use `/scribe` or auto-activate when you ask it to transcribe something.

## Documentation

| For | Where |
|-----|-------|
| First-time users | [Getting Started](docs/user/getting-started.md) |
| Command reference | [Commands](docs/user/commands.md) |
| All config options | [Configuration](docs/user/configuration.md) |
| Provider comparison | [Providers](docs/user/providers.md) |
| AI developers | [CLAUDE.md](CLAUDE.md) |
| Agent directives | [AGENTS.md](AGENTS.md) |
| Developer memory | [Building Docs](docs/building/) |

## Development

```bash
git clone https://github.com/rishmadaan/anyscribecli.git
cd anyscribecli
pip install -e ".[dev]"

ruff check src/          # lint
ruff format src/         # format
pytest                   # test
```

## License

MIT
