Metadata-Version: 2.4
Name: gcp-transcribe
Version: 0.1.1
Summary: Cloud-based MP3 transcription CLI built on Google Speech-to-Text v2 (Chirp 2)
Project-URL: Homepage, https://github.com/bengeos/gcp-transcribe
Project-URL: Repository, https://github.com/bengeos/gcp-transcribe
Project-URL: Issues, https://github.com/bengeos/gcp-transcribe/issues
License-Expression: MIT
Keywords: amharic,audio,chirp,cli,google-cloud,speech-to-text,transcription
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: End Users/Desktop
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Requires-Dist: google-cloud-speech>=2.27.0
Requires-Dist: google-cloud-storage>=2.18.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: rich>=13.7.0
Requires-Dist: typer>=0.12.0
Description-Content-Type: text/markdown

# gcp-transcribe

Cloud-based MP3 transcription CLI built on Google Speech-to-Text v2 (Chirp 2).
Submit a job, get a job ID back, then poll for status and fetch results when ready.

- Async batch workflow for long-form audio
- Multilingual via Chirp 2 (any BCP-47 language Chirp 2 supports)
- Outputs `.txt`, `.json`, and optional `.srt`
- Jobs survive across CLI invocations and reboots

## Prerequisites

- Python 3.10+
- A Google Cloud account with billing enabled
- The [`gcloud` CLI](https://cloud.google.com/sdk/docs/install) installed locally

## Set up Google Cloud

You can do this in the [Google Cloud Console](https://console.cloud.google.com/) or entirely from the terminal. The CLI flow:

### 1. Create or select a project

```bash
# Create a new project
gcloud projects create <your-project-id> --name="Transcribe"

# Or pick an existing one
gcloud projects list
gcloud config set project <your-project-id>
```

### 2. Enable billing

A billing account must be linked to the project. List your billing accounts and link one:

```bash
gcloud billing accounts list
gcloud billing projects link <your-project-id> \
  --billing-account=<billing-account-id>
```

(Or do this once in the Console under **Billing → Link a billing account**.)

### 3. Enable the required APIs

```bash
gcloud services enable \
  speech.googleapis.com \
  storage.googleapis.com \
  --project=<your-project-id>
```

### 4. Authenticate

This tool uses Application Default Credentials, so you don't need a service-account key for local use:

```bash
gcloud auth application-default login
gcloud auth application-default set-quota-project <your-project-id>
```

The user account you log in with needs roles that allow Speech-to-Text recognition plus bucket and object management on the staging bucket. For personal use, **Owner** or **Editor** on the project is sufficient. For tighter scoping, the minimum is:

- `roles/speech.client`
- `roles/storage.admin` *(or `storage.objectAdmin` + `storage.bucketAdmin` if you'd rather not grant full admin)*

> Running in CI or a server? Create a service account, grant the same roles, download a JSON key, and point `GOOGLE_APPLICATION_CREDENTIALS` at it instead of running `gcloud auth ...`.

## Install

```bash
pip install gcp-transcribe
```

Or from source:

```bash
git clone https://github.com/bengeos/gcp-transcribe.git
cd gcp-transcribe
pip install -e .
```

## Configure

Create a `.env` file in your working directory (or export these as environment variables):

```env
GCP_PROJECT_ID=your-gcp-project-id
GCP_REGION=us-central1
GCS_BUCKET=your-staging-bucket
DEFAULT_LANGUAGES=en-US
```

| Key                 | Default                       | Notes                                          |
| ------------------- | ----------------------------- | ---------------------------------------------- |
| `GCP_PROJECT_ID`    | —                             | Required.                                      |
| `GCP_REGION`        | `us-central1`                 | Must be a Chirp 2-supported region.            |
| `GCS_BUCKET`        | `<project-id>-transcription`  | Auto-created on first run if it doesn't exist. |
| `DEFAULT_LANGUAGES` | `en-US`                       | Comma-separated BCP-47 codes.                  |

Verify everything is wired up:

```bash
transcribe doctor
```

## Usage

### Submit a job

```bash
transcribe submit audio.mp3
# → prints a job ID
```

Multiple files in one job:

```bash
transcribe submit a.mp3 b.mp3 c.mp3
```

Override languages per call:

```bash
transcribe submit audio.mp3 -l en-US -l es-ES
```

### Check status

```bash
transcribe status <job-id>
```

State is one of `running`, `done`, `failed`, `fetched`.

### Fetch the transcript

```bash
transcribe result <job-id> --out ./transcripts
```

Adds subtitles:

```bash
transcribe result <job-id> --out ./transcripts --srt
```

GCS staging objects are deleted after a successful fetch. Use `--keep-remote` to retain them.

### Other commands

```bash
transcribe list                # all known jobs
transcribe forget <job-id>     # remove a job from the local manifest
transcribe doctor              # verify config + GCP auth
```

## Costs

Chirp 2 batch transcription is roughly **$0.016 per minute** of audio. A 3-hour file ≈ $2.88.

## Limitations

- No speaker diarization (not supported by Chirp 2 in batch mode).
- Each file is capped at the Speech-to-Text v2 batch limit (currently 8 hours per file).

## License

MIT
