Metadata-Version: 2.4
Name: minyt
Version: 0.0.2
Summary: Donwload audio from a youtube video and use Gemini LLM for cleaner and smarter transcibes
Home-page: https://github.com/franckalbinet/minyt
Author: Franck Albinet
Author-email: franckalbinet@gmail.com
License: Apache Software License 2.0
Keywords: nbdev jupyter notebook python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: Apache Software License
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastcore
Requires-Dist: python-dotenv
Requires-Dist: google-genai
Requires-Dist: yt-dlp
Requires-Dist: ffmpeg-python
Requires-Dist: tqdm
Requires-Dist: rich
Provides-Extra: dev
Requires-Dist: ipykernel; extra == "dev"
Requires-Dist: nbdev; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: keywords
Dynamic: license
Dynamic: license-file
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# minyt


<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

[![PyPI
version](https://badge.fury.io/py/minyt.svg)](https://badge.fury.io/py/minyt)
[![License: Apache
2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Python
3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)

**minyt** (WIP) is a Python package that simplifies the process of
downloading YouTube audio and generating high-quality transcripts using
Google’s Gemini AI. It intelligently splits long audio files at natural
silence points and processes chunks in parallel for optimal performance.

## Features

- **YouTube Audio Download**: Extract audio from any YouTube video using
  `yt-dlp`
- **Smart Audio Splitting**: Automatically detect silence and split
  audio at natural break points
- **AI-Powered Transcription**: Use Google’s Gemini 2.0 Flash for
  accurate, context-aware transcriptions
- **Parallel Processing**: Process multiple audio chunks concurrently
  for faster results
- **Customizable**: Configure chunk sizes, silence detection, and
  transcription prompts
- **Clean Output**: Generate well-formatted transcripts ready for
  analysis

## Quick Start

### Installation

``` bash
pip install minyt
```

### Prerequisites

1.  **FFmpeg**: Required for audio processing

    ``` bash
    # macOS
    brew install ffmpeg

    # Ubuntu/Debian
    sudo apt update && sudo apt install ffmpeg

    # Windows
    # Download from https://ffmpeg.org/download.html
    ```

2.  **Google Gemini API Key**: Get your API key from [Google AI
    Studio](https://makersuite.google.com/app/apikey)

    ``` bash
    export GEMINI_API_KEY="your-api-key-here"
    ```

### Basic Usage

``` python
import asyncio
from pathlib import Path
from minyt.core import *

# Download audio from a YouTube video
video_id = "dQw4w9WgXcQ"  # Replace with your video ID
audio_file = download_audio(video_id, Path("_audio"))

# Detect silence and find optimal split points
_, silence_data = detect_silence(audio_file)
silence_ends = parse_silence_ends(silence_data)
total_duration = get_audio_duration(audio_file)
split_points = find_split_points(silence_ends, total_duration, chunk_len=600)

# Split audio into manageable chunks
chunks = split_audio(audio_file, split_points, dest_dir="_audio_chunks")

# Transcribe all chunks using Gemini AI
async def main():
    transcript = await transcribe_audio(
        chunks_dir="_audio_chunks",
        dest_file="_transcripts/transcript.txt",
        prompt="Please transcribe this audio file verbatim, maintaining speaker clarity and context."
    )
    print(f"Transcript saved to: _transcripts/transcript.txt")

asyncio.run(main())
```

## Detailed Usage

### Step 1: Download YouTube Audio

``` python
from minyt.core import download_audio
from pathlib import Path

# Download audio from a YouTube video
video_id = "your-video-id-here"
audio_file = download_audio(video_id, Path("downloads"))
print(f"Audio downloaded to: {audio_file}")
```

### Step 2: Process Audio with Smart Splitting

``` python
from minyt.core import detect_silence, parse_silence_ends, find_split_points, split_audio

# Detect silence in the audio file
_, silence_data = detect_silence(audio_file)

# Parse silence end points
silence_ends = parse_silence_ends(silence_data)

# Find optimal split points (aiming for 10-minute chunks)
total_duration = get_audio_duration(audio_file)
split_points = find_split_points(silence_ends, total_duration, chunk_len=600)

# Split audio into chunks
chunks = split_audio(audio_file, split_points, dest_dir="audio_chunks")
print(f"Created {len(chunks)} audio chunks")
```

### Step 3: Transcribe with Gemini AI

``` python
import asyncio
from minyt.core import transcribe_audio

async def transcribe_video():
    transcript = await transcribe_audio(
        chunks_dir="audio_chunks",
        dest_file="transcripts/final_transcript.txt",
        model="gemini-2.0-flash-001",  # Default model
        max_concurrent=3,  # Process 3 chunks simultaneously
        prompt="Please transcribe this audio accurately, preserving speaker names and technical terms."
    )
    return transcript

# Run transcription
transcript = asyncio.run(transcribe_video())
print("Transcription completed!")
```

## Configuration

### Environment Variables

``` bash
# Required
export GEMINI_API_KEY="your-gemini-api-key"

# Optional: Configure logging level
export LOG_LEVEL="INFO"
```

### Customization Options

``` python
# Custom silence detection (adjust sensitivity)
_, silence_data = detect_silence(audio_file)  # Uses -30dB threshold, 0.5s duration

# Custom chunk size (in seconds)
split_points = find_split_points(silence_ends, total_duration, chunk_len=300)  # 5-minute chunks

# Custom transcription settings
transcript = await transcribe_audio(
    chunks_dir="chunks",
    dest_file="output.txt",
    model="gemini-2.0-flash-001",  # Different Gemini model
    max_concurrent=5,  # More parallel processing
    prompt="Custom transcription instructions here..."
)
```

## Project Structure

    minyt/
    ├── audio/ # Downloaded audio files
    ├── audio_chunks/ # Split audio chunks
    ├── transcripts/ # Generated transcripts
    ├── minyt/
    │ ├── init.py
    │ └── core.py # Main functionality
    └── nbs/ # Jupyter notebooks (development)

## Development

### Install in Development Mode

``` bash
# Clone the repository
git clone https://github.com/franckalbinet/minyt.git
cd minyt

# Install in development mode
pip install -e .

# Make changes in the nbs/ directory
# ...

# Compile changes to apply to minyt package
nbdev_prepare
```

### Dependencies

- `fastcore`: Core utilities
- `google-genai`: Google Gemini AI client
- `yt-dlp`: YouTube video downloader
- `ffmpeg-python`: Audio processing
- `tqdm`: Progress bars
- `rich`: Enhanced console output

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
For major changes, please open an issue first to discuss what you would
like to change.

## License

This project is licensed under the Apache License 2.0 - see the
[LICENSE](LICENSE) file for details.

## Acknowledgments

- [yt-dlp](https://github.com/yt-dlp/yt-dlp) for YouTube video
  downloading
- [Google Gemini](https://ai.google.dev/) for AI-powered transcription
- [FFmpeg](https://ffmpeg.org/) for audio processing capabilities

## Support

If you encounter any issues or have questions:

1.  Check the [documentation](https://franckalbinet.github.io/minyt/)
2.  Open an [issue](https://github.com/franckalbinet/minyt/issues)
3.  Contact the maintainer: franckalbinet@gmail.com

------------------------------------------------------------------------
