Metadata-Version: 2.4
Name: easytranscribe
Version: 0.1.2
Summary: Easy speech-to-text transcription from audio files or live microphone input using Whisper.
Home-page: https://github.com/akhshyganesh/easytranscribe
Author: akhshyganesh
Author-email: 
License: MIT License
        
        Copyright (c) 2025 Akhshy Ganesh
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/akhshyganesh/easytranscribe
Project-URL: Bug Reports, https://github.com/akhshyganesh/easytranscribe/issues
Project-URL: Source Code, https://github.com/akhshyganesh/easytranscribe
Keywords: speech-to-text,whisper,transcription,audio,ai
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Operating System :: OS Independent
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openai-whisper>=20240930
Requires-Dist: sounddevice>=0.4.6
Requires-Dist: numpy<2.3,>=1.21.0
Provides-Extra: dev
Requires-Dist: pytest>=6.0; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: flake8; extra == "dev"
Requires-Dist: mypy; extra == "dev"
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# EasyTranscribe

A simple Python-based voice assistant that captures speech from your microphone or from recorded file, detects silence, and transcribes spoken words to text using OpenAI Whisper. Easily extensible for integration with LLMs like Ollama or Gemma.

## Features

- Real-time microphone audio capture
- Automatic silence detection and recording stop
- Speech-to-text transcription using Whisper
- Comprehensive transcription logging with detailed metrics
- Easy integration with other AI models

## Installation

1. Clone the repository:
   ```bash
   git clone https://github.com/akhshyganesh/easytranscribe.git
   cd easytranscribe
   ```

2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

## Usage

Run the main script:
```bash
python main.py
```
Speak into your microphone. The assistant will automatically stop recording after a few seconds of silence and transcribe your speech.

# easytranscribe

[![PyPI version](https://badge.fury.io/py/easytranscribe.svg)](https://badge.fury.io/py/easytranscribe)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Easy speech-to-text transcription from audio files or live microphone input using OpenAI's Whisper.

## ✨ Features

- 🎤 **Live microphone transcription** with automatic silence detection
- 📁 **Audio file transcription** supporting multiple formats
- 📊 **Automatic logging** with timestamps and performance metrics
- 🔧 **Simple CLI interface** for quick usage
- 🐍 **Easy Python API** for integration into your projects
- 📈 **Log analysis tools** to view transcription history and statistics

## 🚀 Quick Start

### Installation

```bash
pip install easytranscribe
```

### Python API

**Live microphone transcription:**
```python
from easytranscribe import capture_and_transcribe

# Start live transcription (speaks and waits for silence)
text = capture_and_transcribe(model_name="base")
print(f"You said: {text}")
```

**Audio file transcription:**
```python
from easytranscribe import transcribe_audio_file

# Transcribe an audio file
text = transcribe_audio_file("path/to/audio.wav", model_name="base")
print(f"Transcription: {text}")
```

**View transcription logs:**
```python
from easytranscribe import view_logs

# View today's logs with statistics
logs = view_logs(date="today", stats=True)
print(f"Total entries: {logs['total_count']}")
```

### Command Line Interface

**Live transcription:**
```bash
easytranscribe live --model base
```

**File transcription:**
```bash
easytranscribe file path/to/audio.wav --model base
```

**View logs:**
```bash
# View today's logs
easytranscribe logs --date today --stats

# View last 10 entries
easytranscribe logs --tail 10

# List available log dates
easytranscribe logs --list-dates
```

## 📋 Available Whisper Models

| Model  | Size | Speed | Accuracy | Use Case |
|--------|------|-------|----------|----------|
| `tiny` | 39MB | Fastest | Good | Real-time, low resource |
| `base` | 74MB | Fast | Better | Balanced performance |
| `small` | 244MB | Medium | Good | Higher accuracy |
| `medium` | 769MB | Slow | Very Good | Professional use |
| `large` | 1550MB | Slowest | Best | Maximum accuracy |
| `turbo` | 809MB | Fast | Excellent | Best balance (default) |

## 🔧 Configuration

### Audio Settings (Live Recording)

The package automatically handles:
- ✅ Silence detection (3 seconds of silence stops recording)
- ✅ Minimum recording time (2 seconds)
- ✅ Audio level monitoring
- ✅ Automatic microphone input

### Logging

Transcriptions are automatically logged to `logs/transcription_YYYY-MM-DD.log` with:
- 📅 Timestamp
- 🤖 Model used
- ⏱️ Processing time
- 🎵 Audio duration (for live recording)
- 📝 Transcribed text

## 🛠️ Development

### Install from Source

```bash
git clone https://github.com/akhshyganesh/easytranscribe.git
cd easytranscribe
pip install -e .
```

### Run Tests

```bash
python test/test_integration.py
```

## 📄 Requirements

- Python 3.8+
- OpenAI Whisper
- sounddevice (for microphone input)
- numpy

## 📖 Documentation

For comprehensive documentation, examples, and API reference, visit:

**🌐 [EasyTranscribe Documentation](https://akhshyganesh.github.io/easytranscribe/)**

The documentation includes:
- 🚀 [Quick Start Guide](https://akhshyganesh.github.io/easytranscribe/quickstart/)
- 💻 [CLI Usage](https://akhshyganesh.github.io/easytranscribe/cli/)
- 🐍 [Python API](https://akhshyganesh.github.io/easytranscribe/api/)
- 📝 [Examples](https://akhshyganesh.github.io/easytranscribe/examples/)
- ⚙️ [Configuration](https://akhshyganesh.github.io/easytranscribe/configuration/)
- 🔧 [Advanced Usage](https://akhshyganesh.github.io/easytranscribe/advanced/)

## 🤝 Contributing

Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## 📜 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- [OpenAI Whisper](https://github.com/openai/whisper) for the amazing speech recognition model
- [sounddevice](https://github.com/spatialaudio/python-sounddevice) for microphone input handling
