Metadata-Version: 2.4
Name: fmus-vox
Version: 0.0.1
Summary: A speech processing library
Home-page: https://github.com/mexyusef/fmus-vox
Author: Yusef Ulum
Author-email: yusef314159@gmail.com
Project-URL: Bug Tracker, https://github.com/mexyusef/fmus-vox/issues
Project-URL: Documentation, https://fmus-vox.readthedocs.io/
Project-URL: Source Code, https://github.com/mexyusef/fmus-vox
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.20.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: librosa>=0.9.0
Requires-Dist: soundfile>=0.10.0
Requires-Dist: pyrubberband>=0.3.0
Requires-Dist: sounddevice>=0.4.0
Requires-Dist: requests>=2.27.0
Requires-Dist: tqdm>=4.62.0
Requires-Dist: pydantic>=1.9.0
Provides-Extra: stt
Requires-Dist: torch>=1.10.0; extra == "stt"
Requires-Dist: openai-whisper>=20230314; extra == "stt"
Requires-Dist: transformers>=4.18.0; extra == "stt"
Requires-Dist: pyctcdecode>=0.3.0; extra == "stt"
Provides-Extra: tts
Requires-Dist: torch>=1.10.0; extra == "tts"
Requires-Dist: torchaudio>=0.10.0; extra == "tts"
Requires-Dist: phonemizer>=3.0.0; extra == "tts"
Requires-Dist: unidecode>=1.3.0; extra == "tts"
Provides-Extra: voice
Requires-Dist: torch>=1.10.0; extra == "voice"
Requires-Dist: resemblyzer>=0.1.0; extra == "voice"
Requires-Dist: praat-parselmouth>=0.4.0; extra == "voice"
Requires-Dist: phonemizer>=3.0.0; extra == "voice"
Requires-Dist: gruut>=2.0.0; platform_system != "Windows" and extra == "voice"
Requires-Dist: unidecode>=1.3.0; extra == "voice"
Provides-Extra: voice-yourtts
Requires-Dist: TTS>=0.10.0; extra == "voice-yourtts"
Provides-Extra: voice-sv2tts
Requires-Dist: tensorflow==1.15.0; python_version < "3.10" and extra == "voice-sv2tts"
Requires-Dist: numpy==1.19.3; python_version < "3.10" and extra == "voice-sv2tts"
Requires-Dist: librosa==0.8.0; python_version < "3.10" and extra == "voice-sv2tts"
Requires-Dist: webrtcvad==2.0.10; extra == "voice-sv2tts"
Requires-Dist: inflect>=5.3.0; extra == "voice-sv2tts"
Provides-Extra: wakeword
Requires-Dist: pvporcupine>=2.1.0; extra == "wakeword"
Requires-Dist: webrtcvad>=2.0.10; extra == "wakeword"
Provides-Extra: api
Requires-Dist: fastapi>=0.75.0; extra == "api"
Requires-Dist: uvicorn>=0.17.0; extra == "api"
Requires-Dist: python-multipart>=0.0.5; extra == "api"
Provides-Extra: cli
Requires-Dist: typer>=0.4.0; extra == "cli"
Requires-Dist: rich>=12.0.0; extra == "cli"
Provides-Extra: chatbot
Requires-Dist: langchain>=0.0.139; extra == "chatbot"
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=3.0.0; extra == "dev"
Requires-Dist: black>=22.1.0; extra == "dev"
Requires-Dist: isort>=5.10.0; extra == "dev"
Requires-Dist: mypy>=0.931; extra == "dev"
Requires-Dist: flake8>=4.0.0; extra == "dev"
Requires-Dist: sphinx>=4.4.0; extra == "dev"
Requires-Dist: sphinx-rtd-theme>=1.0.0; extra == "dev"
Provides-Extra: voice-all
Requires-Dist: TTS>=0.10.0; extra == "voice-all"
Requires-Dist: gruut>=2.0.0; platform_system != "Windows" and extra == "voice-all"
Requires-Dist: inflect>=5.3.0; extra == "voice-all"
Requires-Dist: librosa==0.8.0; python_version < "3.10" and extra == "voice-all"
Requires-Dist: numpy==1.19.3; python_version < "3.10" and extra == "voice-all"
Requires-Dist: phonemizer>=3.0.0; extra == "voice-all"
Requires-Dist: praat-parselmouth>=0.4.0; extra == "voice-all"
Requires-Dist: resemblyzer>=0.1.0; extra == "voice-all"
Requires-Dist: tensorflow==1.15.0; python_version < "3.10" and extra == "voice-all"
Requires-Dist: torch>=1.10.0; extra == "voice-all"
Requires-Dist: unidecode>=1.3.0; extra == "voice-all"
Requires-Dist: webrtcvad==2.0.10; extra == "voice-all"
Provides-Extra: full
Requires-Dist: TTS>=0.10.0; extra == "full"
Requires-Dist: black>=22.1.0; extra == "full"
Requires-Dist: fastapi>=0.75.0; extra == "full"
Requires-Dist: flake8>=4.0.0; extra == "full"
Requires-Dist: gruut>=2.0.0; platform_system != "Windows" and extra == "full"
Requires-Dist: inflect>=5.3.0; extra == "full"
Requires-Dist: isort>=5.10.0; extra == "full"
Requires-Dist: langchain>=0.0.139; extra == "full"
Requires-Dist: librosa==0.8.0; python_version < "3.10" and extra == "full"
Requires-Dist: mypy>=0.931; extra == "full"
Requires-Dist: numpy==1.19.3; python_version < "3.10" and extra == "full"
Requires-Dist: openai-whisper>=20230314; extra == "full"
Requires-Dist: phonemizer>=3.0.0; extra == "full"
Requires-Dist: praat-parselmouth>=0.4.0; extra == "full"
Requires-Dist: pvporcupine>=2.1.0; extra == "full"
Requires-Dist: pyctcdecode>=0.3.0; extra == "full"
Requires-Dist: pytest-cov>=3.0.0; extra == "full"
Requires-Dist: pytest>=7.0.0; extra == "full"
Requires-Dist: python-multipart>=0.0.5; extra == "full"
Requires-Dist: resemblyzer>=0.1.0; extra == "full"
Requires-Dist: rich>=12.0.0; extra == "full"
Requires-Dist: sphinx-rtd-theme>=1.0.0; extra == "full"
Requires-Dist: sphinx>=4.4.0; extra == "full"
Requires-Dist: tensorflow==1.15.0; python_version < "3.10" and extra == "full"
Requires-Dist: torch>=1.10.0; extra == "full"
Requires-Dist: torchaudio>=0.10.0; extra == "full"
Requires-Dist: transformers>=4.18.0; extra == "full"
Requires-Dist: typer>=0.4.0; extra == "full"
Requires-Dist: unidecode>=1.3.0; extra == "full"
Requires-Dist: uvicorn>=0.17.0; extra == "full"
Requires-Dist: webrtcvad==2.0.10; extra == "full"
Requires-Dist: webrtcvad>=2.0.10; extra == "full"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: project-url
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# fmus-vox

A speech processing library for Python.

## About

fmus-vox is a Python library that provides a rich set of tools for audio processing, speech-to-text (STT), text-to-speech (TTS), voice cloning, wake word detection, and conversational AI.

## Features

- **Audio Processing**: Load, manipulate, and analyze audio with an intuitive interface
- **Speech-to-Text**: Transcribe speech with support for multiple models (Whisper, Wav2Vec, etc.)
- **Text-to-Speech**: Synthesize natural-sounding speech with various voices and styles
- **Voice Cloning**: Create synthetic speech that mimics a specific voice
- **Wake Word Detection**: Detect custom wake words in audio streams
- **Conversational AI**: Build voice-driven conversational agents
- **Streaming**: Real-time audio processing with low latency
- **API**: Easy integration with web applications

## Installation

### Basic Installation

```bash
pip install fmus-vox
```

### With Specific Features

```bash
# For speech-to-text capabilities
pip install fmus-vox[stt]

# For text-to-speech capabilities
pip install fmus-vox[tts]

# For voice cloning
pip install fmus-vox[voice]

# For API server
pip install fmus-vox[api]

# For all features
pip install fmus-vox[full]
```

## Usage Examples

### Audio Processing

```python
from fmus_vox import Audio

# Load and process audio
audio = Audio.load("recording.wav")
processed = audio.normalize().denoise().resample(target_sr=16000)
processed.save("processed.wav")

# Record audio
audio = Audio.record(seconds=5)
audio.save("recording.wav")
```

### Speech-to-Text

```python
from fmus_vox import transcribe

# Simple transcription
text = transcribe("recording.wav")
print(f"Transcription: {text}")

# With specific model and language
text = transcribe("recording.wav", model="whisper-large", language="en")
```

### Text-to-Speech

```python
from fmus_vox import speak

# Simple synthesis
speak("Hello, welcome to fmus-vox!", output="welcome.wav")

# With voice styling
from fmus_vox import Speaker

speaker = Speaker(voice="en-female-1")
speaker.set_style("happy").set_speed(1.2).speak("This is exciting!")
```

### Voice Cloning

```python
from fmus_vox import clone_voice

# Clone voice and synthesize speech
clone_voice("my_voice.wav", "Hello with my voice", output="cloned.wav")

# Advanced usage
from fmus_vox import VoiceCloner

cloner = VoiceCloner()
voice_id = cloner.add_reference("my_voice.wav")
audio = cloner.synthesize("Hello with my voice", voice_id)
audio.save("cloned.wav")
```

### Real-time Voice Application

```python
from fmus_vox import VoiceApp

app = VoiceApp()

@app.on_wake("hey assistant")
def wake_handler():
    print("Wake word detected!")
    return True  # Start listening

@app.on_transcribe
def transcribe_handler(text):
    print(f"User said: {text}")
    if "weather" in text.lower():
        return "Today's weather is sunny."
    return "I didn't understand that command."

app.run()
```

## License

MIT

## Contributing

Contributions are welcome! Please check out our [contributing guidelines](CONTRIBUTING.md).
