Metadata-Version: 2.4
Name: multiai-tts
Version: 0.2.1
Summary: TTS extension for multiai using OpenAI, Google GenAI and Azure Speech
Project-URL: Homepage, https://github.com/sekika/multiai-tts
Project-URL: Bug Tracker, https://github.com/sekika/multiai-tts/issues
Author: Katsutoshi Seki
License: MIT
License-File: LICENSE
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.8
Requires-Dist: azure-cognitiveservices-speech
Requires-Dist: google-genai
Requires-Dist: multiai
Requires-Dist: openai
Requires-Dist: pydub
Requires-Dist: sounddevice
Requires-Dist: soundfile
Description-Content-Type: text/markdown

# multiai-tts

`multiai-tts` is an extension library for [multiai](https://sekika.github.io/multiai/) that provides Text-to-Speech (TTS) capabilities using OpenAI, Google GenAI, and Azure Speech.

## Prerequisites

**API Key Configuration**

This library relies on the configuration provided by `multiai`. You must set up your API keys (OpenAI API Key, Google API Key, Azure TTS Key and Region) using `multiai`'s configuration files or environment variables before using this library.

For details on how to configure API keys, please refer to the **[multiai documentation](https://sekika.github.io/multiai/)**.

**System Requirements**

- `ffmpeg` must be installed if you want to save audio in formats other than WAV (e.g., MP3).
- `pydub` is required for audio conversion.

## Installation

```bash
pip install multiai-tts
````

## Usage

### Google GenAI Example

```python
import sys
import multiai_tts

client = multiai_tts.Prompt()
client.set_tts_model('google', 'gemini-2.5-flash-preview-tts')
client.tts_voice_google = 'charon'

# Speak directly
client.speak("Please speak the following. Hello, this is a test from Google model.")
if client.error:
    print(client.error_message)
    sys.exit(1)

# Save to file
client.save_tts("Please speak the following. Saving this audio to mp3.", "output_google.mp3")
if client.error:
    print(client.error_message)
    sys.exit(1)
```

### OpenAI Example

```python
import sys
import multiai_tts

client = multiai_tts.Prompt()
client.set_tts_model('openai', 'gpt-4o-mini-tts')
client.tts_voice_openai = 'marin'

# Speak directly
client.speak("Hello, this is a test from OpenAI model.")
if client.error:
    print(client.error_message)
    sys.exit(1)

# Save to file
client.save_tts("Saving this audio to mp3.", "output_openai.mp3")
if client.error:
    print(client.error_message)
    sys.exit(1)
```

### Azure TTS Example

```python
import sys
import multiai_tts

client = multiai_tts.Prompt()
client.set_tts_provider('azure')
client.tts_voice_azure = 'en-US-JennyNeural'

# Speak directly
client.speak("Hello, this is a test from Azure TTS.")
if client.error:
    print(client.error_message)
    sys.exit(1)

# Save to file
client.save_tts("Saving this audio to mp3.", "output_azure.mp3")
if client.error:
    print(client.error_message)
    sys.exit(1)
```

## Notes

* For OpenAI and Google TTS, use `set_tts_model(provider, model)` to select both provider and model.
* For Azure, `set_tts_provider('azure')` is sufficient; the model parameter is not used.
* In Google’s example, the prompt includes “Please speak the following.” In the OpenAI and Azure examples, it does not. Whether you include this phrase depends on the model you use.
* `Prompt.get_wav()` fetches the raw audio data in memory. Playback is separate from retrieval.
* Error handling: After `speak()` or `save_tts()`, always check `client.error` and `client.error_message`.
* WAV output is default; use `pydub`/`ffmpeg` for other formats.