Metadata-Version: 2.4
Name: tts_webui_extension.openai_tts_api
Version: 1.4.1
Summary: OpenAI compatible TTS API with support for multiple TTS models
Author: rsxdalv
License: MIT
Project-URL: Homepage, https://github.com/rsxdalv/tts_webui_extension.openai_tts_api
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.7
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: openai
Requires-Dist: gradio
Requires-Dist: uvicorn
Requires-Dist: psutil
Requires-Dist: fastapi
Requires-Dist: pydantic
Requires-Dist: requests
Requires-Dist: scipy
Requires-Dist: numpy
Provides-Extra: chatterbox
Requires-Dist: tts_webui_extension.chatterbox; extra == "chatterbox"
Provides-Extra: f5-tts
Requires-Dist: tts_webui_extension.f5_tts; extra == "f5-tts"
Provides-Extra: higgs-v2
Requires-Dist: tts_webui_extension.higgs_v2; extra == "higgs-v2"
Provides-Extra: kitten-tts
Requires-Dist: tts_webui_extension.kitten_tts; extra == "kitten-tts"
Provides-Extra: kokoro
Requires-Dist: tts_webui_extension.kokoro; extra == "kokoro"
Provides-Extra: megatts3
Requires-Dist: tts_webui_extension.megatts3; extra == "megatts3"
Provides-Extra: mms
Requires-Dist: tts_webui_extension.mms; extra == "mms"
Provides-Extra: parler-tts
Requires-Dist: tts_webui_extension.parler_tts; extra == "parler-tts"
Provides-Extra: piper-tts
Requires-Dist: tts_webui_extension.piper_tts; extra == "piper-tts"
Provides-Extra: rvc
Requires-Dist: tts_webui_extension.rvc; extra == "rvc"
Provides-Extra: styletts2
Requires-Dist: tts_webui_extension.styletts2; extra == "styletts2"
Provides-Extra: vall-e-x
Requires-Dist: tts_webui_extension.vall_e_x; extra == "vall-e-x"
Dynamic: license-file

# OpenAI TTS API

This extension provides an OpenAI compatible API for multiple TTS models including Kokoro TTS and RVC.

Used as extension in [TTS Webui](https://github.com/rsxdalv/tts-webui)

## Installation

```bash
pip install git+https://github.com/rsxdalv/extension_openai_tts_api@main
```

## Requirements

Models require individual installation of their extensions.

```bash
# Kokoro TTS
pip install git+https://github.com/rsxdalv/extension_kokoro@main
# RVC
pip install git+https://github.com/rsxdalv/extension_rvc@main
# Chatterbox
pip install git+https://github.com/rsxdalv/extension_chatterbox@main
```

## Usage

### Authentication

You can optionally protect the API with an API key.

- Environment variable (priority): set `OPENAI_API_KEY` before starting the server/extension.
- UI setting: in the extension's Gradio UI, set "API Key (OpenAI-compatible)". This is saved to your config and used when the env var is not set.
- If neither is set, the API is open and does not require authentication.

Clients should send the header `Authorization: Bearer <YOUR_API_KEY>`.

### Kokoro TTS

```python
import asyncio

from openai import AsyncOpenAI
from openai.helpers import LocalAudioPlayer

openai = AsyncOpenAI()

async def main() -> None:
    async with openai.audio.speech.with_streaming_response.create(
        model="hexgrad/Kokoro-82M",
        voice="af_heart",
        input="Today is a wonderful day to build something people love!",
    ) as response:
        await LocalAudioPlayer().play(response)

if __name__ == "__main__":
    asyncio.run(main())
```

### RVC

```python
import asyncio

from openai import AsyncOpenAI
from openai.helpers import LocalAudioPlayer

openai = AsyncOpenAI()

async def main() -> None:
    async with openai.audio.speech.with_streaming_response.create(
        model="hexgrad/Kokoro-82M",
        voice="af_heart",
        input="Today is a wonderful day to build something people love!",
        extra_body={
            "params": {
                "use_gpu": True,
                "rvc_params": {
                    "pitch_up_key": "0",
                    "index_path": "CaitArcane\\added_IVF65_Flat_nprobe_1_CaitArcane_v2",
                    "pitch_collection_method": "harvest",
                    "model_path": "CaitArcane\\CaitArcane",
                    "index_rate": 0.66,
                    "filter_radius": 3,
                    "resample_sr": 0,
                    "rms_mix_rate": 1,
                    "protect": 0.33,
                },
            },
        },
    ) as response:
        await LocalAudioPlayer().play(response)

if __name__ == "__main__":
    asyncio.run(main())
```

### Chatterbox

```python
from openai import OpenAI

client = OpenAI(api_key="sk-1234567890", base_url="http://localhost:7778/v1")

with client.audio.speech.with_streaming_response.create(
    model="chatterbox",
    # voice="random",
    voice="voices/irohazaka.wav",
    input="Today is a wonderful day to build something people love!",
    # speed=1.0, # not implemented
    extra_body={
        "params": {
            "exaggeration": 0.5,
            "cfg_weight": 0.5,
            "temperature": 0.8,
        },
    },
) as response:
    audio = response.read()
    with open("audio.mp3", "wb") as f:
        f.write(audio)
```
