Metadata-Version: 2.4
Name: omnivoice-server
Version: 0.2.2
Summary: OpenAI-compatible HTTP server for OmniVoice TTS
Project-URL: Homepage, https://github.com/maemreyo/omnivoice-server
Project-URL: Documentation, https://github.com/maemreyo/omnivoice-server#readme
Project-URL: Repository, https://github.com/maemreyo/omnivoice-server
Project-URL: Issues, https://github.com/maemreyo/omnivoice-server/issues
Project-URL: Changelog, https://github.com/maemreyo/omnivoice-server/blob/main/CHANGELOG.md
Author-email: zamery <matthew.ngo1114@gmail.com>
License: MIT
License-File: LICENSE
Keywords: fastapi,omnivoice,openai,text-to-speech,tts,voice-cloning
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.10
Requires-Dist: fastapi>=0.115.0
Requires-Dist: huggingface-hub>=0.26.0
Requires-Dist: omnivoice<0.2.0,>=0.1.0
Requires-Dist: platformdirs>=3.0.0
Requires-Dist: psutil>=6.0.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-multipart>=0.0.12
Requires-Dist: soundfile>=0.12.0
Requires-Dist: uvicorn[standard]>=0.32.0
Provides-Extra: benchmark
Requires-Dist: psutil; extra == 'benchmark'
Requires-Dist: tqdm; extra == 'benchmark'
Provides-Extra: dev
Requires-Dist: httpx>=0.27.0; extra == 'dev'
Requires-Dist: mypy>=1.11.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest-cov>=5.0.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.6.0; extra == 'dev'
Requires-Dist: torchcodec>=0.11; (sys_platform != 'win32') and extra == 'dev'
Provides-Extra: formats
Requires-Dist: pydub>=0.25.0; extra == 'formats'
Description-Content-Type: text/markdown

# omnivoice-server

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![CI](https://github.com/maemreyo/omnivoice-server/actions/workflows/ci.yml/badge.svg)](https://github.com/maemreyo/omnivoice-server/actions/workflows/ci.yml)
[![PyPI version](https://img.shields.io/pypi/v/omnivoice-server.svg)](https://pypi.org/project/omnivoice-server/)

OpenAI-compatible HTTP server for [OmniVoice](https://github.com/k2-fsa/OmniVoice) text-to-speech.

**Author:** zamery ([@maemreyo](https://github.com/maemreyo)) | **Email:** matthew.ngo1114@gmail.com

> **Early Development Notice**
>
> This is a new repository built on top of OmniVoice (released 2026). Both the upstream model and this server wrapper are under active development. Expect API changes, breaking updates, and performance improvements as PyTorch MPS support matures.
>
> **Current Status**: Functional on CPU and CUDA. MPS (Apple Silicon) has known issues.

## Quick Links

| Category | Sections |
|----------|----------|
| **Getting Started** | [Features](docs/readme/sections/01-features.md) - [Quick Start](docs/readme/sections/02-quick-start.md) - [Verification Status](docs/readme/sections/03-verification-status.md) |
| **Usage** | [API Usage](docs/readme/sections/04-api-usage.md) - [CLI Usage](docs/readme/sections/05-cli-usage.md) - [Configuration](docs/readme/sections/06-configuration.md) |
| **Reference** | [API Reference](docs/readme/sections/07-api-reference.md) - [Advanced Features](docs/readme/sections/08-advanced-features.md) - [Examples](docs/readme/sections/09-examples.md) |
| **Deployment** | [Docker Deployment](docs/readme/sections/10-docker-deployment.md) - [Hardware Requirements](docs/readme/sections/12-hardware-requirements.md) - [Performance](docs/readme/sections/13-performance.md) |
| **Development** | [Development](docs/readme/sections/11-development.md) - [Troubleshooting](docs/readme/sections/14-troubleshooting.md) - [Known Limitations](docs/readme/sections/15-known-limitations.md) |
| **Project** | [Documentation Index](docs/readme/sections/16-documentation-index.md) - [License](docs/readme/sections/17-license.md) - [Contributing](docs/readme/sections/18-contributing.md) - [Acknowledgments](docs/readme/sections/19-acknowledgments.md) - [Support](docs/readme/sections/20-support.md) |

## Quick Start

**Prerequisites**: PyTorch must be installed first. See [Quick Start](docs/readme/sections/02-quick-start.md) for details.

```bash
# Install
pip install omnivoice-server

# Start server
omnivoice-server

# Test with curl
curl -X POST http://127.0.0.1:8880/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model": "omnivoice", "input": "Hello world!"}' \
  --output speech.wav
```

## Overview

**omnivoice-server** wraps the OmniVoice TTS model with an OpenAI-compatible HTTP API:

- **Voice Design**: Control gender, age, pitch, accent, dialect
- **Voice Cloning**: Clone from reference audio
- **Streaming**: Real-time audio streaming with chunked transfer
- **Voice Profiles**: Persistent storage for cloned voices
- **OpenAI-Compatible**: Drop-in replacement for OpenAI TTS endpoints

See [Features](docs/readme/sections/01-features.md) for complete capability list.

## Verification Status

- **System**: Working on CPU and CUDA
- **MPS**: Broken on Apple Silicon (use CPU instead)
- **Performance**: RTF ~4.92 on CPU, ~0.2 on GPU

See [Verification Status](docs/readme/sections/03-verification-status.md) for benchmarks and audio samples.

## Documentation

This README provides quick links to detailed documentation. For complete information, see:

- Individual section files in `docs/readme/sections/`
- Technical docs in `docs/verification/`, `docs/system/`, `docs/architecture/`

## License

MIT - See [License](docs/readme/sections/17-license.md)

## Support

- [GitHub Issues](https://github.com/maemreyo/omnivoice-server/issues)
- [GitHub Discussions](https://github.com/maemreyo/omnivoice-server/discussions)
