Metadata-Version: 2.4
Name: voxedge
Version: 0.0.1a0
Summary: Pipecat for the edge — edge-native, local-first real-time voice conversation library
Author: Harvest Su
License: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.24
Provides-Extra: rk
Requires-Dist: rknn-toolkit-lite2>=2.0; platform_machine == "aarch64" and extra == "rk"
Requires-Dist: rkvoice-stream>=0.1; platform_machine == "aarch64" and extra == "rk"
Provides-Extra: sherpa
Requires-Dist: sherpa-onnx>=1.10; extra == "sherpa"
Requires-Dist: soundfile>=0.12; extra == "sherpa"
Provides-Extra: jetson
Requires-Dist: onnxruntime>=1.16; extra == "jetson"
Requires-Dist: soundfile>=0.12; extra == "jetson"
Requires-Dist: piper-phonemize>=1.1.0; platform_machine == "aarch64" and extra == "jetson"
Requires-Dist: tokenizers>=0.15; extra == "jetson"
Requires-Dist: webrtcvad>=2.0.10; extra == "jetson"
Provides-Extra: translator
Requires-Dist: ctranslate2>=4.0; extra == "translator"
Requires-Dist: sentencepiece>=0.1.99; extra == "translator"
Provides-Extra: text
Provides-Extra: artifacts
Requires-Dist: huggingface_hub>=0.20; extra == "artifacts"

# voxedge

> Edge-native, local-first real-time voice conversation library — "Pipecat for the edge".

**Status: Phase 1a — pure-Python foundation (additive scaffolding only).**

voxedge is an edge-native library for low-latency, local-first real-time voice
conversation. It was originally extracted as the open-core foundation of a
production edge voice stack.

## What's here (Phase 1a)

- `voxedge/backends/base.py` — clean backend ABCs (`ASRBackend`/`ASRStream`,
  `TTSBackend`, `VADBackend`/`VADSession`, `LLMBackend`/`LLMEvent`) with **no
  env / profile coupling** — constructors take explicit params only.
- `voxedge/transport/base.py` — `Transport` ABC + `InProcessTransport`
  (zero-IPC asyncio queues, the default) + `WebSocketTransport` (duck-typed
  ws adapter, no FastAPI dependency).
- `voxedge/engine/conversation.py` — `ConversationEngine` + `Session`, the
  VAD-segmentation / barge-in / multi-turn / sentence-buffer / ASR→(LLM)→TTS
  orchestration loop ported from `app/main.py`.
- `voxedge/backends/mock.py` — Mock backends so the whole engine runs
  end-to-end on a laptop with no CUDA.
- `voxedge/tests/` — proves the architecture runs end-to-end on Mac.

## Design constraints

- **Pure Python.** No CUDA / torch / tensorrt in the core. Heavy adapters live
  behind optional extras (`voxedge[trt]`, `voxedge[rknn]`) — placeholders for now.
- **No env reads in the library.** All config is injected as explicit params.

## Quickstart (Phase 1a, mock backends)

```python
import asyncio
from voxedge.engine import ConversationEngine
from voxedge.transport import InProcessTransport
from voxedge.backends.mock import MockASR, MockTTS, MockVAD

engine = ConversationEngine(
    backends={"asr": MockASR(), "tts": MockTTS(), "vad": MockVAD()},
    multi_utterance=True,
)
transport = InProcessTransport()
asyncio.run(engine.run(transport))
```
