Metadata-Version: 2.1
Name: sense-voice-streaming-asr
Version: 0.1.1
Summary: Real-time streaming automatic speech recognition (ASR) with support for Chinese, English, Cantonese, Japanese, and Korean languages using SenseVoiceSmall model.
Author-email: nikoseven <nikoseven@outlook.com>
Maintainer-email: nikoseven <nikoseven@outlook.com>
License: Apache-2.0
Project-URL: Homepage, https://github.com/nikoseven/sense-voice-streaming-asr
Project-URL: Repository, https://github.com/nikoseven/sense-voice-streaming-asr
Project-URL: Documentation, https://github.com/nikoseven/sense-voice-streaming-asr#readme
Project-URL: Bug Tracker, https://github.com/nikoseven/sense-voice-streaming-asr/issues
Keywords: speech-recognition,asr,streaming,real-time,sensevoice,multilingual,chinese,english,japanese,korean,cantonese,voice-processing,audio-processing,artificial-intelligence,machine-learning
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Software Development :: Libraries
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Operating System :: OS Independent
Classifier: Natural Language :: Chinese (Simplified)
Classifier: Natural Language :: English
Classifier: Natural Language :: Japanese
Classifier: Natural Language :: Korean
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy
Requires-Dist: kaldi-native-fbank
Requires-Dist: sounddevice
Requires-Dist: sentencepiece

# 🎙️ SenseVoice Streaming ASR

[![PyPI](https://img.shields.io/pypi/v/sense-voice-streaming-asr)](https://pypi.org/project/sense-voice-streaming-asr/)
[![Python](https://img.shields.io/pypi/pyversions/sense-voice-streaming-asr)](https://pypi.org/project/sense-voice-streaming-asr/)
[![License](https://img.shields.io/pypi/l/sense-voice-streaming-asr)](LICENSE)

A lightweight, real-time streaming speech recognition engine powered by SenseVoiceSmall.


## ✨ Features

- **Streaming ASR**: lightweight, real-time streaming speech recognition engine, runs on CPU with realtime recongnition.
- **Integrated VAD**: Built-in Voice Activity Detection (VAD) to detect speech segments.
- **Multilingual**: Supports `Chinese`, `English`, `Japanese`, `Korean`, `Cantonese` out of the box.
- **Self-contained**: Models bundled inside the package — no external downloads needed.
- **Pure Python + ONNX**: No heavy dependencies; runs on CPU.

## 🚀 Installation

```bash
pip install sense-voice-streaming-asr
```

> Requires Python ≥ 3.8.



## 🛠️ Development

To install in development mode:

```bash
git clone https://github.com/yourname/sense-voice-streaming-asr.git
cd sense-voice-streaming-asr
git submodule init # for SenseVoice models
pip install -e .
```


## 📄 License

Apache 2.0 License.

This project incorporates code from [SenseVoice](https://github.com/FunAudioLLM/SenseVoice) which is licensed under Apache 2.0. The project as a whole is therefore distributed under the Apache 2.0 license. See the [LICENSE](LICENSE) file for full license text.

## 🙏 Acknowledgements

- [SenseVoice](https://github.com/FunAudioLLM/SenseVoice) by FunAudioLLM
- [SenseVoice ONNX Models](https://www.modelscope.cn/models/iic/SenseVoiceSmall)
- [FSMN VAD](https://www.modelscope.cn/iic/speech_fsmn_vad_zh-cn-16k-common-onnx)
- [Kaldi-native-fbank](https://github.com/csukuangfj/kaldi-native-fbank)
- ONNX Runtime
