Metadata-Version: 2.4
Name: dolphinvoice
Version: 1.0.0
Summary: SDK for dolphinvoice
Author-email: dolphinvoice <development@dolphin-ai.jp>
License: Apache License 2.0
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: websocket-client>=1.4.0
Requires-Dist: requests
Requires-Dist: six
Requires-Dist: cryptography
Dynamic: license-file

# DolphinVoice Python SDK

DolphinVoice SDK is used for speech recognition and synthesis. This SDK provides three main modules:
- Real-time Speech Recognition (ASR)
- Audio File Transcription (FileAsr)
- Text to Speech (TTS)

## Directory

* [Documentation](#documentation)
* [Installation](#installation)
* [Usage](#usage)
  * [Real-time Speech Recognition](#real-time-speech-recognition)
  * [Audio File Transcription](#audio-file-transcription)
  * [Text to Speech](#text-to-speech)
* [API Reference](#api-reference)
  * [Real-time Speech Recognition](#realtimeasrsdk)
  * [Audio File Transcription](#fileasr)
  * [Text to Speech](#shortttsdk)
* [Keywords](#keywords)
* [License](#license)

## Documentation

Find more detailed documentation and guides about the DolphinVoice SDK in the following resources:

- [DolphinVoice API Documentation](https://developers.dolphinvoice.ai/en/docs/api/intro/intro)
- [Developer Center](https://console.dolphinvoice.ai/login)

For technical support or any questions, please contact our developer support team: voice.contact@dolphin-ai.jp

## Installation

Install the DolphinVoice Python SDK using pip:
You can install this SDK directly from [pip](https://pypi.org/project/dolphinvoice/).

```bash
pip install dolphinvoice
```

## Usage

### Real-time Speech Recognition

```python
from dolphinvoice.speech_rec.callbacks import SpeechTranscriberCallback
from dolphinvoice import speech_rec
import time

class Callback(SpeechTranscriberCallback):
    def started(self, message):
        print('TranscriptionStarted: %s' % message)

    def result_changed(self, message):
        print('TranscriptionResultChanged: %s' % message)

    def sentence_begin(self, message):
        print('SentenceBegin: %s' % message)

    def sentence_end(self, message):
        print('SentenceEnd: %s' % message)

    def completed(self, message):
        print('TranscriptionCompleted: %s' % message)

    def task_failed(self, message):
        print('TaskFailed: %s' % message)

    def warning_info(self, message):
        print('Warning: %s' % message)

    def channel_closed(self):
        print('TranslationChannelClosed')

audio_path = 'demo.mp3'
client = speech_rec.SpeechClient(app_id='YOUR_APP_ID', app_secret='YOUR_APP_SECRET')

with client.create_transcriber(Callback()) as transcriber:
    transcriber.set_parameter({
        "lang_type": "en-US",
        "format": "mp3",
        "sample_rate": 16000,
    })
    transcriber.start()
    with open(audio_path, 'rb') as f:
        audio = f.read(7680)
        while audio:
            transcriber.send(audio)
            time.sleep(0.24)
            audio = f.read(7680)
    transcriber.stop()

```

### Audio File Transcription

```python
from dolphinvoice import speech_rec

client = speech_rec.SpeechClient(app_id='YOUR_APP_ID', app_secret='YOUR_APP_SECRET')

asrfile = client.create_asrfile()

audio = 'demo.mp3'
data = {
    "lang_type": "en-US",
    "format": "mp3",
    "sample_rate": 16000
}
result = asrfile.transcribe_file(audio, data)
print(result)
```

### Text to Speech

```python
from dolphinvoice.speech_syn.callbacks import SpeechSynthesizerCallback
from dolphinvoice import speech_syn

class MyCallback(SpeechSynthesizerCallback):
    def __init__(self, name):
        self._name = name
        self._fout = open(name, 'wb')

    def binary_data_received(self, raw):
        self._fout.write(raw)

    def on_message(self, message):
        print('Received : %s' % message)

    def started(self, message):
        print('MyCallback.OnSynthesizerStarted: %s' % message)

    def get_Timestamp(self,message):
        print('MyCallback.OnSynthesizerGetTimestamp: %s' % message)

    def get_Duration(self, message):
        print('MyCallback.OnSynthesizerGetDuration: %s' % message)

    def completed(self, message):
        print('MyCallback.OnSynthesizerCompleted: %s' % message)
        self._fout.close()

    def channel_closed(self):
        print('MyCallback.OnSynthesizerChannelClosed')

audio_name = 'syAudio.mp3'
client = speech_syn.SpeechClient(app_id='YOUR_APP_ID', app_secret='YOUR_APP_SECRET')
callback = MyCallback(audio_name)

with client.create_synthesizer(callback) as synthesizer:
    synthesizer.set_parameter({
        "text": "The weather is nice, let's go for a walk.",
        "lang_type": "en-US",
        "format": "mp3"
    })
    synthesizer.start()
    synthesizer.wait_completed()
```

## API Reference

<a id="realtimeasrsdk"></a>

### Real-time Speech Recognition

The real-time speech recognition module is for processing real-time audio streams.

#### Methods
- `create_transcriber(callback: SpeechSynthesizerCallback)` - Registers event handlers for recognition events
- `set_parameter(params: Json)` - Specifies parameters
  - For the complete API documentation, refer to [DolphinVoice API Documentation](https://developers.dolphinvoice.ai/en/docs/api/asr/transcriber)
- `start()` - Starts a new recognition session
- `send(stream: Bytes)` - Sends audio stream to the recognition service
- `stop()` - Stops the current recognition session and releases resources


#### Events
- `TranscriptionStarted` - Triggered when recognition session starts
- `SentenceBegin` - Triggered when a new sentence is detected
- `TranscriptionResultChanged` - Triggered when intermediate results are updated
- `SentenceEnd` - Triggered when a sentence is completed
- `TranscriptionCompleted` - Triggered when the entire recognition session is completed
- `Warning` - Triggered when a non-fatal warning occurs

<a id="fileasr"></a>

### Audio File Transcription

The audio file transcription module is for processing pre-recorded audio files.

#### Methods
- `transcribe_file(audio: String, params: Json)` - Uploads and transcribes the audio file
  - For the complete API documentation, refer to [DolphinVoice API Documentation](https://developers.dolphinvoice.ai/en/docs/api/asr/file/file)

<a id="shortttsdk"></a>

### Text to Speech

The text-to-speech synthesis module is used to convert text into natural speech.

#### Methods
- `create_synthesizer(callback: SpeechSynthesizerCallback)` - Registers event handlers for synthesis events
- `set_parameter(params: Json)` - Specifies parameters
  - For the complete API documentation, refer to [DolphinVoice API Documentation](https://developers.dolphinvoice.ai/en/docs/api/tts/tts)
- `start()` - Starts a new synthesis session

#### Events
- `OnSynthesizerStarted` - Triggered when synthesis process starts
- `OnSynthesizerGetDuration` - Provides the total duration of the synthesized audio
- `OnSynthesizerGetTimestamp` - Provides timestamp information for the synthesized text
- `OnSynthesizerCompleted` - Triggered when synthesis process is completed

## Keywords

DolphinVoice DolphinAI ASR TTS Text-to-Speech  Speech-to-Text Speech-Recognition Speech-Synthesis

## License

MIT
