Metadata-Version: 2.4
Name: sarvam-transcribe-streaming-sdk
Version: 0.1.0
Summary: Python SDK for Sarvam Streaming Speech-to-Text
Project-URL: Homepage, https://www.sarvam.ai
Project-URL: Documentation, https://docs.sarvam.ai
Author-email: Sarvam AI <support@sarvam.ai>
License: Apache-2.0
License-File: LICENSE
Keywords: asr,indian-languages,sarvam,speech-to-text,streaming,transcription
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.9
Requires-Dist: websockets>=12.0
Description-Content-Type: text/markdown

# Sarvam Streaming Speech-to-Text — Python SDK

Real-time speech-to-text for Indian languages. Stream audio in, get transcripts
back — with sentence-level segmented transcripts for high-quality, complete outputs.

## Installation

```bash
pip install sarvam-transcribe-streaming-sdk
```

Requires Python 3.9+.

## Quick Start

```python
import asyncio
from sarvam_transcribe import SarvamStreamingClient
from sarvam_transcribe.handlers import TranscriptResultStreamHandler
from sarvam_transcribe.model import TranscriptEvent

SARVAM_STT_URL = "wss://stt.example.com"
SARVAM_API_KEY = "your-api-key"
AUDIO_FILE = "recording.pcm"    # 16kHz, mono, 16-bit PCM
CHUNK_SIZE = 16000               # 500ms of audio


class MyHandler(TranscriptResultStreamHandler):
    async def handle_transcript_event(self, event: TranscriptEvent):
        for result in event.transcript.results:
            for alt in result.alternatives:
                prefix = "..." if result.is_partial else ">>>"
                print(f"{prefix} {alt.transcript}")


async def main():
    client = SarvamStreamingClient(
        url=SARVAM_STT_URL,
        api_key=SARVAM_API_KEY,
    )
    stream = await client.start_stream_transcription(
        language_code="hi-IN",
    )

    async def send_audio():
        with open(AUDIO_FILE, "rb") as f:
            while chunk := f.read(CHUNK_SIZE):
                await stream.input_stream.send_audio_event(audio_chunk=chunk)
                await asyncio.sleep(CHUNK_SIZE / (16000 * 2))  # real-time pacing
        await stream.input_stream.end_stream()

    handler = MyHandler(stream.output_stream)
    await asyncio.gather(send_audio(), handler.handle_events())


asyncio.run(main())
```

Output:
```
... प्रधानमंत्री ने
... प्रधानमंत्री ने आज कहा कि
>>> प्रधानमंत्री ने आज कहा कि देश आगे बढ़ रहा है।
... हमें
... हमें मिलकर काम
>>> हमें मिलकर काम करना होगा।
```

Lines prefixed with `...` are **partial results** (the transcript so far, which
may still change). Lines prefixed with `>>>` are **final results** (complete,
endpointed sentences ready for downstream use).

## Supported Languages

| Language | Code |
|----------|------|
| Hindi | `hi-IN` |
| Bengali | `bn-IN` |
| Kannada | `kn-IN` |
| Malayalam | `ml-IN` |
| Marathi | `mr-IN` |
| Odia | `od-IN` |
| Punjabi | `pa-IN` |
| Tamil | `ta-IN` |
| Telugu | `te-IN` |
| English (Indian) | `en-IN` |
| Gujarati | `gu-IN` |
| Auto-detect | `unknown` |

## Audio Requirements

| Property | Value |
|----------|-------|
| Format | Raw PCM (no WAV header) |
| Sample rate | 16,000 Hz |
| Bit depth | 16-bit signed, little-endian |
| Channels | Mono (1 channel) |
| Recommended chunk size | 100–500 ms (3,200–16,000 bytes) |
| Max chunk size | 32 KB (1 second) |

## Documentation

For full API reference, examples (microphone streaming, file transcription,
collecting results), error handling, and migration guide from AWS Transcribe SDK,
contact us.

## License

Apache-2.0
