Metadata-Version: 2.4
Name: humain-voice
Version: 0.16.0a3
Summary: Python SDK for SauTech Services
Author-email: HUMAIN <info@humain.com>, Yousseif Elshahawy <yelshahwy@humain.com>
License-Expression: Apache-2.0
Classifier: Programming Language :: Python :: 3.10
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: aiohttp>=3.12.15
Requires-Dist: anyio>=4.0.0
Requires-Dist: loguru>=0.7.3
Requires-Dist: pydantic>=2.11.7
Requires-Dist: python-socketio[client]>=5.13.0
Requires-Dist: tenacity>=8.0.0
Requires-Dist: websockets>=15.0.1
Provides-Extra: test
Requires-Dist: pytest>=8.0.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.23.0; extra == "test"

## Sautech Python SDK

The official Python SDK for SauTech ASR Services. It supports two modes:

- Fast Transcription (file-based, async and sync)
- Realtime (streaming audio over websockets)

### Installation

```bash
pip install humain-voice
```

Requires Python >= 3.9.

### Configuration

Set the following environment variables provided by SauTech:

- `API_URL`: Base URL of the ASR service (e.g. `https://api.example.com`)
- `API_PATH`: API path for the specific service
- `API_KEY`: Your API key

You can export these in your shell:

```bash
export API_URL="https://api.example.com"
export API_PATH="/realtime/socket.io"
export API_KEY="<your_api_key>"
```

### Fast Transcription (sync)

#### Context manager

```python
import os
from pathlib import Path
from sautech import stt

audio_path = Path("samples/king_fahad_test.wav")

with stt.FastTranscriptionClient(
    api_url=os.getenv("API_URL"),
    api_key=os.getenv("API_KEY"),
    api_path=os.getenv("API_PATH"),
) as client:
    with audio_path.open("rb") as f:
        result = client.transcribe_sync(
            f,  # file-like object or bytes via f.read()
            stt.Language.ArEn,
            stt.ASRModel.BayanArEnV1,
        )
        print(result)
```

#### Manual (no context manager)

```python
import os
from pathlib import Path
from sautech import stt

audio_path = Path("samples/king_fahad_test.wav")

client = stt.FastTranscriptionClient(
    api_url=os.getenv("API_URL"),
    api_key=os.getenv("API_KEY"),
    api_path=os.getenv("API_PATH"),
)

with audio_path.open("rb") as f:
    result1 = client.transcribe_sync(
        f,
        stt.Language.ArEn,
        stt.ASRModel.BayanArEnV1,
    )
    print(result1)

with audio_path.open("rb") as f:
    result2 = client.transcribe_sync(
        f.read(),
        stt.Language.ArEn,
        stt.ASRModel.BayanArEnV1,
    )
    print(result2)

client.close_sync()
```

### Fast Transcription (async)

#### Context manager

```python
import os
import asyncio
from pathlib import Path
from sautech import stt

async def main():
    audio_bytes = Path("samples/king_fahad_test.wav").read_bytes()
    async with stt.FastTranscriptionClient(
        api_url=os.getenv("API_URL"),
        api_key=os.getenv("API_KEY"),
        api_path=os.getenv("API_PATH"),
    ) as client:
        result = await client.transcribe(
            audio_bytes,
            stt.Language.ArEn,
            stt.ASRModel.BayanArEnV1,
        )
        print(result)

asyncio.run(main())
```

#### Manual (no context manager)

```python
import os
import asyncio
from pathlib import Path
from sautech import stt

async def main():
    audio_bytes = Path("samples/king_fahad_test.wav").read_bytes()

    client = stt.FastTranscriptionClient(
        api_url=os.getenv("API_URL"),
        api_key=os.getenv("API_KEY"),
        api_path=os.getenv("API_PATH"),
    )

    result1 = await client.transcribe(
        audio_bytes,
        stt.Language.ArEn,
        stt.ASRModel.BayanArEnV1,
    )
    print(result1)

    result2 = await client.transcribe(
        audio_bytes,
        stt.Language.ArEn,
        stt.ASRModel.BayanArEnV1,
    )
    print(result2)

    await client.close()

asyncio.run(main())
```

You can also pass `on_response`, `on_file_upload`, and `on_error` callbacks to receive intermediate updates and handle lifecycle events during processing.

### Realtime Streaming

```python
import asyncio
import os
import wave
from sautech import stt

async def run():
    client = stt.RealtimeClient(
        api_url=os.getenv("API_URL"),
        api_key=os.getenv("API_KEY"),
        api_path=os.getenv("API_PATH"),
    )

    stream = await client.start_stream(
        language=stt.Language.ArEn,
        on_connect=lambda: print("connected"),
        on_disconnect=lambda: print("disconnected"),
        on_response=lambda t: print("response:", t),
        on_error=lambda e: print("error:", e),
    )

    with wave.open("samples/king_fahad_test.wav", "rb") as f:
        audio_bytes = f.readframes(f.getnframes())

    chunk_duration_s = 0.1
    sample_rate = 16000
    bytes_per_sample = 2  # 16-bit PCM
    chunk_size = int(sample_rate * bytes_per_sample * chunk_duration_s)

    for start in range(0, len(audio_bytes), chunk_size):
        end = min(start + chunk_size, len(audio_bytes))
        await stream.send(audio_bytes[start:end])
        await asyncio.sleep(chunk_duration_s)

    await stream.close(timeout=1)

asyncio.run(run())
```

#### Parallel streams (single client)

```python
import asyncio
import os
import wave
from sautech import stt

async def stream_audio(stream, audio_bytes: bytes):
    chunk_duration_s = 0.1
    sample_rate = 16000
    bytes_per_sample = 2
    chunk_size = int(sample_rate * bytes_per_sample * chunk_duration_s)

    for start in range(0, len(audio_bytes), chunk_size):
        end = min(start + chunk_size, len(audio_bytes))
        await stream.send(audio_bytes[start:end])
        await asyncio.sleep(chunk_duration_s)

    await stream.close(timeout=1)

async def run():
    client = stt.RealtimeClient(
        api_url=os.getenv("API_URL"),
        api_key=os.getenv("API_KEY"),
        api_path=os.getenv("API_PATH"),
    )

    stream_a = await client.start_stream(
        language=stt.Language.ArEn,
        on_response=lambda t: print("stream A:", t),
    )
    stream_b = await client.start_stream(
        language=stt.Language.ArEn,
        on_response=lambda t: print("stream B:", t),
    )

    with wave.open("samples/king_fahad_test.wav", "rb") as f:
        audio_bytes = f.readframes(f.getnframes())

    await asyncio.gather(
        stream_audio(stream_a, audio_bytes),
        stream_audio(stream_b, audio_bytes),
    )

asyncio.run(run())
```

### Error handling

The SDK exposes a structured error contract aligned with the platform's `ErrorResponse`. The platform fields `id`, `message`, `code`, `retryable`, and `timestamp` are all **optional** — server paths exist that omit any of them. The error callback always fires; whether an in-flight context is terminated depends on the **code's ownership** (see [ADR-0003](../../docs/adr/0003-structured-error-contract.md)).

#### Error code constants

```python
from sautech.errors import (
    ASR_TRANSCRIPTION_FAILED,
    ASR_MODEL_UNAVAILABLE,
    RATE_LIMIT_EXCEEDED,
    TTS_VOICE_LIST_FAILED,
    SERVER_INTERNAL,
    is_asr_code,
    is_tts_code,
    is_request_scoped_code,
    is_realtime_owned,
    is_tts_owned,
)
```

Unknown future codes pass through as strings. Legacy aliases such as `RATE_LIMITED`, `VALIDATION_FAILED`, and `INTERNAL_ERROR` remain exported for older deployments.

#### Socket.IO errors (TTS / realtime STT / fast STT)

The `on_error` callback receives an `ErrorResponse` model. Branch on `code` to decide your policy:

```python
from sautech.errors import ASR_TRANSCRIPTION_FAILED, ASR_MODEL_UNAVAILABLE

def on_error(err):
    # err is always non-None; any field can be None.
    print(f"code={err.code} retryable={err.retryable} msg={err.message}")
    if err.code == ASR_MODEL_UNAVAILABLE:
        # Tell the user the model is down; safe to retry later.
        ...
    elif err.code == ASR_TRANSCRIPTION_FAILED:
        # Final terminal error for this stream — no retry.
        ...
```

The realtime adapter only terminates an ASR stream context for ASR-shaped codes. A `TTS_VOICE_LIST_FAILED` arriving on the realtime socket fires the global `on_error` but **does not** kill the stream.

#### Batch transcription HTTP errors

`BatchTranscribeError` (and its subclasses) carry a structured `payload` plus per-field accessors:

```python
from sautech.stt.batchtranscription import (
    BatchTranscribeClient,
    BatchTranscribeError,
    BatchTranscribeRateLimitError,
)

try:
    result = await client.submit(audio, "ar")
except BatchTranscribeRateLimitError as err:
    # Caller decides retry policy; SDK never retries internally.
    print(f"rate-limited; retry_after={err.retry_after}s capacity={err.capacity}")
except BatchTranscribeError as err:
    print(f"status={err.status_code}")
    print(f"code={err.code} retryable={err.retryable}")
    print(f"detail={err.detail} job_id={err.job_id}")
    print(f"raw_body={err.raw_body!r}")  # always preserved
```

User-facing message preference: `detail → message → error → raw text`. `err.message` already follows that order.

> **Note.** The `max_retries` constructor argument is deprecated and is a no-op. The SDK never retries — callers decide policy based on `err.retryable`, `err.code`, and `err.capacity`.

### Types

Common enums are available under `sautech.stt`, for example `Language` and `ASRModel`.

### Examples

See complete examples in `python/examples/ft_client.py` and `python/examples/rt_client.py`. The batch transcription example (`python/examples/batch_transcribe_client.py`) shows structured-error handling end-to-end.
