Metadata-Version: 2.4
Name: attenlabs-sas
Version: 0.1.0
Summary: Python SDK for the SD Attention Server (SAS). Streams microphone and webcam to the inference server and emits typed events for predictions, VAD, conversation state, and speech audio ready for downstream LLM use.
Project-URL: Homepage, https://attentionlabs.ai
Author: Attention Labs
License: MIT
License-File: LICENSE
Keywords: attention,audio,realtime,sas,sd-attention,vad,video,websocket
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.10
Requires-Dist: numpy>=1.24
Requires-Dist: opencv-python>=4.8
Requires-Dist: sounddevice>=0.4.6
Requires-Dist: websocket-client>=1.6
Description-Content-Type: text/markdown

# attenlabs-sas

Python SDK for [Attention Labs](https://attentionlabs.ai) real-time attention detection.

## Sign up

Get your API token at [attentionlabs.ai/dashboard](https://attentionlabs.ai/dashboard).

## Install

```bash
pip install attenlabs-sas
```

Requires Python 3.10+. System audio / video access is handled via `sounddevice` + `opencv-python`, which are pulled in automatically.

## Quickstart

```python
import time
from sas import AttentionClient

client = AttentionClient(
    token="your-auth-token",
)

@client.on_connected
def _(): print("ws connected")

@client.on_started
def _(): print("server warmup complete")

@client.on_warmup_complete
def _(): print("first prediction received")

@client.on_prediction
def _(event):
    print(f"class={event.cls} conf={event.confidence:.2f} src={event.source} faces={event.num_faces}")

@client.on_vad
def _(event):
    print(f"vad={event.probability:.2f}")

@client.on_state
def _(event):
    print(f"conv state={event.state}")

@client.on_speech_ready
def _(event):
    # event.audio_pcm16 — np.int16 array @ 16 kHz mono
    # event.audio_base64 — same audio as base64 (ready for OpenAI Realtime etc.)
    # event.duration_sec — float
    # Forward to your LLM of choice.
    ...

@client.on_error
def _(event):
    print(f"ERROR: {event.title}: {event.message}")

client.start()
try:
    while True:
        time.sleep(0.1)
except KeyboardInterrupt:
    client.stop()
```

## API

### `AttentionClient(...)`

| argument              | type                        | default                                   | notes |
| --------------------- | --------------------------- | ----------------------------------------- | ----- |
| `token`               | `str` \| `None`             | `None`                                    | Sent as the WS subprotocol. Omit if the server has auth disabled. |
| `video`               | `CameraConfig`              | `CameraConfig()`                          | `device_index`, `width`, `height`, `jpeg_quality`. |
| `audio`               | `MicConfig`                 | `MicConfig()`                             | `device` (int index or name; `None` → system default), `channels`. |
| `initial_threshold`   | `float`                     | `0.7`                                     | Device-class confidence threshold, pushed to server after `started`. |
| `enable_audio`        | `bool`                      | `True`                                    | Disable mic capture entirely. |
| `enable_video`        | `bool`                      | `True`                                    | Disable webcam capture entirely. |

### Methods

| method                       | description |
| ---------------------------- | ----------- |
| `start()`                    | Opens WS, acquires mic+cam, starts capture threads. Non-blocking. Raises on handshake failure. |
| `stop()`                     | Tears down capture, joins threads, closes WS. |
| `mute()` / `unmute()`        | Pauses/resumes upstream audio; also signals server so VAD stops firing. |
| `mark_responding(bool)`      | Tell the server an LLM response is in flight — server stops emitting predictions while true. |
| `set_threshold(value)`       | 0..1, device-class trigger threshold. Server acks via `config` event. |

### Events (decorator registration)

| decorator               | payload                                                                 |
| ----------------------- | ----------------------------------------------------------------------- |
| `@on_connected`         | —                                                                       |
| `@on_started`           | — (server-side warmup complete)                                         |
| `@on_warmup_complete`   | — (first non-zero-confidence prediction received)                       |
| `@on_prediction`        | `PredictionEvent(cls, confidence, source, num_faces)`                   |
| `@on_vad`               | `VadEvent(probability, is_speech)`                                      |
| `@on_state`             | `StateEvent(state: "listening" \| "sending" \| "cancelled" \| "idle")`  |
| `@on_speech_ready`      | `SpeechReadyEvent(audio_pcm16, audio_base64, duration_sec)`             |
| `@on_config`            | `ConfigEvent(model_class2_threshold)` (server ack of `set_threshold`)   |
| `@on_stats`             | `StatsEvent(rtt_ms, sent_video, skipped_video, sent_audio, uptime_s)`   |
| `@on_error`             | `AttentionErrorEvent(title, message, detail, code)`                     |
| `@on_disconnected`      | `DisconnectedEvent(code, reason, was_clean)`                            |

All callbacks fire on the WebSocket receive thread (or the heartbeat thread for `stats`/`error`). Keep them fast — offload heavy work to your own thread.

## LLM integration

LLM routing is intentionally **not** part of the SDK. The `speech_ready` event hands you ready-to-send PCM16 audio (both as `np.int16` and as base64); forward it wherever you like.

When your LLM starts speaking, call `client.mute()` + `client.mark_responding(True)`, and `unmute()` + `mark_responding(False)` when it finishes — otherwise the server will keep emitting predictions during the response.

A full CLI sample showing SDK + OpenAI Realtime wiring lives in a separate public repo: **sas-py-demo**.

## License

MIT
