Metadata-Version: 2.4
Name: sarvam-conv-ai-sdk
Version: 1.0.8
Summary: Add your description here
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: fastapi>=0.118.2
Requires-Dist: httpx>=0.28.1
Requires-Dist: pydantic>=2.11.7
Requires-Dist: uvicorn>=0.37.0
Requires-Dist: websockets>=15.0.1
Provides-Extra: all
Requires-Dist: pyaudio>=0.2.14; extra == 'all'
Description-Content-Type: text/markdown

# Sarvam Conv AI SDK

The **Sarvam Conversational AI SDK** is a Python package that helps developers build and extend conversational agents. It provides core components to manage conversation flow, language preferences, and messaging, making it easier to develop interactive and context-aware AI experiences.

---

## Overview

The Sarvam Conv AI SDK enables developers to create tools that can:

* Facilitate agentic capabilities like API calling in the middle of a conversation.
* Manage agent-specific variables
* Control and modify the language used during conversations
* Send dynamic messages to both the user and the underlying language model (LLM)

---

## Installation

### Basic Installation

Install the SDK via pip:

```bash
pip install sarvam-conv-ai-sdk
```

### Audio Support (Optional)

If you want to use audio streaming features (microphone input and speaker output), you need to install PyAudio. This requires system-level dependencies:

#### Option 1: Install with audio support

```bash
pip install sarvam-conv-ai-sdk[all]
```

**Note:** You'll need to install PortAudio first:

- **macOS**: `brew install portaudio`
- **Ubuntu/Debian**: `sudo apt-get install portaudio19-dev`
- **Windows**: Download from [http://www.portaudio.com/download.html](http://www.portaudio.com/download.html)

#### Option 2: Use without PyAudio

The SDK works without PyAudio for non-playback environments; audio capture/playback features will not be available. You can still:
- Use the WebSocket client for real-time voice conversations (provide your own audio I/O)
- Build backend proxies for frontend applications

---

## AsyncSamvaadAgent

Build real-time voice with a small set of inputs.

- You provide InteractionConfig: who the user is, which app to talk to, interaction type, and audio sample rate; optionally include overrides like agent_variables and initial language/state.
- You create AsyncSamvaadAgent with your API key, config, and optional audio interface plus callbacks for text/audio/events.
- Start the agent: it fetches a signed WebSocket URL, sends interaction_start, and streams audio/text both ways.

### Key features

* Real-time voice interaction — natural speak and listen
* Automatic audio management — built-in microphone input and speaker output
* Async/await support — non-blocking operations
* Callback handling — process text/audio/events asynchronously
* Connection management — robust WebSocket handling

Minimal example:

```python
import asyncio
from pydantic import SecretStr
from sarvam_conv_ai_sdk import AsyncSamvaadAgent, AsyncDefaultAudioInterface, InteractionConfig, InteractionType, ServerTextChunkMsg, SarvamToolLanguageName
from sarvam_conv_ai_sdk.messages.types import UserIdentifierType

async def handle_text(msg: ServerTextChunkMsg):
    print("Agent:", msg.text)

async def main(app_id: str, api_key: str):
    config = InteractionConfig(
        user_identifier_type=UserIdentifierType.CUSTOM,
        user_identifier="demo_user",
        org_id="org_ai",
        workspace_id="workspace_id",
        app_id=app_id,
        interaction_type=InteractionType.CALL,
        agent_variables={"agent_variable_1": "value"},
        initial_language_name=SarvamToolLanguageName.HINDI,
        sample_rate=16000,
    )

    agent = AsyncSamvaadAgent(
        api_key=SecretStr(api_key),
        config=config,
        audio_interface=AsyncDefaultAudioInterface(input_sample_rate=16000),
        text_callback=handle_text,
    )

    await agent.start()
    try:
        # Wait until the WebSocket disconnects or the agent is stopped
        await agent.wait_for_disconnect()
    finally:
        await agent.stop()

if __name__ == "__main__":
    asyncio.run(main(app_id="your_app_id", api_key="your_api_key"))
```

### AsyncSamvaadAgent parameters

| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| api_key | SecretStr | Yes | API key used to fetch a signed WebSocket URL |
| config | InteractionConfig | Yes | Interaction start configuration (user id, app id, sample rate, overrides) |
| audio_interface | AsyncAudioInterface or None | No | Automatic mic capture and speaker playback. Omit for headless usage (use `send_audio`) |
| text_callback | Callable[[ServerTextChunkMsg], Awaitable[None]] or None | No | Receives streaming text chunks from the agent |
| audio_callback | Callable[[ServerAudioChunkMsg], Awaitable[None]] or None | No | Receives audio chunks if not using `audio_interface` for playback |
| event_callback | Callable[[ServerEventBase], Awaitable[None]] or None | No | Receives events like interaction_connected, user_interrupt, interaction_end |
| base_url | str | No | Override base URL. Default: `https://apps.sarvam.ai/api/app-runtime/` |

Methods:
- `await agent.start()` — start and connect
- `await agent.stop()` — stop and cleanup
- `await agent.wait_for_connect(timeout: float | None = 5.0)` — wait until connected
- `await agent.wait_for_disconnect()` — wait until disconnected or stopped
- `agent.is_connected()` — connection status
- `await agent.send_audio(audio_bytes: bytes)` — send raw 16‑bit PCM audio
- `agent.get_interaction_id()` — current interaction id or `None`

Audio interface (optional): `AsyncDefaultAudioInterface(input_sample_rate: int = 16000)`
- Methods: `start(input_callback)`, `output(audio: bytes, sample_rate?: int)`, `interrupt()`, `stop()`
- Audio: LINEAR16 (16‑bit PCM mono). Supported sample rates: 8000, 16000

### What you must provide: InteractionConfig

Required fields:

- user_identifier_type: One of CUSTOM, EMAIL, PHONE_NUMBER, UNKNOWN
- user_identifier: The identifier value (string; phone/email/custom id) # This id can be used to see logs in the log analyser
- org_id: Your organization, e.g., "sarvamai"
- workspace_id: Your workspace, e.g., "default"
- app_id: The target application id
- interaction_type: InteractionType.CALL (voice)
- sample_rate: 8000 or 16000 (16-bit PCM mono)
- version: int (Optional)

> **Important**  
> If `version` is not provided, the SDK uses the latest committed version of the app.  
> The connection will fail if the provided `app_id` has no committed version.

Optional overrides (applied server-side at start):

- agent_variables: dict of key/value to seed the agent context
- initial_language_name: e.g., "English", "Hindi" (must be allowed by app)
- initial_state_name: starting state name, if your app uses states
- initial_bot_message: first message from the agent

Example config:

```python
from sarvam_conv_ai_sdk import InteractionConfig, InteractionType, SarvamToolLanguageName
from sarvam_conv_ai_sdk.messages.types import UserIdentifierType

config = InteractionConfig(
    user_identifier_type=UserIdentifierType.CUSTOM,
    user_identifier="demo_user_async",
    org_id="sarvamai",
    workspace_id="default",
    app_id="your_app_id",
    interaction_type=InteractionType.CALL,
    agent_variables={"user_language": "Hindi"},
    initial_language_name=SarvamToolLanguageName.HINDI,
    initial_state_name="greeting",
    sample_rate=16000,
)
```

### Quick start: local voice test

1) Install dependencies

```bash
brew install portaudio               # macOS
pip install "sarvam-conv-ai-sdk[all]"
```

2) Set credentials (or pass directly in code)

```bash
export SARVAM_APP_ID="your_app_id"
export SARVAM_API_KEY="your_api_key"
```

3) Run the example

```bash
python -m sarvam_conv_ai_sdk.examples.async_audio_example
```

The example uses AsyncDefaultAudioInterface to capture mic at 16kHz and play responses. You can override base_url in AsyncSamvaadAgent if you use a different environment.

### Headless mode (no PyAudio)

Use your own audio I/O. Create the agent without audio_interface and push raw 16‑bit PCM mono chunks that match config.sample_rate.

```python
agent = AsyncSamvaadAgent(api_key=SecretStr("your_api_key"), config=config, text_callback=handle_text)
await agent.start()

# Send raw audio bytes
await agent.send_audio(raw_pcm_bytes)  # LINEAR16 mono at 16kHz or 8kHz

await agent.stop()
```

### Connect your frontend (backend proxy pattern)

See the section above for AsyncSamvaadAgent usage. For a full backend bridge, follow the same pattern in your server. Message shapes:

- Frontend → backend (init):

```json
{
  "type": "init",
  "app_id": "your_app_id",
  "context": {"language": "English", "user_name": "Priya"}
}
```

- Frontend → backend (text):

```json
{ "type": "text", "data": { "text": "Hello" } }
```

- Frontend → backend (audio):

```json
{ "type": "audio", "data": "<base64-raw-pcm>" }
```

Bridge essentials on the backend:

- Build InteractionConfig from init context; create AsyncSamvaadAgent with callbacks.
- Decode base64 and forward audio via await agent.send_audio(audio_bytes).
- In text/audio/event callbacks, websocket.send_json back to the frontend.

Minimal sketch:

```python
session.agent = AsyncSamvaadAgent(
    api_key=SecretStr(api_key),
    config=config,
    text_callback=session._handle_text,
    audio_callback=session._handle_audio,
    event_callback=session._handle_event,
)
await session.agent.start()
```

### Requirements for Async Audio

1. PyAudio installation:
   ```bash
   pip install sarvam-conv-ai-sdk[all]
   ```

2. System dependencies:
   - macOS: `brew install portaudio`
   - Ubuntu/Debian: `sudo apt-get install portaudio19-dev`
   - Windows: download from `http://www.portaudio.com/download.html`

3. Environment variables (optional convenience):
   ```bash
   export SARVAM_APP_ID="your_app_id"
   export SARVAM_API_KEY="your_api_key"
   ```

### Complete Example

See `sarvam_conv_ai_sdk/examples/async_audio_example.py` for a full, runnable script with mic capture, callbacks, and clean shutdown.

---

# Custom Tools
## Example Usage

```python
import httpx
from pydantic import Field

from sarvam_conv_ai_sdk import (
    SarvamInteractionTurnRole,
    SarvamOnEndTool,
    SarvamOnEndToolContext,
    SarvamOnStartTool,
    SarvamOnStartToolContext,
    SarvamTool,
    SarvamToolContext,
    SarvamToolLanguageName,
    SarvamToolOutput,
)

class OnStart(SarvamOnStartTool): #Name of the class has to be OnStart
    async def run(self, context: SarvamOnStartToolContext):
        user_id = context.get_user_identifier()
        async with httpx.AsyncClient() as client:
            response = await client.get(f"https://sarvam-flights.com/users/{user_id}")
            response.raise_for_status()
            user_data = response.json()

        source_destination = user_data.get("home_city")
        context.set_agent_variable("source_destination", source_destination)
        context.set_agent_variable("passenger_name", user_data.get("name"))
        
        # Store telephony call SID if available (for telephony channels)
        if context.provider_ref_id:
            context.set_agent_variable("call_sid", context.provider_ref_id)
        
        context.set_initial_language_name(SarvamToolLanguageName.ENGLISH)
        context.set_initial_bot_message(
            f"Hello! Would you like to book a flight from {source_destination}? Where would you like to go?",
        )
        return context


class BookFlight(SarvamTool):
    """Book a flight based on the user's travel preferences."""

    destination: str = Field(description="City of destination")
    travel_date: str = Field(description="Date of travel (YYYY-MM-DD)")

    async def run(self, context: SarvamToolContext) -> SarvamToolOutput:
        source_destination = context.get_agent_variable("source_destination")
        booking_data = {
            "source": source_destination,
            "destination": self.destination,
            "travel_date": self.travel_date,
            "passenger_name": context.get_agent_variable("passenger_name"),
        }

        async with httpx.AsyncClient() as client:
            response = await client.post(
                "https://sarvam-flights.com/book", json=booking_data
            )
            response.raise_for_status()
            booking_result = response.json()

        if booking_result.get("status") == "confirmed":
            context.set_agent_variable("booking_id", booking_result.get("booking_id"))
            context.set_end_conversation()
            return SarvamToolOutput(
                message_to_user=f"Flight booked successfully to {self.destination}!",
                context=context,
            )
        else:
            context.change_state("recommend_destinations")
            return SarvamToolOutput(
                message_to_llm="Booking failed. Please suggest similar destinations.",
                context=context,
            )


class OnEnd(SarvamOnEndTool):  #Name of the class has to be OnEnd
    async def run(self, context: SarvamOnEndToolContext):
        feedback = context.get_agent_variable("feedback")
        negative_words = ["bad", "poor", "disappointed", "unhappy", "problem"]
        interaction_transcript = context.get_interaction_transcript()
        if interaction_transcript.interaction_transcript:
            for turn in interaction_transcript.interaction_transcript:
                if turn.role == SarvamInteractionTurnRole.USER:
                    is_negative = any(word in feedback.lower() for word in negative_words)
            context.set_agent_variable("feedback_sentiment", is_negative)
        
        # Log call details if telephony SID is available
        if context.provider_ref_id:
            async with httpx.AsyncClient() as client:
                await client.post(
                    "https://sarvam-flights.com/analytics/call-logs",
                    json={
                        "call_sid": context.provider_ref_id,
                        "user_id": context.get_user_identifier(),
                        "sentiment": is_negative,
                        "duration": (
                            interaction_transcript.interaction_end_time 
                            - interaction_transcript.interaction_start_time
                        ).total_seconds()
                    }
                )

        return context

```

---

## Base Classes

The SDK exposes three base classes for tool development:

### 1. `SarvamTool`

Primary base class for all operational tools invoked during conversation flow.

**Example:**

```python
class MyCustomTool(SarvamTool):
    """Brief description of the tool's purpose."""

    tool_variable: type = Field(description="Description of this input parameter")

    async def run(self, context: SarvamToolContext) -> SarvamToolOutput:
        # Custom tool logic
        return SarvamToolOutput(
            message_to_user="Response to user",
            message_to_llm="Context for LLM",
            context=context
        )
```

### 2. `SarvamOnStartTool`

Executed at the beginning of a conversation, typically for initialization. The class **must** be named `OnStart`.

### 3. `SarvamOnEndTool`

Executed at the end of a conversation, typically for cleanup or post-processing. The class **must** be named `OnEnd`.

---

## Context Classes and Methods

### `SarvamToolContext`

The context object passed to `SarvamTool.run()` methods.

#### Variable Management

* `get_agent_variable(variable_name: str) -> Any`
  Retrieve the value of a variable.

* `set_agent_variable(variable_name: str, value: Any) -> None`
  Update a variable's value.

#### Language Control

* `get_current_language() -> SarvamToolLanguageName`
  Returns the current language of the agent.

* `change_language(language: SarvamToolLanguageName) -> None`
  Update the language preference.

#### Conversation Flow

* `set_end_conversation() -> None`
  Explicitly end the conversation.

#### State Management

* `get_current_state() -> str`
  Returns the current state of the conversation.

* `change_state(state: str) -> None`
  Transition to a new state. **Note:** The new state must be one of the next valid states defined in the agent configuration.

#### Engagement Metadata

* `get_engagement_metadata() -> EngagementMetadata`
  Retrieve the engagement metadata containing information about the current interaction. 

---

### `SarvamOnStartToolContext`

The context object passed to `SarvamOnStartTool.run()` methods.

#### Variable Management

* `get_agent_variable(variable_name: str) -> Any`
  Retrieve the value of a variable.

* `set_agent_variable(variable_name: str, value: Any) -> None`
  Update a variable's value.

#### User Information

* `get_user_identifier() -> str`
  Get the user identifier.

#### Telephony Information

* `provider_ref_id: Optional[str]`
  The reference ID from the channel provider. For telephony providers, this would contain the Call SID (Session ID) which uniquely identifies a specific phone call. For other channel providers, this would contain their respective reference IDs. Defaults to `None` for channels that don't provide reference IDs.

#### Initialization Methods

* `set_initial_bot_message(message: str) -> None`
  Set the first message sent by the agent when the conversation starts.

* `set_initial_state_name(state_name: str) -> None`
  Set the initial state from which the agent should start.

* `set_initial_language_name(language: SarvamToolLanguageName) -> None`
  Define the initial language preference for the user.

#### Engagement Metadata

* `get_engagement_metadata() -> EngagementMetadata`
  Retrieve the engagement metadata containing information about the current interaction.

---

### `SarvamOnEndToolContext`

The context object passed to `SarvamOnEndTool.run()` methods.

#### Variable Management

* `get_agent_variable(variable_name: str) -> Any`
  Retrieve the value of a variable.

* `set_agent_variable(variable_name: str, value: Any) -> None`
  Update a variable's value.

#### User Information

* `get_user_identifier() -> str`
  Get the user identifier.

#### Telephony Information

* `provider_ref_id: Optional[str]`
  The reference ID from the channel provider. For telephony providers, this would contain the Call SID (Session ID) which uniquely identifies a specific phone call. For other channel providers, this would contain their respective reference IDs. Defaults to `None` for channels that don't provide reference IDs.

#### Engagement Metadata

* `get_engagement_metadata() -> EngagementMetadata`
  Retrieve the engagement metadata containing information about the current interaction.


### Interaction Reattempt
* `set_retry_interaction`
  The user will be reattempted with the same agent. Useful when any business goal has not been met. 

#### Interaction Transcript

* `get_interaction_transcript() -> SarvamInteractionTranscript`
  Retrieve the conversation history containing user and agent messages in English and
 the timestamp when the conversation began and ended. Format: `yyyy-mm-dd hh:mm:ss`

**Example transcript:**
```python
[
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='Hello! How can I help you today?'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.USER: 'user'>, en_text='I need to book a flight'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='I can help you with that. Where would you like to go?'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.USER: 'user'>, en_text='I want to go to Mumbai'),
    SarvamInteractionTurn(role=<SarvamInteractionTurnRole.AGENT: 'agent'>, en_text='Great! When would you like to travel?')
]
```

---

## Return Types

### `SarvamToolOutput`

The return type for `SarvamTool.run()` methods. Contains:

* `message_to_user: Optional[str]` - Message that is sent directly to the user
* `message_to_llm: Optional[str]` - Message that is sent to the LLM, which then responds
* `context: SarvamToolContext` - The updated context object

**Note:** At least one of `message_to_llm` or `message_to_user` must be set.

**Important:** When both `message_to_user` and `message_to_llm` are set, only the `message_to_user` is actually sent to the user, but the `message_to_llm` overrides the `message_to_user` when adding to the chat thread for the LLM's context.

### `EngagementMetadata`

The engagement metadata object that can be retrieved from context objects using `get_engagement_metadata()`. Contains:

* `interaction_id: str` - Unique identifier for each conversation between user & agent.
* `attempt_id: Optional[str]` - Unique identifier for each attempt created on the platform
* `campaign_id: Optional[str]` - Campaign ID for the interaction
* `interaction_language: SarvamToolLanguageName` - The language used for the interaction (defaults to English)
* `app_id: str` - Application identifier of the agent for the interaction
* `app_version: int` - Version number of the agent
* `agent_phone_number: Optional[str]` - Phone number associated with the conversational agent application

---

## Supported Languages

The SDK supports multilingual conversations using the `SarvamToolLanguageName` enum. Available languages include:

* Bengali
* Gujarati
* Kannada
* Malayalam
* Tamil
* Telugu
* Punjabi
* Odia
* Marathi
* Hindi
* English

**Note:** The allowed languages are actually a subset that is preselected while defining the agent configurations.

---

## Best Practices

1. **Always implement `run()`**: The `run()` method is the entry point for tool execution logic.
2. **Use `Field()` for parameters**: Ensures type safety and adds descriptive metadata necessary for LLM to use in the prompt.
3. **Gracefully handle errors**: Avoid accessing unset variables or using invalid types.
4. **Return the appropriate type**: `SarvamTool.run()` must return `SarvamToolOutput`, while `SarvamOnStartTool.run()` and `SarvamOnEndTool.run()` return their respective context objects.
5. **Write meaningful docstrings**: Clearly describe what each tool is intended to do as this directly impacts the performance of tool calling capabilities of the agent.
6. **Use async operations for I/O**: For the best performance, use `async/await` for external API calls to avoid blocking.
7. **Use context methods**: Use the provided context methods for variable management, language control, and messaging instead of directly accessing context attributes.


---

## Testing Your Tools

After creating a tool, you can test it locally to ensure it works as expected. Here's how to test your tools:

### Testing Steps

1. **Create the ToolContext**: Initialize the appropriate context object with test data
2. **Instantiate the tool class**: Use `tool.model_validate(tool_args)` to create a tool instance
3. **Run the tool**: Call the tool's `run()` method with the context
4. **Observe the returned object**: Check if the necessary changes have been made to the context

### Example Test: SarvamTool

```python
# Test the BookFlight tool
async def test_book_flight():
    # 1. Create the ToolContext
    context = SarvamToolContext(
        language=SarvamToolLanguageName.ENGLISH,
        allowed_languages=[SarvamToolLanguageName.ENGLISH],
        state="booking",
        next_valid_states=["recommend_destinations", "end"],
        agent_variables={
            "source_destination": "Mumbai",
            "passenger_name": "John Doe",
            "booking_id": "123"
        },
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
    )
    
    # 2. Instantiate the tool class
    tool_args = {
        "destination": "Delhi",
        "travel_date": "2024-03-15"
    }
    tool_instance = BookFlight.model_validate(tool_args)
    
    # 3. Run the tool
    result = await tool_instance.run(context)
    
    # 4. Observe the returned object
    print(f"Message to user: {result.message_to_user}")
    print(f"Message to LLM: {result.message_to_llm}")
    print(f"End conversation: {result.context.end_conversation}")
    print(f"Current state: {result.context.get_current_state()}")
    print(f"Agent variables: {result.context.agent_variables}")
    print(f"Current Language: {result.context.get_current_language()}")

# Run the test
asyncio.run(test_book_flight())
```

### Example Test: OnStart Tool

For `SarvamOnStartTool`, the testing approach is similar but it returns the context object directly:

```python
# Testing OnStart tool
async def test_on_start():
    context = SarvamOnStartToolContext(
        user_identifier="user123",
        agent_variables={"source_destination": "Mumbai", "passenger_name": "John Doe"},
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
        initial_bot_message=None,
        initial_state_name="start",
        initial_language_name=SarvamToolLanguageName.ENGLISH,
        provider_ref_id="CA1234567890abcdef1234567890abcdef",  # Optional: for telephony channels
    )
    
    tool_instance = OnStart()
    result = await tool_instance.run(context)
    
    print(f"Initial bot message: {result.initial_bot_message}")
    print(f"Initial state: {result.initial_state_name}")
    print(f"Initial Language Name: {result.initial_language_name}")
    print(f"Agent variables: {result.agent_variables}")
    print(f"Telephony Call SID: {result.provider_ref_id}")

# Run the test
asyncio.run(test_on_start())
```

### Example Test: OnEnd Tool

```python
# Testing OnEnd tool
async def test_on_end():
    context = SarvamOnEndToolContext(
        user_identifier="user123",
        agent_variables={"feedback": "I had a bad experience", "feedback_sentiment": False},
        engagement_metadata=EngagementMetadata(
            interaction_id="123",
            attempt_id="456",
            campaign_id="789",
            interaction_language=SarvamToolLanguageName.ENGLISH,
            app_id="101",
            app_version=1,
            agent_phone_number="+1234567890",
        ),
        interaction_transcript=SarvamInteractionTranscript(
            interaction_transcript=[
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='Hello! How can I help you today?'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.USER, en_text='I need to book a flight'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='I can help you with that. Where would you like to go?'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.USER, en_text='I want to go to Mumbai'),
                SarvamInteractionTurn(role=SarvamInteractionTurnRole.AGENT, en_text='Great! When would you like to travel?')
            ],
            interaction_start_time=datetime.now() - timedelta(minutes=2),
            interaction_end_time=datetime.now(),
        ),
        retry_interaction=False,
        provider_ref_id="CA1234567890abcdef1234567890abcdef",  # Optional: for telephony channels
    )
    
    tool_instance = OnEnd()
    result = await tool_instance.run(context)
    
    print(f"Agent variables: {result.agent_variables}")
    print(f"Interaction Retry: {result.retry_interaction}")
    print(f"Telephony Call SID: {result.provider_ref_id}")

# Run the test
asyncio.run(test_on_end())
```


## Requirements for Async Audio

1. **PyAudio Installation:**
   ```bash
   pip install sarvam-conv-ai-sdk[all]
   ```

2. **System Dependencies:**
   - **macOS**: `brew install portaudio`
   - **Ubuntu/Debian**: `sudo apt-get install portaudio19-dev`
   - **Windows**: Download from [http://www.portaudio.com/download.html](http://www.portaudio.com/download.html)

3. **Environment Variables:**
   ```bash
   export SARVAM_APP_ID="your_app_id"
   export SARVAM_API_KEY="your_api_key"
   ```

## Best Practices for Async Audio

1. Use proper event loop setup for PyAudio compatibility:
   ```python
   loop = asyncio.new_event_loop()
   asyncio.set_event_loop(loop)
   ```

2. Handle connection states gracefully:
   ```python
   while agent.is_connected():
       await asyncio.sleep(1)
   ```

3. Implement proper cleanup in finally blocks:
   ```python
   finally:
       await agent.stop()
   ```

4. Use appropriate sample rates (typically 16000 Hz for input)

5. Handle interruptions with KeyboardInterrupt:
   ```python
   except KeyboardInterrupt:
       print("Stopping conversation...")
   ```

## Complete Example

See `sarvam_conv_ai_sdk/examples/async_audio_example.py` for a complete working script.

---