Metadata-Version: 2.4
Name: polargrid-sdk
Version: 0.6.0
Summary: Python SDK for PolarGrid Edge AI Infrastructure with Full API Support
Project-URL: Homepage, https://polargrid.ai
Project-URL: Documentation, https://docs.polargrid.ai
Project-URL: Repository, https://github.com/your-org/polargrid-sdk
Project-URL: Issues, https://github.com/your-org/polargrid-sdk/issues
Author-email: PolarGrid Team <support@polargrid.ai>
License: MIT
Keywords: ai,edge,inference,llm,machine-learning,ml,polargrid,stt,tts,voice,whisper
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: httpx>=0.25.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: dev
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'dev'
Requires-Dist: pytest-cov>=4.0.0; extra == 'dev'
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Requires-Dist: respx>=0.20.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Description-Content-Type: text/markdown

# PolarGrid SDK

The official Python SDK for PolarGrid Edge AI Infrastructure with full API support and mock data capabilities.

## Features

- ✅ **Text Inference**: Completions and chat completions (streaming support)
- ✅ **Voice**: Text-to-speech and speech-to-text
- ✅ **Model Management**: Load, unload, and check model status
- ✅ **GPU Management**: Monitor and manage GPU resources
- ✅ **Mock Data Mode**: Develop without backend (perfect for frontend work)
- ✅ **Full Type Hints**: Complete type annotations with Pydantic models
- ✅ **Async & Sync**: Both async and synchronous clients
- ✅ **Error Handling**: Comprehensive error types
- ✅ **Retry Logic**: Automatic retry with exponential backoff

## Installation

```bash
pip install polargrid-sdk
```

## Quick Start

### Async Client (Recommended)

```python
import asyncio
from polargrid import PolarGrid

async def main():
    # Development mode (with mock data)
    client = PolarGrid(
        use_mock_data=True,  # Enable mock mode
        debug=True,          # See what's happening
    )

    # All methods work with realistic mock data
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [
            {"role": "user", "content": "Hello!"}
        ]
    })

    print(response.choices[0].message.content)

asyncio.run(main())
```

### Sync Client

```python
from polargrid import PolarGridSync

# For synchronous code
client = PolarGridSync(
    api_key="pg_your_api_key",
    use_mock_data=False,
)

response = client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
})

print(response.choices[0].message.content)
```

## API Reference

### Text Inference

#### Chat Completions

```python
# Non-streaming
response = await client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is quantum computing?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7,
})

print(response.choices[0].message.content)
```

```python
# Streaming
async for chunk in client.chat_completion_stream({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Tell me a story"}],
}):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
```

#### Text Completions

```python
response = await client.completion({
    "prompt": "Once upon a time",
    "model": "llama-3.1-8b",
    "max_tokens": 100,
    "temperature": 0.8,
})

print(response.choices[0].text)
```

### Voice / Audio

#### Text-to-Speech

```python
# Generate audio
audio_buffer = await client.text_to_speech({
    "model": "tts-1",
    "input": "Hello from PolarGrid!",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0,
})

# Save to file
with open("speech.mp3", "wb") as f:
    f.write(audio_buffer)
```

```python
# Streaming TTS
async for chunk in client.text_to_speech_stream({
    "model": "tts-1",
    "input": "Long text to convert...",
    "voice": "nova",
}):
    audio_stream.write(chunk)
```

#### Speech-to-Text

```python
from pathlib import Path

# Transcribe audio
transcription = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "language": "en",
        "response_format": "json",
    }
)

print(transcription.text)
```

```python
# Verbose transcription with timestamps
from polargrid.types import VerboseTranscriptionResponse

verbose = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "verbose_json",
    }
)

if isinstance(verbose, VerboseTranscriptionResponse):
    for segment in verbose.segments:
        print(f"[{segment.start}s - {segment.end}s]: {segment.text}")
```

```python
# Translate to English
translation = await client.translate(
    file=Path("spanish_audio.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "json",
    }
)

print(translation.text)
```

### Model Management

```python
# List available models
response = await client.list_models()
for model in response.data:
    print(f"{model.id} ({model.owned_by})")
```

```python
# Load a model
result = await client.load_model({
    "model_name": "llama-3.1-70b",
    "force_reload": False,
})

print(result.message)
```

```python
# Check model status
status = await client.get_model_status()
print("Loaded models:", status.loaded)
print("Loading status:", status.loading_status)
```

```python
# Unload a model
await client.unload_model({"model_name": "gpt2"})

# Unload all models
result = await client.unload_all_models()
print(f"Unloaded {result.total_unloaded} models")
```

### GPU Management

```python
# Get detailed GPU status
gpu_status = await client.get_gpu_status()
for gpu in gpu_status.gpus:
    print(f"GPU {gpu.index}: {gpu.name}")
    print(f"  Memory: {gpu.memory.used_gb}GB / {gpu.memory.total_gb}GB")
    print(f"  Utilization: {gpu.utilization.gpu_percent}%")
    print(f"  Temperature: {gpu.temperature_c}°C")
```

```python
# Get simplified memory info
memory = await client.get_gpu_memory()
print(f"Memory used: {memory.memory[0].used_gb}GB ({memory.memory[0].percent_used}%)")
```

```python
# Purge GPU memory
purge_result = await client.purge_gpu({"force": False})
print(f"Freed {purge_result.memory_freed_gb}GB")
print(f"Unloaded models: {purge_result.models_unloaded}")
print(purge_result.recommendation)
```

### Health Check

```python
health = await client.health()
print(f"Status: {health.status}")
print(f"Backend healthy: {health.backend.healthy}")
print(f"Models loaded: {health.backend.info.models_loaded}")
```

## Error Handling

```python
from polargrid import (
    PolarGrid,
    is_polargrid_error,
    AuthenticationError,
    ValidationError,
    RateLimitError,
    ServerError,
    NetworkError,
)

try:
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [{"role": "user", "content": "Hello"}],
    })
except Exception as error:
    if is_polargrid_error(error):
        print(f"PolarGrid Error: {error.message}")
        print(f"Request ID: {error.request_id}")

        if isinstance(error, AuthenticationError):
            # Handle auth errors
            pass
        elif isinstance(error, ValidationError):
            # Handle validation errors
            print("Details:", error.details)
        elif isinstance(error, RateLimitError):
            # Handle rate limits
            print(f"Retry after: {error.retry_after}s")
```

## Configuration Options

```python
client = PolarGrid(
    # API key (required for production, optional for mock mode)
    api_key="pg_your_api_key",

    # Base URL (default: https://api.polargrid.ai)
    base_url="https://api.polargrid.ai",

    # JWT token exchange URL (default: /api/auth/inference-token)
    auth_url="/api/auth/inference-token",

    # Request timeout in seconds (default: 30.0)
    timeout=30.0,

    # Maximum retry attempts (default: 3)
    max_retries=3,

    # Enable debug logging (default: False)
    debug=True,

    # Use mock data instead of real API (default: False)
    use_mock_data=True,
)
```

## Mock Data for Development

The SDK includes comprehensive mock data that matches the API spec exactly:

### Why Use Mock Data?

1. **Frontend Development**: Build UI components before backend is ready
2. **Testing**: Predictable responses for unit tests
3. **Demos**: Show realistic flows without production infrastructure
4. **Development**: Faster iteration without API calls

### What's Mocked?

- ✅ All text inference endpoints with realistic responses
- ✅ Voice TTS and STT with proper audio formats
- ✅ Model management with state simulation
- ✅ GPU metrics with realistic utilization data
- ✅ Streaming responses (both text and audio)

## Environment Variables

```bash
# API Key
export POLARGRID_API_KEY=pg_your_api_key

# Base URL (optional)
export NEXT_PUBLIC_INFERENCE_URL=https://api.polargrid.ai
```

## Type Support

Full type hints with Pydantic models:

```python
from polargrid.types import (
    ChatCompletionRequest,
    ChatCompletionResponse,
    ModelInfo,
    GPUStatusResponse,
)
```

## Best Practices

### 1. Use Mock Data During Development

```python
import os

is_development = os.environ.get("ENV") == "development"

client = PolarGrid(
    api_key=os.environ.get("POLARGRID_API_KEY"),
    use_mock_data=is_development,
    debug=is_development,
)
```

### 2. Handle Errors Gracefully

```python
import asyncio

async def with_retry(request):
    try:
        return await client.chat_completion(request)
    except RateLimitError as e:
        # Wait and retry
        await asyncio.sleep(e.retry_after or 60)
        return await client.chat_completion(request)
```

### 3. Use Streaming for Long Responses

```python
# Better user experience for long-form content
async for chunk in client.chat_completion_stream(request):
    if chunk.choices[0].delta.content:
        update_ui(chunk.choices[0].delta.content)
```

## Development

```bash
# Install dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov

# Type checking
mypy src/polargrid

# Linting
ruff check src/polargrid
```

## License

MIT

## Support

- Documentation: https://docs.polargrid.ai
- Issues: https://github.com/your-org/polargrid-sdk/issues
- Email: support@polargrid.ai
- Made with ❄️ by the PolarGrid team.