Metadata-Version: 2.4
Name: lexigram-ai-llm
Version: 0.1.1
Summary: LLM client layer for the Lexigram Framework — OpenAI, Anthropic, Ollama, Cohere, Groq, Mistral
Project-URL: Homepage, https://github.com/lexigram-dev/lexigram
Project-URL: Repository, https://github.com/lexigram-dev/lexigram
Project-URL: Documentation, https://docs.lexigram.dev
Project-URL: Issues, https://github.com/lexigram-dev/lexigram/issues
Project-URL: Changelog, https://github.com/lexigram-dev/lexigram/blob/main/CHANGELOG.md
Author-email: Lexigram Framework Team <team@lexigram.dev>
Maintainer-email: Lexigram Framework Team <team@lexigram.dev>
License: MIT
License-File: LICENSE
Keywords: ai,async,framework,language-model,lexigram,llm,python
Classifier: Development Status :: 4 - Beta
Classifier: Framework :: AsyncIO
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Typing :: Typed
Requires-Python: >=3.11
Requires-Dist: lexigram-contracts>=0.1.0
Requires-Dist: lexigram>=0.1.1
Requires-Dist: typing-extensions>=4.0.0
Provides-Extra: admin
Requires-Dist: lexigram-admin>=0.1.1; extra == 'admin'
Provides-Extra: all
Requires-Dist: anthropic>=0.20.0; extra == 'all'
Requires-Dist: cohere>=4.0.0; extra == 'all'
Requires-Dist: groq>=0.4.0; extra == 'all'
Requires-Dist: instructor>=1.0.0; extra == 'all'
Requires-Dist: jinja2>=3.1.0; extra == 'all'
Requires-Dist: mistralai>=0.1.0; extra == 'all'
Requires-Dist: ollama>=0.1.0; extra == 'all'
Requires-Dist: openai>=1.0.0; extra == 'all'
Requires-Dist: redis>=5.0.0; extra == 'all'
Requires-Dist: tiktoken>=0.5.0; extra == 'all'
Requires-Dist: transformers>=4.30.0; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20.0; extra == 'anthropic'
Provides-Extra: cache
Requires-Dist: redis>=5.0.0; extra == 'cache'
Provides-Extra: cohere
Requires-Dist: cohere>=4.0.0; extra == 'cohere'
Provides-Extra: dev
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: ruff>=0.8.0; extra == 'dev'
Provides-Extra: groq
Requires-Dist: groq>=0.4.0; extra == 'groq'
Provides-Extra: huggingface
Requires-Dist: transformers>=4.30.0; extra == 'huggingface'
Provides-Extra: mistral
Requires-Dist: mistralai>=0.1.0; extra == 'mistral'
Provides-Extra: ollama
Requires-Dist: ollama>=0.1.0; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == 'openai'
Requires-Dist: tiktoken>=0.5.0; extra == 'openai'
Provides-Extra: prompts
Requires-Dist: jinja2>=3.1.0; extra == 'prompts'
Provides-Extra: structured
Requires-Dist: instructor>=1.0.0; extra == 'structured'
Provides-Extra: test
Requires-Dist: lexigram-testing>=0.1.1; extra == 'test'
Requires-Dist: pytest-asyncio>=0.23.0; extra == 'test'
Requires-Dist: pytest-cov>=4.0.0; extra == 'test'
Requires-Dist: pytest-mock>=3.10.0; extra == 'test'
Requires-Dist: pytest>=8.0.0; extra == 'test'
Description-Content-Type: text/markdown

# lexigram-ai-llm

LLM client layer for the Lexigram Framework — OpenAI, Anthropic, Ollama, Cohere, Groq, Mistral

---

## Overview

LLM client layer for the Lexigram Framework. Provides typed, async-first clients for 18 providers, multi-provider routing, thinking/reasoning control, structured extraction, streaming, embeddings, and model management — all wired through the DI container via `LLMModule`. Zero-config usage starts with sensible defaults.


> Full documentation: [docs.lexigram.dev](https://docs.lexigram.dev)
## Install

```bash
uv add lexigram-ai-llm
# Optional extras
uv add "lexigram-ai-llm[openai,anthropic,ollama]"
```

## Quick Start

```python
from lexigram import Application
from lexigram.di.module import Module, module

from lexigram.ai.llm import LLMModule
from lexigram.ai.llm.config import ClientConfig

@module(imports=[
    LLMModule.configure(
        ClientConfig(provider="anthropic", model="claude-sonnet-4-6")
    )
])
class AppModule(Module):
    pass

app = Application(modules=[AppModule])
if __name__ == "__main__":
    app.run()
```

## Configuration

> **Zero-config usage:** Call `LLMModule.configure()` with no arguments to use defaults.

### Option 1 — YAML file

```yaml
# application.yaml
ai_llm:
  provider: "anthropic"
  model: "claude-sonnet-4-6"
  api_key: "${LEX_AI_LLM__API_KEY}"
  temperature: 0.7
  max_tokens: null
```

### Option 2 — Profiles + Environment Variables *(recommended)*

```bash
export LEX_AI_LLM__PROVIDER=anthropic
# Environment variables for each field
```

### Option 3 — Python

```python
from lexigram.ai.llm.config import ClientConfig
from lexigram.ai.llm import LLMModule

config = ClientConfig(
    provider="anthropic",
    model="claude-sonnet-4-6",
)
LLMModule.configure(config)
```

### Config reference

| Field | Default | Env var | Description |
|-------|---------|---------|-------------|
| `enabled` | `True` | `LEX_AI_LLM__ENABLED` | Enable the LLM subsystem |
| `provider` | `openai` | `LEX_AI_LLM__PROVIDER` | LLM provider |
| `model` | `gpt-4-turbo` | `LEX_AI_LLM__MODEL` | Model name |
| `api_key` | `None` | `LEX_AI_LLM__API_KEY` | Provider API key |
| `api_base` | `None` | `LEX_AI_LLM__API_BASE` | Custom endpoint (Azure, local, proxy) |
| `temperature` | `0.7` | `LEX_AI_LLM__TEMPERATURE` | Sampling temperature (0.0–2.0) |
| `max_tokens` | `None` | `LEX_AI_LLM__MAX_TOKENS` | Response token limit |
| `timeout` | `60.0` | `LEX_AI_LLM__TIMEOUT` | Request timeout in seconds |
| `enable_cache` | `False` | `LEX_AI_LLM__ENABLE_CACHE` | Cache responses |
| `cache_ttl` | `3600` | `LEX_AI_LLM__CACHE_TTL` | Cache TTL in seconds |
| `thinking` | `None` | — | Reasoning/thinking control configuration |

## Module Factory Methods

| Method | Description |
|--------|-------------|
| `LLMModule.configure(config)` | Single-provider client |
| `LLMModule.configure(routing=LLMConfig())` | Multi-provider routing cascade |
| `LLMModule.stub()` | No-op client for tests |

## Key Features

- **18 providers**: OpenAI, Anthropic, Google Gemini, Azure, Ollama, Groq, Mistral, Cohere, and more
- **Multi-provider routing**: Sequential, cost-optimized, and latency-optimized strategies
- **Thinking/reasoning control**: Extended thinking with token budget and suppression
- **Structured extraction**: JSON schema and Pydantic model extraction
- **Streaming**: Async streaming response support
- **Embeddings**: Text embedding client with same provider
- **Caching**: Response-level caching with configurable TTL

## Testing

```python
async with Application.boot(modules=[LLMModule.stub()]) as app:
    # your test code
    ...
```

## Key Source Files

| File | What it contains |
|------|-----------------|
| `src/lexigram/ai/llm/module.py` | `LLMModule.configure()` and `LLMModule.stub()` |
| `src/lexigram/ai/llm/config.py` | `ClientConfig` |
| `src/lexigram/ai/llm/routing/config.py` | `LLMConfig`, `ProviderConfig` for routing |
| `src/lexigram/ai/llm/di/provider.py` | `LLMProvider` — registers and boots the client |
| `src/lexigram/ai/llm/clients/` | Provider implementations |
| `src/lexigram/ai/llm/thinking/` | `ThinkingConfig` handling and suppression |
| `src/lexigram/ai/llm/exceptions.py` | Full exception hierarchy |
