Metadata-Version: 2.4
Name: llama-index-llms-grok
Version: 0.1.0
Summary: LlamaIndex integration for xAI Grok models using the official xai-sdk
License: MIT
License-File: LICENSE
Keywords: llama-index,grok,xai,llm,ai,machine-learning
Author: Jose Medina
Author-email: josemedina@gmail.com
Requires-Python: >=3.10
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Dist: llama-index-core (>=0.14.8)
Requires-Dist: xai-sdk (>=1.4.0,<2.0.0)
Project-URL: Bug Tracker, https://github.com/josemedina/llama-index-llms-grok/issues
Project-URL: Documentation, https://github.com/josemedina/llama-index-llms-grok#readme
Project-URL: Homepage, https://github.com/josemedina/llama-index-llms-grok
Project-URL: Repository, https://github.com/josemedina/llama-index-llms-grok
Description-Content-Type: text/markdown

# llama-index-llms-grok

LlamaIndex integration for xAI's Grok models using the official `xai-sdk`.

This library provides native support for the latest Grok models (including Grok 4 and Grok 4.1 fast models with and without reasoning) using xAI's modern Chat API, unlike the older OpenAI-compatible completions endpoint.

## Installation

```bash
pip install llama-index-llms-grok
```

## Setup

Get your API key from [console.x.ai](https://console.x.ai/) and set it as an environment variable:

```bash
export XAI_API_KEY=your_api_key_here
```

## Usage

### Basic Chat

```python
from llama_index_llms_grok import Grok
from llama_index.core.llms import ChatMessage

# Initialize with default Grok 4.1 model
llm = Grok(api_key="your_api_key")  # or set XAI_API_KEY env var

# Chat
messages = [
    ChatMessage(role="system", content="You are a helpful assistant."),
    ChatMessage(role="user", content="Explain quantum computing briefly."),
]
response = llm.chat(messages)
print(response.message.content)
```

### Using Grok Fast (Non-Reasoning)

```python
from llama_index_llms_grok import GrokFast

llm = GrokFast()  # Uses grok-4-1-fast-non-reasoning model
response = llm.complete("What is the capital of France?")
print(response.text)
```

### Using Grok with Reasoning Mode

```python
from llama_index_llms_grok import GrokReasoning

# Reasoning models may take longer, so timeout is set to 3600s by default
llm = GrokReasoning(show_reasoning=True)  # Set to True to see thinking process
response = llm.complete("Solve this logic puzzle: ...")
print(response.text)
```

### Using Grok for Code

```python
from llama_index_llms_grok import GrokCode

llm = GrokCode()  # Uses grok-code-fast-1 model
response = llm.complete("Write a Python function to calculate fibonacci numbers.")
print(response.text)
```

### Using Grok Vision

```python
from llama_index_llms_grok import GrokVision

llm = GrokVision()  # Uses grok-2-vision-1212 model
# Vision capabilities for image understanding
```

### Using Grok 3 Models

```python
from llama_index_llms_grok import Grok3, Grok3Mini

# Full Grok 3 model
llm = Grok3()

# Or lightweight Grok 3 Mini
llm_mini = Grok3Mini()
```

### Streaming

```python
from llama_index_llms_grok import Grok
from llama_index.core.llms import ChatMessage

llm = Grok()
messages = [ChatMessage(role="user", content="Tell me a story about AI.")]

for chunk in llm.stream_chat(messages):
    print(chunk.delta, end="", flush=True)
```

### Custom Parameters

```python
from llama_index_llms_grok import Grok

llm = Grok(
    model="grok-4-1-fast-reasoning",
    temperature=0.7,
    max_tokens=1024,
    timeout=600,
)
```

## Available Models

### Language Models

#### Grok 4.1 (Latest - 2M Context Window)
- **`grok-4-1-fast-reasoning`** - Fast model with reasoning (default)
- **`grok-4-1-fast-non-reasoning`** - Fast model without reasoning (`GrokFast`)

#### Grok 4 (2M Context Window)  
- **`grok-4-fast-reasoning`** - Alternative fast with reasoning
- **`grok-4-fast-non-reasoning`** - Alternative fast without reasoning

#### Specialized Models
- **`grok-code-fast-1`** - Optimized for code (256K context) (`GrokCode`)
- **`grok-4-0709`** - Specific version (256K context)

#### Grok 3 (131K Context Window)
- **`grok-3`** - Standard Grok 3 model (`Grok3`)
- **`grok-3-mini`** - Lightweight Grok 3 (`Grok3Mini`)

#### Grok 2 
- **`grok-2-1212`** - Grok 2 from December 2024 (131K context)
- **`grok-2-vision-1212`** - Vision-enabled Grok 2 (32K context) (`GrokVision`)

### Image Generation Models
- **`grok-2-image-1212`** - Image generation (not yet supported in this package)

## Features

- ✅ Native xAI SDK integration using modern Chat API
- ✅ Support for all Grok models (2, 3, 4, 4.1)
- ✅ 2M context window support for Grok 4.1 models
- ✅ Specialized models: Code, Vision
- ✅ Reasoning and non-reasoning modes
- ✅ Streaming responses
- ✅ Automatic reasoning content handling
- ✅ Full LlamaIndex LLM interface compatibility
- ✅ Type hints and proper error handling
- ✅ Configurable timeouts for long-running reasoning tasks
- ✅ Async/await support

## Advanced Usage

### Accessing Reasoning Content

When using reasoning models with `show_reasoning=False` (default), the thinking process is stripped from the response but accessible via `additional_kwargs`:

```python
from llama_index_llms_grok import GrokReasoning
from llama_index.core.llms import ChatMessage

llm = GrokReasoning(show_reasoning=False)
response = llm.chat([ChatMessage(role="user", content="Complex question...")])

# Access reasoning if available
if "reasoning_content" in response.message.additional_kwargs:
    print("Thinking:", response.message.additional_kwargs["reasoning_content"])
print("Answer:", response.message.content)
```

### Integration with LlamaIndex

```python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index_llms_grok import Grok

# Load documents
documents = SimpleDirectoryReader("./data").load_data()

# Create index with Grok
llm = Grok(model="grok-4-1-fast-reasoning")
index = VectorStoreIndex.from_documents(documents)

# Query
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("What are the key points in these documents?")
print(response)
```

## Requirements

- Python >=3.10
- `xai-sdk>=1.4.0`
- `llama-index-core>=0.14.8`

## License

MIT

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## Links

- [xAI Documentation](https://docs.x.ai/)
- [LlamaIndex Documentation](https://docs.llamaindex.ai/)
- [GitHub Repository](https://github.com/josemedina/llama-index-llms-grok)

## Differences from llama-index-llms-openai

This integration uses xAI's native SDK instead of OpenAI compatibility mode:

- ✅ Access to latest Grok models immediately
- ✅ Native reasoning mode support
- ✅ Better error handling for xAI-specific features
- ✅ Future-proof as xAI adds new capabilities

