Metadata-Version: 2.4
Name: llm-ledger
Version: 0.1.0
Summary: Production-ready LLM client with audit trail, deterministic caching, and provenance tracking
Project-URL: Homepage, https://github.com/sireto/llm-ledger
Project-URL: Source, https://github.com/sireto/llm-ledger
Project-URL: Issues, https://github.com/sireto/llm-ledger/issues
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: litellm>=1.30.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: httpx>=0.25.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: sqlalchemy>=2.0.0
Requires-Dist: redis>=5.0.0
Requires-Dist: build>=1.3.0
Requires-Dist: twine>=6.2.0
Provides-Extra: dev
Requires-Dist: pytest>=7.4.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"

# LLM Ledger

A production-ready Python SDK wrapping LiteLLM with deterministic caching, provenance tracking, and full audit trail.

## Features

- ✅ **Deterministic Caching** - SHA256-based content hashing for reproducible results
- ✅ **Provenance Tracking** - Full metadata and source tracking
- ✅ **Audit Trail** - Complete ledger of all LLM interactions
- ✅ **Token Accounting** - Automatic token usage and cost estimation
- ✅ **Multi-Provider** - Supports all LiteLLM providers (Anthropic, OpenAI, Azure, etc.)
- ✅ **Request Persistence** - SQLite/PostgreSQL storage with full audit trail
- ✅ **Retry Logic** - Automatic retry with exponential backoff via LiteLLM
- ✅ **Async Support** - Full async/await support for high throughput
- ✅ **Type Safe** - Complete type hints with Pydantic models

## Installation

```bash
pip install llm-ledger
```

Or install from source:
```bash
git clone <repo>
cd llm-ledger
pip install -e .
```

## Quick Start

### Basic Usage

```python
from llm_ledger import LedgerClient

client = LedgerClient()
response = client.quick_complete("Explain quantum computing in simple terms")
print(response)
```

### Fluent Builder Pattern

```python
response = (
    client.completion()
    .model("claude-sonnet-4")
    .system("You are a helpful assistant")
    .user("Summarize the key points from this document...")
    .temperature(0.0)
    .max_tokens(4000)
    .with_metadata(
        workflow_id="document_processing",
        chunk_id="section_3"
    )
    .execute()
)

print(f"Response: {response.content}")
print(f"Tokens: {response.usage.total_tokens}")
print(f"Cost: ${response.cost_estimate:.4f}")
```

### With Full Provenance

```python
from llm_ledger import LLMRequest, ProvenanceMetadata

metadata = ProvenanceMetadata(
    workflow_id="data_processing",
    chunk_id="chunk_12",
    section_id="section_3.2",
    source_id="document_v2.json",
    character_range=(1500, 2300),
    tags={"stage": "extraction"}
)

request = LLMRequest(
    model="claude-sonnet-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "Analyze this text..."}
    ],
    temperature=0.0,
    metadata=metadata
)

response = client.complete(request)
```

### Async Usage

```python
import asyncio

async def process_chunks():
    requests = [create_request(chunk) for chunk in chunks]
    responses = await asyncio.gather(*[
        client.complete_async(req) for req in requests
    ])
    return responses

responses = asyncio.run(process_chunks())
```

## Configuration

### Environment Variables

Create a `.env` file:

```bash
# API Keys
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...

# Database
LLM_DATABASE_URL=postgresql://user:pass@localhost/llm_gateway

# Cache
LLM_CACHE_BACKEND=redis  # or "memory"
REDIS_URL=redis://localhost:6379/0
LLM_CACHE_TTL=3600  # seconds, 0 = no expiration

# Features
LLM_ENABLE_CACHE=true
LLM_ENABLE_PERSISTENCE=true

# Defaults
LLM_DEFAULT_MODEL=claude-sonnet-4
LLM_DEFAULT_TEMPERATURE=0.0
LLM_DEFAULT_MAX_TOKENS=4000
```

### Programmatic Configuration

```python
from llm_ledger import LedgerClient
from llm_ledger.cache import RedisCache

client = LedgerClient(
    cache_backend=RedisCache("redis://localhost:6379/0"),
    database_url="postgresql://localhost/llm_gateway",
    enable_cache=True,
    enable_persistence=True,
    default_model="claude-sonnet-4"
)
```

## Caching

The SDK uses deterministic SHA256-based caching:

```python
# First call - executes against LLM
response1 = client.quick_complete("What is machine learning?", temperature=0.0)
print(response1.from_cache)  # False
print(response1.latency_ms)  # e.g., 1200ms

# Second call - returns from cache
response2 = client.quick_complete("What is machine learning?", temperature=0.0)
print(response2.from_cache)  # True
print(response2.cache_key)   # SHA256 hash

# Different parameters = cache miss
response3 = client.quick_complete("What is machine learning?", temperature=0.5)
print(response3.from_cache)  # False
```

Cache statistics:
```python
stats = client.get_stats()
print(stats["cache_hit_rate"])  # e.g., 0.67
```

## Persistence & Querying

All requests and responses are automatically persisted:

```python
# Query by provenance metadata
requests = client.query_requests(
    workflow_id="data_processing",
    chunk_id="chunk_12"
)

# Retrieve specific request/response
request = client.get_request(request_id)
response = client.get_response(response_id)

# Get token usage summary
from datetime import datetime
usage = client.persistence.get_token_usage_summary(
    start_date=datetime(2024, 1, 1),
    end_date=datetime(2024, 12, 31)
)
```

## Advanced Features

### Custom Retry Logic

LiteLLM handles retries automatically, configure via:

```python
response = client.complete(
    request,
    num_retries=5,
    timeout=600  # 10 minutes
)
```

### Batch Processing

```python
requests = [create_request(text) for text in texts]
responses = await asyncio.gather(*[
    client.complete_async(req) for req in requests
])
```

### Cache Control

```python
# Bypass cache
response = client.complete(request, use_cache=False)

# Clear entire cache
client.clear_cache()

# Invalidate specific request
client.cache.invalidate(request)
```

## Production Deployment

### With PostgreSQL and Redis

```python
from llm_ledger import LedgerClient
from llm_ledger.cache import RedisCache

client = LedgerClient(
    cache_backend=RedisCache("redis://prod-redis:6379/0"),
    database_url="postgresql://user:pass@prod-db/llm_gateway",
    enable_cache=True,
    enable_persistence=True,
    cache_ttl=86400  # 24 hours
)
```

### With Docker Compose

```yaml
services:
  app:
    environment:
      - LLM_DATABASE_URL=postgresql://postgres:password@db/llm_gateway
      - REDIS_URL=redis://redis:6379/0
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
  
  db:
    image: postgres:15
    
  redis:
    image: redis:7
```

## Testing

```python
from llm_ledger.testing import MockLLMClient

# For unit tests
mock_client = MockLLMClient()
mock_client.add_response("test prompt", "test response")

response = mock_client.quick_complete("test prompt")
assert response == "test response"
```

## License

Apache 2.0 License

## Contributing

Contributions welcome! Please see CONTRIBUTING.md for guidelines.
