Metadata-Version: 2.4
Name: slim-protocol
Version: 1.0.0b1
Summary: Python SDK for fetching web content in SLIM format - optimized for AI consumption
Project-URL: Homepage, https://github.com/matteuccimarco/slim-client-py
Project-URL: Documentation, https://github.com/matteuccimarco/slim-client-py#readme
Project-URL: Repository, https://github.com/matteuccimarco/slim-client-py
Project-URL: Issues, https://github.com/matteuccimarco/slim-client-py/issues
Author-email: SLIM Protocol Team <team@slim-protocol.dev>
License-Expression: MIT
License-File: LICENSE
Keywords: ai,content-extraction,langchain,llamaindex,llm,slim,slim-protocol,swr,web-scraping
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet :: WWW/HTTP
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.9
Requires-Dist: httpx>=0.27.0
Requires-Dist: pydantic>=2.0.0
Provides-Extra: all
Requires-Dist: langchain-core>=0.3.0; extra == 'all'
Requires-Dist: llama-index-core>=0.11.0; extra == 'all'
Provides-Extra: dev
Requires-Dist: mypy>=1.0.0; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.24.0; extra == 'dev'
Requires-Dist: pytest-httpx>=0.30.0; extra == 'dev'
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Requires-Dist: ruff>=0.8.0; extra == 'dev'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.0; extra == 'langchain'
Provides-Extra: llamaindex
Requires-Dist: llama-index-core>=0.11.0; extra == 'llamaindex'
Description-Content-Type: text/markdown

# slim-protocol

Python SDK for fetching web content in **SLIM format** - optimized for AI consumption with ~90% token reduction.

## Features

- **One-line usage** - `slim = fetch_slim(url)`
- **Sync + Async** - Both sync and async APIs
- **Full type hints** - Complete type annotations
- **Pydantic models** - Validated response types
- **AI integrations** - LangChain and LlamaIndex support
- **Python 3.9+** - Wide compatibility

## Installation

```bash
pip install slim-protocol

# With LangChain integration
pip install slim-protocol[langchain]

# With LlamaIndex integration
pip install slim-protocol[llamaindex]

# All integrations
pip install slim-protocol[all]
```

## Quick Start

```python
from slim_protocol import fetch_slim

slim = fetch_slim("https://example.com")

# Access structured content
print(slim.payload.l1.title)      # Page title
print(slim.payload.l1.type)       # Content type (article, video, etc.)
print(slim.payload.l5.key_points) # Key points extracted

# Check compression metrics
print(slim.meta.tokens_estimate)     # Estimated tokens
print(slim.meta.compression_ratio)   # Compression achieved
```

## Async Usage

```python
from slim_protocol import async_fetch_slim

slim = await async_fetch_slim("https://example.com")
print(slim.payload.l1.title)

# Parallel fetching
import asyncio

async def fetch_many(urls):
    tasks = [async_fetch_slim(url) for url in urls]
    return await asyncio.gather(*tasks)
```

## API

### `fetch_slim(url, **options)`

Fetch web content in SLIM format (sync).

```python
slim = fetch_slim(
    "https://example.com",
    proxy_url="https://my-proxy.com",  # Override proxy URL
    timeout=60,                         # Timeout in seconds (default: 30)
    include_images=True,                # Include image metadata (default: True)
    include_videos=True,                # Include video metadata (default: True)
)
```

### `async_fetch_slim(url, **options)`

Fetch web content in SLIM format (async).

```python
slim = await async_fetch_slim("https://example.com", timeout=60)
```

### `configure(**options)`

Configure the SDK globally.

```python
from slim_protocol import configure

configure(
    proxy_url="https://my-proxy.com",
    timeout=60,
    debug=True,
)
```

### `is_valid_slim_url(url)`

Check if a URL is valid for fetching.

```python
from slim_protocol import is_valid_slim_url

if is_valid_slim_url(user_input):
    slim = fetch_slim(user_input)
```

## SLIM Pyramid Levels

The response contains hierarchical content levels:

| Level | Name | Contains |
|-------|------|----------|
| L1 | Identity | Title, type, author, description |
| L3 | Structure | Headings, sections, navigation |
| L5 | Insights | Key points, topics, entities |
| L7 | Full Content | Complete text content |

```python
# L1: Always present - basic identification
slim.payload.l1.title
slim.payload.l1.type
slim.payload.l1.author

# L3: Document structure
slim.payload.l3.sections
slim.payload.l3.structure

# L5: Extracted insights
slim.payload.l5.key_points
slim.payload.l5.topics
slim.payload.l5.summary

# L7: Full content
slim.payload.l7.full_content
```

## Error Handling

```python
from slim_protocol import fetch_slim
from slim_protocol.exceptions import (
    SlimError,
    SlimInvalidUrlError,
    SlimProxyError,
    SlimTimeoutError,
    SlimNetworkError,
)

try:
    slim = fetch_slim(url)
except SlimInvalidUrlError as e:
    print(f"Invalid URL: {e}")
except SlimTimeoutError as e:
    print(f"Timeout: {e}")
except SlimProxyError as e:
    print(f"Proxy error ({e.status_code}): {e}")
except SlimNetworkError as e:
    print(f"Network error: {e}")
except SlimError as e:
    print(f"Generic error: {e}")
    if e.suggestion:
        print(f"Suggestion: {e.suggestion}")
```

## LangChain Integration

```python
from slim_protocol.integrations.langchain import SlimLoader

# Load documents from URLs
loader = SlimLoader(urls=["https://example.com/article"])
documents = loader.load()

# Use in a chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Create a vector store from SLIM documents
# ... your vector store setup ...

qa_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(),
    retriever=vectorstore.as_retriever(),
)
```

## LlamaIndex Integration

```python
from slim_protocol.integrations.llamaindex import SlimReader
from llama_index.core import VectorStoreIndex

# Load documents
reader = SlimReader()
documents = reader.load_data(urls=["https://example.com/article"])

# Create index
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

response = query_engine.query("What is this article about?")
```

## Environment Variables

Configure the SDK via environment variables:

```bash
export SLIM_PROXY_URL="https://my-proxy.com"
export SLIM_TIMEOUT="60"
export SLIM_DEBUG="true"
```

## Type Hints

All types are exported for use in your code:

```python
from slim_protocol import (
    SlimResponse,
    SlimPayload,
    SlimL1, SlimL3, SlimL5, SlimL7,
    SlimSource,
    SlimMeta,
    SlimConfig,
)

def process_slim(slim: SlimResponse) -> str:
    return slim.payload.l1.title
```

## Requirements

- Python 3.9+
- httpx
- pydantic

## License

MIT
