Metadata-Version: 2.4
Name: rec-llm
Version: 0.1.0
Summary: Recursive Language Models for unbounded context processing
Author: Grigori Gvadzabia
License: MIT
Project-URL: Homepage, https://github.com/ysz/recursive-llm
Project-URL: Documentation, https://github.com/ysz/recursive-llm
Project-URL: Repository, https://github.com/ysz/recursive-llm
Project-URL: Issues, https://github.com/ysz/recursive-llm/issues
Keywords: llm,ai,nlp,recursive,language-models
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: litellm>=1.0.0
Requires-Dist: RestrictedPython>=6.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: PyMuPDF>=1.24.0
Provides-Extra: dev
Requires-Dist: build>=1.2.2; extra == "dev"
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=24.0.0; extra == "dev"
Requires-Dist: ruff>=0.1.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: twine>=5.1.1; extra == "dev"
Provides-Extra: ui
Requires-Dist: gradio>=5.0.0; extra == "ui"
Provides-Extra: linearrag
Requires-Dist: numpy>=1.21.0; extra == "linearrag"
Requires-Dist: pandas>=1.3.0; extra == "linearrag"
Requires-Dist: pyarrow>=12.0.1; extra == "linearrag"
Requires-Dist: python-igraph>=0.11.8; extra == "linearrag"
Requires-Dist: scikit-learn>=1.3.2; extra == "linearrag"
Requires-Dist: scipy>=1.7.0; extra == "linearrag"
Requires-Dist: sentence-transformers>=2.2.2; extra == "linearrag"
Requires-Dist: spacy>=3.6.1; extra == "linearrag"
Requires-Dist: tqdm>=4.67.1; extra == "linearrag"
Requires-Dist: transformers>=4.30.2; extra == "linearrag"
Requires-Dist: huggingface-hub>=0.16.4; extra == "linearrag"
Provides-Extra: all
Requires-Dist: gradio>=5.0.0; extra == "all"
Requires-Dist: numpy>=1.21.0; extra == "all"
Requires-Dist: pandas>=1.3.0; extra == "all"
Requires-Dist: pyarrow>=12.0.1; extra == "all"
Requires-Dist: python-igraph>=0.11.8; extra == "all"
Requires-Dist: scikit-learn>=1.3.2; extra == "all"
Requires-Dist: scipy>=1.7.0; extra == "all"
Requires-Dist: sentence-transformers>=2.2.2; extra == "all"
Requires-Dist: spacy>=3.6.1; extra == "all"
Requires-Dist: tqdm>=4.67.1; extra == "all"
Requires-Dist: transformers>=4.30.2; extra == "all"
Requires-Dist: huggingface-hub>=0.16.4; extra == "all"
Dynamic: license-file

# Recursive Language Models (RLM)

Python implementation of Recursive Language Models for processing unbounded context lengths.

**Based on [the paper](https://alexzhang13.github.io/blog/2025/rlm/) by Alex Zhang and Omar Khattab (MIT, 2025)** | [arXiv](https://arxiv.org/abs/2512.24601)


## What is RLM?

RLM enables language models to process extremely long contexts (100k+ tokens) by:
- Storing context as a Python variable instead of in the prompt
- Allowing the LM to recursively explore and partition the context
- Avoiding "context rot" (performance degradation with long context)

Instead of this:
```python
llm.complete(prompt="Summarize this", context=huge_document)  # Context rot!
```

RLM does this:
```python
rlm = RLM(model="gpt-5-mini")
result = rlm.complete(
    query="Summarize this",
    context=huge_document  # Stored as variable, not in prompt
)
```

The LM can then peek, search, and recursively process the context adaptively.

## Installation

Install the core library:

```bash
pip install r-llm
```

Install the Gradio UI:

```bash
pip install "r-llm[ui]"
```

Install the Gradio UI plus LinearRAG-related extras:

```bash
pip install "r-llm[all]"
```

If you're working from source instead:

```bash
# Clone the repository
git clone https://github.com/ysz/recursive-llm.git
cd recursive-llm

# Install core package
pip install -e .

# Install UI extras
pip install -e ".[ui]"

# Install all extras
pip install -e ".[all]"

# Install dev dependencies
pip install -e ".[dev]"
```

## Requirements

- Python 3.9 or higher
- An API key for your chosen LLM provider (OpenAI, Anthropic, etc.)
- Or a local model setup (Ollama, llama.cpp, etc.)

## Quick Start

```python
from rlm import RLM

# Initialize with any LLM
rlm = RLM(model="gpt-5-mini")

# Process long context
result = rlm.complete(
    query="What are the main themes in this document?",
    context=long_document
)
print(result)
```

For document-heavy workflows, use the document processor to prepare chunked corpora with helper tools:

```python
from rlm import DocumentProcessor, RLM, SourceDocument

processor = DocumentProcessor(
    RLM(model="gpt-5-mini"),
    chunk_size_chars=4000,
    chunk_overlap_chars=400,
)

answer = processor.process_documents(
    "Find the retention requirements and compare them across the documents.",
    [
        SourceDocument(name="policy.md", text=policy_text),
        SourceDocument(name="runbook.md", text=runbook_text),
    ],
)
print(answer)
```

## API Keys Setup

Set your API key via environment variable or pass it directly:

```bash
export OPENAI_API_KEY="sk-..."  # or ANTHROPIC_API_KEY, etc.
```

Or pass directly in code:
```python
rlm = RLM(model="gpt-5-mini", api_key="sk-...")
```

## Supported Models

Works with 100+ LLM providers via LiteLLM:

```python
# OpenAI
rlm = RLM(model="gpt-5")
rlm = RLM(model="gpt-5-mini")

# Groq-hosted models
rlm = RLM(model="groq/llama-3.1-8b-instant")
rlm = RLM(model="groq/meta-llama/llama-4-scout-17b-16e-instruct")

# Anthropic
rlm = RLM(model="claude-sonnet-4")
rlm = RLM(model="claude-sonnet-4-20250514")

# Ollama (local)
rlm = RLM(model="ollama/llama3.2")
rlm = RLM(model="ollama/mistral")

# llama.cpp (local)
rlm = RLM(
    model="openai/local",
    api_base="http://localhost:8000/v1"
)

# Azure OpenAI
rlm = RLM(model="azure/gpt-4-deployment")

# And many more via LiteLLM...
```

For Groq, set `GROQ_API_KEY` and use a Groq model string through LiteLLM.
Text files, Markdown files, and PDFs can be passed into the document processor.

## Gradio UI

If you install the UI extra, you can launch the packaged app with:

```bash
rlm-gradio
```

Or from source:

```bash
python app.py
```

## Advanced Usage

### Two Models (Optimize Cost)

Use a cheaper model for recursive calls:

```python
rlm = RLM(
    model="gpt-5",              # Root LM (main decisions)
    recursive_model="gpt-5-mini"  # Recursive calls (cheaper)
)
```

### Async API

For better performance with parallel recursive calls:

```python
import asyncio

async def main():
    rlm = RLM(model="gpt-5-mini")
    result = await rlm.acomplete(query, context)
    print(result)

asyncio.run(main())
```

### Configuration

```python
rlm = RLM(
    model="gpt-5-mini",
    max_depth=5,         # Maximum recursion depth
    max_iterations=20,   # Maximum REPL iterations
    # Optional LiteLLM params: temperature, timeout, etc.
)
```

### Large Document Processor

`DocumentProcessor` adds a reusable document-processing system on top of `RLM`:

- Normalizes one or many documents into named sources
- Splits large documents into overlapping, boundary-aware chunks
- Builds a manifest plus full chunk corpus for the RLM context
- Exposes helper tools inside the REPL: `find_chunks()`, `get_chunk()`, `get_document()`, and chunk metadata

This makes large-doc workflows less dependent on ad hoc string slicing in prompts and gives the model a structured way to localize relevant sections before deeper analysis.

## How It Works

1. **Context is stored as a variable** in a Python REPL environment
2. **Root LM gets only the query** plus instructions
3. **LM can explore context** using Python code:
   ```python
   # Peek at context
   context[:1000]

   # Search with regex
   import re
   re.findall(r'pattern', context)

   # Recursive processing
   recursive_llm("extract dates", context[1000:2000])
   ```
4. **Returns final answer** via `FINAL(answer)` statement

## Examples

See the `examples/` directory for complete working examples:
- `basic_usage.py` - Simple complete with OpenAI
- `document_processor.py` - Structured large-document processing
- `groq_usage.py` - Run RLM on Groq-hosted models
- `ollama_local.py` - Using Ollama locally
- `two_models.py` - Cost optimization with two models
- `long_document.py` - Processing 50k+ token documents
- `data_extraction.py` - Extract structured data from text
- `multi_file.py` - Process multiple documents
- `custom_config.py` - Advanced configuration

Run an example:
```bash
# Set your API key first
export OPENAI_API_KEY="sk-..."

# Run example
python examples/basic_usage.py
```

## Performance

### Paper Results

On OOLONG benchmark (132k tokens):
- GPT-5: baseline
- RLM(GPT-5-Mini): **33% better than GPT-5** at similar cost

### Our Benchmark Results

Tested with GPT-5-Mini on structured data queries (counting, filtering) across 5 different test cases:

**60k token contexts:**
- **RLM**: 80% accurate (4/5 correct)
- **Direct OpenAI**: 0% accurate (0/5 correct, all returned approximations)

RLM wins on accuracy. Both complete requests, but only RLM gives correct answers.

**150k+ token contexts:**
- **Direct OpenAI**: Fails (rate limit errors)
- **RLM**: Works (processes 1M+ tokens successfully)

**Token efficiency:** RLM uses ~2-3k tokens per query vs 95k+ for direct approach, since context is stored as a variable instead of being sent in prompts.

## Development

```bash
# Clone repository
git clone https://github.com/ysz/recursive-llm.git
cd recursive-llm

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/ -v

# Run tests with coverage
pytest tests/ -v --cov=src/rlm --cov-report=term-missing

# Type checking
mypy src/rlm

# Linting
ruff check src/rlm

# Format code
black src/rlm tests examples
```

## Publishing To PyPI

```bash
# Install publishing tools
pip install -e ".[dev]"

# Build sdist + wheel
python -m build

# Check artifacts
python -m twine check dist/*

# Upload to TestPyPI first
python -m twine upload --repository testpypi dist/*

# Upload to PyPI
python -m twine upload dist/*
```

Before uploading, update the version in `pyproject.toml` and `src/rlm/__init__.py`.

## Architecture

```
RLM
├── Core (async completion logic)
├── REPL Executor (safe code execution via RestrictedPython)
├── Prompt Builder (system prompts)
└── Parser (extract FINAL() answers)
```

Built on top of LiteLLM for universal LLM support.

## Limitations

- REPL execution is sequential (no parallel code execution yet)
- No prefix caching (future enhancement)
- Recursion depth is limited (configurable via `max_depth`)
- No streaming support yet

## Troubleshooting

### "Max iterations exceeded"
- Increase `max_iterations` parameter
- Simplify your query
- Check if the model is getting stuck in a loop

### "API key not found"
- Set the appropriate environment variable (e.g., `OPENAI_API_KEY`)
- Or pass `api_key` parameter to RLM constructor

### "Model not found"
- Check model name format for your provider
- See LiteLLM docs: https://docs.litellm.ai/docs/providers

### Using Ollama
- Make sure Ollama is running: `ollama serve`
- Pull a model first: `ollama pull llama3.2`
- Use model format: `ollama/model-name`

## Contributing

Contributions welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Add tests for new features
4. Ensure all tests pass (`pytest tests/`)
5. Follow code style (use `black` and `ruff`)
6. Submit a pull request

## Citation

This implementation is based on the RLM paper by Alex Zhang and Omar Khattab.

**To cite this implementation:**
```bibtex
@software{rlm_python,
  title = {recursive-llm: Python Implementation of Recursive Language Models},
  author = {Gvadzabia, Grisha},
  year = {2025},
  url = {https://github.com/ysz/recursive-llm}
}
```

**To cite the original paper:**
```bibtex
@misc{zhang2025rlm,
  title = {Recursive Language Models},
  author = {Zhang, Alex and Khattab, Omar},
  year = {2025},
  month = {October},
  url = {https://alexzhang13.github.io/blog/2025/rlm/},
  eprint = {2512.24601},
  archivePrefix = {arXiv}
}
```

## License

MIT License - see LICENSE file for details

## Acknowledgments

Based on the Recursive Language Models paper by Alex Zhang and Omar Khattab from MIT CSAIL.

Built using:
- LiteLLM for universal LLM API support
- RestrictedPython for safe code execution

## Links

- **Paper**: https://alexzhang13.github.io/blog/2025/rlm/
- **arXiv**: https://arxiv.org/abs/2512.24601
- **LiteLLM Docs**: https://docs.litellm.ai/
- **Issues**: https://github.com/ysz/recursive-llm/issues
