Metadata-Version: 2.4
Name: slowburn
Version: 0.4.0
Summary: Cost-Sustainable Concurrent Execution for Long-Horizon LLM Agents
Project-URL: Homepage, https://github.com/adivekar-utexas/slowburn
Project-URL: Repository, https://github.com/adivekar-utexas/slowburn
Project-URL: Issues, https://github.com/adivekar-utexas/slowburn/issues
Author-email: Abhishek Divekar <adivekar@utexas.edu>
License-File: LICENSE
Keywords: agents,concurrency,cost-optimization,llm,long-horizon
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: concurry>=0.13.0
Requires-Dist: litellm>=1.0.0
Requires-Dist: morphic>=0.1.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=1.2.2
Provides-Extra: all
Requires-Dist: ag2>=0.8.0; extra == 'all'
Requires-Dist: crewai>=0.80.0; extra == 'all'
Requires-Dist: ipykernel; extra == 'all'
Requires-Dist: ipython; extra == 'all'
Requires-Dist: ipywidgets; extra == 'all'
Requires-Dist: jupyterlab>=4.5.6; extra == 'all'
Requires-Dist: langchain-core>=0.3.0; extra == 'all'
Requires-Dist: langchain>=1.0.0; extra == 'all'
Requires-Dist: langgraph>=0.3.0; extra == 'all'
Requires-Dist: pytest-cov>=4.0; extra == 'all'
Requires-Dist: pytest-timeout>=2.0; extra == 'all'
Requires-Dist: pytest>=7.0; extra == 'all'
Requires-Dist: ray>=2.0.0; extra == 'all'
Requires-Dist: ruff>=0.4; extra == 'all'
Provides-Extra: autogen
Requires-Dist: ag2>=0.8.0; extra == 'autogen'
Provides-Extra: crewai
Requires-Dist: crewai>=0.80.0; extra == 'crewai'
Provides-Extra: dev
Requires-Dist: ipykernel; extra == 'dev'
Requires-Dist: ipython; extra == 'dev'
Requires-Dist: ipywidgets; extra == 'dev'
Requires-Dist: jupyterlab>=4.5.6; extra == 'dev'
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.0; extra == 'langchain'
Provides-Extra: langgraph
Requires-Dist: langchain>=1.0.0; extra == 'langgraph'
Requires-Dist: langgraph>=0.3.0; extra == 'langgraph'
Provides-Extra: ray
Requires-Dist: ray>=2.0.0; extra == 'ray'
Description-Content-Type: text/markdown

# SlowBurn: Cost-Sustainable Concurrent Execution for Long-Horizon LLM Agents
**Authors**: Abhishek Divekar

[![PyPI version](https://img.shields.io/pypi/v/slowburn.svg)](https://pypi.org/project/slowburn/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

---

<p align="center">
  <a href="https://drive.google.com/drive/folders/1_CWYaP9WP-p0X0_RVAJv1rrX-KNF2C1w?usp=drive_link">
    <img src="https://img.shields.io/badge/%E2%96%B6%EF%B8%8F_Watch_Demo_Video-red?style=for-the-badge&logoColor=white&logo=googledrive" alt="Watch Demo Video" height="40"/>
  </a>
</p>

<p align="center">
  <a href="https://drive.google.com/drive/folders/1_CWYaP9WP-p0X0_RVAJv1rrX-KNF2C1w?usp=drive_link">
    <picture>
      <img src="images/architecture.png" alt="SlowBurn Architecture - Click to Watch Demo Video" width="800"/>
    </picture>
  </a>
  <br/>
  <sub><b>Click the image above to watch the demo video</b></sub>
</p>

---

## Overview

Long-horizon LLM agents (autonomous coding assistants, deep research pipelines, multi-agent simulations) issue dozens to hundreds of API calls per task. Existing tools either passively monitor spending, or hard-terminate the agent when a budget cap is reached, discarding accumulated context.

SlowBurn takes a different approach: **when the budget is exhausted, the agent pauses rather than crashes.** Budget exhaustion becomes a flow-control signal (backpressure), not a fatal error. The agent sleeps until the rate-limit window refills, then resumes exactly where it left off with no context loss.

**What SlowBurn provides:**

- **CostLimit**: a dollar-denominated rate limit that composes with token and request rate limits, and blocks rather than terminates when exhausted
- **SlowBurnLLM**: an asyncio LLM worker with automatic per-call cost tracking, supporting 100+ models via [litellm](https://github.com/BerriAI/litellm) (text and vision)
- **Framework integrations**: drop-in hooks for [CrewAI](https://github.com/crewAIInc/crewAI), [AutoGen (AG2)](https://github.com/ag2ai/ag2), [LangGraph](https://github.com/langchain-ai/langgraph), and [LangChain](https://github.com/langchain-ai/langchain) that share a unified budget
- **CostReporter**: per-call, per-model cost attribution with JSON, Markdown, and LaTeX export
- **Global config**: all defaults centralized in `slowburn_config`, overridable at runtime via `temp_config()`

## Quick Start

Create a cost-controlled LLM worker with a daily dollar budget, make calls, and inspect the cost report:

```python
from slowburn import create_llm

# Create a cost-controlled LLM worker: $5 daily budget, asyncio execution
llm = create_llm(model="gpt-4o-mini", budget_usd=5.0, window="daily")

# Make LLM calls (concurrent on the asyncio event loop)
result = llm.call_llm(prompt="Summarize this paper...").result()

# Check costs
reporter = llm.get_reporter().result()
print(f"Cost: ${reporter.total_cost():.4f}")
print(reporter.to_markdown())

llm.stop()
```

### Vision-Language Agents

Pass local files, URLs, or data-URLs as images for multimodal (VLM) calls:

```python
from pathlib import Path

result = llm.call_llm(
    prompt="Describe this image in detail.",
    images=[Path("photo.jpg")],       # local files, URLs, or data-URLs
    image_detail="high",
).result()
```

### Batch calls (concurrent)

Send multiple prompts in one call; they execute concurrently on the asyncio event loop under the same budget:

```python
results = llm.call_llm_batch(
    prompts=["Capital of France?", "Capital of Japan?", "Capital of Brazil?"],
).result()
# All 3 execute concurrently on the event loop
```

### Structured output with validators

Attach a validator function to parse and type-check the response; `ValueError` triggers an automatic retry:

```python
import re

def extract_number(text: str) -> int:
    match = re.search(r"\d+", text)
    if match is None:
        raise ValueError(f"No number found: {text!r}")  # triggers retry
    return int(match.group())

answer = llm.call_llm(
    prompt="What is 17 * 3? Reply with just the number.",
    validator=extract_number,    # retries automatically on ValueError
).result()
# answer = 51 (int, not str)
```

### Global configuration

Override defaults (temperature, budget, timeouts) for a specific run using a context manager that restores on exit:

```python
from slowburn import slowburn_config, temp_config

# Inspect defaults
print(slowburn_config.defaults.temperature)    # 0.7
print(slowburn_config.defaults.budget_usd)     # 5.0

# Override for a specific run (restores on exit)
with temp_config(temperature=0.0, budget_usd=0.10):
    llm = create_llm(model="gpt-4o-mini")
    # temperature=0.0, budget_usd=$0.10
```

## Framework Integrations

SlowBurn provides drop-in hooks that add backpressure-based budget enforcement to existing agent frameworks. Each hook intercepts LLM calls at the framework's extension point and routes them through a shared limit set.

### AutoGen (AG2)

```python
from slowburn.integrations.autogen import SlowBurnModelClient

assistant.register_model_client(
    model_client_cls=SlowBurnModelClient,
    limit_set=limit_set,
    reporter=reporter,
)
```

### CrewAI

```python
from slowburn.integrations.crewai import SlowBurnCrewAI

sb = SlowBurnCrewAI(budget_usd=5.0, max_tokens=1000)
sb.install()
crew.kickoff()
print(sb.reporter.to_markdown())
```

### LangGraph

```python
from slowburn.integrations.langgraph import SlowBurnMiddleware

budget = SlowBurnMiddleware(budget_usd=5.0)
agent = create_agent(model="openai:gpt-4o-mini", middleware=[budget])
```

### LangChain

```python
from slowburn.integrations.langchain import SlowBurnCallbackHandler

handler = SlowBurnCallbackHandler(budget_usd=5.0)
llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[handler])
```

## Case Study: Autonomous Code Improvement Agent

We deployed a ReAct agent that reads Python code, searches the web for best practices, writes improved code, and iterates three times, with every LLM call routed through SlowBurn with a $0.02-per-30s budget window.

| Iteration | Calls | Input Tokens | Output Tokens | Cost |
|---|---:|---:|---:|---:|
| 1: Best practices | 9 | 25K | 3K | $0.02 |
| 2: Type hints | 15 | 68K | 9K | $0.04 |
| 3: Edge cases | 15 | 62K | 7K | $0.03 |
| **Total** | **39** | **155K** | **19K** | **$0.09** |

Between iterations, backpressure paused the agent for ~18 seconds until the budget window refilled. Execution resumed with no loss of context.

## Comparison with Alternatives

| Feature | SlowBurn | AgentBudget | LiteLLM | Langfuse | Prompto |
|---|---|---|---|---|---|
| Budget exhaustion | **Pauses** | Terminates | Terminates | --- | --- |
| Concurrent execution | Asyncio | --- | --- | --- | Async |
| Cost tracking | Per-call | Session | Per-key | Trace | --- |
| Dollar rate limit | Yes | --- | --- | --- | --- |
| Framework hooks | 4 | 2 | Proxy | Many | --- |
| Infrastructure | Zero | Zero | Proxy | Server | Zero |
| Paper-ready export | Markdown + LaTeX | --- | --- | --- | --- |

## Project Structure

```
slowburn/
├── src/slowburn/
│   ├── __init__.py                 # create_llm() entry point
│   ├── config.py                   # SlowBurnConfig, temp_config(), _NO_ARG sentinel
│   ├── llm_worker.py               # SlowBurnLLM asyncio worker (text + vision)
│   ├── cost_accounting.py          # estimate_input_tokens(), cost_controlled_call()
│   ├── limits.py                   # CostLimit (dollar-denominated rate limit)
│   ├── pricing.py                  # PricingCache (litellm + OpenRouter pricing)
│   ├── reporter.py                 # CostReporter (JSON, Markdown, LaTeX export)
│   ├── backpressure.py             # Backpressure warning logging
│   └── integrations/
│       ├── autogen.py              # AutoGen (AG2) ModelClient
│       ├── crewai.py               # CrewAI event bus / hooks middleware
│       ├── langchain.py            # LangChain callback handler
│       └── langgraph.py            # LangGraph agent middleware
├── demos/
│   ├── Demo.ipynb                      # Interactive demo notebook
│   ├── demo_native_research_agent.py   # Research agent with web search
│   ├── demo_native_code_agent.py       # Code improvement agent
│   ├── demo_crewai_research_team.py    # CrewAI multi-agent demo
│   ├── demo_autogen_debate.py          # AutoGen debate demo
│   ├── demo_langchain_reflection.py    # LangChain chain demo
│   └── demo_langgraph_plan_execute.py  # LangGraph agent demo
└── README.md
```

## Installation

```bash
pip install slowburn
```

With framework integrations:

```bash
pip install "slowburn[crewai]"       # CrewAI
pip install "slowburn[autogen]"      # AutoGen (AG2)
pip install "slowburn[langgraph]"    # LangGraph
pip install "slowburn[langchain]"    # LangChain
```

Everything:
```bash
pip install "slowburn[all]"         
```

### From source (development)

```bash
git clone https://github.com/adivekar-utexas/slowburn.git
cd slowburn
pip install -e ".[dev]"

# Set API key
cp .env.example .env
# Edit .env with your OPENROUTER_API_KEY or OPENAI_API_KEY
```

### Running tests

```bash
# Unit tests (mocked, no API key needed)
pytest tests/ --ignore=tests/test_e2e_real_llm.py --ignore=tests/test_e2e_vision.py -v

# Full suite including real LLM calls (requires API key in .env)
pytest tests/ -v --timeout=120
```

### Running demos

```bash
# Interactive notebook
jupyter notebook demos/Demo.ipynb

# Research agent (terminal)
cd demos && python demo_native_research_agent.py

# Code improvement agent (terminal)
cd demos && python demo_native_code_agent.py
```

## Citation

If you use SlowBurn in your research, please cite:

```bibtex
@misc{divekar2026slowburn,
  author       = {Divekar, Abhishek},
  title        = {{SlowBurn}: Cost-Sustainable Concurrent Execution for Long-Horizon {LLM} Agents},
  year         = {2026},
  howpublished = {\url{https://github.com/adivekar-utexas/slowburn}},
}
```

## License

MIT
