Metadata-Version: 2.4
Name: agentslim
Version: 0.1.0
Summary: Make your AI agents leaner, faster, and cheaper — smart context management and token compression
Project-URL: Homepage, https://github.com/WastedSwl/agentslim
Project-URL: Documentation, https://github.com/WastedSwl/agentslim#readme
Project-URL: Issues, https://github.com/WastedSwl/agentslim/issues
Author: agentslim contributors
License: MIT
License-File: LICENSE
Keywords: agents,ai,compression,context,langchain,llm,openai,tokens
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.9
Provides-Extra: all
Requires-Dist: tiktoken>=0.5; extra == 'all'
Provides-Extra: dev
Requires-Dist: hatch; extra == 'dev'
Requires-Dist: pytest-cov; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: tiktoken>=0.5; extra == 'dev'
Provides-Extra: tiktoken
Requires-Dist: tiktoken>=0.5; extra == 'tiktoken'
Description-Content-Type: text/markdown

# agentslim 🪶

**Make your AI agents leaner, faster, and cheaper.**

`agentslim` is a zero-dependency Python toolkit that reduces token consumption in LLM-powered agents — without sacrificing reasoning quality.

[![PyPI version](https://img.shields.io/pypi/v/agentslim.svg)](https://pypi.org/project/agentslim/)
[![Python](https://img.shields.io/pypi/pyversions/agentslim.svg)](https://pypi.org/project/agentslim/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

---

## Why?

Every token counts — literally. When building agents you routinely waste tokens on:

| Problem | Typical waste |
|---|---|
| Verbose JSON tool schemas | 200–800 tokens per request |
| Raw HTML web scrapes fed to the LLM | 60–80% noise |
| Naively truncated chat history | Lost context, broken reasoning |
| Sending entire source files to coding agents | 10× more than needed |

`agentslim` solves all four with one clean API.

---

## Install

```bash
pip install agentslim
```

For accurate token counting (uses `tiktoken` under the hood):

```bash
pip install agentslim[tiktoken]
```

---

## Quick Start

```python
from agentslim import Compressor, AgentMemory, ToolMinifier, CodeContext

# 1 ── Compress any content before sending to LLM
c = Compressor()
slim = c.compress(raw_html_or_json_or_text)   # auto-detects format

# 2 ── Smart context window with auto-summarization
mem = AgentMemory(max_tokens=6000)
mem.add("user", "Build me a FastAPI app")
mem.add("assistant", "Sure! Here's the plan...")
messages = mem.get_messages()   # ready for openai.chat.completions.create()

# 3 ── Minify tool schemas
slim_tools = ToolMinifier.minify(my_tools)          # shorter descriptions
hint_str   = ToolMinifier.to_compact_str(my_tools)  # one-liner per tool

# 4 ── Send only the relevant code chunk, not the whole file
snippet = CodeContext.extract_function("app.py", "handle_request")
outline = CodeContext.outline("app.py")   # class/function map
```

---

## Modules

### 🗜️ `Compressor` — Text / HTML / JSON compressor

Strips noise from content before it hits your LLM.

```python
from agentslim import Compressor
from agentslim.compressor import CompressorConfig

# Defaults — safe for most use cases
c = Compressor()

# Fine-grained control
c = Compressor(config=CompressorConfig(
    strip_html=True,
    remove_decorative_html=True,   # drops <script>, <style>, <nav>, etc.
    collapse_whitespace=True,
    remove_filler_phrases=True,    # "Certainly! As an AI language model..."
    compact_json=True,
    remove_python_comments=False,  # keep comments by default
))

clean = c.compress(raw_content)          # auto-detects JSON / HTML / text
clean = c.compress_html(html_string)
clean = c.compress_json(json_string)
clean = c.compress_text(plain_text)
clean = c.compress_code(source, language="python")  # or "js" / "ts"
```

**Savings report:**

```python
from agentslim.utils import tokens_saved_report

report = tokens_saved_report(original, compressed, model="gpt-4o")
# {
#   'original_tokens': 1842,
#   'compressed_tokens': 612,
#   'tokens_saved': 1230,
#   'percent_saved': 66.8,
#   'cost_saved_usd': 0.003075
# }
```

---

### 🧠 `AgentMemory` — Smart sliding-window context manager

Instead of naively cutting old messages (which breaks reasoning), `AgentMemory` **auto-summarizes** the oldest messages into a compact system note.

```python
from agentslim import AgentMemory

mem = AgentMemory(
    max_tokens=6000,       # soft limit on the active window
    archive_ratio=0.4,     # archive the oldest 40% when limit is hit
    summarize_fn=None,     # optional: plug in your LLM for better summaries
)

mem.add("system", "You are a helpful assistant.")
mem.add("user", "Hello!")
mem.add("assistant", "Hi! How can I help?")

messages = mem.get_messages()   # list[dict] — pass directly to any OpenAI-compatible API
print(mem.stats())
# MemoryStats(active_messages=3, archived=0, active_tokens=24, ...)
```

**With a real LLM summarizer:**

```python
import openai

def gpt_summarize(messages):
    history = "\n".join(f"{m.role}: {m.content}" for m in messages)
    resp = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Summarize in 3 sentences."},
            {"role": "user",   "content": history},
        ],
    )
    return resp.choices[0].message.content

mem = AgentMemory(max_tokens=8000, summarize_fn=gpt_summarize)
```

---

### 🛠️ `ToolMinifier` — Tool schema minifier

OpenAI function schemas are JSON-heavy. `ToolMinifier` cuts them down.

```python
from agentslim import ToolMinifier

# Option A: minify but keep JSON format (for the API)
slim_tools = ToolMinifier.minify(tools, max_desc=80)

# Option B: ultra-compact one-liner hint for system prompts
print(ToolMinifier.to_compact_str(tools))
# get_weather(location:string, unit:string?) -> Any  # Get current weather…
# send_email(to:string, subject:string, body:string) -> Any

# Option C: auto-generate schemas from Python functions
def search(query: str, max_results: int) -> str:
    """Search the web for real-time info."""
    ...

tools = ToolMinifier.from_python_functions(search)
```

| Format | Tokens (example) |
|---|---|
| Full verbose JSON | ~520 |
| `minify()` | ~310 |
| `to_compact_str()` | ~40 |

---

### 📄 `CodeContext` — Code-aware chunk extractor

Don't send 500-line files to your coding agent — send only what it needs.

```python
from agentslim import CodeContext

# Extract a single function (+ N lines of context)
snippet = CodeContext.extract_function("app.py", "process_payment", context_lines=3)

# Extract a class skeleton (signatures only)
skeleton = CodeContext.extract_class("service.py", "PaymentService", methods_only=True)

# Outline: class/function map of the whole file
outline = CodeContext.outline("app.py")
# ['class PaymentService (L12)', 'def charge (L28)', 'def refund (L45)']

# Folded view: function bodies replaced with '...'
folded = CodeContext.folded("large_module.py")

# Extract specific line range
chunk = CodeContext.extract_lines("app.py", start_line=120, end_line=145, context_lines=5)
```

| View | Tokens saved |
|---|---|
| Full source | 0% |
| Folded | ~55% |
| Outline only | ~85% |

---

### 📊 `utils` — Token counting & cost estimation

```python
from agentslim.utils import count_tokens, estimate_cost

tokens = count_tokens("Hello, world!")  # uses tiktoken if available

cost = estimate_cost(input_tokens=1000, output_tokens=200, model="gpt-4o")
# {'input_usd': 0.0025, 'output_usd': 0.002, 'total_usd': 0.0045}
```

Supported models: `gpt-4o`, `gpt-4o-mini`, `gpt-4-turbo`, `gpt-3.5-turbo`,
`claude-3-5-sonnet`, `claude-3-haiku`, `gemini-1.5-pro`, `gemini-1.5-flash`.

---

## Compatibility

`agentslim` is framework-agnostic. It works with anything that accepts a list of `{"role": ..., "content": ...}` dicts:

- ✅ OpenAI Python SDK
- ✅ LangChain / LangGraph
- ✅ LlamaIndex
- ✅ Anthropic SDK
- ✅ Google Generative AI SDK
- ✅ Any custom agent framework

---

## Running tests

```bash
pip install -e ".[dev]"
pytest
```

---

## Contributing

PRs and issues welcome! See [CONTRIBUTING.md](CONTRIBUTING.md).

---

## License

MIT © agentslim contributors
