Metadata-Version: 2.4
Name: sir-sdk
Version: 0.0.2
Summary: Single-Input-Reasoning: one LLM call, full action graph execution with evolutionary memory
Project-URL: Homepage, https://github.com/tommasobredariol/sir-sdk
Project-URL: Repository, https://github.com/tommasobredariol/sir-sdk
Project-URL: Issues, https://github.com/tommasobredariol/sir-sdk/issues
Author-email: "Tommaso G. Bredariol" <tommasobredariol@gmail.com>
License-Expression: AGPL-3.0-or-later
License-File: LICENSE
Keywords: action-graph,agent,dag,evolutionary-memory,llm,reasoning,single-shot
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: click>=8.0
Requires-Dist: msgpack>=1.0
Requires-Dist: pydantic>=2.0
Requires-Dist: rich>=13.0
Provides-Extra: all
Requires-Dist: anthropic>=0.40; extra == 'all'
Requires-Dist: boto3>=1.35; extra == 'all'
Requires-Dist: ollama>=0.4; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Provides-Extra: bedrock
Requires-Dist: boto3>=1.35; extra == 'bedrock'
Provides-Extra: claude
Requires-Dist: anthropic>=0.40; extra == 'claude'
Provides-Extra: dev
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Provides-Extra: gemini
Requires-Dist: openai>=1.0; extra == 'gemini'
Provides-Extra: mistral
Requires-Dist: openai>=1.0; extra == 'mistral'
Provides-Extra: ollama
Requires-Dist: ollama>=0.4; extra == 'ollama'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: openrouter
Requires-Dist: openai>=1.0; extra == 'openrouter'
Provides-Extra: perplexity
Requires-Dist: openai>=1.0; extra == 'perplexity'
Description-Content-Type: text/markdown

<p align="center">
  <img src="static/logo.png" alt="SIR Logo" width="200">
</p>

<h1 align="center">SIR — Single-Input-Reasoning</h1>

<p align="center">
  <strong>One LLM call. Full action graph. Evolutionary memory.</strong>
</p>

<p align="center">
  <a href="https://pypi.org/project/sir-sdk/"><img src="https://img.shields.io/pypi/v/sir-sdk" alt="PyPI"></a>
  <a href="https://pypi.org/project/sir-sdk/"><img src="https://img.shields.io/pypi/pyversions/sir-sdk" alt="Python"></a>
  <a href="LICENSE"><img src="https://img.shields.io/badge/license-AGPL--3.0-blue.svg" alt="License"></a>
</p>

---

SIR is a Python SDK that lets developers delegate complex multi-step tasks to an LLM with a **single inference call**. Instead of the traditional ReAct loop (think → act → observe → repeat), SIR asks the LLM to produce an entire **Directed Acyclic Graph (DAG)** of actions in one shot, then executes it locally with parallelism, fan-out, retry, conditional branching, speculative execution, and DAG branching.

## What makes SIR different

<table width="100%">
<thead>
<tr>
<th align="left" width="25%">Feature</th>
<th align="center" width="15%">ReAct</th>
<th align="center" width="15%">Plan & Execute</th>
<th align="center" width="15%">Chain-of-Tools</th>
<th align="center" width="15%"><strong>SIR</strong></th>
</tr>
</thead>
<tbody>
<tr><td>LLM calls per task</td><td align="center">N (one per step)</td><td align="center">1 + N</td><td align="center">1</td><td align="center"><strong>1</strong></td></tr>
<tr><td>Parallel execution</td><td align="center">❌</td><td align="center">❌</td><td align="center">❌</td><td align="center"><strong>Full DAG</strong></td></tr>
<tr><td>Adaptive tool selection</td><td align="center">✅ (slow)</td><td align="center">✅ (slow)</td><td align="center">❌ hardcoded</td><td align="center"><strong>✅ (1 call)</strong></td></tr>
<tr><td>Conditional branching</td><td align="center">Via LLM re-call</td><td align="center">Via LLM re-call</td><td align="center">❌</td><td align="center"><strong>Local eval</strong></td></tr>
<tr><td>Fan-out (map-reduce)</td><td align="center">Manual</td><td align="center">Manual</td><td align="center">❌</td><td align="center"><strong>Built-in</strong></td></tr>
<tr><td>Speculative execution</td><td align="center">❌</td><td align="center">❌</td><td align="center">❌</td><td align="center"><strong>✅</strong></td></tr>
<tr><td>DAG branching (multi-path)</td><td align="center">❌</td><td align="center">❌</td><td align="center">❌</td><td align="center"><strong>✅</strong></td></tr>
<tr><td>Post-LLM graph optimization</td><td align="center">❌</td><td align="center">❌</td><td align="center">❌</td><td align="center"><strong>✅</strong></td></tr>
<tr><td>Evolutionary memory</td><td align="center">❌</td><td align="center">❌</td><td align="center">❌</td><td align="center"><strong>dags.bin</strong></td></tr>
<tr><td>Token efficiency</td><td align="center">Low</td><td align="center">Low</td><td align="center">Medium</td><td align="center"><strong>Compressed</strong></td></tr>
<tr><td>Cost</td><td align="center">High (N calls)</td><td align="center">High (1+N)</td><td align="center">Medium (1)</td><td align="center"><strong>Minimal (1)</strong></td></tr>
</tbody>
</table>

## Installation

```bash
pip install sir-sdk              # core only
pip install sir-sdk[ollama]      # + Ollama support
pip install sir-sdk[openai]      # + OpenAI support
pip install sir-sdk[claude]      # + Anthropic Claude support
pip install sir-sdk[gemini]      # + Google Gemini support
pip install sir-sdk[bedrock]     # + AWS Bedrock support
pip install sir-sdk[openrouter]  # + OpenRouter support
pip install sir-sdk[perplexity]  # + Perplexity support
pip install sir-sdk[mistral]     # + Mistral support
pip install sir-sdk[all]         # everything
```

## Quick Start

```python
from sir import SIR, tool

@tool
def search_web(query: str) -> str:
    """Search the web."""
    return requests.get(f"https://api.search.com?q={query}").text

@tool
def summarize(text: str) -> str:
    """Summarize text."""
    return text[:200] + "..."

@tool
def translate(text: str, lang: str) -> str:
    """Translate text."""
    return translated_text

sir = SIR(model="qwen2.5:14b")
result = sir.run(
    "Search latest AI news, summarize, and translate to Italian",
    tools=[search_web, summarize, translate],
)
print(result.final_result)
```

**That's it.** One LLM call → full DAG → parallel execution → result.

## How It Works

<p align="center">
  <img src="static/dag.png" alt="SIR DAG Example" width="100%">
</p>

```
sir.run(prompt, tools)
        │
        ▼
┌──────────────────────────────────┐
│ 1. Memory Lookup                 │ ← Semantic vector search in dags.bin
│ 2. Prompt Compilation            │ ← Compressed tool schemas + memory
│ 3. Single LLM Call               │ ← One inference → full action graph
│ 4. Graph Optimization            │ ← Dead-step elimination, dedup, dep relaxation
│ 5. Parallel Graph Execution      │ ← Topological sort → async + speculative
│ 6. Evolutionary Scoring          │ ← Score steps, deprecate bad ones
│ 7. Memory Persistence            │ ← Save to dags.bin
└──────────────────────────────────┘
```

## Benchmarks

### SIR vs Chain-of-Tools (Effectiveness)

SIR adaptively selects only the tools needed. Chain-of-Tools uses a hardcoded pipeline with unnecessary steps.

<p align="center">
  <img src="static/plots/sir_vs_chain.png" alt="SIR vs Chain-of-Tools" width="100%">
</p>

Benchmarked across 5 complexity levels (L1: 2 tools → L5: 11 parallel steps) using the same LLM:

<table width="100%">
<thead>
<tr>
<th align="left" width="40%">Metric</th>
<th align="center" width="30%"><strong>SIR</strong></th>
<th align="center" width="30%">Chain-of-Tools</th>
</tr>
</thead>
<tbody>
<tr><td>Avg Tool Efficiency</td><td align="center"><strong>100%</strong></td><td align="center">71%</td></tr>
<tr><td>Avg Step Efficiency</td><td align="center"><strong>94%</strong></td><td align="center">64%</td></tr>
<tr><td>Total Wasted Tools</td><td align="center"><strong>0</strong></td><td align="center">4</td></tr>
<tr><td>Total Wasted Steps</td><td align="center"><strong>0</strong></td><td align="center">13</td></tr>
<tr><td>Total Tokens</td><td align="center"><strong>5,693</strong></td><td align="center">6,769 (−16%)</td></tr>
<tr><td>Total Wall Time</td><td align="center"><strong>17s</strong></td><td align="center">28s (−40%)</td></tr>
</tbody>
</table>

## Architecture

### Graph Optimization (post-LLM)

After the LLM generates the DAG, SIR runs three compiler passes before execution:

- **Dead-step elimination** — removes steps whose output is never referenced
- **Duplicate merge** — merges steps calling the same tool with identical args
- **Dependency relaxation** — removes unnecessary dependencies to unlock parallelism

### Speculative Execution

While the current layer executes, SIR speculatively launches steps from the next layer if their dependencies are already available. This reduces total wall time on deep DAGs.

### DAG Branching (Multi-Path)

Steps can define `alternatives` — multiple tool strategies that race in parallel:

```json
{
  "id": "s1",
  "tool": "search",
  "args": {"query": "AI news"},
  "alternatives": [{"tool": "fetch_details", "args": {"entity": "AI"}}],
  "select": "fastest"
}
```

Strategies: `fastest` (first to succeed wins), `shortest`, `longest`.

### Token Compression

SIR uses compressed JSON aliases to minimize token usage:

<table width="100%">
<thead>
<tr>
<th align="left" width="40%">Full key</th>
<th align="center" width="30%">Alias</th>
<th align="center" width="30%">Savings</th>
</tr>
</thead>
<tbody>
<tr><td><code>tool</code></td><td align="center"><code>t</code></td><td align="center">3 tokens</td></tr>
<tr><td><code>args</code></td><td align="center"><code>a</code></td><td align="center">3 tokens</td></tr>
<tr><td><code>depends_on</code></td><td align="center"><code>d</code></td><td align="center">9 tokens</td></tr>
<tr><td><code>condition</code></td><td align="center"><code>c</code></td><td align="center">8 tokens</td></tr>
<tr><td><code>foreach</code></td><td align="center"><code>f</code></td><td align="center">6 tokens</td></tr>
<tr><td><code>final_step</code></td><td align="center"><code>fs</code></td><td align="center">9 tokens</td></tr>
</tbody>
</table>

The parser auto-expands aliases and is fully backward-compatible with full key names.

## Tool Modes

SIR gives developers control over how much autonomy the LLM has in selecting tools:

<table width="100%">
<thead>
<tr>
<th align="left" width="20%">Mode</th>
<th align="left" width="45%">Behavior</th>
<th align="left" width="35%">Use case</th>
</tr>
</thead>
<tbody>
<tr><td><code>adaptive</code> (default)</td><td>LLM picks the minimum tools needed</td><td>Generic prompts, many tools available</td></tr>
<tr><td><code>strict</code></td><td>ALL tools passed must be used; LLM decides order/parallelism only</td><td>Predictable pipelines, no surprises</td></tr>
<tr><td><code>required</code></td><td>Tools marked <code>required=True</code> are mandatory, others optional</td><td>Mix of fixed + flexible</td></tr>
</tbody>
</table>

```python
# Adaptive — LLM chooses
sir = SIR(tool_mode="adaptive")

# Strict — all tools must be used
sir = SIR(tool_mode="strict")

# Required — mark optional tools
@tool(required=False)
def cache(key: str, value: str) -> str: ...

sir = SIR(tool_mode="required")
```

## Evolutionary Memory (`dags.bin`)

SIR persists every executed action graph in a binary file using msgpack with vector embeddings for semantic retrieval.

```
Run 1: LLM generates plan → execute → score → store in dags.bin
Run 2: Load prior plan → LLM sees scores/notes → improves plan → update
Run 3: Step X scored 2.1 → DEPRECATED → LLM replaces with better alternative
Run N: Converges to optimal action graph for this task
```

Each step stores:
- **score** (0-10) — exponential moving average
- **notes** — LLM annotations from previous runs
- **executions** — how many times it ran
- **deprecated** — `true` if score < threshold after ≥3 runs

## Advanced Features

### Conditional Branching
```json
{"id":"s3","t":"notify","a":{"msg":"$s2.result"},"d":["s2"],
 "c":{"ref":"$s2.result","op":"contains","val":"error"}}
```

### Fan-out (Map-Reduce)
```json
{"id":"s2","t":"process","a":{"item":"$item"},"d":["s1"],"f":"$s1.result"}
```

Supports both `$sN.result` references and inline arrays:
```json
{"id":"s1","t":"search","a":{"query":"$item"},"f":["topic A","topic B"]}
```

### Retry Policy
```json
{"id":"s1","t":"unreliable_api","a":{"url":"..."},"r":3}
```

## Providers

All providers read API keys from environment variables by default. You can also pass them explicitly.

### Ollama (default)
```python
from sir.providers import OllamaProvider
sir = SIR(provider=OllamaProvider(model="qwen2.5:14b"))
```

### OpenAI
```python
from sir.providers import OpenAIProvider
sir = SIR(provider=OpenAIProvider(model="gpt-4o"))  # reads OPENAI_API_KEY
```

### Claude (Anthropic)
```python
from sir.providers import ClaudeProvider
sir = SIR(provider=ClaudeProvider(model="claude-sonnet-4-20250514"))  # reads ANTHROPIC_API_KEY
```

### Gemini (Google)
```python
from sir.providers import GeminiProvider
sir = SIR(provider=GeminiProvider(model="gemini-2.5-flash"))  # reads GEMINI_API_KEY
```

### AWS Bedrock
```python
from sir.providers import BedrockProvider
sir = SIR(provider=BedrockProvider(model="anthropic.claude-sonnet-4-20250514-v1:0"))  # reads AWS_REGION + AWS_BEARER_TOKEN_BEDROCK
```

### OpenRouter
```python
from sir.providers import OpenRouterProvider
sir = SIR(provider=OpenRouterProvider(model="openai/gpt-4o"))  # reads OPENROUTER_API_KEY
```

### Perplexity
```python
from sir.providers import PerplexityProvider
sir = SIR(provider=PerplexityProvider(model="sonar-pro"))  # reads PERPLEXITY_API_KEY
```

### Mistral
```python
from sir.providers import MistralProvider
sir = SIR(provider=MistralProvider(model="mistral-large-latest"))  # reads MISTRAL_API_KEY
```

### Custom Provider
```python
from sir.providers.llm import LLMProvider

class MyProvider(LLMProvider):
    async def generate(self, messages, **kwargs) -> str:
        return await my_custom_llm(messages)
```

## Configuration

```python
sir = SIR(
    provider=OllamaProvider(model="qwen2.5:14b"),
    memory_path="dags.bin",           # binary memory file
    enable_memory=True,               # toggle memory system
    enable_optimizer=True,            # toggle graph compression
    enable_speculation=True,          # toggle speculative execution
    tool_mode="adaptive",             # "adaptive" | "strict" | "required"
    deprecation_threshold=3.0,        # score below this → deprecated
    similarity_threshold=0.78,        # semantic memory match threshold
    max_tokens=4096,                  # LLM output limit
    llm_retries=2,                    # retry on LLM/parse failure
)
```

## CLI

```bash
sir run "Search AI news and summarize" -t tools.py
sir run "..." -t tools.py --stream     # live streaming
sir inspect                             # view evolutionary memory
sir clear                               # clear memory
```

## License

AGPL-3.0 — See [LICENSE](LICENSE) for details.
