Metadata-Version: 2.4
Name: loom-runner
Version: 0.1.2
Summary: Durable checkpoint/resume runner for async state-machine loops built on loom-tailcalls.
Author: kroq86
License-Expression: MIT
Project-URL: Homepage, https://github.com/kroq86/loom-runner
Project-URL: Repository, https://github.com/kroq86/loom-runner
Project-URL: Issues, https://github.com/kroq86/loom-runner/issues
Project-URL: Loom stack, https://kroq86.github.io/loom-stack/
Project-URL: Documentation, https://kroq86.github.io/loom-stack/
Project-URL: Official showcase (loom-run), https://github.com/kroq86/loom-run
Keywords: async,agents,checkpoint,sqlite,state-machine,durable-execution
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: loom-tailcalls>=0.2.0
Requires-Dist: flow-xray>=0.3.0
Provides-Extra: dev
Requires-Dist: pytest>=7; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23; extra == "dev"
Dynamic: license-file

[![Loom Stack](https://kroq86.github.io/loom-stack/assets/loom-stack-banner.png)](https://kroq86.github.io/loom-stack/)

# loom-runner

[![PyPI](https://img.shields.io/pypi/v/loom-runner)](https://pypi.org/project/loom-runner/)
[![Loom stack](https://img.shields.io/badge/docs-loom--stack-8B7355)](https://kroq86.github.io/loom-stack/)

Small durable checkpoint/resume runner for async state-machine loops built on
top of `loom-tailcalls` and `flow-xray`. **Full stack overview:** [kroq86.github.io/loom-stack](https://kroq86.github.io/loom-stack/)

**Official showcases:** [loom-run](https://github.com/kroq86/loom-run) (dev chat + MCP) · [loom-ops](https://github.com/kroq86/loom-ops) (ops runbooks + HITL) — both wire runner + flow-xray; pick by domain.

This is not a planner, memory system, graph DSL, hosted tracing product, or
full agent SDK. It is the first slice of a Loom-based agent runtime: run a
typed async transition loop, checkpoint each state transition, resume later,
inspect history, and explain a run.

## Loom stack

**Overview:** [kroq86.github.io/loom-stack](https://kroq86.github.io/loom-stack/) — packages, flow, audience, quick start.

The stack is a pyramid, not five equal frameworks. Tail-call optimization is
the primitive, runner is the durable runtime, xray is the microscope, and the
apps prove the stack in real workflows.

| Layer | Project | Job |
| --- | --- | --- |
| Primitive | **[loom-tailcalls](https://github.com/kroq86/loom-tailcalls)** | Make async recursive/state-machine loops stack-safe |
| Runtime kernel | **[loom-runner](https://github.com/kroq86/loom-runner)** ← **this repo** | Make those loops durable, resumable, idempotent |
| Microscope | **[flow-xray](https://github.com/kroq86/flow-xray)** | Show what actually happened in one offline HTML trace |
| Proof app | **[loom-run](https://github.com/kroq86/loom-run)** | Chat agent reference implementation |
| Proof app | **[loom-ops](https://github.com/kroq86/loom-ops)** | Ops/runbook agent reference implementation |

```text
@tailrec agent loop  →  loom-runner run/resume  →  --trace trace.html
     (shape)                  (durability)              (flow-xray)
```

**This repo is the runtime kernel.** `loom-runner` is the library package and
CLI for durable execution. `loom-run` is a runnable chat showcase built on it;
the names are close, but the layer is different.

Dependency direction: `loom-runner` depends on `loom-tailcalls` and optionally
emits `flow-xray` traces. `loom-run` and `loom-ops` depend on `loom-runner`; the
kernel never depends on the apps.

## Who it is for

- Authors of **long-running async agent loops** who need checkpoint/resume without building their own store
- Users of **[loom-tailcalls](https://github.com/kroq86/loom-tailcalls)** who want persistence and CLI inspection on top of stack-safe transitions
- Users of **[flow-xray](https://github.com/kroq86/flow-xray)** who want `--trace trace.html` from the runner CLI
- Anyone who needs an **inspectable run** (`explain`, `history`, `attempts`, `tool-calls`) rather than a black box

**Not for you** if the agent is a single LLM call, or you already have LangGraph/Temporal (or similar) with persistence you are happy with.

This is not reasoning, planning, memory, or a path to AGI — it is a **durability + observability** primitive for state-machine-shaped agent runtimes.

Runtime transitions are logged as logical steps with attempt history. A retry
does not create a new transition: for the same `run_id`, `step_index`, and
stable input hash, the runner reuses the committed outcome. Transient errors
are retryable by default; validation, business, permission, and unknown errors
fail the run unless the caller supplies a different policy.

Tool side effects are only idempotent when invoked through
`RunContext.call_tool(...)`. Direct tool calls or external effects inside a
transition are intentionally treated as unmanaged user code in this first
runtime slice.

Long runs can use bounded reads and explicit storage policies. By default the
runner keeps every checkpoint and every inline tool payload for maximum
inspectability. For larger runs, use `CheckpointPolicy(mode="interval",
every=N)` to retain only periodic history checkpoints while preserving the
current resumable state, and `PayloadPolicy(max_inline_bytes=N)` to replace
large managed tool payloads with hash/size metadata.

The import package remains `loom_agent`; the distribution and CLI are named
`loom-runner` because `loom-agent` is already occupied by an unrelated package
on PyPI.

## Install

```bash
python3.13 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
```

## Minimal Shape

```python
from dataclasses import dataclass

from loom_agent import AgentRunner, Complete, Continue, RunContext, SQLiteCheckpointStore


@dataclass(frozen=True)
class State:
    current: int
    target: int


async def step(state: State, ctx: RunContext):
    if state.current >= state.target:
        return Complete({"current": state.current})
    return Continue(State(current=state.current + 1, target=state.target))


runner = AgentRunner(
    step=step,
    store=SQLiteCheckpointStore("runs.sqlite"),
    encode_state=lambda state: {"current": state.current, "target": state.target},
    decode_state=lambda data: State(**data),
    encode_result=lambda result: result,
    decode_result=lambda data: data,
)
```

## Example

```bash
loom-runner run examples/counter_agent.py --run-id demo --db runs.sqlite --max-steps 5
loom-runner resume examples/counter_agent.py --run-id demo --db runs.sqlite --max-steps 100
loom-runner list examples/counter_agent.py --db runs.sqlite
loom-runner get examples/counter_agent.py --run-id demo --db runs.sqlite
loom-runner history examples/counter_agent.py --run-id demo --db runs.sqlite
loom-runner attempts examples/counter_agent.py --run-id demo --db runs.sqlite --limit 20
loom-runner tool-calls examples/counter_agent.py --run-id demo --db runs.sqlite --limit 20
loom-runner explain examples/counter_agent.py --run-id demo --db runs.sqlite
```

Add `--trace trace.html` to either command to emit a local `flow-xray` HTML
trace. The runner traces step leaves and keeps the tail-recursive driver as the
durable loop boundary.

Or directly:

```bash
python3.13 examples/counter_agent.py
```

## Tests

```bash
python3.13 -m pytest
```

## Runtime Benchmark

```bash
python3.13 scripts/bench_runtime.py --steps 100000
python3.13 scripts/bench_runtime.py --steps 100000 --checkpoint-every 100
```

The benchmark reports wall time, retained checkpoint rows, attempt rows, DB
size, and peak Python memory. It is a local regression tool, not a hosted-scale
performance claim.
