Metadata-Version: 2.4
Name: recursive-flow
Version: 0.4.2
Summary: Build Recursive Language Models as inspectable execution graphs.
Project-URL: Homepage, https://github.com/shyamsn97/recursive-flow
Project-URL: Repository, https://github.com/shyamsn97/recursive-flow
Project-URL: Issues, https://github.com/shyamsn97/recursive-flow/issues
Author-email: Shyam Sudhakaran <shyamsnair@protonmail.com>
License-Expression: MIT
License-File: LICENSE
Keywords: agent,graph,llm,recursive,repl
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Requires-Dist: jsonschema>=4.0
Requires-Dist: pydantic>=2.0
Requires-Dist: tenacity>=8.0
Provides-Extra: all
Requires-Dist: anthropic>=0.20; extra == 'all'
Requires-Dist: datasets>=2.19; extra == 'all'
Requires-Dist: daytona-sdk; extra == 'all'
Requires-Dist: dspy; extra == 'all'
Requires-Dist: e2b; extra == 'all'
Requires-Dist: gradio>=4.29; extra == 'all'
Requires-Dist: huggingface-hub>=0.23; extra == 'all'
Requires-Dist: kaleido>=0.2.1; extra == 'all'
Requires-Dist: mcp; extra == 'all'
Requires-Dist: modal; extra == 'all'
Requires-Dist: openai>=1.0; extra == 'all'
Requires-Dist: openpyxl>=3.1; extra == 'all'
Requires-Dist: pandas>=2.0; extra == 'all'
Requires-Dist: plotly>=5.0; extra == 'all'
Requires-Dist: pyarrow>=15.0; extra == 'all'
Requires-Dist: pypdf2>=3.0; extra == 'all'
Requires-Dist: python-docx>=1.1; extra == 'all'
Requires-Dist: rich>=13.0; extra == 'all'
Requires-Dist: tinker; extra == 'all'
Requires-Dist: tinker-cookbook; extra == 'all'
Requires-Dist: tqdm>=4.66; extra == 'all'
Requires-Dist: wandb>=0.16; extra == 'all'
Provides-Extra: anthropic
Requires-Dist: anthropic>=0.20; extra == 'anthropic'
Provides-Extra: daytona
Requires-Dist: daytona-sdk; extra == 'daytona'
Provides-Extra: dev
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
Requires-Dist: pytest-timeout>=2.0; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: rich>=13.0; extra == 'dev'
Requires-Dist: ruff>=0.4; extra == 'dev'
Provides-Extra: dspy
Requires-Dist: dspy; extra == 'dspy'
Provides-Extra: e2b
Requires-Dist: e2b; extra == 'e2b'
Provides-Extra: eval
Requires-Dist: datasets>=2.19; extra == 'eval'
Requires-Dist: huggingface-hub>=0.23; extra == 'eval'
Requires-Dist: openpyxl>=3.1; extra == 'eval'
Requires-Dist: pandas>=2.0; extra == 'eval'
Requires-Dist: pyarrow>=15.0; extra == 'eval'
Requires-Dist: pypdf2>=3.0; extra == 'eval'
Requires-Dist: python-docx>=1.1; extra == 'eval'
Requires-Dist: tqdm>=4.66; extra == 'eval'
Requires-Dist: wandb>=0.16; extra == 'eval'
Provides-Extra: image
Requires-Dist: kaleido>=0.2.1; extra == 'image'
Requires-Dist: plotly>=5.0; extra == 'image'
Provides-Extra: mcp
Requires-Dist: mcp; extra == 'mcp'
Provides-Extra: modal
Requires-Dist: modal; extra == 'modal'
Provides-Extra: openai
Requires-Dist: openai>=1.0; extra == 'openai'
Provides-Extra: sandbox
Requires-Dist: daytona-sdk; extra == 'sandbox'
Requires-Dist: e2b; extra == 'sandbox'
Requires-Dist: modal; extra == 'sandbox'
Provides-Extra: tinker
Requires-Dist: tinker; extra == 'tinker'
Requires-Dist: tinker-cookbook; extra == 'tinker'
Provides-Extra: viewer
Requires-Dist: gradio>=4.29; extra == 'viewer'
Requires-Dist: plotly>=5.0; extra == 'viewer'
Requires-Dist: rich>=13.0; extra == 'viewer'
Description-Content-Type: text/markdown

# recursive-flow

<p align="center">
  <a href="https://pypi.org/project/recursive-flow/"><img src="https://img.shields.io/pypi/v/recursive-flow.svg?label=pypi" alt="PyPI" /></a>
  <a href="https://github.com/shyamsn97/recursive-flow/pkgs/container/recursive-flow"><img src="https://img.shields.io/badge/ghcr.io-recursive-flow-2496ED?logo=docker&logoColor=white" alt="Docker" /></a>
</p>

A Python library for building Recursive agent graphs built off of [Recursive Language Models](https://arxiv.org/abs/2512.24601).

As LLMs get better at coding, strict agent harnesses become less important.
RLMs let the model decide how to view and manipulate context, when to
delegate pieces of it to sub-agents, and how to combine the results,
all through the same clean coding interface.

**recursive-flow** turns that recursive run into a live execution graph. Every
query, action, observation, child call, wait, resume, and result is a
typed node you can inspect, step, retrace, save, load, fork, and branch into
new run directories. It is for people building long-context agents, recursive
coding agents, and research loops where the execution trace needs to be
as controllable as the final answer is useful. Each `start` / `step`
returns a fresh `Graph` snapshot: a recursive structure where
`graph[agent_id]` returns the sub-agent graph for that agent.

<p align="center">
  <img src="docs/rlm_animation.gif" alt="recursive-flow animation" />
</p>

## RLMs as Graphs

RLMs delegate subtasks to children, those children can delegate to their
own children, and results bubble back up. **recursive-flow** represents the
whole run as one recursive type:

- **`Graph`** — one agent snapshot. Carries the agent's run-invariants
  flat on itself (`agent_id`, `depth`, `query`, `system_prompt`,
  `config`, `runtime`, `model`, `parent_agent_id`, `parent_node_id`),
  plus its `nodes` trajectory
  and a `children: dict[str, Graph]` of sub-agents. Cross-agent
  navigation is `graph[other_aid]`; subtree views are `graph.agents`,
  `graph.all_nodes`, `graph.edges`.
- **`Node`** — one immutable node in an agent's trajectory. The
  trajectory is a strict alternation of **observations** (inputs the
  system received) and **actions** (work the system did). Nine leaf
  types live under four base classes — see
  [`docs/node_model.md`](docs/node_model.md):
  - Observations: `UserQuery`, `LLMOutput`, `ExecOutput`,
    `SupervisingOutput`, `ErrorOutput`, `DoneOutput`.
  - Actions: `LLMAction`, `ExecAction`, `ResumeAction`.

The agent has one delegation call: `await launch_subagents([...])`. It always
takes a list of dict specs and always returns child answers as a `list[str]` in
the same order. A one-child delegation is just a one-item list. An agent that
delegates two children and combines their results writes one REPL block like
this:

```python
results = await launch_subagents([
    {"name": "search", "query": "Find evidence", "inputs": {"chunk": chunk_a}},
    {"name": "verify", "query": "Check the answer", "inputs": {"chunk": chunk_b}},
])
done(combine(results))
```

The `await` is the supervision point: it suspends the parent at a single
`WaitRequest`, the engine runs the children on its pool, then resumes the
parent with their results. The REPL supports top-level await and the engine
drives the resulting coroutine, roughly:

```python
out = coro.send(None)              # run until the await
# out is a WaitRequest([search, verify]) -> suspend the parent, run children
results = [c.result() for c in children]
coro.send(results)                 # resume; `results` is now the list
```

The REPL is stateful across blocks, so the next LLM turn can still see
`results`. The launcher must be awaited; a bare call or a top-level `yield`
are errors. Agents should use `launch_subagents(...)` for delegation.

See [`docs/internals.md`](docs/internals.md) for the full protocol.

The block above becomes this execution graph (one obs/action pair
per step):

```text
UserQuery(root)
  -> LLMAction -> LLMOutput(code="await launch_subagents([search, verify])")
  -> ExecAction -> SupervisingOutput(waiting_on=[root.search, root.verify])
      -> UserQuery(root.search)  -> ... -> DoneOutput(root.search)
      -> UserQuery(root.verify)  -> ... -> DoneOutput(root.verify)
  -> ResumeAction -> ExecOutput(resumed_from=[root.search, root.verify])
  -> LLMAction -> LLMOutput(code="done(combine(...))")
  -> ExecAction -> DoneOutput(root)
```

## Install

```
pip install recursive-flow               # core
pip install recursive-flow[openai]       # + OpenAI client
pip install recursive-flow[anthropic]    # + Anthropic client
pip install recursive-flow[tinker]       # + Tinker inference client
pip install recursive-flow[dspy]         # + DSPy adapter
pip install recursive-flow[sandbox]      # + Modal, E2B, and Daytona runtimes
pip install recursive-flow[viewer]       # + Gradio viewer (plotly)
pip install recursive-flow[image]        # + static image / GIF export (kaleido)
pip install recursive-flow[all]          # all of the above
```

From source:

```
git clone https://github.com/shyamsn97/recursive-flow && cd recursive-flow
pip install -e .
```

For local development, `make install` runs cleanup, formatting/lint checks
including `ruff check .`, then installs the package.

> **Security warning — `LocalRuntime` is not a sandbox.**
> Agent code runs as full Python in your process: filesystem, network,
> environment variables, subprocesses — the same privileges as your interpreter.
> LLM-generated code can be wrong or malicious (prompt injection, model errors,
> supply-chain risk). **Use `LocalRuntime` only for code you would run yourself.**
> For untrusted agents or anything exposed to the internet, use
> [`DockerRuntime`](docs/runtimes.md) or a remote sandbox
> ([`ModalRuntime`](docs/runtimes.md) / [`E2BRuntime`](docs/runtimes.md) /
> [`DaytonaRuntime`](docs/runtimes.md)). See [`docs/security.md`](docs/security.md).
> **Use at your own risk.**

## Quick start

This example builds a simple coding agent with file tools in a local working
directory. See [`examples/notebooks/coding_agent.ipynb`](./examples/notebooks/coding_agent.ipynb)
for the notebook version.

```python
from pathlib import Path

import rflow
from rflow.tools import FILE_TOOLS
from rflow.utils.viewer import open_viewer

workdir = Path("examples/_runs/quickstart")
runtime = rflow.LocalRuntime(working_directory=workdir)
runtime.register_tools(FILE_TOOLS)

# Sandbox agent code inside Docker instead: drop-in replacement, same interface.
# Build the image once with `docker build -t recursive-flow:local .`.
# runtime = rflow.DockerRuntime(
#     "recursive-flow:local",
#     working_directory=workdir,
#     mounts={workdir: "/workspace"},
#     workdir="/workspace",
# )
# runtime.register_tools(FILE_TOOLS)

agent = rflow.Flow(
    rflow.OpenAIClient(model="gpt-5"),
    runtime=runtime,
    max_depth=2,
    max_iters=20,
    child_max_iters=20,
    llm_clients={"fast": rflow.OpenAIClient(model="gpt-5-mini")},
)

query = "Build a Python text-based adventure game with combat and inventory."
graph = agent.start(query)
while not graph.finished:
    graph = agent.step(graph)
    print(graph.tree())

print(graph.result())
graph.save(workdir / "graph")
open_viewer(workdir / "graph")
```

`Flow` is configured directly: `max_depth`, `max_iters`, `child_max_iters`,
`max_concurrency`, `llm_max_concurrency`, `max_budget`, `max_messages`, and
`eager_children` are constructor kwargs. Normal agent LLM turns and
`llm_query_batched(...)` share the same `LLMChannel`, so concurrency and token
usage accounting are centralized.

To let child agents drain work-conservingly after a parent reaches its
delegation wait (`await launch_subagents([...])`), enable `eager_children`:

```python
agent = rflow.Flow(
    rflow.OpenAIClient(model="gpt-5"),
    runtime=runtime,
    max_depth=2,
    child_max_iters=20,
    max_concurrency=8,
    llm_max_concurrency=4,
    eager_children=True,
)
```

With `eager_children=False`, a fast child that finishes `task_1` waits for the
rest of that parallel step before it can start `task_2`. With
`eager_children=True`, the fast child's `task_2` can start while a slow sibling
is still running `task_1`. See
[`examples/control/delegation/eager_children.py`](./examples/control/delegation/eager_children.py)
for a deterministic timestamped demo.

A saved run is a directory rooted at `graph.json` plus `agents/` logs. Reopen it
later with `Graph.load(path)` or `open_viewer(path)`.

## Drop-in `LLMClient`

`Flow` implements `LLMClient`, so it is a drop-in replacement for any raw LLM.

```python
def ask(llm: rflow.LLMClient, q: str) -> str:
    return llm.chat([{"role": "user", "content": q}])

ask(rflow.OpenAIClient(model="gpt-4o-mini"), "2+2?")  # one LLM call
ask(rflow.Flow(rflow.OpenAIClient(model="gpt-4o-mini")), "2+2?")  # full agent loop
```

Nest agents by passing one `Flow` as another's `llm`. See
[`examples/drop_in_llm.py`](examples/drop_in_llm.py).

## Prompt sections and skills

The default system prompt is built from named sections. Sections can be static
text or callables with the signature `section(flow, graph) -> str`. That makes
project memory or skills ordinary files plus a prompt section that decides when
to include them:

```python
from pathlib import Path

from rflow.prompts import DEFAULT_BUILDER

skill_path = Path("skills/numpy-linear-algebra/SKILL.md")

def skills_section(flow, graph):
    if not skill_path.exists():
        return ""
    return skill_path.read_text()

flow = rflow.Flow(rflow.OpenAIClient(model="gpt-4o-mini"))
flow.prompt_builder = DEFAULT_BUILDER.section(
    "skills",
    skills_section,
    title="Skills",
    before="tools",
)
```

See [`examples/skills.py`](examples/skills.py) for a runnable version.

## Step and inspect

`step(graph) -> graph'` is one atomic graph transition. Every step
returns a fresh `Graph` snapshot, so the live tree is just `graph.tree()`:

```python
graph = agent.start(query)
while not graph.finished:
    graph = agent.step(graph)
print(graph.tree())
```

```text
root [supervising] {default}
├── root.scanner_auth [result] {fast} -> Found SQL injection in login.py
├── root.scanner_api  [supervising] {default}
│   ├── root.scanner_api.chunk_0 [result] {fast} -> Clean
│   └── root.scanner_api.chunk_1 [result] {fast} -> Payment flow is safe
└── root.scanner_db   [result] {fast} -> No issues found
```

Every transition follows the same obs → action → obs shape:

```text
LLMOutput  -> ExecAction -> ExecOutput          (REPL output, normal continuation)
                         -> DoneOutput          (code called done())
                         -> ErrorOutput         (code raised / no code block)
                         -> SupervisingOutput   (awaited a launcher — waiting on children)
SupervisingOutput -> ResumeAction -> ExecOutput / Done / Error / Supervising
                                                (children settled — supervisor unpaused)
ExecOutput -> LLMAction -> LLMOutput            (back to the LLM for the next turn)
```

Action nodes carry the work the engine did; observation nodes carry
what was returned. Every action is followed by exactly one
observation. The graph is queryable in plain Python:

```python
graph.tree()                                  # ASCII render
graph["root.scanner_api"]                     # sub-Graph rooted at that agent
graph.agents["root.scanner_api"].nodes       # node trajectory for one agent
graph.children                                # dict[str, Graph] for child agents
graph.all_nodes.find("n_abc...")                  # bare Node lookup by id
graph.all_nodes.errors()                          # every ErrorOutput across agents
graph.all_nodes.results()                         # every DoneOutput across agents
graph.all_nodes.supervising()                     # every SupervisingOutput across agents
graph.all_nodes.where(type="llm_output", agent_id="root")  # kwargs match Node attrs
graph.all_nodes.where(lambda n: n.type == "exec_output")    # or pass a predicate
graph.to_dict()                               # full JSON-serializable payload
```

## Inject controller events

Because `Graph` is the control surface, external controllers can append typed
events and commit them through the normal step loop. This is useful for human
feedback, budget nudges, and forced finalization without losing traceability:

```python
import rflow

graph = graph.inject(
    target="root.scanner_api",
    node=rflow.ExecOutput(
        output="Injected controller observation: answer with current evidence.",
        content="Injected controller observation: answer with current evidence.",
    ),
)
graph = agent.step(graph)  # persists the observation, then continues

graph = graph.inject(
    target="root.scanner_api",
    node=ExecAction(code='done("best available answer")'),
)
graph = agent.step(graph)  # executes the action and writes DoneOutput
```

Injected nodes become ordinary graph nodes with the same shape as organic
nodes. See
[`docs/injections.md`](docs/injections.md) and
[`examples/control/controller_injection.py`](examples/control/controller_injection.py).

## Save, Load, Rewind, Branch

`Graph` is the durable run object. Save a run directory with `graph.save(...)`,
reopen it with `Graph.load(...)`, and keep step snapshots when you want rewind or
live checkpointing:

```python
history = [agent.start(query)]
while not history[-1].finished:
    history.append(agent.step(history[-1]))
    history[-1].save("runs/deep_research")  # overwrites the latest checkpoint

latest = rflow.Graph.load("runs/deep_research")
```

Branch by copying or loading a saved graph and continuing it with a `Flow`:

```python
branch = latest.copy(deep=True)
while not branch.finished:
    branch = agent.step(branch)
branch.save("runs/deep_research_repair")
```

Controller edits use the same graph surface (`replace_node`, `truncate_after`,
`inject`, `retrace_steps`) and then continue through `agent.step(graph)`. See
[`examples/showcase.py`](examples/showcase.py), [`docs/control.md`](docs/control.md),
and [`docs/injections.md`](docs/injections.md).

## Rich visualization

See [notebook](./examples/notebooks/viz_walkthrough.ipynb) for a full showcase of vizualization utilities.

Because the run is a typed graph, every visualization is just a render of
that graph. View either a saved run directory, a single `Graph`, or a list of
step snapshots.


### Gradio viewer

![](docs/static/gradio_ui.png)

`open_viewer(source)` launches a small browser app for inspecting a saved run
directory, a graph snapshot, or an in-memory trace:

```python
from rflow.utils.viewer import open_viewer

open_viewer("runs/deep_research")
```

From the CLI: `recursive-flow view runs/deep_research --port 7861`.

### Live terminal tree

`rflow.utils.viz.live(agent, graph)` drives the step loop and renders a
Rich tree as nodes are produced. The boids run (`Create a simple boids
simulation in plain HTML and JavaScript, split each component into
separate files`) settles to:

```text
root [result] {default:gpt-5} -> Boids simulation written to output/boids-simulation with modular JS (boid, simulation, renderer) and index.html entrypoint.
  root.index_html    [result] {fast:gpt-5-mini} -> ok
  root.styles_css    [result] {fast:gpt-5-mini} -> ok
  root.boid_js       [result] {fast:gpt-5-mini} -> ok
  root.simulation_js [result] {fast:gpt-5-mini} -> ok
  root.renderer_js   [result] {fast:gpt-5-mini} -> ok
  root.main_js       [result] {fast:gpt-5-mini} -> ok
```

The same render is available offline as `graph.tree()` on any snapshot.
Filename-flavored agent ids (`index.html` → `index_html`) are sanitized
because `.` is the parent/child delimiter in the agent tree.

### Static renders

`recursive-flow render <path> -f F` writes a static visualization in any of:

```text
mermaid             # stateDiagram-v2 (default topology)
mermaid-flowchart   # flowchart TD, better for wide trees
mermaid-sequence    # sequenceDiagram of delegate / wait / resume
dot · d2            # Graphviz / D2 topology
tree · ascii-boxes  # text trees
gantt-html          # standalone HTML swimlane
report-md           # full Markdown summary (tree + cost + result + errors)
code-log            # every code block paired with its observation
error-summary       # ErrorOutput counts grouped by kind
tokens              # one-line ASCII sparkline of cumulative tokens
html                # self-contained interactive stepper, one slide per snapshot
image               # single PNG/SVG/PDF of the topology snapshot
steps               # one image per snapshot, written as step_NN.{png,svg,pdf}
```

```bash
recursive-flow render ./myproject -f mermaid-flowchart
recursive-flow render ./myproject -f gantt-html -o run.html
recursive-flow render ./myproject -f report-md  -o run.md
recursive-flow render ./myproject -f tokens
```

GitHub renders mermaid inline, so the output drops straight into a doc.
The example below is the `to_mermaid_flowchart(graph)` projection of the
boids run; it renders reliably across the GitHub-supported mermaid
versions:

```mermaid
flowchart TD
    root["root<br/><i>result</i><br/>Boids simulation written to output/boids-simulation..."]:::result
    root --> html["root.index_html<br/><i>result</i><br/>ok"]:::result
    root --> css["root.styles_css<br/><i>result</i><br/>ok"]:::result
    root --> boid["root.boid_js<br/><i>result</i><br/>ok"]:::result
    root --> sim["root.simulation_js<br/><i>result</i><br/>ok"]:::result
    root --> rend["root.renderer_js<br/><i>result</i><br/>ok"]:::result
    root --> main["root.main_js<br/><i>result</i><br/>ok"]:::result
    classDef result fill:#3fb95022,stroke:#3fb950,color:#c9d1d9;
```

### Programmatic helpers

Everything the CLI does is one function call away:

```python
from rflow.utils.export import to_mermaid, to_mermaid_flowchart, to_mermaid_sequence, to_dot, to_d2
from rflow.utils.viz import (
    ascii_boxes, code_log, error_summary, message_stream, diff_system_prompts,
    gantt, gantt_html, token_sparkline, budget_burndown, bench_table,
    report_md, live, tee, slack_webhook, discord_webhook,
)
from rflow.utils.tracing import json_logs

print(token_sparkline(graphs))          # ▁▂▅█▂   15820 tok over 7 steps
print(error_summary(graph))             # ErrorOutput counts grouped by kind
print(message_stream("root.boid_js", graph))     # rendered transcript for one agent
print(report_md(graphs, title="run"))   # full Markdown report
gantt_html(graphs, "run.html")          # standalone HTML swimlane
json_logs(graph, "run.jsonl")           # one node per line
```

### Image, GIF, and HTML exports

For blog posts, PR comments, papers, and CI artifacts, render the
graph straight to a PNG/SVG/PDF, an animated GIF, or a single
self-contained HTML stepper. Four public functions live in
`rflow.utils`, plus matching CLI verbs:

| Function                                | CLI verb        | Output                                | Use case                                   |
|-----------------------------------------|-----------------|---------------------------------------|--------------------------------------------|
| `save_image(graph, path)`               | `-f image`      | one PNG/SVG/PDF                       | hero image of a finished run               |
| `save_steps(graphs, dir/)`              | `-f steps`      | `step_NN.png` per snapshot            | blog slideshow, paper figure series        |
| `save_gif(graphs, path)`                | _(no verb yet)_ | animated GIF                          | quick preview / social posts               |
| `save_html(graphs, path)`               | `-f html`       | self-contained stepper (Plotly + CSS) | shareable URL-less artifact, PR comment    |

Quick start:

```python
import rflow
from rflow.utils import save_image, save_steps, save_html, save_gif

graph = rflow.Graph.load("runs/deep_research")

save_image(graph, "run_final.png")
save_html(graph, "viewer.html", title="run")

# If you kept an in-memory history list, playback exports still work:
save_steps(graphs, "frames/")                    # one PNG per step
save_gif(graphs, "trace.gif", duration=400)      # animated GIF (~2.5 fps)
```

Or use the graph shorthand (same defaults):

```python
graph.save_image("run_final.png")
graph.save_html("viewer.html")
```

#### Why the scaling knobs exist

The Plotly viewer, static image export, GIF export, and HTML stepper now
share the same default element scale (`element_mult=1.0`), so a saved
PNG looks much closer to the Gradio/Jupyter view. Dense graphs still
adaptively cap marker and label sizes to avoid turning large runs into
solid blobs.

Use these knobs only when a target medium needs a different balance:

| Knob               | Default | Effect                                                                                  |
|--------------------|---------|------------------------------------------------------------------------------------------|
| `element_mult`     | `1.0`   | Uniform multiplier on markers and fonts. The simplest "make it bigger" knob.            |
| `marker_mult`      | _(inherits)_ | Override just marker size and outline width. Useful when dots need more visual weight. |
| `text_mult`        | _(inherits)_ | Override just label font size. Smaller text means fewer label collisions.              |
| `normalize_labels` | `True`  | Force every label to `bottom center` so adjacent depths can't share a vertical band.     |

Pass `marker_mult` and/or `text_mult` to break the symmetry when labels
are colliding or nodes are too subtle for a specific export.

#### Recipes

**Hero PNG of a finished run** — defaults are tuned for this:

```python
graph.save_image("hero.png")
# == save_image(graph, "hero.png", width=1800, height=1350,
#               scale=2.0, element_mult=1.0, normalize_labels=True)
```

**Blog slideshow with dense subtrees** — fat markers, small labels,
square-ish canvas (the recipe behind `docs/blog.md`):

```python
save_steps(
    graphs,
    "blog/frames/",
    width=1600, height=1200, scale=2.0,
    marker_mult=3.5,        # fat node dots + edges
    text_mult=2.2,          # shrink labels so they don't collide
    normalize_labels=True,  # already the default — explicit for the reader
)
```

**Standalone interactive stepper** — drop into a PR comment or
GitHub gist:

```python
save_html(workspace, "viewer.html", title="needle haystack run")
```

The HTML output embeds Plotly from CDN, includes per-slide
transcripts, and ships keyboard navigation (← / →) plus dot-style
slide indicators. Open it in any browser, attach it to an email,
upload it as a CI artifact — it works offline once the CDN script
is cached.

**Animated GIF** — needs `pip install recursive-flow[image] pillow`:

```python
save_gif(
    graphs,
    "trace.gif",
    duration=600,          # ms per frame; lower = faster
    loop=0,                # 0 = forever; 1 = play once
    width=1200, height=900,
)
```

#### From the CLI

Every knob above maps 1:1 to a CLI flag:

```bash
# blog slideshow recipe (matches the dense-tree recipe above)
recursive-flow render ./myproject \
  -f steps -o blog/frames/ \
  --width 1600 --height 1200 --scale 2.0 \
  --marker-mult 3.5 --text-mult 2.2

# self-contained interactive stepper
recursive-flow render ./myproject \
  -f html  -o stepper.html --title "boids walkthrough"

# single hero PNG with default scaling
recursive-flow render ./myproject \
  -f image -o hero.png

# opt out of label normalization (matches Gradio viewer defaults)
recursive-flow render ./myproject \
  -f html  -o stepper.html --no-normalize-labels
```

The CLI uses `element_mult=1.0` by default for `html`, `image`, `steps`,
and `gif` so static exports stay visually consistent with the interactive
viewer. Node sizes are uniform; token counts stay in hover/details, not
marker size. Override with `--element-mult`, `--marker-mult`, or
`--text-mult` for a specific medium.

#### Dependencies

- `save_image` / `save_steps` need `kaleido`. Install with
  `pip install recursive-flow[image]` or just `pip install kaleido`.
- `save_gif` additionally needs `Pillow`
  (`pip install recursive-flow[image] pillow`).
- `save_html` and `render_html` have **no static-image dependency** —
  they emit a single HTML file that embeds Plotly from CDN.

## DSPy Adapter

`RecursiveFlowLM` lets DSPy use a `Flow` agent anywhere it expects a language
model:

```python
import dspy
import rflow
from rflow.integrations.dspy import RecursiveFlowLM

flow = rflow.Flow(
    rflow.OpenAIClient(model="gpt-4o-mini"),
    max_depth=1,
    max_iters=5,
)

dspy.configure(lm=RecursiveFlowLM(flow, model="recursive-flow/gpt-4o-mini"))
qa = dspy.ChainOfThought("question -> answer")
print(qa(question="What is 17 * 23?").answer)
```

Install it with `pip install recursive-flow[openai,dspy]`. See
[`examples/providers/dspy_drop_in.py`](examples/providers/dspy_drop_in.py) for the runnable
version.

## Examples

Run the offline smoke suite with `python examples/run_examples.py`.
Add `--include-optional`, `--include-live`, `--include-sandbox`, or
`--include-manual` as needed. Most live examples share flags like `--no-viz`,
`--docker-image recursive-flow:local`, `--max-depth`, and `--max-iters`; see
[`examples/README.md`](examples/README.md).

| Example | What it shows |
|---|---|
| [`showcase.py`](examples/showcase.py) | Functional stepping, snapshots, save/load, and live terminal visualization. |
| [`structured_output.py`](examples/structured_output.py) | Root and child results validated with JSON Schema / Pydantic. |
| [`drop_in_llm.py`](examples/drop_in_llm.py) | `Flow` as an `LLMClient`, including nested flows. |
| [`skills.py`](examples/skills.py) | On-disk skill files loaded through a dynamic prompt section. |
| [`dspy_drop_in.py`](examples/providers/dspy_drop_in.py) | Use a `Flow` agent as the LM behind a DSPy program. |
| [`mcp_weather.py`](examples/providers/mcp_weather.py) | Start a local MCP weather server, delegate city forecasts, and combine advice. |
| [`tinker_agent.py`](examples/providers/tinker_agent.py) | Run the live terminal graph view with `TinkerClient` inference. |
| [`sandboxes/`](examples/sandboxes/) | Build a small web app while Python code runs inside Modal, E2B, or Daytona. |
| [`coding/agent.py`](examples/coding/agent.py) | Interactive coding agent that writes and edits files in a working directory. |
| [`needle/haystack.py`](examples/needle/haystack.py) | Needle-in-a-haystack over a massive in-memory `INPUTS["haystack"]`. |
| [`needle/filesystem.py`](examples/needle/filesystem.py) | Needle-in-a-haystack across many files with `FILE_TOOLS` and runtime working directories. |
| [`summarizer.py`](examples/summarizer.py) | Recursive map-reduce summarization over a long document. |
| [`eager_children.py`](examples/control/delegation/eager_children.py) | `eager_children=True` vs `False` — how child scheduling overlaps. |
| [`control/injection/`](examples/control/injection/) | Generate a baseline run, edit copies with graph injection/replacement, and continue variants. |
| [`fork_repair.py`](examples/control/branching/fork_repair.py) | Fork graph/workdir snapshots into independent repair branches and compare results. |
| [`best_of_n.py`](examples/control/branching/best_of_n.py) | Run N independent branches and pick the best result. |
| [`autoresearch/`](examples/autoresearch/) | Karpathy-style hill-climbing research loop with custom `@tool`s and delegation. |
| [`graph/`](examples/graph/) | Offline tour of the `Graph` API: query, navigate, mutate, save/load, timeline retrace, fork, render. |
| [`run_examples.py`](examples/run_examples.py) | Manifest-driven smoke runner for offline, optional, live, sandbox, and manual examples. |
| [`view_demo.py`](examples/view_demo.py) | Build synthetic `Graph` snapshots and launch the Gradio viewer. |
| [`notebooks/coding_agent.ipynb`](examples/notebooks/coding_agent.ipynb) | Build the agent, run the boids task end-to-end, and inspect the saved run/viewer. |
| [`notebooks/viz_walkthrough.ipynb`](examples/notebooks/viz_walkthrough.ipynb) | Visualization helpers against a saved fixture. |
| [`notebooks/node_basics.ipynb`](examples/notebooks/node_basics.ipynb) | `Graph` query API tour. |

## Benchmarks

The shared eval harness lives under [`benchmarks/eval/`](benchmarks/eval/).
It uses a task/runner registry, writes `results.jsonl` + `summary.json`, records
rflow graph-shape metrics, shows tqdm progress bars, and can log per-row metrics
to W&B. Real runs can compare `vanilla`, `rflow`, and the upstream official RLM
runner ported from [`avilum/minrlm/eval`](https://github.com/avilum/minrlm/tree/master/eval).
It also writes model-oriented reports under `eval-runs/<model>/<benchmark>/`,
including per-question JSON files with prompt, inputs, expected answer, and each
runner's solution.

```bash
python -m benchmarks.eval \
  --provider fake \
  --model fake \
  --tasks sniah \
  --runners fake vanilla rflow \
  --seeds 0:3
```

To run the full RLM-Bench-style table sweep with W&B logging:

```bash
make eval-benchmark EVAL_MODEL=gpt-5-mini
```

See [`benchmarks/eval/README.md`](benchmarks/eval/README.md) for task/runner
extension points and W&B usage.

## CLI

```
recursive-flow view ./myproject
recursive-flow render ./myproject -f mermaid
recursive-flow render ./myproject -f gantt-html -o run1.html
recursive-flow render ./myproject -f html       -o stepper.html
recursive-flow render ./myproject -f steps      -o frames/  --marker-mult 3.5 --text-mult 2.2
recursive-flow render ./myproject -f image      -o graph.png
recursive-flow version
```

`view` and `render` accept a workspace directory.
`render -f` accepts: `mermaid`, `mermaid-flowchart`, `mermaid-sequence`,
`dot`, `d2`, `tree`, `ascii-boxes`, `gantt-html`, `report-md`, `code-log`,
`error-summary`, `tokens`, `html`, `image`, `steps` — see the
[Static renders](#static-renders) table and [Image, GIF, and HTML
exports](#image-gif-and-html-exports) for what each produces and the
scaling / label-normalization flags (`--marker-mult`, `--text-mult`,
`--normalize-labels` / `--no-normalize-labels`).

## Roadmap
- [x] OOLONG long-context aggregation harness (`standard` / `rlm` / `rlm_tips`)
- [x] `LocalRuntime` + `DockerRuntime` — battle-tested
- [~] `ModalRuntime` / `E2BRuntime` / `DaytonaRuntime` — full support: native SDK file transfer, real-sandbox CI, depth>1 delegation, heavier example
- [~] OOLONG, LongBench-v2, CodeQA, SWE-bench, etc. benchmarks [benchmarks](benchmarks/eval/)
- [ ] **REPL security (local)**
- [ ] [RAO library module](docs/research/rao_implementation_plan.md): `rflow.rao` rollout collection, per-node rewards, leave-one-out advantages, depth weighting, trainer export
- [ ] [DeLM-style coordination](docs/research/delm_vs_rlmflow.md): shared task queue, verified shared context, multi-worker coordinator over `Flow` graphs

## Docs

The top-level docs are short, user-facing guides. The deep dive lives
in [`docs/internals.md`](docs/internals.md). Research notes live under
[`docs/research/`](docs/research/).

- [**Internals**](docs/internals.md): deep reference — engine
  architecture, step lifecycle, REPL `await` protocol, runtime backends,
  graph persistence, and extension seams. This document is being refreshed
  after the `Flow`/`Graph` rewrite.
- [Blog post](docs/blog.md): long-form pitch — why recursive language
  models, why graphs over flat traces, full needle-in-a-haystack
  walkthrough with the same exports the CLI ships.
- [Positioning](docs/positioning.md): when to use recursive-flow vs
  rlm-minimal, ypi, LangGraph, CrewAI, AutoGen, SWE-agent, Aider.
- [Control](docs/control.md): step loop, save/load resume, rewind,
  forks, `INPUTS`, `launch_subagents`, inline-first strategy, custom tools.
- [Node injection](docs/injections.md): append typed controller events to a
  running graph and commit them through `agent.step(graph)`.
- [Observability](docs/observability.md): querying the `Graph`,
  run layout, export helpers, live tree, gantt, topology
  exports, Gradio viewer, CLI.
- [Runtimes](docs/runtimes.md): `Runtime` protocol, shipped runtimes
  (Local / Docker / Modal / E2B / Daytona), writing your own.
- [Prompt customization](docs/prompt_customization.md): `PromptBuilder`
  sections, callable dynamic sections, workspace-backed skills/memory,
  deriving from the default prompt, full replacement.
- [Security](docs/security.md): trust model, Docker isolation knobs,
  engine-level caps, proxied tools, approval gates.
- [Changelog](CHANGELOG.md): release-by-release changes.

## References

- [Recursive Language Models](https://github.com/alexzhang13/rlm): the
  original RLM paper and implementation.
- [rlm-minimal](https://github.com/alexzhang13/rlm-minimal): the
  single-file reference recursive-flow grew from.
- [Scaling Managed Agents: Decoupling the brain from the hands](https://www.anthropic.com/engineering/managed-agents):
  Anthropic's writeup on separating harness, session, and sandbox
  interfaces for long-horizon agents.
- [ypi](https://github.com/rawwerks/ypi): recursive coding agent built
  on Pi. Our session layout and much of the default prompt
  (size-up → delegate → combine, guardrails, aggressive delegation) come
  from ypi's `SYSTEM_PROMPT.md`.

## License

See [LICENSE](LICENSE).

## Citation

```bibtex
@misc{sudhakaran2025recursive-flow,
  author = {Sudhakaran, Shyam},
  title = {recursive-flow},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/shyamsn97/recursive-flow}},
}
```
