Metadata-Version: 2.4
Name: fleet-rlm
Version: 0.4.6
Summary: Recursive Language Models with DSPy + Modal and an integrated Web UI for secure long-context code execution
Author: Qredence
License-Expression: MIT
Project-URL: Homepage, https://github.com/qredence/fleet-rlm
Project-URL: Repository, https://github.com/qredence/fleet-rlm
Project-URL: Issues, https://github.com/qredence/fleet-rlm/issues
Project-URL: Documentation, https://fleet-rlm.readthedocs.io/
Keywords: dspy,llm,modal,recursive-language-model,rlm,agents
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Typing :: Typed
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: AUTHORS.md
Requires-Dist: dspy==3.1.3
Requires-Dist: hydra-core<2,>=1.3
Requires-Dist: markitdown[all]<1,>=0.1.0
Requires-Dist: omegaconf<3,>=2.3
Requires-Dist: modal>=1.3.2
Requires-Dist: pypdf<7,>=6
Requires-Dist: pydantic<3,>=2.12.5
Requires-Dist: prompt-toolkit<4,>=3.0.50
Requires-Dist: python-dotenv>=1.2.1
Requires-Dist: pyyaml<7,>=6.0.3
Requires-Dist: rich<14,>=13.9
Requires-Dist: structlog<25,>=24.1.0
Requires-Dist: sqlmodel>=0.0.24
Requires-Dist: aiosqlite>=0.20.0
Requires-Dist: tomli>=2.0.0; python_version < "3.11"
Requires-Dist: typer<1,>=0.21.1
Requires-Dist: posthog>=7.9.1
Requires-Dist: asyncpg>=0.31.0
Requires-Dist: sqlalchemy>=2.0.46
Requires-Dist: greenlet>=3.3.1
Requires-Dist: psycopg>=3.3.2
Provides-Extra: dev
Requires-Dist: pre-commit>=3.7; extra == "dev"
Requires-Dist: pytest>=8.2; extra == "dev"
Requires-Dist: pytest-asyncio>=0.24; extra == "dev"
Requires-Dist: ruff>=0.8; extra == "dev"
Requires-Dist: ty>=0.0.1a16; extra == "dev"
Requires-Dist: build>=1.2; extra == "dev"
Requires-Dist: twine>=5.1; extra == "dev"
Provides-Extra: mcp
Requires-Dist: fastmcp<3,>=2.14.0; extra == "mcp"
Requires-Dist: httpx[socks]<1,>=0.28.1; extra == "mcp"
Requires-Dist: pydantic<3,>=2.12.5; extra == "mcp"
Provides-Extra: server
Requires-Dist: alembic<2,>=1.13; extra == "server"
Requires-Dist: asyncpg<1,>=0.29; extra == "server"
Requires-Dist: fastapi[standard]<1,>=0.115; extra == "server"
Requires-Dist: greenlet<4,>=3.0; extra == "server"
Requires-Dist: psycopg[binary]<4,>=3.2; extra == "server"
Requires-Dist: PyJWT<3,>=2.8; extra == "server"
Requires-Dist: pydantic<3,>=2.12.5; extra == "server"
Requires-Dist: sqlalchemy<3,>=2; extra == "server"
Requires-Dist: scalar-fastapi<2,>=1.5.0; extra == "server"
Requires-Dist: uvicorn[standard]<1,>=0.32; extra == "server"
Requires-Dist: websockets<17,>=14; extra == "server"
Provides-Extra: full
Requires-Dist: alembic<2,>=1.13; extra == "full"
Requires-Dist: asyncpg<1,>=0.29; extra == "full"
Requires-Dist: fastmcp<3,>=2.14.0; extra == "full"
Requires-Dist: httpx[socks]<1,>=0.28.1; extra == "full"
Requires-Dist: pydantic<3,>=2.12.5; extra == "full"
Requires-Dist: psycopg[binary]<4,>=3.2; extra == "full"
Requires-Dist: PyJWT<3,>=2.8; extra == "full"
Requires-Dist: sqlalchemy<3,>=2; extra == "full"
Requires-Dist: greenlet<4,>=3.0; extra == "full"
Requires-Dist: fastapi[standard]<1,>=0.115; extra == "full"
Requires-Dist: uvicorn[standard]<1,>=0.32; extra == "full"
Requires-Dist: scalar-fastapi<2,>=1.5.0; extra == "full"
Requires-Dist: websockets<17,>=14; extra == "full"
Dynamic: license-file

# fleet-rlm

[![PyPI version](https://img.shields.io/pypi/v/fleet-rlm.svg)](https://pypi.org/project/fleet-rlm/)
[![Python versions](https://img.shields.io/pypi/pyversions/fleet-rlm.svg)](https://pypi.org/project/fleet-rlm/)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![CI](https://github.com/Qredence/fleet-rlm/actions/workflows/ci.yml/badge.svg)](https://github.com/Qredence/fleet-rlm/actions/workflows/ci.yml)

[![PyPI Downloads](https://static.pepy.tech/personalized-badge/fleet-rlm?period=monthly&units=INTERNATIONAL_SYSTEM&left_color=MAGENTA&right_color=BLACK&left_text=downloads%2Fmonth)](https://pepy.tech/projects/fleet-rlm)

**Secure, cloud-sandboxed Recursive Language Models (RLM) with DSPy and Modal.**

`fleet-rlm` provides a production-ready implementation of **Recursive Language Modeling** aligned with the [DSPy RLM API](https://dspy.ai/api/modules/RLM/). It gives your AI agent a secure "computer" in the cloud to read, search, and analyze massive datasets without local resource constraints.

[Paper](https://arxiv.org/abs/2501.123) | [Contributing](CONTRIBUTING.md) | [Docs](docs/)

---

## Architecture

```mermaid
graph TB
    subgraph entry ["🚪 Entry Points"]
        CLI["CLI (Typer)"]
        WebUI["Web UI<br/>(React SPA)"]
        API["FastAPI<br/>(WS/REST)"]
        TUI["Ink TUI<br/>(stdio bridge)"]
        MCP["MCP Server"]
    end

    subgraph orchestration ["🧠 Orchestration Layer"]
        Agent["RLMReActChatAgent<br/>(dspy.Module)"]
        History["Chat History"]
        Memory["Core Memory<br/>(Persona/Human/Scratchpad)"]
        DocCache["Document Cache"]
    end

    subgraph tools ["🔧 ReAct Tools"]
        DocTools["📄 load_document<br/>read_file_slice<br/>chunk_by_*"]
        RecursiveTools["🔄 rlm_query<br/>llm_query<br/>(recursive delegation)"]
        ExecTools["⚡ execute_code<br/>edit_file<br/>search_code"]
    end

    subgraph execution ["⚙️ Execution Layer"]
        Interpreter["ModalInterpreter<br/>(JSON protocol)"]
        Profiles["Execution Profiles:<br/>ROOT | DELEGATE | MAINTENANCE"]
    end

    subgraph cloud ["☁️ Modal Cloud"]
        Sandbox["Sandbox Driver<br/>(Python REPL)"]
        Volume[("💾 Persistent Volume<br/>/data/<br/>• workspaces<br/>• artifacts<br/>• memory<br/>• session state")]
    end

    WebUI -->|"REST / WS"| API
    CLI --> Agent
    API --> Agent
    TUI --> Agent
    MCP --> Agent

    Agent --> History
    Agent --> Memory
    Agent --> DocCache

    Agent --> DocTools
    Agent --> RecursiveTools
    Agent --> ExecTools

    DocTools --> Interpreter
    RecursiveTools --> Interpreter
    ExecTools --> Interpreter

    Interpreter --> Profiles
    Interpreter -->|"stdin/stdout<br/>JSON commands"| Sandbox
    Sandbox -->|"read/write"| Volume

    style entry fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style orchestration fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    style tools fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style execution fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
    style cloud fill:#fce4ec,stroke:#c2185b,stroke-width:2px
```


**Layers:**

🚪 **Entry Points** → 🧠 **Orchestration** → 🔧 **Tools** → ⚙️ **Execution** → ☁️ **Modal Cloud**


## Features

- **Web UI First (0.4.6)**: Integrated React SPA (`src/frontend`) is now the primary interactive surface for chat, execution timeline, and artifact workflows.
- **Interactive Agent**: `RLMReActChatAgent` (a `dspy.Module`) combines fast, interactive chat with deep, recursive task execution via `rlm_query`.
- **DSPy Aligned**: Implements `dspy.RLM`, `dspy.Module`, and `dspy.Tool` interfaces — compatible with DSPy optimizers (`BootstrapFewShot`, `MIPROv2`).
- **Secure Sandbox**: Code runs in isolated **Modal** containers with persistent storage volumes, execution profiles, and sensitive data redaction.
- **Recursive Delegation**: All delegate tools (`rlm_query`, `analyze_long_document`, `grounded_answer`, etc.) spawn true recursive sub-agents via `spawn_delegate_sub_agent()` with unified depth enforcement.
- **PDF Ingestion**: Native document loading via MarkItDown with pypdf fallback; OCR guidance for scanned PDFs.
- **Session State**: Per-workspace, per-user session persistence with manifests stored on Modal volumes.
- **MCP Server**: Expose fleet-rlm capabilities as an MCP tool server via `serve-mcp`.
- **Execution Streams**: `/ws/chat` remains the primary interactive stream while `/ws/execution` provides structured execution lifecycle events for Artifact Canvas and observability clients.
- **Observability**: Real-time streaming of thoughts, tool execution, trajectory normalization, and structured logging.
- **LLM Analytics (Opt-in)**: PostHog `$ai_generation` events for DSPy LM calls with trace correlation, token metadata, latency, and payload redaction/truncation.

## PostHog LLM Analytics

PostHog analytics is disabled by default. To enable it, set both:

```bash
POSTHOG_ENABLED=true
POSTHOG_API_KEY=phc_...
```

Optional settings:

- `POSTHOG_HOST` (default: `https://us.i.posthog.com`)
- `POSTHOG_DISTINCT_ID` (runtime user identity takes precedence in `/ws/chat`)
- `POSTHOG_FLUSH_INTERVAL` / `POSTHOG_FLUSH_AT`
- `POSTHOG_ENABLE_DSPY_OPTIMIZATION` (default: `false`)
- `POSTHOG_INPUT_TRUNCATION` / `POSTHOG_OUTPUT_TRUNCATION`
- `POSTHOG_REDACT_SENSITIVE` (default: `true`)

Programmatic setup:

```python
from fleet_rlm import configure_analytics

configure_analytics()  # reads POSTHOG_* environment variables
```

Each DSPy LM call emits `$ai_generation` with:

- `$ai_trace_id`, `$ai_parent_trace_id`
- `$ai_model`, `$ai_provider`, `$ai_latency`
- `$ai_input`, `$ai_output_choices` (sanitized + truncated)
- `$ai_input_tokens`, `$ai_output_tokens`, `$ai_total_tokens`

## Quick Start

### 1. Install

```bash
uv pip install fleet-rlm
```

Optional extras for server and MCP support:

```bash
uv pip install fleet-rlm[server]   # FastAPI server + WebSocket
uv pip install fleet-rlm[mcp]      # MCP server
uv pip install fleet-rlm[full]     # All extras
```

### 2. Configure

Set up your Modal and LLM credentials:

```bash
modal setup
modal volume create rlm-volume-dspy
modal secret create LITELLM DSPY_LM_MODEL=openai/gemini-3-pro-preview DSPY_LLM_API_KEY=sk-...
```

Set up NeonDB + backend auth bootstrap:

```bash
# from repo root
cp .env.example .env
# Edit .env and set:
#   DATABASE_URL=postgresql://... (direct Neon endpoint)
#   AUTH_MODE=dev
#   AUTH_REQUIRED=false   # dev default; auth optional until Entra is wired
#   DEV_JWT_SECRET=...
```

Initialize DB schema:

```bash
# from repo root
uv run python scripts/db_init.py
```

### 3. Run

**Web UI (React SPA):**

`0.4.6` treats the React SPA as the primary interface. The backend serves the built frontend automatically.

```bash
# 1. Build the frontend (requires Bun)
cd src/frontend
bun install
bun run build
cd ../..

# 2. Build the Python package (bundles the UI into the wheel)
uv build

# 3. Install with server dependencies and run the Web UI server
uv pip install -e ".[server]"
uv run fleet web
```
Then navigate to `http://localhost:8000` in your browser.

OpenAPI source-of-truth is `openapi.yaml` at repository root. Frontend API types are generated from `src/frontend/openapi/fleet-rlm.openapi.yaml`, which should be synced from the root spec via frontend scripts.

**Interactive Chat (OpenTUI):**

```bash
# Requires OpenTUI / Bun
fleet-rlm code-chat --opentui
```

**Standalone Interactive Chat (Ink):**

```bash
# Ink runtime (supported standalone path)
fleet

# Force Ink explicitly
fleet --ui ink
```

**One-shot Tasks:**

```bash
# Basic question
fleet-rlm run-basic --question "What are the first 12 Fibonacci numbers?"

# Document analysis
fleet-rlm run-architecture --docs-path docs/architecture.md --query "Extract all components"
```

**Servers:**

```bash
# API server (FastAPI + WebSocket) via explicit command
uv run fleet-rlm serve-api --port 8000

# MCP server
fleet-rlm serve-mcp --transport stdio
```

WebSocket endpoints:

- `/api/v1/ws/chat` for interactive conversation and tool orchestration events.
- `/api/v1/ws/execution` for filtered execution lifecycle events (`execution_started`, `execution_step`, `execution_completed`) scoped by `workspace_id`, `user_id`, and `session_id`.

Issue a dev token:

```bash
# from repo root
uv run python scripts/dev_issue_token.py \
  --tid "00000000-0000-0000-0000-000000000123" \
  --oid "00000000-0000-0000-0000-000000000456" \
  --email dev@example.com \
  --name "Dev User"
```

Call an authenticated endpoint (debug headers):

```bash
curl -s http://127.0.0.1:8000/api/v1/auth/me \
  -H "X-Debug-Tenant-Id: 00000000-0000-0000-0000-000000000123" \
  -H "X-Debug-User-Id: 00000000-0000-0000-0000-000000000456" \
  -H "X-Debug-Email: dev@example.com" \
  -H "X-Debug-Name: Dev User"
```

Call an authenticated endpoint (JWT):

```bash
curl -s http://127.0.0.1:8000/api/v1/auth/me \
  -H "Authorization: Bearer ${DEV_TOKEN}"
```

Run DB smoke test:

```bash
# from repo root
uv run python scripts/db_smoke.py
```

`fleet` and `fleet-rlm code-chat` serve different interactive paths:

- `fleet` = standalone bridge chat launcher (Ink runtime)
- `fleet-rlm code-chat` = OpenTUI runtime (OpenTUI/Bun required)

## Development Setup

```bash
# Clone and install
git clone https://github.com/qredence/fleet-rlm.git
cd fleet-rlm
uv sync --extra dev

# With server/MCP support
uv sync --extra dev --extra server --extra mcp

# Build React frontend bundle for web UI
cd src/frontend
bun install
bun run check
cd ../..

# Build Ink frontend bundle for `fleet --ui ink`
cd tui-cli/tui-ink
bun install
bun run build
bun run test
cd ..

# Copy environment template
cp .env.example .env

# Quality gate
uv run ruff check src tests
uv run ruff format --check src tests
uv run ty check src --exclude "src/fleet_rlm/_scaffold/**"
uv run pytest -q

# Auto-fix formatting when needed
uv run ruff format src tests
```

## Documentation

- [Concepts](docs/explanation/rlm-concepts.md) — Core architecture (Agent, RLM, Sandbox)
- [User Flows](docs/user_flows.md) — Interaction diagrams (Chat, Tools, Delegation)
- [Architecture](docs/explanation/architecture.md) — System components and hierarchy
- [Tutorials](docs/tutorials/index.md) — Step-by-step lessons
- [How-To Guides](docs/how-to-guides/index.md) — Installation, deployment, troubleshooting
- [CLI Reference](docs/reference/cli.md) — Full CLI command reference
- [HTTP API Reference](docs/reference/http-api.md) — Server endpoints and WebSocket protocol
- [Source Layout](docs/reference/source-layout.md) — Package structure guide

## Contributing

We welcome contributions! Please see our [Contribution Guide](CONTRIBUTING.md) and run the quality gate before submitting:

```bash
uv run ruff check src tests
uv run ruff format --check src tests
uv run ty check src --exclude "src/fleet_rlm/_scaffold/**"
uv run pytest -q
```

## License

MIT License — see [LICENSE](LICENSE).

Based on [Recursive Language Modeling](https://arxiv.org/abs/2501.123) research by **Alex L. Zhang** (MIT CSAIL), **Omar Khattab** (Stanford), and **Tim Kraska** (MIT).
