The concrete pieces of a running Loopy, how they wire together, and what changes from laptop to production.
loopy run actually stands up today.
loopy run
schedules poll and cron triggers on an in-process scheduler (one
watermark-gated task per sensor) and hosts webhook sensors as HTTP routes, fanning
one URL out to every sensor on it. Webhook ingress can be signed: for
/hooks/github paths, loopy run verifies GitHub's X-Hub-Signature-256
HMAC at the edge when GITHUB_WEBHOOK_SECRET is set (a path left without a secret runs
unverified — dev only — and says so loudly). The remaining gap is durability, not the
trigger type: the scheduler is in-process, so restart-survival and single-firing across workers are
still ahead (ARCHITECTURE B7/B8).
A Loopy deployment has a build-time path and a run-time path, joined by one artifact: the manifest.
build time · CI / laptop · pure, no runtime deps
run time · a long-lived server
loopy compile parses the project, resolves the DAG, statically checks every
{{ event.* }} / {{ step.* }} reference, and emits manifest.json
(plus generates loopy.events for sensor authors). It executes nothing and has no runtime
dependency — it belongs in CI. A green compile is the deploy gate.loopy run manifest.json is the server. It loads the manifest, stands up
the sensor webhooks, and runs workflows as events arrive. It never reads .md —
the manifest is the complete IR, which is what makes "edit a workflow, recompile, redeploy" safe.The deploy unit is therefore manifest.json + the project root — the root is
still needed at run time for two things only: the sensor module source and the sandbox env_files.
The Loopy server in the middle, the outside world it integrates with on the edges. The agent
never calls the model from the Runtime process — the AgentHarness
shells out to claude inside the Sandbox, so model calls and tool side
effects originate from the Sandbox and are governed by its network: allowlist.
loopy run@sensor(webhook=…) routes · raw payload → registered EventRuntimeon: subscriber — fan-out + loop-backson: step · walk the after: DAG ·
render {{ event.* }}/{{ step.* }} · record outputs + history ·
publish emits: back to the bus ↺claude -p inside the sandbox · validates typed output:/emits: · enforces budgetegress from the sandbox is gated by the network: allowlist
| Piece | Responsibility | Today (loopy run) |
|---|---|---|
| SensorRunner | Hosts each @sensor(webhook=…) as an HTTP route; runs the author's fn to turn a raw vendor payload into a registered Event. The webhook/push edge — one language-pluggable surface. | FastAPISensorRunner on uvicorn; loads the real sensor module, or synthesizes events if it can't load. Webhook-only — poll sensors run on the PollScheduler (below). Started only when webhook sensors exist. |
| PollScheduler | The timer/pull edge. Fires each @sensor(poll=…) on its interval with a Tick (scheduled_at, last_run), normalizes the returned event(s), and delivers them to the EventReceiver — same output as a webhook, different trigger. | In-process asyncio (one task/sensor, sequential, watermark advances only on success, cold-start scans one window). Behind a Scheduler seam; durable timing is B7. |
| EventReceiver | Transport-neutral intake — accepts an Event from any sensor source, re-validates it against the registry, and publishes it to the EventBus. Lets a non-Python sensor feed the Python Runtime. | LocalEventReceiver (in-proc; publish-and-ack — validates then bus.publish, does not run the workflow; the Runtime drains the bus separately). |
| EventBus | Routes registered events to every workflow subscribing via on:. Handles fan-out (one event → many workflows) and loop-backs (emits: re-enters the EventBus). | InProcessEventBus (single process), or RedisEventBus (Redis Streams + consumer group — durable, at-least-once) with --bus redis. |
| Runtime | The engine. Instantiates a run at the on: step, walks the after: DAG, renders templates, records outputs + event-sourced history, publishes emits:. | InMemoryRuntime — covers B1–B6; single-process, non-durable. Durable timers / cron / resume stubbed (B7/B10). |
| AgentHarness | Runs a step's prose against its agent (model + skills); validates against the typed output:/emits:. | ClaudeCodeHarness — runs headless claude -p … --output-format json inside the sandbox, parses the envelope, feeds total_cost_usd to the budget enforcer. |
| SandboxProvider | Provisions compute + egress from the sandbox spec (image build + network: allowlist). The trust boundary. | local (subprocess, dev) or daytona (isolated cloud container). Selected per-sandbox via provider: in registry.yml (required on every sandbox — a missing one is compile error E214); the runtime routes each step to the backend its sandbox names. No launch-time flag. loopy init scaffolds daytona. |
| Secrets | Resolves a sandbox's env_file(s) at run time and injects them into the sandbox. Never in the manifest, never logged. | EnvFileSecretsResolver — reads dotenv relative to the project root; refuses paths that escape it. |
| Sensor secrets | Supplies credentials to in-process @sensor functions (poll + webhook). | A single runner-wide sensors/.env (load_sensor_env) merged into the process env at loopy run, so sensors read them via os.environ. Optional, gitignored, never in the manifest. |
| StateStore | Run history (event-sourced), step outputs, poll watermarks (last_run), event-id dedupe. | InMemoryStateStore — process-lifetime only. Now actively used: poll watermarks and the Redis bus's at-least-once dedupe. |
| RetryPolicy | Wraps side-effecting calls with backoff + idempotency key (run_id:step_id) so retries/replays don't double-fire. | ExponentialBackoffRetry; budget trips are terminal. |
Section 2 is the logical pipeline — the order data flows through. This is the
physical one: which OS process and trust domain each piece runs in. Architecturally there
are five tiers, and the EventBus is the seam —
everything above it is ingress, everything below it is execution, and the two
halves talk only through it.
EventBus (Redis/NATS/Kafka), as opposed to the in-process one.
| Tier | Component | Role |
|---|---|---|
| 1 | SensorRunner | the sensor surface — hosts developer @sensor code; produces events |
| 2 | EventReceiver | the ingress gateway — authenticates + re-validates, then publishes |
| 3 | EventBus | the seam — routes events to on: subscribers; in-process, or a broker when networked |
| 4 | Runtime | the engine — instantiates and drives runs; uses StateStore + RetryPolicy |
| (4) | AgentHarness | runs a step's agent; straddles Runtime ↔ Sandbox |
| 5 | Sandbox (via SandboxProvider) | the agent's exec domain — provisioned per spec |
POST /hooks/… ▼SensorRunner — the sensor surface own process domain · UNTRUSTED@sensor fn turns a raw vendor payload into a candidate Event, then hands it off — it never touches the EventBus or the Runtime directlyPOST /events when split out) ▼EventReceiver — ingress gateway loopy-owned · TRUSTEDpublish to the EventBusEventBus — the seam modular · the only ingress↔execution boundaryon: subscriber · fan-out · loop-backs (a step's emits re-enters here ↺)Runtime — the engine own process domain · N workers when distributedon: step · walk the after: DAG · render templates · record history · publish emits back to the EventBus ↺AgentHarness orchestration — render the prompt, build the claude argv, parse + validate the result, enforce the budgetSandbox.exec(argv) → subprocess spawn (local) | control-plane RPC (daytona) ▼Sandbox always a separate exec domain, even on a laptopclaude CLI runs HERE — model API calls and every tool side effect
(git push, open_pr) originate HERE · local subprocess or daytona remote containerSandbox network: allowlist ▼Five facts this makes explicit — and that the logical pipeline leaves ambiguous:
SensorRunner is its own domain, and it's untrusted. It runs developer
code, possibly in another language (TypeScript via the SDK) in an app you already operate. It produces a
candidate event and hands it off — it never touches the EventBus or the
Runtime directly.EventReceiver is the trusted front door — on the producer side of the
EventBus, not in the Runtime. Its only job is producer-facing:
authenticate the sender and re-validate the event against the registry contract (never
trust the producer), then publish. It runs no workflows. (Where it physically runs is the
next callout.)EventBus is the one modular seam between ingress and execution. Swap
the in-process implementation for a broker (Redis/NATS/Kafka) without touching a tier
above or below — that is the entire point of the EventBus Protocol. It also buffers: the
EventReceiver keeps accepting and enqueuing while the Runtime is busy or
restarting.Sandbox is the one hard process boundary that always exists — even on
a laptop. The agent never runs in the Runtime process; Sandbox.exec puts it in a
child subprocess (local) or a remote container (daytona). That boundary
is the trust/egress boundary: secrets are injected into it, and its network:
allowlist governs what the agent can reach.AgentHarness straddles the Runtime ↔ Sandbox
boundary. Its orchestration (render the prompt, build the argv, parse/validate the JSON
envelope, enforce the budget) runs in the Runtime process; the agent it launches
runs in the Sandbox. It's the one component that reaches across.EventReceiver as its own small service, separate from the
engine. Give it one job — take the event, check it's valid, put it on the EventBus,
done. The Runtime reads events off the bus on its own.
Why not bundle it into the engine? If they share a process, then while the engine is busy or restarting you stop accepting events from your sensors. Keep them separate and the sensors keep delivering, the bus holds the backlog, and the engine catches up — and you can run several engine workers behind one receiver.
The one change that makes this possible: today receive()
runs the whole workflow and returns a RunId. Change it to just publish the event and return.
That's the whole difference between "the receiver is glued to the engine" and "the receiver is its own thing."
What to do right now: for single-node loopy run, leave
everything in one process — it's the simple starting point and it's fine for dev and small deployments. The
separate service is the production target, not day-one work. (It is never inside the SensorRunner —
untrusted — or inside the broker — Redis/NATS run no loopy code.)
| Boundary | Mechanism | Over the network? |
|---|---|---|
Source → SensorRunner | HTTPS POST /hooks/… | yes — from the internet |
SensorRunner → EventReceiver | in-proc call (1 node) or HTTPS POST /events (SDK) | only when sensors are split out |
EventReceiver → EventBus → Runtime | EventBus.publish / subscribe | only here is the seam — in-process, or a broker |
Runtime → Sandbox (exec) | subprocess spawn (local) / control-plane RPC (daytona) | only daytona |
Sandbox → model API, GitHub, … | HTTPS egress, network:-gated | yes — from the Sandbox |
The tiers are logical; how many processes they occupy is a deploy choice.
The direction is a service-oriented architecture: the EventReceiver and the
Runtime become separate services that talk through a broker. For now we run them as one
node — the simple starting point — and evolve outward. Switching modes changes only which
EventBus you wire in; no .md changes, no recompile.
| Receiver + Runtime | EventBus (the seam) | Sandbox | |
|---|---|---|---|
| Single-node — today ( loopy run) | one node, one process | in-process | separate (always) |
| Service-oriented — the direction | separate services; N Runtime workers behind one EventReceiver | a broker (Redis / NATS / Kafka) | separate (always) |
Two things hold in both modes:
Sandbox is always separate. Even single-node runs the agent in its own
sandbox (a local subprocess or a daytona container) — it is never part of the
receiver/engine node.EventBus
directly. The SensorRunner is untrusted (developer code, maybe another language, maybe
in your own app); the EventReceiver is the gate that authenticates it and re-validates the
event against the registry contract before publishing. A sensor writing to the bus directly would skip that
check (and a remote one can't reach an in-process bus anyway, and shouldn't get direct broker write access).
That gate is the receiver's entire reason to exist. (See the sensor-surface
note for the SDK model that puts the SensorRunner in your own app.)Today sensors are loopy-hosted in Python (the simple start). The plan is to expand to
developer-hosted sensors later — the dev's own app/language posting events to a loopy-owned
endpoint — which is what unlocks polyglot. Getting there cleanly depends on two behaviors, not
on extra machinery. The structure is already right (receive() takes a serializable Event,
the @sensor fn is a standalone payload → Event callable, the contract is generated as
static files), so the only corner risk is shipping the in-process version with the wrong behaviors and
calling the interface "ready to split." Two things land now:
EventReceiver re-validates every event against the manifest registry —
even though in-process the sensor "should" be correct. Skipping it bakes in sensor-trust; un-trusting it later
means adding the gate and auditing everything downstream.receive() publishes and acknowledges — it does not run the workflow
synchronously. A remote receiver can't hold a connection open for a minutes-to-days run, so
synchronous-run-from-receive must never be depended on. The Runtime consumes off the bus
instead.Deliberately deferred to the developer-hosted milestone (additive, easy to get wrong if built
speculatively): the HTTP POST /events endpoint, producer authentication, an external broker, and
contract versioning/distribution to remote sensors. The one rule while they're deferred — keep Event
serializable and never assume the sensor and receiver share an in-memory registry. (Tracked in
plans/future/sensor-ingress/.)
The README's incidents example, traced through a live deployment:
1. Sentry fires a webhook ── POST /hooks/sentry ──▶ SensorRunner 2. the @sensor fn maps the payload ── returns Incident(source=sentry, …) ──▶ EventReceiver 3. EventReceiver injects it ── Runtime.trigger(Incident) ──▶ EventBus.publish 4. EventBus routes Incident ── matches triage/investigate `on: Incident` ──▶ a run starts 5. investigate runs ── ClaudeCodeHarness runs `claude` in a sandbox ──▶ emits WorkItem 6. WorkItem re-enters the bus ── matches resolve/arbitrate `on: WorkItem` ──▶ a new run starts 7. arbitrate ▶ fix ▶ review ▶ ship ── one workflow, `after:` chain, outputs passed by reference 8. ship emits GoalShipped ── terminal announcement on the bus
Steps 4 and 6 are cross-workflow event seams — they go through the bus. The
arbitrate → fix → review → ship chain in step 7 is within-workflow: those
handoffs are outputs passed by reference ({{ fix.diff }}) and never touch the bus.
Events cross workflow boundaries; outputs stay inside one.
The same five-axis composition (Runtime · StateStore · EventBus · Harness · SandboxProvider) is
wired differently per environment. Only the wiring changes — no .md and no compile
step is touched.
loopy trigger … (sandboxes with provider: local)
InMemoryRuntime · in-process EventBus · in-memory StateStore · local Sandbox.
loopy run … (sandboxes with provider: daytona)
Same composition, long-lived, hosting the webhooks; agents in isolated Daytona sandboxes.
Swap Runtime → DBOS / Temporal; StateStore → Postgres / history. (The networked bus already ships — --bus redis, Redis Streams.)
Harness + sandbox unchanged.
cron, day-spanning budgets), crash-recoverable resume, version pinning.contract.py interfaces.loopy.yaml — deployment defaults for loopy runThe run-time wiring choices map to an optional config file so they need not be
retyped as flags. Absent the file, defaults apply unchanged.
sensor_server: # the host:port that binds the sensor-webhook listener host: 0.0.0.0 port: 8000 bus: redis # inproc (single-process) | redis (networked broker)
Precedence is explicit flag > loopy.yaml > built-in default,
so --bus inproc still wins over a file that says redis. Connection
strings stay in the environment, never the file: bus: redis reads its URL from the
REDIS_URL env var (or the --redis-url flag), defaulting to
redis://localhost:6379. sandbox is not a config key and not a
launch flag — each sandbox's provider: is declared in registry.yml, and
the runtime routes each step to the backend its sandbox names. state: (durable StateStore) and limits: (spend caps)
are reserved for B10/B-cost and not yet read.
loopy.env — the secret companion to loopy.yamlBecause connection strings and provider keys can't live in the YAML, a local-dev convenience
file supplies them: loopy run reads loopy.env at the project root and
merges it into the process env with non-override (a value already set in the
real/platform environment always wins), before resolving redis_url and before
any Daytona client is created. Infra creds only — REDIS_URL,
DAYTONA_API_KEY/DAYTONA_API_URL — gitignored; agent secrets stay in sandbox
env_files and sensor secrets in sensors/.env. In production you typically
skip the file and inject these from the platform's secret store, which the non-override semantics
respect.
# loopy.env (project root; gitignored; local-dev convenience) REDIS_URL=redis://localhost:6379 DAYTONA_API_KEY=dt-... DAYTONA_API_URL=https://...
What you need in place for a successful run, in order:
loopy compile . (writes manifest.json; use
--check to validate without writing) exits 0 — every
on:/emits: names a registered event, every {{ }} ref resolves,
the DAG is acyclic, every sensor declares a registered emits. Run it in CI.manifest.json is the deploy artifact; carry the
project root alongside it (for sensor source + env_files).env_files. Each sandbox's env_file must exist under
the project root and hold the keys the harness needs — at minimum the model API key
(ANTHROPIC_API_KEY for claude-code). The runtime refuses to start a step whose
sandbox can't supply the harness's required keys, and refuses paths that escape the root.sensors/.env. Credentials an in-process @sensor needs
(a poll API key, a webhook-signing secret) go in a single runner-wide sensors/.env under the
project root; loopy run merges them into the process env (non-override) so sensors read them via
os.environ. Optional and gitignored. Keep infra creds (DAYTONA_API_KEY, REDIS_URL)
out of it — those are the server's own env.DAYTONA_API_KEY / DAYTONA_API_URL (for the daytona provider, loopy's default sandbox — the SDK ships in the core deps),
and REDIS_URL / --redis-url (for --bus redis). In production these come from the platform;
for local dev drop them in loopy.env at the project root (the secret companion to loopy.yaml, merged
with non-override). Infra creds only — agent secrets stay in sandbox env_files, sensor secrets in sensors/.env.
For the local provider: nothing — but agents run as host subprocesses, so it's dev-only.network: list is the egress contract — it
must include every host an agent reaches (e.g. github.com for opening a PR),
and the model API endpoint must be reachable from inside the sandbox.@sensor(webhook="/hooks/…") path. loopy run prints the hosted paths on startup —
verify the count and paths match.PollScheduler:
poll sensors fire watermark-gated (one task per sensor), and workflow on: cron(…) entries
fire Runtime.tick on each occurrence. Durable scheduling (survives restart,
single-firing across workers) is the B7/B8 work still ahead, behind the same Scheduler seam.loopy run as HTTP routes, with signed
ingress for GitHub (X-Hub-Signature-256 verified at the edge when
GITHUB_WEBHOOK_SECRET is set). A /hooks/github route left without a secret runs
unverified — dev only; set the secret before exposing it. A general per-source auth
framework for arbitrary providers is still ahead.wall_clock or
spend.usd fails the run rather than looping.loopy compile → manifest.jsonloopy run consumes the manifest ▼SensorRunner (untrusted) → EventReceiver (trusted gate: authenticate + re-validate)EventBus.publish ══ THE SEAM · EventBus (in-process | broker: Redis/NATS) ══ subscribe ▼Runtime walks the DAG · emits loop back to the EventBus ↺ · AgentHarness orchestrates each stepSandbox.exec ▼Sandbox — the claude CLI runs here · model API + tool egress gated by network: · secrets injected at run timeThe EventBus is the seam: ingress (SensorRunner → trusted
EventReceiver) on one side, execution (Runtime → AgentHarness) on the
other, talking only through it. The Sandbox is the other hard boundary — where
trust and egress are enforced. The manifest is the contract between author and Runtime; the
Sandbox is the contract between agent and the outside world.