Metadata-Version: 2.4
Name: car-runtime
Version: 0.22.1
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: Other/Proprietary License
Classifier: Operating System :: MacOS
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Rust
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries
Requires-Dist: pytest>=7.0 ; extra == 'test'
Provides-Extra: test
License-File: LICENSE
Summary: Common Agent Runtime — Python bindings for deterministic AI agent execution
Keywords: ai,agent,runtime,llm,inference
Author-email: Parslee AI <hello@parslee.ai>
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/Parslee-ai/car
Project-URL: Issues, https://github.com/Parslee-ai/car/issues
Project-URL: Repository, https://github.com/Parslee-ai/car

# car-runtime (Python)

Python bindings for **Common Agent Runtime** (CAR) — a deterministic execution
layer for AI agents. Models propose; the runtime validates and executes.

As of v0.8, this package is a **thin daemon client**: every method
proxies to a singleton `car-server` daemon over WebSocket. Inference,
the per-session memory graph, and tool dispatch all live on the
daemon side. Start `car-server` once per host before using the
bindings (`car-server` ships in the same wheel; see "Start the
daemon" below).

Pre-built wheels (abi3, Python 3.9+) for:
- `macosx_15_0_arm64` — macOS 15+ on Apple Silicon (Intel Macs dropped — see CHANGELOG)
- `manylinux_2_17_x86_64`, `manylinux_2_28_aarch64`

Building from source on macOS: set `MACOSX_DEPLOYMENT_TARGET=15.0` when
invoking `maturin build`. MLX's bundled Metal shaders don't compile against
the default 11.0 target, and our release wheels target 15.0 because the
compiled extension pulls in libc++ symbols (notably
`std::exception_ptr::__from_native_exception_pointer`) that only exist on
macOS 15+. A lower target produces a wheel whose tag doesn't match what the
binary actually requires at `dlopen` time.

## Install

From a release wheel (substitute the current version for `X.Y.Z`):

```bash
pip install https://github.com/Parslee-ai/car/releases/download/vX.Y.Z/car_runtime-X.Y.Z-cp39-abi3-macosx_15_0_arm64.whl
```

Or build from source:

```bash
pip install maturin
cd car-rs/crates/car-ffi-pyo3
maturin develop --release
```

The import name is `car_runtime` (matching the PyPI package name).

## Start the daemon

```bash
# The car-server binary ships inside the wheel. Default port 9100;
# auth on by default. Foreground (Ctrl-C to stop):
python -m car_runtime.server

# Or background:
python -m car_runtime.server &

# Override the URL the bindings use:
export CAR_DAEMON_URL=ws://127.0.0.1:9100
```

On macOS the SwiftUI menubar host (`CAR Host.app`) launches the
daemon for you. Install it once from the `.pkg` installer on the
latest GitHub release; from then on it supervises `car-server` and
keeps itself up to date via Sparkle.

## Quickstart

```python
import json
from car_runtime import CarRuntime, verify

rt = CarRuntime()   # lazy-connects to ws://127.0.0.1:9100/

# Tools + policies — proxied to the daemon's per-session runtime
# and persist for the lifetime of this CarRuntime's WS connection.
rt.register_tool("shell")
rt.register_policy(
    "no_rm",
    "deny_tool_param",
    target="shell",
    key="command",
    pattern="rm -rf",
)

# Ground with facts (proxied to the daemon's per-session memgine).
rt.add_fact("project_language", "Python", "pattern")

# Static verification — runs on the daemon side, no model needed.
proposal = json.dumps({
    "actions": [{
        "id": "a1",
        "type": "tool_call",
        "tool": "shell",
        "parameters": {"command": "ls"},
        "dependencies": [],
    }],
})

check = json.loads(rt.verify_proposal(proposal))
if not check["valid"]:
    raise RuntimeError(f"invalid proposal: {check['issues']}")

# Proposal execution with a Python tool callback is not exposed on
# the PyO3 surface in v0.8 — connect to the daemon's WebSocket
# directly with `websockets` (or any WS library) and use:
#   - `proposal.submit` (your client → daemon)
#   - a `tools.execute` handler on the same connection (daemon → you)
# See docs/websocket-protocol.md and
# car-rs/examples/ws-client-python/ for a working sketch.
```

## Streaming inference

```python
from car_runtime import CarRuntime

rt = CarRuntime()

def on_event(event_json: str) -> None:
    e = json.loads(event_json)
    if e["type"] == "text":
        print(e["data"], end="", flush=True)

rt.infer_stream(
    "Explain CAR in one sentence.",
    on_event,
    max_tokens=256,
)
```

## Multi-agent coordination

```python
import json
from car_runtime import register_agent_runner, run_swarm

def agent_fn(spec_json: str, task: str) -> str:
    spec = json.loads(spec_json)
    # Call your LLM of choice, returning an AgentOutput JSON.
    return json.dumps({"name": spec["name"], "response": "...", "tool_calls": []})

# Option A: register once, then call run_* without passing agent_fn each time.
register_agent_runner(agent_fn)
result = run_swarm(
    "parallel",
    json.dumps([
        {"name": "researcher", "role": "gather facts", "model": "gpt-5"},
        {"name": "writer",     "role": "compose summary", "model": "claude-opus-4-7"},
    ]),
    "summarize the CAR paper",
)

# Option B: pass agent_fn per call.
result = run_swarm("parallel", agents_json, task, agent_fn=agent_fn)
```

## API surface

The runtime (`CarRuntime`) exposes:

- **State:** `state_set`, `state_get`, `state_exists`, `state_snapshot`, `state_keys`
- **Memory:** `add_fact`, `query_facts`, `fact_count`, `build_context`,
  `build_context_fast`, `persist_memory`, `load_memory`, `consolidate`
- **Skills:** `ingest_skill`, `find_skill`, `report_outcome`, `distill_skills`,
  `ingest_distilled_skills`, `list_skills`, `domains_needing_evolution`,
  `repair_skill`, `evolve_skills`
- **Tools + policies:** `register_tool`, `register_agent_basics`,
  `register_policy`, `set_replan_config`
- **Inference:** `infer`, `infer_tracked`, `infer_with_context`,
  `infer_with_context_tracked`, `embed`, `rerank`, `classify`,
  `prepare_speech_runtime`, `transcribe`, `synthesize`, `infer_stream`
- **Models:** `list_models`, `pull_model`, `remove_model`,
  `list_models_unified`, `register_model`, `route_model`, `model_stats`
- **Execution:** `event_count`, `verify_proposal`, `execute_proposal`

Module-level standalone functions:

- **Verification:** `verify`, `simulate`, `optimize`, `equivalent`
- **Stateless execute:** `execute` (creates a fresh Runtime; for long-lived
  use, prefer `CarRuntime.execute_proposal`)
- **Multi-agent:** `register_agent_runner`, `run_swarm`, `run_pipeline`,
  `run_supervisor`, `run_map_reduce`, `run_vote`
- **Scheduler:** `create_task`, `run_task`, `run_task_loop`, `ensure_dream_task`
- **Planner:** `rank_proposals`

Structured returns are JSON-encoded strings — `json.loads` them on the Python
side. This keeps the FFI surface stable across binding and protocol changes.

## Type stubs

The wheel ships [`car_runtime.pyi`](./car_runtime.pyi) alongside the compiled
extension and a `py.typed` marker (PEP 561). `mypy`, `pyright`, Pylance, and
similar tools pick up the full method signatures, parameter docstrings, and
return-shape descriptions automatically — no extra install needed.

If you're editing the source tree (not the published wheel), the same files
live at `crates/car-ffi-pyo3/car_runtime.pyi` and must be kept in sync with
`src/lib.rs`. See `CLAUDE.md` for the FFI-bindings parity rule.

## Development

```bash
# Install dev deps.
pip install maturin pytest

# Build and install in editable mode.
cd car-rs/crates/car-ffi-pyo3
maturin develop

# Run the smoke tests.
pytest tests/ -v
```

## Architecture

This package is a thin PyO3 client to the singleton `car-server`
daemon over WebSocket — `CarRuntime()` lazy-connects, every method
proxies through the JSON-RPC dispatcher, and the daemon owns
inference / memory / tool dispatch / per-session state.

Pre-v0.8 (the `RuntimeMode::Embedded` path) the wheel hosted an
in-process `car-engine` + `car-memgine` and ran inference under
`py.allow_threads(...)`. That path was retired to close the
multi-tenant overcommit hazard CAR-issue #139 was opened for —
two FFI consumers in different processes each spawned a fresh
admission semaphore + model cache, and concurrent runs could
overwhelm the host. v0.8 takes the harder path: one daemon per
host, every client attaches.

See the repo [README](https://github.com/Parslee-ai/car) for the
broader CAR architecture and `docs/proposals/daemon-as-default-runtime.md`
for the v0.8 rationale.

## License

Free for any use including commercial; free to redistribute unmodified.
Modification, reverse engineering, and derivative works are not permitted.
See [`LICENSE`](./LICENSE) for the full text. Copyright © 2026 Parslee AI.

