Metadata-Version: 2.4
Name: coda-node
Version: 0.1.0
Summary: Minimal server runtime for connecting execution backends to Coda.
Author: Conductor Quantum
License: MIT License
        
        Copyright (c) 2026 Conductor Quantum
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE
Requires-Python: >=3.11
Requires-Dist: cryptography>=41.0.0
Requires-Dist: fastapi>=0.115.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: pydantic-settings>=2.0.0
Requires-Dist: pydantic>=2.11.0
Requires-Dist: pyjwt>=2.0.0
Requires-Dist: redis>=5.0.0
Requires-Dist: uvicorn>=0.30.0
Description-Content-Type: text/markdown

# coda-node

Production-ready runtime for connecting an execution backend to the Coda
cloud platform.

It boots a FastAPI service, provisions or reconnects node credentials, manages
VPN health, consumes Redis jobs, and posts signed execution results back to
Coda.

## What It Does

- Provisions a node from a one-time node token
- Reconnects on restart with persisted JWT credentials
- Verifies and continuously monitors VPN connectivity
- Sends periodic heartbeats to keep QPU status "online"
- Consumes jobs from Redis Streams with crash recovery
- Optionally batches Redis jobs when the executor implements `batch_run`
- Sends JWT-signed webhook results to Coda with retry
- Drains in-flight work on graceful shutdown
- Supports pluggable execution backends via factory convention
- Auto-discovers executor factories from installed packages

## Install

```bash
uv sync --dev
```

Requires Python 3.11+.  Two equivalent CLI entry points are installed:

- `coda`
- `coda-node`

## Quick Start

Provision with a node token:

```bash
uv run coda-node start --token <node-token>
```

Or set the token as an environment variable:

```bash
export CODA_NODE_TOKEN=<node-token>
uv run coda start
```

After a successful first run, credentials are persisted to disk and
subsequent restarts reconnect automatically without a fresh token.

## How It Works

On startup the runtime:

1. Loads configuration from `CODA_`-prefixed environment variables, then
   persisted state on disk, then hardcoded defaults.
2. Connects to Coda using either a node token or persisted JWT
   credentials (with exponential-backoff retry on transient failures).
3. Brings up or validates VPN connectivity when required.
4. Starts the FastAPI service, a background Redis Streams consumer, and
   a heartbeat loop that periodically POSTs node status to the cloud.
5. Dispatches jobs to the configured executor, optionally in batches,
   and posts signed results back via webhook.

On shutdown the runtime drains the in-flight job (up to
`CODA_SHUTDOWN_DRAIN_TIMEOUT_SEC`), cancels background tasks, closes
connections, and stops the managed VPN daemon.

## Endpoints

| Endpoint | Purpose |
|---|---|
| `GET /health` | Liveness probe.  Returns `200` if the process is running. |
| `GET /ready` | Readiness probe.  Returns `200` with component status when VPN and Redis are healthy; `503` when either is degraded or the check times out. |

The `/ready` response body always includes `vpn_state`, `redis_healthy`,
and `current_job` fields for observability.

## Configuration

All settings are driven by `CODA_`-prefixed environment variables.  When
no node token is provided, the runtime automatically loads
previously persisted config from disk.

### Core

| Variable | Default | Description |
|---|---|---|
| `CODA_NODE_TOKEN` | `""` | One-time node token for first-run provisioning. |
| `CODA_JWT_PRIVATE_KEY` | `""` | PEM-encoded RSA private key for direct JWT startup. |
| `CODA_JWT_KEY_ID` | `""` | `kid` header value for signed JWTs. |
| `CODA_REDIS_URL` | `""` | Redis connection string (`redis://…`). |
| `CODA_WEBAPP_URL` | `https://coda.conductorquantum.com` | Coda cloud base URL. Overridden by the node bundle on first connect. |
| `CODA_HOST` | `0.0.0.0` | Bind address for the FastAPI server. |
| `CODA_PORT` | `8080` | Bind port for the FastAPI server. |
| `CODA_EXECUTOR_FACTORY` | `""` | Import path for a custom executor (see below). Highest priority when set. |
| `CODA_DEVICE_CONFIG` | `""` | Path to a YAML device config read by the executor factory. Defaults to `./site/device.yaml` if that file exists. The runtime also looks for an optional top-level `executor_factory` key in this file when `CODA_EXECUTOR_FACTORY` is unset. |

Provide either `CODA_NODE_TOKEN` for auto-provisioning, or both
`CODA_JWT_PRIVATE_KEY` and `CODA_JWT_KEY_ID` for direct JWT startup.

### VPN

| Variable | Default | Description |
|---|---|---|
| `CODA_VPN_REQUIRED` | `true` | Fail preflight if no VPN tunnel is detected. |
| `CODA_VPN_CHECK_INTERVAL_SEC` | `10` | Seconds between background VPN health checks. |
| `CODA_VPN_INTERFACE_HINT` | `null` | Specific TUN/TAP interface name to look for. |
| `CODA_ALLOW_DEGRADED_STARTUP` | `false` | Allow the server to start even if VPN preflight fails. |

### Resilience

| Variable | Default | Description |
|---|---|---|
| `CODA_NODE_CONNECT_RETRIES` | `3` | Max attempts when connecting to the Coda cloud. |
| `CODA_SHUTDOWN_DRAIN_TIMEOUT_SEC` | `30` | Seconds to wait for an in-flight job before forced shutdown. |
| `CODA_NODE_TIMEOUT_SEC` | `15` | HTTP timeout for node connect requests. |

## Persisted State

After successful node provisioning, the runtime writes:

| File | Contents |
|---|---|
| `/tmp/coda.config` | JSON with QPU identity, Redis URL, API paths, and VPN settings. |
| `/tmp/coda-private-key` | PEM-encoded RSA private key. |

Both files use `0600` permissions on POSIX systems and are validated on
read.  They enable token-free reconnects across restarts, preserving JWT
credentials, machine fingerprint, VPN profile path, and connection
settings.

To wipe persisted state, run `coda reset`.

## CLI

```
coda start [--token TOKEN] [--host HOST] [--port PORT] [--daemon]
```

Start the node server.  Pass `--token` on first run for node
provisioning.  Use `--daemon` (or `-d`) to run as a background process.

```
coda stop
```

Stop the background daemon process.

```
coda status
```

Show daemon status (running/stopped, PID, log file, VPN interface).

```
coda logs [-n LINES]
```

Show recent daemon log output (default: last 50 lines).

```
coda doctor
```

Print a diagnostic summary (endpoints, executor, VPN interface, OpenVPN
status).

```
coda stop-vpn
```

Stop the managed OpenVPN daemon without clearing credentials.

```
coda reset
```

Stop the daemon and VPN, then remove all persisted runtime files.

## Executor Resolution

`load_executor()` resolves the execution backend in priority order:

1. **`CODA_EXECUTOR_FACTORY`** (explicit) -- set to a `module:attribute`
   import path to force a specific executor factory.
2. **`executor_factory` in `CODA_DEVICE_CONFIG`** -- optional top-level YAML
   key used when the env var is unset.
3. **Convention-based auto-discovery** -- scan installed packages for
   `<pkg>.executor_factory:create_executor`.  If exactly one match is
   found, use it.  If multiple match, log a warning and fall back.
4. **`NoopExecutor`** fallback -- returns deterministic all-zeros results,
   allowing the service to boot without hardware integration.

### Factory Convention

Backend packages should expose a factory at
`<package>.executor_factory:create_executor`.  The factory must be
either:

- A callable that accepts a `Settings` object and returns an object
  with an async `run(ir, shots)` method.
- A callable that accepts no parameters and returns the same.
- A pre-built object that already has a `run` method.

### Example: Custom Executor

```python
from coda_node.server.executor import ExecutionResult
from coda_node.server.ir import NativeGateIR


class MyExecutor:
    async def run(self, ir: NativeGateIR, shots: int) -> ExecutionResult:
        counts = run_on_hardware(ir, shots)
        return ExecutionResult(
            counts=counts,
            execution_time_ms=42.0,
            shots_completed=shots,
        )


def create_executor() -> MyExecutor:
    return MyExecutor()
```

Place this in `my_package/executor_factory.py` and install the package.
The runtime will discover it automatically.  Or set it explicitly:

```bash
export CODA_EXECUTOR_FACTORY="my_package.executor_factory:create_executor"
```

## Error Handling

All domain exceptions inherit from `CodaError`, making it easy to
distinguish expected operational errors from unexpected bugs:

| Exception | When |
|---|---|
| `ConfigError` | Invalid or missing configuration. |
| `AuthError` | JWT signing or verification failure. |
| `VPNError` | VPN tunnel or health check failure. |
| `NodeError` | Node provisioning or reconnect failure. |
| `ExecutorError` | Executor loading or job execution failure. |
| `WebhookError` | Webhook delivery failure. |

Import from the top-level package:

```python
from coda_node import CodaError
from coda_node.errors import ConfigError, VPNError
```

## Development

Install dependencies (including dev tools):

```bash
uv sync --dev
```

Run the full quality check suite:

```bash
uv run ruff check .
uv run ruff format --check .
uv run mypy src/coda_node
uv build
uv run pytest --cov --cov-report=term-missing
```

Run the opt-in live smoke tests against
`https://coda.conductorquantum.com`:

```bash
export CODA_RUN_E2E=1
export CODA_E2E_NODE_TOKEN=<node-token>
uv run pytest -m e2e
```

To include the live VPN tunnel check as well:

```bash
export CODA_RUN_VPN_E2E=1
uv run pytest -m e2e
```

Install pre-commit hooks (runs ruff format and lint on every commit):

```bash
uv run pre-commit install
```

Run all hooks manually:

```bash
uv run pre-commit run --all-files
```

## Architecture

- **FastAPI** -- HTTP service, lifespan management, health endpoints
- **Pydantic Settings** -- environment-driven configuration with layered
  defaults and persisted state
- **Redis Streams** -- job delivery with consumer groups and crash recovery
- **httpx** -- async HTTP for webhooks and node API calls
- **RS256 JWT** -- authentication between the node and the Coda cloud
- **OpenVPN** -- managed as a subprocess when VPN connectivity is required
