Metadata-Version: 2.4
Name: cybervisor
Version: 0.12.0
Summary: Autonomous CLI supervisor for staged AI workflows
Author: crzidea
Project-URL: Homepage, https://github.com/crzidea/cybervisor
Project-URL: Repository, https://github.com/crzidea/cybervisor
Project-URL: Issues, https://github.com/crzidea/cybervisor/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Typing :: Typed
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: agent-client-protocol>=0.9.0
Requires-Dist: httpx>=0.27.0
Requires-Dist: psutil>=5.9.0
Requires-Dist: pyyaml>=6.0.3
Requires-Dist: setproctitle>=1.3.4; platform_system != "Windows"
Requires-Dist: websockets>=11.0

# cybervisor

`cybervisor` is an autonomous CLI supervisor for development runs. It executes a customizable multi-stage pipeline with Gemini CLI, Claude Code, or a mock agent, installs runtime hooks for non-interactive execution, enforces structured stage-result contracts, and keeps audit logs in JSONL.

`cybervisor` works best when it sits on top of a `speckit` repository. `speckit` gives the project durable product and planning memory under `.specify/`, and `cybervisor` turns that context into an autonomous execution loop with review, correction, and verification stages.

## What it does

- Runs a multi-stage pipeline defined in `cybervisor.yaml`
- Defaults to a robust 5-to-10 stage pipeline depending on the scaffold used
- Supports structured stage-result contracts and artifact-driven routing
- Fails fast when the selected agent CLI or hook verifier credentials are missing
- Writes non-secret hook runtime metadata under `.cybervisor/hooks/` for non-mock runs
- Keeps verifier credentials in `~/.cybervisor/config.yaml`
- Snapshots `.gemini/settings.json` or `.claude/settings.json` and restores them on exit
- Streams live agent output to stderr and persists per-stage logs under `.cybervisor/logs/stages/`
- Enforces single-instance execution — when the daemon is reachable, `run` checks for active daemon tasks before proceeding; when the daemon is unreachable, falls back to `.cybervisor/instance.lock`; exits with `1` if another instance is already running in the same directory (see [Runtime Behavior](docs/runtime-behavior.md))
- Exits with `130` on `SIGINT` or `SIGTERM` after cleanup
- **Daemon mode (`cybervisor serve`)**: Long-running WebSocket server for headless pipeline execution and remote monitoring; supports task cancel, dynamic stop-stage updates, client reconnect with event replay, and background daemonization (see [WebSocket Protocol](docs/websocket-protocol.md))
- **Daemon client commands**: Six subcommands (`status`, `submit`, `attach`, `cancel`, `logs`, `stop-stage`) interact with a running daemon over WebSocket; `status` reports running task IDs and stages from the daemon's active registry; all support `--host` and `--port` overrides and exit with meaningful codes

## Requirements

- Python 3.11+
- [`uv`](https://docs.astral.sh/uv/)
- One of:
  - `gemini` on `PATH`
  - `claude` on `PATH`
  - `mock` mode for local deterministic runs (no API key needed)
- `~/.cybervisor/config.yaml` with verifier settings for non-mock runs

## Installation

Install the CLI onto your `PATH`:

```bash
uv tool install cybervisor
```

After installation, verify:

```bash
cybervisor --version
```

To update an existing installation later:

```bash
uv tool upgrade cybervisor
cybervisor --version
```

For the full update guide, run:

```bash
cybervisor docs updating
```

## Quick Start

Initialize the `cybervisor` scaffold in your project:

```bash
cybervisor init
```

`cybervisor init` detects your environment:
- If `.specify/` exists, it installs the **speckit** scaffold (integrated with `speckit` workflows).
- If `.specify/` is missing, it installs the **simple** scaffold (standalone artifacts in `.cybervisor/artifacts/`).

Both scaffolds create a `cybervisor.yaml` file containing the full pipeline configuration, including prompt templates and stage contracts.

Set your global default agent:

```bash
cybervisor use claude
```

For fast, API-key-free pipeline runs (CI, testing), set `agent_tool: mock` in your config instead:

```yaml
agent_tool: mock
```

When `agent_tool: mock` is set, the pipeline uses the built-in `MockAdapter` which completes each stage deterministically and writes contract artifacts driven by the loaded `PipelineConfig`. The hook verifier is still called via an LLM endpoint, so the `llm` section is still required (point it at any reachable URL with any key).

Configure your verifier settings in `~/.cybervisor/config.yaml`:

```yaml
agent_tool: claude
llm:
  api_key: your-api-key
  # Optional overrides
  # base_url: https://api.openai.com/v1
  # model: gpt-4o

# Per-stage agent tool model overrides (top-level, not under llm)
# stage_models:
#   Spec: claude-sonnet-4-6
#   "Review Code": claude-opus-4-6
```

Run the supervisor:

```bash
cybervisor "Create a 360 feedback system"
printf "Create a 360 feedback system" | cybervisor run
```

## Usage

```bash
# Run with a prompt
cybervisor "Your task description"
cybervisor run "Your task description"
printf "Your task description" | cybervisor run

# Specify a custom config
cybervisor run "Your task" --config custom.yaml

# Control execution flow
cybervisor run "Your task" --start-stage "Implement"
cybervisor run "Your task" --end-stage "Review Code"
cybervisor run "Your task" --end-before "Verify"

# Set default agent
cybervisor use gemini

# Validate your configuration
cybervisor validate
cybervisor validate --show-guidance
```

For advanced stage configuration including cleanup paths, max iterations, per-stage model overrides, and contract authoring, see the [Pipeline Authoring Guide](docs/cybervisor-pipeline-authoring-guide.md) and [Configuration Reference](docs/configuration.md).

### Daemon Mode

`cybervisor serve` starts a long-running WebSocket daemon. Once running, use the client subcommands to submit tasks, monitor progress, and manage the pipeline remotely.

```bash
# Start the daemon server (WebSocket on ws://127.0.0.1:8765)
cybervisor serve
cybervisor serve --host 0.0.0.0 --port 9000
cybervisor serve --background   # Run in background via double-fork

# Check daemon connectivity and active tasks (exits 0 when reachable, 1 when not)
cybervisor status
cybervisor status --host 127.0.0.1 --port 8765
# Example output when a task is running:
#   Running task: abc123def456 (stage: Spec, cwd: /workspace/project, bounds: end_stage=Verify)
#   Daemon reachable at ws://127.0.0.1:8765
# Example output when no task is running:
#   No active tasks.
#   Daemon reachable at ws://127.0.0.1:8765
# Example output when daemon is down:
#   Daemon not reachable at ws://127.0.0.1:8765

# Check status of a specific task by ID (matches across all directories)
cybervisor status abc123def456

# Submit a task and stream events until completion
cybervisor submit "Your task description" --config cybervisor.yaml --start-stage Implement
printf "Your task description" | cybervisor submit          # read prompt from stdin
cat task_prompt.txt | cybervisor submit                     # multi-line prompts preserved
cybervisor submit "Your task" --task-id my-task-123   # explicit task ID
# On submit, the task ID is printed to stderr (e.g. "Task created: abc123def456")
# Use this ID with attach, cancel, logs, or stop-stage
# Note: submitting a new task in a directory that already has a running task is rejected
# with the error "A task is already running in this directory".

# Reconnect to a running or completed task (auto-detects task in current directory)
cybervisor attach

# Reconnect to a specific task by ID to replay buffered events
cybervisor attach my-task-123
# Note: event history exceeding 64 KB is automatically split into chunks on the server
# and reassembled by the client — this is transparent to the user

# Cancel an active task (auto-detects task in current directory; errors if zero tasks)
cybervisor cancel

# Cancel a specific task by ID (works from any directory)
cybervisor cancel my-task-123

# Dump all buffered events as JSON Lines (non-blocking)
cybervisor logs my-task-123

# Update the stop stage of a running task
cybervisor stop-stage --stage Verify                   # auto-detect task in current directory
cybervisor stop-stage my-task-123 --stage Verify       # explicit task ID

# Override daemon address for any client command
cybervisor submit "task" --host 0.0.0.0 --port 9000
```

**Exit codes for client commands:**
- `0` — success
- `1` — failure (daemon unreachable, task not found, invalid state, etc.)
- `130` — interrupted (SIGINT during `submit` or `attach`)
Treat `cybervisor validate` as the local readiness gate before merge or execution. A passing result means the config is not only parseable, but also satisfies the stricter contract-authoring checks for route safety, complete routed examples, and authored prompt/guidance synchronization.

User-facing workflow or specification changes should be documented in tracked files under `docs/` and, when relevant, this README. Do not leave those changes only in local working directories such as `specs/` or `.cybervisor/artifacts/`, because they are not part of the committed project history.

## Recommended with speckit

The strongest setup is pairing `cybervisor` with `speckit`. `speckit` manages the long-lived product memory (specs, plans, tasks) in `.specify/`, while `cybervisor` provides the autonomous execution engine to drive those workflows.

## Development

If you are contributing to `cybervisor`:

```bash
uv sync
uv run mypy --strict src
uv run pytest
```

For self-hosted E2E or verify-stage smoke tests, do not run from the repository root when the goal is to simulate a generated project. Create an isolated demo workspace first, typically with:

```bash
./scripts/e2e-demo-simple-project.sh
```

For a fast smoke test that exercises the full pipeline through `Verify` using a minimal feature prompt and mock LLM API:

```bash
./scripts/e2e-verify-smoke.sh
./scripts/e2e-verify-smoke.sh --agent claude   # use Claude Code adapter instead of mock
```

Both modes route all LLM calls (hook verifier and stage-agent) through the bundled mock API server, so no real API keys are needed.

Release helper:

```bash
./scripts/publish.sh patch  # or minor, major
```

The script requires a clean git working tree, bumps the package version, refreshes `uv.lock`, builds and publishes the package, then creates a release commit and annotated git tag like `v0.7.1`.

## Repository Layout

```text
src/cybervisor/        Core CLI package (split into focused subpackages)
  cli/                 CLI entry point (commands, parser, instance, docs)
  client/              Daemon WebSocket client (commands, connection, rendering)
  pipeline/            Pipeline execution (runner, artifacts, contract)
  server/              Daemon WebSocket server (daemon, handlers, tasks, cleanup)
  core_hooks/          Hook runtime (contracts, streaming, verifier, runner)
  adapters/            Agent adapter registry and tool-specific adapters
  config/              YAML config loading
  cli.py, client.py,   Thin backward-compatible re-exports
  pipeline.py, server.py
  hooks.py             Hook installer and runtime config
  agent_hook.py        Packaged cybervisor-agent-hook entry point
  preflight.py         Dependency pre-check
  signals.py           Signal handler
  logging.py           Structured logging
assets/hooks/          Hook prompt assets and fixtures
scripts/               Demo and utility scripts
templates/demo/        Demo project scaffold
tests/                 Unit and integration coverage
.specify/              Constitution and repo-specific scripts
AGENTS.md              Symlink to constitution
GEMINI.md              Symlink to AGENTS.md
CLAUDE.md              Symlink to AGENTS.md
.cybervisor/           Runtime state (instance.lock, daemon.lock, hooks/, logs/)
```
