scitex_agent_container API Reference

Top-level package surface re-exported from scitex_agent_container. Use scitex-agent-container list-python-apis for the authoritative runtime enumeration.

SciTeX Agent Container – Declarative agent management.

Provides a YAML-based framework for defining, managing, and orchestrating AI coding agent instances across container runtimes.

Modules:
  • config: YAML config loading and validation

  • lifecycle: Agent start/stop/restart/status

  • registry: File-based agent tracking

  • health: Health check implementation

  • runtimes: Container runtime adapters (docker, apptainer, screen)

class scitex_agent_container.AgentConfig(name, runtime='claude-code', model='sonnet', workdir='~/proj', python_venv='', env=<factory>, env_files=<factory>, screen_name='', labels=<factory>, container=<factory>, claude=<factory>, health=<factory>, watchdog=<factory>, restart=<factory>, hooks=<factory>, listen=<factory>, extensions=<factory>, telegram=<factory>, remote=<factory>, skills=<factory>, context_management=<factory>, startup_commands=<factory>, startup=<factory>, mcp_servers=<factory>, multiplexer='tmux', hosts_spec=<factory>, slurm=<factory>, scheduling=<factory>, orochi=<factory>, config_path='')[source]

Bases: object

Parsed agent configuration from a YAML definition file.

name: str
runtime: str = 'claude-code'
model: str = 'sonnet'
workdir: str = '~/proj'
python_venv: str = ''
env: dict[str, str]
env_files: list[str]
screen_name: str = ''
labels: dict[str, str]
container: ContainerSpec
claude: ClaudeSpec
health: HealthSpec
watchdog: WatchdogSpec
restart: RestartSpec
hooks: dict[str, list[str]]
listen: list[ListenPort]
extensions: Dict[str, Any]
telegram: TelegramSpec
remote: RemoteSpec
skills: SkillsSpec
context_management: ContextManagementConfig
startup_commands: list[StartupCommand]
startup: StartupSpec
mcp_servers: dict[str, dict]
multiplexer: str = 'tmux'
hosts_spec: HostsSpec
slurm: SlurmSpec
scheduling: SchedulingSpec
orochi: OrochiSpec
config_path: str = ''
property expanded_workdir: str
__init__(name, runtime='claude-code', model='sonnet', workdir='~/proj', python_venv='', env=<factory>, env_files=<factory>, screen_name='', labels=<factory>, container=<factory>, claude=<factory>, health=<factory>, watchdog=<factory>, restart=<factory>, hooks=<factory>, listen=<factory>, extensions=<factory>, telegram=<factory>, remote=<factory>, skills=<factory>, context_management=<factory>, startup_commands=<factory>, startup=<factory>, mcp_servers=<factory>, multiplexer='tmux', hosts_spec=<factory>, slurm=<factory>, scheduling=<factory>, orochi=<factory>, config_path='')
scitex_agent_container.load_config(path)[source]

Load and validate a YAML config, returning an AgentConfig.

Only scitex-agent-container/v3 is accepted. Older apiVersions (v1, v2) raise loud validation errors — no backward compatibility.

Return type:

AgentConfig

scitex_agent_container.validate_config(path)[source]

Validate a config file and return list of errors (empty = valid).

Return type:

list[str]

scitex_agent_container.agent_start(config_path, registry=None, no_preflight=False, force=False, session_override=None, resume_id_override=None, dry_run=False, foreground=False)[source]

Start an agent from a YAML config file.

Parameters:
  • config_path (str) – Path to the YAML agent definition.

  • registry (Registry | None) – Optional registry instance.

  • no_preflight (bool) – If True, skip SSH preflight checks (useful for slow hosts).

  • force (bool) – If True and the agent is already running, stop it first and then start fresh. Also tolerates stale registry entries and ghost screens (via force-stop).

  • session_override (str | None) – If set, override config.claude.session for this start invocation (one of continue-or-new | continue | new | resume).

  • resume_id_override (str | None) – If set, override config.claude.resume_id. Pass with session_override=”resume” to launch claude --resume <id> without editing the YAML.

  • dry_run (bool) – If True, materialize the workspace files but skip the multiplexer / Claude Code launch and registry registration. Hooks (pre_start / post_start) are also skipped.

Return type:

bool

Returns True on success, False on failure.

scitex_agent_container.agent_stop(name, registry=None, force=False)[source]

Stop a running agent by name.

Parameters:
  • name (str) – Agent name.

  • registry (Registry | None) – Optional registry instance.

  • force (bool) – If True, do not fail when the agent is missing from the registry or when hooks/runtime.stop() raise; wipe stale state and return True. Useful for bulk cleanup.

Return type:

bool

scitex_agent_container.agent_restart(name, registry=None)[source]

Restart an agent by name.

Return type:

bool

scitex_agent_container.agent_status(name, registry=None)[source]

Get detailed status for an agent.

Return type:

dict

scitex_agent_container.agent_logs(name, lines=50, registry=None)[source]

Get recent logs from an agent.

Return type:

str

class scitex_agent_container.Registry(registry_dir=None)[source]

Bases: object

File-based registry for tracking running agent instances.

__init__(registry_dir=None)[source]
add(name, config_path, screen_name, pid=None)[source]

Register an agent as running.

Return type:

None

remove(name)[source]

Remove an agent from the registry.

Return type:

None

get(name)[source]

Get registry entry for an agent, or None if not found.

Return type:

dict | None

list_all()[source]

List all registered agents.

Return type:

list[dict]

exists(name)[source]

Check if an agent is registered.

Return type:

bool

cleanup_stale()[source]

Remove entries whose multiplexer sessions no longer exist.

Probes tmux first (tmux has-session), then screen (-ls). An entry is removed only when the session is absent from both multiplexers. This makes cleanup safe on mixed fleets where agents may run under either tmux or GNU screen.

Returns count removed.

Return type:

int

Config

YAML config loading and validation for agent definitions.

Public API:

AgentConfig, load_config, validate_config, resolve_config ContainerSpec, ClaudeSpec, HealthSpec, WatchdogSpec, RestartSpec, TelegramSpec, RemoteSpec, SkillsSpec, StartupCommand

class scitex_agent_container.config.AgentConfig(name, runtime='claude-code', model='sonnet', workdir='~/proj', python_venv='', env=<factory>, env_files=<factory>, screen_name='', labels=<factory>, container=<factory>, claude=<factory>, health=<factory>, watchdog=<factory>, restart=<factory>, hooks=<factory>, listen=<factory>, extensions=<factory>, telegram=<factory>, remote=<factory>, skills=<factory>, context_management=<factory>, startup_commands=<factory>, startup=<factory>, mcp_servers=<factory>, multiplexer='tmux', hosts_spec=<factory>, slurm=<factory>, scheduling=<factory>, orochi=<factory>, config_path='')[source]

Bases: object

Parsed agent configuration from a YAML definition file.

name: str
runtime: str = 'claude-code'
model: str = 'sonnet'
workdir: str = '~/proj'
python_venv: str = ''
env: dict[str, str]
env_files: list[str]
screen_name: str = ''
labels: dict[str, str]
container: ContainerSpec
claude: ClaudeSpec
health: HealthSpec
watchdog: WatchdogSpec
restart: RestartSpec
hooks: dict[str, list[str]]
listen: list[ListenPort]
extensions: Dict[str, Any]
telegram: TelegramSpec
remote: RemoteSpec
skills: SkillsSpec
context_management: ContextManagementConfig
startup_commands: list[StartupCommand]
startup: StartupSpec
mcp_servers: dict[str, dict]
multiplexer: str = 'tmux'
hosts_spec: HostsSpec
slurm: SlurmSpec
scheduling: SchedulingSpec
orochi: OrochiSpec
config_path: str = ''
property expanded_workdir: str
__init__(name, runtime='claude-code', model='sonnet', workdir='~/proj', python_venv='', env=<factory>, env_files=<factory>, screen_name='', labels=<factory>, container=<factory>, claude=<factory>, health=<factory>, watchdog=<factory>, restart=<factory>, hooks=<factory>, listen=<factory>, extensions=<factory>, telegram=<factory>, remote=<factory>, skills=<factory>, context_management=<factory>, startup_commands=<factory>, startup=<factory>, mcp_servers=<factory>, multiplexer='tmux', hosts_spec=<factory>, slurm=<factory>, scheduling=<factory>, orochi=<factory>, config_path='')
class scitex_agent_container.config.ClaudeSpec(channels=<factory>, flags=<factory>, session='continue-or-new', continue_max_age_minutes=None, resume_id='', auto_accept=True)[source]

Bases: object

channels: list[str]
flags: list[str]
session: str = 'continue-or-new'
continue_max_age_minutes: int | None = None
resume_id: str = ''
auto_accept: bool = True
__init__(channels=<factory>, flags=<factory>, session='continue-or-new', continue_max_age_minutes=None, resume_id='', auto_accept=True)
class scitex_agent_container.config.ContainerSpec(runtime='none', image='scitex-agent-container:latest', volumes=<factory>, network='host', mount_host_claude=False)[source]

Bases: object

runtime: str = 'none'
image: str = 'scitex-agent-container:latest'
volumes: list[str]
network: str = 'host'
mount_host_claude: bool = False
__init__(runtime='none', image='scitex-agent-container:latest', volumes=<factory>, network='host', mount_host_claude=False)
class scitex_agent_container.config.ContextManagementConfig(trigger_at_percent=70.0, strategy='noop', warn_before_n_checks=0, check_interval_seconds=300, state_file='~/.scitex/agent-container/state/<agent>.json')[source]

Bases: object

Context-lifecycle policy for an agent.

Defaults mirror strategy="noop" so absence of the context_management block preserves existing behavior (sensor disabled).

trigger_at_percent: float = 70.0
strategy: str = 'noop'
warn_before_n_checks: int = 0
check_interval_seconds: int = 300
state_file: str = '~/.scitex/agent-container/state/<agent>.json'
property enabled: bool
__init__(trigger_at_percent=70.0, strategy='noop', warn_before_n_checks=0, check_interval_seconds=300, state_file='~/.scitex/agent-container/state/<agent>.json')
class scitex_agent_container.config.HealthSpec(enabled=False, interval=30, timeout=5, method='multiplexer-alive')[source]

Bases: object

enabled: bool = False
interval: int = 30
timeout: int = 5
method: str = 'multiplexer-alive'
__init__(enabled=False, interval=30, timeout=5, method='multiplexer-alive')
class scitex_agent_container.config.HookSpec(pre_start=<factory>, post_start=<factory>, pre_stop=<factory>, post_stop=<factory>, on_compact=<factory>, on_restart=<factory>, on_diff=<factory>)[source]

Bases: object

All hook points supported by the container.

Each entry is a list of opaque commands — shell strings or http(s) URLs. The container executes them fire-and-forget; errors are logged but never raised to the caller. Absent keys default to empty lists (feature disabled).

pre_start: list[str]
post_start: list[str]
pre_stop: list[str]
post_stop: list[str]
on_compact: list[str]
on_restart: list[str]
on_diff: list[str]
counts()[source]
Return type:

dict[str, int]

__init__(pre_start=<factory>, post_start=<factory>, pre_stop=<factory>, post_stop=<factory>, on_compact=<factory>, on_restart=<factory>, on_diff=<factory>)
class scitex_agent_container.config.HostsSpec(host='', hosts=<factory>)[source]

Bases: object

Where an agent should run, in either singleton or multi-instance form.

Mutually exclusive — exactly one of host or hosts may be set:

  • host (singular) — exactly one instance runs:
    • empty / absent: local singleton (runs wherever sac is invoked)

    • string: pinned to that host

    • list: priority order; first available host wins (fallback chain)

  • hosts (plural) — multiple instances run, one per host:
    • “all”: one per fleet host (replaces the old per-host mode)

    • list of host names: one per listed host (subset)

Validator (in _validation.py) enforces mutual exclusion + types. Loader composes effective ids: hosts triggers the <name>-<HOST> suffix; host keeps the bare name.

host: str | list[str] = ''
hosts: str | list[str]
__init__(host='', hosts=<factory>)
class scitex_agent_container.config.ListenPort(port=0, proto='tcp', path='', name='', owner='')[source]

Bases: object

Declaration of a port/socket an external tool binds on behalf of an agent.

The container NEVER binds these — it just validates the shape and echoes them in status --json so orchestrators can see what sidecars are expected to exist. owner is free-form (e.g. "orochi") to identify the plugin that actually listens.

port: int = 0
proto: str = 'tcp'
path: str = ''
name: str = ''
owner: str = ''
__init__(port=0, proto='tcp', path='', name='', owner='')
class scitex_agent_container.config.OrochiSpec(enabled=False, hosts=<factory>, port=8559, token_env='SCITEX_OROCHI_TOKEN', channels=<factory>, heartbeat_interval=60)[source]

Bases: object

enabled: bool = False
hosts: list[str]
port: int = 8559
token_env: str = 'SCITEX_OROCHI_TOKEN'
channels: list[str]
heartbeat_interval: int = 60
__init__(enabled=False, hosts=<factory>, port=8559, token_env='SCITEX_OROCHI_TOKEN', channels=<factory>, heartbeat_interval=60)
class scitex_agent_container.config.ReadyPattern(regex='')[source]

Bases: object

A single regex the pane content must match for the agent to be ready.

regex: str = ''
__init__(regex='')
class scitex_agent_container.config.RemoteSpec(hops=<factory>, host='', user='', key='', port=22, timeout=60, login_shell=True, no_preflight=False)[source]

Bases: object

hops: list
host: str = ''
user: str = ''
key: str = ''
port: int = 22
timeout: int = 60
login_shell: bool = True
no_preflight: bool = False
property is_remote: bool

Return True if this agent should be deployed via SSH.

__init__(hops=<factory>, host='', user='', key='', port=22, timeout=60, login_shell=True, no_preflight=False)
class scitex_agent_container.config.RestartSpec(policy='never', max_retries=3, backoff_initial=30, backoff_max=300, backoff_multiplier=2)[source]

Bases: object

policy: str = 'never'
max_retries: int = 3
backoff_initial: int = 30
backoff_max: int = 300
backoff_multiplier: int = 2
__init__(policy='never', max_retries=3, backoff_initial=30, backoff_max=300, backoff_multiplier=2)
class scitex_agent_container.config.SchedulingSpec(mode='per-host', preferred_host='', fallback_hosts=<factory>)[source]

Bases: object

Fleet-wide scheduling policy for an agent (shared-host layout).

mode controls effective-id composition and launch-skip behavior:
  • per-host (default): agent is started on every host that runs sac start <name>; the effective id is <metadata.name>-<HOST> unless the name already ends with -<HOST>.

  • singleton: exactly one instance fleet-wide. The effective id stays as the bare <metadata.name>. Only launched on preferred-host; on other hosts the launch is a no-op.

fallback-hosts is recorded for observability but not acted on automatically — manual failover today.

mode: str = 'per-host'
preferred_host: str = ''
fallback_hosts: list[str]
__init__(mode='per-host', preferred_host='', fallback_hosts=<factory>)
class scitex_agent_container.config.SkillsSpec(required=<factory>, available=<factory>, injection_mode='at-import', match_by=<factory>, match_style='exact')[source]

Bases: object

required: list[str]
available: list[str]
injection_mode: str = 'at-import'
match_by: list[str]
match_style: str = 'exact'
__init__(required=<factory>, available=<factory>, injection_mode='at-import', match_by=<factory>, match_style='exact')
class scitex_agent_container.config.SlurmHeartbeatSpec(command='', interval_s=30, log_file='')[source]

Bases: object

Compute-node heartbeat daemon for the SLURM runtime.

On HPC clusters the host-level heartbeat pusher (systemd user timer, launchd plist) runs on the login node and cannot see tmux sessions living on the compute node the sbatch job landed on. Without a compute-node-local pusher, the hub marks the agent dead five minutes after the job starts (symptom: head-spartan alive in squeue but red on the dashboard — lead msg#15654).

Fix: the sbatch wrapper spawns a lightweight background loop that invokes command every interval_s seconds on the compute node itself. When command is empty the loop is skipped (opt-in).

The command is expected to be a self-contained shell invocation of a heartbeat pusher (e.g. python3 .../agent_meta.py --push). The wrapper exports SCITEX_OROCHI_AGENT / SCITEX_OROCHI_HOSTNAME via the pre_agent hook so the pushed payload registers with the correct fleet identity.

Fields:

command: Shell command line to run each tick. Empty disables. interval_s: Seconds between ticks. 30 matches the login-node

systemd timer cadence.

log_file: Absolute path (with ~ expansion) for stderr/stdout

capture. Defaults to <logs_dir>/<jobid>.heartbeat.log when empty.

command: str = ''
interval_s: int = 30
log_file: str = ''
__init__(command='', interval_s=30, log_file='')
class scitex_agent_container.config.SlurmHooks(pre_submit='', pre_agent='', walltime_signal='', post_agent='', attach='')[source]

Bases: object

Plugin hook paths for the SLURM runtime.

Each field is a path to a shell fragment that is sourced (not exec’d) by the sbatch wrapper. Hooks can export env vars that persist into the agent process — this is exactly what e.g. Lmod module loads need.

Hook env vars (set by the wrapper before sourcing):

SAC_AGENT_ID, SAC_JOB_ID, SAC_WORKDIR, SAC_LOG_FILE, SAC_PHASE.

sac ships no default hooks; external orchestrators (orochi, etc.) provide their own scripts and reference them from agent YAML.

pre_submit: str = ''
pre_agent: str = ''
walltime_signal: str = ''
post_agent: str = ''
attach: str = ''
__init__(pre_submit='', pre_agent='', walltime_signal='', post_agent='', attach='')
class scitex_agent_container.config.SlurmSpec(partition='', time_limit='1-00:00:00', cpus_per_task=1, mem='4G', nodes=1, ntasks=1, gres='', job_name='', signal='B:USR1@3600', auto_resubmit=True, hold='tail -f /dev/null', logs_dir='~/slurm_logs', hooks=<factory>, heartbeat=<factory>, extra_directives=<factory>, reservation='')[source]

Bases: object

SLURM runtime configuration parsed from agent YAML’s spec.slurm.

partition: str = ''
time_limit: str = '1-00:00:00'
cpus_per_task: int = 1
mem: str = '4G'
nodes: int = 1
ntasks: int = 1
gres: str = ''
job_name: str = ''
signal: str = 'B:USR1@3600'
auto_resubmit: bool = True
hold: str = 'tail -f /dev/null'
logs_dir: str = '~/slurm_logs'
hooks: SlurmHooks
heartbeat: SlurmHeartbeatSpec
extra_directives: list[str]
reservation: str = ''
__init__(partition='', time_limit='1-00:00:00', cpus_per_task=1, mem='4G', nodes=1, ntasks=1, gres='', job_name='', signal='B:USR1@3600', auto_resubmit=True, hold='tail -f /dev/null', logs_dir='~/slurm_logs', hooks=<factory>, heartbeat=<factory>, extra_directives=<factory>, reservation='')
class scitex_agent_container.config.StartupCommand(delay=0, command='')[source]

Bases: object

delay: int = 0
command: str = ''
__init__(delay=0, command='')
class scitex_agent_container.config.StartupSpec(ready_patterns=<factory>, ready_idle_ticks=3, ready_poll_interval_seconds=0.5, ready_timeout_seconds=60.0, on_timeout='capture_and_proceed', commands=<factory>)[source]

Bases: object

Opt-in ready-state gate for startup commands (todo#291).

When ready_patterns is empty, legacy fire-and-hope behavior is preserved. Otherwise agent_start polls the tmux pane content and only dispatches commands once all patterns match against the tail of the capture AND the pane has been byte-identical for ready_idle_ticks consecutive polls.

ready_patterns: list[ReadyPattern]
ready_idle_ticks: int = 3
ready_poll_interval_seconds: float = 0.5
ready_timeout_seconds: float = 60.0
on_timeout: str = 'capture_and_proceed'
commands: list[StartupCommand]
__init__(ready_patterns=<factory>, ready_idle_ticks=3, ready_poll_interval_seconds=0.5, ready_timeout_seconds=60.0, on_timeout='capture_and_proceed', commands=<factory>)
class scitex_agent_container.config.TelegramSpec(bot_token_env='SCITEX_AGENT_CONTAINER_TELEGRAM_BOT_TOKEN', allowed_users=<factory>, auto_connect=True, greeting='')[source]

Bases: object

bot_token_env: str = 'SCITEX_AGENT_CONTAINER_TELEGRAM_BOT_TOKEN'
allowed_users: list[str]
auto_connect: bool = True
greeting: str = ''
__init__(bot_token_env='SCITEX_AGENT_CONTAINER_TELEGRAM_BOT_TOKEN', allowed_users=<factory>, auto_connect=True, greeting='')
class scitex_agent_container.config.WatchdogSpec(enabled=False, interval=1.5, resp_y_n='1', resp_y_y_n='2', resp_waiting='/speak-and-call')[source]

Bases: object

enabled: bool = False
interval: float = 1.5
resp_y_n: str = '1'
resp_y_y_n: str = '2'
resp_waiting: str = '/speak-and-call'
__init__(enabled=False, interval=1.5, resp_y_n='1', resp_y_y_n='2', resp_waiting='/speak-and-call')
scitex_agent_container.config.compose_effective_name(raw_name, hosts_spec, hostname)[source]

Return the effective agent id given dir-derived name + host/hosts + host.

Return type:

str

Rules:
  • If hosts: is set (multi-instance), append -<hostname> so each host’s instance has a unique id. Idempotent — names that already end with -<hostname> are not double-suffixed.

  • Otherwise (host: set, or both empty = local singleton): keep the bare raw_name. Singleton id stays stable across hosts.

scitex_agent_container.config.load_config(path)[source]

Load and validate a YAML config, returning an AgentConfig.

Only scitex-agent-container/v3 is accepted. Older apiVersions (v1, v2) raise loud validation errors — no backward compatibility.

Return type:

AgentConfig

scitex_agent_container.config.resolve_config(name_or_path)[source]

Resolve agent name or path to a config file path.

Return type:

str

Search order for short names (no slash, no .yaml/.yml suffix):
  1. Project-local — first .scitex/agent-container/agents/ found walking upward from cwd. Highest priority so checked-in test agents and CI fixtures override globals.

  2. ~/.scitex/agent-container/agents/<name>.yaml (sac install root)

  3. $SCITEX_AGENT_CONTAINER_YAML_DIRS (colon-separated extra dirs)

  4. Fleet layout — for each root in (~/.scitex/orochi, ~/.dotfiles/src/.scitex/orochi):

    1. <root>/<HOST>/agents/<name>/<name>.yaml (host override)

    2. <root>/shared/agents/<name>/<name>.yaml (shared default)

Pass an explicit path (with / or .yaml/.yml) to bypass the search entirely.

scitex_agent_container.config.resolve_hostname()[source]

Return the canonical host label for this machine.

Return type:

str

Resolution order (first non-empty wins):
  1. SCITEX_AGENT_CONTAINER_HOSTNAME env var (manual override).

  2. SCITEX_OROCHI_HOSTNAME env var.

  3. hostname_aliases[short hostname] from shared/config.yaml or ~/.scitex/agent-container/config.yaml.

  4. socket.gethostname() short form (identity fallback).

Raises:

RuntimeError – If none of the sources produces a non-empty value. This should be practically impossible (gethostname() returns something on any configured box) but is handled loudly rather than returning the empty string.

scitex_agent_container.config.substitute_hostnames(obj, hostname=None)[source]

Recursively walk a dict/list/str and substitute hostname placeholders.

Non-string leaves (int, bool, None) are returned unchanged. The walk is pure-functional — the input is not mutated; a new structure is returned.

Parameters:
  • obj (Any) – YAML-parsed structure (dict/list/scalar).

  • hostname (str | None) – Override hostname (for tests). If None, calls resolve_hostname().

Return type:

Any

scitex_agent_container.config.validate_config(path)[source]

Validate a config file and return list of errors (empty = valid).

Return type:

list[str]

scitex_agent_container.config.validate_contributor_spec(path)[source]

Validate a contributor spec YAML file. Returns list of errors (empty = valid).

Return type:

list[str]

scitex_agent_container.config.validate_contributor_spec_raw(raw, path='<unknown>')[source]

Validate a contributor spec dict. Returns list of error strings.

Return type:

list[str]

Lifecycle

Agent lifecycle management – start, stop, restart, status.

scitex_agent_container.lifecycle._get_runtime(config)[source]

Return the appropriate runtime for the config.

scitex_agent_container.lifecycle._fallback_workdir(name)[source]

Return the workdir path used when the agent’s YAML can’t be loaded.

Canonical 2026-04-17 layout: ~/.scitex/orochi/runtime/workspaces/<id>/.

Return type:

str

scitex_agent_container.lifecycle._fire_forget_hook(agent_name, hook_name, commands, context=None)[source]

Invoke run_hook (non-blocking, handles URL + shell entries).

Called alongside the legacy synchronous _run_hooks path so existing YAML pipes/redirects keep working unchanged while external tools (orochi etc.) can additionally plug in via http(s):// URLs. The legacy path filters out URL entries to avoid double-dispatch of the same side-effect.

Return type:

None

scitex_agent_container.lifecycle._run_hooks(hooks, extra_env=None)[source]

Execute a list of shell hook commands.

Parameters:
  • hooks (list[str]) – Shell commands to execute.

  • extra_env (dict[str, str] | None) – Additional env vars passed to hook subprocesses (e.g., SCITEX_AGENT_CONTAINER_CONFIG_PATH, SCITEX_AGENT_CONTAINER_SCREEN_NAME, SCITEX_AGENT_CONTAINER_NAME).

Return type:

None

scitex_agent_container.lifecycle.agent_start(config_path, registry=None, no_preflight=False, force=False, session_override=None, resume_id_override=None, dry_run=False, foreground=False)[source]

Start an agent from a YAML config file.

Parameters:
  • config_path (str) – Path to the YAML agent definition.

  • registry (Registry | None) – Optional registry instance.

  • no_preflight (bool) – If True, skip SSH preflight checks (useful for slow hosts).

  • force (bool) – If True and the agent is already running, stop it first and then start fresh. Also tolerates stale registry entries and ghost screens (via force-stop).

  • session_override (str | None) – If set, override config.claude.session for this start invocation (one of continue-or-new | continue | new | resume).

  • resume_id_override (str | None) – If set, override config.claude.resume_id. Pass with session_override=”resume” to launch claude --resume <id> without editing the YAML.

  • dry_run (bool) – If True, materialize the workspace files but skip the multiplexer / Claude Code launch and registry registration. Hooks (pre_start / post_start) are also skipped.

Return type:

bool

Returns True on success, False on failure.

scitex_agent_container.lifecycle.agent_stop(name, registry=None, force=False)[source]

Stop a running agent by name.

Parameters:
  • name (str) – Agent name.

  • registry (Registry | None) – Optional registry instance.

  • force (bool) – If True, do not fail when the agent is missing from the registry or when hooks/runtime.stop() raise; wipe stale state and return True. Useful for bulk cleanup.

Return type:

bool

scitex_agent_container.lifecycle.agent_stop_all(registry=None, force=False)[source]

Stop every agent in the registry.

Returns a list of (name, success, message) tuples, one per agent. With force=True, continues through errors so a partial failure doesn’t block cleanup of the rest.

Return type:

list[tuple[str, bool, str]]

scitex_agent_container.lifecycle.agent_restart(name, registry=None)[source]

Restart an agent by name.

Return type:

bool

scitex_agent_container.lifecycle.agent_status(name, registry=None)[source]

Get detailed status for an agent.

Return type:

dict

scitex_agent_container.lifecycle.agent_logs(name, lines=50, registry=None)[source]

Get recent logs from an agent.

Return type:

str

Registry

Agent registry – track running agents via JSON files in a temp directory.

class scitex_agent_container.registry.Registry(registry_dir=None)[source]

Bases: object

File-based registry for tracking running agent instances.

__init__(registry_dir=None)[source]
add(name, config_path, screen_name, pid=None)[source]

Register an agent as running.

Return type:

None

remove(name)[source]

Remove an agent from the registry.

Return type:

None

get(name)[source]

Get registry entry for an agent, or None if not found.

Return type:

dict | None

list_all()[source]

List all registered agents.

Return type:

list[dict]

exists(name)[source]

Check if an agent is registered.

Return type:

bool

cleanup_stale()[source]

Remove entries whose multiplexer sessions no longer exist.

Probes tmux first (tmux has-session), then screen (-ls). An entry is removed only when the session is absent from both multiplexers. This makes cleanup safe on mixed fleets where agents may run under either tmux or GNU screen.

Returns count removed.

Return type:

int

Observability

Rich agent metadata collection (claude-hud-style).

Canonical source of truth for the metadata payload that is:
  1. Emitted by scitex-agent-container show-status <name> --json.

  2. POSTed by the MCP sidecar heartbeat to /api/agents/register/.

Ported 2026-04-12 from the pre-restructure ~/.scitex/orochi/agents/mamba-healer-mba/scripts/agent_meta.py — the fleet script now lives at ~/.scitex/orochi/shared/scripts/agent_meta.py (2026-04-17 runtime/ layout) and shells out to this module via sac status --json, so the collection logic still lives in one place.

Every field is best-effort: any failure leaves the field as its default ("", 0, 0.0, []) and never raises. The caller merges this dict on top of the base agent_status result.

scitex_agent_container.agent_meta.detect_multiplexer(session)[source]

Return ‘tmux’, ‘screen’, or ‘’ if neither reports the session.

Return type:

str

scitex_agent_container.agent_meta._encode_claude_project(workdir)[source]

Replicate Claude Code’s cwd -> projects dir name encoding.

/ and . both become -, but triple-or-more dashes that come from hidden dirs (/.foo) are collapsed back to --.

Return type:

str

scitex_agent_container.agent_meta._parse_skills(workdir)[source]

Parse ```skills fenced code block from workspace CLAUDE.md.

Return type:

list[str]

scitex_agent_container.agent_meta.parse_subagent_count_from_pane_text(pane)[source]

Return the subagent count advertised by Claude Code’s status marker.

Claude Code emits a line of the form N local agent(s) running (or ... still running) in the tmux pane while subagent Agent calls are in flight. Match that marker (anchored on the literal running trailer so chat text that merely mentions “local agent” can’t false-positive us). Anything else (no marker, empty pane) is reported as 0.

Return type:

int

scitex_agent_container.agent_meta._capture_pane(session, multiplexer, max_chars=10000)[source]

Return the current tmux pane contents, truncated. Empty on error.

Return type:

str

scitex_agent_container.agent_meta._classify_pane_state(pane_text)[source]

Heuristic pane-state classifier. Returns (state, stuck_prompt_text).

Return type:

tuple[str, str]

States:
  • “running”: agent is actively working (prompt >_ present, no stuck marker)

  • “idle_prompt”: prompt visible, no recent activity

  • “y_n_prompt”: y/n prompt blocking

  • “auth_error”: credential error shown

  • “compose_pending_unsent”: user text typed but not yet submitted

  • “limit_reached”: Anthropic rate limit warning visible

  • “unknown”: nothing matched

scitex_agent_container.agent_meta._config_candidates(workdir, filename)[source]

Return a prioritised list of candidate locations for filename.

Historically only <workdir>/<filename> was probed, which meant agents whose workspace wasn’t provisioned with that file pushed an empty claude_md / mcp_json to the hub. Walk a wider set of plausible locations so every agent gets populated content:

  1. <workdir>/<filename>

  2. <workdir>/.claude/<filename> (nested config style)

  3. Legacy sibling <workdir-parent>/mamba-<name>/<filename>

  4. Nearest enclosing git-root <filename>

  5. ~/.claude/<filename> (user-global fallback)

  6. ~/<filename>

Return type:

list[Path]

scitex_agent_container.agent_meta._parse_mcp_servers(workdir)[source]

Return a structured summary of MCP servers configured for this agent.

Parses <workdir>/.mcp.json into a flat list of {name, transport, url_host, command} entries so the dashboard can render a setup-audit table alongside installed plugins. URL hosts (not full URLs) and commands (not args) are surfaced because that is enough to verify the server is pointing at the right endpoint without exposing query-string secrets.

Returns [] if the file is missing or malformed — callers never get None.

Return type:

list[dict[str, Any]]

scitex_agent_container.agent_meta._read_sdk_session_state(name, workdir)[source]

Surface runtime: claude-session state on the status JSON.

Returns None for agents that aren’t using the SDK runtime (heartbeat file absent). For SDK agents, returns a dict with the persisted session id, accumulated per-turn token totals, and the latest heartbeat state. Best-effort: any IO / parse failure yields None so non-SDK agents never see this field populated and SDK agents on transient state-dir glitches degrade silently.

Return type:

dict | None

scitex_agent_container.agent_meta.collect_rich(*, name, workdir, session)[source]

Collect claude-hud-style metadata for one agent.

Parameters:
  • name (str) – Agent name (used only as a fallback identifier).

  • workdir (str) – Absolute workspace dir for the agent (used to locate CLAUDE.md and the Claude Code transcript JSONL files).

  • session (str) – Multiplexer session name (what tmux has-session -t checks).

Return type:

dict[str, Any]

scitex_agent_container.agent_meta._collect_action_summary_fields(agent_name)[source]

Return a flat dict of action-summary fields for collect_rich.

Runs inside a try/except so a corrupt or missing ~/.scitex/agent-container/actions.db never blocks a heartbeat. All keys are prefixed action_ so consumers know which subsystem they came from.

Return type:

dict[str, Any]

Ring-buffer event log for Claude Code hook events.

Claude Code invokes configured commands on PreToolUse, PostToolUse, UserPromptSubmit, and Stop. We capture the JSON payloads into a per-agent ring-buffer at ~/.scitex/agent-container/events/<agent>.jsonl so downstream consumers (agent_meta.collect_rich) can surface recent tool calls / prompts / stops without the agent itself having to act.

Design rules

  • Non-agentic. Pure file I/O plus a tiny regex on the payload.

  • Non-blocking. Writes are O(1); a rotation pass runs only when the file exceeds the cap.

  • Stdlib only. json, pathlib, fcntl for the advisory lock. No psutil / requests.

  • Fail-closed. Any exception inside the hook handler is swallowed so hooks can never break the agent session.

scitex_agent_container.event_log._preview_tool_input(tool_name, tool_input)[source]

Return a short human-readable preview for the tool input.

Return type:

str

scitex_agent_container.event_log.append_event(agent, kind, payload, *, root=None)[source]

Append a single hook event. Never raises.

Return type:

None

scitex_agent_container.event_log.read_recent(agent, limit=50, *, root=None)[source]

Return the last limit event records (oldest-first).

Return type:

list[dict[str, Any]]

scitex_agent_container.event_log._compute_open_agent_calls(recent_tools, now=None)[source]

Return Agent pretool events with no matching posttool (LIFO matching).

Walks recent_tools in chronological order, maintaining a stack of open Agent pretool events. Each Agent posttool pops the most-recent unmatched pretool (LIFO — subagents can be nested). Entries still on the stack at the end have no posttool in the observed window.

Each returned record adds age_seconds (wall-clock seconds since ts) so callers can threshold on stuck duration without re-parsing ISO timestamps.

Return type:

list[dict[str, Any]]

Caveats

  • The event log is a ring-buffer. A pretool that fired before the window started would look “open” even if it already completed. Consumers should cross-check subagent_count from the pane text before declaring an open call “stuck”.

  • Nested Agent calls are matched LIFO, which is the correct order when the outer agent waits for the inner one to complete.

scitex_agent_container.event_log.summarize(agent, limit=50, *, root=None)[source]

Return a pre-aggregated view for the status payload.

Keys returned:

{
  "recent_tools":       [{ts, tool, input_preview, ...}, ...] last ``limit``
  "recent_prompts":     [{ts, prompt_preview}, ...]            last 5
  "agent_calls":        [{ts, input_preview}, ...]             last 20 Agent invocations
  "open_agent_calls":   [{ts, input_preview, age_seconds}, ...]
                          Agent pretool events with no matching posttool in the
                          observation window. Non-empty = subagent(s) may be stuck.
                          Cross-check with ``subagent_count`` before alerting.
  "background_tasks":   [{ts, input_preview}, ...]             unresolved Bash run_in_background starts
  "counts":             {tool_name: count_in_window}
  "last_tool_at":       ISO ts of newest tool (pretool kind) — "functional" heartbeat
  "last_tool_name":     tool name for last_tool_at
  "last_mcp_tool_at":   ISO ts of newest mcp__* tool — confirms MCP sidecar route
  "last_mcp_tool_name": tool name for last_mcp_tool_at
}
Return type:

dict[str, Any]

Self-snapshot subcommand (todo#286).

Collects a per-agent snapshot of the local host state (tmux/screen, proc counts, load, memory, fork-pressure, claude context-percent) and persists it to the container cache dir. On each run, the previous snapshot is rolled to <agent>.prev.json and the new one lands in <agent>.latest.json atomically. A flat dotted-key diff is computed against the previous snapshot so the dashboard can highlight what changed.

Kept deliberately stdlib-only: no psutil, no yaml, no new deps.

scitex_agent_container.snapshot.register_sidecar(agent, kind, name, *, pid=None, thread=None)[source]

Register a sidecar so snapshot can introspect liveness.

kind is "thread" or "process". thread must be supplied for thread-kind sidecars; pid for process-kind.

Return type:

None

scitex_agent_container.snapshot.cache_dir()[source]
Return type:

Path

scitex_agent_container.snapshot._snapshot_lock(agent)[source]

Per-agent advisory lock around the latest->prev roll + write.

POSIX fcntl advisory lock; not supported on Windows but container targets unix. The lock file persists between calls (reusable); the advisory lock is released when the fd is closed via the with block.

Return type:

Iterator[None]

scitex_agent_container.snapshot._probe_screen_count()[source]

Return the number of live GNU screen sessions.

Contract: - None iff the screen binary is not installed at all. - 0 if screen is installed but no sessions are live (screen -ls

prints No Sockets found ... and exits non-zero — that is NOT an error, it means zero sessions).

  • A positive int when one or more sessions are listed.

Return type:

int | None

scitex_agent_container.snapshot._probe_claude_pid()[source]

Return the PID of the live claude CLI child, or None.

The naive pgrep -f claude matches ANY process whose full command line contains the substring claude — including the scitex-agent-container python wrapper itself (whose argv often mentions claude-code, claude_code, or a claude agent name). We must exclude that wrapper and only pick the real claude CLI child that runtimes/claude_code.py execs (command basename == claude).

Strategy: prefer pgrep -n -x claude (exact command-name match).

Return type:

int | None

scitex_agent_container.snapshot._probe_nproc()[source]

Return (current, max) process counts for fork-pressure math.

Return type:

tuple[int | None, int | None]

scitex_agent_container.snapshot.gather_snapshot(agent, *, session=None)[source]

Build a snapshot dict for agent. No I/O to cache dir.

Return type:

dict[str, Any]

scitex_agent_container.snapshot.compute_diff_fields(prev, latest)[source]
Return type:

list[str]

scitex_agent_container.snapshot.take_snapshot(agent, *, session=None, with_diff=True)[source]

Gather, persist, and return a snapshot for agent.

Return type:

dict[str, Any]

scitex_agent_container.snapshot.read_latest(agent)[source]
Return type:

dict[str, Any] | None

scitex_agent_container.snapshot.snapshot_tick(agent, *, session=None, agent_config=None)[source]

Daemon helper: take a snapshot, swallow errors.

When agent_config is supplied and the fresh snapshot has has_diff, the configured hooks.on_diff commands are fired via the non-blocking hook pool (todo#286 Phase 4).

Return type:

None

Pane Actions

Base class + engine for pane-mediated agent actions.

A PaneAction subclass describes what to do in four small, mostly-pure methods. The run_action() engine handles the shared plumbing: precondition gate, send, completion polling, timeout, logging to action_store, and uniform error handling.

Why a base class

The conversation that drove this design:

  1. Every agent action we actually perform has the same shape: detect state → send keystrokes with delays → ensure completion state → log the outcome. Hand-rolling that loop per action produces drift (different timeout semantics, inconsistent logging, copy-pasted sleep/try/except trees).

  2. Observer (liveness_probe) and actor code must stay split (operators may disable auto-response). This module is the actor half; it composes with, but does not import from, the observer.

  3. Logs are for ad-hoc debugging and manual revising of the package — the engine writes one structured row per run and that is the only learning surface. There is no runtime callback that mutates module state based on outcomes.

Subclass contract

A subclass implements four methods:

class MyAction(PaneAction):
    name = "my-action"

    def snapshot(self, ctx) -> dict:
        # Whatever the action needs to judge completion.
        return {
            "pane_tail": ctx.capture_fn()[-2000:],
            "context_pct": ctx.context_pct_fn(),
        }

    def precheck(self, before: dict) -> bool:
        # Safe to act given the current state?
        return "auth_error" not in before["pane_tail"]

    def send(self, ctx) -> None:
        # Side-effect: issue keystrokes via ctx.mux.
        ctx.mux.send_text_and_submit(ctx.session, "/compact")

    def is_complete(self, before: dict, now: dict) -> bool:
        # Compare before vs now snapshots.
        return (before["context_pct"] or 100) - (now["context_pct"] or 100) >= 20

Subclasses never call time.sleep, subprocess.run, or write to disk. Everything side-effecting flows through ctx or the engine.

Outcomes

Every run ends in exactly one of:

  • SUCCESSis_complete became true before deadline.

  • PRECONDITION_FAILprecheck rejected before we acted.

  • SEND_ERRORsend raised.

  • COMPLETION_TIMEOUT — deadline reached without is_complete.

  • SKIPPED_BY_POLICY — caller passed skip_reason to run_action.

class scitex_agent_container.action_base.ActionOutcome(value)[source]

Bases: str, Enum

Canonical terminal states of one run_action call.

SUCCESS = 'success'
PRECONDITION_FAIL = 'precondition_fail'
SEND_ERROR = 'send_error'
COMPLETION_TIMEOUT = 'completion_timeout'
SKIPPED_BY_POLICY = 'skipped_by_policy'
class scitex_agent_container.action_base.ActionContext(agent, session, mux, capture_fn, context_pct_fn=<factory>, extras=<factory>)[source]

Bases: object

Everything a subclass needs to snapshot + send.

All side-effecting callables live here so tests can inject fakes without monkey-patching subprocess or time.

agent: str
session: str
mux: Any
capture_fn: Callable[[], str]
context_pct_fn: Callable[[], float | None]
extras: dict[str, Any]
__init__(agent, session, mux, capture_fn, context_pct_fn=<factory>, extras=<factory>)
class scitex_agent_container.action_base.ActionAttempt(agent, action, outcome, elapsed_s, started_at, pane_before, pane_after, extras)[source]

Bases: object

Immutable record of one engine run. Mirrors a row in the attempts table.

agent: str
action: str
outcome: ActionOutcome
elapsed_s: float
started_at: str
pane_before: dict[str, Any] | None
pane_after: dict[str, Any] | None
extras: dict[str, Any]
as_store_record()[source]

Shape expected by action_store.append_attempt.

Return type:

dict[str, Any]

__init__(agent, action, outcome, elapsed_s, started_at, pane_before, pane_after, extras)
class scitex_agent_container.action_base.PaneAction[source]

Bases: ABC

Base class for one pane-mediated action.

name: ClassVar[str] = 'unknown'
abstractmethod snapshot(ctx)[source]

Return the observation needed to judge completion.

Called at least twice per run: once before send and once per poll afterwards. The contents are entirely up to the subclass — pane_tail, context_pct, pane_state, any custom field — because the engine does not inspect them; it only forwards them to is_complete/precheck and stores them in the attempt log.

Return type:

dict[str, Any]

abstractmethod precheck(before)[source]

Return True if the current state permits sending.

Return False to abort with PRECONDITION_FAIL — typical reasons: the agent is already busy, the pane shows an auth error, a y/n prompt is pending the user’s attention, etc.

Return type:

bool

abstractmethod send(ctx)[source]

Side-effecting keystroke emission.

Must only use ctx.mux + ctx.session. Raising any exception terminates the run with SEND_ERROR.

Return type:

None

abstractmethod is_complete(before, now)[source]

Pure comparison — True iff the action is done.

Return type:

bool

before_send(ctx)[source]

Hook for any prep work between precheck and send (e.g. mint a nonce). Default no-op.

Return type:

None

extras_at_end(ctx)[source]

Hook to emit action-specific fields for the attempt log (e.g. nonce value, command text). Default: ctx.extras.

Return type:

dict[str, Any]

scitex_agent_container.action_base.run_action(action, ctx, *, timeout_s=30.0, poll_interval_s=2.0, skip_reason=None, time_fn=<built-in function monotonic>, sleep_fn=<built-in function sleep>, store_root=None, write_to_store=True)[source]

Execute action against ctx and return the attempt record.

Lifecycle:

before = action.snapshot(ctx)            -- pre-state
if skip_reason:                          -- policy skip
    -> SKIPPED_BY_POLICY, no send
elif not action.precheck(before):
    -> PRECONDITION_FAIL, no send
else:
    action.before_send(ctx)
    try: action.send(ctx)
    except: -> SEND_ERROR, no polling
    poll snapshot(ctx) every poll_interval until
        is_complete(before, now) -> SUCCESS
        or deadline -> COMPLETION_TIMEOUT

Always writes one attempt row to action_store unless write_to_store=False (used by tests that want to inspect the returned ActionAttempt in isolation).

Return type:

ActionAttempt

SQLite-backed attempt log for PaneAction executions.

Every run of a PaneAction (see action_base.py) writes one attempts row so the fleet accumulates structured evidence of what was tried and how it ended. The log is ad-hoc — humans and future tooling read it to debug / revise the package; there is no runtime post-hook that mutates behavior based on the log.

Scope

  • One host-level DB: ~/.scitex/agent-container/actions.db.

  • One row per attempt; agent is a column so cross-agent queries (e.g. “which hosts see the most silent probes?”) are trivial.

  • Snapshots (pane_before / pane_after) are stored as JSON blobs with a format discriminator so the on-disk schema can later accept diff-encoded variants without breaking readers.

Design rules (follows event_log.py conventions)

  • Non-agentic. Pure SQL + json.

  • Fail-closed. Any write-path exception is swallowed so a logging hiccup never takes an agent down.

  • Stdlib only. sqlite3 + json + pathlib.

  • Injectable. root override everywhere so tests use a temp directory.

scitex_agent_container.action_store._safe_float(value)[source]

Coerce to float, returning None for None / unparseable.

Return type:

float | None

scitex_agent_container.action_store._get_conn(db_path)[source]

Open a connection with WAL mode and the schema ensured.

WAL keeps concurrent reads safe while a single writer inserts.

Return type:

Connection

scitex_agent_container.action_store._truncate_snapshot(snap, max_chars=4096)[source]

Coerce a snapshot into the {format, text} wrapper.

Accepts: - None -> returns None - dict with ‘format’ key -> passes through after text truncation - plain str -> wraps as {“format”: “full”, “text”: s} - other dict -> returns {“format”: “full”, “text”: json-dump}

Return type:

dict[str, Any] | None

scitex_agent_container.action_store.append_attempt(record, *, root=None)[source]

Insert one attempt row. Never raises.

Required record keys:

  • agent (str)

  • action (str) — the action name

  • outcome (one of OUTCOMES)

  • elapsed_s (float)

Optional:

  • ts — ISO-8601 UTC; auto-populated if missing.

  • pane_before / pane_after — snapshot dicts / strings.

  • extras — action-specific fields (dict).

Return type:

None

scitex_agent_container.action_store._parse_since(since)[source]

Accept either a datetime, an ISO string, or a human-readable relative string like '24h' / '7d' / '30m' and return an ISO UTC string suitable for a ts >= ? comparison.

Return type:

str | None

scitex_agent_container.action_store.query(*, agent=None, action=None, outcome=None, since=None, limit=50, offset=0, root=None)[source]

Return matching attempts, newest first.

Return type:

list[dict[str, Any]]

scitex_agent_container.action_store.stats(*, agent=None, since=None, root=None)[source]

Per-(action, outcome) counts + mean/p95 elapsed.

Returns a list of rows:

[{"action": "nonce-probe", "outcome": "success", "count": 42,
  "mean_elapsed_s": 3.1, "p95_elapsed_s": 5.9}, ...]

p95 is computed in Python (sqlite has no native percentile); sample sets are bounded by query limits so it stays cheap.

Return type:

list[dict[str, Any]]

scitex_agent_container.action_store.summarize(agent, *, limit=100, root=None)[source]

Compact summary for the status payload / heartbeat.

Keys:

{
  "last_action_at":      ISO ts or ""
  "last_action_name":    action name or ""  (renamed from
                                             "last_action" to avoid
                                             collision with the
                                             pre-existing orochi
                                             liveness timestamp field)
  "last_action_outcome": outcome string or ""
  "last_action_elapsed_s": float or None
  "counts":              {"<action>:<outcome>": n}
  "p95_elapsed_s_by_action": {"<action>": p95 float}
}
Return type:

dict[str, Any]

scitex_agent_container.action_store.purge_old(*, days=None, root=None)[source]

Delete rows older than days (default from env). Returns rows deleted. Safe to call periodically from a cron / daemon.

Return type:

int

scitex_agent_container.action_store._all_rows(root=None)[source]

Iterator over every row — used by tests / export tooling.

Return type:

Iterable[dict[str, Any]]

NonceProbeAction — functional-liveness probe via Repeat <nonce>.

A pane-diff is a false liveness signal (channel notifications land in the terminal even when the local agent is frozen). This action proves the full pane -> LLM -> pane loop is working by asking the agent to echo back a random token and watching for it.

Composition

  • action_basePaneAction ABC + run_action engine.

  • liveness_probe — pure helpers (generate_nonce, pane_has_nonce_echo, pane_is_busy).

Outcome interpretation (via ActionOutcome)

  • SUCCESS — nonce echoed; agent is functionally alive.

  • COMPLETION_TIMEOUT — nonce never appeared; agent is either silent (frozen) or busy beyond our patience. Differentiate by inspecting the stored pane_after for busy markers, or by scheduling a second probe after a back-off.

  • PRECONDITION_FAIL — the pane was busy before we sent; we declined to interrupt an in-flight turn.

  • SEND_ERRORsend_text_and_submit itself raised (e.g. tmux session disappeared).

class scitex_agent_container.actions.nonce_probe.NonceProbeAction(nonce=None)[source]

Bases: PaneAction

Functional liveness probe.

Parameters:

nonce (Optional[str]) – Optional deterministic override. Tests use this to assert against a known token; production callers should leave it None so liveness_probe.generate_nonce() mints a fresh one per run.

name: ClassVar[str] = 'nonce-probe'
__init__(nonce=None)[source]
snapshot(ctx)[source]

Return the observation needed to judge completion.

Called at least twice per run: once before send and once per poll afterwards. The contents are entirely up to the subclass — pane_tail, context_pct, pane_state, any custom field — because the engine does not inspect them; it only forwards them to is_complete/precheck and stores them in the attempt log.

Return type:

dict[str, Any]

precheck(before)[source]

Refuse to probe a currently-busy pane.

Interrupting an in-flight response with our probe would corrupt the user’s actual work and skew quota accounting (the probe’s Repeat <nonce> message would land as a new user turn while the agent was mid-reply to the prior turn). Defer to a later attempt; the caller can retry.

Return type:

bool

before_send(ctx)[source]

Mint the nonce right before send so the before-snapshot’s nonce count is guaranteed to be zero.

Also deposits the nonce into ctx.extras so it lands in the attempt log’s extras column — forensic readers can see exactly what token we asked for.

Return type:

None

send(ctx)[source]

Side-effecting keystroke emission.

Must only use ctx.mux + ctx.session. Raising any exception terminates the run with SEND_ERROR.

Return type:

None

is_complete(before, now)[source]

True iff the nonce appears at least twice in the post-send pane tail: once from our own Repeat <nonce> prompt line, plus at least one more time from the agent’s echo.

Return type:

bool

CompactAction — request a /compact and verify it via context_pct drop.

Why a separate completion signal matters

A probe that checks for a string in the pane (e.g. “Conversation compacted”) would be brittle — Claude Code’s TUI wording changes between releases, and the status message can scroll off before the poll sees it. Using the statusline-reported context_pct gives a numeric, unambiguous “did the compact succeed?” signal.

Outcome interpretation

  • SUCCESScontext_pct dropped by at least min_drop_pct (default 20 percentage points).

  • COMPLETION_TIMEOUT — context didn’t drop enough in time. Could mean: (a) compact hasn’t finished, (b) context_pct_fn returned None throughout (statusline unavailable), (c) the compact was rejected by Claude Code. Operator inspects the pane_after log column.

  • PRECONDITION_FAIL — pane was busy; we declined to interrupt an in-flight turn.

  • SEND_ERRORsend_text_and_submit itself raised.

Operators who want a scheduled auto-compact (interval or threshold) should build that policy layer on top of this action — it is intentionally policy-free.

class scitex_agent_container.actions.compact.CompactAction(min_drop_pct=20.0, command='/compact')[source]

Bases: PaneAction

Send /compact and wait for context_pct to fall.

Parameters:
  • min_drop_pct (float) – Percentage points the context must drop for this action to report SUCCESS. Default 20. Tune per operator preference; higher = stricter.

  • command (str) – Override the submitted text. Default "/compact". Present for tests that want to exercise the flow without assuming a specific Claude Code command string.

name: ClassVar[str] = 'compact'
__init__(min_drop_pct=20.0, command='/compact')[source]
snapshot(ctx)[source]

Return the observation needed to judge completion.

Called at least twice per run: once before send and once per poll afterwards. The contents are entirely up to the subclass — pane_tail, context_pct, pane_state, any custom field — because the engine does not inspect them; it only forwards them to is_complete/precheck and stores them in the attempt log.

Return type:

dict[str, Any]

precheck(before)[source]

Refuse to compact a currently-busy pane.

Return type:

bool

Two separate reasons:
  1. Interrupting an in-flight response would corrupt the user’s work (same as the nonce-probe rationale).

  2. Claude Code occasionally ignores /compact when it is mid-tool-call; the command lands but the TUI does not act on it until the current turn resolves, producing a confusing COMPLETION_TIMEOUT with no actual failure.

send(ctx)[source]

Side-effecting keystroke emission.

Must only use ctx.mux + ctx.session. Raising any exception terminates the run with SEND_ERROR.

Return type:

None

is_complete(before, now)[source]

True iff context_pct dropped by at least min_drop_pct percentage points between the before and now snapshots.

Either side being None (statusline unavailable or not yet populated) always returns False — the engine then either keeps polling until the deadline or reports COMPLETION_TIMEOUT. The attempt log still captures both snapshots so the operator can see why we couldn’t confirm.

Return type:

bool

scitex_agent_container.actions.compact._coerce_float(value)[source]

Tolerant float coercion — returns None for None / bad input. The statusline parser may emit strings or None depending on the Claude Code build, so be forgiving.

Return type:

Optional[float]

Runtimes / Multiplexer

Multiplexer abstraction — dispatches to screen or tmux.

class scitex_agent_container.runtimes.multiplexer.MultiplexerProtocol(*args, **kwargs)[source]

Bases: Protocol

Common interface for screen/tmux managers.

static exists(session_name)[source]
Return type:

bool

static start(session_name, command, workdir, env_exports='', venv='')[source]
Return type:

bool

static stop(session_name)[source]
Return type:

bool

static capture_content(session_name)[source]
Return type:

str

static capture_logs(session_name, lines=50)[source]
Return type:

str

static send_keys(session_name, *keys)[source]
Return type:

None

static send_text_and_submit(session_name, text)[source]
Return type:

None

static attach(session_name)[source]
Return type:

None

__init__(*args, **kwargs)
scitex_agent_container.runtimes.multiplexer.get_multiplexer(config)[source]

Return the appropriate multiplexer class based on config.

Return type:

type

Modular TUI prompt detection and response for Claude Code.

Each prompt handler defines: - name: identifier for logging - detect(content) -> bool: whether this prompt is visible - respond(send_keys) -> None: keystrokes to accept the prompt - priority: lower = checked first (default 10)

Add new handlers by appending to PROMPT_HANDLERS or calling register_prompt().

class scitex_agent_container.runtimes.prompts.PromptHandler(name, detect, keys=<factory>, priority=10)[source]

Bases: object

A single TUI prompt detector and responder.

name: str
detect: Callable[[str], bool]
keys: list[str]
priority: int = 10
__init__(name, detect, keys=<factory>, priority=10)
scitex_agent_container.runtimes.prompts._detect_bypass_permissions(content)[source]

Bypass Permissions mode prompt with radio selector.

Return type:

bool

Matches:

“1. No, exit” “2. Yes, I accept” “Bypass Permissions” “Enter to confirm”

scitex_agent_container.runtimes.prompts._detect_dev_channels(content)[source]

Development channels loading confirmation.

Return type:

bool

Matches:

“1. I am using this for local development” “2. Exit” “development channels” or “dangerously-load-development-channels” “Enter to confirm”

scitex_agent_container.runtimes.prompts._detect_thinking_effort(content)[source]

Thinking effort level selector.

Return type:

bool

Matches:

“1. * Medium (recommended)” or similar “thinking” in various casings “Enter to confirm”

scitex_agent_container.runtimes.prompts._detect_skip_permissions_yn(content)[source]

Legacy y/n text prompt for skip-permissions (older Claude Code).

Matches text-based y/n prompts without radio selector.

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_mcp_json_edit(content)[source]

Permission prompt when Claude tries to edit .mcp.json (runtime).

Matches “1. Yes” / “1. Proceed” / “1. Allow” + “.mcp.json” + “Enter to confirm”.

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_press_enter_continue(content)[source]

Generic ‘Press Enter to continue’ runtime pause (context-window warning, etc).

Uses a strict last-5-lines window to avoid scrollback false positives (per pane-state-patterns.md: classify against last 5 visible lines only). Excluded: active tool calls and numbered radio selectors.

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_file_trust(content)[source]

‘Do you trust the files in this folder?’ prompt (first-run or new cwd).

May appear when –dangerously-skip-permissions was not propagated to a subshell. Matches the LEGACY y/n text variant; the new radio-selector variant is handled by _detect_file_trust_radio().

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_file_trust_radio(content)[source]

Radio-selector variant of the file-trust prompt.

Claude Code (>= ~2.1.x) asks “Is this a project you created or one you trust?” with numbered options instead of the legacy y/n text prompt. Appears on the first launch in any un-trusted workdir — including every throwaway tempdir the Haiku integration test uses.

Matches the exact option strings to avoid firing on the bypass-permissions dialog (which also says “Enter to confirm”).

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_external_imports(content)[source]

External CLAUDE.md file imports prompt.

Appears when CLAUDE.md (or .claude/CLAUDE.md) contains @<absolute-path> imports pointing OUTSIDE the agent’s workdir. Triggered by the at-import skill-injection mode (sac PR #74) when skills live in ~/.claude/skills/ or the package source trees rather than the workspace itself.

Return type:

bool

Matches:

“Allow external CLAUDE.md file imports?” “1. Yes, allow external imports” “Enter to confirm”

scitex_agent_container.runtimes.prompts._detect_login_method(content)[source]

First-run login-method picker on a fresh HOME.

Appears when Claude Code can’t find OAuth credentials at ~/.claude/.credentials.json. Even with ANTHROPIC_API_KEY set in env, the 2.1.x CLI still asks which auth mode to use before it checks the env var. Blocks startup until dismissed.

Matches the exact option strings to avoid false positives on any user message that happens to say “login method”.

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_theme_selection(content)[source]

First-run theme selection prompt.

Appears only on a fresh HOME (no ~/.claude/ saved theme). On dev machines it never shows, but in CI (a clean ubuntu VM) this is the first thing Claude Code asks. Blocks every downstream startup prompt until acknowledged.

Matches the radio variant: “Choose the text style…” + numbered options starting with “1. Auto (match terminal)”.

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_compose_pending_unsent(content)[source]

Detect unsent text sitting in the Claude Code compose buffer.

The classifier in agent_meta._classify_pane_state matches ❯[ \t]+\S (non-whitespace after the prompt marker on the same line), meaning the user has typed something but not yet pressed Enter. We mirror that pattern here so the prompts system can submit it via a plain Enter keystroke.

Excluded: lines that are just the decorative separator below an empty prompt — those contain only whitespace after .

Return type:

bool

scitex_agent_container.runtimes.prompts._detect_done(content)[source]

Check if claude is at the main input prompt (all TUI prompts done).

The status bar shows “bypass permissions” when ready.

Return type:

bool

scitex_agent_container.runtimes.prompts.register_prompt(handler)[source]

Add a custom prompt handler to the registry.

Return type:

None

scitex_agent_container.runtimes.prompts.detect_and_respond(content, accepted, send_keys_fn)[source]

Check content against all handlers, respond to the first match.

Parameters:
  • content (str) – Captured pane content.

  • accepted (set[str]) – Set of already-accepted prompt names.

  • send_keys_fn (Callable[..., None]) – Callable to send keystrokes (e.g., mux.send_keys).

Return type:

str | None

Returns:

Name of the matched prompt, or None if no match.

scitex_agent_container.runtimes.prompts.is_ready(content)[source]

Check if claude is at the main input prompt (all TUI prompts done).

Return type:

bool