Container Isolation in sac

TL;DR. Apptainer’s defaults prioritize HPC convenience: it auto-binds $HOME, /tmp, /proc, /sys, /dev; inherits every environment variable; shares the host’s network, PID, and IPC namespaces; and uses the host’s UID/GID. For agent workloads — where the container is meant to be a security boundary — these defaults are upside-down. This document enumerates every leak path and the sac-side default flip that closes it.

This is also the reference for the Clew reproducibility experiments: “same SIF + same spec.yaml” can only be a real claim when isolation is declared, mechanically enforced, and externally verifiable. sac’s position: isolation level is a first-class spec.yaml field, sac chooses hardened by default, and the AgentCard publishes the agreed level so peers can audit it.


Apptainer’s default leak paths (10 categories)

1. Auto-bound filesystem paths

apptainer exec without options auto-binds:

Path

What leaks

Impact

$HOME

dotfiles, .ssh/, .gitconfig, .bash_history, .cache/, app config

agent can read host identity AND mutate it

/tmp

other processes’ temp files, locks, sockets

cross-container contamination, host-state writes

/var/tmp

persistent temp files

state persists across runs

/proc

every host PID’s metadata

host process introspection

/sys

kernel settings, devices

host hardware fingerprint

/dev

device nodes (/dev/null, GPU, etc.)

device access

$PWD

the cwd at apptainer exec time

accidental scope creep

Fix: --containall (cuts all of the above in one flag), or --contain --no-home for finer control.

2. Environment variable inheritance

Apptainer forwards essentially every host env var into the container:

Variable

What leaks

$PATH

host bin paths (/usr/local/cuda/bin, etc.) — references that don’t exist inside

$HOME

host home path — tools that expect a specific $HOME get confused

$USER, $LOGNAME

host username

$SSH_AUTH_SOCK

host ssh-agent socket — agent inherits ssh-agent identity

$DISPLAY, $WAYLAND_DISPLAY

host X / Wayland session

$DBUS_SESSION_BUS_ADDRESS

host D-Bus access

$AWS_*, $GCP_*, $ANTHROPIC_API_KEY, …

every cloud / API credential

$XDG_RUNTIME_DIR

host runtime dir (/run/user/$UID)

$LANG, $LC_*

locale (host vs container OS mismatch)

Fix: --cleanenv wipes everything; pass only what’s needed via --env KEY=VAL.

3. User / permission inheritance

Item

Default

What leaks

UID / GID

host user inherited as-is

files created in container land on host with host UID

supplementary groups

host’s full group list

host’s group permissions effective inside

/etc/passwd, /etc/group

host’s partially visible

host user table partially exposed

/etc/resolv.conf

host’s used as-is

host DNS config

/etc/hosts

host’s used as-is

host hostname map

Fix: --fakeroot for UID 0 mapping (with caveats — needs user namespaces in the kernel), or --no-mount /etc/passwd,/etc/group.

4. Network transparency

Default: the container shares the host’s network namespace.

Leak

What’s possible

host loopback (127.0.0.1)

container reaches 127.0.0.1:7878 (sac listen) directly

host Unix sockets

/var/run/docker.sock, /run/postgresql/..., etc.

host /etc/resolv.conf

host DNS resolver

every host-bound port

every service the host exposes is reachable

host interfaces

eth0, wlan0 directly addressable

This is the deepest one for sac. A container running an agent should NOT be able to bypass sac listen’s bearer auth by talking to its loopback directly. With shared netns it can.

Fix: --net --network=none (full isolation; agent loses Claude API access) OR --net --network=bridge (independent netns + explicit egress allowlist for api.anthropic.com). Bridge + allowlist is the realistic answer; pure isolation kills the agent.

Realistic trade-off (current sac default). sac currently uses the host netns — agents reach sac listen on 127.0.0.1 (A2A) and any MCP server on host loopback (orochi push channels, etc.) over the same path. Naive --network=bridge would isolate host services but break both A2A and MCP-over-loopback. The realistic path forward is --network=bridge + binding sac listen on the bridge interface + injecting an sac-host hostname into /etc/hosts so MCP URLs stay transport-stable (e.g. http://sac-host:7878) — none of which is shipped yet. We accept the host-loopback exposure today as a known limitation; see roadmap row below.

5. Process / IPC / UTS namespaces

Item

Default

Impact

PID namespace

shared

ps aux sees host PIDs; kill can target host processes (subject to UID checks)

IPC namespace

shared

host SysV IPC + shared memory accessible

UTS namespace

shared

hostname returns host’s name

cgroup

inherited

resource limits hit host-wide, not container-wide

Fix: --pid, --ipc, --uts — note these are NOT included in --containall and must be specified separately.

6. SIF build-time leaks (Apptainer-specific)

The SIF itself can carry host-derived metadata baked at build time:

Source

Example leak

.def %files

absolute host paths copied verbatim into the layer

build host’s /etc/passwd

snapshotted into the SIF

env-var-passed credentials

accidentally baked into a layer

absolute symlinks

targets pointing at the BUILD host

Implication for Clew: “same SIF hash = same environment” is only true if the SIF was built without host-specific data leaking in.

Fix:

  • Build SIFs with --fix-perms --force (normalizes host metadata)

  • Use ARG (build-time variables) in .def, never hardcoded paths

  • Build on a clean / CI host, not a developer laptop

7. Overlay state persistence

overlay.img is a writable layer that persists across container restarts. It’s a feature (hot-start cache for pip installs) AND a leak vector (previous-run state contaminates the next run):

Persists across runs

Risk

/tmp writes

accumulating garbage; race conditions if two runs overlap

~/.cache/pip, build artifacts

unintended reproducibility break (same SIF, different overlay → different result)

Claude session state

prior conversation leaks into next start

accidentally shared overlay

horizontal contamination between agents

Fix:

  • Clew experiments: ephemeral overlay (create on start, destroy on stop)

  • Persistent overlay: include the overlay hash in the verification chain

  • One agent ↔ one overlay; never share

Declarative auto-create (sac). sac drives overlay provisioning from the spec so new peers don’t require a manual apptainer overlay create setup step:

spec:
  apptainer:
    overlay: proj-<peer>.overlay.img   # path (workdir-relative ok)
    overlay_size: "5G"                  # units: M/MB/G/GB only
    overlay_create_if_missing: true     # default; set false to gate off

Semantics:

  • overlay_size set + overlay path missing + overlay_create_if_missing true (default) → sac runs apptainer overlay create --size <MB> <path> before launch.

  • overlay_size set but overlay_create_if_missing: false → sac raises FileNotFoundError (operator must pre-create).

  • overlay_size empty + overlay missing → sac raises FileNotFoundError early with a helpful message instead of letting apptainer fail cryptically at exec time.

  • K/KB units are explicitly rejected — apptainer’s overlay create --size takes integer MB.

8. --writable vs --writable-tmpfs confusion

  • --writable — modifies the SIF itself. Destroys SIF immutability. Used by accident → SIF hash changes → “same SIF” claim broken.

  • --writable-tmpfs — tmpfs (in RAM); writes are ephemeral. Safe. Cannot combine with --overlay <image>; pick one.

  • --overlay — writable image file; persists. Safe for reproducibility if the overlay hash is recorded.

Fix for Clew: never --writable. Use --overlay (recorded) or --writable-tmpfs (ephemeral). For sac per-agent auditors we use --overlay so pip’s editable installs persist for hot-start; tmpfs is given up because the overlay catches /tmp writes anyway.

9. Indirect leakage via bind-mounts

The most subtle category — the container respects :ro on the bind itself, but the bound content can chain to host state:

Path

How it leaks

~/.ssh bound

ssh out to other hosts → mutate them OR receive callback → mutate this host

~/proj/foo/.venv/bin/python symlinks to /opt/...

container follows the symlink, lands on container’s /opt (overlay) OR host /opt depending on bind

.git/config with [url "..."] directives

git clone reaches unexpected remotes

Python .pth editable-install files

sys.path lands on bound dirs containing more symlinks

Symlinks inside a bind-mount pointing OUT of the mount

container follows out of the mount to wherever the symlink points

This is what bit the scitex-stats-auditor PoC. A symlink inside the bound ~/proj/scitex-stats/.venv/bin/python pointed at /opt/python3.12/bin/python3.12 — the container resolved it to its OWN /opt, decided the path was missing, “fixed” it inside the overlay. The agent’s mental model thought it had patched the host.

Fix:

  • Don’t bind ~/.ssh unless the agent provably needs it (most don’t)

  • Don’t bind venvs at all if you only need source — bind src/ only

  • Preflight: assert known host-only paths are NOT visible inside (see §sac defaults below)

10. Apptainer version differences

Behavior varies across versions:

Version

Difference

< 1.0 (legacy Singularity)

--containall slightly different scope

1.0 1.2

/dev bind looser

1.3+

some isolation tightened by default

any

--fakeroot requires user namespaces (host kernel config)

Implication for Clew: “same SIF + same spec.yaml” can produce different results across Apptainer versions. Record apptainer --version in the experiment metadata.


sac’s hardened defaults

For agent workloads, sac’s recommended apptainer.raw_args (and where it’s safe to make these defaults) closes most leak paths:

spec:
  apptainer:
    raw_args:
      - "--containall"        # §1: filesystem isolation
      - "--cleanenv"          # §2: environment isolation
      # Network (§4): pick one based on workload —
      #   --net --network=none  → no egress (most secure; agent can't reach Claude API)
      #   --net --network=bridge  → bridged + allowlist (realistic for Claude agents)
      # PID / IPC / UTS (§5): not in --containall; add if you need them.
      #   "--pid", "--ipc", "--uts"

Canonical container HOME — /home/agent (D5)

Apptainer’s default behavior sets $HOME inside the container from the host operator’s passwd entry (e.g. /home/ywatanabe). Even under --containall the directory is scaffolded as a side-effect, so any bind whose target descends /home/<operator>/ populates $HOME and makes spec.yaml operator-specific.

sac auto-injects --home /home/agent (skipped only when apptainer.relaxed: true). Inside the container:

  • $HOME == /home/agent, regardless of the operator’s host username.

  • Bind targets use the canonical HOME: ~/proj/foo:/home/agent/proj/foo:ro.

  • The host operator’s home (/home/ywatanabe, /home/alice, …) is never created inside the container.

  • spec.yaml is operator-agnostic — the same file runs identically on any operator’s host.

Per-agent binds declare exactly what’s allowed; nothing else visible:

    binds:
      # Source side: ~ expands to operator's host home (sac expands it
      # before passing to apptainer). Target side: canonical
      # /home/agent/... — operator-independent.
      - ~/proj/<one-package>:/home/agent/proj/<one-package>:ro
      # NEVER bind ~/.ssh, ~/.gitconfig, ~/.claude unless the agent
      # provably needs them — reduces blast radius of indirect leaks (§9).

Preflight checks are sac-injected as a bash -c wrapper around the inner command; they run BEFORE any operator startup_commands:

# D5 preflight (auto-injected; not in spec.yaml)
test "$(id -u)" != "0" || exit 11           # not root (or userns-fakeroot — see below)
test "$HOME" = "/home/agent" || exit 12      # canonical HOME — drift = misconfigured

Interaction with apptainer.fakeroot: true. Under fakeroot the in-container id -u returns 0, which would normally trip the first check. sac detects fakeroot at preflight time via /proc/self/uid_map: when the map is a single-line 0 <host_uid> 1 with host_uid != 0, the agent is running as userns-fakeroot (root inside, operator on host) — the root-check is treated as passed. A bare id -u == 0 without that map (somehow escaped to real root) still fails fast.

Two static lines, attestable by hash. The “no host leak” property falls out of --containall (no auto-mounts) + canonical HOME (no operator-specific $HOME scaffolding) + the declared binds: list (reviewable on the AgentCard). Any leak hard-stops the agent with an explicit error.


Roadmap for sac’s isolation surface

Item

Status

apptainer.raw_args field (operator-declared)

✅ shipped

Per-agent preflight in startup_commands

✅ pattern documented (this doc)

Default --containall in apptainer argv if operator doesn’t override

✅ shipped (auto-prepended when apptainer.relaxed: false)

apptainer.relaxed: true opt-out to disable hardened defaults

✅ shipped (spec.apptainer.relaxed)

Default --cleanenv + --writable-tmpfs auto-prepend (D1)

✅ shipped

sac-injected static preflight (D2; refined to D5 invariants)

✅ shipped

AgentCard structured isolation block (D3)

✅ shipped

sac agents check warns on host-mirroring bind targets (D4)

✅ shipped

Canonical container $HOME=/home/agent auto-injected via --home (D5)

✅ shipped

apptainer.fakeroot: true opt-in (userns root inside container)

✅ shipped

Network: shared host netns for MCP/A2A interop (known limitation)

⏳ planned migration to --network=bridge + bridge-IF bind + sac-host hostname injection — keeps MCP URLs transport-stable while closing host-loopback exposure

sac image overlay {init,reset,prune} for ephemeral-overlay workflows

⏳ planned

The AgentCard field is the differentiator: external verifiers (orochi, Clew) can attest “this agent ran at isolation level X” by reading the card alone, no SIF introspection required. The shape published at /.well-known/agent-card.json (D3 + D5):

"x-scitex-agent-container": {
  "isolation": {
    "level": "hardened",
    "containall": true,
    "cleanenv": true,
    "writable_tmpfs": false,
    "home_canonical": "/home/agent",
    "fakeroot": false,
    "preflight_passed": ["uid-nonzero", "home-canonical"],
    "preflight_allowed": [],
    "binds_count": 3,
    "binds_writable_count": 0
  }
}

level: hardened is the human shorthand for “all booleans align with sac defaults + preflight_allowed: []”. A run with any opt-out (relaxed: true, fakeroot: true, an entry in preflight_allowed) publishes level: custom plus the explicit booleans, so the verifier sees exactly what changed instead of a flat “non-standard” label.

One-line summary for papers / READMEs

Apptainer’s default behavior prioritizes HPC convenience: it auto-binds the user’s home, /tmp, /proc, /sys, and /dev; inherits all environment variables; shares the host’s network, PID, and IPC namespaces; and uses the host’s UID/GID. For reproducibility, all of these defaults must be inverted. sac uses --containall (filesystem isolation), --cleanenv (environment isolation), --home /home/agent (canonical operator-independent HOME), --net --network=none or controlled bridge (network isolation), and explicit --bind declarations for every host path that must be visible. A sac-injected two-line preflight verifies id -u != 0 and $HOME == /home/agent at boot and fails hard on any breach.

See also