Container Isolation in sac
TL;DR. Apptainer’s defaults prioritize HPC convenience: it auto-binds
$HOME,/tmp,/proc,/sys,/dev; inherits every environment variable; shares the host’s network, PID, and IPC namespaces; and uses the host’s UID/GID. For agent workloads — where the container is meant to be a security boundary — these defaults are upside-down. This document enumerates every leak path and the sac-side default flip that closes it.
This is also the reference for the Clew reproducibility experiments: “same SIF + same spec.yaml” can only be a real claim when isolation is declared, mechanically enforced, and externally verifiable. sac’s position: isolation level is a first-class spec.yaml field, sac chooses hardened by default, and the AgentCard publishes the agreed level so peers can audit it.
Apptainer’s default leak paths (10 categories)
1. Auto-bound filesystem paths
apptainer exec without options auto-binds:
Path |
What leaks |
Impact |
|---|---|---|
|
dotfiles, |
agent can read host identity AND mutate it |
|
other processes’ temp files, locks, sockets |
cross-container contamination, host-state writes |
|
persistent temp files |
state persists across runs |
|
every host PID’s metadata |
host process introspection |
|
kernel settings, devices |
host hardware fingerprint |
|
device nodes ( |
device access |
|
the cwd at |
accidental scope creep |
Fix: --containall (cuts all of the above in one flag), or
--contain --no-home for finer control.
2. Environment variable inheritance
Apptainer forwards essentially every host env var into the container:
Variable |
What leaks |
|---|---|
|
host bin paths ( |
|
host home path — tools that expect a specific |
|
host username |
|
host ssh-agent socket — agent inherits ssh-agent identity |
|
host X / Wayland session |
|
host D-Bus access |
|
every cloud / API credential |
|
host runtime dir ( |
|
locale (host vs container OS mismatch) |
Fix: --cleanenv wipes everything; pass only what’s needed via
--env KEY=VAL.
3. User / permission inheritance
Item |
Default |
What leaks |
|---|---|---|
UID / GID |
host user inherited as-is |
files created in container land on host with host UID |
supplementary groups |
host’s full group list |
host’s group permissions effective inside |
|
host’s partially visible |
host user table partially exposed |
|
host’s used as-is |
host DNS config |
|
host’s used as-is |
host hostname map |
Fix: --fakeroot for UID 0 mapping (with caveats — needs user
namespaces in the kernel), or --no-mount /etc/passwd,/etc/group.
4. Network transparency
Default: the container shares the host’s network namespace.
Leak |
What’s possible |
|---|---|
host loopback (127.0.0.1) |
container reaches |
host Unix sockets |
|
host |
host DNS resolver |
every host-bound port |
every service the host exposes is reachable |
host interfaces |
|
This is the deepest one for sac. A container running an agent should NOT be able to bypass sac listen’s bearer auth by talking to its loopback directly. With shared netns it can.
Fix: --net --network=none (full isolation; agent loses Claude
API access) OR --net --network=bridge (independent netns + explicit
egress allowlist for api.anthropic.com). Bridge + allowlist is the
realistic answer; pure isolation kills the agent.
Realistic trade-off (current sac default). sac currently uses the
host netns — agents reach sac listen on 127.0.0.1 (A2A) and any
MCP server on host loopback (orochi push channels, etc.) over the same
path. Naive --network=bridge would isolate host services but break
both A2A and MCP-over-loopback. The realistic path forward is
--network=bridge + binding sac listen on the bridge interface +
injecting an sac-host hostname into /etc/hosts so MCP URLs stay
transport-stable (e.g. http://sac-host:7878) — none of which is
shipped yet. We accept the host-loopback exposure today as a known
limitation; see roadmap row below.
5. Process / IPC / UTS namespaces
Item |
Default |
Impact |
|---|---|---|
PID namespace |
shared |
|
IPC namespace |
shared |
host SysV IPC + shared memory accessible |
UTS namespace |
shared |
|
cgroup |
inherited |
resource limits hit host-wide, not container-wide |
Fix: --pid, --ipc, --uts — note these are NOT included in
--containall and must be specified separately.
6. SIF build-time leaks (Apptainer-specific)
The SIF itself can carry host-derived metadata baked at build time:
Source |
Example leak |
|---|---|
|
absolute host paths copied verbatim into the layer |
build host’s |
snapshotted into the SIF |
env-var-passed credentials |
accidentally baked into a layer |
absolute symlinks |
targets pointing at the BUILD host |
Implication for Clew: “same SIF hash = same environment” is only true if the SIF was built without host-specific data leaking in.
Fix:
Build SIFs with
--fix-perms --force(normalizes host metadata)Use ARG (build-time variables) in
.def, never hardcoded pathsBuild on a clean / CI host, not a developer laptop
7. Overlay state persistence
overlay.img is a writable layer that persists across container
restarts. It’s a feature (hot-start cache for pip installs) AND a
leak vector (previous-run state contaminates the next run):
Persists across runs |
Risk |
|---|---|
|
accumulating garbage; race conditions if two runs overlap |
|
unintended reproducibility break (same SIF, different overlay → different result) |
Claude session state |
prior conversation leaks into next start |
accidentally shared overlay |
horizontal contamination between agents |
Fix:
Clew experiments: ephemeral overlay (create on start, destroy on stop)
Persistent overlay: include the overlay hash in the verification chain
One agent ↔ one overlay; never share
8. --writable vs --writable-tmpfs confusion
--writable— modifies the SIF itself. Destroys SIF immutability. Used by accident → SIF hash changes → “same SIF” claim broken.--writable-tmpfs— tmpfs (in RAM); writes are ephemeral. Safe. Cannot combine with--overlay <image>; pick one.--overlay— writable image file; persists. Safe for reproducibility if the overlay hash is recorded.
Fix for Clew: never --writable. Use --overlay (recorded) or
--writable-tmpfs (ephemeral). For sac per-agent auditors we use
--overlay so pip’s editable installs persist for hot-start; tmpfs
is given up because the overlay catches /tmp writes anyway.
9. Indirect leakage via bind-mounts
The most subtle category — the container respects :ro on the bind
itself, but the bound content can chain to host state:
Path |
How it leaks |
|---|---|
|
ssh out to other hosts → mutate them OR receive callback → mutate this host |
|
container follows the symlink, lands on container’s |
|
|
Python |
|
Symlinks inside a bind-mount pointing OUT of the mount |
container follows out of the mount to wherever the symlink points |
This is what bit the scitex-stats-auditor PoC. A symlink inside
the bound ~/proj/scitex-stats/.venv/bin/python pointed at
/opt/python3.12/bin/python3.12 — the container resolved it to its
OWN /opt, decided the path was missing, “fixed” it inside the
overlay. The agent’s mental model thought it had patched the host.
Fix:
Don’t bind
~/.sshunless the agent provably needs it (most don’t)Don’t bind venvs at all if you only need source — bind
src/onlyPreflight: assert known host-only paths are NOT visible inside (see §sac defaults below)
10. Apptainer version differences
Behavior varies across versions:
Version |
Difference |
|---|---|
|
|
|
|
|
some isolation tightened by default |
any |
|
Implication for Clew: “same SIF + same spec.yaml” can produce
different results across Apptainer versions. Record apptainer --version in the experiment metadata.
sac’s hardened defaults
For agent workloads, sac’s recommended apptainer.raw_args (and
where it’s safe to make these defaults) closes most leak paths:
spec:
apptainer:
raw_args:
- "--containall" # §1: filesystem isolation
- "--cleanenv" # §2: environment isolation
# Network (§4): pick one based on workload —
# --net --network=none → no egress (most secure; agent can't reach Claude API)
# --net --network=bridge → bridged + allowlist (realistic for Claude agents)
# PID / IPC / UTS (§5): not in --containall; add if you need them.
# "--pid", "--ipc", "--uts"
Canonical container HOME — /home/agent (D5)
Apptainer’s default behavior sets $HOME inside the container from
the host operator’s passwd entry (e.g. /home/ywatanabe). Even under
--containall the directory is scaffolded as a side-effect, so any
bind whose target descends /home/<operator>/ populates $HOME and
makes spec.yaml operator-specific.
sac auto-injects --home /home/agent (skipped only when
apptainer.relaxed: true). Inside the container:
$HOME == /home/agent, regardless of the operator’s host username.Bind targets use the canonical HOME:
~/proj/foo:/home/agent/proj/foo:ro.The host operator’s home (
/home/ywatanabe,/home/alice, …) is never created inside the container.spec.yaml is operator-agnostic — the same file runs identically on any operator’s host.
Per-agent binds declare exactly what’s allowed; nothing else visible:
binds:
# Source side: ~ expands to operator's host home (sac expands it
# before passing to apptainer). Target side: canonical
# /home/agent/... — operator-independent.
- ~/proj/<one-package>:/home/agent/proj/<one-package>:ro
# NEVER bind ~/.ssh, ~/.gitconfig, ~/.claude unless the agent
# provably needs them — reduces blast radius of indirect leaks (§9).
Preflight checks are sac-injected as a bash -c wrapper around
the inner command; they run BEFORE any operator startup_commands:
# D5 preflight (auto-injected; not in spec.yaml)
test "$(id -u)" != "0" || exit 11 # not root (or userns-fakeroot — see below)
test "$HOME" = "/home/agent" || exit 12 # canonical HOME — drift = misconfigured
Interaction with apptainer.fakeroot: true. Under fakeroot the
in-container id -u returns 0, which would normally trip the first
check. sac detects fakeroot at preflight time via /proc/self/uid_map:
when the map is a single-line 0 <host_uid> 1 with host_uid != 0,
the agent is running as userns-fakeroot (root inside, operator on
host) — the root-check is treated as passed. A bare id -u == 0
without that map (somehow escaped to real root) still fails fast.
Two static lines, attestable by hash. The “no host leak” property
falls out of --containall (no auto-mounts) + canonical HOME (no
operator-specific $HOME scaffolding) + the declared binds: list
(reviewable on the AgentCard). Any leak hard-stops the agent with an
explicit error.
Roadmap for sac’s isolation surface
Item |
Status |
|---|---|
|
✅ shipped |
Per-agent preflight in |
✅ pattern documented (this doc) |
Default |
✅ shipped (auto-prepended when |
|
✅ shipped ( |
Default |
✅ shipped |
sac-injected static preflight (D2; refined to D5 invariants) |
✅ shipped |
AgentCard structured |
✅ shipped |
|
✅ shipped |
Canonical container |
✅ shipped |
|
✅ shipped |
Network: shared host netns for MCP/A2A interop (known limitation) |
⏳ planned migration to |
|
⏳ planned |
The AgentCard field is the differentiator: external verifiers (orochi,
Clew) can attest “this agent ran at isolation level X” by reading the
card alone, no SIF introspection required. The shape published at
/.well-known/agent-card.json (D3 + D5):
"x-scitex-agent-container": {
"isolation": {
"level": "hardened",
"containall": true,
"cleanenv": true,
"writable_tmpfs": false,
"home_canonical": "/home/agent",
"fakeroot": false,
"preflight_passed": ["uid-nonzero", "home-canonical"],
"preflight_allowed": [],
"binds_count": 3,
"binds_writable_count": 0
}
}
level: hardened is the human shorthand for “all booleans align with
sac defaults + preflight_allowed: []”. A run with any opt-out
(relaxed: true, fakeroot: true, an entry in preflight_allowed)
publishes level: custom plus the explicit booleans, so the verifier
sees exactly what changed instead of a flat “non-standard” label.
One-line summary for papers / READMEs
Apptainer’s default behavior prioritizes HPC convenience: it auto-binds the user’s home,
/tmp,/proc,/sys, and/dev; inherits all environment variables; shares the host’s network, PID, and IPC namespaces; and uses the host’s UID/GID. For reproducibility, all of these defaults must be inverted. sac uses--containall(filesystem isolation),--cleanenv(environment isolation),--home /home/agent(canonical operator-independent HOME),--net --network=noneor controlled bridge (network isolation), and explicit--binddeclarations for every host path that must be visible. A sac-injected two-line preflight verifiesid -u != 0and$HOME == /home/agentat boot and fails hard on any breach.
See also
spec-reference.md—spec.apptainer.raw_argsfieldtalking-to-agents.md— A2A surface (where isolation level should be advertised)how-sac-works.md— overall architecture